diff --git a/.claude/skills/cosmos-utils-vlm-migration/SKILL.md b/.claude/skills/cosmos-utils-vlm-migration/SKILL.md new file mode 100644 index 00000000..ecd9aeb4 --- /dev/null +++ b/.claude/skills/cosmos-utils-vlm-migration/SKILL.md @@ -0,0 +1,185 @@ +--- +name: cosmos-utils-vlm-migration +description: > + Redirect edits, patches, or PRs that target pre-2026-05-18 paths under + cosmos_training/cosmos/utils/. Covers (a) the vfm/vlm consolidation — utils/vfm/vlm/, + utils/vfm/fused_adam.py, utils/vlm/compute_flops_qwen3vl.py — and (b) the follow-up + cleanup pass that removed dead subsystems: utils/one_logger/, utils/optim_instantiate.py, + utils/configs/lr_scheduler.py, utils/training_telemetry/context_managers.py, + utils/vlm/flop_calculator.py, utils/env_parsers/customization_env_parser.py, + utils/env_parsers/inference_env_parser.py, and the load_config(enable_one_logger=...) + parameter. Use this skill whenever a diff, cherry-pick, rebase, code-review suggestion, + blame trail, or external snippet references the old paths/imports, OR when applying + any upstream change that touches utils/. Triggers on: "cherry-pick", "rebase", + "apply patch", "port this change", "merge upstream", "from cosmos.utils.vfm.vlm", + "from cosmos.utils.vfm.fused_adam", "from cosmos.utils.one_logger", + "compute_flops_qwen3vl", "FlopCalculator", "optim_instantiate", "enable_one_logger", + "OneLoggerCallback", "CustomizationEnvParser", "InferenceEnvParser", or any edit whose + target file path is under utils/vfm/vlm/, utils/one_logger/, utils/configs/, or any + other path listed in the redirect table below. Use proactively before applying any + change to these areas of the repo. +--- + +# Cosmos utils/ refactor + cleanup — pre-2026-05-18 → post-refactor mapping + +On 2026-05-18 several related changes landed in `cosmos_training/cosmos/utils/`: + +1. The duplicated `utils/vfm/vlm/` tree was **merged into `utils/vlm/`** (feature union). +2. `utils/vfm/fused_adam.py` was **promoted to top-level `utils/fused_adam.py`** (DTensor-aware version). +3. A follow-up cleanup pass **deleted ~2400 lines** of dead/orphan code across the utils + tree: the broken `utils/one_logger/` subsystem, `utils/optim_instantiate.py`, + `utils/configs/`, `utils/training_telemetry/context_managers.py`, + `utils/vlm/flop_calculator.py`, and two `utils/env_parsers/*` files. + +Any change targeting these old paths must be **rewritten** against the new layout before +it can be applied — the old files have been deleted from HEAD. + +## Hard rules + +1. **Never** restore deleted files. If a patch tries to add back any of: + `utils/vfm/vlm/*`, `utils/vfm/fused_adam.py`, `utils/vlm/compute_flops_qwen3vl.py`, + `utils/vlm/flop_calculator.py`, `utils/one_logger/*`, `utils/optim_instantiate.py`, + `utils/configs/lr_scheduler.py`, `utils/training_telemetry/context_managers.py`, + `utils/env_parsers/customization_env_parser.py`, + `utils/env_parsers/inference_env_parser.py` — it's stale; redirect or drop. +2. **Always** verify the new file already contains the equivalent feature before adding + it. The merged files are a *superset* of both forks' behavior; the change you're + porting may already be present. +3. If you can't find an equivalent symbol in the new location, stop and ask — silent + feature loss is worse than the consolidation churn. +4. **For deletions that originated as dead code** (one_logger, optim_instantiate, + flop_calculator, env_parsers, etc.), do not re-introduce them just because a patch + asks. Verify the requested behavior is genuinely needed first; the original code was + broken or orphan when deleted. + +## Path redirect table + +### Consolidated (merged into new location) + +| Old import / file path | New import / file path | Notes | +|---|---|---| +| `cosmos.utils.vfm.vlm.constant` | `cosmos.utils.vlm.constant` | identical names exported | +| `cosmos.utils.vfm.vlm.create_position_ids` | `cosmos.utils.vlm.create_position_ids` | `get_position_ids`, `get_rope_index_qwen3_vl` | +| `cosmos.utils.vfm.vlm.optimizer` | `cosmos.utils.vlm.optimizer` | `OptimizerConfig`, `build_optimizers`, `build_lr_schedulers` | +| `cosmos.utils.vfm.vlm.pretrained_models_downloader` | `cosmos.utils.vlm.pretrained_models_downloader` | `maybe_download_hf_model_from_s3`, `parallel_download_s3_prefix_to_dir`, `s3_dir_exists`, `has_model_weights`, `_load_s3_credentials`, `_download_from_hf_hub` | +| `cosmos.utils.vfm.fused_adam` | `cosmos.utils.fused_adam` | `FusedAdam` (DTensor-aware) | +| `cosmos.utils.vlm.compute_flops_qwen3vl` | `cosmos.tools.flops.qwen3_vl` | `compute_qwen3vl_flops_from_config` now accepts `is_causal` (defaults to True). Numeric output verified bit-identical when `is_causal=False` against the deleted local impl. The only caller (`utils/vlm/flop_calculator.py`) has since been deleted as well, so this redirect is mostly historical. | + +### Deleted entirely (do NOT re-add) + +| Old import / file path | Why removed | +|---|---| +| `cosmos.utils.vfm.vlm.flop_calculator` | Preserved during initial vfm/vlm→vlm merge; confirmed zero in-tree refs and removed. `FlopCalculator` class no longer exists anywhere. | +| `cosmos.utils.vlm.flop_calculator` | Same class, post-merge location. Also deleted. | +| `cosmos.utils.one_logger.*` (whole subdir, 5 files + README) | `one_logger_override_utils.py` imported `OneLoggerCallback` from `cosmos.utils.callback`, but that class never existed in the post-imaginaire4 tree (also not in the `one-logger` PyPI package). The only call site (`load_config(..., enable_one_logger=True)`) silently swallowed the `ImportError`, so OneLogger has effectively been off for the duration. The whole subdir + the `enable_one_logger` parameter on `load_config` were removed. | +| `cosmos.utils.optim_instantiate` (`get_regular_param_group`, `get_base_optimizer`, `get_base_scheduler`) | Superseded by per-pipeline optimizer builders in `utils/vfm/optimizer.py` and `utils/vlm/optimizer.py`. Zero importers when deleted. | +| `cosmos.utils.configs.lr_scheduler` (`LambdaLinearSchedulerConfig`) | LazyCall wrapper around `LambdaLinearScheduler`. Zero importers when deleted. The implementation it wrapped — `cosmos.utils.functional.lr_scheduler.LambdaLinearScheduler` — is still alive. Whole `utils/configs/` subdir gone (it only had this one file). | +| `cosmos.utils.training_telemetry.context_managers` | Orphan. The live entry point `utils.training_telemetry.telemetry` uses `utils.py` and lazy-imports `TelemetryCallback` from `callback.py`; neither path touches the context_managers file. Zero external **or** internal cross-refs at delete time. | +| `cosmos.utils.env_parsers.customization_env_parser` (`CustomizationEnvParser`) | Inference-side AWS Fleet/Lambda env vars (FT_AWS_*, LAMBDA_STAGE, FLEET_FUNCTION). Zero importers in the training tree. | +| `cosmos.utils.env_parsers.inference_env_parser` (`InferenceEnvParser`) | Inference deployment env vars (TRT_ENABLED, NIM_DEPLOYMENT, PORT, MODEL_MODULE, …). Zero importers in the training tree. | + +### Signature changes + +| Function | Change | +|---|---| +| `cosmos.utils.config.load_config(config_path, opts, enable_one_logger=...)` | The `enable_one_logger` keyword argument was removed when `utils/one_logger/` was deleted. New signature: `load_config(config_path: str, opts: list[str]) -> Config`. Drop the kwarg from any patch that adds it back. | + +## Feature mapping inside merged files (what came from where) + +If a backport touches a specific symbol/feature, this tells you whether the new file +already has it. + +### `utils/vlm/optimizer.py` — `OptimizerConfig` +Contains the union of both forks: +- Legacy named freeze flags: `freeze_vision_encoder`, `freeze_mm_projector`, `freeze_llm` (both forks) +- `freeze_llm_moe_gates: bool = False` — was vlm-only (declared but not yet referenced in code as of merge) +- `trainable_params: Optional[list[str]] = None` — was vfm/vlm-only; regex whitelist; enforced in `__attrs_post_init__` +- `frozen_params: Optional[list[str]] = None` — was vfm/vlm-only; regex blacklist; mutually exclusive with `trainable_params` +- `betas` now wrapped in `tuple(...)` inside `build_optimizers` — was vfm/vlm-only bugfix + +### `utils/vlm/pretrained_models_downloader.py` +Contains the union: +- `resolve_hf_model_store(credentials, bucket)` — was vlm-only; maps checkpoint-store creds to permanent HF model store +- `_load_s3_credentials(credential_path)` — was vfm/vlm-only; env-var-aware via `cosmos.utils.easy_io.backends.auto_auth` (replaces raw `json.load(open(...))`) +- `_download_from_hf_hub(model_name_or_path, include_model_weights)` — was vfm/vlm-only; HF Hub fallback when no S3 creds +- `_stream_download` (inside `parallel_download_s3_prefix_to_dir`) — was vlm-only; bypasses ETag validation for GCS-compatible buckets +- `maybe_download_hf_model_from_s3` body: + - Local-dir short-circuit (`if os.path.isdir(model_name_or_path)`) — was vfm/vlm-only + - No-credentials → `_download_from_hf_hub` branch — was vfm/vlm-only + - `not INTERNAL` → `CheckpointConfig.maybe_from_uri` + `download_checkpoint_v2` branch — was vfm/vlm-only + - Cache check accepts `vocab.json` OR `tokenizer.json` — was vlm-only (vfm/vlm checked only `vocab.json`) + +### `utils/vlm/flop_calculator.py` — DELETED +Initially merged from vfm/vlm/. Subsequently determined to have zero in-tree +references (the dynamic batcher this was built for never wired it up here) and +deleted on 2026-05-18. The bit-identical FLOP numeric verification still holds +for `cosmos.tools.flops.qwen3_vl.compute_qwen3vl_flops_from_config(..., is_causal=False)` +if you ever need to rebuild this calculator. + +### `utils/vlm/create_position_ids.py`, `utils/vlm/constant.py` +The vfm/vlm version was adopted wholesale. Logic-identical to the prior `utils/vlm/` +version — only docstrings, type annotations, and `Optional[T]` → `T | None` differ. + +### `utils/fused_adam.py` (was `utils/vfm/fused_adam.py`) +DTensor-aware via `cosmos.utils.misc.get_local_tensor_if_DTensor`. For non-DTensor params +(the only kind the old top-level `utils/fused_adam.py` handled), behavior is equivalent: +`get_local_tensor_if_DTensor(x)` is a no-op for regular tensors. TE import path is +`transformer_engine_torch as tex` (unchanged from top-level pre-refactor). + +## NOT consolidated — two `fused_adam.py` remain by design + +`utils/fused_adam.py` and `utils/vlm/fused_adam.py` both still exist. They are **not** +duplicates: + +- `utils/fused_adam.py`: imports `transformer_engine_torch as tex`, uses + `cosmos.utils.misc.get_local_tensor_if_DTensor`. +- `utils/vlm/fused_adam.py`: imports `transformer_engine as te`, uses + `te.pytorch.optimizers.multi_tensor_adam`, with an inlined `get_local_tensor_if_DTensor`. + +These differ in their TE module path. Unifying them requires verifying that +`transformer_engine_torch.multi_tensor_adam*` and +`te.pytorch.optimizers.multi_tensor_adam*` resolve to equivalent CUDA kernels at the +runtime TE version. **Do not unify without that runtime verification.** + +## Import sites that were redirected on 2026-05-18 + +These already point at the new paths in HEAD; if you see an external patch still using +the old paths, redirect: + +**vfm/vlm consolidation:** +- `cosmos/model/vfm/vlm_model.py` (3 import sites) +- `cosmos/model/vfm/algorithm/loss/cross_entropy.py` +- `cosmos/data/vfm/augmentors/vlm/tokenize_data.py` +- `cosmos/data/vfm/processors/base.py` +- `cosmos/data/vfm/processors/__init__.py` +- `cosmos/utils/vfm/optimizer.py` (the `FusedAdam` lazy import) + +**one_logger removal:** +- `cosmos/utils/config.py` — `load_config` lost the `enable_one_logger` parameter and the gated lazy-import block +- `scripts/train.py` — dropped `enable_one_logger=True` kwarg +- `cosmos/data/vfm/action/compute_action_stats.py` — dropped `enable_one_logger=False` kwarg + +## Workflow when applying any change to the utils tree + +1. **Read the patch target path.** If it matches an entry in the redirect table + (consolidated), rewrite the path before applying. If it matches an entry in the + "Deleted entirely" table, the patch is targeting code that no longer exists — drop + it or escalate. +2. **Check feature mapping above.** If the change adds/modifies a feature listed under + "Feature mapping inside merged files," confirm the merged file's current state — the + change may already be present (in which case it's a no-op), partially present (so you + need to merge carefully), or absent (port it). +3. **For anything calling `compute_qwen3vl_flops_from_config`:** if the change touches + the FLOP computation, re-run the equivalence check (see `[[utils-vfm-vlm-forks]]` + memory for context) before assuming the dynamic batcher calibration still holds. +4. **For `load_config` call sites:** if a patch passes `enable_one_logger=...`, drop + that kwarg — the parameter was removed. +5. **Never** create a new `utils/vfm/vlm/`, `utils/one_logger/`, or `utils/configs/` + directory, and never restore the deleted files listed above. If a patch can't be + cleanly applied to the new layout, stop and ask the user. + +## Related memory + +`[[utils-vfm-vlm-forks]]` in the project memory captures the consolidation history, +the reasoning behind the leftover `utils/vlm/fused_adam.py`, and the follow-up +deletions in this same session. diff --git a/.config/rumdl.toml b/.config/rumdl.toml new file mode 100644 index 00000000..24e72483 --- /dev/null +++ b/.config/rumdl.toml @@ -0,0 +1,43 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025-2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +# https://rumdl.dev/global-settings/ +[global] +flavor = "standard" +exclude = [ + "ATTRIBUTIONS.md", + "_src", +] +disable = [ + "MD013", # line-length + "MD033", # inline-html + "MD040", # fenced-code-language +] + +# https://rumdl.dev/rules/ + +[per-file-ignores] +"README.md" = [ + "MD041" # first-line-heading +] + +# ul-style +[MD004] +style = "dash" + +# table-format +[MD060] +enabled = true +style = "aligned" diff --git a/.coveragerc b/.coveragerc new file mode 100644 index 00000000..93b0d6e5 --- /dev/null +++ b/.coveragerc @@ -0,0 +1,32 @@ +# https://coverage.readthedocs.io/en/latest/subprocess.html + +[run] +data_file = outputs/coverage/coverage +disable_warnings = + module-not-imported + no-data-collected +parallel = True +patch = subprocess + +[report] +exclude_lines = + @overload + def __repr__ + if __name__ == .__main__.: + if TYPE_CHECKING: + pragma: no cover + raise AssertionError + raise NotImplementedError +omit = + *_test.py +skip_empty = True +show_missing = True + +[html] +directory = outputs/coverage/html + +[json] +output = outputs/coverage/coverage.json + +[xml] +output = outputs/coverage/coverage.xml diff --git a/.dockerignore b/.dockerignore new file mode 100644 index 00000000..0dfe444b --- /dev/null +++ b/.dockerignore @@ -0,0 +1,8 @@ +.venv +.git +/checkpoints +/datasets +/output +/examples/**/checkpoints +/examples/**/output +/examples/**/datasets diff --git a/.gitattributes b/.gitattributes new file mode 100644 index 00000000..75fb88b0 --- /dev/null +++ b/.gitattributes @@ -0,0 +1,30 @@ +*.lock linguist-generated=true +tests/data/** linguist-generated=true +ATTRIBUTIONS.md linguist-generated=true + +assets/** filter=lfs diff=lfs merge=lfs -text + +# Video files +*.mp4 filter=lfs diff=lfs merge=lfs -text +*.avi filter=lfs diff=lfs merge=lfs -text +*.mov filter=lfs diff=lfs merge=lfs -text +*.mkv filter=lfs diff=lfs merge=lfs -text +*.webm filter=lfs diff=lfs merge=lfs -text + +# Audio files +*.wav filter=lfs diff=lfs merge=lfs -text +*.mp3 filter=lfs diff=lfs merge=lfs -text +*.flac filter=lfs diff=lfs merge=lfs -text +*.aac filter=lfs diff=lfs merge=lfs -text + +# Image files +*.jpg filter=lfs diff=lfs merge=lfs -text +*.jpeg filter=lfs diff=lfs merge=lfs -text +*.png filter=lfs diff=lfs merge=lfs -text +*.tiff filter=lfs diff=lfs merge=lfs -text +*.bmp filter=lfs diff=lfs merge=lfs -text + +# Logo thumbnail is small and was committed as a regular git blob before +# LFS rules were introduced. Keep it out of LFS to preserve the existing +# blob. +cosmos-logo-thumbnail.png -filter -diff -merge text=auto diff --git a/.github/ISSUE_TEMPLATE/bug_report.md b/.github/ISSUE_TEMPLATE/bug_report.md new file mode 100644 index 00000000..2ee4d0ec --- /dev/null +++ b/.github/ISSUE_TEMPLATE/bug_report.md @@ -0,0 +1,65 @@ +--- +name: Bug Report +about: Report a reproducible bug or unexpected behavior +title: "[BUG] " +labels: 'bug' +assignees: + - spectralflight + - jeanachoi + +--- + +## Bug Description + + + +## Reproduction Steps + +```bash +# Minimal command or script to reproduce +``` + +**Reproducibility:** + +- [ ] Always +- [ ] Intermittently (~___% of the time) +- [ ] Only once + +## Expected vs. Actual Behavior + +| | Description | +| ------------ | --------------------------- | +| **Expected** | What you expected to happen | +| **Actual** | What actually happened | + +## Outputs + +
+Error / Stack Trace + + + +
+ +
+Log Files + + + +
+ +## System Information + +| Field | Value | +| ---------------------------- | ------------------------------------------- | +| **Environment** | | +| **Hardware** | | +| **OS** | | +| **GPU Driver** | | +| **CUDA Version** | | +| **Python Version** | | +| **Package Version / Commit** | | + +## Additional Context + + diff --git a/.github/workflows/pre-commit.yml b/.github/workflows/pre-commit.yml new file mode 100644 index 00000000..1b11ee3d --- /dev/null +++ b/.github/workflows/pre-commit.yml @@ -0,0 +1,31 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +name: Pre-commit +on: + pull_request: + push: + branches: [main] +jobs: + pre-commit: + runs-on: ubuntu-latest + steps: + - uses: actions/checkout@v6 + with: + lfs: true + - uses: actions/setup-python@v6 + - uses: astral-sh/setup-uv@v7 + - run: uvx pre-commit@4.5.1 run -a -c ci/.pre-commit-config-base.yaml + - run: uvx pre-commit@4.5.1 run -a diff --git a/.gitignore b/.gitignore new file mode 100644 index 00000000..dde4f984 --- /dev/null +++ b/.gitignore @@ -0,0 +1,224 @@ +# Python bytecode caches — never check in. +__pycache__/ +*.pyc +*.pyo +*.pyd + +# Editor / OS noise. +*.swp +.DS_Store + +# Local training output (generated, large). +training_output/ +outputs/ + +# Release-tool metadata (regenerated on every release run). +cosmos_training_meta/ +/assets +/credentials +/datasets +/outputs +/tmp +.cuda-name +*.env + +# ------------------------ BELOW IS AUTO-GENERATED FOR PYTHON REPOS ------------------------ + +# Byte-compiled / optimized / DLL files +__pycache__/ +*.py[cod] +*$py.class + +# C extensions +*.so + +# Distribution / packaging +.Python +build/ +develop-eggs/ +dist/ +downloads/ +eggs/ +.eggs/ +lib/ +lib64/ +parts/ +sdist/ +var/ +wheels/ +share/python-wheels/ +*.egg-info/ +.installed.cfg +*.egg +MANIFEST + +# PyInstaller +# Usually these files are written by a python script from a template +# before PyInstaller builds the exe, so as to inject date/other infos into it. +*.manifest +*.spec + +# Installer logs +pip-log.txt +pip-delete-this-directory.txt + +# Unit test / coverage reports +htmlcov/ +.tox/ +.nox/ +.coverage +.coverage.* +.cache +nosetests.xml +coverage.xml +*.cover +*.py,cover +.hypothesis/ +.pytest_cache/ +cover/ + +# Translations +*.mo +*.pot + +# Django stuff: +*.log +local_settings.py +db.sqlite3 +db.sqlite3-journal + +# Flask stuff: +instance/ +.webassets-cache + +# Scrapy stuff: +.scrapy + +# Sphinx documentation +docs/_build/ + +# PyBuilder +.pybuilder/ +target/ + +# Jupyter Notebook +.ipynb_checkpoints + +# IPython +profile_default/ +ipython_config.py + +# pyenv +# For a library or package, you might want to ignore these files since the code is +# intended to run in multiple environments; otherwise, check them in: +# .python-version + +# pipenv +# According to pypa/pipenv#598, it is recommended to include Pipfile.lock in version control. +# However, in case of collaboration, if having platform-specific dependencies or dependencies +# having no cross-platform support, pipenv may install dependencies that don't work, or not +# install all needed dependencies. +#Pipfile.lock + +# UV +# Similar to Pipfile.lock, it is generally recommended to include uv.lock in version control. +# This is especially recommended for binary packages to ensure reproducibility, and is more +# commonly ignored for libraries. +#uv.lock + +# poetry +# Similar to Pipfile.lock, it is generally recommended to include poetry.lock in version control. +# This is especially recommended for binary packages to ensure reproducibility, and is more +# commonly ignored for libraries. +# https://python-poetry.org/docs/basic-usage/#commit-your-poetrylock-file-to-version-control +#poetry.lock +#poetry.toml + +# pdm +# Similar to Pipfile.lock, it is generally recommended to include pdm.lock in version control. +#pdm.lock +# pdm stores project-wide configurations in .pdm.toml, but it is recommended to not include it +# in version control. +# https://pdm.fming.dev/latest/usage/project/#working-with-version-control +.pdm.toml +.pdm-python +.pdm-build/ + +# PEP 582; used by e.g. github.com/David-OConnor/pyflow and github.com/pdm-project/pdm +__pypackages__/ + +# Celery stuff +celerybeat-schedule +celerybeat.pid + +# SageMath parsed files +*.sage.py + +# Environments +.env +.venv +env/ +venv/ +ENV/ +env.bak/ +venv.bak/ + +# Spyder project settings +.spyderproject +.spyproject + +# Rope project settings +.ropeproject + +# mkdocs documentation +/site + +# mypy +.mypy_cache/ +.dmypy.json +dmypy.json + +# Pyre type checker +.pyre/ + +# pytype static type analyzer +.pytype/ + +# Cython debug symbols +cython_debug/ + +# PyCharm +# JetBrains specific template is maintained in a separate JetBrains.gitignore that can +# be found at https://github.com/github/gitignore/blob/main/Global/JetBrains.gitignore +# and can be added to the global gitignore or merged into this file. For a more nuclear +# option (not recommended) you can uncomment the following to ignore the entire idea folder. +#.idea/ + +# Abstra +# Abstra is an AI-powered process automation framework. +# Ignore directories containing user credentials, local state, and settings. +# Learn more at https://abstra.io/docs +.abstra/ + +# Visual Studio Code +# Visual Studio Code specific template is maintained in a separate VisualStudioCode.gitignore +# that can be found at https://github.com/github/gitignore/blob/main/Global/VisualStudioCode.gitignore +# and can be added to the global gitignore or merged into this file. However, if you prefer, +# you could uncomment the following to ignore the entire vscode folder +# .vscode/ + +# Ruff stuff: +.ruff_cache/ + +# rumdl markdown linter local cache +.rumdl_cache/ + +# PyPI configuration file +.pypirc + +# Cursor +# Cursor is an AI-powered code editor. `.cursorignore` specifies files/directories to +# exclude from AI features like autocomplete and code analysis. Recommended for sensitive data +# refer to https://docs.cursor.com/context/ignore-files +.cursorignore +.cursorindexingignore diff --git a/.gitleaks.toml b/.gitleaks.toml new file mode 100644 index 00000000..cf8f9ea5 --- /dev/null +++ b/.gitleaks.toml @@ -0,0 +1,19 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025-2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +[[allowlists]] +regexes = [ + '''Qwen3MoeForCausalLM''' +] diff --git a/.pre-commit-config.yaml b/.pre-commit-config.yaml new file mode 100644 index 00000000..4a82b685 --- /dev/null +++ b/.pre-commit-config.yaml @@ -0,0 +1,70 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +default_language_version: + node: 25.2.1 + python: python3.13 +exclude: (?x)( + ^tests/data/| + ^cosmos-inference/| + ^\.claude/ + ) +repos: + - repo: https://github.com/google/addlicense + rev: v1.2.0 + hooks: + - id: addlicense + args: ["-f", "ci/license.txt"] + exclude: \.(png|jpg|jpeg|gif|svg|ico|pdf|bin|safetensors|pt|pth|webp|mp4|mp3|wav|woff2?|ttf)$ + - repo: https://github.com/jsh9/markdown-toc-creator + rev: 0.1.3 + hooks: + - id: markdown-toc-creator + args: ["--config=ci/.markdown-toc-creator.toml"] + - repo: https://github.com/rvben/rumdl-pre-commit + rev: v0.1.62 + hooks: + - id: rumdl-fmt + - repo: https://github.com/pre-commit/pre-commit-hooks + rev: v6.0.0 + hooks: + - id: check-symlinks + - id: check-executables-have-shebangs + exclude: /_src/ + - id: check-shebang-scripts-are-executable + exclude: /_src/ + - repo: local + hooks: + - id: uv-lock + name: Generate uv lock files for projects + entry: ./ci/uv_lock.sh + language: script + files: pyproject\.toml$ + - id: uv-lock-script + name: Generate uv lock files for scripts + entry: ./ci/uv_lock_script.sh + language: script + types: [python] + - repo: https://github.com/tcort/markdown-link-check + rev: v3.14.2 + hooks: + - alias: link-check + name: link check + id: markdown-link-check + args: [--config, "ci/.link-check.json", --quiet] + stages: [manual] + exclude: (?x)( + \bATTRIBUTIONS\b| + /_src/ + ) diff --git a/.pytest.toml b/.pytest.toml new file mode 100644 index 00000000..fd72fb32 --- /dev/null +++ b/.pytest.toml @@ -0,0 +1,47 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025-2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +[pytest] +python_files = [ + "*_test.py", +] +norecursedirs = [ + "_src", + "cosmos-inference", + "packages", + "projects", +] +addopts = [ + "--suppress-no-test-exit-code", +] +filterwarnings = [ + "ignore::DeprecationWarning", + "ignore::FutureWarning", +] +markers = [ + "manual: Test requires --manual.", + "level(l): Test level in [0, 1, 2].", + "gpus(n): Test requires GPUs.", +] + +[pytest_env] +COSMOS_VERBOSE = { value = "0", skip_if_set = true } +CUDA_VISIBLE_DEVICES = { unset = true } +PYTORCH_CUDA_ALLOC_CONF = "expandable_segments:True" # Reduce chance of OOM errors +# Limit threading to reduce contention +MKL_NUM_THREADS = "1" +NUMEXPR_NUM_THREADS = "1" +OMP_NUM_THREADS = "1" +OPENBLAS_NUM_THREADS = "1" diff --git a/.python-version b/.python-version new file mode 100644 index 00000000..24ee5b1b --- /dev/null +++ b/.python-version @@ -0,0 +1 @@ +3.13 diff --git a/.ruff.toml b/.ruff.toml new file mode 100644 index 00000000..30bebd5b --- /dev/null +++ b/.ruff.toml @@ -0,0 +1,36 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +line-length = 120 +target-version = "py310" + +[lint] +select = [ + "E", # pycodestyle errors + "F", # pyflakes + "I", # isort + "TID252", # relative-imports + "T10", # debugger +] +ignore = [ + "E402", # module-import-not-at-top-of-file + "E501", # line-too-long + "E721", # type-comparison + "E741", # ambiguous-variable-name + "F541", # f-string-missing-placeholders + "F811", # redefined-while-unused + "F841", # unused-variable +] +fixable = ["ALL"] diff --git a/AGENTS.md b/AGENTS.md new file mode 100644 index 00000000..cb377772 --- /dev/null +++ b/AGENTS.md @@ -0,0 +1,124 @@ +# AGENTS.md — Cosmos Framework + +Read this file first — it is the canonical map for navigating the Cosmos repository and stays up to date. + +**Cosmos** is a framework for training and serving world foundation models. It is organized as two halves: + +- **Training infrastructure** — the `cosmos/` package and user-facing documentation in `docs/`. +- **Inference infrastructure** — the `cosmos-inference/` subtree (Diffusers / Transformers / vLLM integrations, online serving with Ray + Gradio). + +Both halves share the same dependency manifest at the repository root. + +> All paths below are relative to the repository root (the directory containing `pyproject.toml`, the `cosmos/` Python package, and the `cosmos-inference/` subtree). + +## Commands + +| Task | Command | +| ---------------------- | --------------------------------------------------- | +| Lint | `uv run ruff check .` | +| Format check | `uv run ruff format --check .` | +| Auto-fix lint + format | `uv run ruff check --fix . && uv run ruff format .` | +| Type-check | `uv run pyrefly check` | +| Test (all) | `uv run pytest` | +| Test (single file) | `uv run pytest --capture=no ` | + +Config files: `.ruff.toml` (ruff), `pyrefly.toml` (pyrefly), `.pytest.toml` (pytest), `conftest.py` (pytest fixtures). + +A `justfile` is provided at the root with longer recipes (`just install`, `just lint`, `just test`, `just docker-cu130`). + +## Rules + +- Always answer questions with references to code or documentation in `file:line` format. +- When unsure, point the user to the closest doc rather than guessing. +- Keep this file short. Link out to skills and docs for detail — this file is included in every prompt. +- Do not duplicate inference behavior into `cosmos/`; it belongs in `cosmos-inference/`. Do not duplicate training behavior into `cosmos-inference/`; it belongs in `cosmos/`. + +## Key File Locations + +### Training (`cosmos/`) + +| What | Where | +| ---------------------------------------------------- | ------------------------------------- | +| Algorithms (losses, RL, reward) | `cosmos/algorithm/{loss,reward,rl}` | +| Training loop | `cosmos/trainer/` | +| Models + parallelism | `cosmos/model/` | +| Datasets / data loading | `cosmos/data/` | +| Checkpoint I/O | `cosmos/checkpoint/` | +| Callbacks (logging, eval) | `cosmos/callbacks/` | +| RL workers (rollout, reward, reference, simulations) | `cosmos/workers/` | +| Controller / orchestrator | `cosmos/controller/` | +| Launchers (Slurm, torchrun, k8s) | `cosmos/launcher/` | +| Evaluation harness | `cosmos/evaluation/` | +| CLI tools | `cosmos/tools/`, `tools/` (repo root) | + +For a per-subpackage tour with descriptions, see [`docs/code_structure.md`](./docs/code_structure.md). + +### Inference (`cosmos-inference/`) + +| What | Where | +| -------------------------- | --------------------------------------------------------------------------------------------- | +| CLI entry point | `cosmos-inference/cosmos3/scripts/inference.py` | +| Args / param definitions | `cosmos-inference/cosmos3/args.py` | +| Per-modality defaults | `cosmos-inference/cosmos3/defaults//sample_args.json` | +| Model / inference core | `cosmos-inference/cosmos3/model.py`, `cosmos-inference/cosmos3/inference.py` | +| Ray serving configs | `cosmos-inference/cosmos3/ray/configs/latency.yaml`, `.../throughput.yaml` | +| Backend packages | `cosmos-inference/packages/{diffusers,transformers,vllm}-cosmos3/` | +| Example inputs | `cosmos-inference/inputs/omni/*.json` | + +## Documentation + +### Training (root `docs/`) + +| Doc | What it covers | +| -------------------------------------------------- | -------------------------------------------------------------------- | +| [docs/setup.md](./docs/setup.md) | Install, NGC base image, CUDA variants, base-checkpoint download. | +| [docs/code_structure.md](./docs/code_structure.md) | Repo layout and per-subpackage tour of `cosmos/`. | +| [docs/configs.md](./docs/configs.md) | LazyConfig / experiment system and overrides. | +| [docs/dataset.md](./docs/dataset.md) | JSONL / WebDataset / LeRobot formats and data prep. | +| [docs/training.md](./docs/training.md) | Single- and multi-node launches, parallelism, mixed precision. | +| [docs/checkpoints.md](./docs/checkpoints.md) | DCP vs. HuggingFace safetensors, conversion, resume. | +| [docs/inference.md](./docs/inference.md) | Bridge from a trained checkpoint to the inference backends. | +| [docs/examples.md](./docs/examples.md) | End-to-end training, fine-tuning, and inference walkthroughs. | +| [docs/faq.md](./docs/faq.md) | Troubleshooting (OOM, NCCL, slow training) + env vars. | + +### Inference (`cosmos-inference/docs/`) + +| Doc | What it covers | +| ---------------------------------------------------------------------------------------- | ------------------------------------------- | +| [cosmos-inference/docs/setup.md](./cosmos-inference/docs/setup.md) | Inference-side install/env (subtree-local). | +| [cosmos-inference/docs/inference.md](./cosmos-inference/docs/inference.md) | Sample arguments, default values, schemas. | +| [cosmos-inference/docs/inference_online.md](./cosmos-inference/docs/inference_online.md) | Online serving with Ray Serve and Gradio. | +| [cosmos-inference/docs/prompting.md](./cosmos-inference/docs/prompting.md) | Prompt engineering, upsampling with vLLM. | +| [cosmos-inference/docs/faq.md](./cosmos-inference/docs/faq.md) | Inference-side FAQ and troubleshooting. | + +Inference-side agent skills (codebase navigation, env troubleshooting, inference, post-training, setup) live in [`cosmos-inference/.agents/skills/`](./cosmos-inference/.agents/skills) and [`cosmos-inference/.claude/skills/`](./cosmos-inference/.claude/skills); they activate when working inside the `cosmos-inference/` subtree. + +## Common Tasks + +### Training + +| Task | Command | +| ------------------------ | ---------------------------------------------------------------------------------------- | +| Single-GPU train (smoke) | `python -m cosmos.scripts.train --config ` | +| Multi-GPU train | `torchrun --nproc-per-node=8 -m cosmos.scripts.train --config ` | +| Resume from checkpoint | `python -m cosmos.scripts.train --config --resume ` | +| Export DCP → HF | `python -m cosmos.scripts.export_checkpoint --src --dst ` | +| Run a config sweep | `just run python -m cosmos.scripts.train --config --overrides "..."` | + +### Inference (in `cosmos-inference/`) + +| Task | Command | +| ----------------------- | ------------------------------------------------------------------------------------------------------------------------- | +| Single-GPU inference | `python -m cosmos3.scripts.inference -i cosmos-inference/inputs/omni/t2v.json -o outputs/ --checkpoint-path Cosmos3-Nano` | +| Multi-GPU inference | `torchrun --nproc-per-node=4 -m cosmos3.scripts.inference --parallelism-preset=latency -i ... -o outputs/ ...` | +| Start online Ray server | `python -m cosmos3.ray.serve --parallelism-preset=latency -o outputs/ray_serve --checkpoint-path Cosmos3-Nano` | +| Launch Gradio UI | `python -m cosmos3.ray.gradio --port=8080` | +| See all CLI flags | `python -m cosmos3.scripts.inference --help` | + +## Gotchas + +- **NGC / PyTorch containers**: run `export LD_LIBRARY_PATH=''` before any `python` call or you'll hit a `torch._C` import error. See [`docs/setup.md`](./docs/setup.md#pytorch-import-issue). +- **Reproducibility**: always pass `--seed `. Without it a random seed is used each run. +- **JSON paths**: relative paths inside input JSON files resolve relative to the JSON file's directory, not the working directory. +- **Resume**: re-running the same inference command skips already-generated outputs automatically. +- **Don't cross the streams**: training code in `cosmos/` must not import from `cosmos-inference/`. Inference code in `cosmos-inference/` must not import from `cosmos/`. diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md new file mode 100644 index 00000000..3c8fdc4e --- /dev/null +++ b/CONTRIBUTING.md @@ -0,0 +1,121 @@ +# Contributing + + + +______________________________________________________________________ + +**Table of Contents** + +- [Setup](#setup) +- [Test](#test) + - [Run Linting and Formatting](#run-linting-and-formatting) + - [Run Tests](#run-tests) + - [Run a Single Test](#run-a-single-test) +- [Code Reviews](#code-reviews) +- [Signing Your Work](#signing-your-work) + +______________________________________________________________________ + + + +We'd love to receive your patches and contributions. Please keep your PRs as draft until such time that you would like us to review them. + +## Setup + +Install system dependencies: + +[just](https://just.systems/man/en/pre-built-binaries.html#pre-built-binaries) + +```shell +uv tool install -U rust-just +``` + +To see all available `just` commands, run + +```shell +just +``` + +## Test + +### Run Linting and Formatting + +```shell +just lint +``` + +This will also run auto-fixes and linting. We recommend that you commit your changes first. + +### Run Tests + +```shell +just test +``` + +Test levels (`--levels`): + +0. Smoke tests. Requires >= 1 GPU. +1. Partial E2E tests. Requires >= 8 GPUs. +2. Full E2E tests. Requires >= 8 GPUs. + +Test outputs are saved to `outputs/pytest/`. To monitor a test, open `console.log`/`debug.log`. + +### Run a Single Test + +```shell +# List tests to get the test name +just test-list +# Run the test +just test-single [--pdb] +``` + +## Code Reviews + +All submissions, including submissions by project members, require review. We use GitHub pull requests for this purpose. Consult +[GitHub Help](https://help.github.com/articles/about-pull-requests/) for more information on using pull requests. + +## Signing Your Work + +- We require that all contributors "sign-off" on their commits. This certifies that the contribution is your original work, or you have rights to submit it under the same license, or a compatible license. + + - Any contribution which contains commits that are not Signed-Off will not be accepted. + +- To sign off on a commit you simply use the `--signoff` (or `-s`) option when committing your changes: + + ```bash + git commit -s -m "Add cool feature." + ``` + + This will append the following to your commit message: + + ```text + Signed-off-by: Your Name + ``` + +- Full text of the DCO: + + ```text + Developer Certificate of Origin + Version 1.1 + + Copyright (C) 2004, 2006 The Linux Foundation and its contributors. + 1 Letterman Drive + Suite D4700 + San Francisco, CA, 94129 + + Everyone is permitted to copy and distribute verbatim copies of this license document, but changing it is not allowed. + ``` + + ```text + Developer's Certificate of Origin 1.1 + + By making a contribution to this project, I certify that: + + (a) The contribution was created in whole or in part by me and I have the right to submit it under the open source license indicated in the file; or + + (b) The contribution is based upon previous work that, to the best of my knowledge, is covered under an appropriate open source license and I have the right under that license to submit that work with modifications, whether created in whole or in part by me, under the same open source license (unless I am permitted to submit under a different license), as indicated in the file; or + + (c) The contribution was provided directly to me by some other person who certified (a), (b) or (c) and I have not modified it. + + (d) I understand and agree that this project and the contribution are public and that a record of the contribution (including all personal information I submit with it, including my sign-off) is maintained indefinitely and may be redistributed consistent with this project or the open source license(s) involved. + ``` diff --git a/LICENSE b/LICENSE new file mode 100644 index 00000000..1bffec96 --- /dev/null +++ b/LICENSE @@ -0,0 +1,222 @@ + Apache License + Version 2.0, January 2004 + http://www.apache.org/licenses/ + + TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION + + 1. Definitions. + + "License" shall mean the terms and conditions for use, reproduction, + and distribution as defined by Sections 1 through 9 of this document. + + "Licensor" shall mean the copyright owner or entity authorized by + the copyright owner that is granting the License. + + "Legal Entity" shall mean the union of the acting entity and all + other entities that control, are controlled by, or are under common + control with that entity. For the purposes of this definition, + "control" means (i) the power, direct or indirect, to cause the + direction or management of such entity, whether by contract or + otherwise, or (ii) ownership of fifty percent (50%) or more of the + outstanding shares, or (iii) beneficial ownership of such entity. + + "You" (or "Your") shall mean an individual or Legal Entity + exercising permissions granted by this License. + + "Source" form shall mean the preferred form for making modifications, + including but not limited to software source code, documentation + source, and configuration files. + + "Object" form shall mean any form resulting from mechanical + transformation or translation of a Source form, including but + not limited to compiled object code, generated documentation, + and conversions to other media types. + + "Work" shall mean the work of authorship, whether in Source or + Object form, made available under the License, as indicated by a + copyright notice that is included in or attached to the work + (an example is provided in the Appendix below). + + "Derivative Works" shall mean any work, whether in Source or Object + form, that is based on (or derived from) the Work and for which the + editorial revisions, annotations, elaborations, or other modifications + represent, as a whole, an original work of authorship. For the purposes + of this License, Derivative Works shall not include works that remain + separable from, or merely link (or bind by name) to the interfaces of, + the Work and Derivative Works thereof. + + "Contribution" shall mean any work of authorship, including + the original version of the Work and any modifications or additions + to that Work or Derivative Works thereof, that is intentionally + submitted to Licensor for inclusion in the Work by the copyright owner + or by an individual or Legal Entity authorized to submit on behalf of + the copyright owner. For the purposes of this definition, "submitted" + means any form of electronic, verbal, or written communication sent + to the Licensor or its representatives, including but not limited to + communication on electronic mailing lists, source code control systems, + and issue tracking systems that are managed by, or on behalf of, the + Licensor for the purpose of discussing and improving the Work, but + excluding communication that is conspicuously marked or otherwise + designated in writing by the copyright owner as "Not a Contribution." + + "Contributor" shall mean Licensor and any individual or Legal Entity + on behalf of whom a Contribution has been received by Licensor and + subsequently incorporated within the Work. + + 2. Grant of Copyright License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + copyright license to reproduce, prepare Derivative Works of, + publicly display, publicly perform, sublicense, and distribute the + Work and such Derivative Works in Source or Object form. + + 3. Grant of Patent License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + (except as stated in this section) patent license to make, have made, + use, offer to sell, sell, import, and otherwise transfer the Work, + where such license applies only to those patent claims licensable + by such Contributor that are necessarily infringed by their + Contribution(s) alone or by combination of their Contribution(s) + with the Work to which such Contribution(s) was submitted. If You + institute patent litigation against any entity (including a + cross-claim or counterclaim in a lawsuit) alleging that the Work + or a Contribution incorporated within the Work constitutes direct + or contributory patent infringement, then any patent licenses + granted to You under this License for that Work shall terminate + as of the date such litigation is filed. + + 4. Redistribution. You may reproduce and distribute copies of the + Work or Derivative Works thereof in any medium, with or without + modifications, and in Source or Object form, provided that You + meet the following conditions: + + (a) You must give any other recipients of the Work or + Derivative Works a copy of this License; and + + (b) You must cause any modified files to carry prominent notices + stating that You changed the files; and + + (c) You must retain, in the Source form of any Derivative Works + that You distribute, all copyright, patent, trademark, and + attribution notices from the Source form of the Work, + excluding those notices that do not pertain to any part of + the Derivative Works; and + + (d) If the Work includes a "NOTICE" text file as part of its + distribution, then any Derivative Works that You distribute must + include a readable copy of the attribution notices contained + within such NOTICE file, excluding those notices that do not + pertain to any part of the Derivative Works, in at least one + of the following places: within a NOTICE text file distributed + as part of the Derivative Works; within the Source form or + documentation, if provided along with the Derivative Works; or, + within a display generated by the Derivative Works, if and + wherever such third-party notices normally appear. The contents + of the NOTICE file are for informational purposes only and + do not modify the License. You may add Your own attribution + notices within Derivative Works that You distribute, alongside + or as an addendum to the NOTICE text from the Work, provided + that such additional attribution notices cannot be construed + as modifying the License. + + You may add Your own copyright statement to Your modifications and + may provide additional or different license terms and conditions + for use, reproduction, or distribution of Your modifications, or + for any such Derivative Works as a whole, provided Your use, + reproduction, and distribution of the Work otherwise complies with + the conditions stated in this License. + + 5. Submission of Contributions. Unless You explicitly state otherwise, + any Contribution intentionally submitted for inclusion in the Work + by You to the Licensor shall be under the terms and conditions of + this License, without any additional terms or conditions. + Notwithstanding the above, nothing herein shall supersede or modify + the terms of any separate license agreement you may have executed + with Licensor regarding such Contributions. + + 6. Trademarks. This License does not grant permission to use the trade + names, trademarks, service marks, or product names of the Licensor, + except as required for reasonable and customary use in describing the + origin of the Work and reproducing the content of the NOTICE file. + + 7. Disclaimer of Warranty. Unless required by applicable law or + agreed to in writing, Licensor provides the Work (and each + Contributor provides its Contributions) on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or + implied, including, without limitation, any warranties or conditions + of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A + PARTICULAR PURPOSE. You are solely responsible for determining the + appropriateness of using or redistributing the Work and assume any + risks associated with Your exercise of permissions under this License. + + 8. Limitation of Liability. In no event and under no legal theory, + whether in tort (including negligence), contract, or otherwise, + unless required by applicable law (such as deliberate and grossly + negligent acts) or agreed to in writing, shall any Contributor be + liable to You for damages, including any direct, indirect, special, + incidental, or consequential damages of any character arising as a + result of this License or out of the use or inability to use the + Work (including but not limited to damages for loss of goodwill, + work stoppage, computer failure or malfunction, or any and all + other commercial damages or losses), even if such Contributor + has been advised of the possibility of such damages. + + 9. Accepting Warranty or Additional Liability. While redistributing + the Work or Derivative Works thereof, You may choose to offer, + and charge a fee for, acceptance of support, warranty, indemnity, + or other liability obligations and/or rights consistent with this + License. However, in accepting such obligations, You may act only + on Your own behalf and on Your sole responsibility, not on behalf + of any other Contributor, and only if You agree to indemnify, + defend, and hold each Contributor harmless for any liability + incurred by, or claims asserted against, such Contributor by reason + of your accepting any such warranty or additional liability. + + END OF TERMS AND CONDITIONS + + APPENDIX: How to apply the Apache License to your work. + + To apply the Apache License to your work, attach the following + boilerplate notice, with the fields enclosed by brackets "[]" + replaced with your own identifying information. (Don't include + the brackets!) The text should be enclosed in the appropriate + comment syntax for the file format. We also recommend that a + file or class name and description of purpose be included on the + same "printed page" as the copyright notice for easier + identification within third-party archives. + + Copyright [yyyy] [name of copyright owner] + + Licensed under the Apache License, Version 2.0 (the "License"); + you may not use this file except in compliance with the License. + You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, software + distributed under the License is distributed on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + See the License for the specific language governing permissions and + limitations under the License. + +================================================================================ + THIRD-PARTY ATTRIBUTIONS +================================================================================ + +This product includes code adapted from HuggingFace Transformers +(https://github.com/huggingface/transformers), licensed under the Apache +License, Version 2.0. + + Copyright 2024 The Qwen team, Alibaba Group and the HuggingFace Inc. team. + Copyright 2025 The Qwen team, Alibaba Group and the HuggingFace Inc. team. + Copyright 2025 The Qwen Team and The HuggingFace Inc. team. + All rights reserved. + +The following files are adapted from HuggingFace Transformers: + + cosmos3/_src/vfm/models/llm/qwen3/configuration_qwen3.py + cosmos3/_src/vfm/models/llm/qwen3/qwen3.py + cosmos3/_src/vfm/models/vlm/qwen3_vl/configuration_qwen3_vl.py + cosmos3/_src/vfm/models/vlm/qwen3_vl/qwen3_vl.py + cosmos3/_src/vfm/models/vlm/qwen3_vl/video_processing_qwen3_vl.py diff --git a/README.md b/README.md index 3c833802..92d2fc59 100644 --- a/README.md +++ b/README.md @@ -1,13 +1,87 @@

-NVIDIA Cosmos Logo + NVIDIA Cosmos

-

- New GitHub page for NVIDIA Cosmos:
- https://github.com/nvidia-cosmos -

-**This repository has been deprecated and is no longer maintained.** To view the initial release of NVIDIA Cosmos from this repository, please check out branch `archived-ces2025`. +

🤗 Hugging Face | Paper Draft

+ +# Cosmos + +**Cosmos** is an end-to-end framework for training and serving world foundation models. It is organized as two tightly-integrated halves: + +- **Training infrastructure** — the `cosmos/` package and the documentation in [`docs/`](./docs), covering setup, dataset preparation, distributed training, checkpointing, evaluation, and configuration. +- **Inference infrastructure** — the [`cosmos-inference/`](./cosmos-inference) subtree, containing the Diffusers / Transformers / vLLM integrations and end-to-end inference pipelines. + +Both halves share the same dependency manifest at the repository root, so a single `uv sync` installs everything needed to train a model and serve its checkpoints. + +- [Quickstart](#setup) +- [Training](#training) +- [Inference](#inference) + +## Overview + +The training side provides: + +- A distributed trainer with FSDP / tensor / context / pipeline parallelism (see [`cosmos/trainer/`](./cosmos_training/cosmos/trainer) and [`cosmos/model/`](./cosmos_training/cosmos/model)). +- A worker-based RL / post-training topology with `controller`, `rollout`, `reward`, `reference`, and `simulations` workers (see [`cosmos/workers/`](./cosmos_training/cosmos/workers)). +- A pluggable algorithm layer for losses, reward models, and RL update rules (see [`cosmos/algorithm/`](./cosmos_training/cosmos/algorithm)). +- Native DCP checkpointing with HuggingFace `safetensors` import/export. +- Dataset abstractions for JSONL, WebDataset, and LeRobot formats. + +The inference side under [`cosmos-inference/`](./cosmos-inference) exposes ready-to-use pipelines for offline batch generation and online serving (Ray, Gradio), with backends for HuggingFace Transformers, vLLM, and Diffusers. + +## Setup + +For full instructions and alternative installation methods, see [Setup](./docs/setup.md). + +Before installing, make sure your machine meets the [System Requirements](./docs/setup.md#system-requirements). If you want a curated PyTorch + CUDA environment, start from the [recommended base image](./docs/setup.md#recommended-base-image). + +Install system dependencies: + +```shell +sudo apt-get install -y --no-install-recommends curl ffmpeg libx11-dev tree wget +``` + +Install the package with `uv`: + +```shell +uv sync --all-extras --group=cu130-train +source .venv/bin/activate && export LD_LIBRARY_PATH= +``` + +If you are starting from the [recommended NVIDIA NGC base image](./docs/setup.md#recommended-base-image) (`nvcr.io/nvidia/pytorch:25.09-py3`), see the [one-shot quickstart](./docs/setup.md#quickstart-from-the-recommended-base-image). + +## Training + +The training infrastructure lives in [`cosmos/`](./cosmos_training/cosmos), with user-facing documentation in [`docs/`](./docs): + +| Topic | What it covers | +| ------------------------------------------------------------ | ----------------------------------------------------------------------------------------------------------------------- | +| [Setup](./docs/setup.md) | Hardware/software prerequisites, `uv` install paths, CUDA variants, Docker base image, and base-checkpoint downloading. | +| [Code Structure](./docs/code_structure.md) | Repository layout and a per-subpackage tour of `cosmos/` — where each concern lives and where to add new code. | +| [Configs](./docs/configs.md) | The LazyConfig / experiment system, CLI and YAML overrides, and how to register a new experiment. | +| [Dataset](./docs/dataset.md) | Supported data formats (JSONL, WebDataset, LeRobot), preparation steps, augmentations, and multi-dataset weighting. | +| [Training](./docs/training.md) | Launching single-GPU, multi-GPU, and multi-node runs; parallelism strategies; mixed precision; resuming. | +| [Checkpoints](./docs/checkpoints.md) | DCP vs. HuggingFace `safetensors`, conversion utilities, and resuming a training run from a saved checkpoint. | +| [Inference (from a trained checkpoint)](./docs/inference.md) | Loading a trained checkpoint into one of the inference backends — points back into `cosmos-inference/`. | +| [Examples](./docs/examples.md) | End-to-end training, fine-tuning, and inference walkthroughs; runnable scripts in [`examples/`](./examples). | +| [FAQ](./docs/faq.md) | Troubleshooting (OOM, NCCL hangs, slow training), environment variables, and common pitfalls. | + +A minimal single-GPU training launch looks like: + +```shell +python -m cosmos.scripts.train --config +``` + +See [Training](./docs/training.md) for multi-GPU / multi-node launches and the full set of CLI arguments. + +### Reference + +- [Code Structure](./docs/code_structure.md) — repository layout and a tour of each `cosmos/` subpackage. +- [FAQ](./docs/faq.md) — troubleshooting OOM, NCCL hangs, slow training, and environment variables. +- [AGENTS.md](./AGENTS.md) — contributor-facing guidance for AI agents working in this repo. + +## Inference + +End-to-end inference — offline batch generation, online serving with Ray and Gradio, and integration with HuggingFace Transformers, vLLM, and Diffusers — is documented in [`cosmos-inference/README.md`](./cosmos-inference/README.md). + +Once a checkpoint has been trained in this repo, export it (see [Checkpoints](./docs/checkpoints.md)) and follow the inference README for serving. diff --git a/RELEASE.md b/RELEASE.md index 99fc9ffb..b31bd8f4 100644 --- a/RELEASE.md +++ b/RELEASE.md @@ -1,7 +1,3 @@ # Release Cadence - -| Version | Description | Date | -|------------|----------|----------| -| [v1.0](release_notes/v0p1.md) | Initial diffusion and autoregressive WFMs release | 2025-01-06 | -| [v0.1](release_notes/v0p1.md) | Initial tokenizer release | 2024-11-06 | +Release notes will be published here as the framework reaches tagged releases. diff --git a/ci/.link-check.json b/ci/.link-check.json new file mode 100644 index 00000000..3546a997 --- /dev/null +++ b/ci/.link-check.json @@ -0,0 +1,10 @@ +{ + "ignorePatterns": [ + { + "pattern": "localhost" + }, + { + "pattern": "^https://github-production-user-asset" + } + ] +} diff --git a/ci/.markdown-toc-creator.toml b/ci/.markdown-toc-creator.toml new file mode 100644 index 00000000..650901b0 --- /dev/null +++ b/ci/.markdown-toc-creator.toml @@ -0,0 +1,19 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025-2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +[tool.markdown_toc_creator] +proactive = false +exclude = '/_src/' +quiet = true diff --git a/ci/.pre-commit-config-base.yaml b/ci/.pre-commit-config-base.yaml new file mode 100644 index 00000000..4a7374ba --- /dev/null +++ b/ci/.pre-commit-config-base.yaml @@ -0,0 +1,29 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +exclude: (?x)( + ^cosmos-inference/ + ) +repos: + - repo: https://github.com/pre-commit/pre-commit-hooks + rev: v6.0.0 + hooks: + - id: check-added-large-files + args: ['--maxkb=10000'] # 10MB + - id: forbid-submodules + - repo: https://github.com/gitleaks/gitleaks + rev: v8.30.0 + hooks: + - id: gitleaks diff --git a/ci/license.txt b/ci/license.txt new file mode 100644 index 00000000..8f179ca4 --- /dev/null +++ b/ci/license.txt @@ -0,0 +1,14 @@ +SPDX-FileCopyrightText: Copyright (c) 2025-2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +SPDX-License-Identifier: Apache-2.0 + +Licensed under the Apache License, Version 2.0 (the "License"); +you may not use this file except in compliance with the License. +You may obtain a copy of the License at + +http://www.apache.org/licenses/LICENSE-2.0 + +Unless required by applicable law or agreed to in writing, software +distributed under the License is distributed on an "AS IS" BASIS, +WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +See the License for the specific language governing permissions and +limitations under the License. diff --git a/ci/uv_lock.sh b/ci/uv_lock.sh new file mode 100755 index 00000000..3a6189c9 --- /dev/null +++ b/ci/uv_lock.sh @@ -0,0 +1,22 @@ +#!/usr/bin/env bash +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# + +# Generate uv lock files for projects. + +set -euo pipefail + +for file in "$@"; do + project_dir="$(dirname "$file")" + if ! uv lock -q --check --project "$project_dir" &>/dev/null; then + echo "Updating lock file for '$project_dir'" >&2 + uv lock -q --project "$project_dir" + fi +done diff --git a/ci/uv_lock_script.sh b/ci/uv_lock_script.sh new file mode 100755 index 00000000..83975315 --- /dev/null +++ b/ci/uv_lock_script.sh @@ -0,0 +1,23 @@ +#!/usr/bin/env bash +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# + +# Generate uv lock files for scripts. + +set -euo pipefail + +for file in "$@"; do + if head -n1 "$file" | grep -q '^#!/usr/bin/env -S uv run --script'; then + if ! uv lock -q --check --script "$file" &>/dev/null; then + echo "Updating lock file for '$file'" >&2 + uv lock -q --script "$file" + fi + fi +done diff --git a/conftest.py b/conftest.py new file mode 100644 index 00000000..cbf97c4d --- /dev/null +++ b/conftest.py @@ -0,0 +1,228 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025-2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +"""Root pytest configuration. + +This is a slimmed-down adaptation of ``cosmos-inference/conftest.py``. +The richer Args / fixture infrastructure (``cosmos3.fixtures.args``, +``cosmos3._src.imaginaire.lazy_config``, ``cosmos3.common.init``) is +not yet ported to the root ``cosmos/`` package. As ``cosmos/`` grows +the equivalent helpers, re-port the additional fixtures (init logging, +seed, lazy_call._CONVERT_TARGET_TO_STRING, etc.) from cosmos-inference. +""" + +from __future__ import annotations + +import gc +import os +from functools import cache +from pathlib import Path + +import pytest + +ALL_NUM_GPUS = (0, 1, 2, 4, 8) +ALL_LEVELS = (0, 1, 2) +# Tests at level ``l`` are allowed to request ``ALLOWED_GPUS_BY_LEVEL[l]`` GPUs. +ALLOWED_GPUS_BY_LEVEL: dict[int, tuple[int, ...]] = { + 0: (0, 1), + 1: (0, 1, 2, 4), + 2: ALL_NUM_GPUS, +} + + +@pytest.fixture(scope="module") +def original_datadir(request: pytest.FixtureRequest) -> Path: + root_dir = request.config.rootpath + relative_path = request.path.with_suffix("").relative_to(root_dir) + return root_dir / "tests/data" / relative_path + + +@cache +def _get_available_gpus() -> int: + try: + import pynvml + except ImportError: + return 0 + try: + pynvml.nvmlInit() + device_count = pynvml.nvmlDeviceGetCount() + pynvml.nvmlShutdown() + return device_count + except pynvml.NVMLError as e: + print(f"WARNING: Failed to get available GPUs: {e}") + return 0 + + +def pytest_addoption(parser: pytest.Parser): + parser.addoption("--manual", action="store_true", default=False, help="Run manual tests") + parser.addoption( + "--num-gpus", + default=None, + type=int, + choices=ALL_NUM_GPUS, + help="Run tests with the specified number of GPUs", + ) + parser.addoption("--levels", default=None, help="Run tests with the specified levels (comma-separated list)") + + +def pytest_xdist_auto_num_workers(config: pytest.Config) -> int | None: + num_gpus: int | None = config.option.num_gpus + if num_gpus is None: + return 1 + if num_gpus == 0: + return None + + available_gpus = _get_available_gpus() + if available_gpus < num_gpus: + raise ValueError(f"Not enough GPUs available. Required: {num_gpus}, Available: {available_gpus}") + return available_gpus // num_gpus + + +def _parse_levels(value: str | None) -> tuple[int, ...] | None: + if value is None: + return None + levels = tuple(int(x) for x in value.split(",")) + for level in levels: + if level not in ALL_LEVELS: + raise ValueError(f"Invalid level {level} not in {ALL_LEVELS}") + return levels + + +def _get_marker(item: pytest.Item, name: str) -> pytest.Mark | None: + markers = list(item.iter_markers(name=name)) + if not markers: + return None + marker = markers[0] + for other_marker in markers[1:]: + if other_marker != marker: + raise ValueError(f"Multiple different markers found for {name}: {markers}") + return marker + + +def _parse_level_marker(mark: pytest.Mark) -> int: + if len(mark.args) != 1: + raise ValueError(f"Invalid arguments: {mark.args}") + if mark.kwargs: + raise ValueError(f"Invalid keyword arguments: {mark.kwargs}") + level = mark.args[0] + if level not in ALL_LEVELS: + raise ValueError(f"Invalid level {level} not in {ALL_LEVELS}") + return level + + +def _parse_gpus_marker(mark: pytest.Mark) -> int: + if len(mark.args) != 1: + raise ValueError(f"Invalid arguments: {mark.args}") + if mark.kwargs: + raise ValueError(f"Invalid keyword arguments: {mark.kwargs}") + required_gpus = int(mark.args[0]) + if required_gpus not in ALL_NUM_GPUS: + raise ValueError(f"Invalid number of GPUs {required_gpus} not in {ALL_NUM_GPUS}") + return required_gpus + + +def pytest_collection_modifyitems(config: pytest.Config, items: list[pytest.Item]): + enable_manual: bool = config.getoption("--manual") + num_gpus: int | None = config.option.num_gpus + levels = _parse_levels(config.getoption("--levels")) + + for item in items: + manual_mark = _get_marker(item, "manual") + level_mark = _get_marker(item, "level") + gpus_mark = _get_marker(item, "gpus") + try: + level = _parse_level_marker(level_mark) if level_mark else 0 + gpus = _parse_gpus_marker(gpus_mark) if gpus_mark else 0 + except ValueError as e: + pytest.fail(f"Invalid marker on test {item.name}: {e}") + assert False, "unreachable" + + allowed_gpus = ALLOWED_GPUS_BY_LEVEL[level] + if gpus not in allowed_gpus: + pytest.fail(f"Level {level} tests must have {allowed_gpus} GPUs, but {item.name} has {gpus} GPUs") + + if not enable_manual and manual_mark is not None: + item.add_marker(pytest.mark.skip(reason="test requires --manual")) + if levels is not None and level not in levels: + item.add_marker(pytest.mark.skip(reason=f"test requires --levels={level}")) + if num_gpus is not None and gpus != num_gpus: + item.add_marker(pytest.mark.skip(reason=f"test requires --num-gpus={gpus}")) + available_gpus = _get_available_gpus() + if gpus > available_gpus: + item.add_marker( + pytest.mark.skip(reason=f"test requires {gpus} GPUs, but only {available_gpus} are available") + ) + + selected_items = [] + deselected_items = [] + for item in items: + if item.get_closest_marker("skip"): + deselected_items.append(item) + continue + selected_items.append(item) + items[:] = selected_items + config.hook.pytest_deselected(items=deselected_items) + + +@pytest.fixture(autouse=True) +def init_torch_test(): + try: + import torch + except ImportError: + yield + return + yield + gc.collect() + if torch.cuda.is_available(): + torch.cuda.empty_cache() + + +_WHITELIST_ENV_VARS = { + "LD_LIBRARY_PATH", + "QT_QPA_FONTDIR", + "QT_QPA_PLATFORM_PLUGIN_PATH", + "TORCHINDUCTOR_CACHE_DIR", +} + + +@pytest.fixture(autouse=True) +def detect_env_modifications(): + original_env = dict(os.environ) + + yield + + new_env = dict(os.environ) + + for env in [original_env, new_env]: + for k in list(env.keys()): + if k.startswith("PYTEST_") or k in _WHITELIST_ENV_VARS: + del env[k] + if new_env != original_env: + added, removed, modified = _compare_dict(new_env, original_env) + os.environ.clear() + os.environ.update(original_env) + raise ValueError( + f"Environment variables modified by test! Use 'monkeypatch.setenv' to temporarily modify environment variables. \n" + f"Added: {added}\n" + f"Removed: {removed}\n" + f"Modified: {modified}" + ) + + +def _compare_dict(actual: dict[str, str], expected: dict[str, str]) -> tuple[set[str], set[str], set[str]]: + added = set(actual) - set(expected) + removed = set(expected) - set(actual) + modified = {k for k in expected if k in actual and expected[k] != actual[k]} + return added, removed, modified diff --git a/cosmos-inference/.agents/skills/cosmos3-codebase-nav/SKILL.md b/cosmos-inference/.agents/skills/cosmos3-codebase-nav/SKILL.md new file mode 100644 index 00000000..810695e4 --- /dev/null +++ b/cosmos-inference/.agents/skills/cosmos3-codebase-nav/SKILL.md @@ -0,0 +1,99 @@ +--- +name: cosmos3-codebase-nav +description: > + Navigate the Cosmos3 package codebase to find where parameters, configs, defaults, + scripts, and documentation live. Use when the user asks "where is X in cosmos3", + "how do I find the config for Y", "where are the defaults", "where do I change a + parameter", or any question about locating files, modules, or settings. Also use + when the user opens or edits files and needs orientation. +--- + +# Cosmos3 Codebase Navigation + +## When to use this skill + +- Use this skill when an agent is navigating the Cosmos3 package +- Use this skill to answer "where is X", "how do I find the config for Y", or any file-location question +- Use this skill when the user opens or edits cosmos3 files and needs orientation + +## Path convention + +All paths below are relative to this file's location (`.agents/skills/cosmos3-codebase-nav/`). + +## Quick Reference + +### Where parameters and defaults live + +| What you're looking for | File | +| ------------------------------------------------------- | ------------------------------------------------------------------------------------------------- | +| Sampling params (num_steps, guidance, shift, fps, etc.) | `../../../cosmos3/args.py` → `SamplingArgs`, `SamplingOverrides` | +| Per-modality default values | `../../../cosmos3/defaults//sample_args.json` | +| Setup params (parallelism, checkpoints, model path) | `../../../cosmos3/args.py` → `OmniSetupArgs`, `OmniSetupOverrides` | +| Common args base classes | `../../../cosmos3/common/args.py` → `ArgsBase`, `OverridesBase` | +| Ray serving parallelism presets | `../../../cosmos3/ray/configs/latency.yaml`, `../../../cosmos3/ray/configs/throughput.yaml` | +| Feature flags | `../../../cosmos3/flags.py` | +| Prompt upsampler system prompt | `../../../cosmos3/defaults/prompt_upsampler.txt` | +| Example inputs | `../../../inputs/omni/t2i.json`, `../../../inputs/omni/t2v.json`, `../../../inputs/omni/i2v.json` | + +Available modality modes for defaults: `text2image`, `text2video`, `image2video`. + +### Config defaults resolution chain + +When a user runs inference, default parameter values are resolved in this order: + +``` +cosmos3/defaults//sample_args.json # 1. Per-modality JSON defaults (num_steps, guidance, shift, fps, etc.) + ↓ +_load_modality_defaults() in cosmos3/args.py # 2. Loaded and cached at import time + ↓ +SamplingArgs / SamplingOverrides # 3. Pydantic models with field-level validation + ↓ +OmniSampleOverrides.build_sample() # 4. Merges user overrides → final resolved args + ↓ +_RESOLUTION_SHIFT_DEFAULTS[model_size, resolution] # 5. Model+resolution shift override (if user didn't set shift) + ↓ +CLI flags (--guidance, --shift, etc.) # 6. User overrides from command line +``` + +The `_RESOLUTION_SHIFT_DEFAULTS` table in `../../../cosmos3/args.py` overrides the default `shift` based on model size and resolution, unless the user explicitly specified `--shift`. + +| Mode | Default file | Key defaults | +| ------------- | -------------------------------------------------------- | ---------------------------------------------- | +| `text2image` | `../../../cosmos3/defaults/text2image/sample_args.json` | `num_frames=1`, `guidance=6.0`, `shift=10.0` | +| `text2video` | `../../../cosmos3/defaults/text2video/sample_args.json` | `num_frames=189`, `guidance=6.0`, `shift=10.0` | +| `image2video` | `../../../cosmos3/defaults/image2video/sample_args.json` | `num_frames=189`, `guidance=6.0`, `shift=10.0` | + +Users can also supply a custom defaults file per-request via the `defaults_file` field in sample arguments (see `../../../docs/inference.md`). + +### Where to make changes + +| Task | Edit | +| ------------------------------- | ------------------------------------------------------------------------------------------------------- | +| Change a built-in default value | `../../../cosmos3/defaults//sample_args.json` | +| Add a new CLI parameter | `SamplingArgs` + `SamplingOverrides` in `../../../cosmos3/args.py`, then add to each `sample_args.json` | +| Change parallelism presets | `../../../cosmos3/ray/configs/latency.yaml` or `throughput.yaml` | +| Add a new script | `../../../cosmos3/scripts/` — follow `inference.py` as the pattern | + +### Key entry points + +| Entry point | How to run | +| -------------------- | ------------------------------------------------------ | +| Batch inference | `python -m cosmos3.scripts.inference` | +| Online serving (Ray) | `python -m cosmos3.ray.serve` | +| Submit to Ray server | `python -m cosmos3.ray.submit` | +| Gradio UI | `python -m cosmos3.ray.gradio` | +| Prompt upsampling | `python -m cosmos3.scripts.upsample_prompts` | +| Model export | `python -m cosmos3.scripts.export_model` | +| Diffusers conversion | `python -m cosmos3.scripts.convert_model_to_diffusers` | + +### Documentation + +| Doc | Covers | +| ----------------------------------- | ----------------------------------------------------- | +| `../../../AGENTS.md` | Commands, rules, key file locations (read this first) | +| `../../../README.md` | Overview, quickstart, examples | +| `../../../docs/setup.md` | Installation, environment, checkpoints | +| `../../../docs/inference.md` | Sample args, default values, custom defaults | +| `../../../docs/inference_online.md` | Ray Serve and Gradio | +| `../../../docs/prompting.md` | Prompt engineering, upsampling | +| `../../../docs/faq.md` | FAQ, tips, and troubleshooting | diff --git a/cosmos-inference/.agents/skills/cosmos3-env-troubleshoot/SKILL.md b/cosmos-inference/.agents/skills/cosmos3-env-troubleshoot/SKILL.md new file mode 100644 index 00000000..9df15188 --- /dev/null +++ b/cosmos-inference/.agents/skills/cosmos3-env-troubleshoot/SKILL.md @@ -0,0 +1,106 @@ +--- +name: cosmos3-env-troubleshoot +description: > + Diagnose and fix Cosmos3 environment, installation, and runtime errors. + Use when the user encounters an ImportError, ModuleNotFoundError, CUDA error, + Docker error, checkpoint download failure, or any traceback during setup or inference. +--- + +# Cosmos3 Environment Troubleshooting + +## When to use this skill + +- Use when a user hits an error during installation, environment setup, or first run +- Use when a traceback mentions torch, CUDA, missing modules, or shared libraries +- Use when Docker or container setup fails +- Use when checkpoint downloads fail or HuggingFace auth errors appear + +## Path convention + +All paths below are relative to this file's location (`.agents/skills/cosmos3-env-troubleshoot/`). + +## Step 1: Match against known errors + +Check the error message against the table below. Each row links to the canonical fix in the docs. + +| Error signature | Cause | Fix location | +| ----------------------------------------------------------------------------- | ------------------------------------ | ------------------------------------------------------------------------------------------------------ | +| `ImportError: cannot import name '_functionalization' from 'torch._C'` | NGC container library conflict | `../../../docs/setup.md` § PyTorch Import Issue — run `export LD_LIBRARY_PATH=''` | +| `ModuleNotFoundError: No module named 'cosmos3'` | Package not installed | `../../../docs/setup.md` § Dependency Issue — run `uv sync --all-extras --group=cu130 --reinstall` | +| `ModuleNotFoundError: No module named ` | Dependency missing | `../../../docs/setup.md` § Dependency Issue — reinstall venv | +| `fatal error: Python.h: No such file or directory` | Broken Python / uv install | `../../../docs/setup.md` § Python Issue — reinstall uv + venv from scratch | +| `OSError: : cannot open shared object file` | CUDA version mismatch | `../../../docs/setup.md` § CUDA Issue — install matching `cuda-toolkit-` | +| `docker: Error response from daemon: unknown or invalid runtime name: nvidia` | Docker nvidia runtime not configured | `../../../docs/setup.md` § Docker Container — run `sudo nvidia-ctk runtime configure --runtime=docker` | +| HuggingFace 401 / download failures | Auth or license not accepted | `../../../docs/setup.md` § Downloading Checkpoints — check `HF_TOKEN`, accept license agreement | + +## Step 2: If no documented fix matches, try common remediation + +Run these diagnostic commands to collect information, then attempt fixes in order: + +### Diagnostic commands + +```shell +# System +uname -a +cat /etc/os-release | head -5 + +# Python +python --version +which python + +# CUDA +nvidia-smi +python -c "import torch; print(f'torch={torch.__version__}, cuda={torch.version.cuda}')" + +# Package +uv pip list | head -20 +``` + +### Remediation ladder (try in order) + +1. **Clear library path**: `export LD_LIBRARY_PATH=''` +2. **Reinstall venv**: `uv sync --all-extras --group=cu130 --reinstall` +3. **Reinstall uv + venv from scratch**: + + ```shell + curl -LsSf https://astral.sh/uv/install.sh | sh + uv python install --reinstall + rm -rf .venv + uv sync --all-extras --group=cu130 --reinstall + source .venv/bin/activate + ``` + +4. **Check CUDA version alignment**: the major CUDA version from `nvidia-smi` must match `torch.version.cuda` +5. **Try Docker**: if the host environment is too broken, fall back to the Docker container (see `../../../docs/setup.md`) + +## Step 3: If still unresolved, generate a bug report + +If none of the above resolves the issue, collect environment information and present the user with a pre-filled bug report they can submit as a GitHub issue. + +Fill in the template below by running the diagnostic commands and inserting the results: + +````markdown +## Environment + +- **OS**: +- **Python**: +- **CUDA (system)**: +- **CUDA (torch)**: `> +- **torch version**: `> +- **cosmos3 version**: +- **Installation method**: + +## Error + +``` + +``` + +## What was tried + +1. + +## Additional context + + +```` diff --git a/cosmos-inference/.agents/skills/cosmos3-inference/SKILL.md b/cosmos-inference/.agents/skills/cosmos3-inference/SKILL.md new file mode 100644 index 00000000..c2c0379a --- /dev/null +++ b/cosmos-inference/.agents/skills/cosmos3-inference/SKILL.md @@ -0,0 +1,59 @@ +--- +name: cosmos3-inference +description: > + Guide users through running Cosmos3 inference — offline batch generation, online + serving with Ray and Gradio, parallelism options, input formats, sampling parameters, + and prompt upsampling. Use when the user asks "how do I run inference", + "how do I generate a video", "how do I serve the model", "what parameters should I use", + or any question about running the model to produce outputs. +--- + +# Cosmos3 Inference + +## When to use this skill + +- Use when a user wants to generate images or videos with Cosmos3 +- Use when a user asks about inference parameters, input formats, or parallelism +- Use when a user wants to set up online serving (Ray Serve, Gradio) +- Use when a user asks about prompt engineering or upsampling +- For environment or import errors, hand off to **cosmos3-env-troubleshoot** + +## Path convention + +All paths below are relative to the cosmos3 package root (`../../../` from this skill file). All `uv run` / `python` commands should also be run from there. + +## Where to find answers + +| User question | Go to | +| ----------------------------------------------------------------------- | ------------------------------------------------------------------------------ | +| How do I run inference? (single-GPU, multi-GPU) | `README.md` § Offline Batch Inference | +| Which model should I use? (Nano vs Super, memory, shift) | `README.md` § Models | +| Which modality? (t2i, t2v, i2v, examples) | `README.md` § Modalities | +| What parallelism preset? (latency vs throughput) | `README.md` § Offline Batch Inference | +| What input fields are available? (prompt, vision_path, num_frames, ...) | `docs/inference.md` § Sample Arguments | +| What are the default parameter values? | `docs/inference.md` § Default Values | +| How do I use custom defaults? | `docs/inference.md` § Custom Defaults | +| How do I override a parameter? (precedence) | `docs/faq.md` § How do I override a default parameter? | +| What is the shift parameter? | `docs/faq.md` § What is the `shift` parameter? | +| How many frames can I generate? (resolution caps) | `docs/faq.md` § How many frames can I generate? | +| How do I start Ray Serve / Gradio / submit requests? | `docs/inference_online.md` | +| How do I write good prompts? | `docs/prompting.md` | +| How do I upsample short prompts? | `docs/prompting.md` § Upsampling | +| How do I use the low-level API? (examples/) | `docs/inference.md` § Supplementary Examples | +| All CLI flags | `uv run --all-extras --group=cu130 python -m cosmos3.scripts.inference --help` | + +## Things not obvious from the docs + +- **Path resolution**: relative paths in input JSON files are resolved relative to the **JSON file's directory**, not the working directory. +- **Seed**: always pass `--seed` for reproducible results. Without it, a random seed is used each time. +- **Resume**: interrupted runs can be resumed by re-running the same command — existing outputs are skipped automatically. +- **`--keep-going`**: continues processing remaining samples after a per-sample failure (e.g. guardrail rejection). Used in online serving by default. +- **Unique names**: every sample in a run must have a unique `name` field, or the script will error. + +## Related skills + +| Skill | When to use | +| -------------------------------------- | ---------------------------------------------- | +| `../cosmos3-setup/SKILL.md` | Installation and environment setup | +| `../cosmos3-codebase-nav/SKILL.md` | Finding files, parameters, and configs in code | +| `../cosmos3-env-troubleshoot/SKILL.md` | Debugging environment and runtime errors | diff --git a/cosmos-inference/.agents/skills/cosmos3-post-training/SKILL.md b/cosmos-inference/.agents/skills/cosmos3-post-training/SKILL.md new file mode 100644 index 00000000..78220ca4 --- /dev/null +++ b/cosmos-inference/.agents/skills/cosmos3-post-training/SKILL.md @@ -0,0 +1,95 @@ +--- +name: cosmos3-post-training +description: > + Guide users through Cosmos3 supervised fine-tuning (SFT) post-training: + preparing the example dataset and DCP base checkpoint, editing the experiment + config, launching distributed training with `torchrun`, running T2V/I2V/V2V + inference with the trained DCP checkpoint, optionally exporting it to + Hugging Face safetensors, running **action evaluation** (`cosmos3.scripts.eval`) + on action checkpoints (forward / inverse dynamics, policy) for PSNR / + action MSE, and the optional Video Captioning pipeline. Use when the user + asks how to post-train Cosmos3, fine-tune on a custom video dataset, export + a trained checkpoint, evaluate an action checkpoint, run `eval.py`, or + caption videos for training — or any question about `cu130-train` / + `cu128-train`, `mixed_modality_sft_nano.yaml`, `convert_model_to_dcp` / + `export_model` / `train` / `eval` / `caption_from_video` / + `captions_to_sft_jsonl`, action-eval metrics, or SFT output paths. eval.py + is action-only; for T2V/I2V/V2V use inference. +--- + +# Cosmos3 Post-Training (SFT) + +## When to use this skill + +- User wants to fine-tune Cosmos3-Nano on their own video dataset (SFT) +- User asks which fields in `mixed_modality_sft_nano.yaml` to override (lr, FSDP shard, max_iter, jsonl_paths, ...) +- User wants to convert a base Hugging Face checkpoint to DCP, or convert a trained DCP back to safetensors +- User wants to score an **action** checkpoint (forward / inverse dynamics, policy) against a held-out dataset with `cosmos3.scripts.eval` — per-sample PSNR / action MSE plus an aggregate. Eval is action-only; do not invoke this skill's eval guidance for T2V/I2V/V2V checkpoints +- User wants to caption raw videos with a VLM to build a training dataset +- User wants to assemble a JSONL manifest from videos + captions +- For installation, `--group=cu130-train` / `cu128-train`, or LD_LIBRARY_PATH issues, hand off to **cosmos3-setup** +- For inference parameters, parallelism presets, or online serving, hand off to **cosmos3-inference** + +## Path convention + +All paths below are relative to the cosmos3 package root (`../../../` from this skill file). All `uv run` / `python` / `torchrun` commands should also be run from there. + +## Where to find answers + +The canonical reference is `docs/training.md`. Use this table to route questions: + +| User question | Go to | +| -------------------------------------------------------------------- | ------------------------------------------------------------------- | +| Full step-by-step SFT workflow | `docs/training.md` | +| Which install group? (`cu130-train` vs `cu128-train`) | `docs/training.md` § Setup, `docs/setup.md` § CUDA Variants | +| How do I download the example bridge dataset? | `docs/training.md` § Step 1 — Prepare data and checkpoint | +| How do I convert a base HF checkpoint to DCP? | `docs/training.md` § Step 1 — Prepare data and checkpoint | +| Which `mixed_modality_sft_nano.yaml` fields are commonly overridden? | `docs/training.md` § Step 2 — Prepare config | +| How do I launch distributed training? | `docs/training.md` § Step 3 — Run training | +| How do I validate the config without actually training? | `docs/training.md` § Step 3 (the `--dry-run` flag) | +| How do I export the trained DCP back to safetensors? | `docs/training.md` § Export checkpoint to Hugging Face safetensors | +| How do I run inference with the trained checkpoint? | `docs/training.md` § Run inference with trained checkpoint | +| How do I evaluate an action checkpoint (forward/inverse/policy)? | `docs/training.md` § Evaluation | +| How do I run `cosmos3.scripts.eval` / `eval.py`? | `docs/training.md` § Run action evaluation with trained checkpoint | +| Latency vs throughput preset for action eval? | `docs/training.md` § Run action evaluation with trained checkpoint | +| What metrics does action eval report (PSNR, action MSE)? | `docs/training.md` § Evaluation | +| Where do training and action-eval artifacts land? | `docs/training.md` § Outputs | +| How do I caption raw videos for SFT? | `docs/training.md` § Video Captioning for Training Data Processing | +| How do I serve the captioning VLM? | `docs/training.md` § Server setup | +| How do I build a JSONL dataset from captions + videos? | `docs/training.md` § Creating Video Dataset JSONL File for Training | + +## Workflow at a glance + +1. **Setup** — install the training extras: `uv sync --all-extras --group=cu130-train` (or `cu128-train` on older drivers), then `source .venv/bin/activate && export LD_LIBRARY_PATH=`. +2. **Step 1 — Prepare data and checkpoint** — download the example bridge dataset to `$DATASET_PATH` (Hugging Face cache) and `convert_model_to_dcp` the base checkpoint into `$BASE_CHECKPOINT_PATH` (default: `/tmp/$USER/checkpoints/cosmos3_nano`). +3. **Step 2 — Prepare config** — the provided `cosmos3/configs/experiment/mixed_modality_sft_nano.yaml` runs as-is on the example dataset (~100 iterations); override `model.config.parallelism.data_parallel_shard_degree`, `dataloader_train.dataloader.datasets.*.jsonl_paths`, `optimizer.lr`, `trainer.max_iter`, etc. for custom runs. +4. **Step 3 — Run training** — `torchrun --nproc_per_node=8 -m cosmos3.scripts.train -o outputs/train --config-file cosmos3/configs/experiment/mixed_modality_sft_nano.yaml --config-overrides "checkpoint.load_path=$BASE_CHECKPOINT_PATH" "dataloader_train.dataloader.datasets.video.dataset.jsonl_paths=$DATASET_PATH/train/video_dataset_file.jsonl"` (use `--dry-run` first when iterating on config). DCP checkpoints land in `outputs/train/job/checkpoints/iter_`. +5. **Inference** — read `outputs/train/job/checkpoints/latest_checkpoint.txt`, point `cosmos3.scripts.inference` at the resulting `outputs/train/job/checkpoints/iter_` DCP path with `--config-file outputs/train/config.yaml`. The example input glob `"$DATASET_PATH/val/inference_prompt*/episode_049683_clip000.json"` covers T2V, I2V, and V2V (see `cosmos3-inference` skill for presets / input formats). +6. **Action evaluation (action checkpoints only)** — `torchrun --nproc-per-node=8 -m cosmos3.scripts.eval --parallelism-preset=throughput -o outputs/train_eval --checkpoint-path $CHECKPOINT_PATH --config-file outputs/train/config.yaml --root-override /path/to/eval/dataset`. Resolves the dataloader from the training config (`val` split, falling back to `dataloader_train`), generates each sample through the inference engine, and scores against GT — `psnr` for video modes (`forward_dynamics`, `policy`) and `action_mse` for action modes (`inverse_dynamics`, `policy`). Per-sample `metrics.json` lives next to each `vision.mp4`; rank-0 aggregate is `outputs/train_eval/metrics_aggregate.json`. Skip this step entirely for T2V/I2V/V2V checkpoints — eval.py only computes action-mode metrics. +7. **Export (optional)** — `cosmos3.scripts.export_model` converts the DCP iter to Hugging Face safetensors at `outputs/train/model`. Not required for the standard inference flow above. + +## Things not obvious from the docs + +- **Training extras are a separate group**: SFT requires the `cu130-train` / `cu128-train` install group, not the inference-only `cu130` / `cu128`. Re-running `uv sync` with the wrong group silently leaves training deps uninstalled. +- **`-o` controls the entire output tree**: passing `-o outputs/train` to `cosmos3.scripts.train` makes everything land under `outputs/train/job/...` (logs, `config.yaml`, `checkpoints/iter_`, callback outputs). Without `-o`, training falls back to `${IMAGINAIRE_OUTPUT_ROOT:-/tmp/imaginaire4-output}/{job.project}/{job.group}/{job.name}/`. +- **Inference uses the DCP checkpoint directly**: the standard flow points `cosmos3.scripts.inference` at `outputs/train/job/checkpoints/iter_` together with `--config-file outputs/train/config.yaml`. The Hugging Face safetensors export (`outputs/train/model`) is optional — only needed if you want a portable single-file checkpoint. +- **Mixed-modality input glob**: the example uses `"$DATASET_PATH/val/inference_prompt*/episode_049683_clip000.json"` with a `*` so a single command runs T2V, I2V, and V2V (the dataset has `inference_prompt/`, `inference_prompt_i2v/`, `inference_prompt_v2v/` siblings under `val/`). +- **`data_parallel_shard_degree` must equal `WORLD_SIZE`**: it has to match `--nproc_per_node` on the `torchrun` command. Mismatch → FSDP init failure. +- **`--dry-run`**: `cosmos3.scripts.train` accepts `--dry-run` to validate the config end-to-end without launching training. Use it whenever iterating on YAML overrides. +- **`eval.py` is action-only**: `cosmos3.scripts.eval` only scores action-mode generations (PSNR for predicted video, MSE for predicted action). It does *not* score the bridge-v2 video SFT walkthrough or any T2V/I2V/V2V checkpoint — those use `cosmos3.scripts.inference` (no GT scoring). Pointing `eval.py` at a non-action dataloader fails with "mode requires GT video/action but data_batch had none". +- **Throughput preset for full-dataset action eval**: `--parallelism-preset` defaults to `latency` (model sharded across all ranks, one sample at a time — required when the checkpoint is too large to fit on a single GPU). For full-dataset action eval when the model fits on one GPU, pass `--parallelism-preset=throughput` so wall-clock scales as `N / num_gpus × per_sample_time` instead of `N × per_sample_time`. +- **Action eval reuses the training dataloader via `--config-file`**: pointing `eval.py` at `outputs/train/config.yaml` resolves the same dataloader the model was trained against (`val` split by default; falls back to `dataloader_train` when there is no `dataloader_val`). Use `--root-override /path/to/eval/dataset` to swap in held-out data without editing the config; alternatives are `--gcs-root-override ` (downloads via `--cache-dir`) and `--gcs-path-map`. Use `--dataset ` when the dataloader has multiple entries. +- **`--dataset-model-mode` defaults to `joint`**: every dataset entry is evaluated under all three action modes — total generation count = `len(modes) × ceil(len(val_split) / sample_stride)`, capped by `--num-samples`. Restrict during development with `--num-samples N`, `--sample-stride K`, or `--dataset-model-mode `. Mode is also encoded in each sample's name (`//`) and is what the metric dispatcher reads back when scoring. +- **Captioning server flags**: `vllm serve ... --allowed-local-media-path /` is required so the VLM can read the `file://` paths the captioning script sends. Use `Qwen/Qwen3-VL-8B-Instruct-FP8` as the recommended model; first launch downloads weights and may take several minutes (server is ready when you see `Application startup complete.`). +- **Captioning input modes**: `cosmos3.scripts.caption_from_video` accepts `--video ` (single file or directory of `.mp4`s) or `-i ` where each line has a `vision_path` field — same JSONL format used downstream by training. +- **Captioning output layout**: each input video produces a directory containing `caption.txt` and `sample_args.json`; `captions_to_sft_jsonl` then assembles those plus the source videos into a training-ready JSONL. +- **`uv run` over bare `python`**: per repo convention, prefer `uv run` for new commands. The `python -m cosmos3.scripts.caption_from_video ...` snippets in `docs/training.md` are a known pending migration. + +## Related skills + +| Skill | When to use | +| -------------------------------------- | ---------------------------------------------------------------------------- | +| `../cosmos3-setup/SKILL.md` | Initial install, CUDA variant selection, container/`LD_LIBRARY_PATH` setup | +| `../cosmos3-inference/SKILL.md` | Inference parameters, parallelism presets, input JSON format, online serving | +| `../cosmos3-codebase-nav/SKILL.md` | Locating configs, scripts, and defaults inside the package | +| `../cosmos3-env-troubleshoot/SKILL.md` | Debugging environment / runtime errors during training | diff --git a/cosmos-inference/.agents/skills/cosmos3-setup/SKILL.md b/cosmos-inference/.agents/skills/cosmos3-setup/SKILL.md new file mode 100644 index 00000000..f8b9afce --- /dev/null +++ b/cosmos-inference/.agents/skills/cosmos3-setup/SKILL.md @@ -0,0 +1,62 @@ +--- +name: cosmos3-setup +description: > + Guide users through Cosmos3 installation, environment setup, checkpoint downloading, + and verification. Use when the user asks "how do I install cosmos3", "how do I set up + the environment", "how do I download checkpoints", "how do I use Docker", or any + question about getting the package running for the first time. +--- + +# Cosmos3 Setup + +## When to use this skill + +- Use when a user wants to install Cosmos3 or set up a development environment +- Use when a user asks about system requirements, CUDA versions, or GPU compatibility +- Use when a user needs to download model checkpoints or configure HuggingFace auth +- Use when a user wants to run Cosmos3 inside a Docker container or NGC container +- For errors during setup, hand off to the **cosmos3-env-troubleshoot** skill + +## Path convention + +All paths below are relative to the cosmos3 package root (`../../../` from this skill file). All `uv run` / `python` commands should also be run from there. + +## Where to find answers + +The canonical setup reference is `docs/setup.md`. The README (`README.md` § Setup) has the shortest quickstart. + +| User question | Go to | +| ------------------------------------------------------ | -------------------------------------------------------------------- | +| What are the system requirements? | `docs/setup.md` § System Requirements | +| How do I install with uv? (sync, pip venv, pip system) | `docs/setup.md` § Virtual Environment | +| How do I install with Docker? | `docs/setup.md` § Docker Container | +| Custom torch/CUDA versions or attention backends? | `docs/setup.md` § Advanced | +| Which CUDA version? (cu130 vs cu128) | `docs/setup.md` § CUDA Variants, `docs/faq.md` § Which CUDA version? | +| How do I download checkpoints? | `docs/setup.md` § Downloading Checkpoints | +| NGC container issues? | `docs/setup.md` § PyTorch Import Issue | +| Any installation error | `../cosmos3-env-troubleshoot/SKILL.md` | + +## Setup steps at a glance + +1. **Clone** the repository and `cd` into the project root (the directory containing `pyproject.toml`) +2. **System deps**: `sudo apt-get install -y --no-install-recommends curl ffmpeg libx11-dev tree wget` +3. **Install uv**: `curl -LsSf https://astral.sh/uv/install.sh | sh && source $HOME/.local/bin/env` +4. **Install package**: `uv sync --all-extras --group=cu130 && source .venv/bin/activate` (see docs for alternative methods) +5. **Optional extras**: `.[serve]` for Ray/Gradio, `.[guardrail]` for safety, `.[train]` for post-training +6. **Checkpoints**: auto-downloaded during inference; requires HuggingFace auth (see docs) +7. **Verify**: `uv run --all-extras --group=cu130 python -c "import cosmos3; print('ok')"` + +## Things not obvious from the docs + +- **NGC container caveat**: you must run `export LD_LIBRARY_PATH=''` *before* any Python imports when inside an NGC PyTorch container. Easy to miss. +- **CUDA version alignment**: the major CUDA version from `nvidia-smi` must match `torch.version.cuda`. Mismatches cause cryptic shared-library errors. +- **`HF_HOME`**: controls where checkpoints are cached (default: `~/.cache/huggingface`). Set this if disk space is tight or you want a shared cache. +- **Conflicting env vars**: stale `HF_TOKEN` or `HUGGING_FACE_HUB_TOKEN` env vars can silently override CLI auth. Check with `printenv | grep HF_`. + +## Related skills + +| Skill | When to use | +| -------------------------------------- | ---------------------------------------------- | +| `../cosmos3-inference/SKILL.md` | Running inference after setup is complete | +| `../cosmos3-codebase-nav/SKILL.md` | Finding files, parameters, and configs in code | +| `../cosmos3-env-troubleshoot/SKILL.md` | Debugging environment and runtime errors | diff --git a/cosmos-inference/.claude/skills/cosmos3-codebase-nav/SKILL.md b/cosmos-inference/.claude/skills/cosmos3-codebase-nav/SKILL.md new file mode 100644 index 00000000..810695e4 --- /dev/null +++ b/cosmos-inference/.claude/skills/cosmos3-codebase-nav/SKILL.md @@ -0,0 +1,99 @@ +--- +name: cosmos3-codebase-nav +description: > + Navigate the Cosmos3 package codebase to find where parameters, configs, defaults, + scripts, and documentation live. Use when the user asks "where is X in cosmos3", + "how do I find the config for Y", "where are the defaults", "where do I change a + parameter", or any question about locating files, modules, or settings. Also use + when the user opens or edits files and needs orientation. +--- + +# Cosmos3 Codebase Navigation + +## When to use this skill + +- Use this skill when an agent is navigating the Cosmos3 package +- Use this skill to answer "where is X", "how do I find the config for Y", or any file-location question +- Use this skill when the user opens or edits cosmos3 files and needs orientation + +## Path convention + +All paths below are relative to this file's location (`.agents/skills/cosmos3-codebase-nav/`). + +## Quick Reference + +### Where parameters and defaults live + +| What you're looking for | File | +| ------------------------------------------------------- | ------------------------------------------------------------------------------------------------- | +| Sampling params (num_steps, guidance, shift, fps, etc.) | `../../../cosmos3/args.py` → `SamplingArgs`, `SamplingOverrides` | +| Per-modality default values | `../../../cosmos3/defaults//sample_args.json` | +| Setup params (parallelism, checkpoints, model path) | `../../../cosmos3/args.py` → `OmniSetupArgs`, `OmniSetupOverrides` | +| Common args base classes | `../../../cosmos3/common/args.py` → `ArgsBase`, `OverridesBase` | +| Ray serving parallelism presets | `../../../cosmos3/ray/configs/latency.yaml`, `../../../cosmos3/ray/configs/throughput.yaml` | +| Feature flags | `../../../cosmos3/flags.py` | +| Prompt upsampler system prompt | `../../../cosmos3/defaults/prompt_upsampler.txt` | +| Example inputs | `../../../inputs/omni/t2i.json`, `../../../inputs/omni/t2v.json`, `../../../inputs/omni/i2v.json` | + +Available modality modes for defaults: `text2image`, `text2video`, `image2video`. + +### Config defaults resolution chain + +When a user runs inference, default parameter values are resolved in this order: + +``` +cosmos3/defaults//sample_args.json # 1. Per-modality JSON defaults (num_steps, guidance, shift, fps, etc.) + ↓ +_load_modality_defaults() in cosmos3/args.py # 2. Loaded and cached at import time + ↓ +SamplingArgs / SamplingOverrides # 3. Pydantic models with field-level validation + ↓ +OmniSampleOverrides.build_sample() # 4. Merges user overrides → final resolved args + ↓ +_RESOLUTION_SHIFT_DEFAULTS[model_size, resolution] # 5. Model+resolution shift override (if user didn't set shift) + ↓ +CLI flags (--guidance, --shift, etc.) # 6. User overrides from command line +``` + +The `_RESOLUTION_SHIFT_DEFAULTS` table in `../../../cosmos3/args.py` overrides the default `shift` based on model size and resolution, unless the user explicitly specified `--shift`. + +| Mode | Default file | Key defaults | +| ------------- | -------------------------------------------------------- | ---------------------------------------------- | +| `text2image` | `../../../cosmos3/defaults/text2image/sample_args.json` | `num_frames=1`, `guidance=6.0`, `shift=10.0` | +| `text2video` | `../../../cosmos3/defaults/text2video/sample_args.json` | `num_frames=189`, `guidance=6.0`, `shift=10.0` | +| `image2video` | `../../../cosmos3/defaults/image2video/sample_args.json` | `num_frames=189`, `guidance=6.0`, `shift=10.0` | + +Users can also supply a custom defaults file per-request via the `defaults_file` field in sample arguments (see `../../../docs/inference.md`). + +### Where to make changes + +| Task | Edit | +| ------------------------------- | ------------------------------------------------------------------------------------------------------- | +| Change a built-in default value | `../../../cosmos3/defaults//sample_args.json` | +| Add a new CLI parameter | `SamplingArgs` + `SamplingOverrides` in `../../../cosmos3/args.py`, then add to each `sample_args.json` | +| Change parallelism presets | `../../../cosmos3/ray/configs/latency.yaml` or `throughput.yaml` | +| Add a new script | `../../../cosmos3/scripts/` — follow `inference.py` as the pattern | + +### Key entry points + +| Entry point | How to run | +| -------------------- | ------------------------------------------------------ | +| Batch inference | `python -m cosmos3.scripts.inference` | +| Online serving (Ray) | `python -m cosmos3.ray.serve` | +| Submit to Ray server | `python -m cosmos3.ray.submit` | +| Gradio UI | `python -m cosmos3.ray.gradio` | +| Prompt upsampling | `python -m cosmos3.scripts.upsample_prompts` | +| Model export | `python -m cosmos3.scripts.export_model` | +| Diffusers conversion | `python -m cosmos3.scripts.convert_model_to_diffusers` | + +### Documentation + +| Doc | Covers | +| ----------------------------------- | ----------------------------------------------------- | +| `../../../AGENTS.md` | Commands, rules, key file locations (read this first) | +| `../../../README.md` | Overview, quickstart, examples | +| `../../../docs/setup.md` | Installation, environment, checkpoints | +| `../../../docs/inference.md` | Sample args, default values, custom defaults | +| `../../../docs/inference_online.md` | Ray Serve and Gradio | +| `../../../docs/prompting.md` | Prompt engineering, upsampling | +| `../../../docs/faq.md` | FAQ, tips, and troubleshooting | diff --git a/cosmos-inference/.claude/skills/cosmos3-env-troubleshoot/SKILL.md b/cosmos-inference/.claude/skills/cosmos3-env-troubleshoot/SKILL.md new file mode 100644 index 00000000..9df15188 --- /dev/null +++ b/cosmos-inference/.claude/skills/cosmos3-env-troubleshoot/SKILL.md @@ -0,0 +1,106 @@ +--- +name: cosmos3-env-troubleshoot +description: > + Diagnose and fix Cosmos3 environment, installation, and runtime errors. + Use when the user encounters an ImportError, ModuleNotFoundError, CUDA error, + Docker error, checkpoint download failure, or any traceback during setup or inference. +--- + +# Cosmos3 Environment Troubleshooting + +## When to use this skill + +- Use when a user hits an error during installation, environment setup, or first run +- Use when a traceback mentions torch, CUDA, missing modules, or shared libraries +- Use when Docker or container setup fails +- Use when checkpoint downloads fail or HuggingFace auth errors appear + +## Path convention + +All paths below are relative to this file's location (`.agents/skills/cosmos3-env-troubleshoot/`). + +## Step 1: Match against known errors + +Check the error message against the table below. Each row links to the canonical fix in the docs. + +| Error signature | Cause | Fix location | +| ----------------------------------------------------------------------------- | ------------------------------------ | ------------------------------------------------------------------------------------------------------ | +| `ImportError: cannot import name '_functionalization' from 'torch._C'` | NGC container library conflict | `../../../docs/setup.md` § PyTorch Import Issue — run `export LD_LIBRARY_PATH=''` | +| `ModuleNotFoundError: No module named 'cosmos3'` | Package not installed | `../../../docs/setup.md` § Dependency Issue — run `uv sync --all-extras --group=cu130 --reinstall` | +| `ModuleNotFoundError: No module named ` | Dependency missing | `../../../docs/setup.md` § Dependency Issue — reinstall venv | +| `fatal error: Python.h: No such file or directory` | Broken Python / uv install | `../../../docs/setup.md` § Python Issue — reinstall uv + venv from scratch | +| `OSError: : cannot open shared object file` | CUDA version mismatch | `../../../docs/setup.md` § CUDA Issue — install matching `cuda-toolkit-` | +| `docker: Error response from daemon: unknown or invalid runtime name: nvidia` | Docker nvidia runtime not configured | `../../../docs/setup.md` § Docker Container — run `sudo nvidia-ctk runtime configure --runtime=docker` | +| HuggingFace 401 / download failures | Auth or license not accepted | `../../../docs/setup.md` § Downloading Checkpoints — check `HF_TOKEN`, accept license agreement | + +## Step 2: If no documented fix matches, try common remediation + +Run these diagnostic commands to collect information, then attempt fixes in order: + +### Diagnostic commands + +```shell +# System +uname -a +cat /etc/os-release | head -5 + +# Python +python --version +which python + +# CUDA +nvidia-smi +python -c "import torch; print(f'torch={torch.__version__}, cuda={torch.version.cuda}')" + +# Package +uv pip list | head -20 +``` + +### Remediation ladder (try in order) + +1. **Clear library path**: `export LD_LIBRARY_PATH=''` +2. **Reinstall venv**: `uv sync --all-extras --group=cu130 --reinstall` +3. **Reinstall uv + venv from scratch**: + + ```shell + curl -LsSf https://astral.sh/uv/install.sh | sh + uv python install --reinstall + rm -rf .venv + uv sync --all-extras --group=cu130 --reinstall + source .venv/bin/activate + ``` + +4. **Check CUDA version alignment**: the major CUDA version from `nvidia-smi` must match `torch.version.cuda` +5. **Try Docker**: if the host environment is too broken, fall back to the Docker container (see `../../../docs/setup.md`) + +## Step 3: If still unresolved, generate a bug report + +If none of the above resolves the issue, collect environment information and present the user with a pre-filled bug report they can submit as a GitHub issue. + +Fill in the template below by running the diagnostic commands and inserting the results: + +````markdown +## Environment + +- **OS**: +- **Python**: +- **CUDA (system)**: +- **CUDA (torch)**: `> +- **torch version**: `> +- **cosmos3 version**: +- **Installation method**: + +## Error + +``` + +``` + +## What was tried + +1. + +## Additional context + + +```` diff --git a/cosmos-inference/.claude/skills/cosmos3-inference/SKILL.md b/cosmos-inference/.claude/skills/cosmos3-inference/SKILL.md new file mode 100644 index 00000000..c2c0379a --- /dev/null +++ b/cosmos-inference/.claude/skills/cosmos3-inference/SKILL.md @@ -0,0 +1,59 @@ +--- +name: cosmos3-inference +description: > + Guide users through running Cosmos3 inference — offline batch generation, online + serving with Ray and Gradio, parallelism options, input formats, sampling parameters, + and prompt upsampling. Use when the user asks "how do I run inference", + "how do I generate a video", "how do I serve the model", "what parameters should I use", + or any question about running the model to produce outputs. +--- + +# Cosmos3 Inference + +## When to use this skill + +- Use when a user wants to generate images or videos with Cosmos3 +- Use when a user asks about inference parameters, input formats, or parallelism +- Use when a user wants to set up online serving (Ray Serve, Gradio) +- Use when a user asks about prompt engineering or upsampling +- For environment or import errors, hand off to **cosmos3-env-troubleshoot** + +## Path convention + +All paths below are relative to the cosmos3 package root (`../../../` from this skill file). All `uv run` / `python` commands should also be run from there. + +## Where to find answers + +| User question | Go to | +| ----------------------------------------------------------------------- | ------------------------------------------------------------------------------ | +| How do I run inference? (single-GPU, multi-GPU) | `README.md` § Offline Batch Inference | +| Which model should I use? (Nano vs Super, memory, shift) | `README.md` § Models | +| Which modality? (t2i, t2v, i2v, examples) | `README.md` § Modalities | +| What parallelism preset? (latency vs throughput) | `README.md` § Offline Batch Inference | +| What input fields are available? (prompt, vision_path, num_frames, ...) | `docs/inference.md` § Sample Arguments | +| What are the default parameter values? | `docs/inference.md` § Default Values | +| How do I use custom defaults? | `docs/inference.md` § Custom Defaults | +| How do I override a parameter? (precedence) | `docs/faq.md` § How do I override a default parameter? | +| What is the shift parameter? | `docs/faq.md` § What is the `shift` parameter? | +| How many frames can I generate? (resolution caps) | `docs/faq.md` § How many frames can I generate? | +| How do I start Ray Serve / Gradio / submit requests? | `docs/inference_online.md` | +| How do I write good prompts? | `docs/prompting.md` | +| How do I upsample short prompts? | `docs/prompting.md` § Upsampling | +| How do I use the low-level API? (examples/) | `docs/inference.md` § Supplementary Examples | +| All CLI flags | `uv run --all-extras --group=cu130 python -m cosmos3.scripts.inference --help` | + +## Things not obvious from the docs + +- **Path resolution**: relative paths in input JSON files are resolved relative to the **JSON file's directory**, not the working directory. +- **Seed**: always pass `--seed` for reproducible results. Without it, a random seed is used each time. +- **Resume**: interrupted runs can be resumed by re-running the same command — existing outputs are skipped automatically. +- **`--keep-going`**: continues processing remaining samples after a per-sample failure (e.g. guardrail rejection). Used in online serving by default. +- **Unique names**: every sample in a run must have a unique `name` field, or the script will error. + +## Related skills + +| Skill | When to use | +| -------------------------------------- | ---------------------------------------------- | +| `../cosmos3-setup/SKILL.md` | Installation and environment setup | +| `../cosmos3-codebase-nav/SKILL.md` | Finding files, parameters, and configs in code | +| `../cosmos3-env-troubleshoot/SKILL.md` | Debugging environment and runtime errors | diff --git a/cosmos-inference/.claude/skills/cosmos3-post-training/SKILL.md b/cosmos-inference/.claude/skills/cosmos3-post-training/SKILL.md new file mode 100644 index 00000000..78220ca4 --- /dev/null +++ b/cosmos-inference/.claude/skills/cosmos3-post-training/SKILL.md @@ -0,0 +1,95 @@ +--- +name: cosmos3-post-training +description: > + Guide users through Cosmos3 supervised fine-tuning (SFT) post-training: + preparing the example dataset and DCP base checkpoint, editing the experiment + config, launching distributed training with `torchrun`, running T2V/I2V/V2V + inference with the trained DCP checkpoint, optionally exporting it to + Hugging Face safetensors, running **action evaluation** (`cosmos3.scripts.eval`) + on action checkpoints (forward / inverse dynamics, policy) for PSNR / + action MSE, and the optional Video Captioning pipeline. Use when the user + asks how to post-train Cosmos3, fine-tune on a custom video dataset, export + a trained checkpoint, evaluate an action checkpoint, run `eval.py`, or + caption videos for training — or any question about `cu130-train` / + `cu128-train`, `mixed_modality_sft_nano.yaml`, `convert_model_to_dcp` / + `export_model` / `train` / `eval` / `caption_from_video` / + `captions_to_sft_jsonl`, action-eval metrics, or SFT output paths. eval.py + is action-only; for T2V/I2V/V2V use inference. +--- + +# Cosmos3 Post-Training (SFT) + +## When to use this skill + +- User wants to fine-tune Cosmos3-Nano on their own video dataset (SFT) +- User asks which fields in `mixed_modality_sft_nano.yaml` to override (lr, FSDP shard, max_iter, jsonl_paths, ...) +- User wants to convert a base Hugging Face checkpoint to DCP, or convert a trained DCP back to safetensors +- User wants to score an **action** checkpoint (forward / inverse dynamics, policy) against a held-out dataset with `cosmos3.scripts.eval` — per-sample PSNR / action MSE plus an aggregate. Eval is action-only; do not invoke this skill's eval guidance for T2V/I2V/V2V checkpoints +- User wants to caption raw videos with a VLM to build a training dataset +- User wants to assemble a JSONL manifest from videos + captions +- For installation, `--group=cu130-train` / `cu128-train`, or LD_LIBRARY_PATH issues, hand off to **cosmos3-setup** +- For inference parameters, parallelism presets, or online serving, hand off to **cosmos3-inference** + +## Path convention + +All paths below are relative to the cosmos3 package root (`../../../` from this skill file). All `uv run` / `python` / `torchrun` commands should also be run from there. + +## Where to find answers + +The canonical reference is `docs/training.md`. Use this table to route questions: + +| User question | Go to | +| -------------------------------------------------------------------- | ------------------------------------------------------------------- | +| Full step-by-step SFT workflow | `docs/training.md` | +| Which install group? (`cu130-train` vs `cu128-train`) | `docs/training.md` § Setup, `docs/setup.md` § CUDA Variants | +| How do I download the example bridge dataset? | `docs/training.md` § Step 1 — Prepare data and checkpoint | +| How do I convert a base HF checkpoint to DCP? | `docs/training.md` § Step 1 — Prepare data and checkpoint | +| Which `mixed_modality_sft_nano.yaml` fields are commonly overridden? | `docs/training.md` § Step 2 — Prepare config | +| How do I launch distributed training? | `docs/training.md` § Step 3 — Run training | +| How do I validate the config without actually training? | `docs/training.md` § Step 3 (the `--dry-run` flag) | +| How do I export the trained DCP back to safetensors? | `docs/training.md` § Export checkpoint to Hugging Face safetensors | +| How do I run inference with the trained checkpoint? | `docs/training.md` § Run inference with trained checkpoint | +| How do I evaluate an action checkpoint (forward/inverse/policy)? | `docs/training.md` § Evaluation | +| How do I run `cosmos3.scripts.eval` / `eval.py`? | `docs/training.md` § Run action evaluation with trained checkpoint | +| Latency vs throughput preset for action eval? | `docs/training.md` § Run action evaluation with trained checkpoint | +| What metrics does action eval report (PSNR, action MSE)? | `docs/training.md` § Evaluation | +| Where do training and action-eval artifacts land? | `docs/training.md` § Outputs | +| How do I caption raw videos for SFT? | `docs/training.md` § Video Captioning for Training Data Processing | +| How do I serve the captioning VLM? | `docs/training.md` § Server setup | +| How do I build a JSONL dataset from captions + videos? | `docs/training.md` § Creating Video Dataset JSONL File for Training | + +## Workflow at a glance + +1. **Setup** — install the training extras: `uv sync --all-extras --group=cu130-train` (or `cu128-train` on older drivers), then `source .venv/bin/activate && export LD_LIBRARY_PATH=`. +2. **Step 1 — Prepare data and checkpoint** — download the example bridge dataset to `$DATASET_PATH` (Hugging Face cache) and `convert_model_to_dcp` the base checkpoint into `$BASE_CHECKPOINT_PATH` (default: `/tmp/$USER/checkpoints/cosmos3_nano`). +3. **Step 2 — Prepare config** — the provided `cosmos3/configs/experiment/mixed_modality_sft_nano.yaml` runs as-is on the example dataset (~100 iterations); override `model.config.parallelism.data_parallel_shard_degree`, `dataloader_train.dataloader.datasets.*.jsonl_paths`, `optimizer.lr`, `trainer.max_iter`, etc. for custom runs. +4. **Step 3 — Run training** — `torchrun --nproc_per_node=8 -m cosmos3.scripts.train -o outputs/train --config-file cosmos3/configs/experiment/mixed_modality_sft_nano.yaml --config-overrides "checkpoint.load_path=$BASE_CHECKPOINT_PATH" "dataloader_train.dataloader.datasets.video.dataset.jsonl_paths=$DATASET_PATH/train/video_dataset_file.jsonl"` (use `--dry-run` first when iterating on config). DCP checkpoints land in `outputs/train/job/checkpoints/iter_`. +5. **Inference** — read `outputs/train/job/checkpoints/latest_checkpoint.txt`, point `cosmos3.scripts.inference` at the resulting `outputs/train/job/checkpoints/iter_` DCP path with `--config-file outputs/train/config.yaml`. The example input glob `"$DATASET_PATH/val/inference_prompt*/episode_049683_clip000.json"` covers T2V, I2V, and V2V (see `cosmos3-inference` skill for presets / input formats). +6. **Action evaluation (action checkpoints only)** — `torchrun --nproc-per-node=8 -m cosmos3.scripts.eval --parallelism-preset=throughput -o outputs/train_eval --checkpoint-path $CHECKPOINT_PATH --config-file outputs/train/config.yaml --root-override /path/to/eval/dataset`. Resolves the dataloader from the training config (`val` split, falling back to `dataloader_train`), generates each sample through the inference engine, and scores against GT — `psnr` for video modes (`forward_dynamics`, `policy`) and `action_mse` for action modes (`inverse_dynamics`, `policy`). Per-sample `metrics.json` lives next to each `vision.mp4`; rank-0 aggregate is `outputs/train_eval/metrics_aggregate.json`. Skip this step entirely for T2V/I2V/V2V checkpoints — eval.py only computes action-mode metrics. +7. **Export (optional)** — `cosmos3.scripts.export_model` converts the DCP iter to Hugging Face safetensors at `outputs/train/model`. Not required for the standard inference flow above. + +## Things not obvious from the docs + +- **Training extras are a separate group**: SFT requires the `cu130-train` / `cu128-train` install group, not the inference-only `cu130` / `cu128`. Re-running `uv sync` with the wrong group silently leaves training deps uninstalled. +- **`-o` controls the entire output tree**: passing `-o outputs/train` to `cosmos3.scripts.train` makes everything land under `outputs/train/job/...` (logs, `config.yaml`, `checkpoints/iter_`, callback outputs). Without `-o`, training falls back to `${IMAGINAIRE_OUTPUT_ROOT:-/tmp/imaginaire4-output}/{job.project}/{job.group}/{job.name}/`. +- **Inference uses the DCP checkpoint directly**: the standard flow points `cosmos3.scripts.inference` at `outputs/train/job/checkpoints/iter_` together with `--config-file outputs/train/config.yaml`. The Hugging Face safetensors export (`outputs/train/model`) is optional — only needed if you want a portable single-file checkpoint. +- **Mixed-modality input glob**: the example uses `"$DATASET_PATH/val/inference_prompt*/episode_049683_clip000.json"` with a `*` so a single command runs T2V, I2V, and V2V (the dataset has `inference_prompt/`, `inference_prompt_i2v/`, `inference_prompt_v2v/` siblings under `val/`). +- **`data_parallel_shard_degree` must equal `WORLD_SIZE`**: it has to match `--nproc_per_node` on the `torchrun` command. Mismatch → FSDP init failure. +- **`--dry-run`**: `cosmos3.scripts.train` accepts `--dry-run` to validate the config end-to-end without launching training. Use it whenever iterating on YAML overrides. +- **`eval.py` is action-only**: `cosmos3.scripts.eval` only scores action-mode generations (PSNR for predicted video, MSE for predicted action). It does *not* score the bridge-v2 video SFT walkthrough or any T2V/I2V/V2V checkpoint — those use `cosmos3.scripts.inference` (no GT scoring). Pointing `eval.py` at a non-action dataloader fails with "mode requires GT video/action but data_batch had none". +- **Throughput preset for full-dataset action eval**: `--parallelism-preset` defaults to `latency` (model sharded across all ranks, one sample at a time — required when the checkpoint is too large to fit on a single GPU). For full-dataset action eval when the model fits on one GPU, pass `--parallelism-preset=throughput` so wall-clock scales as `N / num_gpus × per_sample_time` instead of `N × per_sample_time`. +- **Action eval reuses the training dataloader via `--config-file`**: pointing `eval.py` at `outputs/train/config.yaml` resolves the same dataloader the model was trained against (`val` split by default; falls back to `dataloader_train` when there is no `dataloader_val`). Use `--root-override /path/to/eval/dataset` to swap in held-out data without editing the config; alternatives are `--gcs-root-override ` (downloads via `--cache-dir`) and `--gcs-path-map`. Use `--dataset ` when the dataloader has multiple entries. +- **`--dataset-model-mode` defaults to `joint`**: every dataset entry is evaluated under all three action modes — total generation count = `len(modes) × ceil(len(val_split) / sample_stride)`, capped by `--num-samples`. Restrict during development with `--num-samples N`, `--sample-stride K`, or `--dataset-model-mode `. Mode is also encoded in each sample's name (`//`) and is what the metric dispatcher reads back when scoring. +- **Captioning server flags**: `vllm serve ... --allowed-local-media-path /` is required so the VLM can read the `file://` paths the captioning script sends. Use `Qwen/Qwen3-VL-8B-Instruct-FP8` as the recommended model; first launch downloads weights and may take several minutes (server is ready when you see `Application startup complete.`). +- **Captioning input modes**: `cosmos3.scripts.caption_from_video` accepts `--video ` (single file or directory of `.mp4`s) or `-i ` where each line has a `vision_path` field — same JSONL format used downstream by training. +- **Captioning output layout**: each input video produces a directory containing `caption.txt` and `sample_args.json`; `captions_to_sft_jsonl` then assembles those plus the source videos into a training-ready JSONL. +- **`uv run` over bare `python`**: per repo convention, prefer `uv run` for new commands. The `python -m cosmos3.scripts.caption_from_video ...` snippets in `docs/training.md` are a known pending migration. + +## Related skills + +| Skill | When to use | +| -------------------------------------- | ---------------------------------------------------------------------------- | +| `../cosmos3-setup/SKILL.md` | Initial install, CUDA variant selection, container/`LD_LIBRARY_PATH` setup | +| `../cosmos3-inference/SKILL.md` | Inference parameters, parallelism presets, input JSON format, online serving | +| `../cosmos3-codebase-nav/SKILL.md` | Locating configs, scripts, and defaults inside the package | +| `../cosmos3-env-troubleshoot/SKILL.md` | Debugging environment / runtime errors during training | diff --git a/cosmos-inference/.claude/skills/cosmos3-setup/SKILL.md b/cosmos-inference/.claude/skills/cosmos3-setup/SKILL.md new file mode 100644 index 00000000..f8b9afce --- /dev/null +++ b/cosmos-inference/.claude/skills/cosmos3-setup/SKILL.md @@ -0,0 +1,62 @@ +--- +name: cosmos3-setup +description: > + Guide users through Cosmos3 installation, environment setup, checkpoint downloading, + and verification. Use when the user asks "how do I install cosmos3", "how do I set up + the environment", "how do I download checkpoints", "how do I use Docker", or any + question about getting the package running for the first time. +--- + +# Cosmos3 Setup + +## When to use this skill + +- Use when a user wants to install Cosmos3 or set up a development environment +- Use when a user asks about system requirements, CUDA versions, or GPU compatibility +- Use when a user needs to download model checkpoints or configure HuggingFace auth +- Use when a user wants to run Cosmos3 inside a Docker container or NGC container +- For errors during setup, hand off to the **cosmos3-env-troubleshoot** skill + +## Path convention + +All paths below are relative to the cosmos3 package root (`../../../` from this skill file). All `uv run` / `python` commands should also be run from there. + +## Where to find answers + +The canonical setup reference is `docs/setup.md`. The README (`README.md` § Setup) has the shortest quickstart. + +| User question | Go to | +| ------------------------------------------------------ | -------------------------------------------------------------------- | +| What are the system requirements? | `docs/setup.md` § System Requirements | +| How do I install with uv? (sync, pip venv, pip system) | `docs/setup.md` § Virtual Environment | +| How do I install with Docker? | `docs/setup.md` § Docker Container | +| Custom torch/CUDA versions or attention backends? | `docs/setup.md` § Advanced | +| Which CUDA version? (cu130 vs cu128) | `docs/setup.md` § CUDA Variants, `docs/faq.md` § Which CUDA version? | +| How do I download checkpoints? | `docs/setup.md` § Downloading Checkpoints | +| NGC container issues? | `docs/setup.md` § PyTorch Import Issue | +| Any installation error | `../cosmos3-env-troubleshoot/SKILL.md` | + +## Setup steps at a glance + +1. **Clone** the repository and `cd` into the project root (the directory containing `pyproject.toml`) +2. **System deps**: `sudo apt-get install -y --no-install-recommends curl ffmpeg libx11-dev tree wget` +3. **Install uv**: `curl -LsSf https://astral.sh/uv/install.sh | sh && source $HOME/.local/bin/env` +4. **Install package**: `uv sync --all-extras --group=cu130 && source .venv/bin/activate` (see docs for alternative methods) +5. **Optional extras**: `.[serve]` for Ray/Gradio, `.[guardrail]` for safety, `.[train]` for post-training +6. **Checkpoints**: auto-downloaded during inference; requires HuggingFace auth (see docs) +7. **Verify**: `uv run --all-extras --group=cu130 python -c "import cosmos3; print('ok')"` + +## Things not obvious from the docs + +- **NGC container caveat**: you must run `export LD_LIBRARY_PATH=''` *before* any Python imports when inside an NGC PyTorch container. Easy to miss. +- **CUDA version alignment**: the major CUDA version from `nvidia-smi` must match `torch.version.cuda`. Mismatches cause cryptic shared-library errors. +- **`HF_HOME`**: controls where checkpoints are cached (default: `~/.cache/huggingface`). Set this if disk space is tight or you want a shared cache. +- **Conflicting env vars**: stale `HF_TOKEN` or `HUGGING_FACE_HUB_TOKEN` env vars can silently override CLI auth. Check with `printenv | grep HF_`. + +## Related skills + +| Skill | When to use | +| -------------------------------------- | ---------------------------------------------- | +| `../cosmos3-inference/SKILL.md` | Running inference after setup is complete | +| `../cosmos3-codebase-nav/SKILL.md` | Finding files, parameters, and configs in code | +| `../cosmos3-env-troubleshoot/SKILL.md` | Debugging environment and runtime errors | diff --git a/cosmos-inference/.config/rumdl.toml b/cosmos-inference/.config/rumdl.toml new file mode 100644 index 00000000..24e72483 --- /dev/null +++ b/cosmos-inference/.config/rumdl.toml @@ -0,0 +1,43 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025-2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +# https://rumdl.dev/global-settings/ +[global] +flavor = "standard" +exclude = [ + "ATTRIBUTIONS.md", + "_src", +] +disable = [ + "MD013", # line-length + "MD033", # inline-html + "MD040", # fenced-code-language +] + +# https://rumdl.dev/rules/ + +[per-file-ignores] +"README.md" = [ + "MD041" # first-line-heading +] + +# ul-style +[MD004] +style = "dash" + +# table-format +[MD060] +enabled = true +style = "aligned" diff --git a/cosmos-inference/.coveragerc b/cosmos-inference/.coveragerc new file mode 100644 index 00000000..93b0d6e5 --- /dev/null +++ b/cosmos-inference/.coveragerc @@ -0,0 +1,32 @@ +# https://coverage.readthedocs.io/en/latest/subprocess.html + +[run] +data_file = outputs/coverage/coverage +disable_warnings = + module-not-imported + no-data-collected +parallel = True +patch = subprocess + +[report] +exclude_lines = + @overload + def __repr__ + if __name__ == .__main__.: + if TYPE_CHECKING: + pragma: no cover + raise AssertionError + raise NotImplementedError +omit = + *_test.py +skip_empty = True +show_missing = True + +[html] +directory = outputs/coverage/html + +[json] +output = outputs/coverage/coverage.json + +[xml] +output = outputs/coverage/coverage.xml diff --git a/cosmos-inference/.dockerignore b/cosmos-inference/.dockerignore new file mode 100644 index 00000000..0dfe444b --- /dev/null +++ b/cosmos-inference/.dockerignore @@ -0,0 +1,8 @@ +.venv +.git +/checkpoints +/datasets +/output +/examples/**/checkpoints +/examples/**/output +/examples/**/datasets diff --git a/cosmos-inference/.gitattributes b/cosmos-inference/.gitattributes new file mode 100644 index 00000000..dba17f50 --- /dev/null +++ b/cosmos-inference/.gitattributes @@ -0,0 +1,25 @@ +*.lock linguist-generated=true +tests/data/** linguist-generated=true +ATTRIBUTIONS.md linguist-generated=true + +assets/** filter=lfs diff=lfs merge=lfs -text + +# Video files +*.mp4 filter=lfs diff=lfs merge=lfs -text +*.avi filter=lfs diff=lfs merge=lfs -text +*.mov filter=lfs diff=lfs merge=lfs -text +*.mkv filter=lfs diff=lfs merge=lfs -text +*.webm filter=lfs diff=lfs merge=lfs -text + +# Audio files +*.wav filter=lfs diff=lfs merge=lfs -text +*.mp3 filter=lfs diff=lfs merge=lfs -text +*.flac filter=lfs diff=lfs merge=lfs -text +*.aac filter=lfs diff=lfs merge=lfs -text + +# Image files +*.jpg filter=lfs diff=lfs merge=lfs -text +*.jpeg filter=lfs diff=lfs merge=lfs -text +*.png filter=lfs diff=lfs merge=lfs -text +*.tiff filter=lfs diff=lfs merge=lfs -text +*.bmp filter=lfs diff=lfs merge=lfs -text diff --git a/cosmos-inference/.github/ISSUE_TEMPLATE/bug_report.md b/cosmos-inference/.github/ISSUE_TEMPLATE/bug_report.md new file mode 100644 index 00000000..2ee4d0ec --- /dev/null +++ b/cosmos-inference/.github/ISSUE_TEMPLATE/bug_report.md @@ -0,0 +1,65 @@ +--- +name: Bug Report +about: Report a reproducible bug or unexpected behavior +title: "[BUG] " +labels: 'bug' +assignees: + - spectralflight + - jeanachoi + +--- + +## Bug Description + + + +## Reproduction Steps + +```bash +# Minimal command or script to reproduce +``` + +**Reproducibility:** + +- [ ] Always +- [ ] Intermittently (~___% of the time) +- [ ] Only once + +## Expected vs. Actual Behavior + +| | Description | +| ------------ | --------------------------- | +| **Expected** | What you expected to happen | +| **Actual** | What actually happened | + +## Outputs + +
+Error / Stack Trace + + + +
+ +
+Log Files + + + +
+ +## System Information + +| Field | Value | +| ---------------------------- | ------------------------------------------- | +| **Environment** | | +| **Hardware** | | +| **OS** | | +| **GPU Driver** | | +| **CUDA Version** | | +| **Python Version** | | +| **Package Version / Commit** | | + +## Additional Context + + diff --git a/cosmos-inference/.github/workflows/pre-commit.yml b/cosmos-inference/.github/workflows/pre-commit.yml new file mode 100644 index 00000000..1b11ee3d --- /dev/null +++ b/cosmos-inference/.github/workflows/pre-commit.yml @@ -0,0 +1,31 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +name: Pre-commit +on: + pull_request: + push: + branches: [main] +jobs: + pre-commit: + runs-on: ubuntu-latest + steps: + - uses: actions/checkout@v6 + with: + lfs: true + - uses: actions/setup-python@v6 + - uses: astral-sh/setup-uv@v7 + - run: uvx pre-commit@4.5.1 run -a -c ci/.pre-commit-config-base.yaml + - run: uvx pre-commit@4.5.1 run -a diff --git a/cosmos-inference/.gitignore b/cosmos-inference/.gitignore new file mode 100644 index 00000000..3b3378e1 --- /dev/null +++ b/cosmos-inference/.gitignore @@ -0,0 +1,205 @@ +/assets +/credentials +/datasets +/outputs +/tmp +.cuda-name +*.env + +# ------------------------ BELOW IS AUTO-GENERATED FOR PYTHON REPOS ------------------------ + +# Byte-compiled / optimized / DLL files +__pycache__/ +*.py[cod] +*$py.class + +# C extensions +*.so + +# Distribution / packaging +.Python +build/ +develop-eggs/ +dist/ +downloads/ +eggs/ +.eggs/ +lib/ +lib64/ +parts/ +sdist/ +var/ +wheels/ +share/python-wheels/ +*.egg-info/ +.installed.cfg +*.egg +MANIFEST + +# PyInstaller +# Usually these files are written by a python script from a template +# before PyInstaller builds the exe, so as to inject date/other infos into it. +*.manifest +*.spec + +# Installer logs +pip-log.txt +pip-delete-this-directory.txt + +# Unit test / coverage reports +htmlcov/ +.tox/ +.nox/ +.coverage +.coverage.* +.cache +nosetests.xml +coverage.xml +*.cover +*.py,cover +.hypothesis/ +.pytest_cache/ +cover/ + +# Translations +*.mo +*.pot + +# Django stuff: +*.log +local_settings.py +db.sqlite3 +db.sqlite3-journal + +# Flask stuff: +instance/ +.webassets-cache + +# Scrapy stuff: +.scrapy + +# Sphinx documentation +docs/_build/ + +# PyBuilder +.pybuilder/ +target/ + +# Jupyter Notebook +.ipynb_checkpoints + +# IPython +profile_default/ +ipython_config.py + +# pyenv +# For a library or package, you might want to ignore these files since the code is +# intended to run in multiple environments; otherwise, check them in: +# .python-version + +# pipenv +# According to pypa/pipenv#598, it is recommended to include Pipfile.lock in version control. +# However, in case of collaboration, if having platform-specific dependencies or dependencies +# having no cross-platform support, pipenv may install dependencies that don't work, or not +# install all needed dependencies. +#Pipfile.lock + +# UV +# Similar to Pipfile.lock, it is generally recommended to include uv.lock in version control. +# This is especially recommended for binary packages to ensure reproducibility, and is more +# commonly ignored for libraries. +#uv.lock + +# poetry +# Similar to Pipfile.lock, it is generally recommended to include poetry.lock in version control. +# This is especially recommended for binary packages to ensure reproducibility, and is more +# commonly ignored for libraries. +# https://python-poetry.org/docs/basic-usage/#commit-your-poetrylock-file-to-version-control +#poetry.lock +#poetry.toml + +# pdm +# Similar to Pipfile.lock, it is generally recommended to include pdm.lock in version control. +#pdm.lock +# pdm stores project-wide configurations in .pdm.toml, but it is recommended to not include it +# in version control. +# https://pdm.fming.dev/latest/usage/project/#working-with-version-control +.pdm.toml +.pdm-python +.pdm-build/ + +# PEP 582; used by e.g. github.com/David-OConnor/pyflow and github.com/pdm-project/pdm +__pypackages__/ + +# Celery stuff +celerybeat-schedule +celerybeat.pid + +# SageMath parsed files +*.sage.py + +# Environments +.env +.venv +env/ +venv/ +ENV/ +env.bak/ +venv.bak/ + +# Spyder project settings +.spyderproject +.spyproject + +# Rope project settings +.ropeproject + +# mkdocs documentation +/site + +# mypy +.mypy_cache/ +.dmypy.json +dmypy.json + +# Pyre type checker +.pyre/ + +# pytype static type analyzer +.pytype/ + +# Cython debug symbols +cython_debug/ + +# PyCharm +# JetBrains specific template is maintained in a separate JetBrains.gitignore that can +# be found at https://github.com/github/gitignore/blob/main/Global/JetBrains.gitignore +# and can be added to the global gitignore or merged into this file. For a more nuclear +# option (not recommended) you can uncomment the following to ignore the entire idea folder. +#.idea/ + +# Abstra +# Abstra is an AI-powered process automation framework. +# Ignore directories containing user credentials, local state, and settings. +# Learn more at https://abstra.io/docs +.abstra/ + +# Visual Studio Code +# Visual Studio Code specific template is maintained in a separate VisualStudioCode.gitignore +# that can be found at https://github.com/github/gitignore/blob/main/Global/VisualStudioCode.gitignore +# and can be added to the global gitignore or merged into this file. However, if you prefer, +# you could uncomment the following to ignore the entire vscode folder +# .vscode/ + +# Ruff stuff: +.ruff_cache/ + +# PyPI configuration file +.pypirc + +# Cursor +# Cursor is an AI-powered code editor. `.cursorignore` specifies files/directories to +# exclude from AI features like autocomplete and code analysis. Recommended for sensitive data +# refer to https://docs.cursor.com/context/ignore-files +.cursorignore +.cursorindexingignore diff --git a/cosmos-inference/.gitleaks.toml b/cosmos-inference/.gitleaks.toml new file mode 100644 index 00000000..cf8f9ea5 --- /dev/null +++ b/cosmos-inference/.gitleaks.toml @@ -0,0 +1,19 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025-2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +[[allowlists]] +regexes = [ + '''Qwen3MoeForCausalLM''' +] diff --git a/cosmos-inference/.pre-commit-config.yaml b/cosmos-inference/.pre-commit-config.yaml new file mode 100644 index 00000000..6baa0ec9 --- /dev/null +++ b/cosmos-inference/.pre-commit-config.yaml @@ -0,0 +1,67 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +default_language_version: + node: 25.2.1 + python: python3.13 +exclude: (?x)( + ^tests/data/ + ) +repos: + - repo: https://github.com/google/addlicense + rev: v1.2.0 + hooks: + - id: addlicense + args: ["-f", "ci/license.txt"] + - repo: https://github.com/jsh9/markdown-toc-creator + rev: 0.1.3 + hooks: + - id: markdown-toc-creator + args: ["--config=ci/.markdown-toc-creator.toml"] + - repo: https://github.com/rvben/rumdl-pre-commit + rev: v0.1.62 + hooks: + - id: rumdl-fmt + - repo: https://github.com/pre-commit/pre-commit-hooks + rev: v6.0.0 + hooks: + - id: check-symlinks + - id: check-executables-have-shebangs + exclude: /_src/ + - id: check-shebang-scripts-are-executable + exclude: /_src/ + - repo: local + hooks: + - id: uv-lock + name: Generate uv lock files for projects + entry: ./ci/uv_lock.sh + language: script + files: pyproject\.toml$ + - id: uv-lock-script + name: Generate uv lock files for scripts + entry: ./ci/uv_lock_script.sh + language: script + types: [python] + - repo: https://github.com/tcort/markdown-link-check + rev: v3.14.2 + hooks: + - alias: link-check + name: link check + id: markdown-link-check + args: [--config, "ci/.link-check.json", --quiet] + stages: [manual] + exclude: (?x)( + \bATTRIBUTIONS\b| + /_src/ + ) diff --git a/cosmos-inference/.pytest.toml b/cosmos-inference/.pytest.toml new file mode 100644 index 00000000..aae7f969 --- /dev/null +++ b/cosmos-inference/.pytest.toml @@ -0,0 +1,47 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025-2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +[pytest] +python_files = [ + "*_test.py", +] +norecursedirs = [ + "_src", + "cosmos3._src.imaginaire", + "packages", + "projects", +] +addopts = [ + "--suppress-no-test-exit-code", +] +filterwarnings = [ + "ignore::DeprecationWarning", + "ignore::FutureWarning", +] +markers = [ + "manual: Test requires --manual.", + "level(l): Test level in [0, 1, 2].", + "gpus(n): Test requires GPUs.", +] + +[pytest_env] +COSMOS_VERBOSE = { value = "0", skip_if_set = true } +CUDA_VISIBLE_DEVICES = { unset = true } +PYTORCH_CUDA_ALLOC_CONF = "expandable_segments:True" # Reduce chance of OOM errors +# Limit threading to reduce contention +MKL_NUM_THREADS = "1" +NUMEXPR_NUM_THREADS = "1" +OMP_NUM_THREADS = "1" +OPENBLAS_NUM_THREADS = "1" diff --git a/cosmos-inference/.python-version b/cosmos-inference/.python-version new file mode 100644 index 00000000..24ee5b1b --- /dev/null +++ b/cosmos-inference/.python-version @@ -0,0 +1 @@ +3.13 diff --git a/cosmos-inference/.ruff.toml b/cosmos-inference/.ruff.toml new file mode 100644 index 00000000..30bebd5b --- /dev/null +++ b/cosmos-inference/.ruff.toml @@ -0,0 +1,36 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +line-length = 120 +target-version = "py310" + +[lint] +select = [ + "E", # pycodestyle errors + "F", # pyflakes + "I", # isort + "TID252", # relative-imports + "T10", # debugger +] +ignore = [ + "E402", # module-import-not-at-top-of-file + "E501", # line-too-long + "E721", # type-comparison + "E741", # ambiguous-variable-name + "F541", # f-string-missing-placeholders + "F811", # redefined-while-unused + "F841", # unused-variable +] +fixable = ["ALL"] diff --git a/cosmos-inference/AGENTS.md b/cosmos-inference/AGENTS.md new file mode 100644 index 00000000..90f515cb --- /dev/null +++ b/cosmos-inference/AGENTS.md @@ -0,0 +1,72 @@ +# AGENTS.md — Cosmos3 Package + +Read this file first — it is the canonical map for navigating the Cosmos3 codebase and stays up to date. + +Cosmos3 is a Mixture-of-Transformer (MoT) world foundation model supporting text-to-image, text-to-video, and image-to-video generation. This package covers inference, online serving, and the public API surface. + +> All paths below are relative to the repository root (the directory containing `pyproject.toml` and the `cosmos3/` Python package). + +## Commands + +| Task | Command | +| ---------------------- | --------------------------------------------------- | +| Lint | `uv run ruff check .` | +| Format check | `uv run ruff format --check .` | +| Auto-fix lint + format | `uv run ruff check --fix . && uv run ruff format .` | +| Type-check | `uv run pyrefly check` | +| Test (all) | `uv run pytest` | +| Test (single file) | `uv run pytest --capture=no ` | + +Config files: `.ruff.toml` (ruff), `pyrefly.toml` (pyrefly). + +## Rules + +- Always answer questions with references to code or documentation in `file:line` format. +- When unsure, point the user to the closest doc rather than guessing. +- Keep this file short. Link out to skills and docs for detail — this file is included in every prompt. + +## Key File Locations + +| What | Where | +| ------------------------ | --------------------------------------------------------------------------------------------- | +| CLI entry point | `cosmos3/scripts/inference.py` | +| Args / param definitions | `cosmos3/args.py` → `SamplingArgs`, `SamplingOverrides`, `OmniSampleArgs`, `OmniSetupArgs` | +| Per-modality defaults | `cosmos3/defaults//sample_args.json` (modes: `text2image`, `text2video`, `image2video`) | +| Model / inference core | `cosmos3/model.py`, `cosmos3/inference.py` | +| Feature flags | `cosmos3/flags.py` | +| Ray serving configs | `cosmos3/ray/configs/latency.yaml`, `cosmos3/ray/configs/throughput.yaml` | +| Example inputs | `inputs/omni/t2i.json`, `inputs/omni/t2v.json`, `inputs/omni/i2v.json` | + +For the full config-defaults resolution chain, modality tables, and "where to make changes" guidance, see the **cosmos3-codebase-nav** skill (`.agents/skills/cosmos3-codebase-nav/SKILL.md`). + +## Documentation + +| Doc | What it covers | +| ------------------------------------------------------ | ----------------------------------------------------- | +| [docs/setup.md](./docs/setup.md) | Installation, environment, NGC container, checkpoints | +| [docs/prompting.md](./docs/prompting.md) | Prompt engineering, upsampling with vLLM | +| [docs/inference.md](./docs/inference.md) | Sample arguments, default values, custom defaults | +| [docs/inference_online.md](./docs/inference_online.md) | Online serving with Ray Serve and Gradio | +| [docs/faq.md](./docs/faq.md) | FAQ, tips, and troubleshooting | + +## Common Tasks + +| Task | Command | +| ----------------------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------- | +| Single-GPU inference | `python -m cosmos3.scripts.inference -i inputs/omni/t2v.json -o outputs/ --checkpoint-path Cosmos3-Nano` | +| Multi-GPU inference | `torchrun --nproc-per-node=4 -m cosmos3.scripts.inference --parallelism-preset=latency -i inputs/omni/t2v.json -o outputs/ --checkpoint-path Cosmos3-Nano` | +| Start online Ray server | `python -m cosmos3.ray.serve --parallelism-preset=latency -o outputs/ray_serve --checkpoint-path Cosmos3-Nano` | +| Launch Gradio UI | `python -m cosmos3.ray.gradio --port=8080` | +| See all CLI flags | `python -m cosmos3.scripts.inference --help` | + +## Inference + +For full parameter reference, input formats, parallelism presets, and online serving, read the **cosmos3-inference** skill (`.agents/skills/cosmos3-inference/SKILL.md`). + +Key things not obvious from the CLI help: + +- **NGC/PyTorch containers**: run `export LD_LIBRARY_PATH=''` before any `python` call or you'll hit a `torch._C` import error. See `docs/setup.md` § PyTorch Import Issue. +- **Reproducibility**: always pass `--seed `. Without it a random seed is used each run. +- **JSON paths**: relative paths inside input JSON files resolve relative to the JSON file's directory, not the working directory. +- **Resume**: re-running the same command skips already-generated outputs automatically. +- **Parameters / defaults**: `docs/inference.md` is the reference for all sampling args and their defaults. diff --git a/cosmos-inference/ATTRIBUTIONS.md b/cosmos-inference/ATTRIBUTIONS.md new file mode 100644 index 00000000..f832c49f --- /dev/null +++ b/cosmos-inference/ATTRIBUTIONS.md @@ -0,0 +1,57745 @@ +DISTS-pytorch +MIT +https://github.com/dingkeyan93/DISTS +MIT License + +Copyright (c) 2020 Keyan Ding + +Permission is hereby granted, free of charge, to any person obtaining a copy +of this software and associated documentation files (the "Software"), to deal +in the Software without restriction, including without limitation the rights +to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +copies of the Software, and to permit persons to whom the Software is +furnished to do so, subject to the following conditions: + +The above copyright notice and this permission notice shall be included in all +copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE +SOFTWARE. + + +Deprecated +MIT License +https://github.com/laurent-laporte-pro/deprecated +The MIT License (MIT) + +Copyright (c) 2017 Laurent LAPORTE + +Permission is hereby granted, free of charge, to any person obtaining a copy +of this software and associated documentation files (the "Software"), to deal +in the Software without restriction, including without limitation the rights +to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +copies of the Software, and to permit persons to whom the Software is +furnished to do so, subject to the following conditions: + +The above copyright notice and this permission notice shall be included in all +copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE +SOFTWARE. + +Flask +BSD-3-Clause +https://github.com/pallets/flask/ +Copyright 2010 Pallets + +Redistribution and use in source and binary forms, with or without +modification, are permitted provided that the following conditions are +met: + +1. Redistributions of source code must retain the above copyright + notice, this list of conditions and the following disclaimer. + +2. Redistributions in binary form must reproduce the above copyright + notice, this list of conditions and the following disclaimer in the + documentation and/or other materials provided with the distribution. + +3. Neither the name of the copyright holder nor the names of its + contributors may be used to endorse or promote products derived from + this software without specific prior written permission. + +THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS +"AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT +LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A +PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT +HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, +SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED +TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR +PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF +LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING +NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS +SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + + +GitPython +BSD-3-Clause +https://github.com/gitpython-developers/GitPython +Copyright (C) 2008, 2009 Michael Trier and contributors +All rights reserved. + +Redistribution and use in source and binary forms, with or without +modification, are permitted provided that the following conditions +are met: + +* Redistributions of source code must retain the above copyright +notice, this list of conditions and the following disclaimer. + +* Redistributions in binary form must reproduce the above copyright +notice, this list of conditions and the following disclaimer in the +documentation and/or other materials provided with the distribution. + +* Neither the name of the GitPython project nor the names of +its contributors may be used to endorse or promote products derived +from this software without specific prior written permission. + +THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS +"AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT +LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR +A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT +OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, +SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED +TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR +PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF +LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING +NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS +SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + + +ImageIO +BSD-2-Clause +https://github.com/imageio/imageio +Copyright (c) 2014-2025, imageio developers +All rights reserved. + +Redistribution and use in source and binary forms, with or without +modification, are permitted provided that the following conditions are met: + +* Redistributions of source code must retain the above copyright notice, this + list of conditions and the following disclaimer. + +* Redistributions in binary form must reproduce the above copyright notice, + this list of conditions and the following disclaimer in the documentation + and/or other materials provided with the distribution. + +THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" +AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE +IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE +DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE +FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL +DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR +SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER +CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, +OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE +OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + + + +Jinja2 +BSD License +https://github.com/pallets/jinja/ +Copyright 2007 Pallets + +Redistribution and use in source and binary forms, with or without +modification, are permitted provided that the following conditions are +met: + +1. Redistributions of source code must retain the above copyright + notice, this list of conditions and the following disclaimer. + +2. Redistributions in binary form must reproduce the above copyright + notice, this list of conditions and the following disclaimer in the + documentation and/or other materials provided with the distribution. + +3. Neither the name of the copyright holder nor the names of its + contributors may be used to endorse or promote products derived from + this software without specific prior written permission. + +THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS +"AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT +LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A +PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT +HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, +SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED +TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR +PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF +LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING +NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS +SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + + +Markdown +BSD-3-Clause +https://Python-Markdown.github.io/ +BSD 3-Clause License + +Copyright 2007, 2008 The Python Markdown Project (v. 1.7 and later) +Copyright 2004, 2005, 2006 Yuri Takhteyev (v. 0.2-1.6b) +Copyright 2004 Manfred Stienstra (the original version) + +Redistribution and use in source and binary forms, with or without +modification, are permitted provided that the following conditions are met: + +1. Redistributions of source code must retain the above copyright notice, this + list of conditions and the following disclaimer. + +2. Redistributions in binary form must reproduce the above copyright notice, + this list of conditions and the following disclaimer in the documentation + and/or other materials provided with the distribution. + +3. Neither the name of the copyright holder nor the names of its + contributors may be used to endorse or promote products derived from + this software without specific prior written permission. + +THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" +AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE +IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE +DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE +FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL +DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR +SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER +CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, +OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE +OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + + +MarkupSafe +BSD-3-Clause +https://github.com/pallets/markupsafe/ +Copyright 2010 Pallets + +Redistribution and use in source and binary forms, with or without +modification, are permitted provided that the following conditions are +met: + +1. Redistributions of source code must retain the above copyright + notice, this list of conditions and the following disclaimer. + +2. Redistributions in binary form must reproduce the above copyright + notice, this list of conditions and the following disclaimer in the + documentation and/or other materials provided with the distribution. + +3. Neither the name of the copyright holder nor the names of its + contributors may be used to endorse or promote products derived from + this software without specific prior written permission. + +THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS +"AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT +LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A +PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT +HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, +SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED +TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR +PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF +LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING +NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS +SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + + +NATTEN +UNKNOWN +https://natten.org +MIT License + +Copyright (c) 2022 - 2026 Ali Hassani. + +Permission is hereby granted, free of charge, to any person obtaining a copy +of this software and associated documentation files (the "Software"), to deal +in the Software without restriction, including without limitation the rights +to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +copies of the Software, and to permit persons to whom the Software is +furnished to do so, subject to the following conditions: + +The above copyright notice and this permission notice shall be included in all +copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE +SOFTWARE. + + +PyDispatcher +BSD License +https://github.com/mcfletch/pydispatcher +UNKNOWN + +PyOpenGL +BSD License +https://mcfletch.github.io/pyopengl/ +UNKNOWN + +PyYAML +MIT License +https://pyyaml.org/ +Copyright (c) 2017-2021 Ingy döt Net +Copyright (c) 2006-2016 Kirill Simonov + +Permission is hereby granted, free of charge, to any person obtaining a copy of +this software and associated documentation files (the "Software"), to deal in +the Software without restriction, including without limitation the rights to +use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies +of the Software, and to permit persons to whom the Software is furnished to do +so, subject to the following conditions: + +The above copyright notice and this permission notice shall be included in all +copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE +SOFTWARE. + + +Pygments +BSD License +https://pygments.org +Copyright (c) 2006-2022 by the respective authors (see AUTHORS file). +All rights reserved. + +Redistribution and use in source and binary forms, with or without +modification, are permitted provided that the following conditions are +met: + +* Redistributions of source code must retain the above copyright + notice, this list of conditions and the following disclaimer. + +* Redistributions in binary form must reproduce the above copyright + notice, this list of conditions and the following disclaimer in the + documentation and/or other materials provided with the distribution. + +THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS +"AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT +LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR +A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT +OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, +SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT +LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, +DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY +THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT +(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE +OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + + +SecretStorage +BSD-3-Clause +https://github.com/mitya57/secretstorage +Copyright 2012-2025 Dmitry Shachnev +All rights reserved. + +Redistribution and use in source and binary forms, with or without +modification, are permitted provided that the following conditions are met: + +1. Redistributions of source code must retain the above copyright notice, this + list of conditions and the following disclaimer. +2. Redistributions in binary form must reproduce the above copyright notice, + this list of conditions and the following disclaimer in the documentation + and/or other materials provided with the distribution. +3. Neither the name of the University nor the names of its contributors may be + used to endorse or promote products derived from this software without + specific prior written permission. + +THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND +ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED +WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE +DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE FOR ANY +DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES +(INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; +LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON +ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT +(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS +SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + + +Send2Trash +BSD-3-Clause +https://github.com/arsenetar/send2trash +Copyright (c) 2017, Virgil Dupras +All rights reserved. + +Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met: + + * Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer. + * Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution. + * Neither the name of Hardcoded Software Inc. nor the names of its contributors may be used to endorse or promote products derived from this software without specific prior written permission. + +THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + + +Werkzeug +BSD-3-Clause +https://github.com/pallets/werkzeug/ +Copyright 2007 Pallets + +Redistribution and use in source and binary forms, with or without +modification, are permitted provided that the following conditions are +met: + +1. Redistributions of source code must retain the above copyright + notice, this list of conditions and the following disclaimer. + +2. Redistributions in binary form must reproduce the above copyright + notice, this list of conditions and the following disclaimer in the + documentation and/or other materials provided with the distribution. + +3. Neither the name of the copyright holder nor the names of its + contributors may be used to endorse or promote products derived from + this software without specific prior written permission. + +THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS +"AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT +LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A +PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT +HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, +SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED +TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR +PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF +LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING +NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS +SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + + +absl-py +Apache-2.0 +https://github.com/abseil/abseil-py + + Apache License + Version 2.0, January 2004 + http://www.apache.org/licenses/ + + TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION + + 1. Definitions. + + "License" shall mean the terms and conditions for use, reproduction, + and distribution as defined by Sections 1 through 9 of this document. + + "Licensor" shall mean the copyright owner or entity authorized by + the copyright owner that is granting the License. + + "Legal Entity" shall mean the union of the acting entity and all + other entities that control, are controlled by, or are under common + control with that entity. For the purposes of this definition, + "control" means (i) the power, direct or indirect, to cause the + direction or management of such entity, whether by contract or + otherwise, or (ii) ownership of fifty percent (50%) or more of the + outstanding shares, or (iii) beneficial ownership of such entity. + + "You" (or "Your") shall mean an individual or Legal Entity + exercising permissions granted by this License. + + "Source" form shall mean the preferred form for making modifications, + including but not limited to software source code, documentation + source, and configuration files. + + "Object" form shall mean any form resulting from mechanical + transformation or translation of a Source form, including but + not limited to compiled object code, generated documentation, + and conversions to other media types. + + "Work" shall mean the work of authorship, whether in Source or + Object form, made available under the License, as indicated by a + copyright notice that is included in or attached to the work + (an example is provided in the Appendix below). + + "Derivative Works" shall mean any work, whether in Source or Object + form, that is based on (or derived from) the Work and for which the + editorial revisions, annotations, elaborations, or other modifications + represent, as a whole, an original work of authorship. For the purposes + of this License, Derivative Works shall not include works that remain + separable from, or merely link (or bind by name) to the interfaces of, + the Work and Derivative Works thereof. + + "Contribution" shall mean any work of authorship, including + the original version of the Work and any modifications or additions + to that Work or Derivative Works thereof, that is intentionally + submitted to Licensor for inclusion in the Work by the copyright owner + or by an individual or Legal Entity authorized to submit on behalf of + the copyright owner. For the purposes of this definition, "submitted" + means any form of electronic, verbal, or written communication sent + to the Licensor or its representatives, including but not limited to + communication on electronic mailing lists, source code control systems, + and issue tracking systems that are managed by, or on behalf of, the + Licensor for the purpose of discussing and improving the Work, but + excluding communication that is conspicuously marked or otherwise + designated in writing by the copyright owner as "Not a Contribution." + + "Contributor" shall mean Licensor and any individual or Legal Entity + on behalf of whom a Contribution has been received by Licensor and + subsequently incorporated within the Work. + + 2. Grant of Copyright License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + copyright license to reproduce, prepare Derivative Works of, + publicly display, publicly perform, sublicense, and distribute the + Work and such Derivative Works in Source or Object form. + + 3. Grant of Patent License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + (except as stated in this section) patent license to make, have made, + use, offer to sell, sell, import, and otherwise transfer the Work, + where such license applies only to those patent claims licensable + by such Contributor that are necessarily infringed by their + Contribution(s) alone or by combination of their Contribution(s) + with the Work to which such Contribution(s) was submitted. If You + institute patent litigation against any entity (including a + cross-claim or counterclaim in a lawsuit) alleging that the Work + or a Contribution incorporated within the Work constitutes direct + or contributory patent infringement, then any patent licenses + granted to You under this License for that Work shall terminate + as of the date such litigation is filed. + + 4. Redistribution. You may reproduce and distribute copies of the + Work or Derivative Works thereof in any medium, with or without + modifications, and in Source or Object form, provided that You + meet the following conditions: + + (a) You must give any other recipients of the Work or + Derivative Works a copy of this License; and + + (b) You must cause any modified files to carry prominent notices + stating that You changed the files; and + + (c) You must retain, in the Source form of any Derivative Works + that You distribute, all copyright, patent, trademark, and + attribution notices from the Source form of the Work, + excluding those notices that do not pertain to any part of + the Derivative Works; and + + (d) If the Work includes a "NOTICE" text file as part of its + distribution, then any Derivative Works that You distribute must + include a readable copy of the attribution notices contained + within such NOTICE file, excluding those notices that do not + pertain to any part of the Derivative Works, in at least one + of the following places: within a NOTICE text file distributed + as part of the Derivative Works; within the Source form or + documentation, if provided along with the Derivative Works; or, + within a display generated by the Derivative Works, if and + wherever such third-party notices normally appear. The contents + of the NOTICE file are for informational purposes only and + do not modify the License. You may add Your own attribution + notices within Derivative Works that You distribute, alongside + or as an addendum to the NOTICE text from the Work, provided + that such additional attribution notices cannot be construed + as modifying the License. + + You may add Your own copyright statement to Your modifications and + may provide additional or different license terms and conditions + for use, reproduction, or distribution of Your modifications, or + for any such Derivative Works as a whole, provided Your use, + reproduction, and distribution of the Work otherwise complies with + the conditions stated in this License. + + 5. Submission of Contributions. Unless You explicitly state otherwise, + any Contribution intentionally submitted for inclusion in the Work + by You to the Licensor shall be under the terms and conditions of + this License, without any additional terms or conditions. + Notwithstanding the above, nothing herein shall supersede or modify + the terms of any separate license agreement you may have executed + with Licensor regarding such Contributions. + + 6. Trademarks. This License does not grant permission to use the trade + names, trademarks, service marks, or product names of the Licensor, + except as required for reasonable and customary use in describing the + origin of the Work and reproducing the content of the NOTICE file. + + 7. Disclaimer of Warranty. Unless required by applicable law or + agreed to in writing, Licensor provides the Work (and each + Contributor provides its Contributions) on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or + implied, including, without limitation, any warranties or conditions + of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A + PARTICULAR PURPOSE. You are solely responsible for determining the + appropriateness of using or redistributing the Work and assume any + risks associated with Your exercise of permissions under this License. + + 8. Limitation of Liability. In no event and under no legal theory, + whether in tort (including negligence), contract, or otherwise, + unless required by applicable law (such as deliberate and grossly + negligent acts) or agreed to in writing, shall any Contributor be + liable to You for damages, including any direct, indirect, special, + incidental, or consequential damages of any character arising as a + result of this License or out of the use or inability to use the + Work (including but not limited to damages for loss of goodwill, + work stoppage, computer failure or malfunction, or any and all + other commercial damages or losses), even if such Contributor + has been advised of the possibility of such damages. + + 9. Accepting Warranty or Additional Liability. While redistributing + the Work or Derivative Works thereof, You may choose to offer, + and charge a fee for, acceptance of support, warranty, indemnity, + or other liability obligations and/or rights consistent with this + License. However, in accepting such obligations, You may act only + on Your own behalf and on Your sole responsibility, not on behalf + of any other Contributor, and only if You agree to indemnify, + defend, and hold each Contributor harmless for any liability + incurred by, or claims asserted against, such Contributor by reason + of your accepting any such warranty or additional liability. + + END OF TERMS AND CONDITIONS + + APPENDIX: How to apply the Apache License to your work. + + To apply the Apache License to your work, attach the following + boilerplate notice, with the fields enclosed by brackets "[]" + replaced with your own identifying information. (Don't include + the brackets!) The text should be enclosed in the appropriate + comment syntax for the file format. We also recommend that a + file or class name and description of purpose be included on the + same "printed page" as the copyright notice for easier + identification within third-party archives. + + Copyright [yyyy] [name of copyright owner] + + Licensed under the Apache License, Version 2.0 (the "License"); + you may not use this file except in compliance with the License. + You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, software + distributed under the License is distributed on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + See the License for the specific language governing permissions and + limitations under the License. + + +accelerate +Apache Software License +https://github.com/huggingface/accelerate + Apache License + Version 2.0, January 2004 + http://www.apache.org/licenses/ + + TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION + + 1. Definitions. + + "License" shall mean the terms and conditions for use, reproduction, + and distribution as defined by Sections 1 through 9 of this document. + + "Licensor" shall mean the copyright owner or entity authorized by + the copyright owner that is granting the License. + + "Legal Entity" shall mean the union of the acting entity and all + other entities that control, are controlled by, or are under common + control with that entity. For the purposes of this definition, + "control" means (i) the power, direct or indirect, to cause the + direction or management of such entity, whether by contract or + otherwise, or (ii) ownership of fifty percent (50%) or more of the + outstanding shares, or (iii) beneficial ownership of such entity. + + "You" (or "Your") shall mean an individual or Legal Entity + exercising permissions granted by this License. + + "Source" form shall mean the preferred form for making modifications, + including but not limited to software source code, documentation + source, and configuration files. + + "Object" form shall mean any form resulting from mechanical + transformation or translation of a Source form, including but + not limited to compiled object code, generated documentation, + and conversions to other media types. + + "Work" shall mean the work of authorship, whether in Source or + Object form, made available under the License, as indicated by a + copyright notice that is included in or attached to the work + (an example is provided in the Appendix below). + + "Derivative Works" shall mean any work, whether in Source or Object + form, that is based on (or derived from) the Work and for which the + editorial revisions, annotations, elaborations, or other modifications + represent, as a whole, an original work of authorship. For the purposes + of this License, Derivative Works shall not include works that remain + separable from, or merely link (or bind by name) to the interfaces of, + the Work and Derivative Works thereof. + + "Contribution" shall mean any work of authorship, including + the original version of the Work and any modifications or additions + to that Work or Derivative Works thereof, that is intentionally + submitted to Licensor for inclusion in the Work by the copyright owner + or by an individual or Legal Entity authorized to submit on behalf of + the copyright owner. For the purposes of this definition, "submitted" + means any form of electronic, verbal, or written communication sent + to the Licensor or its representatives, including but not limited to + communication on electronic mailing lists, source code control systems, + and issue tracking systems that are managed by, or on behalf of, the + Licensor for the purpose of discussing and improving the Work, but + excluding communication that is conspicuously marked or otherwise + designated in writing by the copyright owner as "Not a Contribution." + + "Contributor" shall mean Licensor and any individual or Legal Entity + on behalf of whom a Contribution has been received by Licensor and + subsequently incorporated within the Work. + + 2. Grant of Copyright License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + copyright license to reproduce, prepare Derivative Works of, + publicly display, publicly perform, sublicense, and distribute the + Work and such Derivative Works in Source or Object form. + + 3. Grant of Patent License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + (except as stated in this section) patent license to make, have made, + use, offer to sell, sell, import, and otherwise transfer the Work, + where such license applies only to those patent claims licensable + by such Contributor that are necessarily infringed by their + Contribution(s) alone or by combination of their Contribution(s) + with the Work to which such Contribution(s) was submitted. If You + institute patent litigation against any entity (including a + cross-claim or counterclaim in a lawsuit) alleging that the Work + or a Contribution incorporated within the Work constitutes direct + or contributory patent infringement, then any patent licenses + granted to You under this License for that Work shall terminate + as of the date such litigation is filed. + + 4. Redistribution. You may reproduce and distribute copies of the + Work or Derivative Works thereof in any medium, with or without + modifications, and in Source or Object form, provided that You + meet the following conditions: + + (a) You must give any other recipients of the Work or + Derivative Works a copy of this License; and + + (b) You must cause any modified files to carry prominent notices + stating that You changed the files; and + + (c) You must retain, in the Source form of any Derivative Works + that You distribute, all copyright, patent, trademark, and + attribution notices from the Source form of the Work, + excluding those notices that do not pertain to any part of + the Derivative Works; and + + (d) If the Work includes a "NOTICE" text file as part of its + distribution, then any Derivative Works that You distribute must + include a readable copy of the attribution notices contained + within such NOTICE file, excluding those notices that do not + pertain to any part of the Derivative Works, in at least one + of the following places: within a NOTICE text file distributed + as part of the Derivative Works; within the Source form or + documentation, if provided along with the Derivative Works; or, + within a display generated by the Derivative Works, if and + wherever such third-party notices normally appear. The contents + of the NOTICE file are for informational purposes only and + do not modify the License. You may add Your own attribution + notices within Derivative Works that You distribute, alongside + or as an addendum to the NOTICE text from the Work, provided + that such additional attribution notices cannot be construed + as modifying the License. + + You may add Your own copyright statement to Your modifications and + may provide additional or different license terms and conditions + for use, reproduction, or distribution of Your modifications, or + for any such Derivative Works as a whole, provided Your use, + reproduction, and distribution of the Work otherwise complies with + the conditions stated in this License. + + 5. Submission of Contributions. Unless You explicitly state otherwise, + any Contribution intentionally submitted for inclusion in the Work + by You to the Licensor shall be under the terms and conditions of + this License, without any additional terms or conditions. + Notwithstanding the above, nothing herein shall supersede or modify + the terms of any separate license agreement you may have executed + with Licensor regarding such Contributions. + + 6. Trademarks. This License does not grant permission to use the trade + names, trademarks, service marks, or product names of the Licensor, + except as required for reasonable and customary use in describing the + origin of the Work and reproducing the content of the NOTICE file. + + 7. Disclaimer of Warranty. Unless required by applicable law or + agreed to in writing, Licensor provides the Work (and each + Contributor provides its Contributions) on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or + implied, including, without limitation, any warranties or conditions + of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A + PARTICULAR PURPOSE. You are solely responsible for determining the + appropriateness of using or redistributing the Work and assume any + risks associated with Your exercise of permissions under this License. + + 8. Limitation of Liability. In no event and under no legal theory, + whether in tort (including negligence), contract, or otherwise, + unless required by applicable law (such as deliberate and grossly + negligent acts) or agreed to in writing, shall any Contributor be + liable to You for damages, including any direct, indirect, special, + incidental, or consequential damages of any character arising as a + result of this License or out of the use or inability to use the + Work (including but not limited to damages for loss of goodwill, + work stoppage, computer failure or malfunction, or any and all + other commercial damages or losses), even if such Contributor + has been advised of the possibility of such damages. + + 9. Accepting Warranty or Additional Liability. While redistributing + the Work or Derivative Works thereof, You may choose to offer, + and charge a fee for, acceptance of support, warranty, indemnity, + or other liability obligations and/or rights consistent with this + License. However, in accepting such obligations, You may act only + on Your own behalf and on Your sole responsibility, not on behalf + of any other Contributor, and only if You agree to indemnify, + defend, and hold each Contributor harmless for any liability + incurred by, or claims asserted against, such Contributor by reason + of your accepting any such warranty or additional liability. + + END OF TERMS AND CONDITIONS + + APPENDIX: How to apply the Apache License to your work. + + To apply the Apache License to your work, attach the following + boilerplate notice, with the fields enclosed by brackets "[]" + replaced with your own identifying information. (Don't include + the brackets!) The text should be enclosed in the appropriate + comment syntax for the file format. We also recommend that a + file or class name and description of purpose be included on the + same "printed page" as the copyright notice for easier + identification within third-party archives. + + Copyright [yyyy] [name of copyright owner] + + Licensed under the Apache License, Version 2.0 (the "License"); + you may not use this file except in compliance with the License. + You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, software + distributed under the License is distributed on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + See the License for the specific language governing permissions and + limitations under the License. + + +aioboto3 +Apache Software License +https://github.com/terricain/aioboto3 +Apache License + Version 2.0, January 2004 + http://www.apache.org/licenses/ + + TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION + + 1. Definitions. + + "License" shall mean the terms and conditions for use, reproduction, + and distribution as defined by Sections 1 through 9 of this document. + + "Licensor" shall mean the copyright owner or entity authorized by + the copyright owner that is granting the License. + + "Legal Entity" shall mean the union of the acting entity and all + other entities that control, are controlled by, or are under common + control with that entity. For the purposes of this definition, + "control" means (i) the power, direct or indirect, to cause the + direction or management of such entity, whether by contract or + otherwise, or (ii) ownership of fifty percent (50%) or more of the + outstanding shares, or (iii) beneficial ownership of such entity. + + "You" (or "Your") shall mean an individual or Legal Entity + exercising permissions granted by this License. + + "Source" form shall mean the preferred form for making modifications, + including but not limited to software source code, documentation + source, and configuration files. + + "Object" form shall mean any form resulting from mechanical + transformation or translation of a Source form, including but + not limited to compiled object code, generated documentation, + and conversions to other media types. + + "Work" shall mean the work of authorship, whether in Source or + Object form, made available under the License, as indicated by a + copyright notice that is included in or attached to the work + (an example is provided in the Appendix below). + + "Derivative Works" shall mean any work, whether in Source or Object + form, that is based on (or derived from) the Work and for which the + editorial revisions, annotations, elaborations, or other modifications + represent, as a whole, an original work of authorship. For the purposes + of this License, Derivative Works shall not include works that remain + separable from, or merely link (or bind by name) to the interfaces of, + the Work and Derivative Works thereof. + + "Contribution" shall mean any work of authorship, including + the original version of the Work and any modifications or additions + to that Work or Derivative Works thereof, that is intentionally + submitted to Licensor for inclusion in the Work by the copyright owner + or by an individual or Legal Entity authorized to submit on behalf of + the copyright owner. For the purposes of this definition, "submitted" + means any form of electronic, verbal, or written communication sent + to the Licensor or its representatives, including but not limited to + communication on electronic mailing lists, source code control systems, + and issue tracking systems that are managed by, or on behalf of, the + Licensor for the purpose of discussing and improving the Work, but + excluding communication that is conspicuously marked or otherwise + designated in writing by the copyright owner as "Not a Contribution." + + "Contributor" shall mean Licensor and any individual or Legal Entity + on behalf of whom a Contribution has been received by Licensor and + subsequently incorporated within the Work. + + 2. Grant of Copyright License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + copyright license to reproduce, prepare Derivative Works of, + publicly display, publicly perform, sublicense, and distribute the + Work and such Derivative Works in Source or Object form. + + 3. Grant of Patent License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + (except as stated in this section) patent license to make, have made, + use, offer to sell, sell, import, and otherwise transfer the Work, + where such license applies only to those patent claims licensable + by such Contributor that are necessarily infringed by their + Contribution(s) alone or by combination of their Contribution(s) + with the Work to which such Contribution(s) was submitted. If You + institute patent litigation against any entity (including a + cross-claim or counterclaim in a lawsuit) alleging that the Work + or a Contribution incorporated within the Work constitutes direct + or contributory patent infringement, then any patent licenses + granted to You under this License for that Work shall terminate + as of the date such litigation is filed. + + 4. Redistribution. You may reproduce and distribute copies of the + Work or Derivative Works thereof in any medium, with or without + modifications, and in Source or Object form, provided that You + meet the following conditions: + + (a) You must give any other recipients of the Work or + Derivative Works a copy of this License; and + + (b) You must cause any modified files to carry prominent notices + stating that You changed the files; and + + (c) You must retain, in the Source form of any Derivative Works + that You distribute, all copyright, patent, trademark, and + attribution notices from the Source form of the Work, + excluding those notices that do not pertain to any part of + the Derivative Works; and + + (d) If the Work includes a "NOTICE" text file as part of its + distribution, then any Derivative Works that You distribute must + include a readable copy of the attribution notices contained + within such NOTICE file, excluding those notices that do not + pertain to any part of the Derivative Works, in at least one + of the following places: within a NOTICE text file distributed + as part of the Derivative Works; within the Source form or + documentation, if provided along with the Derivative Works; or, + within a display generated by the Derivative Works, if and + wherever such third-party notices normally appear. The contents + of the NOTICE file are for informational purposes only and + do not modify the License. You may add Your own attribution + notices within Derivative Works that You distribute, alongside + or as an addendum to the NOTICE text from the Work, provided + that such additional attribution notices cannot be construed + as modifying the License. + + You may add Your own copyright statement to Your modifications and + may provide additional or different license terms and conditions + for use, reproduction, or distribution of Your modifications, or + for any such Derivative Works as a whole, provided Your use, + reproduction, and distribution of the Work otherwise complies with + the conditions stated in this License. + + 5. Submission of Contributions. Unless You explicitly state otherwise, + any Contribution intentionally submitted for inclusion in the Work + by You to the Licensor shall be under the terms and conditions of + this License, without any additional terms or conditions. + Notwithstanding the above, nothing herein shall supersede or modify + the terms of any separate license agreement you may have executed + with Licensor regarding such Contributions. + + 6. Trademarks. This License does not grant permission to use the trade + names, trademarks, service marks, or product names of the Licensor, + except as required for reasonable and customary use in describing the + origin of the Work and reproducing the content of the NOTICE file. + + 7. Disclaimer of Warranty. Unless required by applicable law or + agreed to in writing, Licensor provides the Work (and each + Contributor provides its Contributions) on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or + implied, including, without limitation, any warranties or conditions + of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A + PARTICULAR PURPOSE. You are solely responsible for determining the + appropriateness of using or redistributing the Work and assume any + risks associated with Your exercise of permissions under this License. + + 8. Limitation of Liability. In no event and under no legal theory, + whether in tort (including negligence), contract, or otherwise, + unless required by applicable law (such as deliberate and grossly + negligent acts) or agreed to in writing, shall any Contributor be + liable to You for damages, including any direct, indirect, special, + incidental, or consequential damages of any character arising as a + result of this License or out of the use or inability to use the + Work (including but not limited to damages for loss of goodwill, + work stoppage, computer failure or malfunction, or any and all + other commercial damages or losses), even if such Contributor + has been advised of the possibility of such damages. + + 9. Accepting Warranty or Additional Liability. While redistributing + the Work or Derivative Works thereof, You may choose to offer, + and charge a fee for, acceptance of support, warranty, indemnity, + or other liability obligations and/or rights consistent with this + License. However, in accepting such obligations, You may act only + on Your own behalf and on Your sole responsibility, not on behalf + of any other Contributor, and only if You agree to indemnify, + defend, and hold each Contributor harmless for any liability + incurred by, or claims asserted against, such Contributor by reason + of your accepting any such warranty or additional liability. + + END OF TERMS AND CONDITIONS + + APPENDIX: How to apply the Apache License to your work. + + To apply the Apache License to your work, attach the following + boilerplate notice, with the fields enclosed by brackets "{}" + replaced with your own identifying information. (Don't include + the brackets!) The text should be enclosed in the appropriate + comment syntax for the file format. We also recommend that a + file or class name and description of purpose be included on the + same "printed page" as the copyright notice for easier + identification within third-party archives. + + Copyright 2015-2016 Nikolai Novik + + Licensed under the Apache License, Version 2.0 (the "License"); + you may not use this file except in compliance with the License. + You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, software + distributed under the License is distributed on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + See the License for the specific language governing permissions and + limitations under the License. + +aiobotocore +Apache-2.0 +https://github.com/aio-libs/aiobotocore +Apache License + Version 2.0, January 2004 + http://www.apache.org/licenses/ + + TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION + + 1. Definitions. + + "License" shall mean the terms and conditions for use, reproduction, + and distribution as defined by Sections 1 through 9 of this document. + + "Licensor" shall mean the copyright owner or entity authorized by + the copyright owner that is granting the License. + + "Legal Entity" shall mean the union of the acting entity and all + other entities that control, are controlled by, or are under common + control with that entity. For the purposes of this definition, + "control" means (i) the power, direct or indirect, to cause the + direction or management of such entity, whether by contract or + otherwise, or (ii) ownership of fifty percent (50%) or more of the + outstanding shares, or (iii) beneficial ownership of such entity. + + "You" (or "Your") shall mean an individual or Legal Entity + exercising permissions granted by this License. + + "Source" form shall mean the preferred form for making modifications, + including but not limited to software source code, documentation + source, and configuration files. + + "Object" form shall mean any form resulting from mechanical + transformation or translation of a Source form, including but + not limited to compiled object code, generated documentation, + and conversions to other media types. + + "Work" shall mean the work of authorship, whether in Source or + Object form, made available under the License, as indicated by a + copyright notice that is included in or attached to the work + (an example is provided in the Appendix below). + + "Derivative Works" shall mean any work, whether in Source or Object + form, that is based on (or derived from) the Work and for which the + editorial revisions, annotations, elaborations, or other modifications + represent, as a whole, an original work of authorship. For the purposes + of this License, Derivative Works shall not include works that remain + separable from, or merely link (or bind by name) to the interfaces of, + the Work and Derivative Works thereof. + + "Contribution" shall mean any work of authorship, including + the original version of the Work and any modifications or additions + to that Work or Derivative Works thereof, that is intentionally + submitted to Licensor for inclusion in the Work by the copyright owner + or by an individual or Legal Entity authorized to submit on behalf of + the copyright owner. For the purposes of this definition, "submitted" + means any form of electronic, verbal, or written communication sent + to the Licensor or its representatives, including but not limited to + communication on electronic mailing lists, source code control systems, + and issue tracking systems that are managed by, or on behalf of, the + Licensor for the purpose of discussing and improving the Work, but + excluding communication that is conspicuously marked or otherwise + designated in writing by the copyright owner as "Not a Contribution." + + "Contributor" shall mean Licensor and any individual or Legal Entity + on behalf of whom a Contribution has been received by Licensor and + subsequently incorporated within the Work. + + 2. Grant of Copyright License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + copyright license to reproduce, prepare Derivative Works of, + publicly display, publicly perform, sublicense, and distribute the + Work and such Derivative Works in Source or Object form. + + 3. Grant of Patent License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + (except as stated in this section) patent license to make, have made, + use, offer to sell, sell, import, and otherwise transfer the Work, + where such license applies only to those patent claims licensable + by such Contributor that are necessarily infringed by their + Contribution(s) alone or by combination of their Contribution(s) + with the Work to which such Contribution(s) was submitted. If You + institute patent litigation against any entity (including a + cross-claim or counterclaim in a lawsuit) alleging that the Work + or a Contribution incorporated within the Work constitutes direct + or contributory patent infringement, then any patent licenses + granted to You under this License for that Work shall terminate + as of the date such litigation is filed. + + 4. Redistribution. You may reproduce and distribute copies of the + Work or Derivative Works thereof in any medium, with or without + modifications, and in Source or Object form, provided that You + meet the following conditions: + + (a) You must give any other recipients of the Work or + Derivative Works a copy of this License; and + + (b) You must cause any modified files to carry prominent notices + stating that You changed the files; and + + (c) You must retain, in the Source form of any Derivative Works + that You distribute, all copyright, patent, trademark, and + attribution notices from the Source form of the Work, + excluding those notices that do not pertain to any part of + the Derivative Works; and + + (d) If the Work includes a "NOTICE" text file as part of its + distribution, then any Derivative Works that You distribute must + include a readable copy of the attribution notices contained + within such NOTICE file, excluding those notices that do not + pertain to any part of the Derivative Works, in at least one + of the following places: within a NOTICE text file distributed + as part of the Derivative Works; within the Source form or + documentation, if provided along with the Derivative Works; or, + within a display generated by the Derivative Works, if and + wherever such third-party notices normally appear. The contents + of the NOTICE file are for informational purposes only and + do not modify the License. You may add Your own attribution + notices within Derivative Works that You distribute, alongside + or as an addendum to the NOTICE text from the Work, provided + that such additional attribution notices cannot be construed + as modifying the License. + + You may add Your own copyright statement to Your modifications and + may provide additional or different license terms and conditions + for use, reproduction, or distribution of Your modifications, or + for any such Derivative Works as a whole, provided Your use, + reproduction, and distribution of the Work otherwise complies with + the conditions stated in this License. + + 5. Submission of Contributions. Unless You explicitly state otherwise, + any Contribution intentionally submitted for inclusion in the Work + by You to the Licensor shall be under the terms and conditions of + this License, without any additional terms or conditions. + Notwithstanding the above, nothing herein shall supersede or modify + the terms of any separate license agreement you may have executed + with Licensor regarding such Contributions. + + 6. Trademarks. This License does not grant permission to use the trade + names, trademarks, service marks, or product names of the Licensor, + except as required for reasonable and customary use in describing the + origin of the Work and reproducing the content of the NOTICE file. + + 7. Disclaimer of Warranty. Unless required by applicable law or + agreed to in writing, Licensor provides the Work (and each + Contributor provides its Contributions) on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or + implied, including, without limitation, any warranties or conditions + of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A + PARTICULAR PURPOSE. You are solely responsible for determining the + appropriateness of using or redistributing the Work and assume any + risks associated with Your exercise of permissions under this License. + + 8. Limitation of Liability. In no event and under no legal theory, + whether in tort (including negligence), contract, or otherwise, + unless required by applicable law (such as deliberate and grossly + negligent acts) or agreed to in writing, shall any Contributor be + liable to You for damages, including any direct, indirect, special, + incidental, or consequential damages of any character arising as a + result of this License or out of the use or inability to use the + Work (including but not limited to damages for loss of goodwill, + work stoppage, computer failure or malfunction, or any and all + other commercial damages or losses), even if such Contributor + has been advised of the possibility of such damages. + + 9. Accepting Warranty or Additional Liability. While redistributing + the Work or Derivative Works thereof, You may choose to offer, + and charge a fee for, acceptance of support, warranty, indemnity, + or other liability obligations and/or rights consistent with this + License. However, in accepting such obligations, You may act only + on Your own behalf and on Your sole responsibility, not on behalf + of any other Contributor, and only if You agree to indemnify, + defend, and hold each Contributor harmless for any liability + incurred by, or claims asserted against, such Contributor by reason + of your accepting any such warranty or additional liability. + + END OF TERMS AND CONDITIONS + + APPENDIX: How to apply the Apache License to your work. + + To apply the Apache License to your work, attach the following + boilerplate notice, with the fields enclosed by brackets "{}" + replaced with your own identifying information. (Don't include + the brackets!) The text should be enclosed in the appropriate + comment syntax for the file format. We also recommend that a + file or class name and description of purpose be included on the + same "printed page" as the copyright notice for easier + identification within third-party archives. + + Copyright 2015-2016 Nikolai Novik + + Licensed under the Apache License, Version 2.0 (the "License"); + you may not use this file except in compliance with the License. + You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, software + distributed under the License is distributed on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + See the License for the specific language governing permissions and + limitations under the License. + + +aiofiles +Apache Software License +https://github.com/Tinche/aiofiles +Apache License + Version 2.0, January 2004 + http://www.apache.org/licenses/ + + TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION + + 1. Definitions. + + "License" shall mean the terms and conditions for use, reproduction, + and distribution as defined by Sections 1 through 9 of this document. + + "Licensor" shall mean the copyright owner or entity authorized by + the copyright owner that is granting the License. + + "Legal Entity" shall mean the union of the acting entity and all + other entities that control, are controlled by, or are under common + control with that entity. For the purposes of this definition, + "control" means (i) the power, direct or indirect, to cause the + direction or management of such entity, whether by contract or + otherwise, or (ii) ownership of fifty percent (50%) or more of the + outstanding shares, or (iii) beneficial ownership of such entity. + + "You" (or "Your") shall mean an individual or Legal Entity + exercising permissions granted by this License. + + "Source" form shall mean the preferred form for making modifications, + including but not limited to software source code, documentation + source, and configuration files. + + "Object" form shall mean any form resulting from mechanical + transformation or translation of a Source form, including but + not limited to compiled object code, generated documentation, + and conversions to other media types. + + "Work" shall mean the work of authorship, whether in Source or + Object form, made available under the License, as indicated by a + copyright notice that is included in or attached to the work + (an example is provided in the Appendix below). + + "Derivative Works" shall mean any work, whether in Source or Object + form, that is based on (or derived from) the Work and for which the + editorial revisions, annotations, elaborations, or other modifications + represent, as a whole, an original work of authorship. For the purposes + of this License, Derivative Works shall not include works that remain + separable from, or merely link (or bind by name) to the interfaces of, + the Work and Derivative Works thereof. + + "Contribution" shall mean any work of authorship, including + the original version of the Work and any modifications or additions + to that Work or Derivative Works thereof, that is intentionally + submitted to Licensor for inclusion in the Work by the copyright owner + or by an individual or Legal Entity authorized to submit on behalf of + the copyright owner. For the purposes of this definition, "submitted" + means any form of electronic, verbal, or written communication sent + to the Licensor or its representatives, including but not limited to + communication on electronic mailing lists, source code control systems, + and issue tracking systems that are managed by, or on behalf of, the + Licensor for the purpose of discussing and improving the Work, but + excluding communication that is conspicuously marked or otherwise + designated in writing by the copyright owner as "Not a Contribution." + + "Contributor" shall mean Licensor and any individual or Legal Entity + on behalf of whom a Contribution has been received by Licensor and + subsequently incorporated within the Work. + + 2. Grant of Copyright License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + copyright license to reproduce, prepare Derivative Works of, + publicly display, publicly perform, sublicense, and distribute the + Work and such Derivative Works in Source or Object form. + + 3. Grant of Patent License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + (except as stated in this section) patent license to make, have made, + use, offer to sell, sell, import, and otherwise transfer the Work, + where such license applies only to those patent claims licensable + by such Contributor that are necessarily infringed by their + Contribution(s) alone or by combination of their Contribution(s) + with the Work to which such Contribution(s) was submitted. If You + institute patent litigation against any entity (including a + cross-claim or counterclaim in a lawsuit) alleging that the Work + or a Contribution incorporated within the Work constitutes direct + or contributory patent infringement, then any patent licenses + granted to You under this License for that Work shall terminate + as of the date such litigation is filed. + + 4. Redistribution. You may reproduce and distribute copies of the + Work or Derivative Works thereof in any medium, with or without + modifications, and in Source or Object form, provided that You + meet the following conditions: + + (a) You must give any other recipients of the Work or + Derivative Works a copy of this License; and + + (b) You must cause any modified files to carry prominent notices + stating that You changed the files; and + + (c) You must retain, in the Source form of any Derivative Works + that You distribute, all copyright, patent, trademark, and + attribution notices from the Source form of the Work, + excluding those notices that do not pertain to any part of + the Derivative Works; and + + (d) If the Work includes a "NOTICE" text file as part of its + distribution, then any Derivative Works that You distribute must + include a readable copy of the attribution notices contained + within such NOTICE file, excluding those notices that do not + pertain to any part of the Derivative Works, in at least one + of the following places: within a NOTICE text file distributed + as part of the Derivative Works; within the Source form or + documentation, if provided along with the Derivative Works; or, + within a display generated by the Derivative Works, if and + wherever such third-party notices normally appear. The contents + of the NOTICE file are for informational purposes only and + do not modify the License. You may add Your own attribution + notices within Derivative Works that You distribute, alongside + or as an addendum to the NOTICE text from the Work, provided + that such additional attribution notices cannot be construed + as modifying the License. + + You may add Your own copyright statement to Your modifications and + may provide additional or different license terms and conditions + for use, reproduction, or distribution of Your modifications, or + for any such Derivative Works as a whole, provided Your use, + reproduction, and distribution of the Work otherwise complies with + the conditions stated in this License. + + 5. Submission of Contributions. Unless You explicitly state otherwise, + any Contribution intentionally submitted for inclusion in the Work + by You to the Licensor shall be under the terms and conditions of + this License, without any additional terms or conditions. + Notwithstanding the above, nothing herein shall supersede or modify + the terms of any separate license agreement you may have executed + with Licensor regarding such Contributions. + + 6. Trademarks. This License does not grant permission to use the trade + names, trademarks, service marks, or product names of the Licensor, + except as required for reasonable and customary use in describing the + origin of the Work and reproducing the content of the NOTICE file. + + 7. Disclaimer of Warranty. Unless required by applicable law or + agreed to in writing, Licensor provides the Work (and each + Contributor provides its Contributions) on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or + implied, including, without limitation, any warranties or conditions + of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A + PARTICULAR PURPOSE. You are solely responsible for determining the + appropriateness of using or redistributing the Work and assume any + risks associated with Your exercise of permissions under this License. + + 8. Limitation of Liability. In no event and under no legal theory, + whether in tort (including negligence), contract, or otherwise, + unless required by applicable law (such as deliberate and grossly + negligent acts) or agreed to in writing, shall any Contributor be + liable to You for damages, including any direct, indirect, special, + incidental, or consequential damages of any character arising as a + result of this License or out of the use or inability to use the + Work (including but not limited to damages for loss of goodwill, + work stoppage, computer failure or malfunction, or any and all + other commercial damages or losses), even if such Contributor + has been advised of the possibility of such damages. + + 9. Accepting Warranty or Additional Liability. While redistributing + the Work or Derivative Works thereof, You may choose to offer, + and charge a fee for, acceptance of support, warranty, indemnity, + or other liability obligations and/or rights consistent with this + License. However, in accepting such obligations, You may act only + on Your own behalf and on Your sole responsibility, not on behalf + of any other Contributor, and only if You agree to indemnify, + defend, and hold each Contributor harmless for any liability + incurred by, or claims asserted against, such Contributor by reason + of your accepting any such warranty or additional liability. + + END OF TERMS AND CONDITIONS + + APPENDIX: How to apply the Apache License to your work. + + To apply the Apache License to your work, attach the following + boilerplate notice, with the fields enclosed by brackets "{}" + replaced with your own identifying information. (Don't include + the brackets!) The text should be enclosed in the appropriate + comment syntax for the file format. We also recommend that a + file or class name and description of purpose be included on the + same "printed page" as the copyright notice for easier + identification within third-party archives. + + Copyright {yyyy} {name of copyright owner} + + Licensed under the Apache License, Version 2.0 (the "License"); + you may not use this file except in compliance with the License. + You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, software + distributed under the License is distributed on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + See the License for the specific language governing permissions and + limitations under the License. + + + +aiohappyeyeballs +Python Software Foundation License +https://github.com/aio-libs/aiohappyeyeballs +A. HISTORY OF THE SOFTWARE +========================== + +Python was created in the early 1990s by Guido van Rossum at Stichting +Mathematisch Centrum (CWI, see https://www.cwi.nl) in the Netherlands +as a successor of a language called ABC. Guido remains Python's +principal author, although it includes many contributions from others. + +In 1995, Guido continued his work on Python at the Corporation for +National Research Initiatives (CNRI, see https://www.cnri.reston.va.us) +in Reston, Virginia where he released several versions of the +software. + +In May 2000, Guido and the Python core development team moved to +BeOpen.com to form the BeOpen PythonLabs team. In October of the same +year, the PythonLabs team moved to Digital Creations, which became +Zope Corporation. In 2001, the Python Software Foundation (PSF, see +https://www.python.org/psf/) was formed, a non-profit organization +created specifically to own Python-related Intellectual Property. +Zope Corporation was a sponsoring member of the PSF. + +All Python releases are Open Source (see https://opensource.org for +the Open Source Definition). Historically, most, but not all, Python +releases have also been GPL-compatible; the table below summarizes +the various releases. + + Release Derived Year Owner GPL- + from compatible? (1) + + 0.9.0 thru 1.2 1991-1995 CWI yes + 1.3 thru 1.5.2 1.2 1995-1999 CNRI yes + 1.6 1.5.2 2000 CNRI no + 2.0 1.6 2000 BeOpen.com no + 1.6.1 1.6 2001 CNRI yes (2) + 2.1 2.0+1.6.1 2001 PSF no + 2.0.1 2.0+1.6.1 2001 PSF yes + 2.1.1 2.1+2.0.1 2001 PSF yes + 2.1.2 2.1.1 2002 PSF yes + 2.1.3 2.1.2 2002 PSF yes + 2.2 and above 2.1.1 2001-now PSF yes + +Footnotes: + +(1) GPL-compatible doesn't mean that we're distributing Python under + the GPL. All Python licenses, unlike the GPL, let you distribute + a modified version without making your changes open source. The + GPL-compatible licenses make it possible to combine Python with + other software that is released under the GPL; the others don't. + +(2) According to Richard Stallman, 1.6.1 is not GPL-compatible, + because its license has a choice of law clause. According to + CNRI, however, Stallman's lawyer has told CNRI's lawyer that 1.6.1 + is "not incompatible" with the GPL. + +Thanks to the many outside volunteers who have worked under Guido's +direction to make these releases possible. + + +B. TERMS AND CONDITIONS FOR ACCESSING OR OTHERWISE USING PYTHON +=============================================================== + +Python software and documentation are licensed under the +Python Software Foundation License Version 2. + +Starting with Python 3.8.6, examples, recipes, and other code in +the documentation are dual licensed under the PSF License Version 2 +and the Zero-Clause BSD license. + +Some software incorporated into Python is under different licenses. +The licenses are listed with code falling under that license. + + +PYTHON SOFTWARE FOUNDATION LICENSE VERSION 2 +-------------------------------------------- + +1. This LICENSE AGREEMENT is between the Python Software Foundation +("PSF"), and the Individual or Organization ("Licensee") accessing and +otherwise using this software ("Python") in source or binary form and +its associated documentation. + +2. Subject to the terms and conditions of this License Agreement, PSF hereby +grants Licensee a nonexclusive, royalty-free, world-wide license to reproduce, +analyze, test, perform and/or display publicly, prepare derivative works, +distribute, and otherwise use Python alone or in any derivative version, +provided, however, that PSF's License Agreement and PSF's notice of copyright, +i.e., "Copyright (c) 2001, 2002, 2003, 2004, 2005, 2006, 2007, 2008, 2009, 2010, +2011, 2012, 2013, 2014, 2015, 2016, 2017, 2018, 2019, 2020, 2021, 2022, 2023 Python Software Foundation; +All Rights Reserved" are retained in Python alone or in any derivative version +prepared by Licensee. + +3. In the event Licensee prepares a derivative work that is based on +or incorporates Python or any part thereof, and wants to make +the derivative work available to others as provided herein, then +Licensee hereby agrees to include in any such work a brief summary of +the changes made to Python. + +4. PSF is making Python available to Licensee on an "AS IS" +basis. PSF MAKES NO REPRESENTATIONS OR WARRANTIES, EXPRESS OR +IMPLIED. BY WAY OF EXAMPLE, BUT NOT LIMITATION, PSF MAKES NO AND +DISCLAIMS ANY REPRESENTATION OR WARRANTY OF MERCHANTABILITY OR FITNESS +FOR ANY PARTICULAR PURPOSE OR THAT THE USE OF PYTHON WILL NOT +INFRINGE ANY THIRD PARTY RIGHTS. + +5. PSF SHALL NOT BE LIABLE TO LICENSEE OR ANY OTHER USERS OF PYTHON +FOR ANY INCIDENTAL, SPECIAL, OR CONSEQUENTIAL DAMAGES OR LOSS AS +A RESULT OF MODIFYING, DISTRIBUTING, OR OTHERWISE USING PYTHON, +OR ANY DERIVATIVE THEREOF, EVEN IF ADVISED OF THE POSSIBILITY THEREOF. + +6. This License Agreement will automatically terminate upon a material +breach of its terms and conditions. + +7. Nothing in this License Agreement shall be deemed to create any +relationship of agency, partnership, or joint venture between PSF and +Licensee. This License Agreement does not grant permission to use PSF +trademarks or trade name in a trademark sense to endorse or promote +products or services of Licensee, or any third party. + +8. By copying, installing or otherwise using Python, Licensee +agrees to be bound by the terms and conditions of this License +Agreement. + + +BEOPEN.COM LICENSE AGREEMENT FOR PYTHON 2.0 +------------------------------------------- + +BEOPEN PYTHON OPEN SOURCE LICENSE AGREEMENT VERSION 1 + +1. This LICENSE AGREEMENT is between BeOpen.com ("BeOpen"), having an +office at 160 Saratoga Avenue, Santa Clara, CA 95051, and the +Individual or Organization ("Licensee") accessing and otherwise using +this software in source or binary form and its associated +documentation ("the Software"). + +2. Subject to the terms and conditions of this BeOpen Python License +Agreement, BeOpen hereby grants Licensee a non-exclusive, +royalty-free, world-wide license to reproduce, analyze, test, perform +and/or display publicly, prepare derivative works, distribute, and +otherwise use the Software alone or in any derivative version, +provided, however, that the BeOpen Python License is retained in the +Software, alone or in any derivative version prepared by Licensee. + +3. BeOpen is making the Software available to Licensee on an "AS IS" +basis. BEOPEN MAKES NO REPRESENTATIONS OR WARRANTIES, EXPRESS OR +IMPLIED. BY WAY OF EXAMPLE, BUT NOT LIMITATION, BEOPEN MAKES NO AND +DISCLAIMS ANY REPRESENTATION OR WARRANTY OF MERCHANTABILITY OR FITNESS +FOR ANY PARTICULAR PURPOSE OR THAT THE USE OF THE SOFTWARE WILL NOT +INFRINGE ANY THIRD PARTY RIGHTS. + +4. BEOPEN SHALL NOT BE LIABLE TO LICENSEE OR ANY OTHER USERS OF THE +SOFTWARE FOR ANY INCIDENTAL, SPECIAL, OR CONSEQUENTIAL DAMAGES OR LOSS +AS A RESULT OF USING, MODIFYING OR DISTRIBUTING THE SOFTWARE, OR ANY +DERIVATIVE THEREOF, EVEN IF ADVISED OF THE POSSIBILITY THEREOF. + +5. This License Agreement will automatically terminate upon a material +breach of its terms and conditions. + +6. This License Agreement shall be governed by and interpreted in all +respects by the law of the State of California, excluding conflict of +law provisions. Nothing in this License Agreement shall be deemed to +create any relationship of agency, partnership, or joint venture +between BeOpen and Licensee. This License Agreement does not grant +permission to use BeOpen trademarks or trade names in a trademark +sense to endorse or promote products or services of Licensee, or any +third party. As an exception, the "BeOpen Python" logos available at +http://www.pythonlabs.com/logos.html may be used according to the +permissions granted on that web page. + +7. By copying, installing or otherwise using the software, Licensee +agrees to be bound by the terms and conditions of this License +Agreement. + + +CNRI LICENSE AGREEMENT FOR PYTHON 1.6.1 +--------------------------------------- + +1. This LICENSE AGREEMENT is between the Corporation for National +Research Initiatives, having an office at 1895 Preston White Drive, +Reston, VA 20191 ("CNRI"), and the Individual or Organization +("Licensee") accessing and otherwise using Python 1.6.1 software in +source or binary form and its associated documentation. + +2. Subject to the terms and conditions of this License Agreement, CNRI +hereby grants Licensee a nonexclusive, royalty-free, world-wide +license to reproduce, analyze, test, perform and/or display publicly, +prepare derivative works, distribute, and otherwise use Python 1.6.1 +alone or in any derivative version, provided, however, that CNRI's +License Agreement and CNRI's notice of copyright, i.e., "Copyright (c) +1995-2001 Corporation for National Research Initiatives; All Rights +Reserved" are retained in Python 1.6.1 alone or in any derivative +version prepared by Licensee. Alternately, in lieu of CNRI's License +Agreement, Licensee may substitute the following text (omitting the +quotes): "Python 1.6.1 is made available subject to the terms and +conditions in CNRI's License Agreement. This Agreement together with +Python 1.6.1 may be located on the internet using the following +unique, persistent identifier (known as a handle): 1895.22/1013. This +Agreement may also be obtained from a proxy server on the internet +using the following URL: http://hdl.handle.net/1895.22/1013". + +3. In the event Licensee prepares a derivative work that is based on +or incorporates Python 1.6.1 or any part thereof, and wants to make +the derivative work available to others as provided herein, then +Licensee hereby agrees to include in any such work a brief summary of +the changes made to Python 1.6.1. + +4. CNRI is making Python 1.6.1 available to Licensee on an "AS IS" +basis. CNRI MAKES NO REPRESENTATIONS OR WARRANTIES, EXPRESS OR +IMPLIED. BY WAY OF EXAMPLE, BUT NOT LIMITATION, CNRI MAKES NO AND +DISCLAIMS ANY REPRESENTATION OR WARRANTY OF MERCHANTABILITY OR FITNESS +FOR ANY PARTICULAR PURPOSE OR THAT THE USE OF PYTHON 1.6.1 WILL NOT +INFRINGE ANY THIRD PARTY RIGHTS. + +5. CNRI SHALL NOT BE LIABLE TO LICENSEE OR ANY OTHER USERS OF PYTHON +1.6.1 FOR ANY INCIDENTAL, SPECIAL, OR CONSEQUENTIAL DAMAGES OR LOSS AS +A RESULT OF MODIFYING, DISTRIBUTING, OR OTHERWISE USING PYTHON 1.6.1, +OR ANY DERIVATIVE THEREOF, EVEN IF ADVISED OF THE POSSIBILITY THEREOF. + +6. This License Agreement will automatically terminate upon a material +breach of its terms and conditions. + +7. This License Agreement shall be governed by the federal +intellectual property law of the United States, including without +limitation the federal copyright law, and, to the extent such +U.S. federal law does not apply, by the law of the Commonwealth of +Virginia, excluding Virginia's conflict of law provisions. +Notwithstanding the foregoing, with regard to derivative works based +on Python 1.6.1 that incorporate non-separable material that was +previously distributed under the GNU General Public License (GPL), the +law of the Commonwealth of Virginia shall govern this License +Agreement only as to issues arising under or with respect to +Paragraphs 4, 5, and 7 of this License Agreement. Nothing in this +License Agreement shall be deemed to create any relationship of +agency, partnership, or joint venture between CNRI and Licensee. This +License Agreement does not grant permission to use CNRI trademarks or +trade name in a trademark sense to endorse or promote products or +services of Licensee, or any third party. + +8. By clicking on the "ACCEPT" button where indicated, or by copying, +installing or otherwise using Python 1.6.1, Licensee agrees to be +bound by the terms and conditions of this License Agreement. + + ACCEPT + + +CWI LICENSE AGREEMENT FOR PYTHON 0.9.0 THROUGH 1.2 +-------------------------------------------------- + +Copyright (c) 1991 - 1995, Stichting Mathematisch Centrum Amsterdam, +The Netherlands. All rights reserved. + +Permission to use, copy, modify, and distribute this software and its +documentation for any purpose and without fee is hereby granted, +provided that the above copyright notice appear in all copies and that +both that copyright notice and this permission notice appear in +supporting documentation, and that the name of Stichting Mathematisch +Centrum or CWI not be used in advertising or publicity pertaining to +distribution of the software without specific, written prior +permission. + +STICHTING MATHEMATISCH CENTRUM DISCLAIMS ALL WARRANTIES WITH REGARD TO +THIS SOFTWARE, INCLUDING ALL IMPLIED WARRANTIES OF MERCHANTABILITY AND +FITNESS, IN NO EVENT SHALL STICHTING MATHEMATISCH CENTRUM BE LIABLE +FOR ANY SPECIAL, INDIRECT OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES +WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN +ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT +OF OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE. + +ZERO-CLAUSE BSD LICENSE FOR CODE IN THE PYTHON DOCUMENTATION +---------------------------------------------------------------------- + +Permission to use, copy, modify, and/or distribute this software for any +purpose with or without fee is hereby granted. + +THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES WITH +REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF MERCHANTABILITY +AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY SPECIAL, DIRECT, +INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES WHATSOEVER RESULTING FROM +LOSS OF USE, DATA OR PROFITS, WHETHER IN AN ACTION OF CONTRACT, NEGLIGENCE OR +OTHER TORTIOUS ACTION, ARISING OUT OF OR IN CONNECTION WITH THE USE OR +PERFORMANCE OF THIS SOFTWARE. + + +aiohttp +Apache-2.0 AND MIT +https://github.com/aio-libs/aiohttp + Copyright aio-libs contributors. + + Licensed under the Apache License, Version 2.0 (the "License"); + you may not use this file except in compliance with the License. + You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, software + distributed under the License is distributed on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + See the License for the specific language governing permissions and + limitations under the License. + + +aiohttp-cors +Apache Software License +https://github.com/aio-libs/aiohttp-cors + Apache License + Version 2.0, January 2004 + http://www.apache.org/licenses/ + + TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION + + 1. Definitions. + + "License" shall mean the terms and conditions for use, reproduction, + and distribution as defined by Sections 1 through 9 of this document. + + "Licensor" shall mean the copyright owner or entity authorized by + the copyright owner that is granting the License. + + "Legal Entity" shall mean the union of the acting entity and all + other entities that control, are controlled by, or are under common + control with that entity. For the purposes of this definition, + "control" means (i) the power, direct or indirect, to cause the + direction or management of such entity, whether by contract or + otherwise, or (ii) ownership of fifty percent (50%) or more of the + outstanding shares, or (iii) beneficial ownership of such entity. + + "You" (or "Your") shall mean an individual or Legal Entity + exercising permissions granted by this License. + + "Source" form shall mean the preferred form for making modifications, + including but not limited to software source code, documentation + source, and configuration files. + + "Object" form shall mean any form resulting from mechanical + transformation or translation of a Source form, including but + not limited to compiled object code, generated documentation, + and conversions to other media types. + + "Work" shall mean the work of authorship, whether in Source or + Object form, made available under the License, as indicated by a + copyright notice that is included in or attached to the work + (an example is provided in the Appendix below). + + "Derivative Works" shall mean any work, whether in Source or Object + form, that is based on (or derived from) the Work and for which the + editorial revisions, annotations, elaborations, or other modifications + represent, as a whole, an original work of authorship. For the purposes + of this License, Derivative Works shall not include works that remain + separable from, or merely link (or bind by name) to the interfaces of, + the Work and Derivative Works thereof. + + "Contribution" shall mean any work of authorship, including + the original version of the Work and any modifications or additions + to that Work or Derivative Works thereof, that is intentionally + submitted to Licensor for inclusion in the Work by the copyright owner + or by an individual or Legal Entity authorized to submit on behalf of + the copyright owner. For the purposes of this definition, "submitted" + means any form of electronic, verbal, or written communication sent + to the Licensor or its representatives, including but not limited to + communication on electronic mailing lists, source code control systems, + and issue tracking systems that are managed by, or on behalf of, the + Licensor for the purpose of discussing and improving the Work, but + excluding communication that is conspicuously marked or otherwise + designated in writing by the copyright owner as "Not a Contribution." + + "Contributor" shall mean Licensor and any individual or Legal Entity + on behalf of whom a Contribution has been received by Licensor and + subsequently incorporated within the Work. + + 2. Grant of Copyright License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + copyright license to reproduce, prepare Derivative Works of, + publicly display, publicly perform, sublicense, and distribute the + Work and such Derivative Works in Source or Object form. + + 3. Grant of Patent License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + (except as stated in this section) patent license to make, have made, + use, offer to sell, sell, import, and otherwise transfer the Work, + where such license applies only to those patent claims licensable + by such Contributor that are necessarily infringed by their + Contribution(s) alone or by combination of their Contribution(s) + with the Work to which such Contribution(s) was submitted. If You + institute patent litigation against any entity (including a + cross-claim or counterclaim in a lawsuit) alleging that the Work + or a Contribution incorporated within the Work constitutes direct + or contributory patent infringement, then any patent licenses + granted to You under this License for that Work shall terminate + as of the date such litigation is filed. + + 4. Redistribution. You may reproduce and distribute copies of the + Work or Derivative Works thereof in any medium, with or without + modifications, and in Source or Object form, provided that You + meet the following conditions: + + (a) You must give any other recipients of the Work or + Derivative Works a copy of this License; and + + (b) You must cause any modified files to carry prominent notices + stating that You changed the files; and + + (c) You must retain, in the Source form of any Derivative Works + that You distribute, all copyright, patent, trademark, and + attribution notices from the Source form of the Work, + excluding those notices that do not pertain to any part of + the Derivative Works; and + + (d) If the Work includes a "NOTICE" text file as part of its + distribution, then any Derivative Works that You distribute must + include a readable copy of the attribution notices contained + within such NOTICE file, excluding those notices that do not + pertain to any part of the Derivative Works, in at least one + of the following places: within a NOTICE text file distributed + as part of the Derivative Works; within the Source form or + documentation, if provided along with the Derivative Works; or, + within a display generated by the Derivative Works, if and + wherever such third-party notices normally appear. The contents + of the NOTICE file are for informational purposes only and + do not modify the License. You may add Your own attribution + notices within Derivative Works that You distribute, alongside + or as an addendum to the NOTICE text from the Work, provided + that such additional attribution notices cannot be construed + as modifying the License. + + You may add Your own copyright statement to Your modifications and + may provide additional or different license terms and conditions + for use, reproduction, or distribution of Your modifications, or + for any such Derivative Works as a whole, provided Your use, + reproduction, and distribution of the Work otherwise complies with + the conditions stated in this License. + + 5. Submission of Contributions. Unless You explicitly state otherwise, + any Contribution intentionally submitted for inclusion in the Work + by You to the Licensor shall be under the terms and conditions of + this License, without any additional terms or conditions. + Notwithstanding the above, nothing herein shall supersede or modify + the terms of any separate license agreement you may have executed + with Licensor regarding such Contributions. + + 6. Trademarks. This License does not grant permission to use the trade + names, trademarks, service marks, or product names of the Licensor, + except as required for reasonable and customary use in describing the + origin of the Work and reproducing the content of the NOTICE file. + + 7. Disclaimer of Warranty. Unless required by applicable law or + agreed to in writing, Licensor provides the Work (and each + Contributor provides its Contributions) on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or + implied, including, without limitation, any warranties or conditions + of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A + PARTICULAR PURPOSE. You are solely responsible for determining the + appropriateness of using or redistributing the Work and assume any + risks associated with Your exercise of permissions under this License. + + 8. Limitation of Liability. In no event and under no legal theory, + whether in tort (including negligence), contract, or otherwise, + unless required by applicable law (such as deliberate and grossly + negligent acts) or agreed to in writing, shall any Contributor be + liable to You for damages, including any direct, indirect, special, + incidental, or consequential damages of any character arising as a + result of this License or out of the use or inability to use the + Work (including but not limited to damages for loss of goodwill, + work stoppage, computer failure or malfunction, or any and all + other commercial damages or losses), even if such Contributor + has been advised of the possibility of such damages. + + 9. Accepting Warranty or Additional Liability. While redistributing + the Work or Derivative Works thereof, You may choose to offer, + and charge a fee for, acceptance of support, warranty, indemnity, + or other liability obligations and/or rights consistent with this + License. However, in accepting such obligations, You may act only + on Your own behalf and on Your sole responsibility, not on behalf + of any other Contributor, and only if You agree to indemnify, + defend, and hold each Contributor harmless for any liability + incurred by, or claims asserted against, such Contributor by reason + of your accepting any such warranty or additional liability. + + END OF TERMS AND CONDITIONS + + APPENDIX: How to apply the Apache License to your work. + + To apply the Apache License to your work, attach the following + boilerplate notice, with the fields enclosed by brackets "{}" + replaced with your own identifying information. (Don't include + the brackets!) The text should be enclosed in the appropriate + comment syntax for the file format. We also recommend that a + file or class name and description of purpose be included on the + same "printed page" as the copyright notice for easier + identification within third-party archives. + + Copyright 2015-2018 Vladimir Rutsky and aio-libs team + + Licensed under the Apache License, Version 2.0 (the "License"); + you may not use this file except in compliance with the License. + You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, software + distributed under the License is distributed on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + See the License for the specific language governing permissions and + limitations under the License. + + +aioitertools +MIT +https://aioitertools.omnilib.dev/en/latest/changelog.html +MIT License + +Copyright (c) 2022 Amethyst Reese + +Permission is hereby granted, free of charge, to any person obtaining a copy +of this software and associated documentation files (the "Software"), to deal +in the Software without restriction, including without limitation the rights +to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +copies of the Software, and to permit persons to whom the Software is +furnished to do so, subject to the following conditions: + +The above copyright notice and this permission notice shall be included in all +copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE +SOFTWARE. + + +aiosignal +Apache Software License +https://github.com/aio-libs/aiosignal +Apache License + Version 2.0, January 2004 + http://www.apache.org/licenses/ + + TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION + + 1. Definitions. + + "License" shall mean the terms and conditions for use, reproduction, + and distribution as defined by Sections 1 through 9 of this document. + + "Licensor" shall mean the copyright owner or entity authorized by + the copyright owner that is granting the License. + + "Legal Entity" shall mean the union of the acting entity and all + other entities that control, are controlled by, or are under common + control with that entity. For the purposes of this definition, + "control" means (i) the power, direct or indirect, to cause the + direction or management of such entity, whether by contract or + otherwise, or (ii) ownership of fifty percent (50%) or more of the + outstanding shares, or (iii) beneficial ownership of such entity. + + "You" (or "Your") shall mean an individual or Legal Entity + exercising permissions granted by this License. + + "Source" form shall mean the preferred form for making modifications, + including but not limited to software source code, documentation + source, and configuration files. + + "Object" form shall mean any form resulting from mechanical + transformation or translation of a Source form, including but + not limited to compiled object code, generated documentation, + and conversions to other media types. + + "Work" shall mean the work of authorship, whether in Source or + Object form, made available under the License, as indicated by a + copyright notice that is included in or attached to the work + (an example is provided in the Appendix below). + + "Derivative Works" shall mean any work, whether in Source or Object + form, that is based on (or derived from) the Work and for which the + editorial revisions, annotations, elaborations, or other modifications + represent, as a whole, an original work of authorship. For the purposes + of this License, Derivative Works shall not include works that remain + separable from, or merely link (or bind by name) to the interfaces of, + the Work and Derivative Works thereof. + + "Contribution" shall mean any work of authorship, including + the original version of the Work and any modifications or additions + to that Work or Derivative Works thereof, that is intentionally + submitted to Licensor for inclusion in the Work by the copyright owner + or by an individual or Legal Entity authorized to submit on behalf of + the copyright owner. For the purposes of this definition, "submitted" + means any form of electronic, verbal, or written communication sent + to the Licensor or its representatives, including but not limited to + communication on electronic mailing lists, source code control systems, + and issue tracking systems that are managed by, or on behalf of, the + Licensor for the purpose of discussing and improving the Work, but + excluding communication that is conspicuously marked or otherwise + designated in writing by the copyright owner as "Not a Contribution." + + "Contributor" shall mean Licensor and any individual or Legal Entity + on behalf of whom a Contribution has been received by Licensor and + subsequently incorporated within the Work. + + 2. Grant of Copyright License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + copyright license to reproduce, prepare Derivative Works of, + publicly display, publicly perform, sublicense, and distribute the + Work and such Derivative Works in Source or Object form. + + 3. Grant of Patent License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + (except as stated in this section) patent license to make, have made, + use, offer to sell, sell, import, and otherwise transfer the Work, + where such license applies only to those patent claims licensable + by such Contributor that are necessarily infringed by their + Contribution(s) alone or by combination of their Contribution(s) + with the Work to which such Contribution(s) was submitted. If You + institute patent litigation against any entity (including a + cross-claim or counterclaim in a lawsuit) alleging that the Work + or a Contribution incorporated within the Work constitutes direct + or contributory patent infringement, then any patent licenses + granted to You under this License for that Work shall terminate + as of the date such litigation is filed. + + 4. Redistribution. You may reproduce and distribute copies of the + Work or Derivative Works thereof in any medium, with or without + modifications, and in Source or Object form, provided that You + meet the following conditions: + + (a) You must give any other recipients of the Work or + Derivative Works a copy of this License; and + + (b) You must cause any modified files to carry prominent notices + stating that You changed the files; and + + (c) You must retain, in the Source form of any Derivative Works + that You distribute, all copyright, patent, trademark, and + attribution notices from the Source form of the Work, + excluding those notices that do not pertain to any part of + the Derivative Works; and + + (d) If the Work includes a "NOTICE" text file as part of its + distribution, then any Derivative Works that You distribute must + include a readable copy of the attribution notices contained + within such NOTICE file, excluding those notices that do not + pertain to any part of the Derivative Works, in at least one + of the following places: within a NOTICE text file distributed + as part of the Derivative Works; within the Source form or + documentation, if provided along with the Derivative Works; or, + within a display generated by the Derivative Works, if and + wherever such third-party notices normally appear. The contents + of the NOTICE file are for informational purposes only and + do not modify the License. You may add Your own attribution + notices within Derivative Works that You distribute, alongside + or as an addendum to the NOTICE text from the Work, provided + that such additional attribution notices cannot be construed + as modifying the License. + + You may add Your own copyright statement to Your modifications and + may provide additional or different license terms and conditions + for use, reproduction, or distribution of Your modifications, or + for any such Derivative Works as a whole, provided Your use, + reproduction, and distribution of the Work otherwise complies with + the conditions stated in this License. + + 5. Submission of Contributions. Unless You explicitly state otherwise, + any Contribution intentionally submitted for inclusion in the Work + by You to the Licensor shall be under the terms and conditions of + this License, without any additional terms or conditions. + Notwithstanding the above, nothing herein shall supersede or modify + the terms of any separate license agreement you may have executed + with Licensor regarding such Contributions. + + 6. Trademarks. This License does not grant permission to use the trade + names, trademarks, service marks, or product names of the Licensor, + except as required for reasonable and customary use in describing the + origin of the Work and reproducing the content of the NOTICE file. + + 7. Disclaimer of Warranty. Unless required by applicable law or + agreed to in writing, Licensor provides the Work (and each + Contributor provides its Contributions) on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or + implied, including, without limitation, any warranties or conditions + of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A + PARTICULAR PURPOSE. You are solely responsible for determining the + appropriateness of using or redistributing the Work and assume any + risks associated with Your exercise of permissions under this License. + + 8. Limitation of Liability. In no event and under no legal theory, + whether in tort (including negligence), contract, or otherwise, + unless required by applicable law (such as deliberate and grossly + negligent acts) or agreed to in writing, shall any Contributor be + liable to You for damages, including any direct, indirect, special, + incidental, or consequential damages of any character arising as a + result of this License or out of the use or inability to use the + Work (including but not limited to damages for loss of goodwill, + work stoppage, computer failure or malfunction, or any and all + other commercial damages or losses), even if such Contributor + has been advised of the possibility of such damages. + + 9. Accepting Warranty or Additional Liability. While redistributing + the Work or Derivative Works thereof, You may choose to offer, + and charge a fee for, acceptance of support, warranty, indemnity, + or other liability obligations and/or rights consistent with this + License. However, in accepting such obligations, You may act only + on Your own behalf and on Your sole responsibility, not on behalf + of any other Contributor, and only if You agree to indemnify, + defend, and hold each Contributor harmless for any liability + incurred by, or claims asserted against, such Contributor by reason + of your accepting any such warranty or additional liability. + + END OF TERMS AND CONDITIONS + + APPENDIX: How to apply the Apache License to your work. + + To apply the Apache License to your work, attach the following + boilerplate notice, with the fields enclosed by brackets "{}" + replaced with your own identifying information. (Don't include + the brackets!) The text should be enclosed in the appropriate + comment syntax for the file format. We also recommend that a + file or class name and description of purpose be included on the + same "printed page" as the copyright notice for easier + identification within third-party archives. + + Copyright 2013-2019 Nikolay Kim and Andrew Svetlov + + Licensed under the Apache License, Version 2.0 (the "License"); + you may not use this file except in compliance with the License. + You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, software + distributed under the License is distributed on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + See the License for the specific language governing permissions and + limitations under the License. + + +annotated-doc +MIT +https://github.com/fastapi/annotated-doc +The MIT License (MIT) + +Copyright (c) 2025 Sebastián Ramírez + +Permission is hereby granted, free of charge, to any person obtaining a copy +of this software and associated documentation files (the "Software"), to deal +in the Software without restriction, including without limitation the rights +to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +copies of the Software, and to permit persons to whom the Software is +furnished to do so, subject to the following conditions: + +The above copyright notice and this permission notice shall be included in +all copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN +THE SOFTWARE. + + +annotated-types +MIT License +https://github.com/annotated-types/annotated-types +The MIT License (MIT) + +Copyright (c) 2022 the contributors + +Permission is hereby granted, free of charge, to any person obtaining a copy +of this software and associated documentation files (the "Software"), to deal +in the Software without restriction, including without limitation the rights +to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +copies of the Software, and to permit persons to whom the Software is +furnished to do so, subject to the following conditions: + +The above copyright notice and this permission notice shall be included in all +copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE +SOFTWARE. + + +antlr4-python3-runtime +BSD +http://www.antlr.org +UNKNOWN + +anyio +MIT License +https://anyio.readthedocs.io/en/stable/versionhistory.html +The MIT License (MIT) + +Copyright (c) 2018 Alex Grönholm + +Permission is hereby granted, free of charge, to any person obtaining a copy of +this software and associated documentation files (the "Software"), to deal in +the Software without restriction, including without limitation the rights to +use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of +the Software, and to permit persons to whom the Software is furnished to do so, +subject to the following conditions: + +The above copyright notice and this permission notice shall be included in all +copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS +FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR +COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER +IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN +CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. + + +argon2-cffi +MIT +https://github.com/hynek/argon2-cffi/blob/main/CHANGELOG.md +The MIT License (MIT) + +Copyright (c) 2015 Hynek Schlawack and the argon2-cffi contributors + +Permission is hereby granted, free of charge, to any person obtaining a copy +of this software and associated documentation files (the "Software"), to deal +in the Software without restriction, including without limitation the rights +to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +copies of the Software, and to permit persons to whom the Software is +furnished to do so, subject to the following conditions: + +The above copyright notice and this permission notice shall be included in all +copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE +SOFTWARE. + + +argon2-cffi-bindings +MIT +https://github.com/hynek/argon2-cffi-bindings/blob/main/CHANGELOG.md +The MIT License (MIT) + +Copyright (c) 2021 Hynek Schlawack + +Permission is hereby granted, free of charge, to any person obtaining a copy +of this software and associated documentation files (the "Software"), to deal +in the Software without restriction, including without limitation the rights +to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +copies of the Software, and to permit persons to whom the Software is +furnished to do so, subject to the following conditions: + +The above copyright notice and this permission notice shall be included in all +copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE +SOFTWARE. + + +arrgh +MIT License + +Copyright (c) 2023 Nicholas Sharp + +Permission is hereby granted, free of charge, to any person obtaining a copy +of this software and associated documentation files (the "Software"), to deal +in the Software without restriction, including without limitation the rights +to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +copies of the Software, and to permit persons to whom the Software is +furnished to do so, subject to the following conditions: + +The above copyright notice and this permission notice shall be included in all +copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE +SOFTWARE. + +https://github.com/nmwsharp/arrgh +MIT License + +Copyright (c) 2023 Nicholas Sharp + +Permission is hereby granted, free of charge, to any person obtaining a copy +of this software and associated documentation files (the "Software"), to deal +in the Software without restriction, including without limitation the rights +to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +copies of the Software, and to permit persons to whom the Software is +furnished to do so, subject to the following conditions: + +The above copyright notice and this permission notice shall be included in all +copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE +SOFTWARE. + + +arrow +Apache Software License +https://github.com/arrow-py/arrow + Apache License + Version 2.0, January 2004 + http://www.apache.org/licenses/ + + TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION + + 1. Definitions. + + "License" shall mean the terms and conditions for use, reproduction, + and distribution as defined by Sections 1 through 9 of this document. + + "Licensor" shall mean the copyright owner or entity authorized by + the copyright owner that is granting the License. + + "Legal Entity" shall mean the union of the acting entity and all + other entities that control, are controlled by, or are under common + control with that entity. For the purposes of this definition, + "control" means (i) the power, direct or indirect, to cause the + direction or management of such entity, whether by contract or + otherwise, or (ii) ownership of fifty percent (50%) or more of the + outstanding shares, or (iii) beneficial ownership of such entity. + + "You" (or "Your") shall mean an individual or Legal Entity + exercising permissions granted by this License. + + "Source" form shall mean the preferred form for making modifications, + including but not limited to software source code, documentation + source, and configuration files. + + "Object" form shall mean any form resulting from mechanical + transformation or translation of a Source form, including but + not limited to compiled object code, generated documentation, + and conversions to other media types. + + "Work" shall mean the work of authorship, whether in Source or + Object form, made available under the License, as indicated by a + copyright notice that is included in or attached to the work + (an example is provided in the Appendix below). + + "Derivative Works" shall mean any work, whether in Source or Object + form, that is based on (or derived from) the Work and for which the + editorial revisions, annotations, elaborations, or other modifications + represent, as a whole, an original work of authorship. For the purposes + of this License, Derivative Works shall not include works that remain + separable from, or merely link (or bind by name) to the interfaces of, + the Work and Derivative Works thereof. + + "Contribution" shall mean any work of authorship, including + the original version of the Work and any modifications or additions + to that Work or Derivative Works thereof, that is intentionally + submitted to Licensor for inclusion in the Work by the copyright owner + or by an individual or Legal Entity authorized to submit on behalf of + the copyright owner. For the purposes of this definition, "submitted" + means any form of electronic, verbal, or written communication sent + to the Licensor or its representatives, including but not limited to + communication on electronic mailing lists, source code control systems, + and issue tracking systems that are managed by, or on behalf of, the + Licensor for the purpose of discussing and improving the Work, but + excluding communication that is conspicuously marked or otherwise + designated in writing by the copyright owner as "Not a Contribution." + + "Contributor" shall mean Licensor and any individual or Legal Entity + on behalf of whom a Contribution has been received by Licensor and + subsequently incorporated within the Work. + + 2. Grant of Copyright License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + copyright license to reproduce, prepare Derivative Works of, + publicly display, publicly perform, sublicense, and distribute the + Work and such Derivative Works in Source or Object form. + + 3. Grant of Patent License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + (except as stated in this section) patent license to make, have made, + use, offer to sell, sell, import, and otherwise transfer the Work, + where such license applies only to those patent claims licensable + by such Contributor that are necessarily infringed by their + Contribution(s) alone or by combination of their Contribution(s) + with the Work to which such Contribution(s) was submitted. If You + institute patent litigation against any entity (including a + cross-claim or counterclaim in a lawsuit) alleging that the Work + or a Contribution incorporated within the Work constitutes direct + or contributory patent infringement, then any patent licenses + granted to You under this License for that Work shall terminate + as of the date such litigation is filed. + + 4. Redistribution. You may reproduce and distribute copies of the + Work or Derivative Works thereof in any medium, with or without + modifications, and in Source or Object form, provided that You + meet the following conditions: + + (a) You must give any other recipients of the Work or + Derivative Works a copy of this License; and + + (b) You must cause any modified files to carry prominent notices + stating that You changed the files; and + + (c) You must retain, in the Source form of any Derivative Works + that You distribute, all copyright, patent, trademark, and + attribution notices from the Source form of the Work, + excluding those notices that do not pertain to any part of + the Derivative Works; and + + (d) If the Work includes a "NOTICE" text file as part of its + distribution, then any Derivative Works that You distribute must + include a readable copy of the attribution notices contained + within such NOTICE file, excluding those notices that do not + pertain to any part of the Derivative Works, in at least one + of the following places: within a NOTICE text file distributed + as part of the Derivative Works; within the Source form or + documentation, if provided along with the Derivative Works; or, + within a display generated by the Derivative Works, if and + wherever such third-party notices normally appear. The contents + of the NOTICE file are for informational purposes only and + do not modify the License. You may add Your own attribution + notices within Derivative Works that You distribute, alongside + or as an addendum to the NOTICE text from the Work, provided + that such additional attribution notices cannot be construed + as modifying the License. + + You may add Your own copyright statement to Your modifications and + may provide additional or different license terms and conditions + for use, reproduction, or distribution of Your modifications, or + for any such Derivative Works as a whole, provided Your use, + reproduction, and distribution of the Work otherwise complies with + the conditions stated in this License. + + 5. Submission of Contributions. Unless You explicitly state otherwise, + any Contribution intentionally submitted for inclusion in the Work + by You to the Licensor shall be under the terms and conditions of + this License, without any additional terms or conditions. + Notwithstanding the above, nothing herein shall supersede or modify + the terms of any separate license agreement you may have executed + with Licensor regarding such Contributions. + + 6. Trademarks. This License does not grant permission to use the trade + names, trademarks, service marks, or product names of the Licensor, + except as required for reasonable and customary use in describing the + origin of the Work and reproducing the content of the NOTICE file. + + 7. Disclaimer of Warranty. Unless required by applicable law or + agreed to in writing, Licensor provides the Work (and each + Contributor provides its Contributions) on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or + implied, including, without limitation, any warranties or conditions + of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A + PARTICULAR PURPOSE. You are solely responsible for determining the + appropriateness of using or redistributing the Work and assume any + risks associated with Your exercise of permissions under this License. + + 8. Limitation of Liability. In no event and under no legal theory, + whether in tort (including negligence), contract, or otherwise, + unless required by applicable law (such as deliberate and grossly + negligent acts) or agreed to in writing, shall any Contributor be + liable to You for damages, including any direct, indirect, special, + incidental, or consequential damages of any character arising as a + result of this License or out of the use or inability to use the + Work (including but not limited to damages for loss of goodwill, + work stoppage, computer failure or malfunction, or any and all + other commercial damages or losses), even if such Contributor + has been advised of the possibility of such damages. + + 9. Accepting Warranty or Additional Liability. While redistributing + the Work or Derivative Works thereof, You may choose to offer, + and charge a fee for, acceptance of support, warranty, indemnity, + or other liability obligations and/or rights consistent with this + License. However, in accepting such obligations, You may act only + on Your own behalf and on Your sole responsibility, not on behalf + of any other Contributor, and only if You agree to indemnify, + defend, and hold each Contributor harmless for any liability + incurred by, or claims asserted against, such Contributor by reason + of your accepting any such warranty or additional liability. + + END OF TERMS AND CONDITIONS + + APPENDIX: How to apply the Apache License to your work. + + To apply the Apache License to your work, attach the following + boilerplate notice, with the fields enclosed by brackets "[]" + replaced with your own identifying information. (Don't include + the brackets!) The text should be enclosed in the appropriate + comment syntax for the file format. We also recommend that a + file or class name and description of purpose be included on the + same "printed page" as the copyright notice for easier + identification within third-party archives. + + Copyright 2023 Chris Smith + + Licensed under the Apache License, Version 2.0 (the "License"); + you may not use this file except in compliance with the License. + You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, software + distributed under the License is distributed on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + See the License for the specific language governing permissions and + limitations under the License. + + +asttokens +Apache 2.0 +https://github.com/gristlabs/asttokens + Apache License + Version 2.0, January 2004 + https://www.apache.org/licenses/ + + TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION + + 1. Definitions. + + "License" shall mean the terms and conditions for use, reproduction, + and distribution as defined by Sections 1 through 9 of this document. + + "Licensor" shall mean the copyright owner or entity authorized by + the copyright owner that is granting the License. + + "Legal Entity" shall mean the union of the acting entity and all + other entities that control, are controlled by, or are under common + control with that entity. For the purposes of this definition, + "control" means (i) the power, direct or indirect, to cause the + direction or management of such entity, whether by contract or + otherwise, or (ii) ownership of fifty percent (50%) or more of the + outstanding shares, or (iii) beneficial ownership of such entity. + + "You" (or "Your") shall mean an individual or Legal Entity + exercising permissions granted by this License. + + "Source" form shall mean the preferred form for making modifications, + including but not limited to software source code, documentation + source, and configuration files. + + "Object" form shall mean any form resulting from mechanical + transformation or translation of a Source form, including but + not limited to compiled object code, generated documentation, + and conversions to other media types. + + "Work" shall mean the work of authorship, whether in Source or + Object form, made available under the License, as indicated by a + copyright notice that is included in or attached to the work + (an example is provided in the Appendix below). + + "Derivative Works" shall mean any work, whether in Source or Object + form, that is based on (or derived from) the Work and for which the + editorial revisions, annotations, elaborations, or other modifications + represent, as a whole, an original work of authorship. For the purposes + of this License, Derivative Works shall not include works that remain + separable from, or merely link (or bind by name) to the interfaces of, + the Work and Derivative Works thereof. + + "Contribution" shall mean any work of authorship, including + the original version of the Work and any modifications or additions + to that Work or Derivative Works thereof, that is intentionally + submitted to Licensor for inclusion in the Work by the copyright owner + or by an individual or Legal Entity authorized to submit on behalf of + the copyright owner. For the purposes of this definition, "submitted" + means any form of electronic, verbal, or written communication sent + to the Licensor or its representatives, including but not limited to + communication on electronic mailing lists, source code control systems, + and issue tracking systems that are managed by, or on behalf of, the + Licensor for the purpose of discussing and improving the Work, but + excluding communication that is conspicuously marked or otherwise + designated in writing by the copyright owner as "Not a Contribution." + + "Contributor" shall mean Licensor and any individual or Legal Entity + on behalf of whom a Contribution has been received by Licensor and + subsequently incorporated within the Work. + + 2. Grant of Copyright License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + copyright license to reproduce, prepare Derivative Works of, + publicly display, publicly perform, sublicense, and distribute the + Work and such Derivative Works in Source or Object form. + + 3. Grant of Patent License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + (except as stated in this section) patent license to make, have made, + use, offer to sell, sell, import, and otherwise transfer the Work, + where such license applies only to those patent claims licensable + by such Contributor that are necessarily infringed by their + Contribution(s) alone or by combination of their Contribution(s) + with the Work to which such Contribution(s) was submitted. If You + institute patent litigation against any entity (including a + cross-claim or counterclaim in a lawsuit) alleging that the Work + or a Contribution incorporated within the Work constitutes direct + or contributory patent infringement, then any patent licenses + granted to You under this License for that Work shall terminate + as of the date such litigation is filed. + + 4. Redistribution. You may reproduce and distribute copies of the + Work or Derivative Works thereof in any medium, with or without + modifications, and in Source or Object form, provided that You + meet the following conditions: + + (a) You must give any other recipients of the Work or + Derivative Works a copy of this License; and + + (b) You must cause any modified files to carry prominent notices + stating that You changed the files; and + + (c) You must retain, in the Source form of any Derivative Works + that You distribute, all copyright, patent, trademark, and + attribution notices from the Source form of the Work, + excluding those notices that do not pertain to any part of + the Derivative Works; and + + (d) If the Work includes a "NOTICE" text file as part of its + distribution, then any Derivative Works that You distribute must + include a readable copy of the attribution notices contained + within such NOTICE file, excluding those notices that do not + pertain to any part of the Derivative Works, in at least one + of the following places: within a NOTICE text file distributed + as part of the Derivative Works; within the Source form or + documentation, if provided along with the Derivative Works; or, + within a display generated by the Derivative Works, if and + wherever such third-party notices normally appear. The contents + of the NOTICE file are for informational purposes only and + do not modify the License. You may add Your own attribution + notices within Derivative Works that You distribute, alongside + or as an addendum to the NOTICE text from the Work, provided + that such additional attribution notices cannot be construed + as modifying the License. + + You may add Your own copyright statement to Your modifications and + may provide additional or different license terms and conditions + for use, reproduction, or distribution of Your modifications, or + for any such Derivative Works as a whole, provided Your use, + reproduction, and distribution of the Work otherwise complies with + the conditions stated in this License. + + 5. Submission of Contributions. Unless You explicitly state otherwise, + any Contribution intentionally submitted for inclusion in the Work + by You to the Licensor shall be under the terms and conditions of + this License, without any additional terms or conditions. + Notwithstanding the above, nothing herein shall supersede or modify + the terms of any separate license agreement you may have executed + with Licensor regarding such Contributions. + + 6. Trademarks. This License does not grant permission to use the trade + names, trademarks, service marks, or product names of the Licensor, + except as required for reasonable and customary use in describing the + origin of the Work and reproducing the content of the NOTICE file. + + 7. Disclaimer of Warranty. Unless required by applicable law or + agreed to in writing, Licensor provides the Work (and each + Contributor provides its Contributions) on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or + implied, including, without limitation, any warranties or conditions + of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A + PARTICULAR PURPOSE. You are solely responsible for determining the + appropriateness of using or redistributing the Work and assume any + risks associated with Your exercise of permissions under this License. + + 8. Limitation of Liability. In no event and under no legal theory, + whether in tort (including negligence), contract, or otherwise, + unless required by applicable law (such as deliberate and grossly + negligent acts) or agreed to in writing, shall any Contributor be + liable to You for damages, including any direct, indirect, special, + incidental, or consequential damages of any character arising as a + result of this License or out of the use or inability to use the + Work (including but not limited to damages for loss of goodwill, + work stoppage, computer failure or malfunction, or any and all + other commercial damages or losses), even if such Contributor + has been advised of the possibility of such damages. + + 9. Accepting Warranty or Additional Liability. While redistributing + the Work or Derivative Works thereof, You may choose to offer, + and charge a fee for, acceptance of support, warranty, indemnity, + or other liability obligations and/or rights consistent with this + License. However, in accepting such obligations, You may act only + on Your own behalf and on Your sole responsibility, not on behalf + of any other Contributor, and only if You agree to indemnify, + defend, and hold each Contributor harmless for any liability + incurred by, or claims asserted against, such Contributor by reason + of your accepting any such warranty or additional liability. + + END OF TERMS AND CONDITIONS + + APPENDIX: How to apply the Apache License to your work. + + To apply the Apache License to your work, attach the following + boilerplate notice, with the fields enclosed by brackets "{}" + replaced with your own identifying information. (Don't include + the brackets!) The text should be enclosed in the appropriate + comment syntax for the file format. We also recommend that a + file or class name and description of purpose be included on the + same "printed page" as the copyright notice for easier + identification within third-party archives. + + Copyright {yyyy} {name of copyright owner} + + Licensed under the Apache License, Version 2.0 (the "License"); + you may not use this file except in compliance with the License. + You may obtain a copy of the License at + + https://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, software + distributed under the License is distributed on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + See the License for the specific language governing permissions and + limitations under the License. + + +astunparse +BSD License +https://github.com/simonpercivall/astunparse +LICENSE +======= + +Copyright (c) 2014, Simon Percivall +All rights reserved. + +Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met: + +* Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer. + +* Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution. + +* Neither the name of AST Unparser nor the names of its contributors may be used to endorse or promote products derived from this software without specific prior written permission. + +THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + + +PYTHON SOFTWARE FOUNDATION LICENSE VERSION 2 +-------------------------------------------- + +1. This LICENSE AGREEMENT is between the Python Software Foundation +("PSF"), and the Individual or Organization ("Licensee") accessing and +otherwise using this software ("Python") in source or binary form and +its associated documentation. + +2. Subject to the terms and conditions of this License Agreement, PSF hereby +grants Licensee a nonexclusive, royalty-free, world-wide license to reproduce, +analyze, test, perform and/or display publicly, prepare derivative works, +distribute, and otherwise use Python alone or in any derivative version, +provided, however, that PSF's License Agreement and PSF's notice of copyright, +i.e., "Copyright (c) 2001, 2002, 2003, 2004, 2005, 2006, 2007, 2008, 2009, 2010, +2011, 2012, 2013, 2014 Python Software Foundation; All Rights Reserved" are retained +in Python alone or in any derivative version prepared by Licensee. + +3. In the event Licensee prepares a derivative work that is based on +or incorporates Python or any part thereof, and wants to make +the derivative work available to others as provided herein, then +Licensee hereby agrees to include in any such work a brief summary of +the changes made to Python. + +4. PSF is making Python available to Licensee on an "AS IS" +basis. PSF MAKES NO REPRESENTATIONS OR WARRANTIES, EXPRESS OR +IMPLIED. BY WAY OF EXAMPLE, BUT NOT LIMITATION, PSF MAKES NO AND +DISCLAIMS ANY REPRESENTATION OR WARRANTY OF MERCHANTABILITY OR FITNESS +FOR ANY PARTICULAR PURPOSE OR THAT THE USE OF PYTHON WILL NOT +INFRINGE ANY THIRD PARTY RIGHTS. + +5. PSF SHALL NOT BE LIABLE TO LICENSEE OR ANY OTHER USERS OF PYTHON +FOR ANY INCIDENTAL, SPECIAL, OR CONSEQUENTIAL DAMAGES OR LOSS AS +A RESULT OF MODIFYING, DISTRIBUTING, OR OTHERWISE USING PYTHON, +OR ANY DERIVATIVE THEREOF, EVEN IF ADVISED OF THE POSSIBILITY THEREOF. + +6. This License Agreement will automatically terminate upon a material +breach of its terms and conditions. + +7. Nothing in this License Agreement shall be deemed to create any +relationship of agency, partnership, or joint venture between PSF and +Licensee. This License Agreement does not grant permission to use PSF +trademarks or trade name in a trademark sense to endorse or promote +products or services of Licensee, or any third party. + +8. By copying, installing or otherwise using Python, Licensee +agrees to be bound by the terms and conditions of this License +Agreement. + + +async-lru +MIT License +https://github.com/aio-libs/async-lru +The MIT License + +Copyright (c) 2018 aio-libs team https://github.com/aio-libs/ +Copyright (c) 2017 Ocean S. A. https://ocean.io/ +Copyright (c) 2016-2017 WikiBusiness Corporation http://wikibusiness.org/ + +Permission is hereby granted, free of charge, to any person obtaining a copy +of this software and associated documentation files (the "Software"), to deal +in the Software without restriction, including without limitation the rights +to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +copies of the Software, and to permit persons to whom the Software is +furnished to do so, subject to the following conditions: + +The above copyright notice and this permission notice shall be included in +all copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN +THE SOFTWARE. + + +attrs +MIT +https://www.attrs.org/en/stable/changelog.html +The MIT License (MIT) + +Copyright (c) 2015 Hynek Schlawack and the attrs contributors + +Permission is hereby granted, free of charge, to any person obtaining a copy +of this software and associated documentation files (the "Software"), to deal +in the Software without restriction, including without limitation the rights +to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +copies of the Software, and to permit persons to whom the Software is +furnished to do so, subject to the following conditions: + +The above copyright notice and this permission notice shall be included in all +copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE +SOFTWARE. + + +audioop-lts +PSF-2.0 +https://github.com/AbstractUmbra/audioop +A. HISTORY OF THE SOFTWARE +========================== + +Python was created in the early 1990s by Guido van Rossum at Stichting +Mathematisch Centrum (CWI, see https://www.cwi.nl) in the Netherlands +as a successor of a language called ABC. Guido remains Python's +principal author, although it includes many contributions from others. + +In 1995, Guido continued his work on Python at the Corporation for +National Research Initiatives (CNRI, see https://www.cnri.reston.va.us) +in Reston, Virginia where he released several versions of the +software. + +In May 2000, Guido and the Python core development team moved to +BeOpen.com to form the BeOpen PythonLabs team. In October of the same +year, the PythonLabs team moved to Digital Creations, which became +Zope Corporation. In 2001, the Python Software Foundation (PSF, see +https://www.python.org/psf/) was formed, a non-profit organization +created specifically to own Python-related Intellectual Property. +Zope Corporation was a sponsoring member of the PSF. + +All Python releases are Open Source (see https://opensource.org for +the Open Source Definition). Historically, most, but not all, Python +releases have also been GPL-compatible; the table below summarizes +the various releases. + + Release Derived Year Owner GPL- + from compatible? (1) + + 0.9.0 thru 1.2 1991-1995 CWI yes + 1.3 thru 1.5.2 1.2 1995-1999 CNRI yes + 1.6 1.5.2 2000 CNRI no + 2.0 1.6 2000 BeOpen.com no + 1.6.1 1.6 2001 CNRI yes (2) + 2.1 2.0+1.6.1 2001 PSF no + 2.0.1 2.0+1.6.1 2001 PSF yes + 2.1.1 2.1+2.0.1 2001 PSF yes + 2.1.2 2.1.1 2002 PSF yes + 2.1.3 2.1.2 2002 PSF yes + 2.2 and above 2.1.1 2001-now PSF yes + +Footnotes: + +(1) GPL-compatible doesn't mean that we're distributing Python under + the GPL. All Python licenses, unlike the GPL, let you distribute + a modified version without making your changes open source. The + GPL-compatible licenses make it possible to combine Python with + other software that is released under the GPL; the others don't. + +(2) According to Richard Stallman, 1.6.1 is not GPL-compatible, + because its license has a choice of law clause. According to + CNRI, however, Stallman's lawyer has told CNRI's lawyer that 1.6.1 + is "not incompatible" with the GPL. + +Thanks to the many outside volunteers who have worked under Guido's +direction to make these releases possible. + + +B. TERMS AND CONDITIONS FOR ACCESSING OR OTHERWISE USING PYTHON +=============================================================== + +Python software and documentation are licensed under the +Python Software Foundation License Version 2. + +Starting with Python 3.8.6, examples, recipes, and other code in +the documentation are dual licensed under the PSF License Version 2 +and the Zero-Clause BSD license. + +Some software incorporated into Python is under different licenses. +The licenses are listed with code falling under that license. + + +PYTHON SOFTWARE FOUNDATION LICENSE VERSION 2 +-------------------------------------------- + +1. This LICENSE AGREEMENT is between the Python Software Foundation +("PSF"), and the Individual or Organization ("Licensee") accessing and +otherwise using this software ("Python") in source or binary form and +its associated documentation. + +2. Subject to the terms and conditions of this License Agreement, PSF hereby +grants Licensee a nonexclusive, royalty-free, world-wide license to reproduce, +analyze, test, perform and/or display publicly, prepare derivative works, +distribute, and otherwise use Python alone or in any derivative version, +provided, however, that PSF's License Agreement and PSF's notice of copyright, +i.e., "Copyright (c) 2001, 2002, 2003, 2004, 2005, 2006, 2007, 2008, 2009, 2010, +2011, 2012, 2013, 2014, 2015, 2016, 2017, 2018, 2019, 2020, 2021, 2022, 2023 Python Software Foundation; +All Rights Reserved" are retained in Python alone or in any derivative version +prepared by Licensee. + +3. In the event Licensee prepares a derivative work that is based on +or incorporates Python or any part thereof, and wants to make +the derivative work available to others as provided herein, then +Licensee hereby agrees to include in any such work a brief summary of +the changes made to Python. + +4. PSF is making Python available to Licensee on an "AS IS" +basis. PSF MAKES NO REPRESENTATIONS OR WARRANTIES, EXPRESS OR +IMPLIED. BY WAY OF EXAMPLE, BUT NOT LIMITATION, PSF MAKES NO AND +DISCLAIMS ANY REPRESENTATION OR WARRANTY OF MERCHANTABILITY OR FITNESS +FOR ANY PARTICULAR PURPOSE OR THAT THE USE OF PYTHON WILL NOT +INFRINGE ANY THIRD PARTY RIGHTS. + +5. PSF SHALL NOT BE LIABLE TO LICENSEE OR ANY OTHER USERS OF PYTHON +FOR ANY INCIDENTAL, SPECIAL, OR CONSEQUENTIAL DAMAGES OR LOSS AS +A RESULT OF MODIFYING, DISTRIBUTING, OR OTHERWISE USING PYTHON, +OR ANY DERIVATIVE THEREOF, EVEN IF ADVISED OF THE POSSIBILITY THEREOF. + +6. This License Agreement will automatically terminate upon a material +breach of its terms and conditions. + +7. Nothing in this License Agreement shall be deemed to create any +relationship of agency, partnership, or joint venture between PSF and +Licensee. This License Agreement does not grant permission to use PSF +trademarks or trade name in a trademark sense to endorse or promote +products or services of Licensee, or any third party. + +8. By copying, installing or otherwise using Python, Licensee +agrees to be bound by the terms and conditions of this License +Agreement. + + +BEOPEN.COM LICENSE AGREEMENT FOR PYTHON 2.0 +------------------------------------------- + +BEOPEN PYTHON OPEN SOURCE LICENSE AGREEMENT VERSION 1 + +1. This LICENSE AGREEMENT is between BeOpen.com ("BeOpen"), having an +office at 160 Saratoga Avenue, Santa Clara, CA 95051, and the +Individual or Organization ("Licensee") accessing and otherwise using +this software in source or binary form and its associated +documentation ("the Software"). + +2. Subject to the terms and conditions of this BeOpen Python License +Agreement, BeOpen hereby grants Licensee a non-exclusive, +royalty-free, world-wide license to reproduce, analyze, test, perform +and/or display publicly, prepare derivative works, distribute, and +otherwise use the Software alone or in any derivative version, +provided, however, that the BeOpen Python License is retained in the +Software, alone or in any derivative version prepared by Licensee. + +3. BeOpen is making the Software available to Licensee on an "AS IS" +basis. BEOPEN MAKES NO REPRESENTATIONS OR WARRANTIES, EXPRESS OR +IMPLIED. BY WAY OF EXAMPLE, BUT NOT LIMITATION, BEOPEN MAKES NO AND +DISCLAIMS ANY REPRESENTATION OR WARRANTY OF MERCHANTABILITY OR FITNESS +FOR ANY PARTICULAR PURPOSE OR THAT THE USE OF THE SOFTWARE WILL NOT +INFRINGE ANY THIRD PARTY RIGHTS. + +4. BEOPEN SHALL NOT BE LIABLE TO LICENSEE OR ANY OTHER USERS OF THE +SOFTWARE FOR ANY INCIDENTAL, SPECIAL, OR CONSEQUENTIAL DAMAGES OR LOSS +AS A RESULT OF USING, MODIFYING OR DISTRIBUTING THE SOFTWARE, OR ANY +DERIVATIVE THEREOF, EVEN IF ADVISED OF THE POSSIBILITY THEREOF. + +5. This License Agreement will automatically terminate upon a material +breach of its terms and conditions. + +6. This License Agreement shall be governed by and interpreted in all +respects by the law of the State of California, excluding conflict of +law provisions. Nothing in this License Agreement shall be deemed to +create any relationship of agency, partnership, or joint venture +between BeOpen and Licensee. This License Agreement does not grant +permission to use BeOpen trademarks or trade names in a trademark +sense to endorse or promote products or services of Licensee, or any +third party. As an exception, the "BeOpen Python" logos available at +http://www.pythonlabs.com/logos.html may be used according to the +permissions granted on that web page. + +7. By copying, installing or otherwise using the software, Licensee +agrees to be bound by the terms and conditions of this License +Agreement. + + +CNRI LICENSE AGREEMENT FOR PYTHON 1.6.1 +--------------------------------------- + +1. This LICENSE AGREEMENT is between the Corporation for National +Research Initiatives, having an office at 1895 Preston White Drive, +Reston, VA 20191 ("CNRI"), and the Individual or Organization +("Licensee") accessing and otherwise using Python 1.6.1 software in +source or binary form and its associated documentation. + +2. Subject to the terms and conditions of this License Agreement, CNRI +hereby grants Licensee a nonexclusive, royalty-free, world-wide +license to reproduce, analyze, test, perform and/or display publicly, +prepare derivative works, distribute, and otherwise use Python 1.6.1 +alone or in any derivative version, provided, however, that CNRI's +License Agreement and CNRI's notice of copyright, i.e., "Copyright (c) +1995-2001 Corporation for National Research Initiatives; All Rights +Reserved" are retained in Python 1.6.1 alone or in any derivative +version prepared by Licensee. Alternately, in lieu of CNRI's License +Agreement, Licensee may substitute the following text (omitting the +quotes): "Python 1.6.1 is made available subject to the terms and +conditions in CNRI's License Agreement. This Agreement together with +Python 1.6.1 may be located on the internet using the following +unique, persistent identifier (known as a handle): 1895.22/1013. This +Agreement may also be obtained from a proxy server on the internet +using the following URL: http://hdl.handle.net/1895.22/1013". + +3. In the event Licensee prepares a derivative work that is based on +or incorporates Python 1.6.1 or any part thereof, and wants to make +the derivative work available to others as provided herein, then +Licensee hereby agrees to include in any such work a brief summary of +the changes made to Python 1.6.1. + +4. CNRI is making Python 1.6.1 available to Licensee on an "AS IS" +basis. CNRI MAKES NO REPRESENTATIONS OR WARRANTIES, EXPRESS OR +IMPLIED. BY WAY OF EXAMPLE, BUT NOT LIMITATION, CNRI MAKES NO AND +DISCLAIMS ANY REPRESENTATION OR WARRANTY OF MERCHANTABILITY OR FITNESS +FOR ANY PARTICULAR PURPOSE OR THAT THE USE OF PYTHON 1.6.1 WILL NOT +INFRINGE ANY THIRD PARTY RIGHTS. + +5. CNRI SHALL NOT BE LIABLE TO LICENSEE OR ANY OTHER USERS OF PYTHON +1.6.1 FOR ANY INCIDENTAL, SPECIAL, OR CONSEQUENTIAL DAMAGES OR LOSS AS +A RESULT OF MODIFYING, DISTRIBUTING, OR OTHERWISE USING PYTHON 1.6.1, +OR ANY DERIVATIVE THEREOF, EVEN IF ADVISED OF THE POSSIBILITY THEREOF. + +6. This License Agreement will automatically terminate upon a material +breach of its terms and conditions. + +7. This License Agreement shall be governed by the federal +intellectual property law of the United States, including without +limitation the federal copyright law, and, to the extent such +U.S. federal law does not apply, by the law of the Commonwealth of +Virginia, excluding Virginia's conflict of law provisions. +Notwithstanding the foregoing, with regard to derivative works based +on Python 1.6.1 that incorporate non-separable material that was +previously distributed under the GNU General Public License (GPL), the +law of the Commonwealth of Virginia shall govern this License +Agreement only as to issues arising under or with respect to +Paragraphs 4, 5, and 7 of this License Agreement. Nothing in this +License Agreement shall be deemed to create any relationship of +agency, partnership, or joint venture between CNRI and Licensee. This +License Agreement does not grant permission to use CNRI trademarks or +trade name in a trademark sense to endorse or promote products or +services of Licensee, or any third party. + +8. By clicking on the "ACCEPT" button where indicated, or by copying, +installing or otherwise using Python 1.6.1, Licensee agrees to be +bound by the terms and conditions of this License Agreement. + + ACCEPT + + +CWI LICENSE AGREEMENT FOR PYTHON 0.9.0 THROUGH 1.2 +-------------------------------------------------- + +Copyright (c) 1991 - 1995, Stichting Mathematisch Centrum Amsterdam, +The Netherlands. All rights reserved. + +Permission to use, copy, modify, and distribute this software and its +documentation for any purpose and without fee is hereby granted, +provided that the above copyright notice appear in all copies and that +both that copyright notice and this permission notice appear in +supporting documentation, and that the name of Stichting Mathematisch +Centrum or CWI not be used in advertising or publicity pertaining to +distribution of the software without specific, written prior +permission. + +STICHTING MATHEMATISCH CENTRUM DISCLAIMS ALL WARRANTIES WITH REGARD TO +THIS SOFTWARE, INCLUDING ALL IMPLIED WARRANTIES OF MERCHANTABILITY AND +FITNESS, IN NO EVENT SHALL STICHTING MATHEMATISCH CENTRUM BE LIABLE +FOR ANY SPECIAL, INDIRECT OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES +WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN +ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT +OF OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE. + +ZERO-CLAUSE BSD LICENSE FOR CODE IN THE PYTHON DOCUMENTATION +---------------------------------------------------------------------- + +Permission to use, copy, modify, and/or distribute this software for any +purpose with or without fee is hereby granted. + +THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES WITH +REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF MERCHANTABILITY +AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY SPECIAL, DIRECT, +INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES WHATSOEVER RESULTING FROM +LOSS OF USE, DATA OR PROFITS, WHETHER IN AN ACTION OF CONTRACT, NEGLIGENCE OR +OTHER TORTIOUS ACTION, ARISING OUT OF OR IN CONNECTION WITH THE USE OR +PERFORMANCE OF THIS SOFTWARE. + + +av +BSD-3-Clause +https://pyav.basswood-io.com +Copyright retained by original committers. All rights reserved. + +Redistribution and use in source and binary forms, with or without +modification, are permitted provided that the following conditions are met: + * Redistributions of source code must retain the above copyright + notice, this list of conditions and the following disclaimer. + * Redistributions in binary form must reproduce the above copyright + notice, this list of conditions and the following disclaimer in the + documentation and/or other materials provided with the distribution. + * Neither the name of the project nor the names of its contributors may be + used to endorse or promote products derived from this software without + specific prior written permission. + +THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" +AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE +IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE +DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDERS BE LIABLE FOR ANY DIRECT, +INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, +BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, +DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY +OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING +NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, +EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + + +babel +BSD License +https://babel.pocoo.org/ +Copyright (c) 2013-2026 by the Babel Team, see AUTHORS for more information. + +Redistribution and use in source and binary forms, with or without +modification, are permitted provided that the following conditions +are met: + + 1. Redistributions of source code must retain the above copyright + notice, this list of conditions and the following disclaimer. + 2. Redistributions in binary form must reproduce the above copyright + notice, this list of conditions and the following disclaimer in + the documentation and/or other materials provided with the + distribution. + 3. Neither the name of the copyright holder nor the names of its + contributors may be used to endorse or promote products derived + from this software without specific prior written permission. + +THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS +"AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT +LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR +A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT +HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, +SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT +LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, +DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY +THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT +(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE +OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + + +backports.zstd +PSF-2.0 +https://github.com/rogdham/backports.zstd +A. HISTORY OF THE SOFTWARE +========================== + +Python was created in the early 1990s by Guido van Rossum at Stichting +Mathematisch Centrum (CWI, see https://www.cwi.nl) in the Netherlands +as a successor of a language called ABC. Guido remains Python's +principal author, although it includes many contributions from others. + +In 1995, Guido continued his work on Python at the Corporation for +National Research Initiatives (CNRI, see https://www.cnri.reston.va.us) +in Reston, Virginia where he released several versions of the +software. + +In May 2000, Guido and the Python core development team moved to +BeOpen.com to form the BeOpen PythonLabs team. In October of the same +year, the PythonLabs team moved to Digital Creations, which became +Zope Corporation. In 2001, the Python Software Foundation (PSF, see +https://www.python.org/psf/) was formed, a non-profit organization +created specifically to own Python-related Intellectual Property. +Zope Corporation was a sponsoring member of the PSF. + +All Python releases are Open Source (see https://opensource.org for +the Open Source Definition). Historically, most, but not all, Python +releases have also been GPL-compatible; the table below summarizes +the various releases. + + Release Derived Year Owner GPL- + from compatible? (1) + + 0.9.0 thru 1.2 1991-1995 CWI yes + 1.3 thru 1.5.2 1.2 1995-1999 CNRI yes + 1.6 1.5.2 2000 CNRI no + 2.0 1.6 2000 BeOpen.com no + 1.6.1 1.6 2001 CNRI yes (2) + 2.1 2.0+1.6.1 2001 PSF no + 2.0.1 2.0+1.6.1 2001 PSF yes + 2.1.1 2.1+2.0.1 2001 PSF yes + 2.1.2 2.1.1 2002 PSF yes + 2.1.3 2.1.2 2002 PSF yes + 2.2 and above 2.1.1 2001-now PSF yes + +Footnotes: + +(1) GPL-compatible doesn't mean that we're distributing Python under + the GPL. All Python licenses, unlike the GPL, let you distribute + a modified version without making your changes open source. The + GPL-compatible licenses make it possible to combine Python with + other software that is released under the GPL; the others don't. + +(2) According to Richard Stallman, 1.6.1 is not GPL-compatible, + because its license has a choice of law clause. According to + CNRI, however, Stallman's lawyer has told CNRI's lawyer that 1.6.1 + is "not incompatible" with the GPL. + +Thanks to the many outside volunteers who have worked under Guido's +direction to make these releases possible. + + +B. TERMS AND CONDITIONS FOR ACCESSING OR OTHERWISE USING PYTHON +=============================================================== + +Python software and documentation are licensed under the +Python Software Foundation License Version 2. + +Starting with Python 3.8.6, examples, recipes, and other code in +the documentation are dual licensed under the PSF License Version 2 +and the Zero-Clause BSD license. + +Some software incorporated into Python is under different licenses. +The licenses are listed with code falling under that license. + + +PYTHON SOFTWARE FOUNDATION LICENSE VERSION 2 +-------------------------------------------- + +1. This LICENSE AGREEMENT is between the Python Software Foundation +("PSF"), and the Individual or Organization ("Licensee") accessing and +otherwise using this software ("Python") in source or binary form and +its associated documentation. + +2. Subject to the terms and conditions of this License Agreement, PSF hereby +grants Licensee a nonexclusive, royalty-free, world-wide license to reproduce, +analyze, test, perform and/or display publicly, prepare derivative works, +distribute, and otherwise use Python alone or in any derivative version, +provided, however, that PSF's License Agreement and PSF's notice of copyright, +i.e., "Copyright (c) 2001 Python Software Foundation; All Rights Reserved" +are retained in Python alone or in any derivative version prepared by Licensee. + +3. In the event Licensee prepares a derivative work that is based on +or incorporates Python or any part thereof, and wants to make +the derivative work available to others as provided herein, then +Licensee hereby agrees to include in any such work a brief summary of +the changes made to Python. + +4. PSF is making Python available to Licensee on an "AS IS" +basis. PSF MAKES NO REPRESENTATIONS OR WARRANTIES, EXPRESS OR +IMPLIED. BY WAY OF EXAMPLE, BUT NOT LIMITATION, PSF MAKES NO AND +DISCLAIMS ANY REPRESENTATION OR WARRANTY OF MERCHANTABILITY OR FITNESS +FOR ANY PARTICULAR PURPOSE OR THAT THE USE OF PYTHON WILL NOT +INFRINGE ANY THIRD PARTY RIGHTS. + +5. PSF SHALL NOT BE LIABLE TO LICENSEE OR ANY OTHER USERS OF PYTHON +FOR ANY INCIDENTAL, SPECIAL, OR CONSEQUENTIAL DAMAGES OR LOSS AS +A RESULT OF MODIFYING, DISTRIBUTING, OR OTHERWISE USING PYTHON, +OR ANY DERIVATIVE THEREOF, EVEN IF ADVISED OF THE POSSIBILITY THEREOF. + +6. This License Agreement will automatically terminate upon a material +breach of its terms and conditions. + +7. Nothing in this License Agreement shall be deemed to create any +relationship of agency, partnership, or joint venture between PSF and +Licensee. This License Agreement does not grant permission to use PSF +trademarks or trade name in a trademark sense to endorse or promote +products or services of Licensee, or any third party. + +8. By copying, installing or otherwise using Python, Licensee +agrees to be bound by the terms and conditions of this License +Agreement. + + +BEOPEN.COM LICENSE AGREEMENT FOR PYTHON 2.0 +------------------------------------------- + +BEOPEN PYTHON OPEN SOURCE LICENSE AGREEMENT VERSION 1 + +1. This LICENSE AGREEMENT is between BeOpen.com ("BeOpen"), having an +office at 160 Saratoga Avenue, Santa Clara, CA 95051, and the +Individual or Organization ("Licensee") accessing and otherwise using +this software in source or binary form and its associated +documentation ("the Software"). + +2. Subject to the terms and conditions of this BeOpen Python License +Agreement, BeOpen hereby grants Licensee a non-exclusive, +royalty-free, world-wide license to reproduce, analyze, test, perform +and/or display publicly, prepare derivative works, distribute, and +otherwise use the Software alone or in any derivative version, +provided, however, that the BeOpen Python License is retained in the +Software, alone or in any derivative version prepared by Licensee. + +3. BeOpen is making the Software available to Licensee on an "AS IS" +basis. BEOPEN MAKES NO REPRESENTATIONS OR WARRANTIES, EXPRESS OR +IMPLIED. BY WAY OF EXAMPLE, BUT NOT LIMITATION, BEOPEN MAKES NO AND +DISCLAIMS ANY REPRESENTATION OR WARRANTY OF MERCHANTABILITY OR FITNESS +FOR ANY PARTICULAR PURPOSE OR THAT THE USE OF THE SOFTWARE WILL NOT +INFRINGE ANY THIRD PARTY RIGHTS. + +4. BEOPEN SHALL NOT BE LIABLE TO LICENSEE OR ANY OTHER USERS OF THE +SOFTWARE FOR ANY INCIDENTAL, SPECIAL, OR CONSEQUENTIAL DAMAGES OR LOSS +AS A RESULT OF USING, MODIFYING OR DISTRIBUTING THE SOFTWARE, OR ANY +DERIVATIVE THEREOF, EVEN IF ADVISED OF THE POSSIBILITY THEREOF. + +5. This License Agreement will automatically terminate upon a material +breach of its terms and conditions. + +6. This License Agreement shall be governed by and interpreted in all +respects by the law of the State of California, excluding conflict of +law provisions. Nothing in this License Agreement shall be deemed to +create any relationship of agency, partnership, or joint venture +between BeOpen and Licensee. This License Agreement does not grant +permission to use BeOpen trademarks or trade names in a trademark +sense to endorse or promote products or services of Licensee, or any +third party. As an exception, the "BeOpen Python" logos available at +http://www.pythonlabs.com/logos.html may be used according to the +permissions granted on that web page. + +7. By copying, installing or otherwise using the software, Licensee +agrees to be bound by the terms and conditions of this License +Agreement. + + +CNRI LICENSE AGREEMENT FOR PYTHON 1.6.1 +--------------------------------------- + +1. This LICENSE AGREEMENT is between the Corporation for National +Research Initiatives, having an office at 1895 Preston White Drive, +Reston, VA 20191 ("CNRI"), and the Individual or Organization +("Licensee") accessing and otherwise using Python 1.6.1 software in +source or binary form and its associated documentation. + +2. Subject to the terms and conditions of this License Agreement, CNRI +hereby grants Licensee a nonexclusive, royalty-free, world-wide +license to reproduce, analyze, test, perform and/or display publicly, +prepare derivative works, distribute, and otherwise use Python 1.6.1 +alone or in any derivative version, provided, however, that CNRI's +License Agreement and CNRI's notice of copyright, i.e., "Copyright (c) +1995-2001 Corporation for National Research Initiatives; All Rights +Reserved" are retained in Python 1.6.1 alone or in any derivative +version prepared by Licensee. Alternately, in lieu of CNRI's License +Agreement, Licensee may substitute the following text (omitting the +quotes): "Python 1.6.1 is made available subject to the terms and +conditions in CNRI's License Agreement. This Agreement together with +Python 1.6.1 may be located on the internet using the following +unique, persistent identifier (known as a handle): 1895.22/1013. This +Agreement may also be obtained from a proxy server on the internet +using the following URL: http://hdl.handle.net/1895.22/1013". + +3. In the event Licensee prepares a derivative work that is based on +or incorporates Python 1.6.1 or any part thereof, and wants to make +the derivative work available to others as provided herein, then +Licensee hereby agrees to include in any such work a brief summary of +the changes made to Python 1.6.1. + +4. CNRI is making Python 1.6.1 available to Licensee on an "AS IS" +basis. CNRI MAKES NO REPRESENTATIONS OR WARRANTIES, EXPRESS OR +IMPLIED. BY WAY OF EXAMPLE, BUT NOT LIMITATION, CNRI MAKES NO AND +DISCLAIMS ANY REPRESENTATION OR WARRANTY OF MERCHANTABILITY OR FITNESS +FOR ANY PARTICULAR PURPOSE OR THAT THE USE OF PYTHON 1.6.1 WILL NOT +INFRINGE ANY THIRD PARTY RIGHTS. + +5. CNRI SHALL NOT BE LIABLE TO LICENSEE OR ANY OTHER USERS OF PYTHON +1.6.1 FOR ANY INCIDENTAL, SPECIAL, OR CONSEQUENTIAL DAMAGES OR LOSS AS +A RESULT OF MODIFYING, DISTRIBUTING, OR OTHERWISE USING PYTHON 1.6.1, +OR ANY DERIVATIVE THEREOF, EVEN IF ADVISED OF THE POSSIBILITY THEREOF. + +6. This License Agreement will automatically terminate upon a material +breach of its terms and conditions. + +7. This License Agreement shall be governed by the federal +intellectual property law of the United States, including without +limitation the federal copyright law, and, to the extent such +U.S. federal law does not apply, by the law of the Commonwealth of +Virginia, excluding Virginia's conflict of law provisions. +Notwithstanding the foregoing, with regard to derivative works based +on Python 1.6.1 that incorporate non-separable material that was +previously distributed under the GNU General Public License (GPL), the +law of the Commonwealth of Virginia shall govern this License +Agreement only as to issues arising under or with respect to +Paragraphs 4, 5, and 7 of this License Agreement. Nothing in this +License Agreement shall be deemed to create any relationship of +agency, partnership, or joint venture between CNRI and Licensee. This +License Agreement does not grant permission to use CNRI trademarks or +trade name in a trademark sense to endorse or promote products or +services of Licensee, or any third party. + +8. By clicking on the "ACCEPT" button where indicated, or by copying, +installing or otherwise using Python 1.6.1, Licensee agrees to be +bound by the terms and conditions of this License Agreement. + + ACCEPT + + +CWI LICENSE AGREEMENT FOR PYTHON 0.9.0 THROUGH 1.2 +-------------------------------------------------- + +Copyright (c) 1991 - 1995, Stichting Mathematisch Centrum Amsterdam, +The Netherlands. All rights reserved. + +Permission to use, copy, modify, and distribute this software and its +documentation for any purpose and without fee is hereby granted, +provided that the above copyright notice appear in all copies and that +both that copyright notice and this permission notice appear in +supporting documentation, and that the name of Stichting Mathematisch +Centrum or CWI not be used in advertising or publicity pertaining to +distribution of the software without specific, written prior +permission. + +STICHTING MATHEMATISCH CENTRUM DISCLAIMS ALL WARRANTIES WITH REGARD TO +THIS SOFTWARE, INCLUDING ALL IMPLIED WARRANTIES OF MERCHANTABILITY AND +FITNESS, IN NO EVENT SHALL STICHTING MATHEMATISCH CENTRUM BE LIABLE +FOR ANY SPECIAL, INDIRECT OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES +WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN +ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT +OF OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE. + +ZERO-CLAUSE BSD LICENSE FOR CODE IN THE PYTHON DOCUMENTATION +---------------------------------------------------------------------- + +Permission to use, copy, modify, and/or distribute this software for any +purpose with or without fee is hereby granted. + +THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES WITH +REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF MERCHANTABILITY +AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY SPECIAL, DIRECT, +INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES WHATSOEVER RESULTING FROM +LOSS OF USE, DATA OR PROFITS, WHETHER IN AN ACTION OF CONTRACT, NEGLIGENCE OR +OTHER TORTIOUS ACTION, ARISING OUT OF OR IN CONNECTION WITH THE USE OR +PERFORMANCE OF THIS SOFTWARE. + + +beautifulsoup4 +MIT License +https://www.crummy.com/software/BeautifulSoup/bs4/ +Beautiful Soup is made available under the MIT license: + + Copyright (c) Leonard Richardson + + Permission is hereby granted, free of charge, to any person obtaining + a copy of this software and associated documentation files (the + "Software"), to deal in the Software without restriction, including + without limitation the rights to use, copy, modify, merge, publish, + distribute, sublicense, and/or sell copies of the Software, and to + permit persons to whom the Software is furnished to do so, subject to + the following conditions: + + The above copyright notice and this permission notice shall be + included in all copies or substantial portions of the Software. + + THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, + EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF + MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND + NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS + BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN + ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN + CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE + SOFTWARE. + +Beautiful Soup incorporates code from the html5lib library, which is +also made available under the MIT license. Copyright (c) James Graham +and other contributors + +Beautiful Soup has an optional dependency on the soupsieve library, +which is also made available under the MIT license. Copyright (c) +Isaac Muse + + +better-profanity +MIT License +https://github.com/snguyenthanh/better_profanity +Copyright (c) 2018 The Python Packaging Authority + +Permission is hereby granted, free of charge, to any person obtaining a copy +of this software and associated documentation files (the "Software"), to deal +in the Software without restriction, including without limitation the rights +to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +copies of the Software, and to permit persons to whom the Software is +furnished to do so, subject to the following conditions: + +The above copyright notice and this permission notice shall be included in all +copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE +SOFTWARE. + + +bleach +Apache Software License +https://github.com/mozilla/bleach +Copyright (c) 2014-2017, Mozilla Foundation + +Licensed under the Apache License, Version 2.0 (the "License"); +you may not use this file except in compliance with the License. +You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + +Unless required by applicable law or agreed to in writing, software +distributed under the License is distributed on an "AS IS" BASIS, +WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +See the License for the specific language governing permissions and +limitations under the License. + + +blinker +MIT License +https://github.com/pallets-eco/blinker/ +Copyright 2010 Jason Kirtland + +Permission is hereby granted, free of charge, to any person obtaining a +copy of this software and associated documentation files (the +"Software"), to deal in the Software without restriction, including +without limitation the rights to use, copy, modify, merge, publish, +distribute, sublicense, and/or sell copies of the Software, and to +permit persons to whom the Software is furnished to do so, subject to +the following conditions: + +The above copyright notice and this permission notice shall be included +in all copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS +OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF +MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. +IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY +CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, +TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE +SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. + + +blobfile +Public Domain +https://github.com/blobfile/blobfile +This is free and unencumbered software released into the public domain. + +Anyone is free to copy, modify, publish, use, compile, sell, or +distribute this software, either in source code form or as a compiled +binary, for any purpose, commercial or non-commercial, and by any +means. + +In jurisdictions that recognize copyright laws, the author or authors +of this software dedicate any and all copyright interest in the +software to the public domain. We make this dedication for the benefit +of the public at large and to the detriment of our heirs and +successors. We intend this dedication to be an overt act of +relinquishment in perpetuity of all present and future rights to this +software under copyright law. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, +EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF +MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. +IN NO EVENT SHALL THE AUTHORS BE LIABLE FOR ANY CLAIM, DAMAGES OR +OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, +ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR +OTHER DEALINGS IN THE SOFTWARE. + +For more information, please refer to + +boto3 +Apache-2.0 +https://github.com/boto/boto3 + + Apache License + Version 2.0, January 2004 + http://www.apache.org/licenses/ + + TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION + + 1. Definitions. + + "License" shall mean the terms and conditions for use, reproduction, + and distribution as defined by Sections 1 through 9 of this document. + + "Licensor" shall mean the copyright owner or entity authorized by + the copyright owner that is granting the License. + + "Legal Entity" shall mean the union of the acting entity and all + other entities that control, are controlled by, or are under common + control with that entity. For the purposes of this definition, + "control" means (i) the power, direct or indirect, to cause the + direction or management of such entity, whether by contract or + otherwise, or (ii) ownership of fifty percent (50%) or more of the + outstanding shares, or (iii) beneficial ownership of such entity. + + "You" (or "Your") shall mean an individual or Legal Entity + exercising permissions granted by this License. + + "Source" form shall mean the preferred form for making modifications, + including but not limited to software source code, documentation + source, and configuration files. + + "Object" form shall mean any form resulting from mechanical + transformation or translation of a Source form, including but + not limited to compiled object code, generated documentation, + and conversions to other media types. + + "Work" shall mean the work of authorship, whether in Source or + Object form, made available under the License, as indicated by a + copyright notice that is included in or attached to the work + (an example is provided in the Appendix below). + + "Derivative Works" shall mean any work, whether in Source or Object + form, that is based on (or derived from) the Work and for which the + editorial revisions, annotations, elaborations, or other modifications + represent, as a whole, an original work of authorship. For the purposes + of this License, Derivative Works shall not include works that remain + separable from, or merely link (or bind by name) to the interfaces of, + the Work and Derivative Works thereof. + + "Contribution" shall mean any work of authorship, including + the original version of the Work and any modifications or additions + to that Work or Derivative Works thereof, that is intentionally + submitted to Licensor for inclusion in the Work by the copyright owner + or by an individual or Legal Entity authorized to submit on behalf of + the copyright owner. For the purposes of this definition, "submitted" + means any form of electronic, verbal, or written communication sent + to the Licensor or its representatives, including but not limited to + communication on electronic mailing lists, source code control systems, + and issue tracking systems that are managed by, or on behalf of, the + Licensor for the purpose of discussing and improving the Work, but + excluding communication that is conspicuously marked or otherwise + designated in writing by the copyright owner as "Not a Contribution." + + "Contributor" shall mean Licensor and any individual or Legal Entity + on behalf of whom a Contribution has been received by Licensor and + subsequently incorporated within the Work. + + 2. Grant of Copyright License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + copyright license to reproduce, prepare Derivative Works of, + publicly display, publicly perform, sublicense, and distribute the + Work and such Derivative Works in Source or Object form. + + 3. Grant of Patent License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + (except as stated in this section) patent license to make, have made, + use, offer to sell, sell, import, and otherwise transfer the Work, + where such license applies only to those patent claims licensable + by such Contributor that are necessarily infringed by their + Contribution(s) alone or by combination of their Contribution(s) + with the Work to which such Contribution(s) was submitted. If You + institute patent litigation against any entity (including a + cross-claim or counterclaim in a lawsuit) alleging that the Work + or a Contribution incorporated within the Work constitutes direct + or contributory patent infringement, then any patent licenses + granted to You under this License for that Work shall terminate + as of the date such litigation is filed. + + 4. Redistribution. You may reproduce and distribute copies of the + Work or Derivative Works thereof in any medium, with or without + modifications, and in Source or Object form, provided that You + meet the following conditions: + + (a) You must give any other recipients of the Work or + Derivative Works a copy of this License; and + + (b) You must cause any modified files to carry prominent notices + stating that You changed the files; and + + (c) You must retain, in the Source form of any Derivative Works + that You distribute, all copyright, patent, trademark, and + attribution notices from the Source form of the Work, + excluding those notices that do not pertain to any part of + the Derivative Works; and + + (d) If the Work includes a "NOTICE" text file as part of its + distribution, then any Derivative Works that You distribute must + include a readable copy of the attribution notices contained + within such NOTICE file, excluding those notices that do not + pertain to any part of the Derivative Works, in at least one + of the following places: within a NOTICE text file distributed + as part of the Derivative Works; within the Source form or + documentation, if provided along with the Derivative Works; or, + within a display generated by the Derivative Works, if and + wherever such third-party notices normally appear. The contents + of the NOTICE file are for informational purposes only and + do not modify the License. You may add Your own attribution + notices within Derivative Works that You distribute, alongside + or as an addendum to the NOTICE text from the Work, provided + that such additional attribution notices cannot be construed + as modifying the License. + + You may add Your own copyright statement to Your modifications and + may provide additional or different license terms and conditions + for use, reproduction, or distribution of Your modifications, or + for any such Derivative Works as a whole, provided Your use, + reproduction, and distribution of the Work otherwise complies with + the conditions stated in this License. + + 5. Submission of Contributions. Unless You explicitly state otherwise, + any Contribution intentionally submitted for inclusion in the Work + by You to the Licensor shall be under the terms and conditions of + this License, without any additional terms or conditions. + Notwithstanding the above, nothing herein shall supersede or modify + the terms of any separate license agreement you may have executed + with Licensor regarding such Contributions. + + 6. Trademarks. This License does not grant permission to use the trade + names, trademarks, service marks, or product names of the Licensor, + except as required for reasonable and customary use in describing the + origin of the Work and reproducing the content of the NOTICE file. + + 7. Disclaimer of Warranty. Unless required by applicable law or + agreed to in writing, Licensor provides the Work (and each + Contributor provides its Contributions) on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or + implied, including, without limitation, any warranties or conditions + of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A + PARTICULAR PURPOSE. You are solely responsible for determining the + appropriateness of using or redistributing the Work and assume any + risks associated with Your exercise of permissions under this License. + + 8. Limitation of Liability. In no event and under no legal theory, + whether in tort (including negligence), contract, or otherwise, + unless required by applicable law (such as deliberate and grossly + negligent acts) or agreed to in writing, shall any Contributor be + liable to You for damages, including any direct, indirect, special, + incidental, or consequential damages of any character arising as a + result of this License or out of the use or inability to use the + Work (including but not limited to damages for loss of goodwill, + work stoppage, computer failure or malfunction, or any and all + other commercial damages or losses), even if such Contributor + has been advised of the possibility of such damages. + + 9. Accepting Warranty or Additional Liability. While redistributing + the Work or Derivative Works thereof, You may choose to offer, + and charge a fee for, acceptance of support, warranty, indemnity, + or other liability obligations and/or rights consistent with this + License. However, in accepting such obligations, You may act only + on Your own behalf and on Your sole responsibility, not on behalf + of any other Contributor, and only if You agree to indemnify, + defend, and hold each Contributor harmless for any liability + incurred by, or claims asserted against, such Contributor by reason + of your accepting any such warranty or additional liability. + + END OF TERMS AND CONDITIONS + + +botocore +Apache-2.0 +https://github.com/boto/botocore + + Apache License + Version 2.0, January 2004 + http://www.apache.org/licenses/ + + TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION + + 1. Definitions. + + "License" shall mean the terms and conditions for use, reproduction, + and distribution as defined by Sections 1 through 9 of this document. + + "Licensor" shall mean the copyright owner or entity authorized by + the copyright owner that is granting the License. + + "Legal Entity" shall mean the union of the acting entity and all + other entities that control, are controlled by, or are under common + control with that entity. For the purposes of this definition, + "control" means (i) the power, direct or indirect, to cause the + direction or management of such entity, whether by contract or + otherwise, or (ii) ownership of fifty percent (50%) or more of the + outstanding shares, or (iii) beneficial ownership of such entity. + + "You" (or "Your") shall mean an individual or Legal Entity + exercising permissions granted by this License. + + "Source" form shall mean the preferred form for making modifications, + including but not limited to software source code, documentation + source, and configuration files. + + "Object" form shall mean any form resulting from mechanical + transformation or translation of a Source form, including but + not limited to compiled object code, generated documentation, + and conversions to other media types. + + "Work" shall mean the work of authorship, whether in Source or + Object form, made available under the License, as indicated by a + copyright notice that is included in or attached to the work + (an example is provided in the Appendix below). + + "Derivative Works" shall mean any work, whether in Source or Object + form, that is based on (or derived from) the Work and for which the + editorial revisions, annotations, elaborations, or other modifications + represent, as a whole, an original work of authorship. For the purposes + of this License, Derivative Works shall not include works that remain + separable from, or merely link (or bind by name) to the interfaces of, + the Work and Derivative Works thereof. + + "Contribution" shall mean any work of authorship, including + the original version of the Work and any modifications or additions + to that Work or Derivative Works thereof, that is intentionally + submitted to Licensor for inclusion in the Work by the copyright owner + or by an individual or Legal Entity authorized to submit on behalf of + the copyright owner. For the purposes of this definition, "submitted" + means any form of electronic, verbal, or written communication sent + to the Licensor or its representatives, including but not limited to + communication on electronic mailing lists, source code control systems, + and issue tracking systems that are managed by, or on behalf of, the + Licensor for the purpose of discussing and improving the Work, but + excluding communication that is conspicuously marked or otherwise + designated in writing by the copyright owner as "Not a Contribution." + + "Contributor" shall mean Licensor and any individual or Legal Entity + on behalf of whom a Contribution has been received by Licensor and + subsequently incorporated within the Work. + + 2. Grant of Copyright License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + copyright license to reproduce, prepare Derivative Works of, + publicly display, publicly perform, sublicense, and distribute the + Work and such Derivative Works in Source or Object form. + + 3. Grant of Patent License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + (except as stated in this section) patent license to make, have made, + use, offer to sell, sell, import, and otherwise transfer the Work, + where such license applies only to those patent claims licensable + by such Contributor that are necessarily infringed by their + Contribution(s) alone or by combination of their Contribution(s) + with the Work to which such Contribution(s) was submitted. If You + institute patent litigation against any entity (including a + cross-claim or counterclaim in a lawsuit) alleging that the Work + or a Contribution incorporated within the Work constitutes direct + or contributory patent infringement, then any patent licenses + granted to You under this License for that Work shall terminate + as of the date such litigation is filed. + + 4. Redistribution. You may reproduce and distribute copies of the + Work or Derivative Works thereof in any medium, with or without + modifications, and in Source or Object form, provided that You + meet the following conditions: + + (a) You must give any other recipients of the Work or + Derivative Works a copy of this License; and + + (b) You must cause any modified files to carry prominent notices + stating that You changed the files; and + + (c) You must retain, in the Source form of any Derivative Works + that You distribute, all copyright, patent, trademark, and + attribution notices from the Source form of the Work, + excluding those notices that do not pertain to any part of + the Derivative Works; and + + (d) If the Work includes a "NOTICE" text file as part of its + distribution, then any Derivative Works that You distribute must + include a readable copy of the attribution notices contained + within such NOTICE file, excluding those notices that do not + pertain to any part of the Derivative Works, in at least one + of the following places: within a NOTICE text file distributed + as part of the Derivative Works; within the Source form or + documentation, if provided along with the Derivative Works; or, + within a display generated by the Derivative Works, if and + wherever such third-party notices normally appear. The contents + of the NOTICE file are for informational purposes only and + do not modify the License. You may add Your own attribution + notices within Derivative Works that You distribute, alongside + or as an addendum to the NOTICE text from the Work, provided + that such additional attribution notices cannot be construed + as modifying the License. + + You may add Your own copyright statement to Your modifications and + may provide additional or different license terms and conditions + for use, reproduction, or distribution of Your modifications, or + for any such Derivative Works as a whole, provided Your use, + reproduction, and distribution of the Work otherwise complies with + the conditions stated in this License. + + 5. Submission of Contributions. Unless You explicitly state otherwise, + any Contribution intentionally submitted for inclusion in the Work + by You to the Licensor shall be under the terms and conditions of + this License, without any additional terms or conditions. + Notwithstanding the above, nothing herein shall supersede or modify + the terms of any separate license agreement you may have executed + with Licensor regarding such Contributions. + + 6. Trademarks. This License does not grant permission to use the trade + names, trademarks, service marks, or product names of the Licensor, + except as required for reasonable and customary use in describing the + origin of the Work and reproducing the content of the NOTICE file. + + 7. Disclaimer of Warranty. Unless required by applicable law or + agreed to in writing, Licensor provides the Work (and each + Contributor provides its Contributions) on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or + implied, including, without limitation, any warranties or conditions + of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A + PARTICULAR PURPOSE. You are solely responsible for determining the + appropriateness of using or redistributing the Work and assume any + risks associated with Your exercise of permissions under this License. + + 8. Limitation of Liability. In no event and under no legal theory, + whether in tort (including negligence), contract, or otherwise, + unless required by applicable law (such as deliberate and grossly + negligent acts) or agreed to in writing, shall any Contributor be + liable to You for damages, including any direct, indirect, special, + incidental, or consequential damages of any character arising as a + result of this License or out of the use or inability to use the + Work (including but not limited to damages for loss of goodwill, + work stoppage, computer failure or malfunction, or any and all + other commercial damages or losses), even if such Contributor + has been advised of the possibility of such damages. + + 9. Accepting Warranty or Additional Liability. While redistributing + the Work or Derivative Works thereof, You may choose to offer, + and charge a fee for, acceptance of support, warranty, indemnity, + or other liability obligations and/or rights consistent with this + License. However, in accepting such obligations, You may act only + on Your own behalf and on Your sole responsibility, not on behalf + of any other Contributor, and only if You agree to indemnify, + defend, and hold each Contributor harmless for any liability + incurred by, or claims asserted against, such Contributor by reason + of your accepting any such warranty or additional liability. + + END OF TERMS AND CONDITIONS + + +braceexpand +MIT License +https://github.com/trendels/braceexpand +The MIT License (MIT) + +Copyright (c) 2015 Stanis Trendelenburg + +Permission is hereby granted, free of charge, to any person obtaining a copy +of this software and associated documentation files (the "Software"), to deal +in the Software without restriction, including without limitation the rights +to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +copies of the Software, and to permit persons to whom the Software is +furnished to do so, subject to the following conditions: + +The above copyright notice and this permission notice shall be included in all +copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE +SOFTWARE. + + +bracex +MIT +https://github.com/facelessuser/bracex +MIT License + +Copyright (c) 2018 - 2025 Isaac Muse + +Permission is hereby granted, free of charge, to any person obtaining a copy +of this software and associated documentation files (the "Software"), to deal +in the Software without restriction, including without limitation the rights +to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +copies of the Software, and to permit persons to whom the Software is +furnished to do so, subject to the following conditions: + +The above copyright notice and this permission notice shall be included in all +copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE +SOFTWARE. + + +brotli +MIT +https://github.com/google/brotli +Copyright (c) 2009, 2010, 2013-2016 by the Brotli Authors. + +Permission is hereby granted, free of charge, to any person obtaining a copy +of this software and associated documentation files (the "Software"), to deal +in the Software without restriction, including without limitation the rights +to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +copies of the Software, and to permit persons to whom the Software is +furnished to do so, subject to the following conditions: + +The above copyright notice and this permission notice shall be included in +all copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN +THE SOFTWARE. + + +cattrs +MIT License +https://catt.rs + +MIT License + +Copyright (c) 2016, Tin Tvrtković + +Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions: + +The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. + + + +certifi +Mozilla Public License 2.0 (MPL 2.0) +https://github.com/certifi/python-certifi +This package contains a modified version of ca-bundle.crt: + +ca-bundle.crt -- Bundle of CA Root Certificates + +This is a bundle of X.509 certificates of public Certificate Authorities +(CA). These were automatically extracted from Mozilla's root certificates +file (certdata.txt). This file can be found in the mozilla source tree: +https://hg.mozilla.org/mozilla-central/file/tip/security/nss/lib/ckfw/builtins/certdata.txt +It contains the certificates in PEM format and therefore +can be directly used with curl / libcurl / php_curl, or with +an Apache+mod_ssl webserver for SSL client authentication. +Just configure this file as the SSLCACertificateFile.# + +***** BEGIN LICENSE BLOCK ***** +This Source Code Form is subject to the terms of the Mozilla Public License, +v. 2.0. If a copy of the MPL was not distributed with this file, You can obtain +one at http://mozilla.org/MPL/2.0/. + +***** END LICENSE BLOCK ***** +@(#) $RCSfile: certdata.txt,v $ $Revision: 1.80 $ $Date: 2011/11/03 15:11:58 $ + + +cffi +MIT +https://cffi.readthedocs.io/en/latest/whatsnew.html + +Except when otherwise stated (look for LICENSE files in directories or +information at the beginning of each file) all software and +documentation is licensed as follows: + + MIT No Attribution + + Permission is hereby granted, free of charge, to any person + obtaining a copy of this software and associated documentation + files (the "Software"), to deal in the Software without + restriction, including without limitation the rights to use, + copy, modify, merge, publish, distribute, sublicense, and/or + sell copies of the Software, and to permit persons to whom the + Software is furnished to do so. + + THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS + OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL + THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER + LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING + FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER + DEALINGS IN THE SOFTWARE. + + + +charset-normalizer +MIT +https://github.com/jawah/charset_normalizer/blob/master/CHANGELOG.md +MIT License + +Copyright (c) 2025 TAHRI Ahmed R. + +Permission is hereby granted, free of charge, to any person obtaining a copy +of this software and associated documentation files (the "Software"), to deal +in the Software without restriction, including without limitation the rights +to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +copies of the Software, and to permit persons to whom the Software is +furnished to do so, subject to the following conditions: + +The above copyright notice and this permission notice shall be included in all +copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE +SOFTWARE. + + +click +BSD-3-Clause +https://github.com/pallets/click/ +Copyright 2014 Pallets + +Redistribution and use in source and binary forms, with or without +modification, are permitted provided that the following conditions are +met: + +1. Redistributions of source code must retain the above copyright + notice, this list of conditions and the following disclaimer. + +2. Redistributions in binary form must reproduce the above copyright + notice, this list of conditions and the following disclaimer in the + documentation and/or other materials provided with the distribution. + +3. Neither the name of the copyright holder nor the names of its + contributors may be used to endorse or promote products derived from + this software without specific prior written permission. + +THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS +"AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT +LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A +PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT +HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, +SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED +TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR +PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF +LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING +NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS +SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + + +cmake +Apache Software License; BSD License +https://cmake.org +Apache License +Version 2.0, January 2004 +http://www.apache.org/licenses/ + +TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION + +1. Definitions. + +"License" shall mean the terms and conditions for use, reproduction, and +distribution as defined by Sections 1 through 9 of this document. + +"Licensor" shall mean the copyright owner or entity authorized by the copyright +owner that is granting the License. + +"Legal Entity" shall mean the union of the acting entity and all other entities +that control, are controlled by, or are under common control with that entity. +For the purposes of this definition, "control" means (i) the power, direct or +indirect, to cause the direction or management of such entity, whether by +contract or otherwise, or (ii) ownership of fifty percent (50%) or more of the +outstanding shares, or (iii) beneficial ownership of such entity. + +"You" (or "Your") shall mean an individual or Legal Entity exercising +permissions granted by this License. + +"Source" form shall mean the preferred form for making modifications, including +but not limited to software source code, documentation source, and configuration +files. + +"Object" form shall mean any form resulting from mechanical transformation or +translation of a Source form, including but not limited to compiled object code, +generated documentation, and conversions to other media types. + +"Work" shall mean the work of authorship, whether in Source or Object form, made +available under the License, as indicated by a copyright notice that is included +in or attached to the work (an example is provided in the Appendix below). + +"Derivative Works" shall mean any work, whether in Source or Object form, that +is based on (or derived from) the Work and for which the editorial revisions, +annotations, elaborations, or other modifications represent, as a whole, an +original work of authorship. For the purposes of this License, Derivative Works +shall not include works that remain separable from, or merely link (or bind by +name) to the interfaces of, the Work and Derivative Works thereof. + +"Contribution" shall mean any work of authorship, including the original version +of the Work and any modifications or additions to that Work or Derivative Works +thereof, that is intentionally submitted to Licensor for inclusion in the Work +by the copyright owner or by an individual or Legal Entity authorized to submit +on behalf of the copyright owner. For the purposes of this definition, +"submitted" means any form of electronic, verbal, or written communication sent +to the Licensor or its representatives, including but not limited to +communication on electronic mailing lists, source code control systems, and +issue tracking systems that are managed by, or on behalf of, the Licensor for +the purpose of discussing and improving the Work, but excluding communication +that is conspicuously marked or otherwise designated in writing by the copyright +owner as "Not a Contribution." + +"Contributor" shall mean Licensor and any individual or Legal Entity on behalf +of whom a Contribution has been received by Licensor and subsequently +incorporated within the Work. + +2. Grant of Copyright License. + +Subject to the terms and conditions of this License, each Contributor hereby +grants to You a perpetual, worldwide, non-exclusive, no-charge, royalty-free, +irrevocable copyright license to reproduce, prepare Derivative Works of, +publicly display, publicly perform, sublicense, and distribute the Work and such +Derivative Works in Source or Object form. + +3. Grant of Patent License. + +Subject to the terms and conditions of this License, each Contributor hereby +grants to You a perpetual, worldwide, non-exclusive, no-charge, royalty-free, +irrevocable (except as stated in this section) patent license to make, have +made, use, offer to sell, sell, import, and otherwise transfer the Work, where +such license applies only to those patent claims licensable by such Contributor +that are necessarily infringed by their Contribution(s) alone or by combination +of their Contribution(s) with the Work to which such Contribution(s) was +submitted. If You institute patent litigation against any entity (including a +cross-claim or counterclaim in a lawsuit) alleging that the Work or a +Contribution incorporated within the Work constitutes direct or contributory +patent infringement, then any patent licenses granted to You under this License +for that Work shall terminate as of the date such litigation is filed. + +4. Redistribution. + +You may reproduce and distribute copies of the Work or Derivative Works thereof +in any medium, with or without modifications, and in Source or Object form, +provided that You meet the following conditions: + +You must give any other recipients of the Work or Derivative Works a copy of +this License; and +You must cause any modified files to carry prominent notices stating that You +changed the files; and +You must retain, in the Source form of any Derivative Works that You distribute, +all copyright, patent, trademark, and attribution notices from the Source form +of the Work, excluding those notices that do not pertain to any part of the +Derivative Works; and +If the Work includes a "NOTICE" text file as part of its distribution, then any +Derivative Works that You distribute must include a readable copy of the +attribution notices contained within such NOTICE file, excluding those notices +that do not pertain to any part of the Derivative Works, in at least one of the +following places: within a NOTICE text file distributed as part of the +Derivative Works; within the Source form or documentation, if provided along +with the Derivative Works; or, within a display generated by the Derivative +Works, if and wherever such third-party notices normally appear. The contents of +the NOTICE file are for informational purposes only and do not modify the +License. You may add Your own attribution notices within Derivative Works that +You distribute, alongside or as an addendum to the NOTICE text from the Work, +provided that such additional attribution notices cannot be construed as +modifying the License. +You may add Your own copyright statement to Your modifications and may provide +additional or different license terms and conditions for use, reproduction, or +distribution of Your modifications, or for any such Derivative Works as a whole, +provided Your use, reproduction, and distribution of the Work otherwise complies +with the conditions stated in this License. + +5. Submission of Contributions. + +Unless You explicitly state otherwise, any Contribution intentionally submitted +for inclusion in the Work by You to the Licensor shall be under the terms and +conditions of this License, without any additional terms or conditions. +Notwithstanding the above, nothing herein shall supersede or modify the terms of +any separate license agreement you may have executed with Licensor regarding +such Contributions. + +6. Trademarks. + +This License does not grant permission to use the trade names, trademarks, +service marks, or product names of the Licensor, except as required for +reasonable and customary use in describing the origin of the Work and +reproducing the content of the NOTICE file. + +7. Disclaimer of Warranty. + +Unless required by applicable law or agreed to in writing, Licensor provides the +Work (and each Contributor provides its Contributions) on an "AS IS" BASIS, +WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied, +including, without limitation, any warranties or conditions of TITLE, +NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A PARTICULAR PURPOSE. You are +solely responsible for determining the appropriateness of using or +redistributing the Work and assume any risks associated with Your exercise of +permissions under this License. + +8. Limitation of Liability. + +In no event and under no legal theory, whether in tort (including negligence), +contract, or otherwise, unless required by applicable law (such as deliberate +and grossly negligent acts) or agreed to in writing, shall any Contributor be +liable to You for damages, including any direct, indirect, special, incidental, +or consequential damages of any character arising as a result of this License or +out of the use or inability to use the Work (including but not limited to +damages for loss of goodwill, work stoppage, computer failure or malfunction, or +any and all other commercial damages or losses), even if such Contributor has +been advised of the possibility of such damages. + +9. Accepting Warranty or Additional Liability. + +While redistributing the Work or Derivative Works thereof, You may choose to +offer, and charge a fee for, acceptance of support, warranty, indemnity, or +other liability obligations and/or rights consistent with this License. However, +in accepting such obligations, You may act only on Your own behalf and on Your +sole responsibility, not on behalf of any other Contributor, and only if You +agree to indemnify, defend, and hold each Contributor harmless for any liability +incurred by, or claims asserted against, such Contributor by reason of your +accepting any such warranty or additional liability. + +END OF TERMS AND CONDITIONS + +APPENDIX: How to apply the Apache License to your work + +To apply the Apache License to your work, attach the following boilerplate +notice, with the fields enclosed by brackets "[]" replaced with your own +identifying information. (Don't include the brackets!) The text should be +enclosed in the appropriate comment syntax for the file format. We also +recommend that a file or class name and description of purpose be included on +the same "printed page" as the copyright notice for easier identification within +third-party archives. + + Copyright [yyyy] [name of copyright owner] + + Licensed under the Apache License, Version 2.0 (the "License"); + you may not use this file except in compliance with the License. + You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, software + distributed under the License is distributed on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + See the License for the specific language governing permissions and + limitations under the License. + + +colorful +MIT License +http://github.com/timofurrer/colorful +The MIT License (MIT) + +Copyright (c) 2017 Timo Furrer + +Permission is hereby granted, free of charge, to any person obtaining a copy +of this software and associated documentation files (the "Software"), to deal +in the Software without restriction, including without limitation the rights +to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +copies of the Software, and to permit persons to whom the Software is +furnished to do so, subject to the following conditions: + +The above copyright notice and this permission notice shall be included in all +copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE +SOFTWARE. + + +comm +BSD License +https://github.com/ipython/comm +BSD 3-Clause License + +Copyright (c) 2022, Jupyter +All rights reserved. + +Redistribution and use in source and binary forms, with or without +modification, are permitted provided that the following conditions are met: + +1. Redistributions of source code must retain the above copyright notice, this + list of conditions and the following disclaimer. + +2. Redistributions in binary form must reproduce the above copyright notice, + this list of conditions and the following disclaimer in the documentation + and/or other materials provided with the distribution. + +3. Neither the name of the copyright holder nor the names of its + contributors may be used to endorse or promote products derived from + this software without specific prior written permission. + +THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" +AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE +IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE +DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE +FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL +DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR +SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER +CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, +OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE +OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + + +contourpy +BSD License +https://github.com/contourpy/contourpy +BSD 3-Clause License + +Copyright (c) 2021-2025, ContourPy Developers. +All rights reserved. + +Redistribution and use in source and binary forms, with or without +modification, are permitted provided that the following conditions are met: + +1. Redistributions of source code must retain the above copyright notice, this + list of conditions and the following disclaimer. + +2. Redistributions in binary form must reproduce the above copyright notice, + this list of conditions and the following disclaimer in the documentation + and/or other materials provided with the distribution. + +3. Neither the name of the copyright holder nor the names of its + contributors may be used to endorse or promote products derived from + this software without specific prior written permission. + +THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" +AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE +IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE +DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE +FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL +DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR +SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER +CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, +OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE +OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + + +cosmos3 +Apache Software License +https://research.nvidia.com/labs/dir/cosmos3 + Apache License + Version 2.0, January 2004 + http://www.apache.org/licenses/ + + TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION + + 1. Definitions. + + "License" shall mean the terms and conditions for use, reproduction, + and distribution as defined by Sections 1 through 9 of this document. + + "Licensor" shall mean the copyright owner or entity authorized by + the copyright owner that is granting the License. + + "Legal Entity" shall mean the union of the acting entity and all + other entities that control, are controlled by, or are under common + control with that entity. For the purposes of this definition, + "control" means (i) the power, direct or indirect, to cause the + direction or management of such entity, whether by contract or + otherwise, or (ii) ownership of fifty percent (50%) or more of the + outstanding shares, or (iii) beneficial ownership of such entity. + + "You" (or "Your") shall mean an individual or Legal Entity + exercising permissions granted by this License. + + "Source" form shall mean the preferred form for making modifications, + including but not limited to software source code, documentation + source, and configuration files. + + "Object" form shall mean any form resulting from mechanical + transformation or translation of a Source form, including but + not limited to compiled object code, generated documentation, + and conversions to other media types. + + "Work" shall mean the work of authorship, whether in Source or + Object form, made available under the License, as indicated by a + copyright notice that is included in or attached to the work + (an example is provided in the Appendix below). + + "Derivative Works" shall mean any work, whether in Source or Object + form, that is based on (or derived from) the Work and for which the + editorial revisions, annotations, elaborations, or other modifications + represent, as a whole, an original work of authorship. For the purposes + of this License, Derivative Works shall not include works that remain + separable from, or merely link (or bind by name) to the interfaces of, + the Work and Derivative Works thereof. + + "Contribution" shall mean any work of authorship, including + the original version of the Work and any modifications or additions + to that Work or Derivative Works thereof, that is intentionally + submitted to Licensor for inclusion in the Work by the copyright owner + or by an individual or Legal Entity authorized to submit on behalf of + the copyright owner. For the purposes of this definition, "submitted" + means any form of electronic, verbal, or written communication sent + to the Licensor or its representatives, including but not limited to + communication on electronic mailing lists, source code control systems, + and issue tracking systems that are managed by, or on behalf of, the + Licensor for the purpose of discussing and improving the Work, but + excluding communication that is conspicuously marked or otherwise + designated in writing by the copyright owner as "Not a Contribution." + + "Contributor" shall mean Licensor and any individual or Legal Entity + on behalf of whom a Contribution has been received by Licensor and + subsequently incorporated within the Work. + + 2. Grant of Copyright License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + copyright license to reproduce, prepare Derivative Works of, + publicly display, publicly perform, sublicense, and distribute the + Work and such Derivative Works in Source or Object form. + + 3. Grant of Patent License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + (except as stated in this section) patent license to make, have made, + use, offer to sell, sell, import, and otherwise transfer the Work, + where such license applies only to those patent claims licensable + by such Contributor that are necessarily infringed by their + Contribution(s) alone or by combination of their Contribution(s) + with the Work to which such Contribution(s) was submitted. If You + institute patent litigation against any entity (including a + cross-claim or counterclaim in a lawsuit) alleging that the Work + or a Contribution incorporated within the Work constitutes direct + or contributory patent infringement, then any patent licenses + granted to You under this License for that Work shall terminate + as of the date such litigation is filed. + + 4. Redistribution. You may reproduce and distribute copies of the + Work or Derivative Works thereof in any medium, with or without + modifications, and in Source or Object form, provided that You + meet the following conditions: + + (a) You must give any other recipients of the Work or + Derivative Works a copy of this License; and + + (b) You must cause any modified files to carry prominent notices + stating that You changed the files; and + + (c) You must retain, in the Source form of any Derivative Works + that You distribute, all copyright, patent, trademark, and + attribution notices from the Source form of the Work, + excluding those notices that do not pertain to any part of + the Derivative Works; and + + (d) If the Work includes a "NOTICE" text file as part of its + distribution, then any Derivative Works that You distribute must + include a readable copy of the attribution notices contained + within such NOTICE file, excluding those notices that do not + pertain to any part of the Derivative Works, in at least one + of the following places: within a NOTICE text file distributed + as part of the Derivative Works; within the Source form or + documentation, if provided along with the Derivative Works; or, + within a display generated by the Derivative Works, if and + wherever such third-party notices normally appear. The contents + of the NOTICE file are for informational purposes only and + do not modify the License. You may add Your own attribution + notices within Derivative Works that You distribute, alongside + or as an addendum to the NOTICE text from the Work, provided + that such additional attribution notices cannot be construed + as modifying the License. + + You may add Your own copyright statement to Your modifications and + may provide additional or different license terms and conditions + for use, reproduction, or distribution of Your modifications, or + for any such Derivative Works as a whole, provided Your use, + reproduction, and distribution of the Work otherwise complies with + the conditions stated in this License. + + 5. Submission of Contributions. Unless You explicitly state otherwise, + any Contribution intentionally submitted for inclusion in the Work + by You to the Licensor shall be under the terms and conditions of + this License, without any additional terms or conditions. + Notwithstanding the above, nothing herein shall supersede or modify + the terms of any separate license agreement you may have executed + with Licensor regarding such Contributions. + + 6. Trademarks. This License does not grant permission to use the trade + names, trademarks, service marks, or product names of the Licensor, + except as required for reasonable and customary use in describing the + origin of the Work and reproducing the content of the NOTICE file. + + 7. Disclaimer of Warranty. Unless required by applicable law or + agreed to in writing, Licensor provides the Work (and each + Contributor provides its Contributions) on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or + implied, including, without limitation, any warranties or conditions + of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A + PARTICULAR PURPOSE. You are solely responsible for determining the + appropriateness of using or redistributing the Work and assume any + risks associated with Your exercise of permissions under this License. + + 8. Limitation of Liability. In no event and under no legal theory, + whether in tort (including negligence), contract, or otherwise, + unless required by applicable law (such as deliberate and grossly + negligent acts) or agreed to in writing, shall any Contributor be + liable to You for damages, including any direct, indirect, special, + incidental, or consequential damages of any character arising as a + result of this License or out of the use or inability to use the + Work (including but not limited to damages for loss of goodwill, + work stoppage, computer failure or malfunction, or any and all + other commercial damages or losses), even if such Contributor + has been advised of the possibility of such damages. + + 9. Accepting Warranty or Additional Liability. While redistributing + the Work or Derivative Works thereof, You may choose to offer, + and charge a fee for, acceptance of support, warranty, indemnity, + or other liability obligations and/or rights consistent with this + License. However, in accepting such obligations, You may act only + on Your own behalf and on Your sole responsibility, not on behalf + of any other Contributor, and only if You agree to indemnify, + defend, and hold each Contributor harmless for any liability + incurred by, or claims asserted against, such Contributor by reason + of your accepting any such warranty or additional liability. + + END OF TERMS AND CONDITIONS + + APPENDIX: How to apply the Apache License to your work. + + To apply the Apache License to your work, attach the following + boilerplate notice, with the fields enclosed by brackets "[]" + replaced with your own identifying information. (Don't include + the brackets!) The text should be enclosed in the appropriate + comment syntax for the file format. We also recommend that a + file or class name and description of purpose be included on the + same "printed page" as the copyright notice for easier + identification within third-party archives. + + Copyright [yyyy] [name of copyright owner] + + Licensed under the Apache License, Version 2.0 (the "License"); + you may not use this file except in compliance with the License. + You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, software + distributed under the License is distributed on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + See the License for the specific language governing permissions and + limitations under the License. + + +coverage +Apache-2.0 +https://github.com/coveragepy/coveragepy + + Apache License + Version 2.0, January 2004 + http://www.apache.org/licenses/ + + TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION + + 1. Definitions. + + "License" shall mean the terms and conditions for use, reproduction, + and distribution as defined by Sections 1 through 9 of this document. + + "Licensor" shall mean the copyright owner or entity authorized by + the copyright owner that is granting the License. + + "Legal Entity" shall mean the union of the acting entity and all + other entities that control, are controlled by, or are under common + control with that entity. For the purposes of this definition, + "control" means (i) the power, direct or indirect, to cause the + direction or management of such entity, whether by contract or + otherwise, or (ii) ownership of fifty percent (50%) or more of the + outstanding shares, or (iii) beneficial ownership of such entity. + + "You" (or "Your") shall mean an individual or Legal Entity + exercising permissions granted by this License. + + "Source" form shall mean the preferred form for making modifications, + including but not limited to software source code, documentation + source, and configuration files. + + "Object" form shall mean any form resulting from mechanical + transformation or translation of a Source form, including but + not limited to compiled object code, generated documentation, + and conversions to other media types. + + "Work" shall mean the work of authorship, whether in Source or + Object form, made available under the License, as indicated by a + copyright notice that is included in or attached to the work + (an example is provided in the Appendix below). + + "Derivative Works" shall mean any work, whether in Source or Object + form, that is based on (or derived from) the Work and for which the + editorial revisions, annotations, elaborations, or other modifications + represent, as a whole, an original work of authorship. For the purposes + of this License, Derivative Works shall not include works that remain + separable from, or merely link (or bind by name) to the interfaces of, + the Work and Derivative Works thereof. + + "Contribution" shall mean any work of authorship, including + the original version of the Work and any modifications or additions + to that Work or Derivative Works thereof, that is intentionally + submitted to Licensor for inclusion in the Work by the copyright owner + or by an individual or Legal Entity authorized to submit on behalf of + the copyright owner. For the purposes of this definition, "submitted" + means any form of electronic, verbal, or written communication sent + to the Licensor or its representatives, including but not limited to + communication on electronic mailing lists, source code control systems, + and issue tracking systems that are managed by, or on behalf of, the + Licensor for the purpose of discussing and improving the Work, but + excluding communication that is conspicuously marked or otherwise + designated in writing by the copyright owner as "Not a Contribution." + + "Contributor" shall mean Licensor and any individual or Legal Entity + on behalf of whom a Contribution has been received by Licensor and + subsequently incorporated within the Work. + + 2. Grant of Copyright License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + copyright license to reproduce, prepare Derivative Works of, + publicly display, publicly perform, sublicense, and distribute the + Work and such Derivative Works in Source or Object form. + + 3. Grant of Patent License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + (except as stated in this section) patent license to make, have made, + use, offer to sell, sell, import, and otherwise transfer the Work, + where such license applies only to those patent claims licensable + by such Contributor that are necessarily infringed by their + Contribution(s) alone or by combination of their Contribution(s) + with the Work to which such Contribution(s) was submitted. If You + institute patent litigation against any entity (including a + cross-claim or counterclaim in a lawsuit) alleging that the Work + or a Contribution incorporated within the Work constitutes direct + or contributory patent infringement, then any patent licenses + granted to You under this License for that Work shall terminate + as of the date such litigation is filed. + + 4. Redistribution. You may reproduce and distribute copies of the + Work or Derivative Works thereof in any medium, with or without + modifications, and in Source or Object form, provided that You + meet the following conditions: + + (a) You must give any other recipients of the Work or + Derivative Works a copy of this License; and + + (b) You must cause any modified files to carry prominent notices + stating that You changed the files; and + + (c) You must retain, in the Source form of any Derivative Works + that You distribute, all copyright, patent, trademark, and + attribution notices from the Source form of the Work, + excluding those notices that do not pertain to any part of + the Derivative Works; and + + (d) If the Work includes a "NOTICE" text file as part of its + distribution, then any Derivative Works that You distribute must + include a readable copy of the attribution notices contained + within such NOTICE file, excluding those notices that do not + pertain to any part of the Derivative Works, in at least one + of the following places: within a NOTICE text file distributed + as part of the Derivative Works; within the Source form or + documentation, if provided along with the Derivative Works; or, + within a display generated by the Derivative Works, if and + wherever such third-party notices normally appear. The contents + of the NOTICE file are for informational purposes only and + do not modify the License. You may add Your own attribution + notices within Derivative Works that You distribute, alongside + or as an addendum to the NOTICE text from the Work, provided + that such additional attribution notices cannot be construed + as modifying the License. + + You may add Your own copyright statement to Your modifications and + may provide additional or different license terms and conditions + for use, reproduction, or distribution of Your modifications, or + for any such Derivative Works as a whole, provided Your use, + reproduction, and distribution of the Work otherwise complies with + the conditions stated in this License. + + 5. Submission of Contributions. Unless You explicitly state otherwise, + any Contribution intentionally submitted for inclusion in the Work + by You to the Licensor shall be under the terms and conditions of + this License, without any additional terms or conditions. + Notwithstanding the above, nothing herein shall supersede or modify + the terms of any separate license agreement you may have executed + with Licensor regarding such Contributions. + + 6. Trademarks. This License does not grant permission to use the trade + names, trademarks, service marks, or product names of the Licensor, + except as required for reasonable and customary use in describing the + origin of the Work and reproducing the content of the NOTICE file. + + 7. Disclaimer of Warranty. Unless required by applicable law or + agreed to in writing, Licensor provides the Work (and each + Contributor provides its Contributions) on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or + implied, including, without limitation, any warranties or conditions + of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A + PARTICULAR PURPOSE. You are solely responsible for determining the + appropriateness of using or redistributing the Work and assume any + risks associated with Your exercise of permissions under this License. + + 8. Limitation of Liability. In no event and under no legal theory, + whether in tort (including negligence), contract, or otherwise, + unless required by applicable law (such as deliberate and grossly + negligent acts) or agreed to in writing, shall any Contributor be + liable to You for damages, including any direct, indirect, special, + incidental, or consequential damages of any character arising as a + result of this License or out of the use or inability to use the + Work (including but not limited to damages for loss of goodwill, + work stoppage, computer failure or malfunction, or any and all + other commercial damages or losses), even if such Contributor + has been advised of the possibility of such damages. + + 9. Accepting Warranty or Additional Liability. While redistributing + the Work or Derivative Works thereof, You may choose to offer, + and charge a fee for, acceptance of support, warranty, indemnity, + or other liability obligations and/or rights consistent with this + License. However, in accepting such obligations, You may act only + on Your own behalf and on Your sole responsibility, not on behalf + of any other Contributor, and only if You agree to indemnify, + defend, and hold each Contributor harmless for any liability + incurred by, or claims asserted against, such Contributor by reason + of your accepting any such warranty or additional liability. + + END OF TERMS AND CONDITIONS + + +cramjam +MIT +https://github.com/milesgranger/pyrus-cramjam +MIT License + +Copyright (c) 2020 Miles Granger + +Permission is hereby granted, free of charge, to any person obtaining a copy +of this software and associated documentation files (the "Software"), to deal +in the Software without restriction, including without limitation the rights +to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +copies of the Software, and to permit persons to whom the Software is +furnished to do so, subject to the following conditions: + +The above copyright notice and this permission notice shall be included in all +copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE +SOFTWARE. + + +cryptography +Apache-2.0 OR BSD-3-Clause +https://github.com/pyca/cryptography +This software is made available under the terms of *either* of the licenses +found in LICENSE.APACHE or LICENSE.BSD. Contributions to cryptography are made +under the terms of *both* these licenses. + + +cuda-bindings +LicenseRef-NVIDIA-SOFTWARE-LICENSE +https://github.com/NVIDIA/cuda-python +NVIDIA SOFTWARE LICENSE + +This license is a legal agreement between you and NVIDIA Corporation ("NVIDIA") and governs your use of the NVIDIA CUDA Python software and materials provided hereunder ("SOFTWARE"). + +This license can be accepted only by an adult of legal age of majority in the country in which the SOFTWARE is used. If you are under the legal age of majority, you must ask your parent or legal guardian to consent to this license. By taking delivery of the SOFTWARE, you affirm that you have reached the legal age of majority, you accept the terms of this license, and you take legal and financial responsibility for the actions of your permitted users. + +You agree to use the SOFTWARE only for purposes that are permitted by (a) this license, and (b) any applicable law, regulation or generally accepted practices or guidelines in the relevant jurisdictions. + +1. LICENSE. Subject to the terms of this license, NVIDIA grants you a non-exclusive limited license to: (a) install and use the SOFTWARE, and (b) distribute the SOFTWARE subject to the distribution requirements described in this license. NVIDIA reserves all rights, title and interest in and to the SOFTWARE not expressly granted to you under this license. + +2. DISTRIBUTION REQUIREMENTS. These are the distribution requirements for you to exercise the distribution grant: +a. The terms under which you distribute the SOFTWARE must be consistent with the terms of this license, including (without limitation) terms relating to the license grant and license restrictions and protection of NVIDIA's intellectual property rights. +b. You agree to notify NVIDIA in writing of any known or suspected distribution or use of the SOFTWARE not in compliance with the requirements of this license, and to enforce the terms of your agreements with respect to distributed SOFTWARE. + +3. LIMITATIONS. Your license to use the SOFTWARE is restricted as follows: +a. The SOFTWARE is licensed for you to develop applications only for use in systems with NVIDIA GPUs. +b. You may not reverse engineer, decompile or disassemble, or remove copyright or other proprietary notices from any portion of the SOFTWARE or copies of the SOFTWARE. +c. You may not modify or create derivative works of any portion of the SOFTWARE. +d. You may not bypass, disable, or circumvent any technical measure, encryption, security, digital rights management or authentication mechanism in the SOFTWARE. +e. You may not use the SOFTWARE in any manner that would cause it to become subject to an open source software license. As examples, licenses that require as a condition of use, modification, and/or distribution that the SOFTWARE be (i) disclosed or distributed in source code form; (ii) licensed for the purpose of making derivative works; or (iii) redistributable at no charge. +f. Unless you have an agreement with NVIDIA for this purpose, you may not use the SOFTWARE with any system or application where the use or failure of the system or application can reasonably be expected to threaten or result in personal injury, death, or catastrophic loss. Examples include use in avionics, navigation, military, medical, life support or other life critical applications. NVIDIA does not design, test or manufacture the SOFTWARE for these critical uses and NVIDIA shall not be liable to you or any third party, in whole or in part, for any claims or damages arising from such uses. +g. You agree to defend, indemnify and hold harmless NVIDIA and its affiliates, and their respective employees, contractors, agents, officers and directors, from and against any and all claims, damages, obligations, losses, liabilities, costs or debt, fines, restitutions and expenses (including but not limited to attorney's fees and costs incident to establishing the right of indemnification) arising out of or related to use of the SOFTWARE outside of the scope of this Agreement, or not in compliance with its terms. + +4. PRE-RELEASE. SOFTWARE versions identified as alpha, beta, preview, early access or otherwise as pre-release may not be fully functional, may contain errors or design flaws, and may have reduced or different security, privacy, availability, and reliability standards relative to commercial versions of NVIDIA software and materials. You may use a pre-release SOFTWARE version at your own risk, understanding that these versions are not intended for use in production or business-critical systems. + +5. OWNERSHIP. The SOFTWARE and the related intellectual property rights therein are and will remain the sole and exclusive property of NVIDIA or its licensors. The SOFTWARE is copyrighted and protected by the laws of the United States and other countries, and international treaty provisions. NVIDIA may make changes to the SOFTWARE, at any time without notice, but is not obligated to support or update the SOFTWARE. + +6. COMPONENTS UNDER OTHER LICENSES. The SOFTWARE may include NVIDIA or third-party components with separate legal notices or terms as may be described in proprietary notices accompanying the SOFTWARE. If and to the extent there is a conflict between the terms in this license and the license terms associated with a component, the license terms associated with the components control only to the extent necessary to resolve the conflict. + +7. FEEDBACK. You may, but don't have to, provide to NVIDIA any Feedback. "Feedback" means any suggestions, bug fixes, enhancements, modifications, feature requests or other feedback regarding the SOFTWARE. For any Feedback that you voluntarily provide, you hereby grant NVIDIA and its affiliates a perpetual, non-exclusive, worldwide, irrevocable license to use, reproduce, modify, license, sublicense (through multiple tiers of sublicensees), and distribute (through multiple tiers of distributors) the Feedback without the payment of any royalties or fees to you. NVIDIA will use Feedback at its choice. + +8. NO WARRANTIES. THE SOFTWARE IS PROVIDED "AS IS" WITHOUT ANY EXPRESS OR IMPLIED WARRANTY OF ANY KIND INCLUDING, BUT NOT LIMITED TO, WARRANTIES OF MERCHANTABILITY, NONINFRINGEMENT, OR FITNESS FOR A PARTICULAR PURPOSE. NVIDIA DOES NOT WARRANT THAT THE SOFTWARE WILL MEET YOUR REQUIREMENTS OR THAT THE OPERATION THEREOF WILL BE UNINTERRUPTED OR ERROR-FREE, OR THAT ALL ERRORS WILL BE CORRECTED. + +9. LIMITATIONS OF LIABILITY. TO THE MAXIMUM EXTENT PERMITTED BY LAW, NVIDIA AND ITS AFFILIATES SHALL NOT BE LIABLE FOR ANY SPECIAL, INCIDENTAL, PUNITIVE OR CONSEQUENTIAL DAMAGES, OR ANY LOST PROFITS, PROJECT DELAYS, LOSS OF USE, LOSS OF DATA OR LOSS OF GOODWILL, OR THE COSTS OF PROCURING SUBSTITUTE PRODUCTS, ARISING OUT OF OR IN CONNECTION WITH THIS LICENSE OR THE USE OR PERFORMANCE OF THE SOFTWARE, WHETHER SUCH LIABILITY ARISES FROM ANY CLAIM BASED UPON BREACH OF CONTRACT, BREACH OF WARRANTY, TORT (INCLUDING NEGLIGENCE), PRODUCT LIABILITY OR ANY OTHER CAUSE OF ACTION OR THEORY OF LIABILITY, EVEN IF NVIDIA HAS PREVIOUSLY BEEN ADVISED OF, OR COULD REASONABLY HAVE FORESEEN, THE POSSIBILITY OF SUCH DAMAGES. IN NO EVENT WILL NVIDIA'S AND ITS AFFILIATES TOTAL CUMULATIVE LIABILITY UNDER OR ARISING OUT OF THIS LICENSE EXCEED US$10.00. THE NATURE OF THE LIABILITY OR THE NUMBER OF CLAIMS OR SUITS SHALL NOT ENLARGE OR EXTEND THIS LIMIT. + +10. TERMINATION. Your rights under this license will terminate automatically without notice from NVIDIA if you fail to comply with any term and condition of this license or if you commence or participate in any legal proceeding against NVIDIA with respect to the SOFTWARE. NVIDIA may terminate this license with advance written notice to you if NVIDIA decides to no longer provide the SOFTWARE in a country or, in NVIDIA's sole discretion, the continued use of it is no longer commercially viable. Upon any termination of this license, you agree to promptly discontinue use of the SOFTWARE and destroy all copies in your possession or control. Your prior distributions in accordance with this license are not affected by the termination of this license. All provisions of this license will survive termination, except for the license granted to you. + +11. APPLICABLE LAW. This license will be governed in all respects by the laws of the United States and of the State of Delaware as those laws are applied to contracts entered into and performed entirely within Delaware by Delaware residents, without regard to the conflicts of laws principles. The United Nations Convention on Contracts for the International Sale of Goods is specifically disclaimed. You agree to all terms of this Agreement in the English language. The state or federal courts residing in Santa Clara County, California shall have exclusive jurisdiction over any dispute or claim arising out of this license. Notwithstanding this, you agree that NVIDIA shall still be allowed to apply for injunctive remedies or an equivalent type of urgent legal relief in any jurisdiction. + +12. NO ASSIGNMENT. This license and your rights and obligations thereunder may not be assigned by you by any means or operation of law without NVIDIA's permission. Any attempted assignment not approved by NVIDIA in writing shall be void and of no effect. + +13. EXPORT. The SOFTWARE is subject to United States export laws and regulations. You agree that you will not ship, transfer or export the SOFTWARE into any country, or use the SOFTWARE in any manner, prohibited by the United States Bureau of Industry and Security or economic sanctions regulations administered by the U.S. Department of Treasury's Office of Foreign Assets Control (OFAC), or any applicable export laws, restrictions or regulations. These laws include restrictions on destinations, end users and end use. By accepting this license, you confirm that you are not a resident or citizen of any country currently embargoed by the U.S. and that you are not otherwise prohibited from receiving the SOFTWARE. + +14. GOVERNMENT USE. The SOFTWARE has been developed entirely at private expense and is "commercial items" consisting of "commercial computer software" and "commercial computer software documentation" provided with RESTRICTED RIGHTS. Use, duplication or disclosure by the U.S. Government or a U.S. Government subcontractor is subject to the restrictions in this license pursuant to DFARS 227.7202-3(a) or as set forth in subparagraphs (b)(1) and (2) of the Commercial Computer Software - Restricted Rights clause at FAR 52.227-19, as applicable. Contractor/manufacturer is NVIDIA, 2788 San Tomas Expressway, Santa Clara, CA 95051. + +15. ENTIRE AGREEMENT. This license is the final, complete and exclusive agreement between the parties relating to the subject matter of this license and supersedes all prior or contemporaneous understandings and agreements relating to this subject matter, whether oral or written. If any court of competent jurisdiction determines that any provision of this license is illegal, invalid or unenforceable, the remaining provisions will remain in full force and effect. This license may only be modified in a writing signed by an authorized representative of each party. + +(v. May 12, 2021) + + +cuda-pathfinder +Apache-2.0 +https://github.com/NVIDIA/cuda-python + + Apache License + Version 2.0, January 2004 + http://www.apache.org/licenses/ + + TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION + + 1. Definitions. + + "License" shall mean the terms and conditions for use, reproduction, + and distribution as defined by Sections 1 through 9 of this document. + + "Licensor" shall mean the copyright owner or entity authorized by + the copyright owner that is granting the License. + + "Legal Entity" shall mean the union of the acting entity and all + other entities that control, are controlled by, or are under common + control with that entity. For the purposes of this definition, + "control" means (i) the power, direct or indirect, to cause the + direction or management of such entity, whether by contract or + otherwise, or (ii) ownership of fifty percent (50%) or more of the + outstanding shares, or (iii) beneficial ownership of such entity. + + "You" (or "Your") shall mean an individual or Legal Entity + exercising permissions granted by this License. + + "Source" form shall mean the preferred form for making modifications, + including but not limited to software source code, documentation + source, and configuration files. + + "Object" form shall mean any form resulting from mechanical + transformation or translation of a Source form, including but + not limited to compiled object code, generated documentation, + and conversions to other media types. + + "Work" shall mean the work of authorship, whether in Source or + Object form, made available under the License, as indicated by a + copyright notice that is included in or attached to the work + (an example is provided in the Appendix below). + + "Derivative Works" shall mean any work, whether in Source or Object + form, that is based on (or derived from) the Work and for which the + editorial revisions, annotations, elaborations, or other modifications + represent, as a whole, an original work of authorship. For the purposes + of this License, Derivative Works shall not include works that remain + separable from, or merely link (or bind by name) to the interfaces of, + the Work and Derivative Works thereof. + + "Contribution" shall mean any work of authorship, including + the original version of the Work and any modifications or additions + to that Work or Derivative Works thereof, that is intentionally + submitted to Licensor for inclusion in the Work by the copyright owner + or by an individual or Legal Entity authorized to submit on behalf of + the copyright owner. For the purposes of this definition, "submitted" + means any form of electronic, verbal, or written communication sent + to the Licensor or its representatives, including but not limited to + communication on electronic mailing lists, source code control systems, + and issue tracking systems that are managed by, or on behalf of, the + Licensor for the purpose of discussing and improving the Work, but + excluding communication that is conspicuously marked or otherwise + designated in writing by the copyright owner as "Not a Contribution." + + "Contributor" shall mean Licensor and any individual or Legal Entity + on behalf of whom a Contribution has been received by Licensor and + subsequently incorporated within the Work. + + 2. Grant of Copyright License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + copyright license to reproduce, prepare Derivative Works of, + publicly display, publicly perform, sublicense, and distribute the + Work and such Derivative Works in Source or Object form. + + 3. Grant of Patent License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + (except as stated in this section) patent license to make, have made, + use, offer to sell, sell, import, and otherwise transfer the Work, + where such license applies only to those patent claims licensable + by such Contributor that are necessarily infringed by their + Contribution(s) alone or by combination of their Contribution(s) + with the Work to which such Contribution(s) was submitted. If You + institute patent litigation against any entity (including a + cross-claim or counterclaim in a lawsuit) alleging that the Work + or a Contribution incorporated within the Work constitutes direct + or contributory patent infringement, then any patent licenses + granted to You under this License for that Work shall terminate + as of the date such litigation is filed. + + 4. Redistribution. You may reproduce and distribute copies of the + Work or Derivative Works thereof in any medium, with or without + modifications, and in Source or Object form, provided that You + meet the following conditions: + + (a) You must give any other recipients of the Work or + Derivative Works a copy of this License; and + + (b) You must cause any modified files to carry prominent notices + stating that You changed the files; and + + (c) You must retain, in the Source form of any Derivative Works + that You distribute, all copyright, patent, trademark, and + attribution notices from the Source form of the Work, + excluding those notices that do not pertain to any part of + the Derivative Works; and + + (d) If the Work includes a "NOTICE" text file as part of its + distribution, then any Derivative Works that You distribute must + include a readable copy of the attribution notices contained + within such NOTICE file, excluding those notices that do not + pertain to any part of the Derivative Works, in at least one + of the following places: within a NOTICE text file distributed + as part of the Derivative Works; within the Source form or + documentation, if provided along with the Derivative Works; or, + within a display generated by the Derivative Works, if and + wherever such third-party notices normally appear. The contents + of the NOTICE file are for informational purposes only and + do not modify the License. You may add Your own attribution + notices within Derivative Works that You distribute, alongside + or as an addendum to the NOTICE text from the Work, provided + that such additional attribution notices cannot be construed + as modifying the License. + + You may add Your own copyright statement to Your modifications and + may provide additional or different license terms and conditions + for use, reproduction, or distribution of Your modifications, or + for any such Derivative Works as a whole, provided Your use, + reproduction, and distribution of the Work otherwise complies with + the conditions stated in this License. + + 5. Submission of Contributions. Unless You explicitly state otherwise, + any Contribution intentionally submitted for inclusion in the Work + by You to the Licensor shall be under the terms and conditions of + this License, without any additional terms or conditions. + Notwithstanding the above, nothing herein shall supersede or modify + the terms of any separate license agreement you may have executed + with Licensor regarding such Contributions. + + 6. Trademarks. This License does not grant permission to use the trade + names, trademarks, service marks, or product names of the Licensor, + except as required for reasonable and customary use in describing the + origin of the Work and reproducing the content of the NOTICE file. + + 7. Disclaimer of Warranty. Unless required by applicable law or + agreed to in writing, Licensor provides the Work (and each + Contributor provides its Contributions) on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or + implied, including, without limitation, any warranties or conditions + of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A + PARTICULAR PURPOSE. You are solely responsible for determining the + appropriateness of using or redistributing the Work and assume any + risks associated with Your exercise of permissions under this License. + + 8. Limitation of Liability. In no event and under no legal theory, + whether in tort (including negligence), contract, or otherwise, + unless required by applicable law (such as deliberate and grossly + negligent acts) or agreed to in writing, shall any Contributor be + liable to You for damages, including any direct, indirect, special, + incidental, or consequential damages of any character arising as a + result of this License or out of the use or inability to use the + Work (including but not limited to damages for loss of goodwill, + work stoppage, computer failure or malfunction, or any and all + other commercial damages or losses), even if such Contributor + has been advised of the possibility of such damages. + + 9. Accepting Warranty or Additional Liability. While redistributing + the Work or Derivative Works thereof, You may choose to offer, + and charge a fee for, acceptance of support, warranty, indemnity, + or other liability obligations and/or rights consistent with this + License. However, in accepting such obligations, You may act only + on Your own behalf and on Your sole responsibility, not on behalf + of any other Contributor, and only if You agree to indemnify, + defend, and hold each Contributor harmless for any liability + incurred by, or claims asserted against, such Contributor by reason + of your accepting any such warranty or additional liability. + + END OF TERMS AND CONDITIONS + + +cycler +BSD License +https://matplotlib.org/cycler/ +Copyright (c) 2015, matplotlib project +All rights reserved. + +Redistribution and use in source and binary forms, with or without +modification, are permitted provided that the following conditions are met: + +* Redistributions of source code must retain the above copyright notice, this + list of conditions and the following disclaimer. + +* Redistributions in binary form must reproduce the above copyright notice, + this list of conditions and the following disclaimer in the documentation + and/or other materials provided with the distribution. + +* Neither the name of the matplotlib project nor the names of its + contributors may be used to endorse or promote products derived from + this software without specific prior written permission. + +THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" +AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE +IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE +DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE +FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL +DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR +SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER +CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, +OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE +OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + +dataclasses-json +MIT License +https://github.com/lidatong/dataclasses-json +MIT License + +Copyright (c) 2019 Charles Li and contributors + +Permission is hereby granted, free of charge, to any person obtaining a copy +of this software and associated documentation files (the "Software"), to deal +in the Software without restriction, including without limitation the rights +to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +copies of the Software, and to permit persons to whom the Software is +furnished to do so, subject to the following conditions: + +The above copyright notice and this permission notice shall be included in all +copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE +SOFTWARE. + + +datasets +Apache Software License +https://github.com/huggingface/datasets + + Apache License + Version 2.0, January 2004 + http://www.apache.org/licenses/ + + TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION + + 1. Definitions. + + "License" shall mean the terms and conditions for use, reproduction, + and distribution as defined by Sections 1 through 9 of this document. + + "Licensor" shall mean the copyright owner or entity authorized by + the copyright owner that is granting the License. + + "Legal Entity" shall mean the union of the acting entity and all + other entities that control, are controlled by, or are under common + control with that entity. For the purposes of this definition, + "control" means (i) the power, direct or indirect, to cause the + direction or management of such entity, whether by contract or + otherwise, or (ii) ownership of fifty percent (50%) or more of the + outstanding shares, or (iii) beneficial ownership of such entity. + + "You" (or "Your") shall mean an individual or Legal Entity + exercising permissions granted by this License. + + "Source" form shall mean the preferred form for making modifications, + including but not limited to software source code, documentation + source, and configuration files. + + "Object" form shall mean any form resulting from mechanical + transformation or translation of a Source form, including but + not limited to compiled object code, generated documentation, + and conversions to other media types. + + "Work" shall mean the work of authorship, whether in Source or + Object form, made available under the License, as indicated by a + copyright notice that is included in or attached to the work + (an example is provided in the Appendix below). + + "Derivative Works" shall mean any work, whether in Source or Object + form, that is based on (or derived from) the Work and for which the + editorial revisions, annotations, elaborations, or other modifications + represent, as a whole, an original work of authorship. For the purposes + of this License, Derivative Works shall not include works that remain + separable from, or merely link (or bind by name) to the interfaces of, + the Work and Derivative Works thereof. + + "Contribution" shall mean any work of authorship, including + the original version of the Work and any modifications or additions + to that Work or Derivative Works thereof, that is intentionally + submitted to Licensor for inclusion in the Work by the copyright owner + or by an individual or Legal Entity authorized to submit on behalf of + the copyright owner. For the purposes of this definition, "submitted" + means any form of electronic, verbal, or written communication sent + to the Licensor or its representatives, including but not limited to + communication on electronic mailing lists, source code control systems, + and issue tracking systems that are managed by, or on behalf of, the + Licensor for the purpose of discussing and improving the Work, but + excluding communication that is conspicuously marked or otherwise + designated in writing by the copyright owner as "Not a Contribution." + + "Contributor" shall mean Licensor and any individual or Legal Entity + on behalf of whom a Contribution has been received by Licensor and + subsequently incorporated within the Work. + + 2. Grant of Copyright License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + copyright license to reproduce, prepare Derivative Works of, + publicly display, publicly perform, sublicense, and distribute the + Work and such Derivative Works in Source or Object form. + + 3. Grant of Patent License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + (except as stated in this section) patent license to make, have made, + use, offer to sell, sell, import, and otherwise transfer the Work, + where such license applies only to those patent claims licensable + by such Contributor that are necessarily infringed by their + Contribution(s) alone or by combination of their Contribution(s) + with the Work to which such Contribution(s) was submitted. If You + institute patent litigation against any entity (including a + cross-claim or counterclaim in a lawsuit) alleging that the Work + or a Contribution incorporated within the Work constitutes direct + or contributory patent infringement, then any patent licenses + granted to You under this License for that Work shall terminate + as of the date such litigation is filed. + + 4. Redistribution. You may reproduce and distribute copies of the + Work or Derivative Works thereof in any medium, with or without + modifications, and in Source or Object form, provided that You + meet the following conditions: + + (a) You must give any other recipients of the Work or + Derivative Works a copy of this License; and + + (b) You must cause any modified files to carry prominent notices + stating that You changed the files; and + + (c) You must retain, in the Source form of any Derivative Works + that You distribute, all copyright, patent, trademark, and + attribution notices from the Source form of the Work, + excluding those notices that do not pertain to any part of + the Derivative Works; and + + (d) If the Work includes a "NOTICE" text file as part of its + distribution, then any Derivative Works that You distribute must + include a readable copy of the attribution notices contained + within such NOTICE file, excluding those notices that do not + pertain to any part of the Derivative Works, in at least one + of the following places: within a NOTICE text file distributed + as part of the Derivative Works; within the Source form or + documentation, if provided along with the Derivative Works; or, + within a display generated by the Derivative Works, if and + wherever such third-party notices normally appear. The contents + of the NOTICE file are for informational purposes only and + do not modify the License. You may add Your own attribution + notices within Derivative Works that You distribute, alongside + or as an addendum to the NOTICE text from the Work, provided + that such additional attribution notices cannot be construed + as modifying the License. + + You may add Your own copyright statement to Your modifications and + may provide additional or different license terms and conditions + for use, reproduction, or distribution of Your modifications, or + for any such Derivative Works as a whole, provided Your use, + reproduction, and distribution of the Work otherwise complies with + the conditions stated in this License. + + 5. Submission of Contributions. Unless You explicitly state otherwise, + any Contribution intentionally submitted for inclusion in the Work + by You to the Licensor shall be under the terms and conditions of + this License, without any additional terms or conditions. + Notwithstanding the above, nothing herein shall supersede or modify + the terms of any separate license agreement you may have executed + with Licensor regarding such Contributions. + + 6. Trademarks. This License does not grant permission to use the trade + names, trademarks, service marks, or product names of the Licensor, + except as required for reasonable and customary use in describing the + origin of the Work and reproducing the content of the NOTICE file. + + 7. Disclaimer of Warranty. Unless required by applicable law or + agreed to in writing, Licensor provides the Work (and each + Contributor provides its Contributions) on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or + implied, including, without limitation, any warranties or conditions + of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A + PARTICULAR PURPOSE. You are solely responsible for determining the + appropriateness of using or redistributing the Work and assume any + risks associated with Your exercise of permissions under this License. + + 8. Limitation of Liability. In no event and under no legal theory, + whether in tort (including negligence), contract, or otherwise, + unless required by applicable law (such as deliberate and grossly + negligent acts) or agreed to in writing, shall any Contributor be + liable to You for damages, including any direct, indirect, special, + incidental, or consequential damages of any character arising as a + result of this License or out of the use or inability to use the + Work (including but not limited to damages for loss of goodwill, + work stoppage, computer failure or malfunction, or any and all + other commercial damages or losses), even if such Contributor + has been advised of the possibility of such damages. + + 9. Accepting Warranty or Additional Liability. While redistributing + the Work or Derivative Works thereof, You may choose to offer, + and charge a fee for, acceptance of support, warranty, indemnity, + or other liability obligations and/or rights consistent with this + License. However, in accepting such obligations, You may act only + on Your own behalf and on Your sole responsibility, not on behalf + of any other Contributor, and only if You agree to indemnify, + defend, and hold each Contributor harmless for any liability + incurred by, or claims asserted against, such Contributor by reason + of your accepting any such warranty or additional liability. + + END OF TERMS AND CONDITIONS + + APPENDIX: How to apply the Apache License to your work. + + To apply the Apache License to your work, attach the following + boilerplate notice, with the fields enclosed by brackets "[]" + replaced with your own identifying information. (Don't include + the brackets!) The text should be enclosed in the appropriate + comment syntax for the file format. We also recommend that a + file or class name and description of purpose be included on the + same "printed page" as the copyright notice for easier + identification within third-party archives. + + Copyright [yyyy] [name of copyright owner] + + Licensed under the Apache License, Version 2.0 (the "License"); + you may not use this file except in compliance with the License. + You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, software + distributed under the License is distributed on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + See the License for the specific language governing permissions and + limitations under the License. + + +debugpy +MIT License +https://aka.ms/debugpy + debugpy + + Copyright (c) Microsoft Corporation + All rights reserved. + + MIT License + + Permission is hereby granted, free of charge, to any person obtaining a copy of + this software and associated documentation files (the "Software"), to deal in + the Software without restriction, including without limitation the rights to + use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of + the Software, and to permit persons to whom the Software is furnished to do so, + subject to the following conditions: + + The above copyright notice and this permission notice shall be included in all + copies or substantial portions of the Software. + + THE SOFTWARE IS PROVIDED *AS IS*, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS + FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR + COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER + IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN + CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. + + + +decorator +BSD License +UNKNOWN +Copyright (c) 2005-2025, Michele Simionato +All rights reserved. + +Redistribution and use in source and binary forms, with or without +modification, are permitted provided that the following conditions are +met: + +* Redistributions of source code must retain the above copyright + notice, this list of conditions and the following disclaimer. + +* Redistributions in binary form must reproduce the above copyright + notice, this list of conditions and the following disclaimer in + the documentation and/or other materials provided with the + distribution. + +THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS +"AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT +LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR +A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT +HOLDERS OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, +INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, +BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS +OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND +ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR +TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE +USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH +DAMAGE. + + +deepdiff +MIT License +https://zepworks.com/deepdiff/ +The MIT License (MIT) + +Copyright (c) 2014 - 2021 Sep Dehpour (Seperman) and contributors +www.zepworks.com + +Permission is hereby granted, free of charge, to any person obtaining a copy +of this software and associated documentation files (the "Software"), to deal +in the Software without restriction, including without limitation the rights +to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +copies of the Software, and to permit persons to whom the Software is +furnished to do so, subject to the following conditions: + +The above copyright notice and this permission notice shall be included in all +copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE +SOFTWARE. + + + +defusedxml +Python Software Foundation License +https://github.com/tiran/defusedxml +PYTHON SOFTWARE FOUNDATION LICENSE VERSION 2 +-------------------------------------------- + +1. This LICENSE AGREEMENT is between the Python Software Foundation +("PSF"), and the Individual or Organization ("Licensee") accessing and +otherwise using this software ("Python") in source or binary form and +its associated documentation. + +2. Subject to the terms and conditions of this License Agreement, PSF +hereby grants Licensee a nonexclusive, royalty-free, world-wide +license to reproduce, analyze, test, perform and/or display publicly, +prepare derivative works, distribute, and otherwise use Python +alone or in any derivative version, provided, however, that PSF's +License Agreement and PSF's notice of copyright, i.e., "Copyright (c) +2001, 2002, 2003, 2004, 2005, 2006, 2007, 2008 Python Software Foundation; +All Rights Reserved" are retained in Python alone or in any derivative +version prepared by Licensee. + +3. In the event Licensee prepares a derivative work that is based on +or incorporates Python or any part thereof, and wants to make +the derivative work available to others as provided herein, then +Licensee hereby agrees to include in any such work a brief summary of +the changes made to Python. + +4. PSF is making Python available to Licensee on an "AS IS" +basis. PSF MAKES NO REPRESENTATIONS OR WARRANTIES, EXPRESS OR +IMPLIED. BY WAY OF EXAMPLE, BUT NOT LIMITATION, PSF MAKES NO AND +DISCLAIMS ANY REPRESENTATION OR WARRANTY OF MERCHANTABILITY OR FITNESS +FOR ANY PARTICULAR PURPOSE OR THAT THE USE OF PYTHON WILL NOT +INFRINGE ANY THIRD PARTY RIGHTS. + +5. PSF SHALL NOT BE LIABLE TO LICENSEE OR ANY OTHER USERS OF PYTHON +FOR ANY INCIDENTAL, SPECIAL, OR CONSEQUENTIAL DAMAGES OR LOSS AS +A RESULT OF MODIFYING, DISTRIBUTING, OR OTHERWISE USING PYTHON, +OR ANY DERIVATIVE THEREOF, EVEN IF ADVISED OF THE POSSIBILITY THEREOF. + +6. This License Agreement will automatically terminate upon a material +breach of its terms and conditions. + +7. Nothing in this License Agreement shall be deemed to create any +relationship of agency, partnership, or joint venture between PSF and +Licensee. This License Agreement does not grant permission to use PSF +trademarks or trade name in a trademark sense to endorse or promote +products or services of Licensee, or any third party. + +8. By copying, installing or otherwise using Python, Licensee +agrees to be bound by the terms and conditions of this License +Agreement. + + + +diffusers +Apache Software License +https://github.com/huggingface/diffusers + Apache License + Version 2.0, January 2004 + http://www.apache.org/licenses/ + + TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION + + 1. Definitions. + + "License" shall mean the terms and conditions for use, reproduction, + and distribution as defined by Sections 1 through 9 of this document. + + "Licensor" shall mean the copyright owner or entity authorized by + the copyright owner that is granting the License. + + "Legal Entity" shall mean the union of the acting entity and all + other entities that control, are controlled by, or are under common + control with that entity. For the purposes of this definition, + "control" means (i) the power, direct or indirect, to cause the + direction or management of such entity, whether by contract or + otherwise, or (ii) ownership of fifty percent (50%) or more of the + outstanding shares, or (iii) beneficial ownership of such entity. + + "You" (or "Your") shall mean an individual or Legal Entity + exercising permissions granted by this License. + + "Source" form shall mean the preferred form for making modifications, + including but not limited to software source code, documentation + source, and configuration files. + + "Object" form shall mean any form resulting from mechanical + transformation or translation of a Source form, including but + not limited to compiled object code, generated documentation, + and conversions to other media types. + + "Work" shall mean the work of authorship, whether in Source or + Object form, made available under the License, as indicated by a + copyright notice that is included in or attached to the work + (an example is provided in the Appendix below). + + "Derivative Works" shall mean any work, whether in Source or Object + form, that is based on (or derived from) the Work and for which the + editorial revisions, annotations, elaborations, or other modifications + represent, as a whole, an original work of authorship. For the purposes + of this License, Derivative Works shall not include works that remain + separable from, or merely link (or bind by name) to the interfaces of, + the Work and Derivative Works thereof. + + "Contribution" shall mean any work of authorship, including + the original version of the Work and any modifications or additions + to that Work or Derivative Works thereof, that is intentionally + submitted to Licensor for inclusion in the Work by the copyright owner + or by an individual or Legal Entity authorized to submit on behalf of + the copyright owner. For the purposes of this definition, "submitted" + means any form of electronic, verbal, or written communication sent + to the Licensor or its representatives, including but not limited to + communication on electronic mailing lists, source code control systems, + and issue tracking systems that are managed by, or on behalf of, the + Licensor for the purpose of discussing and improving the Work, but + excluding communication that is conspicuously marked or otherwise + designated in writing by the copyright owner as "Not a Contribution." + + "Contributor" shall mean Licensor and any individual or Legal Entity + on behalf of whom a Contribution has been received by Licensor and + subsequently incorporated within the Work. + + 2. Grant of Copyright License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + copyright license to reproduce, prepare Derivative Works of, + publicly display, publicly perform, sublicense, and distribute the + Work and such Derivative Works in Source or Object form. + + 3. Grant of Patent License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + (except as stated in this section) patent license to make, have made, + use, offer to sell, sell, import, and otherwise transfer the Work, + where such license applies only to those patent claims licensable + by such Contributor that are necessarily infringed by their + Contribution(s) alone or by combination of their Contribution(s) + with the Work to which such Contribution(s) was submitted. If You + institute patent litigation against any entity (including a + cross-claim or counterclaim in a lawsuit) alleging that the Work + or a Contribution incorporated within the Work constitutes direct + or contributory patent infringement, then any patent licenses + granted to You under this License for that Work shall terminate + as of the date such litigation is filed. + + 4. Redistribution. You may reproduce and distribute copies of the + Work or Derivative Works thereof in any medium, with or without + modifications, and in Source or Object form, provided that You + meet the following conditions: + + (a) You must give any other recipients of the Work or + Derivative Works a copy of this License; and + + (b) You must cause any modified files to carry prominent notices + stating that You changed the files; and + + (c) You must retain, in the Source form of any Derivative Works + that You distribute, all copyright, patent, trademark, and + attribution notices from the Source form of the Work, + excluding those notices that do not pertain to any part of + the Derivative Works; and + + (d) If the Work includes a "NOTICE" text file as part of its + distribution, then any Derivative Works that You distribute must + include a readable copy of the attribution notices contained + within such NOTICE file, excluding those notices that do not + pertain to any part of the Derivative Works, in at least one + of the following places: within a NOTICE text file distributed + as part of the Derivative Works; within the Source form or + documentation, if provided along with the Derivative Works; or, + within a display generated by the Derivative Works, if and + wherever such third-party notices normally appear. The contents + of the NOTICE file are for informational purposes only and + do not modify the License. You may add Your own attribution + notices within Derivative Works that You distribute, alongside + or as an addendum to the NOTICE text from the Work, provided + that such additional attribution notices cannot be construed + as modifying the License. + + You may add Your own copyright statement to Your modifications and + may provide additional or different license terms and conditions + for use, reproduction, or distribution of Your modifications, or + for any such Derivative Works as a whole, provided Your use, + reproduction, and distribution of the Work otherwise complies with + the conditions stated in this License. + + 5. Submission of Contributions. Unless You explicitly state otherwise, + any Contribution intentionally submitted for inclusion in the Work + by You to the Licensor shall be under the terms and conditions of + this License, without any additional terms or conditions. + Notwithstanding the above, nothing herein shall supersede or modify + the terms of any separate license agreement you may have executed + with Licensor regarding such Contributions. + + 6. Trademarks. This License does not grant permission to use the trade + names, trademarks, service marks, or product names of the Licensor, + except as required for reasonable and customary use in describing the + origin of the Work and reproducing the content of the NOTICE file. + + 7. Disclaimer of Warranty. Unless required by applicable law or + agreed to in writing, Licensor provides the Work (and each + Contributor provides its Contributions) on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or + implied, including, without limitation, Any warranties or conditions + of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A + PARTICULAR PURPOSE. You are solely responsible for determining the + appropriateness of using or redistributing the Work and assume any + risks associated with Your exercise of permissions under this License. + + 8. Limitation of Liability. In no event and under no legal theory, + whether in tort (including negligence), contract, or otherwise, + unless required by applicable law (such as deliberate and grossly + negligent acts) or agreed to in writing, shall any Contributor be + liable to You for damages, including any direct, indirect, special, + incidental, or consequential damages of any character arising as a + result of this License or out of the use or inability to use the + Work (including but not limited to damages for loss of goodwill, + work stoppage, computer failure or malfunction, or any and all + other commercial damages or losses), even if such Contributor + has been advised of the possibility of such damages. + + 9. Accepting Warranty or Additional Liability. While redistributing + the Work or Derivative Works thereof, You may choose to offer, + and charge a fee for, acceptance of support, warranty, indemnity, + or other liability obligations and/or rights consistent with this + License. However, in accepting such obligations, You may act only + on Your own behalf and on Your sole responsibility, not on behalf + of any other Contributor, and only if You agree to indemnify, + defend, and hold each Contributor harmless for any liability + incurred by, or claims asserted against, such Contributor by reason + of your accepting any such warranty or additional liability. + + END OF TERMS AND CONDITIONS + + APPENDIX: How to apply the Apache License to your work. + + To apply the Apache License to your work, attach the following + boilerplate notice, with the fields enclosed by brackets "[]" + replaced with your own identifying information. (Don't include + the brackets!) The text should be enclosed in the appropriate + comment syntax for the file format. We also recommend that a + file or class name and description of purpose be included on the + same "printed page" as the copyright notice for easier + identification within third-party archives. + + Copyright [yyyy] [name of copyright owner] + + Licensed under the Apache License, Version 2.0 (the "License"); + you may not use this file except in compliance with the License. + You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, software + distributed under the License is distributed on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + See the License for the specific language governing permissions and + limitations under the License. + + +dill +BSD License +https://github.com/uqfoundation/dill +Copyright (c) 2004-2016 California Institute of Technology. +Copyright (c) 2016-2025 The Uncertainty Quantification Foundation. +All rights reserved. + +This software is available subject to the conditions and terms laid +out below. By downloading and using this software you are agreeing +to the following conditions. + +Redistribution and use in source and binary forms, with or without +modification, are permitted provided that the following conditions +are met: + + - Redistributions of source code must retain the above copyright + notice, this list of conditions and the following disclaimer. + + - Redistributions in binary form must reproduce the above copyright + notice, this list of conditions and the following disclaimer in the + documentation and/or other materials provided with the distribution. + + - Neither the names of the copyright holders nor the names of any of + the contributors may be used to endorse or promote products derived + from this software without specific prior written permission. + +THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS +"AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED +TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR +PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR +CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, +EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, +PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; +OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, +WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR +OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF +ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + + + +distlib +Python Software Foundation License +https://github.com/pypa/distlib +A. HISTORY OF THE SOFTWARE +========================== + +Python was created in the early 1990s by Guido van Rossum at Stichting +Mathematisch Centrum (CWI, see http://www.cwi.nl) in the Netherlands +as a successor of a language called ABC. Guido remains Python's +principal author, although it includes many contributions from others. + +In 1995, Guido continued his work on Python at the Corporation for +National Research Initiatives (CNRI, see http://www.cnri.reston.va.us) +in Reston, Virginia where he released several versions of the +software. + +In May 2000, Guido and the Python core development team moved to +BeOpen.com to form the BeOpen PythonLabs team. In October of the same +year, the PythonLabs team moved to Digital Creations (now Zope +Corporation, see http://www.zope.com). In 2001, the Python Software +Foundation (PSF, see http://www.python.org/psf/) was formed, a +non-profit organization created specifically to own Python-related +Intellectual Property. Zope Corporation is a sponsoring member of +the PSF. + +All Python releases are Open Source (see http://www.opensource.org for +the Open Source Definition). Historically, most, but not all, Python +releases have also been GPL-compatible; the table below summarizes +the various releases. + + Release Derived Year Owner GPL- + from compatible? (1) + + 0.9.0 thru 1.2 1991-1995 CWI yes + 1.3 thru 1.5.2 1.2 1995-1999 CNRI yes + 1.6 1.5.2 2000 CNRI no + 2.0 1.6 2000 BeOpen.com no + 1.6.1 1.6 2001 CNRI yes (2) + 2.1 2.0+1.6.1 2001 PSF no + 2.0.1 2.0+1.6.1 2001 PSF yes + 2.1.1 2.1+2.0.1 2001 PSF yes + 2.2 2.1.1 2001 PSF yes + 2.1.2 2.1.1 2002 PSF yes + 2.1.3 2.1.2 2002 PSF yes + 2.2.1 2.2 2002 PSF yes + 2.2.2 2.2.1 2002 PSF yes + 2.2.3 2.2.2 2003 PSF yes + 2.3 2.2.2 2002-2003 PSF yes + 2.3.1 2.3 2002-2003 PSF yes + 2.3.2 2.3.1 2002-2003 PSF yes + 2.3.3 2.3.2 2002-2003 PSF yes + 2.3.4 2.3.3 2004 PSF yes + 2.3.5 2.3.4 2005 PSF yes + 2.4 2.3 2004 PSF yes + 2.4.1 2.4 2005 PSF yes + 2.4.2 2.4.1 2005 PSF yes + 2.4.3 2.4.2 2006 PSF yes + 2.4.4 2.4.3 2006 PSF yes + 2.5 2.4 2006 PSF yes + 2.5.1 2.5 2007 PSF yes + 2.5.2 2.5.1 2008 PSF yes + 2.5.3 2.5.2 2008 PSF yes + 2.6 2.5 2008 PSF yes + 2.6.1 2.6 2008 PSF yes + 2.6.2 2.6.1 2009 PSF yes + 2.6.3 2.6.2 2009 PSF yes + 2.6.4 2.6.3 2009 PSF yes + 2.6.5 2.6.4 2010 PSF yes + 3.0 2.6 2008 PSF yes + 3.0.1 3.0 2009 PSF yes + 3.1 3.0.1 2009 PSF yes + 3.1.1 3.1 2009 PSF yes + 3.1.2 3.1 2010 PSF yes + 3.2 3.1 2010 PSF yes + +Footnotes: + +(1) GPL-compatible doesn't mean that we're distributing Python under + the GPL. All Python licenses, unlike the GPL, let you distribute + a modified version without making your changes open source. The + GPL-compatible licenses make it possible to combine Python with + other software that is released under the GPL; the others don't. + +(2) According to Richard Stallman, 1.6.1 is not GPL-compatible, + because its license has a choice of law clause. According to + CNRI, however, Stallman's lawyer has told CNRI's lawyer that 1.6.1 + is "not incompatible" with the GPL. + +Thanks to the many outside volunteers who have worked under Guido's +direction to make these releases possible. + + +B. TERMS AND CONDITIONS FOR ACCESSING OR OTHERWISE USING PYTHON +=============================================================== + +PYTHON SOFTWARE FOUNDATION LICENSE VERSION 2 +-------------------------------------------- + +1. This LICENSE AGREEMENT is between the Python Software Foundation +("PSF"), and the Individual or Organization ("Licensee") accessing and +otherwise using this software ("Python") in source or binary form and +its associated documentation. + +2. Subject to the terms and conditions of this License Agreement, PSF hereby +grants Licensee a nonexclusive, royalty-free, world-wide license to reproduce, +analyze, test, perform and/or display publicly, prepare derivative works, +distribute, and otherwise use Python alone or in any derivative version, +provided, however, that PSF's License Agreement and PSF's notice of copyright, +i.e., "Copyright (c) 2001, 2002, 2003, 2004, 2005, 2006, 2007, 2008, 2009, 2010 +Python Software Foundation; All Rights Reserved" are retained in Python alone or +in any derivative version prepared by Licensee. + +3. In the event Licensee prepares a derivative work that is based on +or incorporates Python or any part thereof, and wants to make +the derivative work available to others as provided herein, then +Licensee hereby agrees to include in any such work a brief summary of +the changes made to Python. + +4. PSF is making Python available to Licensee on an "AS IS" +basis. PSF MAKES NO REPRESENTATIONS OR WARRANTIES, EXPRESS OR +IMPLIED. BY WAY OF EXAMPLE, BUT NOT LIMITATION, PSF MAKES NO AND +DISCLAIMS ANY REPRESENTATION OR WARRANTY OF MERCHANTABILITY OR FITNESS +FOR ANY PARTICULAR PURPOSE OR THAT THE USE OF PYTHON WILL NOT +INFRINGE ANY THIRD PARTY RIGHTS. + +5. PSF SHALL NOT BE LIABLE TO LICENSEE OR ANY OTHER USERS OF PYTHON +FOR ANY INCIDENTAL, SPECIAL, OR CONSEQUENTIAL DAMAGES OR LOSS AS +A RESULT OF MODIFYING, DISTRIBUTING, OR OTHERWISE USING PYTHON, +OR ANY DERIVATIVE THEREOF, EVEN IF ADVISED OF THE POSSIBILITY THEREOF. + +6. This License Agreement will automatically terminate upon a material +breach of its terms and conditions. + +7. Nothing in this License Agreement shall be deemed to create any +relationship of agency, partnership, or joint venture between PSF and +Licensee. This License Agreement does not grant permission to use PSF +trademarks or trade name in a trademark sense to endorse or promote +products or services of Licensee, or any third party. + +8. By copying, installing or otherwise using Python, Licensee +agrees to be bound by the terms and conditions of this License +Agreement. + + +BEOPEN.COM LICENSE AGREEMENT FOR PYTHON 2.0 +------------------------------------------- + +BEOPEN PYTHON OPEN SOURCE LICENSE AGREEMENT VERSION 1 + +1. This LICENSE AGREEMENT is between BeOpen.com ("BeOpen"), having an +office at 160 Saratoga Avenue, Santa Clara, CA 95051, and the +Individual or Organization ("Licensee") accessing and otherwise using +this software in source or binary form and its associated +documentation ("the Software"). + +2. Subject to the terms and conditions of this BeOpen Python License +Agreement, BeOpen hereby grants Licensee a non-exclusive, +royalty-free, world-wide license to reproduce, analyze, test, perform +and/or display publicly, prepare derivative works, distribute, and +otherwise use the Software alone or in any derivative version, +provided, however, that the BeOpen Python License is retained in the +Software, alone or in any derivative version prepared by Licensee. + +3. BeOpen is making the Software available to Licensee on an "AS IS" +basis. BEOPEN MAKES NO REPRESENTATIONS OR WARRANTIES, EXPRESS OR +IMPLIED. BY WAY OF EXAMPLE, BUT NOT LIMITATION, BEOPEN MAKES NO AND +DISCLAIMS ANY REPRESENTATION OR WARRANTY OF MERCHANTABILITY OR FITNESS +FOR ANY PARTICULAR PURPOSE OR THAT THE USE OF THE SOFTWARE WILL NOT +INFRINGE ANY THIRD PARTY RIGHTS. + +4. BEOPEN SHALL NOT BE LIABLE TO LICENSEE OR ANY OTHER USERS OF THE +SOFTWARE FOR ANY INCIDENTAL, SPECIAL, OR CONSEQUENTIAL DAMAGES OR LOSS +AS A RESULT OF USING, MODIFYING OR DISTRIBUTING THE SOFTWARE, OR ANY +DERIVATIVE THEREOF, EVEN IF ADVISED OF THE POSSIBILITY THEREOF. + +5. This License Agreement will automatically terminate upon a material +breach of its terms and conditions. + +6. This License Agreement shall be governed by and interpreted in all +respects by the law of the State of California, excluding conflict of +law provisions. Nothing in this License Agreement shall be deemed to +create any relationship of agency, partnership, or joint venture +between BeOpen and Licensee. This License Agreement does not grant +permission to use BeOpen trademarks or trade names in a trademark +sense to endorse or promote products or services of Licensee, or any +third party. As an exception, the "BeOpen Python" logos available at +http://www.pythonlabs.com/logos.html may be used according to the +permissions granted on that web page. + +7. By copying, installing or otherwise using the software, Licensee +agrees to be bound by the terms and conditions of this License +Agreement. + + +CNRI LICENSE AGREEMENT FOR PYTHON 1.6.1 +--------------------------------------- + +1. This LICENSE AGREEMENT is between the Corporation for National +Research Initiatives, having an office at 1895 Preston White Drive, +Reston, VA 20191 ("CNRI"), and the Individual or Organization +("Licensee") accessing and otherwise using Python 1.6.1 software in +source or binary form and its associated documentation. + +2. Subject to the terms and conditions of this License Agreement, CNRI +hereby grants Licensee a nonexclusive, royalty-free, world-wide +license to reproduce, analyze, test, perform and/or display publicly, +prepare derivative works, distribute, and otherwise use Python 1.6.1 +alone or in any derivative version, provided, however, that CNRI's +License Agreement and CNRI's notice of copyright, i.e., "Copyright (c) +1995-2001 Corporation for National Research Initiatives; All Rights +Reserved" are retained in Python 1.6.1 alone or in any derivative +version prepared by Licensee. Alternately, in lieu of CNRI's License +Agreement, Licensee may substitute the following text (omitting the +quotes): "Python 1.6.1 is made available subject to the terms and +conditions in CNRI's License Agreement. This Agreement together with +Python 1.6.1 may be located on the Internet using the following +unique, persistent identifier (known as a handle): 1895.22/1013. This +Agreement may also be obtained from a proxy server on the Internet +using the following URL: http://hdl.handle.net/1895.22/1013". + +3. In the event Licensee prepares a derivative work that is based on +or incorporates Python 1.6.1 or any part thereof, and wants to make +the derivative work available to others as provided herein, then +Licensee hereby agrees to include in any such work a brief summary of +the changes made to Python 1.6.1. + +4. CNRI is making Python 1.6.1 available to Licensee on an "AS IS" +basis. CNRI MAKES NO REPRESENTATIONS OR WARRANTIES, EXPRESS OR +IMPLIED. BY WAY OF EXAMPLE, BUT NOT LIMITATION, CNRI MAKES NO AND +DISCLAIMS ANY REPRESENTATION OR WARRANTY OF MERCHANTABILITY OR FITNESS +FOR ANY PARTICULAR PURPOSE OR THAT THE USE OF PYTHON 1.6.1 WILL NOT +INFRINGE ANY THIRD PARTY RIGHTS. + +5. CNRI SHALL NOT BE LIABLE TO LICENSEE OR ANY OTHER USERS OF PYTHON +1.6.1 FOR ANY INCIDENTAL, SPECIAL, OR CONSEQUENTIAL DAMAGES OR LOSS AS +A RESULT OF MODIFYING, DISTRIBUTING, OR OTHERWISE USING PYTHON 1.6.1, +OR ANY DERIVATIVE THEREOF, EVEN IF ADVISED OF THE POSSIBILITY THEREOF. + +6. This License Agreement will automatically terminate upon a material +breach of its terms and conditions. + +7. This License Agreement shall be governed by the federal +intellectual property law of the United States, including without +limitation the federal copyright law, and, to the extent such +U.S. federal law does not apply, by the law of the Commonwealth of +Virginia, excluding Virginia's conflict of law provisions. +Notwithstanding the foregoing, with regard to derivative works based +on Python 1.6.1 that incorporate non-separable material that was +previously distributed under the GNU General Public License (GPL), the +law of the Commonwealth of Virginia shall govern this License +Agreement only as to issues arising under or with respect to +Paragraphs 4, 5, and 7 of this License Agreement. Nothing in this +License Agreement shall be deemed to create any relationship of +agency, partnership, or joint venture between CNRI and Licensee. This +License Agreement does not grant permission to use CNRI trademarks or +trade name in a trademark sense to endorse or promote products or +services of Licensee, or any third party. + +8. By clicking on the "ACCEPT" button where indicated, or by copying, +installing or otherwise using Python 1.6.1, Licensee agrees to be +bound by the terms and conditions of this License Agreement. + + ACCEPT + + +CWI LICENSE AGREEMENT FOR PYTHON 0.9.0 THROUGH 1.2 +-------------------------------------------------- + +Copyright (c) 1991 - 1995, Stichting Mathematisch Centrum Amsterdam, +The Netherlands. All rights reserved. + +Permission to use, copy, modify, and distribute this software and its +documentation for any purpose and without fee is hereby granted, +provided that the above copyright notice appear in all copies and that +both that copyright notice and this permission notice appear in +supporting documentation, and that the name of Stichting Mathematisch +Centrum or CWI not be used in advertising or publicity pertaining to +distribution of the software without specific, written prior +permission. + +STICHTING MATHEMATISCH CENTRUM DISCLAIMS ALL WARRANTIES WITH REGARD TO +THIS SOFTWARE, INCLUDING ALL IMPLIED WARRANTIES OF MERCHANTABILITY AND +FITNESS, IN NO EVENT SHALL STICHTING MATHEMATISCH CENTRUM BE LIABLE +FOR ANY SPECIAL, INDIRECT OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES +WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN +ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT +OF OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE. + + +distro +Apache Software License +https://github.com/python-distro/distro +Apache License + Version 2.0, January 2004 + http://www.apache.org/licenses/ + + TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION + + 1. Definitions. + + "License" shall mean the terms and conditions for use, reproduction, + and distribution as defined by Sections 1 through 9 of this document. + + "Licensor" shall mean the copyright owner or entity authorized by + the copyright owner that is granting the License. + + "Legal Entity" shall mean the union of the acting entity and all + other entities that control, are controlled by, or are under common + control with that entity. For the purposes of this definition, + "control" means (i) the power, direct or indirect, to cause the + direction or management of such entity, whether by contract or + otherwise, or (ii) ownership of fifty percent (50%) or more of the + outstanding shares, or (iii) beneficial ownership of such entity. + + "You" (or "Your") shall mean an individual or Legal Entity + exercising permissions granted by this License. + + "Source" form shall mean the preferred form for making modifications, + including but not limited to software source code, documentation + source, and configuration files. + + "Object" form shall mean any form resulting from mechanical + transformation or translation of a Source form, including but + not limited to compiled object code, generated documentation, + and conversions to other media types. + + "Work" shall mean the work of authorship, whether in Source or + Object form, made available under the License, as indicated by a + copyright notice that is included in or attached to the work + (an example is provided in the Appendix below). + + "Derivative Works" shall mean any work, whether in Source or Object + form, that is based on (or derived from) the Work and for which the + editorial revisions, annotations, elaborations, or other modifications + represent, as a whole, an original work of authorship. For the purposes + of this License, Derivative Works shall not include works that remain + separable from, or merely link (or bind by name) to the interfaces of, + the Work and Derivative Works thereof. + + "Contribution" shall mean any work of authorship, including + the original version of the Work and any modifications or additions + to that Work or Derivative Works thereof, that is intentionally + submitted to Licensor for inclusion in the Work by the copyright owner + or by an individual or Legal Entity authorized to submit on behalf of + the copyright owner. For the purposes of this definition, "submitted" + means any form of electronic, verbal, or written communication sent + to the Licensor or its representatives, including but not limited to + communication on electronic mailing lists, source code control systems, + and issue tracking systems that are managed by, or on behalf of, the + Licensor for the purpose of discussing and improving the Work, but + excluding communication that is conspicuously marked or otherwise + designated in writing by the copyright owner as "Not a Contribution." + + "Contributor" shall mean Licensor and any individual or Legal Entity + on behalf of whom a Contribution has been received by Licensor and + subsequently incorporated within the Work. + + 2. Grant of Copyright License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + copyright license to reproduce, prepare Derivative Works of, + publicly display, publicly perform, sublicense, and distribute the + Work and such Derivative Works in Source or Object form. + + 3. Grant of Patent License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + (except as stated in this section) patent license to make, have made, + use, offer to sell, sell, import, and otherwise transfer the Work, + where such license applies only to those patent claims licensable + by such Contributor that are necessarily infringed by their + Contribution(s) alone or by combination of their Contribution(s) + with the Work to which such Contribution(s) was submitted. If You + institute patent litigation against any entity (including a + cross-claim or counterclaim in a lawsuit) alleging that the Work + or a Contribution incorporated within the Work constitutes direct + or contributory patent infringement, then any patent licenses + granted to You under this License for that Work shall terminate + as of the date such litigation is filed. + + 4. Redistribution. You may reproduce and distribute copies of the + Work or Derivative Works thereof in any medium, with or without + modifications, and in Source or Object form, provided that You + meet the following conditions: + + (a) You must give any other recipients of the Work or + Derivative Works a copy of this License; and + + (b) You must cause any modified files to carry prominent notices + stating that You changed the files; and + + (c) You must retain, in the Source form of any Derivative Works + that You distribute, all copyright, patent, trademark, and + attribution notices from the Source form of the Work, + excluding those notices that do not pertain to any part of + the Derivative Works; and + + (d) If the Work includes a "NOTICE" text file as part of its + distribution, then any Derivative Works that You distribute must + include a readable copy of the attribution notices contained + within such NOTICE file, excluding those notices that do not + pertain to any part of the Derivative Works, in at least one + of the following places: within a NOTICE text file distributed + as part of the Derivative Works; within the Source form or + documentation, if provided along with the Derivative Works; or, + within a display generated by the Derivative Works, if and + wherever such third-party notices normally appear. The contents + of the NOTICE file are for informational purposes only and + do not modify the License. You may add Your own attribution + notices within Derivative Works that You distribute, alongside + or as an addendum to the NOTICE text from the Work, provided + that such additional attribution notices cannot be construed + as modifying the License. + + You may add Your own copyright statement to Your modifications and + may provide additional or different license terms and conditions + for use, reproduction, or distribution of Your modifications, or + for any such Derivative Works as a whole, provided Your use, + reproduction, and distribution of the Work otherwise complies with + the conditions stated in this License. + + 5. Submission of Contributions. Unless You explicitly state otherwise, + any Contribution intentionally submitted for inclusion in the Work + by You to the Licensor shall be under the terms and conditions of + this License, without any additional terms or conditions. + Notwithstanding the above, nothing herein shall supersede or modify + the terms of any separate license agreement you may have executed + with Licensor regarding such Contributions. + + 6. Trademarks. This License does not grant permission to use the trade + names, trademarks, service marks, or product names of the Licensor, + except as required for reasonable and customary use in describing the + origin of the Work and reproducing the content of the NOTICE file. + + 7. Disclaimer of Warranty. Unless required by applicable law or + agreed to in writing, Licensor provides the Work (and each + Contributor provides its Contributions) on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or + implied, including, without limitation, any warranties or conditions + of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A + PARTICULAR PURPOSE. You are solely responsible for determining the + appropriateness of using or redistributing the Work and assume any + risks associated with Your exercise of permissions under this License. + + 8. Limitation of Liability. In no event and under no legal theory, + whether in tort (including negligence), contract, or otherwise, + unless required by applicable law (such as deliberate and grossly + negligent acts) or agreed to in writing, shall any Contributor be + liable to You for damages, including any direct, indirect, special, + incidental, or consequential damages of any character arising as a + result of this License or out of the use or inability to use the + Work (including but not limited to damages for loss of goodwill, + work stoppage, computer failure or malfunction, or any and all + other commercial damages or losses), even if such Contributor + has been advised of the possibility of such damages. + + 9. Accepting Warranty or Additional Liability. While redistributing + the Work or Derivative Works thereof, You may choose to offer, + and charge a fee for, acceptance of support, warranty, indemnity, + or other liability obligations and/or rights consistent with this + License. However, in accepting such obligations, You may act only + on Your own behalf and on Your sole responsibility, not on behalf + of any other Contributor, and only if You agree to indemnify, + defend, and hold each Contributor harmless for any liability + incurred by, or claims asserted against, such Contributor by reason + of your accepting any such warranty or additional liability. + + END OF TERMS AND CONDITIONS + + APPENDIX: How to apply the Apache License to your work. + + To apply the Apache License to your work, attach the following + boilerplate notice, with the fields enclosed by brackets "{}" + replaced with your own identifying information. (Don't include + the brackets!) The text should be enclosed in the appropriate + comment syntax for the file format. We also recommend that a + file or class name and description of purpose be included on the + same "printed page" as the copyright notice for easier + identification within third-party archives. + + Copyright {yyyy} {name of copyright owner} + + Licensed under the Apache License, Version 2.0 (the "License"); + you may not use this file except in compliance with the License. + You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, software + distributed under the License is distributed on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + See the License for the specific language governing permissions and + limitations under the License. + + + +dm-tree +Apache Software License +https://github.com/deepmind/tree + + Apache License + Version 2.0, January 2004 + http://www.apache.org/licenses/ + + TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION + + 1. Definitions. + + "License" shall mean the terms and conditions for use, reproduction, + and distribution as defined by Sections 1 through 9 of this document. + + "Licensor" shall mean the copyright owner or entity authorized by + the copyright owner that is granting the License. + + "Legal Entity" shall mean the union of the acting entity and all + other entities that control, are controlled by, or are under common + control with that entity. For the purposes of this definition, + "control" means (i) the power, direct or indirect, to cause the + direction or management of such entity, whether by contract or + otherwise, or (ii) ownership of fifty percent (50%) or more of the + outstanding shares, or (iii) beneficial ownership of such entity. + + "You" (or "Your") shall mean an individual or Legal Entity + exercising permissions granted by this License. + + "Source" form shall mean the preferred form for making modifications, + including but not limited to software source code, documentation + source, and configuration files. + + "Object" form shall mean any form resulting from mechanical + transformation or translation of a Source form, including but + not limited to compiled object code, generated documentation, + and conversions to other media types. + + "Work" shall mean the work of authorship, whether in Source or + Object form, made available under the License, as indicated by a + copyright notice that is included in or attached to the work + (an example is provided in the Appendix below). + + "Derivative Works" shall mean any work, whether in Source or Object + form, that is based on (or derived from) the Work and for which the + editorial revisions, annotations, elaborations, or other modifications + represent, as a whole, an original work of authorship. For the purposes + of this License, Derivative Works shall not include works that remain + separable from, or merely link (or bind by name) to the interfaces of, + the Work and Derivative Works thereof. + + "Contribution" shall mean any work of authorship, including + the original version of the Work and any modifications or additions + to that Work or Derivative Works thereof, that is intentionally + submitted to Licensor for inclusion in the Work by the copyright owner + or by an individual or Legal Entity authorized to submit on behalf of + the copyright owner. For the purposes of this definition, "submitted" + means any form of electronic, verbal, or written communication sent + to the Licensor or its representatives, including but not limited to + communication on electronic mailing lists, source code control systems, + and issue tracking systems that are managed by, or on behalf of, the + Licensor for the purpose of discussing and improving the Work, but + excluding communication that is conspicuously marked or otherwise + designated in writing by the copyright owner as "Not a Contribution." + + "Contributor" shall mean Licensor and any individual or Legal Entity + on behalf of whom a Contribution has been received by Licensor and + subsequently incorporated within the Work. + + 2. Grant of Copyright License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + copyright license to reproduce, prepare Derivative Works of, + publicly display, publicly perform, sublicense, and distribute the + Work and such Derivative Works in Source or Object form. + + 3. Grant of Patent License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + (except as stated in this section) patent license to make, have made, + use, offer to sell, sell, import, and otherwise transfer the Work, + where such license applies only to those patent claims licensable + by such Contributor that are necessarily infringed by their + Contribution(s) alone or by combination of their Contribution(s) + with the Work to which such Contribution(s) was submitted. If You + institute patent litigation against any entity (including a + cross-claim or counterclaim in a lawsuit) alleging that the Work + or a Contribution incorporated within the Work constitutes direct + or contributory patent infringement, then any patent licenses + granted to You under this License for that Work shall terminate + as of the date such litigation is filed. + + 4. Redistribution. You may reproduce and distribute copies of the + Work or Derivative Works thereof in any medium, with or without + modifications, and in Source or Object form, provided that You + meet the following conditions: + + (a) You must give any other recipients of the Work or + Derivative Works a copy of this License; and + + (b) You must cause any modified files to carry prominent notices + stating that You changed the files; and + + (c) You must retain, in the Source form of any Derivative Works + that You distribute, all copyright, patent, trademark, and + attribution notices from the Source form of the Work, + excluding those notices that do not pertain to any part of + the Derivative Works; and + + (d) If the Work includes a "NOTICE" text file as part of its + distribution, then any Derivative Works that You distribute must + include a readable copy of the attribution notices contained + within such NOTICE file, excluding those notices that do not + pertain to any part of the Derivative Works, in at least one + of the following places: within a NOTICE text file distributed + as part of the Derivative Works; within the Source form or + documentation, if provided along with the Derivative Works; or, + within a display generated by the Derivative Works, if and + wherever such third-party notices normally appear. The contents + of the NOTICE file are for informational purposes only and + do not modify the License. You may add Your own attribution + notices within Derivative Works that You distribute, alongside + or as an addendum to the NOTICE text from the Work, provided + that such additional attribution notices cannot be construed + as modifying the License. + + You may add Your own copyright statement to Your modifications and + may provide additional or different license terms and conditions + for use, reproduction, or distribution of Your modifications, or + for any such Derivative Works as a whole, provided Your use, + reproduction, and distribution of the Work otherwise complies with + the conditions stated in this License. + + 5. Submission of Contributions. Unless You explicitly state otherwise, + any Contribution intentionally submitted for inclusion in the Work + by You to the Licensor shall be under the terms and conditions of + this License, without any additional terms or conditions. + Notwithstanding the above, nothing herein shall supersede or modify + the terms of any separate license agreement you may have executed + with Licensor regarding such Contributions. + + 6. Trademarks. This License does not grant permission to use the trade + names, trademarks, service marks, or product names of the Licensor, + except as required for reasonable and customary use in describing the + origin of the Work and reproducing the content of the NOTICE file. + + 7. Disclaimer of Warranty. Unless required by applicable law or + agreed to in writing, Licensor provides the Work (and each + Contributor provides its Contributions) on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or + implied, including, without limitation, any warranties or conditions + of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A + PARTICULAR PURPOSE. You are solely responsible for determining the + appropriateness of using or redistributing the Work and assume any + risks associated with Your exercise of permissions under this License. + + 8. Limitation of Liability. In no event and under no legal theory, + whether in tort (including negligence), contract, or otherwise, + unless required by applicable law (such as deliberate and grossly + negligent acts) or agreed to in writing, shall any Contributor be + liable to You for damages, including any direct, indirect, special, + incidental, or consequential damages of any character arising as a + result of this License or out of the use or inability to use the + Work (including but not limited to damages for loss of goodwill, + work stoppage, computer failure or malfunction, or any and all + other commercial damages or losses), even if such Contributor + has been advised of the possibility of such damages. + + 9. Accepting Warranty or Additional Liability. While redistributing + the Work or Derivative Works thereof, You may choose to offer, + and charge a fee for, acceptance of support, warranty, indemnity, + or other liability obligations and/or rights consistent with this + License. However, in accepting such obligations, You may act only + on Your own behalf and on Your sole responsibility, not on behalf + of any other Contributor, and only if You agree to indemnify, + defend, and hold each Contributor harmless for any liability + incurred by, or claims asserted against, such Contributor by reason + of your accepting any such warranty or additional liability. + + END OF TERMS AND CONDITIONS + + APPENDIX: How to apply the Apache License to your work. + + To apply the Apache License to your work, attach the following + boilerplate notice, with the fields enclosed by brackets "[]" + replaced with your own identifying information. (Don't include + the brackets!) The text should be enclosed in the appropriate + comment syntax for the file format. We also recommend that a + file or class name and description of purpose be included on the + same "printed page" as the copyright notice for easier + identification within third-party archives. + + Copyright [yyyy] [name of copyright owner] + + Licensed under the Apache License, Version 2.0 (the "License"); + you may not use this file except in compliance with the License. + You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, software + distributed under the License is distributed on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + See the License for the specific language governing permissions and + limitations under the License. + + +docstring_parser +MIT License +https://github.com/rr-/docstring_parser +The MIT License (MIT) + +Copyright (c) 2018 Marcin Kurczewski + +Permission is hereby granted, free of charge, to any person obtaining a copy +of this software and associated documentation files (the "Software"), to deal +in the Software without restriction, including without limitation the rights +to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +copies of the Software, and to permit persons to whom the Software is +furnished to do so, subject to the following conditions: + +The above copyright notice and this permission notice shall be included in all +copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE +SOFTWARE. + + +donfig +MIT License +https://github.com/pytroll/donfig +Copyright (c) 2018- Donfig Developers +Copyright (c) 2014-2018, Anaconda, Inc. and contributors + +Permission is hereby granted, free of charge, to any person obtaining a copy +of this software and associated documentation files (the "Software"), to deal +in the Software without restriction, including without limitation the rights +to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +copies of the Software, and to permit persons to whom the Software is +furnished to do so, subject to the following conditions: + +The above copyright notice and this permission notice shall be included in all +copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE +SOFTWARE. + + +einops +MIT License +https://github.com/arogozhnikov/einops +MIT License + +Copyright (c) 2018 Alex Rogozhnikov + +Permission is hereby granted, free of charge, to any person obtaining a copy +of this software and associated documentation files (the "Software"), to deal +in the Software without restriction, including without limitation the rights +to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +copies of the Software, and to permit persons to whom the Software is +furnished to do so, subject to the following conditions: + +The above copyright notice and this permission notice shall be included in all +copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE +SOFTWARE. + + +einx +MIT +https://github.com/fferflo/einx +MIT License + +Copyright (c) 2023- Florian Fervers + +Permission is hereby granted, free of charge, to any person obtaining a copy +of this software and associated documentation files (the "Software"), to deal +in the Software without restriction, including without limitation the rights +to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +copies of the Software, and to permit persons to whom the Software is +furnished to do so, subject to the following conditions: + +The above copyright notice and this permission notice shall be included in all +copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE +SOFTWARE. + + +evdev +BSD-3-Clause +https://github.com/gvalkov/python-evdev +Copyright (c) 2012-2025 Georgi Valkov. All rights reserved. + +Redistribution and use in source and binary forms, with or without +modification, are permitted provided that the following conditions are +met: + + 1. Redistributions of source code must retain the above copyright + notice, this list of conditions and the following disclaimer. + + 2. Redistributions in binary form must reproduce the above copyright + notice, this list of conditions and the following disclaimer in + the documentation and/or other materials provided with the + distribution. + + 3. Neither the name of author nor the names of its contributors may + be used to endorse or promote products derived from this software + without specific prior written permission. + +THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS +"AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT +LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR +A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHORS OR +COPYRIGHT HOLDERS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, +SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT +LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, +DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY +THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT +(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE +OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + + +execnet +MIT +https://execnet.readthedocs.io/en/latest/ + + Permission is hereby granted, free of charge, to any person obtaining a copy + of this software and associated documentation files (the "Software"), to deal + in the Software without restriction, including without limitation the rights + to use, copy, modify, merge, publish, distribute, sublicense, and/or sell + copies of the Software, and to permit persons to whom the Software is + furnished to do so, subject to the following conditions: + + The above copyright notice and this permission notice shall be included in all + copies or substantial portions of the Software. + + THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE + AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER + LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, + OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE + SOFTWARE. + + +executing +MIT License +https://github.com/alexmojaki/executing +MIT License + +Copyright (c) 2019 Alex Hall + +Permission is hereby granted, free of charge, to any person obtaining a copy +of this software and associated documentation files (the "Software"), to deal +in the Software without restriction, including without limitation the rights +to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +copies of the Software, and to permit persons to whom the Software is +furnished to do so, subject to the following conditions: + +The above copyright notice and this permission notice shall be included in all +copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE +SOFTWARE. + + +fastapi +MIT +https://github.com/fastapi/fastapi +The MIT License (MIT) + +Copyright (c) 2018 Sebastián Ramírez + +Permission is hereby granted, free of charge, to any person obtaining a copy +of this software and associated documentation files (the "Software"), to deal +in the Software without restriction, including without limitation the rights +to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +copies of the Software, and to permit persons to whom the Software is +furnished to do so, subject to the following conditions: + +The above copyright notice and this permission notice shall be included in +all copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN +THE SOFTWARE. + + +fastjsonschema +BSD License +https://github.com/horejsek/python-fastjsonschema +Copyright (c) 2018, Michal Horejsek +All rights reserved. + +Redistribution and use in source and binary forms, with or without modification, +are permitted provided that the following conditions are met: + + Redistributions of source code must retain the above copyright notice, this + list of conditions and the following disclaimer. + + Redistributions in binary form must reproduce the above copyright notice, this + list of conditions and the following disclaimer in the documentation and/or + other materials provided with the distribution. + + Neither the name of the {organization} nor the names of its + contributors may be used to endorse or promote products derived from + this software without specific prior written permission. + +THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND +ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED +WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE +DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR +ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES +(INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; +LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON +ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT +(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS +SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + + +fastparquet +Apache Software License +https://github.com/dask/fastparquet/ + + Apache License + Version 2.0, January 2004 + http://www.apache.org/licenses/ + + TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION + + 1. Definitions. + + "License" shall mean the terms and conditions for use, reproduction, + and distribution as defined by Sections 1 through 9 of this document. + + "Licensor" shall mean the copyright owner or entity authorized by + the copyright owner that is granting the License. + + "Legal Entity" shall mean the union of the acting entity and all + other entities that control, are controlled by, or are under common + control with that entity. For the purposes of this definition, + "control" means (i) the power, direct or indirect, to cause the + direction or management of such entity, whether by contract or + otherwise, or (ii) ownership of fifty percent (50%) or more of the + outstanding shares, or (iii) beneficial ownership of such entity. + + "You" (or "Your") shall mean an individual or Legal Entity + exercising permissions granted by this License. + + "Source" form shall mean the preferred form for making modifications, + including but not limited to software source code, documentation + source, and configuration files. + + "Object" form shall mean any form resulting from mechanical + transformation or translation of a Source form, including but + not limited to compiled object code, generated documentation, + and conversions to other media types. + + "Work" shall mean the work of authorship, whether in Source or + Object form, made available under the License, as indicated by a + copyright notice that is included in or attached to the work + (an example is provided in the Appendix below). + + "Derivative Works" shall mean any work, whether in Source or Object + form, that is based on (or derived from) the Work and for which the + editorial revisions, annotations, elaborations, or other modifications + represent, as a whole, an original work of authorship. For the purposes + of this License, Derivative Works shall not include works that remain + separable from, or merely link (or bind by name) to the interfaces of, + the Work and Derivative Works thereof. + + "Contribution" shall mean any work of authorship, including + the original version of the Work and any modifications or additions + to that Work or Derivative Works thereof, that is intentionally + submitted to Licensor for inclusion in the Work by the copyright owner + or by an individual or Legal Entity authorized to submit on behalf of + the copyright owner. For the purposes of this definition, "submitted" + means any form of electronic, verbal, or written communication sent + to the Licensor or its representatives, including but not limited to + communication on electronic mailing lists, source code control systems, + and issue tracking systems that are managed by, or on behalf of, the + Licensor for the purpose of discussing and improving the Work, but + excluding communication that is conspicuously marked or otherwise + designated in writing by the copyright owner as "Not a Contribution." + + "Contributor" shall mean Licensor and any individual or Legal Entity + on behalf of whom a Contribution has been received by Licensor and + subsequently incorporated within the Work. + + 2. Grant of Copyright License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + copyright license to reproduce, prepare Derivative Works of, + publicly display, publicly perform, sublicense, and distribute the + Work and such Derivative Works in Source or Object form. + + 3. Grant of Patent License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + (except as stated in this section) patent license to make, have made, + use, offer to sell, sell, import, and otherwise transfer the Work, + where such license applies only to those patent claims licensable + by such Contributor that are necessarily infringed by their + Contribution(s) alone or by combination of their Contribution(s) + with the Work to which such Contribution(s) was submitted. If You + institute patent litigation against any entity (including a + cross-claim or counterclaim in a lawsuit) alleging that the Work + or a Contribution incorporated within the Work constitutes direct + or contributory patent infringement, then any patent licenses + granted to You under this License for that Work shall terminate + as of the date such litigation is filed. + + 4. Redistribution. You may reproduce and distribute copies of the + Work or Derivative Works thereof in any medium, with or without + modifications, and in Source or Object form, provided that You + meet the following conditions: + + (a) You must give any other recipients of the Work or + Derivative Works a copy of this License; and + + (b) You must cause any modified files to carry prominent notices + stating that You changed the files; and + + (c) You must retain, in the Source form of any Derivative Works + that You distribute, all copyright, patent, trademark, and + attribution notices from the Source form of the Work, + excluding those notices that do not pertain to any part of + the Derivative Works; and + + (d) If the Work includes a "NOTICE" text file as part of its + distribution, then any Derivative Works that You distribute must + include a readable copy of the attribution notices contained + within such NOTICE file, excluding those notices that do not + pertain to any part of the Derivative Works, in at least one + of the following places: within a NOTICE text file distributed + as part of the Derivative Works; within the Source form or + documentation, if provided along with the Derivative Works; or, + within a display generated by the Derivative Works, if and + wherever such third-party notices normally appear. The contents + of the NOTICE file are for informational purposes only and + do not modify the License. You may add Your own attribution + notices within Derivative Works that You distribute, alongside + or as an addendum to the NOTICE text from the Work, provided + that such additional attribution notices cannot be construed + as modifying the License. + + You may add Your own copyright statement to Your modifications and + may provide additional or different license terms and conditions + for use, reproduction, or distribution of Your modifications, or + for any such Derivative Works as a whole, provided Your use, + reproduction, and distribution of the Work otherwise complies with + the conditions stated in this License. + + 5. Submission of Contributions. Unless You explicitly state otherwise, + any Contribution intentionally submitted for inclusion in the Work + by You to the Licensor shall be under the terms and conditions of + this License, without any additional terms or conditions. + Notwithstanding the above, nothing herein shall supersede or modify + the terms of any separate license agreement you may have executed + with Licensor regarding such Contributions. + + 6. Trademarks. This License does not grant permission to use the trade + names, trademarks, service marks, or product names of the Licensor, + except as required for reasonable and customary use in describing the + origin of the Work and reproducing the content of the NOTICE file. + + 7. Disclaimer of Warranty. Unless required by applicable law or + agreed to in writing, Licensor provides the Work (and each + Contributor provides its Contributions) on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or + implied, including, without limitation, any warranties or conditions + of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A + PARTICULAR PURPOSE. You are solely responsible for determining the + appropriateness of using or redistributing the Work and assume any + risks associated with Your exercise of permissions under this License. + + 8. Limitation of Liability. In no event and under no legal theory, + whether in tort (including negligence), contract, or otherwise, + unless required by applicable law (such as deliberate and grossly + negligent acts) or agreed to in writing, shall any Contributor be + liable to You for damages, including any direct, indirect, special, + incidental, or consequential damages of any character arising as a + result of this License or out of the use or inability to use the + Work (including but not limited to damages for loss of goodwill, + work stoppage, computer failure or malfunction, or any and all + other commercial damages or losses), even if such Contributor + has been advised of the possibility of such damages. + + 9. Accepting Warranty or Additional Liability. While redistributing + the Work or Derivative Works thereof, You may choose to offer, + and charge a fee for, acceptance of support, warranty, indemnity, + or other liability obligations and/or rights consistent with this + License. However, in accepting such obligations, You may act only + on Your own behalf and on Your sole responsibility, not on behalf + of any other Contributor, and only if You agree to indemnify, + defend, and hold each Contributor harmless for any liability + incurred by, or claims asserted against, such Contributor by reason + of your accepting any such warranty or additional liability. + + END OF TERMS AND CONDITIONS + +ffmpeg-python +Apache Software License +https://github.com/kkroening/ffmpeg-python + Apache License + Version 2.0, January 2004 + http://www.apache.org/licenses/ + + TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION + + 1. Definitions. + + "License" shall mean the terms and conditions for use, reproduction, + and distribution as defined by Sections 1 through 9 of this document. + + "Licensor" shall mean the copyright owner or entity authorized by + the copyright owner that is granting the License. + + "Legal Entity" shall mean the union of the acting entity and all + other entities that control, are controlled by, or are under common + control with that entity. For the purposes of this definition, + "control" means (i) the power, direct or indirect, to cause the + direction or management of such entity, whether by contract or + otherwise, or (ii) ownership of fifty percent (50%) or more of the + outstanding shares, or (iii) beneficial ownership of such entity. + + "You" (or "Your") shall mean an individual or Legal Entity + exercising permissions granted by this License. + + "Source" form shall mean the preferred form for making modifications, + including but not limited to software source code, documentation + source, and configuration files. + + "Object" form shall mean any form resulting from mechanical + transformation or translation of a Source form, including but + not limited to compiled object code, generated documentation, + and conversions to other media types. + + "Work" shall mean the work of authorship, whether in Source or + Object form, made available under the License, as indicated by a + copyright notice that is included in or attached to the work + (an example is provided in the Appendix below). + + "Derivative Works" shall mean any work, whether in Source or Object + form, that is based on (or derived from) the Work and for which the + editorial revisions, annotations, elaborations, or other modifications + represent, as a whole, an original work of authorship. For the purposes + of this License, Derivative Works shall not include works that remain + separable from, or merely link (or bind by name) to the interfaces of, + the Work and Derivative Works thereof. + + "Contribution" shall mean any work of authorship, including + the original version of the Work and any modifications or additions + to that Work or Derivative Works thereof, that is intentionally + submitted to Licensor for inclusion in the Work by the copyright owner + or by an individual or Legal Entity authorized to submit on behalf of + the copyright owner. For the purposes of this definition, "submitted" + means any form of electronic, verbal, or written communication sent + to the Licensor or its representatives, including but not limited to + communication on electronic mailing lists, source code control systems, + and issue tracking systems that are managed by, or on behalf of, the + Licensor for the purpose of discussing and improving the Work, but + excluding communication that is conspicuously marked or otherwise + designated in writing by the copyright owner as "Not a Contribution." + + "Contributor" shall mean Licensor and any individual or Legal Entity + on behalf of whom a Contribution has been received by Licensor and + subsequently incorporated within the Work. + + 2. Grant of Copyright License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + copyright license to reproduce, prepare Derivative Works of, + publicly display, publicly perform, sublicense, and distribute the + Work and such Derivative Works in Source or Object form. + + 3. Grant of Patent License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + (except as stated in this section) patent license to make, have made, + use, offer to sell, sell, import, and otherwise transfer the Work, + where such license applies only to those patent claims licensable + by such Contributor that are necessarily infringed by their + Contribution(s) alone or by combination of their Contribution(s) + with the Work to which such Contribution(s) was submitted. If You + institute patent litigation against any entity (including a + cross-claim or counterclaim in a lawsuit) alleging that the Work + or a Contribution incorporated within the Work constitutes direct + or contributory patent infringement, then any patent licenses + granted to You under this License for that Work shall terminate + as of the date such litigation is filed. + + 4. Redistribution. You may reproduce and distribute copies of the + Work or Derivative Works thereof in any medium, with or without + modifications, and in Source or Object form, provided that You + meet the following conditions: + + (a) You must give any other recipients of the Work or + Derivative Works a copy of this License; and + + (b) You must cause any modified files to carry prominent notices + stating that You changed the files; and + + (c) You must retain, in the Source form of any Derivative Works + that You distribute, all copyright, patent, trademark, and + attribution notices from the Source form of the Work, + excluding those notices that do not pertain to any part of + the Derivative Works; and + + (d) If the Work includes a "NOTICE" text file as part of its + distribution, then any Derivative Works that You distribute must + include a readable copy of the attribution notices contained + within such NOTICE file, excluding those notices that do not + pertain to any part of the Derivative Works, in at least one + of the following places: within a NOTICE text file distributed + as part of the Derivative Works; within the Source form or + documentation, if provided along with the Derivative Works; or, + within a display generated by the Derivative Works, if and + wherever such third-party notices normally appear. The contents + of the NOTICE file are for informational purposes only and + do not modify the License. You may add Your own attribution + notices within Derivative Works that You distribute, alongside + or as an addendum to the NOTICE text from the Work, provided + that such additional attribution notices cannot be construed + as modifying the License. + + You may add Your own copyright statement to Your modifications and + may provide additional or different license terms and conditions + for use, reproduction, or distribution of Your modifications, or + for any such Derivative Works as a whole, provided Your use, + reproduction, and distribution of the Work otherwise complies with + the conditions stated in this License. + + 5. Submission of Contributions. Unless You explicitly state otherwise, + any Contribution intentionally submitted for inclusion in the Work + by You to the Licensor shall be under the terms and conditions of + this License, without any additional terms or conditions. + Notwithstanding the above, nothing herein shall supersede or modify + the terms of any separate license agreement you may have executed + with Licensor regarding such Contributions. + + 6. Trademarks. This License does not grant permission to use the trade + names, trademarks, service marks, or product names of the Licensor, + except as required for reasonable and customary use in describing the + origin of the Work and reproducing the content of the NOTICE file. + + 7. Disclaimer of Warranty. Unless required by applicable law or + agreed to in writing, Licensor provides the Work (and each + Contributor provides its Contributions) on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or + implied, including, without limitation, any warranties or conditions + of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A + PARTICULAR PURPOSE. You are solely responsible for determining the + appropriateness of using or redistributing the Work and assume any + risks associated with Your exercise of permissions under this License. + + 8. Limitation of Liability. In no event and under no legal theory, + whether in tort (including negligence), contract, or otherwise, + unless required by applicable law (such as deliberate and grossly + negligent acts) or agreed to in writing, shall any Contributor be + liable to You for damages, including any direct, indirect, special, + incidental, or consequential damages of any character arising as a + result of this License or out of the use or inability to use the + Work (including but not limited to damages for loss of goodwill, + work stoppage, computer failure or malfunction, or any and all + other commercial damages or losses), even if such Contributor + has been advised of the possibility of such damages. + + 9. Accepting Warranty or Additional Liability. While redistributing + the Work or Derivative Works thereof, You may choose to offer, + and charge a fee for, acceptance of support, warranty, indemnity, + or other liability obligations and/or rights consistent with this + License. However, in accepting such obligations, You may act only + on Your own behalf and on Your sole responsibility, not on behalf + of any other Contributor, and only if You agree to indemnify, + defend, and hold each Contributor harmless for any liability + incurred by, or claims asserted against, such Contributor by reason + of your accepting any such warranty or additional liability. + + END OF TERMS AND CONDITIONS + + APPENDIX: How to apply the Apache License to your work. + + To apply the Apache License to your work, attach the following + boilerplate notice, with the fields enclosed by brackets "[]" + replaced with your own identifying information. (Don't include + the brackets!) The text should be enclosed in the appropriate + comment syntax for the file format. We also recommend that a + file or class name and description of purpose be included on the + same "printed page" as the copyright notice for easier + identification within third-party archives. + + Copyright 2017 Karl Kroening + + Licensed under the Apache License, Version 2.0 (the "License"); + you may not use this file except in compliance with the License. + You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, software + distributed under the License is distributed on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + See the License for the specific language governing permissions and + limitations under the License. + + +ffmpegcv +UNKNOWN +https://github.com/chenxinfeng4/ffmpegcv +UNKNOWN + +ffmpy +MIT +UNKNOWN +UNKNOWN + +filelock +MIT +https://github.com/tox-dev/py-filelock +MIT License + +Copyright (c) 2025 Bernát Gábor and contributors + +Permission is hereby granted, free of charge, to any person obtaining a copy +of this software and associated documentation files (the "Software"), to deal +in the Software without restriction, including without limitation the rights +to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +copies of the Software, and to permit persons to whom the Software is +furnished to do so, subject to the following conditions: + +The above copyright notice and this permission notice shall be included in all +copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE +SOFTWARE. + + +flopth +UNKNOWN +UNKNOWN +MIT License + +Copyright (c) 2019 Yunfeng Wang + +Permission is hereby granted, free of charge, to any person obtaining a copy +of this software and associated documentation files (the "Software"), to deal +in the Software without restriction, including without limitation the rights +to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +copies of the Software, and to permit persons to whom the Software is +furnished to do so, subject to the following conditions: + +The above copyright notice and this permission notice shall be included in all +copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE +SOFTWARE. + + +fonttools +MIT +http://github.com/fonttools/fonttools +MIT License + +Copyright (c) 2017 Just van Rossum + +Permission is hereby granted, free of charge, to any person obtaining a copy +of this software and associated documentation files (the "Software"), to deal +in the Software without restriction, including without limitation the rights +to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +copies of the Software, and to permit persons to whom the Software is +furnished to do so, subject to the following conditions: + +The above copyright notice and this permission notice shall be included in all +copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE +SOFTWARE. + + +fqdn +Mozilla Public License 2.0 (MPL 2.0) +https://github.com/ypcrts/fqdn +Mozilla Public License Version 2.0 +================================== + +1. Definitions +-------------- + +1.1. "Contributor" + means each individual or legal entity that creates, contributes to + the creation of, or owns Covered Software. + +1.2. "Contributor Version" + means the combination of the Contributions of others (if any) used + by a Contributor and that particular Contributor's Contribution. + +1.3. "Contribution" + means Covered Software of a particular Contributor. + +1.4. "Covered Software" + means Source Code Form to which the initial Contributor has attached + the notice in Exhibit A, the Executable Form of such Source Code + Form, and Modifications of such Source Code Form, in each case + including portions thereof. + +1.5. "Incompatible With Secondary Licenses" + means + + (a) that the initial Contributor has attached the notice described + in Exhibit B to the Covered Software; or + + (b) that the Covered Software was made available under the terms of + version 1.1 or earlier of the License, but not also under the + terms of a Secondary License. + +1.6. "Executable Form" + means any form of the work other than Source Code Form. + +1.7. "Larger Work" + means a work that combines Covered Software with other material, in + a separate file or files, that is not Covered Software. + +1.8. "License" + means this document. + +1.9. "Licensable" + means having the right to grant, to the maximum extent possible, + whether at the time of the initial grant or subsequently, any and + all of the rights conveyed by this License. + +1.10. "Modifications" + means any of the following: + + (a) any file in Source Code Form that results from an addition to, + deletion from, or modification of the contents of Covered + Software; or + + (b) any new file in Source Code Form that contains any Covered + Software. + +1.11. "Patent Claims" of a Contributor + means any patent claim(s), including without limitation, method, + process, and apparatus claims, in any patent Licensable by such + Contributor that would be infringed, but for the grant of the + License, by the making, using, selling, offering for sale, having + made, import, or transfer of either its Contributions or its + Contributor Version. + +1.12. "Secondary License" + means either the GNU General Public License, Version 2.0, the GNU + Lesser General Public License, Version 2.1, the GNU Affero General + Public License, Version 3.0, or any later versions of those + licenses. + +1.13. "Source Code Form" + means the form of the work preferred for making modifications. + +1.14. "You" (or "Your") + means an individual or a legal entity exercising rights under this + License. For legal entities, "You" includes any entity that + controls, is controlled by, or is under common control with You. For + purposes of this definition, "control" means (a) the power, direct + or indirect, to cause the direction or management of such entity, + whether by contract or otherwise, or (b) ownership of more than + fifty percent (50%) of the outstanding shares or beneficial + ownership of such entity. + +2. License Grants and Conditions +-------------------------------- + +2.1. Grants + +Each Contributor hereby grants You a world-wide, royalty-free, +non-exclusive license: + +(a) under intellectual property rights (other than patent or trademark) + Licensable by such Contributor to use, reproduce, make available, + modify, display, perform, distribute, and otherwise exploit its + Contributions, either on an unmodified basis, with Modifications, or + as part of a Larger Work; and + +(b) under Patent Claims of such Contributor to make, use, sell, offer + for sale, have made, import, and otherwise transfer either its + Contributions or its Contributor Version. + +2.2. Effective Date + +The licenses granted in Section 2.1 with respect to any Contribution +become effective for each Contribution on the date the Contributor first +distributes such Contribution. + +2.3. Limitations on Grant Scope + +The licenses granted in this Section 2 are the only rights granted under +this License. No additional rights or licenses will be implied from the +distribution or licensing of Covered Software under this License. +Notwithstanding Section 2.1(b) above, no patent license is granted by a +Contributor: + +(a) for any code that a Contributor has removed from Covered Software; + or + +(b) for infringements caused by: (i) Your and any other third party's + modifications of Covered Software, or (ii) the combination of its + Contributions with other software (except as part of its Contributor + Version); or + +(c) under Patent Claims infringed by Covered Software in the absence of + its Contributions. + +This License does not grant any rights in the trademarks, service marks, +or logos of any Contributor (except as may be necessary to comply with +the notice requirements in Section 3.4). + +2.4. Subsequent Licenses + +No Contributor makes additional grants as a result of Your choice to +distribute the Covered Software under a subsequent version of this +License (see Section 10.2) or under the terms of a Secondary License (if +permitted under the terms of Section 3.3). + +2.5. Representation + +Each Contributor represents that the Contributor believes its +Contributions are its original creation(s) or it has sufficient rights +to grant the rights to its Contributions conveyed by this License. + +2.6. Fair Use + +This License is not intended to limit any rights You have under +applicable copyright doctrines of fair use, fair dealing, or other +equivalents. + +2.7. Conditions + +Sections 3.1, 3.2, 3.3, and 3.4 are conditions of the licenses granted +in Section 2.1. + +3. Responsibilities +------------------- + +3.1. Distribution of Source Form + +All distribution of Covered Software in Source Code Form, including any +Modifications that You create or to which You contribute, must be under +the terms of this License. You must inform recipients that the Source +Code Form of the Covered Software is governed by the terms of this +License, and how they can obtain a copy of this License. You may not +attempt to alter or restrict the recipients' rights in the Source Code +Form. + +3.2. Distribution of Executable Form + +If You distribute Covered Software in Executable Form then: + +(a) such Covered Software must also be made available in Source Code + Form, as described in Section 3.1, and You must inform recipients of + the Executable Form how they can obtain a copy of such Source Code + Form by reasonable means in a timely manner, at a charge no more + than the cost of distribution to the recipient; and + +(b) You may distribute such Executable Form under the terms of this + License, or sublicense it under different terms, provided that the + license for the Executable Form does not attempt to limit or alter + the recipients' rights in the Source Code Form under this License. + +3.3. Distribution of a Larger Work + +You may create and distribute a Larger Work under terms of Your choice, +provided that You also comply with the requirements of this License for +the Covered Software. If the Larger Work is a combination of Covered +Software with a work governed by one or more Secondary Licenses, and the +Covered Software is not Incompatible With Secondary Licenses, this +License permits You to additionally distribute such Covered Software +under the terms of such Secondary License(s), so that the recipient of +the Larger Work may, at their option, further distribute the Covered +Software under the terms of either this License or such Secondary +License(s). + +3.4. Notices + +You may not remove or alter the substance of any license notices +(including copyright notices, patent notices, disclaimers of warranty, +or limitations of liability) contained within the Source Code Form of +the Covered Software, except that You may alter any license notices to +the extent required to remedy known factual inaccuracies. + +3.5. Application of Additional Terms + +You may choose to offer, and to charge a fee for, warranty, support, +indemnity or liability obligations to one or more recipients of Covered +Software. However, You may do so only on Your own behalf, and not on +behalf of any Contributor. You must make it absolutely clear that any +such warranty, support, indemnity, or liability obligation is offered by +You alone, and You hereby agree to indemnify every Contributor for any +liability incurred by such Contributor as a result of warranty, support, +indemnity or liability terms You offer. You may include additional +disclaimers of warranty and limitations of liability specific to any +jurisdiction. + +4. Inability to Comply Due to Statute or Regulation +--------------------------------------------------- + +If it is impossible for You to comply with any of the terms of this +License with respect to some or all of the Covered Software due to +statute, judicial order, or regulation then You must: (a) comply with +the terms of this License to the maximum extent possible; and (b) +describe the limitations and the code they affect. Such description must +be placed in a text file included with all distributions of the Covered +Software under this License. Except to the extent prohibited by statute +or regulation, such description must be sufficiently detailed for a +recipient of ordinary skill to be able to understand it. + +5. Termination +-------------- + +5.1. The rights granted under this License will terminate automatically +if You fail to comply with any of its terms. However, if You become +compliant, then the rights granted under this License from a particular +Contributor are reinstated (a) provisionally, unless and until such +Contributor explicitly and finally terminates Your grants, and (b) on an +ongoing basis, if such Contributor fails to notify You of the +non-compliance by some reasonable means prior to 60 days after You have +come back into compliance. Moreover, Your grants from a particular +Contributor are reinstated on an ongoing basis if such Contributor +notifies You of the non-compliance by some reasonable means, this is the +first time You have received notice of non-compliance with this License +from such Contributor, and You become compliant prior to 30 days after +Your receipt of the notice. + +5.2. If You initiate litigation against any entity by asserting a patent +infringement claim (excluding declaratory judgment actions, +counter-claims, and cross-claims) alleging that a Contributor Version +directly or indirectly infringes any patent, then the rights granted to +You by any and all Contributors for the Covered Software under Section +2.1 of this License shall terminate. + +5.3. In the event of termination under Sections 5.1 or 5.2 above, all +end user license agreements (excluding distributors and resellers) which +have been validly granted by You or Your distributors under this License +prior to termination shall survive termination. + +************************************************************************ +* * +* 6. Disclaimer of Warranty * +* ------------------------- * +* * +* Covered Software is provided under this License on an "as is" * +* basis, without warranty of any kind, either expressed, implied, or * +* statutory, including, without limitation, warranties that the * +* Covered Software is free of defects, merchantable, fit for a * +* particular purpose or non-infringing. The entire risk as to the * +* quality and performance of the Covered Software is with You. * +* Should any Covered Software prove defective in any respect, You * +* (not any Contributor) assume the cost of any necessary servicing, * +* repair, or correction. This disclaimer of warranty constitutes an * +* essential part of this License. No use of any Covered Software is * +* authorized under this License except under this disclaimer. * +* * +************************************************************************ + +************************************************************************ +* * +* 7. Limitation of Liability * +* -------------------------- * +* * +* Under no circumstances and under no legal theory, whether tort * +* (including negligence), contract, or otherwise, shall any * +* Contributor, or anyone who distributes Covered Software as * +* permitted above, be liable to You for any direct, indirect, * +* special, incidental, or consequential damages of any character * +* including, without limitation, damages for lost profits, loss of * +* goodwill, work stoppage, computer failure or malfunction, or any * +* and all other commercial damages or losses, even if such party * +* shall have been informed of the possibility of such damages. This * +* limitation of liability shall not apply to liability for death or * +* personal injury resulting from such party's negligence to the * +* extent applicable law prohibits such limitation. Some * +* jurisdictions do not allow the exclusion or limitation of * +* incidental or consequential damages, so this exclusion and * +* limitation may not apply to You. * +* * +************************************************************************ + +8. Litigation +------------- + +Any litigation relating to this License may be brought only in the +courts of a jurisdiction where the defendant maintains its principal +place of business and such litigation shall be governed by laws of that +jurisdiction, without reference to its conflict-of-law provisions. +Nothing in this Section shall prevent a party's ability to bring +cross-claims or counter-claims. + +9. Miscellaneous +---------------- + +This License represents the complete agreement concerning the subject +matter hereof. If any provision of this License is held to be +unenforceable, such provision shall be reformed only to the extent +necessary to make it enforceable. Any law or regulation which provides +that the language of a contract shall be construed against the drafter +shall not be used to construe this License against a Contributor. + +10. Versions of the License +--------------------------- + +10.1. New Versions + +Mozilla Foundation is the license steward. Except as provided in Section +10.3, no one other than the license steward has the right to modify or +publish new versions of this License. Each version will be given a +distinguishing version number. + +10.2. Effect of New Versions + +You may distribute the Covered Software under the terms of the version +of the License under which You originally received the Covered Software, +or under the terms of any subsequent version published by the license +steward. + +10.3. Modified Versions + +If you create software not governed by this License, and you want to +create a new license for such software, you may create and use a +modified version of this License if you rename the license and remove +any references to the name of the license steward (except to note that +such modified license differs from this License). + +10.4. Distributing Source Code Form that is Incompatible With Secondary +Licenses + +If You choose to distribute Source Code Form that is Incompatible With +Secondary Licenses under the terms of this version of the License, the +notice described in Exhibit B of this License must be attached. + +Exhibit A - Source Code Form License Notice +------------------------------------------- + + This Source Code Form is subject to the terms of the Mozilla Public + License, v. 2.0. If a copy of the MPL was not distributed with this + file, You can obtain one at http://mozilla.org/MPL/2.0/. + +If it is not possible or desirable to put the notice in a particular +file, then You may include the notice in a location (such as a LICENSE +file in a relevant directory) where a recipient would be likely to look +for such a notice. + +You may add additional accurate notices of copyright ownership. + +Exhibit B - "Incompatible With Secondary Licenses" Notice +--------------------------------------------------------- + + This Source Code Form is "Incompatible With Secondary Licenses", as + defined by the Mozilla Public License, v. 2.0. + + +frozendict +GNU Lesser General Public License v3 (LGPLv3) +https://github.com/Marco-Sulla/python-frozendict + GNU LESSER GENERAL PUBLIC LICENSE + Version 3, 29 June 2007 + + Copyright (C) 2007 Free Software Foundation, Inc. + Everyone is permitted to copy and distribute verbatim copies + of this license document, but changing it is not allowed. + + + This version of the GNU Lesser General Public License incorporates +the terms and conditions of version 3 of the GNU General Public +License, supplemented by the additional permissions listed below. + + 0. Additional Definitions. + + As used herein, "this License" refers to version 3 of the GNU Lesser +General Public License, and the "GNU GPL" refers to version 3 of the GNU +General Public License. + + "The Library" refers to a covered work governed by this License, +other than an Application or a Combined Work as defined below. + + An "Application" is any work that makes use of an interface provided +by the Library, but which is not otherwise based on the Library. +Defining a subclass of a class defined by the Library is deemed a mode +of using an interface provided by the Library. + + A "Combined Work" is a work produced by combining or linking an +Application with the Library. The particular version of the Library +with which the Combined Work was made is also called the "Linked +Version". + + The "Minimal Corresponding Source" for a Combined Work means the +Corresponding Source for the Combined Work, excluding any source code +for portions of the Combined Work that, considered in isolation, are +based on the Application, and not on the Linked Version. + + The "Corresponding Application Code" for a Combined Work means the +object code and/or source code for the Application, including any data +and utility programs needed for reproducing the Combined Work from the +Application, but excluding the System Libraries of the Combined Work. + + 1. Exception to Section 3 of the GNU GPL. + + You may convey a covered work under sections 3 and 4 of this License +without being bound by section 3 of the GNU GPL. + + 2. Conveying Modified Versions. + + If you modify a copy of the Library, and, in your modifications, a +facility refers to a function or data to be supplied by an Application +that uses the facility (other than as an argument passed when the +facility is invoked), then you may convey a copy of the modified +version: + + a) under this License, provided that you make a good faith effort to + ensure that, in the event an Application does not supply the + function or data, the facility still operates, and performs + whatever part of its purpose remains meaningful, or + + b) under the GNU GPL, with none of the additional permissions of + this License applicable to that copy. + + 3. Object Code Incorporating Material from Library Header Files. + + The object code form of an Application may incorporate material from +a header file that is part of the Library. You may convey such object +code under terms of your choice, provided that, if the incorporated +material is not limited to numerical parameters, data structure +layouts and accessors, or small macros, inline functions and templates +(ten or fewer lines in length), you do both of the following: + + a) Give prominent notice with each copy of the object code that the + Library is used in it and that the Library and its use are + covered by this License. + + b) Accompany the object code with a copy of the GNU GPL and this license + document. + + 4. Combined Works. + + You may convey a Combined Work under terms of your choice that, +taken together, effectively do not restrict modification of the +portions of the Library contained in the Combined Work and reverse +engineering for debugging such modifications, if you also do each of +the following: + + a) Give prominent notice with each copy of the Combined Work that + the Library is used in it and that the Library and its use are + covered by this License. + + b) Accompany the Combined Work with a copy of the GNU GPL and this license + document. + + c) For a Combined Work that displays copyright notices during + execution, include the copyright notice for the Library among + these notices, as well as a reference directing the user to the + copies of the GNU GPL and this license document. + + d) Do one of the following: + + 0) Convey the Minimal Corresponding Source under the terms of this + License, and the Corresponding Application Code in a form + suitable for, and under terms that permit, the user to + recombine or relink the Application with a modified version of + the Linked Version to produce a modified Combined Work, in the + manner specified by section 6 of the GNU GPL for conveying + Corresponding Source. + + 1) Use a suitable shared library mechanism for linking with the + Library. A suitable mechanism is one that (a) uses at run time + a copy of the Library already present on the user's computer + system, and (b) will operate properly with a modified version + of the Library that is interface-compatible with the Linked + Version. + + e) Provide Installation Information, but only if you would otherwise + be required to provide such information under section 6 of the + GNU GPL, and only to the extent that such information is + necessary to install and execute a modified version of the + Combined Work produced by recombining or relinking the + Application with a modified version of the Linked Version. (If + you use option 4d0, the Installation Information must accompany + the Minimal Corresponding Source and Corresponding Application + Code. If you use option 4d1, you must provide the Installation + Information in the manner specified by section 6 of the GNU GPL + for conveying Corresponding Source.) + + 5. Combined Libraries. + + You may place library facilities that are a work based on the +Library side by side in a single library together with other library +facilities that are not Applications and are not covered by this +License, and convey such a combined library under terms of your +choice, if you do both of the following: + + a) Accompany the combined library with a copy of the same work based + on the Library, uncombined with any other library facilities, + conveyed under the terms of this License. + + b) Give prominent notice with the combined library that part of it + is a work based on the Library, and explaining where to find the + accompanying uncombined form of the same work. + + 6. Revised Versions of the GNU Lesser General Public License. + + The Free Software Foundation may publish revised and/or new versions +of the GNU Lesser General Public License from time to time. Such new +versions will be similar in spirit to the present version, but may +differ in detail to address new problems or concerns. + + Each version is given a distinguishing version number. If the +Library as you received it specifies that a certain numbered version +of the GNU Lesser General Public License "or any later version" +applies to it, you have the option of following the terms and +conditions either of that published version or of any later version +published by the Free Software Foundation. If the Library as you +received it does not specify a version number of the GNU Lesser +General Public License, you may choose any version of the GNU Lesser +General Public License ever published by the Free Software Foundation. + + If the Library as you received it specifies that a proxy can decide +whether future versions of the GNU Lesser General Public License shall +apply, that proxy's public statement of acceptance of any version is +permanent authorization for you to choose that version for the +Library. + + +frozenlist +Apache-2.0 +https://github.com/aio-libs/frozenlist +Apache License + Version 2.0, January 2004 + http://www.apache.org/licenses/ + + TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION + + 1. Definitions. + + "License" shall mean the terms and conditions for use, reproduction, + and distribution as defined by Sections 1 through 9 of this document. + + "Licensor" shall mean the copyright owner or entity authorized by + the copyright owner that is granting the License. + + "Legal Entity" shall mean the union of the acting entity and all + other entities that control, are controlled by, or are under common + control with that entity. For the purposes of this definition, + "control" means (i) the power, direct or indirect, to cause the + direction or management of such entity, whether by contract or + otherwise, or (ii) ownership of fifty percent (50%) or more of the + outstanding shares, or (iii) beneficial ownership of such entity. + + "You" (or "Your") shall mean an individual or Legal Entity + exercising permissions granted by this License. + + "Source" form shall mean the preferred form for making modifications, + including but not limited to software source code, documentation + source, and configuration files. + + "Object" form shall mean any form resulting from mechanical + transformation or translation of a Source form, including but + not limited to compiled object code, generated documentation, + and conversions to other media types. + + "Work" shall mean the work of authorship, whether in Source or + Object form, made available under the License, as indicated by a + copyright notice that is included in or attached to the work + (an example is provided in the Appendix below). + + "Derivative Works" shall mean any work, whether in Source or Object + form, that is based on (or derived from) the Work and for which the + editorial revisions, annotations, elaborations, or other modifications + represent, as a whole, an original work of authorship. For the purposes + of this License, Derivative Works shall not include works that remain + separable from, or merely link (or bind by name) to the interfaces of, + the Work and Derivative Works thereof. + + "Contribution" shall mean any work of authorship, including + the original version of the Work and any modifications or additions + to that Work or Derivative Works thereof, that is intentionally + submitted to Licensor for inclusion in the Work by the copyright owner + or by an individual or Legal Entity authorized to submit on behalf of + the copyright owner. For the purposes of this definition, "submitted" + means any form of electronic, verbal, or written communication sent + to the Licensor or its representatives, including but not limited to + communication on electronic mailing lists, source code control systems, + and issue tracking systems that are managed by, or on behalf of, the + Licensor for the purpose of discussing and improving the Work, but + excluding communication that is conspicuously marked or otherwise + designated in writing by the copyright owner as "Not a Contribution." + + "Contributor" shall mean Licensor and any individual or Legal Entity + on behalf of whom a Contribution has been received by Licensor and + subsequently incorporated within the Work. + + 2. Grant of Copyright License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + copyright license to reproduce, prepare Derivative Works of, + publicly display, publicly perform, sublicense, and distribute the + Work and such Derivative Works in Source or Object form. + + 3. Grant of Patent License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + (except as stated in this section) patent license to make, have made, + use, offer to sell, sell, import, and otherwise transfer the Work, + where such license applies only to those patent claims licensable + by such Contributor that are necessarily infringed by their + Contribution(s) alone or by combination of their Contribution(s) + with the Work to which such Contribution(s) was submitted. If You + institute patent litigation against any entity (including a + cross-claim or counterclaim in a lawsuit) alleging that the Work + or a Contribution incorporated within the Work constitutes direct + or contributory patent infringement, then any patent licenses + granted to You under this License for that Work shall terminate + as of the date such litigation is filed. + + 4. Redistribution. You may reproduce and distribute copies of the + Work or Derivative Works thereof in any medium, with or without + modifications, and in Source or Object form, provided that You + meet the following conditions: + + (a) You must give any other recipients of the Work or + Derivative Works a copy of this License; and + + (b) You must cause any modified files to carry prominent notices + stating that You changed the files; and + + (c) You must retain, in the Source form of any Derivative Works + that You distribute, all copyright, patent, trademark, and + attribution notices from the Source form of the Work, + excluding those notices that do not pertain to any part of + the Derivative Works; and + + (d) If the Work includes a "NOTICE" text file as part of its + distribution, then any Derivative Works that You distribute must + include a readable copy of the attribution notices contained + within such NOTICE file, excluding those notices that do not + pertain to any part of the Derivative Works, in at least one + of the following places: within a NOTICE text file distributed + as part of the Derivative Works; within the Source form or + documentation, if provided along with the Derivative Works; or, + within a display generated by the Derivative Works, if and + wherever such third-party notices normally appear. The contents + of the NOTICE file are for informational purposes only and + do not modify the License. You may add Your own attribution + notices within Derivative Works that You distribute, alongside + or as an addendum to the NOTICE text from the Work, provided + that such additional attribution notices cannot be construed + as modifying the License. + + You may add Your own copyright statement to Your modifications and + may provide additional or different license terms and conditions + for use, reproduction, or distribution of Your modifications, or + for any such Derivative Works as a whole, provided Your use, + reproduction, and distribution of the Work otherwise complies with + the conditions stated in this License. + + 5. Submission of Contributions. Unless You explicitly state otherwise, + any Contribution intentionally submitted for inclusion in the Work + by You to the Licensor shall be under the terms and conditions of + this License, without any additional terms or conditions. + Notwithstanding the above, nothing herein shall supersede or modify + the terms of any separate license agreement you may have executed + with Licensor regarding such Contributions. + + 6. Trademarks. This License does not grant permission to use the trade + names, trademarks, service marks, or product names of the Licensor, + except as required for reasonable and customary use in describing the + origin of the Work and reproducing the content of the NOTICE file. + + 7. Disclaimer of Warranty. Unless required by applicable law or + agreed to in writing, Licensor provides the Work (and each + Contributor provides its Contributions) on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or + implied, including, without limitation, any warranties or conditions + of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A + PARTICULAR PURPOSE. You are solely responsible for determining the + appropriateness of using or redistributing the Work and assume any + risks associated with Your exercise of permissions under this License. + + 8. Limitation of Liability. In no event and under no legal theory, + whether in tort (including negligence), contract, or otherwise, + unless required by applicable law (such as deliberate and grossly + negligent acts) or agreed to in writing, shall any Contributor be + liable to You for damages, including any direct, indirect, special, + incidental, or consequential damages of any character arising as a + result of this License or out of the use or inability to use the + Work (including but not limited to damages for loss of goodwill, + work stoppage, computer failure or malfunction, or any and all + other commercial damages or losses), even if such Contributor + has been advised of the possibility of such damages. + + 9. Accepting Warranty or Additional Liability. While redistributing + the Work or Derivative Works thereof, You may choose to offer, + and charge a fee for, acceptance of support, warranty, indemnity, + or other liability obligations and/or rights consistent with this + License. However, in accepting such obligations, You may act only + on Your own behalf and on Your sole responsibility, not on behalf + of any other Contributor, and only if You agree to indemnify, + defend, and hold each Contributor harmless for any liability + incurred by, or claims asserted against, such Contributor by reason + of your accepting any such warranty or additional liability. + + END OF TERMS AND CONDITIONS + + APPENDIX: How to apply the Apache License to your work. + + To apply the Apache License to your work, attach the following + boilerplate notice, with the fields enclosed by brackets "{}" + replaced with your own identifying information. (Don't include + the brackets!) The text should be enclosed in the appropriate + comment syntax for the file format. We also recommend that a + file or class name and description of purpose be included on the + same "printed page" as the copyright notice for easier + identification within third-party archives. + + Copyright 2013-2019 Nikolay Kim and Andrew Svetlov + + Licensed under the Apache License, Version 2.0 (the "License"); + you may not use this file except in compliance with the License. + You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, software + distributed under the License is distributed on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + See the License for the specific language governing permissions and + limitations under the License. + + +fsspec +BSD-3-Clause +https://github.com/fsspec/filesystem_spec +BSD 3-Clause License + +Copyright (c) 2018, Martin Durant +All rights reserved. + +Redistribution and use in source and binary forms, with or without +modification, are permitted provided that the following conditions are met: + +* Redistributions of source code must retain the above copyright notice, this + list of conditions and the following disclaimer. + +* Redistributions in binary form must reproduce the above copyright notice, + this list of conditions and the following disclaimer in the documentation + and/or other materials provided with the distribution. + +* Neither the name of the copyright holder nor the names of its + contributors may be used to endorse or promote products derived from + this software without specific prior written permission. + +THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" +AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE +IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE +DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE +FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL +DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR +SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER +CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, +OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE +OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + + +ftfy +Apache-2.0 +https://ftfy.readthedocs.io/en/latest/ +Copyright 2023 Robyn Speer + +Licensed under the Apache License, Version 2.0 (the "License"); +you may not use this file except in compliance with the License. +You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + +Unless required by applicable law or agreed to in writing, software +distributed under the License is distributed on an "AS IS" BASIS, +WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +See the License for the specific language governing permissions and +limitations under the License. + + +future +MIT License +https://python-future.org +Copyright (c) 2013-2024 Python Charmers, Australia + +Permission is hereby granted, free of charge, to any person obtaining a copy +of this software and associated documentation files (the "Software"), to deal +in the Software without restriction, including without limitation the rights +to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +copies of the Software, and to permit persons to whom the Software is +furnished to do so, subject to the following conditions: + +The above copyright notice and this permission notice shall be included in +all copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN +THE SOFTWARE. + + +futureproof +MIT License +https://github.com/yeraydiazdiaz/futureproof +MIT License + +Copyright © 2019, Yeray Díaz Díaz + +Permission is hereby granted, free of charge, to any person obtaining a copy +of this software and associated documentation files (the "Software"), to deal +in the Software without restriction, including without limitation the rights +to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +copies of the Software, and to permit persons to whom the Software is +furnished to do so, subject to the following conditions: + +The above copyright notice and this permission notice shall be included in all +copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE +SOFTWARE. + +fvcore +Apache 2.0 +https://github.com/facebookresearch/fvcore +UNKNOWN + +gast +BSD License +https://github.com/serge-sans-paille/gast/ +Copyright (c) 2016, Serge Guelton +All rights reserved. + +Redistribution and use in source and binary forms, with or without +modification, are permitted provided that the following conditions are met: + + Redistributions of source code must retain the above copyright notice, this + list of conditions and the following disclaimer. + + Redistributions in binary form must reproduce the above copyright notice, + this list of conditions and the following disclaimer in the documentation + and/or other materials provided with the distribution. + + Neither the name of HPCProject, Serge Guelton nor the names of its + contributors may be used to endorse or promote products derived from this + software without specific prior written permission. + +THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND +ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED +WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE +DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE +FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL +DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR +SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER +CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, +OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE +OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + + + + +gitdb +BSD License +https://github.com/gitpython-developers/gitdb +Copyright (C) 2010, 2011 Sebastian Thiel and contributors +All rights reserved. + +Redistribution and use in source and binary forms, with or without +modification, are permitted provided that the following conditions +are met: + +* Redistributions of source code must retain the above copyright +notice, this list of conditions and the following disclaimer. + +* Redistributions in binary form must reproduce the above copyright +notice, this list of conditions and the following disclaimer in the +documentation and/or other materials provided with the distribution. + +* Neither the name of the GitDB project nor the names of +its contributors may be used to endorse or promote products derived +from this software without specific prior written permission. + +THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS +"AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT +LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR +A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT +OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, +SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED +TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR +PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF +LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING +NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS +SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + + +Additional Licenses +------------------- +The files at +gitdb/test/fixtures/packs/pack-11fdfa9e156ab73caae3b6da867192221f2089c2.idx +and +gitdb/test/fixtures/packs/pack-11fdfa9e156ab73caae3b6da867192221f2089c2.pack +are licensed under GNU GPL as part of the git source repository, +see http://en.wikipedia.org/wiki/Git_%28software%29 for more information. + +They are not required for the actual operation, which is why they are not found +in the distribution package. + + +glfw +MIT License +https://github.com/FlorianRhiem/pyGLFW +The MIT License (MIT) + +Copyright (c) 2013-2019 Florian Rhiem + +Permission is hereby granted, free of charge, to any person obtaining a copy of +this software and associated documentation files (the "Software"), to deal in +the Software without restriction, including without limitation the rights to +use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of +the Software, and to permit persons to whom the Software is furnished to do so, +subject to the following conditions: + +The above copyright notice and this permission notice shall be included in all +copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS +FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR +COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER +IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN +CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. + + +google-api-core +Apache Software License +https://github.com/googleapis/google-cloud-python/tree/main/packages/google-api-core + + Apache License + Version 2.0, January 2004 + http://www.apache.org/licenses/ + + TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION + + 1. Definitions. + + "License" shall mean the terms and conditions for use, reproduction, + and distribution as defined by Sections 1 through 9 of this document. + + "Licensor" shall mean the copyright owner or entity authorized by + the copyright owner that is granting the License. + + "Legal Entity" shall mean the union of the acting entity and all + other entities that control, are controlled by, or are under common + control with that entity. For the purposes of this definition, + "control" means (i) the power, direct or indirect, to cause the + direction or management of such entity, whether by contract or + otherwise, or (ii) ownership of fifty percent (50%) or more of the + outstanding shares, or (iii) beneficial ownership of such entity. + + "You" (or "Your") shall mean an individual or Legal Entity + exercising permissions granted by this License. + + "Source" form shall mean the preferred form for making modifications, + including but not limited to software source code, documentation + source, and configuration files. + + "Object" form shall mean any form resulting from mechanical + transformation or translation of a Source form, including but + not limited to compiled object code, generated documentation, + and conversions to other media types. + + "Work" shall mean the work of authorship, whether in Source or + Object form, made available under the License, as indicated by a + copyright notice that is included in or attached to the work + (an example is provided in the Appendix below). + + "Derivative Works" shall mean any work, whether in Source or Object + form, that is based on (or derived from) the Work and for which the + editorial revisions, annotations, elaborations, or other modifications + represent, as a whole, an original work of authorship. For the purposes + of this License, Derivative Works shall not include works that remain + separable from, or merely link (or bind by name) to the interfaces of, + the Work and Derivative Works thereof. + + "Contribution" shall mean any work of authorship, including + the original version of the Work and any modifications or additions + to that Work or Derivative Works thereof, that is intentionally + submitted to Licensor for inclusion in the Work by the copyright owner + or by an individual or Legal Entity authorized to submit on behalf of + the copyright owner. For the purposes of this definition, "submitted" + means any form of electronic, verbal, or written communication sent + to the Licensor or its representatives, including but not limited to + communication on electronic mailing lists, source code control systems, + and issue tracking systems that are managed by, or on behalf of, the + Licensor for the purpose of discussing and improving the Work, but + excluding communication that is conspicuously marked or otherwise + designated in writing by the copyright owner as "Not a Contribution." + + "Contributor" shall mean Licensor and any individual or Legal Entity + on behalf of whom a Contribution has been received by Licensor and + subsequently incorporated within the Work. + + 2. Grant of Copyright License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + copyright license to reproduce, prepare Derivative Works of, + publicly display, publicly perform, sublicense, and distribute the + Work and such Derivative Works in Source or Object form. + + 3. Grant of Patent License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + (except as stated in this section) patent license to make, have made, + use, offer to sell, sell, import, and otherwise transfer the Work, + where such license applies only to those patent claims licensable + by such Contributor that are necessarily infringed by their + Contribution(s) alone or by combination of their Contribution(s) + with the Work to which such Contribution(s) was submitted. If You + institute patent litigation against any entity (including a + cross-claim or counterclaim in a lawsuit) alleging that the Work + or a Contribution incorporated within the Work constitutes direct + or contributory patent infringement, then any patent licenses + granted to You under this License for that Work shall terminate + as of the date such litigation is filed. + + 4. Redistribution. You may reproduce and distribute copies of the + Work or Derivative Works thereof in any medium, with or without + modifications, and in Source or Object form, provided that You + meet the following conditions: + + (a) You must give any other recipients of the Work or + Derivative Works a copy of this License; and + + (b) You must cause any modified files to carry prominent notices + stating that You changed the files; and + + (c) You must retain, in the Source form of any Derivative Works + that You distribute, all copyright, patent, trademark, and + attribution notices from the Source form of the Work, + excluding those notices that do not pertain to any part of + the Derivative Works; and + + (d) If the Work includes a "NOTICE" text file as part of its + distribution, then any Derivative Works that You distribute must + include a readable copy of the attribution notices contained + within such NOTICE file, excluding those notices that do not + pertain to any part of the Derivative Works, in at least one + of the following places: within a NOTICE text file distributed + as part of the Derivative Works; within the Source form or + documentation, if provided along with the Derivative Works; or, + within a display generated by the Derivative Works, if and + wherever such third-party notices normally appear. The contents + of the NOTICE file are for informational purposes only and + do not modify the License. You may add Your own attribution + notices within Derivative Works that You distribute, alongside + or as an addendum to the NOTICE text from the Work, provided + that such additional attribution notices cannot be construed + as modifying the License. + + You may add Your own copyright statement to Your modifications and + may provide additional or different license terms and conditions + for use, reproduction, or distribution of Your modifications, or + for any such Derivative Works as a whole, provided Your use, + reproduction, and distribution of the Work otherwise complies with + the conditions stated in this License. + + 5. Submission of Contributions. Unless You explicitly state otherwise, + any Contribution intentionally submitted for inclusion in the Work + by You to the Licensor shall be under the terms and conditions of + this License, without any additional terms or conditions. + Notwithstanding the above, nothing herein shall supersede or modify + the terms of any separate license agreement you may have executed + with Licensor regarding such Contributions. + + 6. Trademarks. This License does not grant permission to use the trade + names, trademarks, service marks, or product names of the Licensor, + except as required for reasonable and customary use in describing the + origin of the Work and reproducing the content of the NOTICE file. + + 7. Disclaimer of Warranty. Unless required by applicable law or + agreed to in writing, Licensor provides the Work (and each + Contributor provides its Contributions) on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or + implied, including, without limitation, any warranties or conditions + of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A + PARTICULAR PURPOSE. You are solely responsible for determining the + appropriateness of using or redistributing the Work and assume any + risks associated with Your exercise of permissions under this License. + + 8. Limitation of Liability. In no event and under no legal theory, + whether in tort (including negligence), contract, or otherwise, + unless required by applicable law (such as deliberate and grossly + negligent acts) or agreed to in writing, shall any Contributor be + liable to You for damages, including any direct, indirect, special, + incidental, or consequential damages of any character arising as a + result of this License or out of the use or inability to use the + Work (including but not limited to damages for loss of goodwill, + work stoppage, computer failure or malfunction, or any and all + other commercial damages or losses), even if such Contributor + has been advised of the possibility of such damages. + + 9. Accepting Warranty or Additional Liability. While redistributing + the Work or Derivative Works thereof, You may choose to offer, + and charge a fee for, acceptance of support, warranty, indemnity, + or other liability obligations and/or rights consistent with this + License. However, in accepting such obligations, You may act only + on Your own behalf and on Your sole responsibility, not on behalf + of any other Contributor, and only if You agree to indemnify, + defend, and hold each Contributor harmless for any liability + incurred by, or claims asserted against, such Contributor by reason + of your accepting any such warranty or additional liability. + + END OF TERMS AND CONDITIONS + + APPENDIX: How to apply the Apache License to your work. + + To apply the Apache License to your work, attach the following + boilerplate notice, with the fields enclosed by brackets "[]" + replaced with your own identifying information. (Don't include + the brackets!) The text should be enclosed in the appropriate + comment syntax for the file format. We also recommend that a + file or class name and description of purpose be included on the + same "printed page" as the copyright notice for easier + identification within third-party archives. + + Copyright [yyyy] [name of copyright owner] + + Licensed under the Apache License, Version 2.0 (the "License"); + you may not use this file except in compliance with the License. + You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, software + distributed under the License is distributed on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + See the License for the specific language governing permissions and + limitations under the License. + + +google-auth +Apache Software License +https://github.com/googleapis/google-auth-library-python + Apache License + Version 2.0, January 2004 + http://www.apache.org/licenses/ + + TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION + + 1. Definitions. + + "License" shall mean the terms and conditions for use, reproduction, + and distribution as defined by Sections 1 through 9 of this document. + + "Licensor" shall mean the copyright owner or entity authorized by + the copyright owner that is granting the License. + + "Legal Entity" shall mean the union of the acting entity and all + other entities that control, are controlled by, or are under common + control with that entity. For the purposes of this definition, + "control" means (i) the power, direct or indirect, to cause the + direction or management of such entity, whether by contract or + otherwise, or (ii) ownership of fifty percent (50%) or more of the + outstanding shares, or (iii) beneficial ownership of such entity. + + "You" (or "Your") shall mean an individual or Legal Entity + exercising permissions granted by this License. + + "Source" form shall mean the preferred form for making modifications, + including but not limited to software source code, documentation + source, and configuration files. + + "Object" form shall mean any form resulting from mechanical + transformation or translation of a Source form, including but + not limited to compiled object code, generated documentation, + and conversions to other media types. + + "Work" shall mean the work of authorship, whether in Source or + Object form, made available under the License, as indicated by a + copyright notice that is included in or attached to the work + (an example is provided in the Appendix below). + + "Derivative Works" shall mean any work, whether in Source or Object + form, that is based on (or derived from) the Work and for which the + editorial revisions, annotations, elaborations, or other modifications + represent, as a whole, an original work of authorship. For the purposes + of this License, Derivative Works shall not include works that remain + separable from, or merely link (or bind by name) to the interfaces of, + the Work and Derivative Works thereof. + + "Contribution" shall mean any work of authorship, including + the original version of the Work and any modifications or additions + to that Work or Derivative Works thereof, that is intentionally + submitted to Licensor for inclusion in the Work by the copyright owner + or by an individual or Legal Entity authorized to submit on behalf of + the copyright owner. For the purposes of this definition, "submitted" + means any form of electronic, verbal, or written communication sent + to the Licensor or its representatives, including but not limited to + communication on electronic mailing lists, source code control systems, + and issue tracking systems that are managed by, or on behalf of, the + Licensor for the purpose of discussing and improving the Work, but + excluding communication that is conspicuously marked or otherwise + designated in writing by the copyright owner as "Not a Contribution." + + "Contributor" shall mean Licensor and any individual or Legal Entity + on behalf of whom a Contribution has been received by Licensor and + subsequently incorporated within the Work. + + 2. Grant of Copyright License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + copyright license to reproduce, prepare Derivative Works of, + publicly display, publicly perform, sublicense, and distribute the + Work and such Derivative Works in Source or Object form. + + 3. Grant of Patent License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + (except as stated in this section) patent license to make, have made, + use, offer to sell, sell, import, and otherwise transfer the Work, + where such license applies only to those patent claims licensable + by such Contributor that are necessarily infringed by their + Contribution(s) alone or by combination of their Contribution(s) + with the Work to which such Contribution(s) was submitted. If You + institute patent litigation against any entity (including a + cross-claim or counterclaim in a lawsuit) alleging that the Work + or a Contribution incorporated within the Work constitutes direct + or contributory patent infringement, then any patent licenses + granted to You under this License for that Work shall terminate + as of the date such litigation is filed. + + 4. Redistribution. You may reproduce and distribute copies of the + Work or Derivative Works thereof in any medium, with or without + modifications, and in Source or Object form, provided that You + meet the following conditions: + + (a) You must give any other recipients of the Work or + Derivative Works a copy of this License; and + + (b) You must cause any modified files to carry prominent notices + stating that You changed the files; and + + (c) You must retain, in the Source form of any Derivative Works + that You distribute, all copyright, patent, trademark, and + attribution notices from the Source form of the Work, + excluding those notices that do not pertain to any part of + the Derivative Works; and + + (d) If the Work includes a "NOTICE" text file as part of its + distribution, then any Derivative Works that You distribute must + include a readable copy of the attribution notices contained + within such NOTICE file, excluding those notices that do not + pertain to any part of the Derivative Works, in at least one + of the following places: within a NOTICE text file distributed + as part of the Derivative Works; within the Source form or + documentation, if provided along with the Derivative Works; or, + within a display generated by the Derivative Works, if and + wherever such third-party notices normally appear. The contents + of the NOTICE file are for informational purposes only and + do not modify the License. You may add Your own attribution + notices within Derivative Works that You distribute, alongside + or as an addendum to the NOTICE text from the Work, provided + that such additional attribution notices cannot be construed + as modifying the License. + + You may add Your own copyright statement to Your modifications and + may provide additional or different license terms and conditions + for use, reproduction, or distribution of Your modifications, or + for any such Derivative Works as a whole, provided Your use, + reproduction, and distribution of the Work otherwise complies with + the conditions stated in this License. + + 5. Submission of Contributions. Unless You explicitly state otherwise, + any Contribution intentionally submitted for inclusion in the Work + by You to the Licensor shall be under the terms and conditions of + this License, without any additional terms or conditions. + Notwithstanding the above, nothing herein shall supersede or modify + the terms of any separate license agreement you may have executed + with Licensor regarding such Contributions. + + 6. Trademarks. This License does not grant permission to use the trade + names, trademarks, service marks, or product names of the Licensor, + except as required for reasonable and customary use in describing the + origin of the Work and reproducing the content of the NOTICE file. + + 7. Disclaimer of Warranty. Unless required by applicable law or + agreed to in writing, Licensor provides the Work (and each + Contributor provides its Contributions) on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or + implied, including, without limitation, any warranties or conditions + of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A + PARTICULAR PURPOSE. You are solely responsible for determining the + appropriateness of using or redistributing the Work and assume any + risks associated with Your exercise of permissions under this License. + + 8. Limitation of Liability. In no event and under no legal theory, + whether in tort (including negligence), contract, or otherwise, + unless required by applicable law (such as deliberate and grossly + negligent acts) or agreed to in writing, shall any Contributor be + liable to You for damages, including any direct, indirect, special, + incidental, or consequential damages of any character arising as a + result of this License or out of the use or inability to use the + Work (including but not limited to damages for loss of goodwill, + work stoppage, computer failure or malfunction, or any and all + other commercial damages or losses), even if such Contributor + has been advised of the possibility of such damages. + + 9. Accepting Warranty or Additional Liability. While redistributing + the Work or Derivative Works thereof, You may choose to offer, + and charge a fee for, acceptance of support, warranty, indemnity, + or other liability obligations and/or rights consistent with this + License. However, in accepting such obligations, You may act only + on Your own behalf and on Your sole responsibility, not on behalf + of any other Contributor, and only if You agree to indemnify, + defend, and hold each Contributor harmless for any liability + incurred by, or claims asserted against, such Contributor by reason + of your accepting any such warranty or additional liability. + + END OF TERMS AND CONDITIONS + + APPENDIX: How to apply the Apache License to your work. + + To apply the Apache License to your work, attach the following + boilerplate notice, with the fields enclosed by brackets "[]" + replaced with your own identifying information. (Don't include + the brackets!) The text should be enclosed in the appropriate + comment syntax for the file format. We also recommend that a + file or class name and description of purpose be included on the + same "printed page" as the copyright notice for easier + identification within third-party archives. + + Copyright [yyyy] [name of copyright owner] + + Licensed under the Apache License, Version 2.0 (the "License"); + you may not use this file except in compliance with the License. + You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, software + distributed under the License is distributed on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + See the License for the specific language governing permissions and + limitations under the License. + + +google-cloud-core +Apache Software License +https://github.com/googleapis/python-cloud-core + + Apache License + Version 2.0, January 2004 + http://www.apache.org/licenses/ + + TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION + + 1. Definitions. + + "License" shall mean the terms and conditions for use, reproduction, + and distribution as defined by Sections 1 through 9 of this document. + + "Licensor" shall mean the copyright owner or entity authorized by + the copyright owner that is granting the License. + + "Legal Entity" shall mean the union of the acting entity and all + other entities that control, are controlled by, or are under common + control with that entity. For the purposes of this definition, + "control" means (i) the power, direct or indirect, to cause the + direction or management of such entity, whether by contract or + otherwise, or (ii) ownership of fifty percent (50%) or more of the + outstanding shares, or (iii) beneficial ownership of such entity. + + "You" (or "Your") shall mean an individual or Legal Entity + exercising permissions granted by this License. + + "Source" form shall mean the preferred form for making modifications, + including but not limited to software source code, documentation + source, and configuration files. + + "Object" form shall mean any form resulting from mechanical + transformation or translation of a Source form, including but + not limited to compiled object code, generated documentation, + and conversions to other media types. + + "Work" shall mean the work of authorship, whether in Source or + Object form, made available under the License, as indicated by a + copyright notice that is included in or attached to the work + (an example is provided in the Appendix below). + + "Derivative Works" shall mean any work, whether in Source or Object + form, that is based on (or derived from) the Work and for which the + editorial revisions, annotations, elaborations, or other modifications + represent, as a whole, an original work of authorship. For the purposes + of this License, Derivative Works shall not include works that remain + separable from, or merely link (or bind by name) to the interfaces of, + the Work and Derivative Works thereof. + + "Contribution" shall mean any work of authorship, including + the original version of the Work and any modifications or additions + to that Work or Derivative Works thereof, that is intentionally + submitted to Licensor for inclusion in the Work by the copyright owner + or by an individual or Legal Entity authorized to submit on behalf of + the copyright owner. For the purposes of this definition, "submitted" + means any form of electronic, verbal, or written communication sent + to the Licensor or its representatives, including but not limited to + communication on electronic mailing lists, source code control systems, + and issue tracking systems that are managed by, or on behalf of, the + Licensor for the purpose of discussing and improving the Work, but + excluding communication that is conspicuously marked or otherwise + designated in writing by the copyright owner as "Not a Contribution." + + "Contributor" shall mean Licensor and any individual or Legal Entity + on behalf of whom a Contribution has been received by Licensor and + subsequently incorporated within the Work. + + 2. Grant of Copyright License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + copyright license to reproduce, prepare Derivative Works of, + publicly display, publicly perform, sublicense, and distribute the + Work and such Derivative Works in Source or Object form. + + 3. Grant of Patent License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + (except as stated in this section) patent license to make, have made, + use, offer to sell, sell, import, and otherwise transfer the Work, + where such license applies only to those patent claims licensable + by such Contributor that are necessarily infringed by their + Contribution(s) alone or by combination of their Contribution(s) + with the Work to which such Contribution(s) was submitted. If You + institute patent litigation against any entity (including a + cross-claim or counterclaim in a lawsuit) alleging that the Work + or a Contribution incorporated within the Work constitutes direct + or contributory patent infringement, then any patent licenses + granted to You under this License for that Work shall terminate + as of the date such litigation is filed. + + 4. Redistribution. You may reproduce and distribute copies of the + Work or Derivative Works thereof in any medium, with or without + modifications, and in Source or Object form, provided that You + meet the following conditions: + + (a) You must give any other recipients of the Work or + Derivative Works a copy of this License; and + + (b) You must cause any modified files to carry prominent notices + stating that You changed the files; and + + (c) You must retain, in the Source form of any Derivative Works + that You distribute, all copyright, patent, trademark, and + attribution notices from the Source form of the Work, + excluding those notices that do not pertain to any part of + the Derivative Works; and + + (d) If the Work includes a "NOTICE" text file as part of its + distribution, then any Derivative Works that You distribute must + include a readable copy of the attribution notices contained + within such NOTICE file, excluding those notices that do not + pertain to any part of the Derivative Works, in at least one + of the following places: within a NOTICE text file distributed + as part of the Derivative Works; within the Source form or + documentation, if provided along with the Derivative Works; or, + within a display generated by the Derivative Works, if and + wherever such third-party notices normally appear. The contents + of the NOTICE file are for informational purposes only and + do not modify the License. You may add Your own attribution + notices within Derivative Works that You distribute, alongside + or as an addendum to the NOTICE text from the Work, provided + that such additional attribution notices cannot be construed + as modifying the License. + + You may add Your own copyright statement to Your modifications and + may provide additional or different license terms and conditions + for use, reproduction, or distribution of Your modifications, or + for any such Derivative Works as a whole, provided Your use, + reproduction, and distribution of the Work otherwise complies with + the conditions stated in this License. + + 5. Submission of Contributions. Unless You explicitly state otherwise, + any Contribution intentionally submitted for inclusion in the Work + by You to the Licensor shall be under the terms and conditions of + this License, without any additional terms or conditions. + Notwithstanding the above, nothing herein shall supersede or modify + the terms of any separate license agreement you may have executed + with Licensor regarding such Contributions. + + 6. Trademarks. This License does not grant permission to use the trade + names, trademarks, service marks, or product names of the Licensor, + except as required for reasonable and customary use in describing the + origin of the Work and reproducing the content of the NOTICE file. + + 7. Disclaimer of Warranty. Unless required by applicable law or + agreed to in writing, Licensor provides the Work (and each + Contributor provides its Contributions) on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or + implied, including, without limitation, any warranties or conditions + of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A + PARTICULAR PURPOSE. You are solely responsible for determining the + appropriateness of using or redistributing the Work and assume any + risks associated with Your exercise of permissions under this License. + + 8. Limitation of Liability. In no event and under no legal theory, + whether in tort (including negligence), contract, or otherwise, + unless required by applicable law (such as deliberate and grossly + negligent acts) or agreed to in writing, shall any Contributor be + liable to You for damages, including any direct, indirect, special, + incidental, or consequential damages of any character arising as a + result of this License or out of the use or inability to use the + Work (including but not limited to damages for loss of goodwill, + work stoppage, computer failure or malfunction, or any and all + other commercial damages or losses), even if such Contributor + has been advised of the possibility of such damages. + + 9. Accepting Warranty or Additional Liability. While redistributing + the Work or Derivative Works thereof, You may choose to offer, + and charge a fee for, acceptance of support, warranty, indemnity, + or other liability obligations and/or rights consistent with this + License. However, in accepting such obligations, You may act only + on Your own behalf and on Your sole responsibility, not on behalf + of any other Contributor, and only if You agree to indemnify, + defend, and hold each Contributor harmless for any liability + incurred by, or claims asserted against, such Contributor by reason + of your accepting any such warranty or additional liability. + + END OF TERMS AND CONDITIONS + + APPENDIX: How to apply the Apache License to your work. + + To apply the Apache License to your work, attach the following + boilerplate notice, with the fields enclosed by brackets "[]" + replaced with your own identifying information. (Don't include + the brackets!) The text should be enclosed in the appropriate + comment syntax for the file format. We also recommend that a + file or class name and description of purpose be included on the + same "printed page" as the copyright notice for easier + identification within third-party archives. + + Copyright [yyyy] [name of copyright owner] + + Licensed under the Apache License, Version 2.0 (the "License"); + you may not use this file except in compliance with the License. + You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, software + distributed under the License is distributed on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + See the License for the specific language governing permissions and + limitations under the License. + + +google-cloud-storage +Apache Software License +https://github.com/googleapis/python-storage + + Apache License + Version 2.0, January 2004 + http://www.apache.org/licenses/ + + TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION + + 1. Definitions. + + "License" shall mean the terms and conditions for use, reproduction, + and distribution as defined by Sections 1 through 9 of this document. + + "Licensor" shall mean the copyright owner or entity authorized by + the copyright owner that is granting the License. + + "Legal Entity" shall mean the union of the acting entity and all + other entities that control, are controlled by, or are under common + control with that entity. For the purposes of this definition, + "control" means (i) the power, direct or indirect, to cause the + direction or management of such entity, whether by contract or + otherwise, or (ii) ownership of fifty percent (50%) or more of the + outstanding shares, or (iii) beneficial ownership of such entity. + + "You" (or "Your") shall mean an individual or Legal Entity + exercising permissions granted by this License. + + "Source" form shall mean the preferred form for making modifications, + including but not limited to software source code, documentation + source, and configuration files. + + "Object" form shall mean any form resulting from mechanical + transformation or translation of a Source form, including but + not limited to compiled object code, generated documentation, + and conversions to other media types. + + "Work" shall mean the work of authorship, whether in Source or + Object form, made available under the License, as indicated by a + copyright notice that is included in or attached to the work + (an example is provided in the Appendix below). + + "Derivative Works" shall mean any work, whether in Source or Object + form, that is based on (or derived from) the Work and for which the + editorial revisions, annotations, elaborations, or other modifications + represent, as a whole, an original work of authorship. For the purposes + of this License, Derivative Works shall not include works that remain + separable from, or merely link (or bind by name) to the interfaces of, + the Work and Derivative Works thereof. + + "Contribution" shall mean any work of authorship, including + the original version of the Work and any modifications or additions + to that Work or Derivative Works thereof, that is intentionally + submitted to Licensor for inclusion in the Work by the copyright owner + or by an individual or Legal Entity authorized to submit on behalf of + the copyright owner. For the purposes of this definition, "submitted" + means any form of electronic, verbal, or written communication sent + to the Licensor or its representatives, including but not limited to + communication on electronic mailing lists, source code control systems, + and issue tracking systems that are managed by, or on behalf of, the + Licensor for the purpose of discussing and improving the Work, but + excluding communication that is conspicuously marked or otherwise + designated in writing by the copyright owner as "Not a Contribution." + + "Contributor" shall mean Licensor and any individual or Legal Entity + on behalf of whom a Contribution has been received by Licensor and + subsequently incorporated within the Work. + + 2. Grant of Copyright License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + copyright license to reproduce, prepare Derivative Works of, + publicly display, publicly perform, sublicense, and distribute the + Work and such Derivative Works in Source or Object form. + + 3. Grant of Patent License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + (except as stated in this section) patent license to make, have made, + use, offer to sell, sell, import, and otherwise transfer the Work, + where such license applies only to those patent claims licensable + by such Contributor that are necessarily infringed by their + Contribution(s) alone or by combination of their Contribution(s) + with the Work to which such Contribution(s) was submitted. If You + institute patent litigation against any entity (including a + cross-claim or counterclaim in a lawsuit) alleging that the Work + or a Contribution incorporated within the Work constitutes direct + or contributory patent infringement, then any patent licenses + granted to You under this License for that Work shall terminate + as of the date such litigation is filed. + + 4. Redistribution. You may reproduce and distribute copies of the + Work or Derivative Works thereof in any medium, with or without + modifications, and in Source or Object form, provided that You + meet the following conditions: + + (a) You must give any other recipients of the Work or + Derivative Works a copy of this License; and + + (b) You must cause any modified files to carry prominent notices + stating that You changed the files; and + + (c) You must retain, in the Source form of any Derivative Works + that You distribute, all copyright, patent, trademark, and + attribution notices from the Source form of the Work, + excluding those notices that do not pertain to any part of + the Derivative Works; and + + (d) If the Work includes a "NOTICE" text file as part of its + distribution, then any Derivative Works that You distribute must + include a readable copy of the attribution notices contained + within such NOTICE file, excluding those notices that do not + pertain to any part of the Derivative Works, in at least one + of the following places: within a NOTICE text file distributed + as part of the Derivative Works; within the Source form or + documentation, if provided along with the Derivative Works; or, + within a display generated by the Derivative Works, if and + wherever such third-party notices normally appear. The contents + of the NOTICE file are for informational purposes only and + do not modify the License. You may add Your own attribution + notices within Derivative Works that You distribute, alongside + or as an addendum to the NOTICE text from the Work, provided + that such additional attribution notices cannot be construed + as modifying the License. + + You may add Your own copyright statement to Your modifications and + may provide additional or different license terms and conditions + for use, reproduction, or distribution of Your modifications, or + for any such Derivative Works as a whole, provided Your use, + reproduction, and distribution of the Work otherwise complies with + the conditions stated in this License. + + 5. Submission of Contributions. Unless You explicitly state otherwise, + any Contribution intentionally submitted for inclusion in the Work + by You to the Licensor shall be under the terms and conditions of + this License, without any additional terms or conditions. + Notwithstanding the above, nothing herein shall supersede or modify + the terms of any separate license agreement you may have executed + with Licensor regarding such Contributions. + + 6. Trademarks. This License does not grant permission to use the trade + names, trademarks, service marks, or product names of the Licensor, + except as required for reasonable and customary use in describing the + origin of the Work and reproducing the content of the NOTICE file. + + 7. Disclaimer of Warranty. Unless required by applicable law or + agreed to in writing, Licensor provides the Work (and each + Contributor provides its Contributions) on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or + implied, including, without limitation, any warranties or conditions + of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A + PARTICULAR PURPOSE. You are solely responsible for determining the + appropriateness of using or redistributing the Work and assume any + risks associated with Your exercise of permissions under this License. + + 8. Limitation of Liability. In no event and under no legal theory, + whether in tort (including negligence), contract, or otherwise, + unless required by applicable law (such as deliberate and grossly + negligent acts) or agreed to in writing, shall any Contributor be + liable to You for damages, including any direct, indirect, special, + incidental, or consequential damages of any character arising as a + result of this License or out of the use or inability to use the + Work (including but not limited to damages for loss of goodwill, + work stoppage, computer failure or malfunction, or any and all + other commercial damages or losses), even if such Contributor + has been advised of the possibility of such damages. + + 9. Accepting Warranty or Additional Liability. While redistributing + the Work or Derivative Works thereof, You may choose to offer, + and charge a fee for, acceptance of support, warranty, indemnity, + or other liability obligations and/or rights consistent with this + License. However, in accepting such obligations, You may act only + on Your own behalf and on Your sole responsibility, not on behalf + of any other Contributor, and only if You agree to indemnify, + defend, and hold each Contributor harmless for any liability + incurred by, or claims asserted against, such Contributor by reason + of your accepting any such warranty or additional liability. + + END OF TERMS AND CONDITIONS + + APPENDIX: How to apply the Apache License to your work. + + To apply the Apache License to your work, attach the following + boilerplate notice, with the fields enclosed by brackets "[]" + replaced with your own identifying information. (Don't include + the brackets!) The text should be enclosed in the appropriate + comment syntax for the file format. We also recommend that a + file or class name and description of purpose be included on the + same "printed page" as the copyright notice for easier + identification within third-party archives. + + Copyright [yyyy] [name of copyright owner] + + Licensed under the Apache License, Version 2.0 (the "License"); + you may not use this file except in compliance with the License. + You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, software + distributed under the License is distributed on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + See the License for the specific language governing permissions and + limitations under the License. + + +google-crc32c +UNKNOWN +https://github.com/googleapis/python-crc32c + + Apache License + Version 2.0, January 2004 + http://www.apache.org/licenses/ + + TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION + + 1. Definitions. + + "License" shall mean the terms and conditions for use, reproduction, + and distribution as defined by Sections 1 through 9 of this document. + + "Licensor" shall mean the copyright owner or entity authorized by + the copyright owner that is granting the License. + + "Legal Entity" shall mean the union of the acting entity and all + other entities that control, are controlled by, or are under common + control with that entity. For the purposes of this definition, + "control" means (i) the power, direct or indirect, to cause the + direction or management of such entity, whether by contract or + otherwise, or (ii) ownership of fifty percent (50%) or more of the + outstanding shares, or (iii) beneficial ownership of such entity. + + "You" (or "Your") shall mean an individual or Legal Entity + exercising permissions granted by this License. + + "Source" form shall mean the preferred form for making modifications, + including but not limited to software source code, documentation + source, and configuration files. + + "Object" form shall mean any form resulting from mechanical + transformation or translation of a Source form, including but + not limited to compiled object code, generated documentation, + and conversions to other media types. + + "Work" shall mean the work of authorship, whether in Source or + Object form, made available under the License, as indicated by a + copyright notice that is included in or attached to the work + (an example is provided in the Appendix below). + + "Derivative Works" shall mean any work, whether in Source or Object + form, that is based on (or derived from) the Work and for which the + editorial revisions, annotations, elaborations, or other modifications + represent, as a whole, an original work of authorship. For the purposes + of this License, Derivative Works shall not include works that remain + separable from, or merely link (or bind by name) to the interfaces of, + the Work and Derivative Works thereof. + + "Contribution" shall mean any work of authorship, including + the original version of the Work and any modifications or additions + to that Work or Derivative Works thereof, that is intentionally + submitted to Licensor for inclusion in the Work by the copyright owner + or by an individual or Legal Entity authorized to submit on behalf of + the copyright owner. For the purposes of this definition, "submitted" + means any form of electronic, verbal, or written communication sent + to the Licensor or its representatives, including but not limited to + communication on electronic mailing lists, source code control systems, + and issue tracking systems that are managed by, or on behalf of, the + Licensor for the purpose of discussing and improving the Work, but + excluding communication that is conspicuously marked or otherwise + designated in writing by the copyright owner as "Not a Contribution." + + "Contributor" shall mean Licensor and any individual or Legal Entity + on behalf of whom a Contribution has been received by Licensor and + subsequently incorporated within the Work. + + 2. Grant of Copyright License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + copyright license to reproduce, prepare Derivative Works of, + publicly display, publicly perform, sublicense, and distribute the + Work and such Derivative Works in Source or Object form. + + 3. Grant of Patent License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + (except as stated in this section) patent license to make, have made, + use, offer to sell, sell, import, and otherwise transfer the Work, + where such license applies only to those patent claims licensable + by such Contributor that are necessarily infringed by their + Contribution(s) alone or by combination of their Contribution(s) + with the Work to which such Contribution(s) was submitted. If You + institute patent litigation against any entity (including a + cross-claim or counterclaim in a lawsuit) alleging that the Work + or a Contribution incorporated within the Work constitutes direct + or contributory patent infringement, then any patent licenses + granted to You under this License for that Work shall terminate + as of the date such litigation is filed. + + 4. Redistribution. You may reproduce and distribute copies of the + Work or Derivative Works thereof in any medium, with or without + modifications, and in Source or Object form, provided that You + meet the following conditions: + + (a) You must give any other recipients of the Work or + Derivative Works a copy of this License; and + + (b) You must cause any modified files to carry prominent notices + stating that You changed the files; and + + (c) You must retain, in the Source form of any Derivative Works + that You distribute, all copyright, patent, trademark, and + attribution notices from the Source form of the Work, + excluding those notices that do not pertain to any part of + the Derivative Works; and + + (d) If the Work includes a "NOTICE" text file as part of its + distribution, then any Derivative Works that You distribute must + include a readable copy of the attribution notices contained + within such NOTICE file, excluding those notices that do not + pertain to any part of the Derivative Works, in at least one + of the following places: within a NOTICE text file distributed + as part of the Derivative Works; within the Source form or + documentation, if provided along with the Derivative Works; or, + within a display generated by the Derivative Works, if and + wherever such third-party notices normally appear. The contents + of the NOTICE file are for informational purposes only and + do not modify the License. You may add Your own attribution + notices within Derivative Works that You distribute, alongside + or as an addendum to the NOTICE text from the Work, provided + that such additional attribution notices cannot be construed + as modifying the License. + + You may add Your own copyright statement to Your modifications and + may provide additional or different license terms and conditions + for use, reproduction, or distribution of Your modifications, or + for any such Derivative Works as a whole, provided Your use, + reproduction, and distribution of the Work otherwise complies with + the conditions stated in this License. + + 5. Submission of Contributions. Unless You explicitly state otherwise, + any Contribution intentionally submitted for inclusion in the Work + by You to the Licensor shall be under the terms and conditions of + this License, without any additional terms or conditions. + Notwithstanding the above, nothing herein shall supersede or modify + the terms of any separate license agreement you may have executed + with Licensor regarding such Contributions. + + 6. Trademarks. This License does not grant permission to use the trade + names, trademarks, service marks, or product names of the Licensor, + except as required for reasonable and customary use in describing the + origin of the Work and reproducing the content of the NOTICE file. + + 7. Disclaimer of Warranty. Unless required by applicable law or + agreed to in writing, Licensor provides the Work (and each + Contributor provides its Contributions) on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or + implied, including, without limitation, any warranties or conditions + of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A + PARTICULAR PURPOSE. You are solely responsible for determining the + appropriateness of using or redistributing the Work and assume any + risks associated with Your exercise of permissions under this License. + + 8. Limitation of Liability. In no event and under no legal theory, + whether in tort (including negligence), contract, or otherwise, + unless required by applicable law (such as deliberate and grossly + negligent acts) or agreed to in writing, shall any Contributor be + liable to You for damages, including any direct, indirect, special, + incidental, or consequential damages of any character arising as a + result of this License or out of the use or inability to use the + Work (including but not limited to damages for loss of goodwill, + work stoppage, computer failure or malfunction, or any and all + other commercial damages or losses), even if such Contributor + has been advised of the possibility of such damages. + + 9. Accepting Warranty or Additional Liability. While redistributing + the Work or Derivative Works thereof, You may choose to offer, + and charge a fee for, acceptance of support, warranty, indemnity, + or other liability obligations and/or rights consistent with this + License. However, in accepting such obligations, You may act only + on Your own behalf and on Your sole responsibility, not on behalf + of any other Contributor, and only if You agree to indemnify, + defend, and hold each Contributor harmless for any liability + incurred by, or claims asserted against, such Contributor by reason + of your accepting any such warranty or additional liability. + + END OF TERMS AND CONDITIONS + + APPENDIX: How to apply the Apache License to your work. + + To apply the Apache License to your work, attach the following + boilerplate notice, with the fields enclosed by brackets "[]" + replaced with your own identifying information. (Don't include + the brackets!) The text should be enclosed in the appropriate + comment syntax for the file format. We also recommend that a + file or class name and description of purpose be included on the + same "printed page" as the copyright notice for easier + identification within third-party archives. + + Copyright [yyyy] [name of copyright owner] + + Licensed under the Apache License, Version 2.0 (the "License"); + you may not use this file except in compliance with the License. + You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, software + distributed under the License is distributed on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + See the License for the specific language governing permissions and + limitations under the License. + + +google-resumable-media +Apache Software License +https://github.com/googleapis/google-resumable-media-python + + Apache License + Version 2.0, January 2004 + http://www.apache.org/licenses/ + + TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION + + 1. Definitions. + + "License" shall mean the terms and conditions for use, reproduction, + and distribution as defined by Sections 1 through 9 of this document. + + "Licensor" shall mean the copyright owner or entity authorized by + the copyright owner that is granting the License. + + "Legal Entity" shall mean the union of the acting entity and all + other entities that control, are controlled by, or are under common + control with that entity. For the purposes of this definition, + "control" means (i) the power, direct or indirect, to cause the + direction or management of such entity, whether by contract or + otherwise, or (ii) ownership of fifty percent (50%) or more of the + outstanding shares, or (iii) beneficial ownership of such entity. + + "You" (or "Your") shall mean an individual or Legal Entity + exercising permissions granted by this License. + + "Source" form shall mean the preferred form for making modifications, + including but not limited to software source code, documentation + source, and configuration files. + + "Object" form shall mean any form resulting from mechanical + transformation or translation of a Source form, including but + not limited to compiled object code, generated documentation, + and conversions to other media types. + + "Work" shall mean the work of authorship, whether in Source or + Object form, made available under the License, as indicated by a + copyright notice that is included in or attached to the work + (an example is provided in the Appendix below). + + "Derivative Works" shall mean any work, whether in Source or Object + form, that is based on (or derived from) the Work and for which the + editorial revisions, annotations, elaborations, or other modifications + represent, as a whole, an original work of authorship. For the purposes + of this License, Derivative Works shall not include works that remain + separable from, or merely link (or bind by name) to the interfaces of, + the Work and Derivative Works thereof. + + "Contribution" shall mean any work of authorship, including + the original version of the Work and any modifications or additions + to that Work or Derivative Works thereof, that is intentionally + submitted to Licensor for inclusion in the Work by the copyright owner + or by an individual or Legal Entity authorized to submit on behalf of + the copyright owner. For the purposes of this definition, "submitted" + means any form of electronic, verbal, or written communication sent + to the Licensor or its representatives, including but not limited to + communication on electronic mailing lists, source code control systems, + and issue tracking systems that are managed by, or on behalf of, the + Licensor for the purpose of discussing and improving the Work, but + excluding communication that is conspicuously marked or otherwise + designated in writing by the copyright owner as "Not a Contribution." + + "Contributor" shall mean Licensor and any individual or Legal Entity + on behalf of whom a Contribution has been received by Licensor and + subsequently incorporated within the Work. + + 2. Grant of Copyright License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + copyright license to reproduce, prepare Derivative Works of, + publicly display, publicly perform, sublicense, and distribute the + Work and such Derivative Works in Source or Object form. + + 3. Grant of Patent License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + (except as stated in this section) patent license to make, have made, + use, offer to sell, sell, import, and otherwise transfer the Work, + where such license applies only to those patent claims licensable + by such Contributor that are necessarily infringed by their + Contribution(s) alone or by combination of their Contribution(s) + with the Work to which such Contribution(s) was submitted. If You + institute patent litigation against any entity (including a + cross-claim or counterclaim in a lawsuit) alleging that the Work + or a Contribution incorporated within the Work constitutes direct + or contributory patent infringement, then any patent licenses + granted to You under this License for that Work shall terminate + as of the date such litigation is filed. + + 4. Redistribution. You may reproduce and distribute copies of the + Work or Derivative Works thereof in any medium, with or without + modifications, and in Source or Object form, provided that You + meet the following conditions: + + (a) You must give any other recipients of the Work or + Derivative Works a copy of this License; and + + (b) You must cause any modified files to carry prominent notices + stating that You changed the files; and + + (c) You must retain, in the Source form of any Derivative Works + that You distribute, all copyright, patent, trademark, and + attribution notices from the Source form of the Work, + excluding those notices that do not pertain to any part of + the Derivative Works; and + + (d) If the Work includes a "NOTICE" text file as part of its + distribution, then any Derivative Works that You distribute must + include a readable copy of the attribution notices contained + within such NOTICE file, excluding those notices that do not + pertain to any part of the Derivative Works, in at least one + of the following places: within a NOTICE text file distributed + as part of the Derivative Works; within the Source form or + documentation, if provided along with the Derivative Works; or, + within a display generated by the Derivative Works, if and + wherever such third-party notices normally appear. The contents + of the NOTICE file are for informational purposes only and + do not modify the License. You may add Your own attribution + notices within Derivative Works that You distribute, alongside + or as an addendum to the NOTICE text from the Work, provided + that such additional attribution notices cannot be construed + as modifying the License. + + You may add Your own copyright statement to Your modifications and + may provide additional or different license terms and conditions + for use, reproduction, or distribution of Your modifications, or + for any such Derivative Works as a whole, provided Your use, + reproduction, and distribution of the Work otherwise complies with + the conditions stated in this License. + + 5. Submission of Contributions. Unless You explicitly state otherwise, + any Contribution intentionally submitted for inclusion in the Work + by You to the Licensor shall be under the terms and conditions of + this License, without any additional terms or conditions. + Notwithstanding the above, nothing herein shall supersede or modify + the terms of any separate license agreement you may have executed + with Licensor regarding such Contributions. + + 6. Trademarks. This License does not grant permission to use the trade + names, trademarks, service marks, or product names of the Licensor, + except as required for reasonable and customary use in describing the + origin of the Work and reproducing the content of the NOTICE file. + + 7. Disclaimer of Warranty. Unless required by applicable law or + agreed to in writing, Licensor provides the Work (and each + Contributor provides its Contributions) on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or + implied, including, without limitation, any warranties or conditions + of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A + PARTICULAR PURPOSE. You are solely responsible for determining the + appropriateness of using or redistributing the Work and assume any + risks associated with Your exercise of permissions under this License. + + 8. Limitation of Liability. In no event and under no legal theory, + whether in tort (including negligence), contract, or otherwise, + unless required by applicable law (such as deliberate and grossly + negligent acts) or agreed to in writing, shall any Contributor be + liable to You for damages, including any direct, indirect, special, + incidental, or consequential damages of any character arising as a + result of this License or out of the use or inability to use the + Work (including but not limited to damages for loss of goodwill, + work stoppage, computer failure or malfunction, or any and all + other commercial damages or losses), even if such Contributor + has been advised of the possibility of such damages. + + 9. Accepting Warranty or Additional Liability. While redistributing + the Work or Derivative Works thereof, You may choose to offer, + and charge a fee for, acceptance of support, warranty, indemnity, + or other liability obligations and/or rights consistent with this + License. However, in accepting such obligations, You may act only + on Your own behalf and on Your sole responsibility, not on behalf + of any other Contributor, and only if You agree to indemnify, + defend, and hold each Contributor harmless for any liability + incurred by, or claims asserted against, such Contributor by reason + of your accepting any such warranty or additional liability. + + END OF TERMS AND CONDITIONS + + APPENDIX: How to apply the Apache License to your work. + + To apply the Apache License to your work, attach the following + boilerplate notice, with the fields enclosed by brackets "[]" + replaced with your own identifying information. (Don't include + the brackets!) The text should be enclosed in the appropriate + comment syntax for the file format. We also recommend that a + file or class name and description of purpose be included on the + same "printed page" as the copyright notice for easier + identification within third-party archives. + + Copyright [yyyy] [name of copyright owner] + + Licensed under the Apache License, Version 2.0 (the "License"); + you may not use this file except in compliance with the License. + You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, software + distributed under the License is distributed on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + See the License for the specific language governing permissions and + limitations under the License. + + +googleapis-common-protos +Apache Software License +https://github.com/googleapis/google-cloud-python/tree/main/packages/googleapis-common-protos + + Apache License + Version 2.0, January 2004 + http://www.apache.org/licenses/ + + TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION + + 1. Definitions. + + "License" shall mean the terms and conditions for use, reproduction, + and distribution as defined by Sections 1 through 9 of this document. + + "Licensor" shall mean the copyright owner or entity authorized by + the copyright owner that is granting the License. + + "Legal Entity" shall mean the union of the acting entity and all + other entities that control, are controlled by, or are under common + control with that entity. For the purposes of this definition, + "control" means (i) the power, direct or indirect, to cause the + direction or management of such entity, whether by contract or + otherwise, or (ii) ownership of fifty percent (50%) or more of the + outstanding shares, or (iii) beneficial ownership of such entity. + + "You" (or "Your") shall mean an individual or Legal Entity + exercising permissions granted by this License. + + "Source" form shall mean the preferred form for making modifications, + including but not limited to software source code, documentation + source, and configuration files. + + "Object" form shall mean any form resulting from mechanical + transformation or translation of a Source form, including but + not limited to compiled object code, generated documentation, + and conversions to other media types. + + "Work" shall mean the work of authorship, whether in Source or + Object form, made available under the License, as indicated by a + copyright notice that is included in or attached to the work + (an example is provided in the Appendix below). + + "Derivative Works" shall mean any work, whether in Source or Object + form, that is based on (or derived from) the Work and for which the + editorial revisions, annotations, elaborations, or other modifications + represent, as a whole, an original work of authorship. For the purposes + of this License, Derivative Works shall not include works that remain + separable from, or merely link (or bind by name) to the interfaces of, + the Work and Derivative Works thereof. + + "Contribution" shall mean any work of authorship, including + the original version of the Work and any modifications or additions + to that Work or Derivative Works thereof, that is intentionally + submitted to Licensor for inclusion in the Work by the copyright owner + or by an individual or Legal Entity authorized to submit on behalf of + the copyright owner. For the purposes of this definition, "submitted" + means any form of electronic, verbal, or written communication sent + to the Licensor or its representatives, including but not limited to + communication on electronic mailing lists, source code control systems, + and issue tracking systems that are managed by, or on behalf of, the + Licensor for the purpose of discussing and improving the Work, but + excluding communication that is conspicuously marked or otherwise + designated in writing by the copyright owner as "Not a Contribution." + + "Contributor" shall mean Licensor and any individual or Legal Entity + on behalf of whom a Contribution has been received by Licensor and + subsequently incorporated within the Work. + + 2. Grant of Copyright License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + copyright license to reproduce, prepare Derivative Works of, + publicly display, publicly perform, sublicense, and distribute the + Work and such Derivative Works in Source or Object form. + + 3. Grant of Patent License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + (except as stated in this section) patent license to make, have made, + use, offer to sell, sell, import, and otherwise transfer the Work, + where such license applies only to those patent claims licensable + by such Contributor that are necessarily infringed by their + Contribution(s) alone or by combination of their Contribution(s) + with the Work to which such Contribution(s) was submitted. If You + institute patent litigation against any entity (including a + cross-claim or counterclaim in a lawsuit) alleging that the Work + or a Contribution incorporated within the Work constitutes direct + or contributory patent infringement, then any patent licenses + granted to You under this License for that Work shall terminate + as of the date such litigation is filed. + + 4. Redistribution. You may reproduce and distribute copies of the + Work or Derivative Works thereof in any medium, with or without + modifications, and in Source or Object form, provided that You + meet the following conditions: + + (a) You must give any other recipients of the Work or + Derivative Works a copy of this License; and + + (b) You must cause any modified files to carry prominent notices + stating that You changed the files; and + + (c) You must retain, in the Source form of any Derivative Works + that You distribute, all copyright, patent, trademark, and + attribution notices from the Source form of the Work, + excluding those notices that do not pertain to any part of + the Derivative Works; and + + (d) If the Work includes a "NOTICE" text file as part of its + distribution, then any Derivative Works that You distribute must + include a readable copy of the attribution notices contained + within such NOTICE file, excluding those notices that do not + pertain to any part of the Derivative Works, in at least one + of the following places: within a NOTICE text file distributed + as part of the Derivative Works; within the Source form or + documentation, if provided along with the Derivative Works; or, + within a display generated by the Derivative Works, if and + wherever such third-party notices normally appear. The contents + of the NOTICE file are for informational purposes only and + do not modify the License. You may add Your own attribution + notices within Derivative Works that You distribute, alongside + or as an addendum to the NOTICE text from the Work, provided + that such additional attribution notices cannot be construed + as modifying the License. + + You may add Your own copyright statement to Your modifications and + may provide additional or different license terms and conditions + for use, reproduction, or distribution of Your modifications, or + for any such Derivative Works as a whole, provided Your use, + reproduction, and distribution of the Work otherwise complies with + the conditions stated in this License. + + 5. Submission of Contributions. Unless You explicitly state otherwise, + any Contribution intentionally submitted for inclusion in the Work + by You to the Licensor shall be under the terms and conditions of + this License, without any additional terms or conditions. + Notwithstanding the above, nothing herein shall supersede or modify + the terms of any separate license agreement you may have executed + with Licensor regarding such Contributions. + + 6. Trademarks. This License does not grant permission to use the trade + names, trademarks, service marks, or product names of the Licensor, + except as required for reasonable and customary use in describing the + origin of the Work and reproducing the content of the NOTICE file. + + 7. Disclaimer of Warranty. Unless required by applicable law or + agreed to in writing, Licensor provides the Work (and each + Contributor provides its Contributions) on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or + implied, including, without limitation, any warranties or conditions + of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A + PARTICULAR PURPOSE. You are solely responsible for determining the + appropriateness of using or redistributing the Work and assume any + risks associated with Your exercise of permissions under this License. + + 8. Limitation of Liability. In no event and under no legal theory, + whether in tort (including negligence), contract, or otherwise, + unless required by applicable law (such as deliberate and grossly + negligent acts) or agreed to in writing, shall any Contributor be + liable to You for damages, including any direct, indirect, special, + incidental, or consequential damages of any character arising as a + result of this License or out of the use or inability to use the + Work (including but not limited to damages for loss of goodwill, + work stoppage, computer failure or malfunction, or any and all + other commercial damages or losses), even if such Contributor + has been advised of the possibility of such damages. + + 9. Accepting Warranty or Additional Liability. While redistributing + the Work or Derivative Works thereof, You may choose to offer, + and charge a fee for, acceptance of support, warranty, indemnity, + or other liability obligations and/or rights consistent with this + License. However, in accepting such obligations, You may act only + on Your own behalf and on Your sole responsibility, not on behalf + of any other Contributor, and only if You agree to indemnify, + defend, and hold each Contributor harmless for any liability + incurred by, or claims asserted against, such Contributor by reason + of your accepting any such warranty or additional liability. + + END OF TERMS AND CONDITIONS + + APPENDIX: How to apply the Apache License to your work. + + To apply the Apache License to your work, attach the following + boilerplate notice, with the fields enclosed by brackets "[]" + replaced with your own identifying information. (Don't include + the brackets!) The text should be enclosed in the appropriate + comment syntax for the file format. We also recommend that a + file or class name and description of purpose be included on the + same "printed page" as the copyright notice for easier + identification within third-party archives. + + Copyright [yyyy] [name of copyright owner] + + Licensed under the Apache License, Version 2.0 (the "License"); + you may not use this file except in compliance with the License. + You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, software + distributed under the License is distributed on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + See the License for the specific language governing permissions and + limitations under the License. + + +gradio +Apache-2.0 +https://github.com/gradio-app/gradio + Apache License + Version 2.0, January 2004 + http://www.apache.org/licenses/ + + TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION + + 1. Definitions. + + "License" shall mean the terms and conditions for use, reproduction, + and distribution as defined by Sections 1 through 9 of this document. + + "Licensor" shall mean the copyright owner or entity authorized by + the copyright owner that is granting the License. + + "Legal Entity" shall mean the union of the acting entity and all + other entities that control, are controlled by, or are under common + control with that entity. For the purposes of this definition, + "control" means (i) the power, direct or indirect, to cause the + direction or management of such entity, whether by contract or + otherwise, or (ii) ownership of fifty percent (50%) or more of the + outstanding shares, or (iii) beneficial ownership of such entity. + + "You" (or "Your") shall mean an individual or Legal Entity + exercising permissions granted by this License. + + "Source" form shall mean the preferred form for making modifications, + including but not limited to software source code, documentation + source, and configuration files. + + "Object" form shall mean any form resulting from mechanical + transformation or translation of a Source form, including but + not limited to compiled object code, generated documentation, + and conversions to other media types. + + "Work" shall mean the work of authorship, whether in Source or + Object form, made available under the License, as indicated by a + copyright notice that is included in or attached to the work + (an example is provided in the Appendix below). + + "Derivative Works" shall mean any work, whether in Source or Object + form, that is based on (or derived from) the Work and for which the + editorial revisions, annotations, elaborations, or other modifications + represent, as a whole, an original work of authorship. For the purposes + of this License, Derivative Works shall not include works that remain + separable from, or merely link (or bind by name) to the interfaces of, + the Work and Derivative Works thereof. + + "Contribution" shall mean any work of authorship, including + the original version of the Work and any modifications or additions + to that Work or Derivative Works thereof, that is intentionally + submitted to Licensor for inclusion in the Work by the copyright owner + or by an individual or Legal Entity authorized to submit on behalf of + the copyright owner. For the purposes of this definition, "submitted" + means any form of electronic, verbal, or written communication sent + to the Licensor or its representatives, including but not limited to + communication on electronic mailing lists, source code control systems, + and issue tracking systems that are managed by, or on behalf of, the + Licensor for the purpose of discussing and improving the Work, but + excluding communication that is conspicuously marked or otherwise + designated in writing by the copyright owner as "Not a Contribution." + + "Contributor" shall mean Licensor and any individual or Legal Entity + on behalf of whom a Contribution has been received by Licensor and + subsequently incorporated within the Work. + + 2. Grant of Copyright License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + copyright license to reproduce, prepare Derivative Works of, + publicly display, publicly perform, sublicense, and distribute the + Work and such Derivative Works in Source or Object form. + + 3. Grant of Patent License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + (except as stated in this section) patent license to make, have made, + use, offer to sell, sell, import, and otherwise transfer the Work, + where such license applies only to those patent claims licensable + by such Contributor that are necessarily infringed by their + Contribution(s) alone or by combination of their Contribution(s) + with the Work to which such Contribution(s) was submitted. If You + institute patent litigation against any entity (including a + cross-claim or counterclaim in a lawsuit) alleging that the Work + or a Contribution incorporated within the Work constitutes direct + or contributory patent infringement, then any patent licenses + granted to You under this License for that Work shall terminate + as of the date such litigation is filed. + + 4. Redistribution. You may reproduce and distribute copies of the + Work or Derivative Works thereof in any medium, with or without + modifications, and in Source or Object form, provided that You + meet the following conditions: + + (a) You must give any other recipients of the Work or + Derivative Works a copy of this License; and + + (b) You must cause any modified files to carry prominent notices + stating that You changed the files; and + + (c) You must retain, in the Source form of any Derivative Works + that You distribute, all copyright, patent, trademark, and + attribution notices from the Source form of the Work, + excluding those notices that do not pertain to any part of + the Derivative Works; and + + (d) If the Work includes a "NOTICE" text file as part of its + distribution, then any Derivative Works that You distribute must + include a readable copy of the attribution notices contained + within such NOTICE file, excluding those notices that do not + pertain to any part of the Derivative Works, in at least one + of the following places: within a NOTICE text file distributed + as part of the Derivative Works; within the Source form or + documentation, if provided along with the Derivative Works; or, + within a display generated by the Derivative Works, if and + wherever such third-party notices normally appear. The contents + of the NOTICE file are for informational purposes only and + do not modify the License. You may add Your own attribution + notices within Derivative Works that You distribute, alongside + or as an addendum to the NOTICE text from the Work, provided + that such additional attribution notices cannot be construed + as modifying the License. + + You may add Your own copyright statement to Your modifications and + may provide additional or different license terms and conditions + for use, reproduction, or distribution of Your modifications, or + for any such Derivative Works as a whole, provided Your use, + reproduction, and distribution of the Work otherwise complies with + the conditions stated in this License. + + 5. Submission of Contributions. Unless You explicitly state otherwise, + any Contribution intentionally submitted for inclusion in the Work + by You to the Licensor shall be under the terms and conditions of + this License, without any additional terms or conditions. + Notwithstanding the above, nothing herein shall supersede or modify + the terms of any separate license agreement you may have executed + with Licensor regarding such Contributions. + + 6. Trademarks. This License does not grant permission to use the trade + names, trademarks, service marks, or product names of the Licensor, + except as required for reasonable and customary use in describing the + origin of the Work and reproducing the content of the NOTICE file. + + 7. Disclaimer of Warranty. Unless required by applicable law or + agreed to in writing, Licensor provides the Work (and each + Contributor provides its Contributions) on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or + implied, including, without limitation, any warranties or conditions + of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A + PARTICULAR PURPOSE. You are solely responsible for determining the + appropriateness of using or redistributing the Work and assume any + risks associated with Your exercise of permissions under this License. + + 8. Limitation of Liability. In no event and under no legal theory, + whether in tort (including negligence), contract, or otherwise, + unless required by applicable law (such as deliberate and grossly + negligent acts) or agreed to in writing, shall any Contributor be + liable to You for damages, including any direct, indirect, special, + incidental, or consequential damages of any character arising as a + result of this License or out of the use or inability to use the + Work (including but not limited to damages for loss of goodwill, + work stoppage, computer failure or malfunction, or any and all + other commercial damages or losses), even if such Contributor + has been advised of the possibility of such damages. + + 9. Accepting Warranty or Additional Liability. While redistributing + the Work or Derivative Works thereof, You may choose to offer, + and charge a fee for, acceptance of support, warranty, indemnity, + or other liability obligations and/or rights consistent with this + License. However, in accepting such obligations, You may act only + on Your own behalf and on Your sole responsibility, not on behalf + of any other Contributor, and only if You agree to indemnify, + defend, and hold each Contributor harmless for any liability + incurred by, or claims asserted against, such Contributor by reason + of your accepting any such warranty or additional liability. + + END OF TERMS AND CONDITIONS + + APPENDIX: How to apply the Apache License to your work. + + To apply the Apache License to your work, attach the following + boilerplate notice, with the fields enclosed by brackets "[]" + replaced with your own identifying information. (Don't include + the brackets!) The text should be enclosed in the appropriate + comment syntax for the file format. We also recommend that a + file or class name and description of purpose be included on the + same "printed page" as the copyright notice for easier + identification within third-party archives. + + Copyright [yyyy] [name of copyright owner] + + Licensed under the Apache License, Version 2.0 (the "License"); + you may not use this file except in compliance with the License. + You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, software + distributed under the License is distributed on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + See the License for the specific language governing permissions and + limitations under the License. + + +gradio_client +Apache-2.0 +https://github.com/gradio-app/gradio +UNKNOWN + +groovy +MIT License +https://github.com/gradio-app/groovy + Apache License + Version 2.0, January 2004 + http://www.apache.org/licenses/ + + TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION + + 1. Definitions. + + "License" shall mean the terms and conditions for use, reproduction, + and distribution as defined by Sections 1 through 9 of this document. + + "Licensor" shall mean the copyright owner or entity authorized by + the copyright owner that is granting the License. + + "Legal Entity" shall mean the union of the acting entity and all + other entities that control, are controlled by, or are under common + control with that entity. For the purposes of this definition, + "control" means (i) the power, direct or indirect, to cause the + direction or management of such entity, whether by contract or + otherwise, or (ii) ownership of fifty percent (50%) or more of the + outstanding shares, or (iii) beneficial ownership of such entity. + + "You" (or "Your") shall mean an individual or Legal Entity + exercising permissions granted by this License. + + "Source" form shall mean the preferred form for making modifications, + including but not limited to software source code, documentation + source, and configuration files. + + "Object" form shall mean any form resulting from mechanical + transformation or translation of a Source form, including but + not limited to compiled object code, generated documentation, + and conversions to other media types. + + "Work" shall mean the work of authorship, whether in Source or + Object form, made available under the License, as indicated by a + copyright notice that is included in or attached to the work + (an example is provided in the Appendix below). + + "Derivative Works" shall mean any work, whether in Source or Object + form, that is based on (or derived from) the Work and for which the + editorial revisions, annotations, elaborations, or other modifications + represent, as a whole, an original work of authorship. For the purposes + of this License, Derivative Works shall not include works that remain + separable from, or merely link (or bind by name) to the interfaces of, + the Work and Derivative Works thereof. + + "Contribution" shall mean any work of authorship, including + the original version of the Work and any modifications or additions + to that Work or Derivative Works thereof, that is intentionally + submitted to Licensor for inclusion in the Work by the copyright owner + or by an individual or Legal Entity authorized to submit on behalf of + the copyright owner. For the purposes of this definition, "submitted" + means any form of electronic, verbal, or written communication sent + to the Licensor or its representatives, including but not limited to + communication on electronic mailing lists, source code control systems, + and issue tracking systems that are managed by, or on behalf of, the + Licensor for the purpose of discussing and improving the Work, but + excluding communication that is conspicuously marked or otherwise + designated in writing by the copyright owner as "Not a Contribution." + + "Contributor" shall mean Licensor and any individual or Legal Entity + on behalf of whom a Contribution has been received by Licensor and + subsequently incorporated within the Work. + + 2. Grant of Copyright License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + copyright license to reproduce, prepare Derivative Works of, + publicly display, publicly perform, sublicense, and distribute the + Work and such Derivative Works in Source or Object form. + + 3. Grant of Patent License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + (except as stated in this section) patent license to make, have made, + use, offer to sell, sell, import, and otherwise transfer the Work, + where such license applies only to those patent claims licensable + by such Contributor that are necessarily infringed by their + Contribution(s) alone or by combination of their Contribution(s) + with the Work to which such Contribution(s) was submitted. If You + institute patent litigation against any entity (including a + cross-claim or counterclaim in a lawsuit) alleging that the Work + or a Contribution incorporated within the Work constitutes direct + or contributory patent infringement, then any patent licenses + granted to You under this License for that Work shall terminate + as of the date such litigation is filed. + + 4. Redistribution. You may reproduce and distribute copies of the + Work or Derivative Works thereof in any medium, with or without + modifications, and in Source or Object form, provided that You + meet the following conditions: + + (a) You must give any other recipients of the Work or + Derivative Works a copy of this License; and + + (b) You must cause any modified files to carry prominent notices + stating that You changed the files; and + + (c) You must retain, in the Source form of any Derivative Works + that You distribute, all copyright, patent, trademark, and + attribution notices from the Source form of the Work, + excluding those notices that do not pertain to any part of + the Derivative Works; and + + (d) If the Work includes a "NOTICE" text file as part of its + distribution, then any Derivative Works that You distribute must + include a readable copy of the attribution notices contained + within such NOTICE file, excluding those notices that do not + pertain to any part of the Derivative Works, in at least one + of the following places: within a NOTICE text file distributed + as part of the Derivative Works; within the Source form or + documentation, if provided along with the Derivative Works; or, + within a display generated by the Derivative Works, if and + wherever such third-party notices normally appear. The contents + of the NOTICE file are for informational purposes only and + do not modify the License. You may add Your own attribution + notices within Derivative Works that You distribute, alongside + or as an addendum to the NOTICE text from the Work, provided + that such additional attribution notices cannot be construed + as modifying the License. + + You may add Your own copyright statement to Your modifications and + may provide additional or different license terms and conditions + for use, reproduction, or distribution of Your modifications, or + for any such Derivative Works as a whole, provided Your use, + reproduction, and distribution of the Work otherwise complies with + the conditions stated in this License. + + 5. Submission of Contributions. Unless You explicitly state otherwise, + any Contribution intentionally submitted for inclusion in the Work + by You to the Licensor shall be under the terms and conditions of + this License, without any additional terms or conditions. + Notwithstanding the above, nothing herein shall supersede or modify + the terms of any separate license agreement you may have executed + with Licensor regarding such Contributions. + + 6. Trademarks. This License does not grant permission to use the trade + names, trademarks, service marks, or product names of the Licensor, + except as required for reasonable and customary use in describing the + origin of the Work and reproducing the content of the NOTICE file. + + 7. Disclaimer of Warranty. Unless required by applicable law or + agreed to in writing, Licensor provides the Work (and each + Contributor provides its Contributions) on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or + implied, including, without limitation, any warranties or conditions + of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A + PARTICULAR PURPOSE. You are solely responsible for determining the + appropriateness of using or redistributing the Work and assume any + risks associated with Your exercise of permissions under this License. + + 8. Limitation of Liability. In no event and under no legal theory, + whether in tort (including negligence), contract, or otherwise, + unless required by applicable law (such as deliberate and grossly + negligent acts) or agreed to in writing, shall any Contributor be + liable to You for damages, including any direct, indirect, special, + incidental, or consequential damages of any character arising as a + result of this License or out of the use or inability to use the + Work (including but not limited to damages for loss of goodwill, + work stoppage, computer failure or malfunction, or any and all + other commercial damages or losses), even if such Contributor + has been advised of the possibility of such damages. + + 9. Accepting Warranty or Additional Liability. While redistributing + the Work or Derivative Works thereof, You may choose to offer, + and charge a fee for, acceptance of support, warranty, indemnity, + or other liability obligations and/or rights consistent with this + License. However, in accepting such obligations, You may act only + on Your own behalf and on Your sole responsibility, not on behalf + of any other Contributor, and only if You agree to indemnify, + defend, and hold each Contributor harmless for any liability + incurred by, or claims asserted against, such Contributor by reason + of your accepting any such warranty or additional liability. + + END OF TERMS AND CONDITIONS + + APPENDIX: How to apply the Apache License to your work. + + To apply the Apache License to your work, attach the following + boilerplate notice, with the fields enclosed by brackets "[]" + replaced with your own identifying information. (Don't include + the brackets!) The text should be enclosed in the appropriate + comment syntax for the file format. We also recommend that a + file or class name and description of purpose be included on the + same "printed page" as the copyright notice for easier + identification within third-party archives. + + Copyright [yyyy] [name of copyright owner] + + Licensed under the Apache License, Version 2.0 (the "License"); + you may not use this file except in compliance with the License. + You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, software + distributed under the License is distributed on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + See the License for the specific language governing permissions and + limitations under the License. + + +grpcio +Apache-2.0 +https://grpc.io + + Apache License + Version 2.0, January 2004 + http://www.apache.org/licenses/ + + TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION + + 1. Definitions. + + "License" shall mean the terms and conditions for use, reproduction, + and distribution as defined by Sections 1 through 9 of this document. + + "Licensor" shall mean the copyright owner or entity authorized by + the copyright owner that is granting the License. + + "Legal Entity" shall mean the union of the acting entity and all + other entities that control, are controlled by, or are under common + control with that entity. For the purposes of this definition, + "control" means (i) the power, direct or indirect, to cause the + direction or management of such entity, whether by contract or + otherwise, or (ii) ownership of fifty percent (50%) or more of the + outstanding shares, or (iii) beneficial ownership of such entity. + + "You" (or "Your") shall mean an individual or Legal Entity + exercising permissions granted by this License. + + "Source" form shall mean the preferred form for making modifications, + including but not limited to software source code, documentation + source, and configuration files. + + "Object" form shall mean any form resulting from mechanical + transformation or translation of a Source form, including but + not limited to compiled object code, generated documentation, + and conversions to other media types. + + "Work" shall mean the work of authorship, whether in Source or + Object form, made available under the License, as indicated by a + copyright notice that is included in or attached to the work + (an example is provided in the Appendix below). + + "Derivative Works" shall mean any work, whether in Source or Object + form, that is based on (or derived from) the Work and for which the + editorial revisions, annotations, elaborations, or other modifications + represent, as a whole, an original work of authorship. For the purposes + of this License, Derivative Works shall not include works that remain + separable from, or merely link (or bind by name) to the interfaces of, + the Work and Derivative Works thereof. + + "Contribution" shall mean any work of authorship, including + the original version of the Work and any modifications or additions + to that Work or Derivative Works thereof, that is intentionally + submitted to Licensor for inclusion in the Work by the copyright owner + or by an individual or Legal Entity authorized to submit on behalf of + the copyright owner. For the purposes of this definition, "submitted" + means any form of electronic, verbal, or written communication sent + to the Licensor or its representatives, including but not limited to + communication on electronic mailing lists, source code control systems, + and issue tracking systems that are managed by, or on behalf of, the + Licensor for the purpose of discussing and improving the Work, but + excluding communication that is conspicuously marked or otherwise + designated in writing by the copyright owner as "Not a Contribution." + + "Contributor" shall mean Licensor and any individual or Legal Entity + on behalf of whom a Contribution has been received by Licensor and + subsequently incorporated within the Work. + + 2. Grant of Copyright License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + copyright license to reproduce, prepare Derivative Works of, + publicly display, publicly perform, sublicense, and distribute the + Work and such Derivative Works in Source or Object form. + + 3. Grant of Patent License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + (except as stated in this section) patent license to make, have made, + use, offer to sell, sell, import, and otherwise transfer the Work, + where such license applies only to those patent claims licensable + by such Contributor that are necessarily infringed by their + Contribution(s) alone or by combination of their Contribution(s) + with the Work to which such Contribution(s) was submitted. If You + institute patent litigation against any entity (including a + cross-claim or counterclaim in a lawsuit) alleging that the Work + or a Contribution incorporated within the Work constitutes direct + or contributory patent infringement, then any patent licenses + granted to You under this License for that Work shall terminate + as of the date such litigation is filed. + + 4. Redistribution. You may reproduce and distribute copies of the + Work or Derivative Works thereof in any medium, with or without + modifications, and in Source or Object form, provided that You + meet the following conditions: + + (a) You must give any other recipients of the Work or + Derivative Works a copy of this License; and + + (b) You must cause any modified files to carry prominent notices + stating that You changed the files; and + + (c) You must retain, in the Source form of any Derivative Works + that You distribute, all copyright, patent, trademark, and + attribution notices from the Source form of the Work, + excluding those notices that do not pertain to any part of + the Derivative Works; and + + (d) If the Work includes a "NOTICE" text file as part of its + distribution, then any Derivative Works that You distribute must + include a readable copy of the attribution notices contained + within such NOTICE file, excluding those notices that do not + pertain to any part of the Derivative Works, in at least one + of the following places: within a NOTICE text file distributed + as part of the Derivative Works; within the Source form or + documentation, if provided along with the Derivative Works; or, + within a display generated by the Derivative Works, if and + wherever such third-party notices normally appear. The contents + of the NOTICE file are for informational purposes only and + do not modify the License. You may add Your own attribution + notices within Derivative Works that You distribute, alongside + or as an addendum to the NOTICE text from the Work, provided + that such additional attribution notices cannot be construed + as modifying the License. + + You may add Your own copyright statement to Your modifications and + may provide additional or different license terms and conditions + for use, reproduction, or distribution of Your modifications, or + for any such Derivative Works as a whole, provided Your use, + reproduction, and distribution of the Work otherwise complies with + the conditions stated in this License. + + 5. Submission of Contributions. Unless You explicitly state otherwise, + any Contribution intentionally submitted for inclusion in the Work + by You to the Licensor shall be under the terms and conditions of + this License, without any additional terms or conditions. + Notwithstanding the above, nothing herein shall supersede or modify + the terms of any separate license agreement you may have executed + with Licensor regarding such Contributions. + + 6. Trademarks. This License does not grant permission to use the trade + names, trademarks, service marks, or product names of the Licensor, + except as required for reasonable and customary use in describing the + origin of the Work and reproducing the content of the NOTICE file. + + 7. Disclaimer of Warranty. Unless required by applicable law or + agreed to in writing, Licensor provides the Work (and each + Contributor provides its Contributions) on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or + implied, including, without limitation, any warranties or conditions + of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A + PARTICULAR PURPOSE. You are solely responsible for determining the + appropriateness of using or redistributing the Work and assume any + risks associated with Your exercise of permissions under this License. + + 8. Limitation of Liability. In no event and under no legal theory, + whether in tort (including negligence), contract, or otherwise, + unless required by applicable law (such as deliberate and grossly + negligent acts) or agreed to in writing, shall any Contributor be + liable to You for damages, including any direct, indirect, special, + incidental, or consequential damages of any character arising as a + result of this License or out of the use or inability to use the + Work (including but not limited to damages for loss of goodwill, + work stoppage, computer failure or malfunction, or any and all + other commercial damages or losses), even if such Contributor + has been advised of the possibility of such damages. + + 9. Accepting Warranty or Additional Liability. While redistributing + the Work or Derivative Works thereof, You may choose to offer, + and charge a fee for, acceptance of support, warranty, indemnity, + or other liability obligations and/or rights consistent with this + License. However, in accepting such obligations, You may act only + on Your own behalf and on Your sole responsibility, not on behalf + of any other Contributor, and only if You agree to indemnify, + defend, and hold each Contributor harmless for any liability + incurred by, or claims asserted against, such Contributor by reason + of your accepting any such warranty or additional liability. + + END OF TERMS AND CONDITIONS + + APPENDIX: How to apply the Apache License to your work. + + To apply the Apache License to your work, attach the following + boilerplate notice, with the fields enclosed by brackets "[]" + replaced with your own identifying information. (Don't include + the brackets!) The text should be enclosed in the appropriate + comment syntax for the file format. We also recommend that a + file or class name and description of purpose be included on the + same "printed page" as the copyright notice for easier + identification within third-party archives. + + Copyright [yyyy] [name of copyright owner] + + Licensed under the Apache License, Version 2.0 (the "License"); + you may not use this file except in compliance with the License. + You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, software + distributed under the License is distributed on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + See the License for the specific language governing permissions and + limitations under the License. + +----------------------------------------------------------- + +BSD 3-Clause License + +Copyright 2016, Google Inc. + +Redistribution and use in source and binary forms, with or without +modification, are permitted provided that the following conditions are met: + +1. Redistributions of source code must retain the above copyright notice, +this list of conditions and the following disclaimer. + +2. Redistributions in binary form must reproduce the above copyright notice, +this list of conditions and the following disclaimer in the documentation +and/or other materials provided with the distribution. + +3. Neither the name of the copyright holder nor the names of its +contributors may be used to endorse or promote products derived from this +software without specific prior written permission. + +THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" +AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE +IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE +ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE +LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR +CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF +SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS +INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN +CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) +ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF +THE POSSIBILITY OF SUCH DAMAGE. + +----------------------------------------------------------- + +Mozilla Public License Version 2.0 +================================== + +1. Definitions +-------------- + +1.1. "Contributor" + means each individual or legal entity that creates, contributes to + the creation of, or owns Covered Software. + +1.2. "Contributor Version" + means the combination of the Contributions of others (if any) used + by a Contributor and that particular Contributor's Contribution. + +1.3. "Contribution" + means Covered Software of a particular Contributor. + +1.4. "Covered Software" + means Source Code Form to which the initial Contributor has attached + the notice in Exhibit A, the Executable Form of such Source Code + Form, and Modifications of such Source Code Form, in each case + including portions thereof. + +1.5. "Incompatible With Secondary Licenses" + means + + (a) that the initial Contributor has attached the notice described + in Exhibit B to the Covered Software; or + + (b) that the Covered Software was made available under the terms of + version 1.1 or earlier of the License, but not also under the + terms of a Secondary License. + +1.6. "Executable Form" + means any form of the work other than Source Code Form. + +1.7. "Larger Work" + means a work that combines Covered Software with other material, in + a separate file or files, that is not Covered Software. + +1.8. "License" + means this document. + +1.9. "Licensable" + means having the right to grant, to the maximum extent possible, + whether at the time of the initial grant or subsequently, any and + all of the rights conveyed by this License. + +1.10. "Modifications" + means any of the following: + + (a) any file in Source Code Form that results from an addition to, + deletion from, or modification of the contents of Covered + Software; or + + (b) any new file in Source Code Form that contains any Covered + Software. + +1.11. "Patent Claims" of a Contributor + means any patent claim(s), including without limitation, method, + process, and apparatus claims, in any patent Licensable by such + Contributor that would be infringed, but for the grant of the + License, by the making, using, selling, offering for sale, having + made, import, or transfer of either its Contributions or its + Contributor Version. + +1.12. "Secondary License" + means either the GNU General Public License, Version 2.0, the GNU + Lesser General Public License, Version 2.1, the GNU Affero General + Public License, Version 3.0, or any later versions of those + licenses. + +1.13. "Source Code Form" + means the form of the work preferred for making modifications. + +1.14. "You" (or "Your") + means an individual or a legal entity exercising rights under this + License. For legal entities, "You" includes any entity that + controls, is controlled by, or is under common control with You. For + purposes of this definition, "control" means (a) the power, direct + or indirect, to cause the direction or management of such entity, + whether by contract or otherwise, or (b) ownership of more than + fifty percent (50%) of the outstanding shares or beneficial + ownership of such entity. + +2. License Grants and Conditions +-------------------------------- + +2.1. Grants + +Each Contributor hereby grants You a world-wide, royalty-free, +non-exclusive license: + +(a) under intellectual property rights (other than patent or trademark) + Licensable by such Contributor to use, reproduce, make available, + modify, display, perform, distribute, and otherwise exploit its + Contributions, either on an unmodified basis, with Modifications, or + as part of a Larger Work; and + +(b) under Patent Claims of such Contributor to make, use, sell, offer + for sale, have made, import, and otherwise transfer either its + Contributions or its Contributor Version. + +2.2. Effective Date + +The licenses granted in Section 2.1 with respect to any Contribution +become effective for each Contribution on the date the Contributor first +distributes such Contribution. + +2.3. Limitations on Grant Scope + +The licenses granted in this Section 2 are the only rights granted under +this License. No additional rights or licenses will be implied from the +distribution or licensing of Covered Software under this License. +Notwithstanding Section 2.1(b) above, no patent license is granted by a +Contributor: + +(a) for any code that a Contributor has removed from Covered Software; + or + +(b) for infringements caused by: (i) Your and any other third party's + modifications of Covered Software, or (ii) the combination of its + Contributions with other software (except as part of its Contributor + Version); or + +(c) under Patent Claims infringed by Covered Software in the absence of + its Contributions. + +This License does not grant any rights in the trademarks, service marks, +or logos of any Contributor (except as may be necessary to comply with +the notice requirements in Section 3.4). + +2.4. Subsequent Licenses + +No Contributor makes additional grants as a result of Your choice to +distribute the Covered Software under a subsequent version of this +License (see Section 10.2) or under the terms of a Secondary License (if +permitted under the terms of Section 3.3). + +2.5. Representation + +Each Contributor represents that the Contributor believes its +Contributions are its original creation(s) or it has sufficient rights +to grant the rights to its Contributions conveyed by this License. + +2.6. Fair Use + +This License is not intended to limit any rights You have under +applicable copyright doctrines of fair use, fair dealing, or other +equivalents. + +2.7. Conditions + +Sections 3.1, 3.2, 3.3, and 3.4 are conditions of the licenses granted +in Section 2.1. + +3. Responsibilities +------------------- + +3.1. Distribution of Source Form + +All distribution of Covered Software in Source Code Form, including any +Modifications that You create or to which You contribute, must be under +the terms of this License. You must inform recipients that the Source +Code Form of the Covered Software is governed by the terms of this +License, and how they can obtain a copy of this License. You may not +attempt to alter or restrict the recipients' rights in the Source Code +Form. + +3.2. Distribution of Executable Form + +If You distribute Covered Software in Executable Form then: + +(a) such Covered Software must also be made available in Source Code + Form, as described in Section 3.1, and You must inform recipients of + the Executable Form how they can obtain a copy of such Source Code + Form by reasonable means in a timely manner, at a charge no more + than the cost of distribution to the recipient; and + +(b) You may distribute such Executable Form under the terms of this + License, or sublicense it under different terms, provided that the + license for the Executable Form does not attempt to limit or alter + the recipients' rights in the Source Code Form under this License. + +3.3. Distribution of a Larger Work + +You may create and distribute a Larger Work under terms of Your choice, +provided that You also comply with the requirements of this License for +the Covered Software. If the Larger Work is a combination of Covered +Software with a work governed by one or more Secondary Licenses, and the +Covered Software is not Incompatible With Secondary Licenses, this +License permits You to additionally distribute such Covered Software +under the terms of such Secondary License(s), so that the recipient of +the Larger Work may, at their option, further distribute the Covered +Software under the terms of either this License or such Secondary +License(s). + +3.4. Notices + +You may not remove or alter the substance of any license notices +(including copyright notices, patent notices, disclaimers of warranty, +or limitations of liability) contained within the Source Code Form of +the Covered Software, except that You may alter any license notices to +the extent required to remedy known factual inaccuracies. + +3.5. Application of Additional Terms + +You may choose to offer, and to charge a fee for, warranty, support, +indemnity or liability obligations to one or more recipients of Covered +Software. However, You may do so only on Your own behalf, and not on +behalf of any Contributor. You must make it absolutely clear that any +such warranty, support, indemnity, or liability obligation is offered by +You alone, and You hereby agree to indemnify every Contributor for any +liability incurred by such Contributor as a result of warranty, support, +indemnity or liability terms You offer. You may include additional +disclaimers of warranty and limitations of liability specific to any +jurisdiction. + +4. Inability to Comply Due to Statute or Regulation +--------------------------------------------------- + +If it is impossible for You to comply with any of the terms of this +License with respect to some or all of the Covered Software due to +statute, judicial order, or regulation then You must: (a) comply with +the terms of this License to the maximum extent possible; and (b) +describe the limitations and the code they affect. Such description must +be placed in a text file included with all distributions of the Covered +Software under this License. Except to the extent prohibited by statute +or regulation, such description must be sufficiently detailed for a +recipient of ordinary skill to be able to understand it. + +5. Termination +-------------- + +5.1. The rights granted under this License will terminate automatically +if You fail to comply with any of its terms. However, if You become +compliant, then the rights granted under this License from a particular +Contributor are reinstated (a) provisionally, unless and until such +Contributor explicitly and finally terminates Your grants, and (b) on an +ongoing basis, if such Contributor fails to notify You of the +non-compliance by some reasonable means prior to 60 days after You have +come back into compliance. Moreover, Your grants from a particular +Contributor are reinstated on an ongoing basis if such Contributor +notifies You of the non-compliance by some reasonable means, this is the +first time You have received notice of non-compliance with this License +from such Contributor, and You become compliant prior to 30 days after +Your receipt of the notice. + +5.2. If You initiate litigation against any entity by asserting a patent +infringement claim (excluding declaratory judgment actions, +counter-claims, and cross-claims) alleging that a Contributor Version +directly or indirectly infringes any patent, then the rights granted to +You by any and all Contributors for the Covered Software under Section +2.1 of this License shall terminate. + +5.3. In the event of termination under Sections 5.1 or 5.2 above, all +end user license agreements (excluding distributors and resellers) which +have been validly granted by You or Your distributors under this License +prior to termination shall survive termination. + +************************************************************************ +* * +* 6. Disclaimer of Warranty * +* ------------------------- * +* * +* Covered Software is provided under this License on an "as is" * +* basis, without warranty of any kind, either expressed, implied, or * +* statutory, including, without limitation, warranties that the * +* Covered Software is free of defects, merchantable, fit for a * +* particular purpose or non-infringing. The entire risk as to the * +* quality and performance of the Covered Software is with You. * +* Should any Covered Software prove defective in any respect, You * +* (not any Contributor) assume the cost of any necessary servicing, * +* repair, or correction. This disclaimer of warranty constitutes an * +* essential part of this License. No use of any Covered Software is * +* authorized under this License except under this disclaimer. * +* * +************************************************************************ + +************************************************************************ +* * +* 7. Limitation of Liability * +* -------------------------- * +* * +* Under no circumstances and under no legal theory, whether tort * +* (including negligence), contract, or otherwise, shall any * +* Contributor, or anyone who distributes Covered Software as * +* permitted above, be liable to You for any direct, indirect, * +* special, incidental, or consequential damages of any character * +* including, without limitation, damages for lost profits, loss of * +* goodwill, work stoppage, computer failure or malfunction, or any * +* and all other commercial damages or losses, even if such party * +* shall have been informed of the possibility of such damages. This * +* limitation of liability shall not apply to liability for death or * +* personal injury resulting from such party's negligence to the * +* extent applicable law prohibits such limitation. Some * +* jurisdictions do not allow the exclusion or limitation of * +* incidental or consequential damages, so this exclusion and * +* limitation may not apply to You. * +* * +************************************************************************ + +8. Litigation +------------- + +Any litigation relating to this License may be brought only in the +courts of a jurisdiction where the defendant maintains its principal +place of business and such litigation shall be governed by laws of that +jurisdiction, without reference to its conflict-of-law provisions. +Nothing in this Section shall prevent a party's ability to bring +cross-claims or counter-claims. + +9. Miscellaneous +---------------- + +This License represents the complete agreement concerning the subject +matter hereof. If any provision of this License is held to be +unenforceable, such provision shall be reformed only to the extent +necessary to make it enforceable. Any law or regulation which provides +that the language of a contract shall be construed against the drafter +shall not be used to construe this License against a Contributor. + +10. Versions of the License +--------------------------- + +10.1. New Versions + +Mozilla Foundation is the license steward. Except as provided in Section +10.3, no one other than the license steward has the right to modify or +publish new versions of this License. Each version will be given a +distinguishing version number. + +10.2. Effect of New Versions + +You may distribute the Covered Software under the terms of the version +of the License under which You originally received the Covered Software, +or under the terms of any subsequent version published by the license +steward. + +10.3. Modified Versions + +If you create software not governed by this License, and you want to +create a new license for such software, you may create and use a +modified version of this License if you rename the license and remove +any references to the name of the license steward (except to note that +such modified license differs from this License). + +10.4. Distributing Source Code Form that is Incompatible With Secondary +Licenses + +If You choose to distribute Source Code Form that is Incompatible With +Secondary Licenses under the terms of this version of the License, the +notice described in Exhibit B of this License must be attached. + +Exhibit A - Source Code Form License Notice +------------------------------------------- + + This Source Code Form is subject to the terms of the Mozilla Public + License, v. 2.0. If a copy of the MPL was not distributed with this + file, You can obtain one at http://mozilla.org/MPL/2.0/. + +If it is not possible or desirable to put the notice in a particular +file, then You may include the notice in a location (such as a LICENSE +file in a relevant directory) where a recipient would be likely to look +for such a notice. + +You may add additional accurate notices of copyright ownership. + +Exhibit B - "Incompatible With Secondary Licenses" Notice +--------------------------------------------------------- + + This Source Code Form is "Incompatible With Secondary Licenses", as + defined by the Mozilla Public License, v. 2.0. + + +h11 +MIT License +https://github.com/python-hyper/h11 +The MIT License (MIT) + +Copyright (c) 2016 Nathaniel J. Smith and other contributors + +Permission is hereby granted, free of charge, to any person obtaining +a copy of this software and associated documentation files (the +"Software"), to deal in the Software without restriction, including +without limitation the rights to use, copy, modify, merge, publish, +distribute, sublicense, and/or sell copies of the Software, and to +permit persons to whom the Software is furnished to do so, subject to +the following conditions: + +The above copyright notice and this permission notice shall be +included in all copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, +EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF +MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND +NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE +LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION +OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION +WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. + + +h5py +BSD-3-Clause +https://www.h5py.org/ +Copyright (c) 2008 Andrew Collette and contributors +All rights reserved. + +Redistribution and use in source and binary forms, with or without +modification, are permitted provided that the following conditions are +met: + +1. Redistributions of source code must retain the above copyright + notice, this list of conditions and the following disclaimer. + +2. Redistributions in binary form must reproduce the above copyright + notice, this list of conditions and the following disclaimer in the + documentation and/or other materials provided with the + distribution. + +3. Neither the name of the copyright holder nor the names of its + contributors may be used to endorse or promote products derived from + this software without specific prior written permission. + +THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS +"AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT +LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR +A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT +HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, +SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT +LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, +DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY +THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT +(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE +OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + + +hatch +MIT +https://hatch.pypa.io/latest/ +MIT License + +Copyright (c) 2017-present Ofek Lev + +Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions: + +The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. + + +hatchling +MIT +https://hatch.pypa.io/latest/ +MIT License + +Copyright (c) 2021-present Ofek Lev + +Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions: + +The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. + + +hf-xet +Apache-2.0 +https://github.com/huggingface/xet-core + Apache License + Version 2.0, January 2004 + http://www.apache.org/licenses/ + + TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION + + 1. Definitions. + + "License" shall mean the terms and conditions for use, reproduction, + and distribution as defined by Sections 1 through 9 of this document. + + "Licensor" shall mean the copyright owner or entity authorized by + the copyright owner that is granting the License. + + "Legal Entity" shall mean the union of the acting entity and all + other entities that control, are controlled by, or are under common + control with that entity. For the purposes of this definition, + "control" means (i) the power, direct or indirect, to cause the + direction or management of such entity, whether by contract or + otherwise, or (ii) ownership of fifty percent (50%) or more of the + outstanding shares, or (iii) beneficial ownership of such entity. + + "You" (or "Your") shall mean an individual or Legal Entity + exercising permissions granted by this License. + + "Source" form shall mean the preferred form for making modifications, + including but not limited to software source code, documentation + source, and configuration files. + + "Object" form shall mean any form resulting from mechanical + transformation or translation of a Source form, including but + not limited to compiled object code, generated documentation, + and conversions to other media types. + + "Work" shall mean the work of authorship, whether in Source or + Object form, made available under the License, as indicated by a + copyright notice that is included in or attached to the work + (an example is provided in the Appendix below). + + "Derivative Works" shall mean any work, whether in Source or Object + form, that is based on (or derived from) the Work and for which the + editorial revisions, annotations, elaborations, or other modifications + represent, as a whole, an original work of authorship. For the purposes + of this License, Derivative Works shall not include works that remain + separable from, or merely link (or bind by name) to the interfaces of, + the Work and Derivative Works thereof. + + "Contribution" shall mean any work of authorship, including + the original version of the Work and any modifications or additions + to that Work or Derivative Works thereof, that is intentionally + submitted to Licensor for inclusion in the Work by the copyright owner + or by an individual or Legal Entity authorized to submit on behalf of + the copyright owner. For the purposes of this definition, "submitted" + means any form of electronic, verbal, or written communication sent + to the Licensor or its representatives, including but not limited to + communication on electronic mailing lists, source code control systems, + and issue tracking systems that are managed by, or on behalf of, the + Licensor for the purpose of discussing and improving the Work, but + excluding communication that is conspicuously marked or otherwise + designated in writing by the copyright owner as "Not a Contribution." + + "Contributor" shall mean Licensor and any individual or Legal Entity + on behalf of whom a Contribution has been received by Licensor and + subsequently incorporated within the Work. + + 2. Grant of Copyright License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + copyright license to reproduce, prepare Derivative Works of, + publicly display, publicly perform, sublicense, and distribute the + Work and such Derivative Works in Source or Object form. + + 3. Grant of Patent License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + (except as stated in this section) patent license to make, have made, + use, offer to sell, sell, import, and otherwise transfer the Work, + where such license applies only to those patent claims licensable + by such Contributor that are necessarily infringed by their + Contribution(s) alone or by combination of their Contribution(s) + with the Work to which such Contribution(s) was submitted. If You + institute patent litigation against any entity (including a + cross-claim or counterclaim in a lawsuit) alleging that the Work + or a Contribution incorporated within the Work constitutes direct + or contributory patent infringement, then any patent licenses + granted to You under this License for that Work shall terminate + as of the date such litigation is filed. + + 4. Redistribution. You may reproduce and distribute copies of the + Work or Derivative Works thereof in any medium, with or without + modifications, and in Source or Object form, provided that You + meet the following conditions: + + (a) You must give any other recipients of the Work or + Derivative Works a copy of this License; and + + (b) You must cause any modified files to carry prominent notices + stating that You changed the files; and + + (c) You must retain, in the Source form of any Derivative Works + that You distribute, all copyright, patent, trademark, and + attribution notices from the Source form of the Work, + excluding those notices that do not pertain to any part of + the Derivative Works; and + + (d) If the Work includes a "NOTICE" text file as part of its + distribution, then any Derivative Works that You distribute must + include a readable copy of the attribution notices contained + within such NOTICE file, excluding those notices that do not + pertain to any part of the Derivative Works, in at least one + of the following places: within a NOTICE text file distributed + as part of the Derivative Works; within the Source form or + documentation, if provided along with the Derivative Works; or, + within a display generated by the Derivative Works, if and + wherever such third-party notices normally appear. The contents + of the NOTICE file are for informational purposes only and + do not modify the License. You may add Your own attribution + notices within Derivative Works that You distribute, alongside + or as an addendum to the NOTICE text from the Work, provided + that such additional attribution notices cannot be construed + as modifying the License. + + You may add Your own copyright statement to Your modifications and + may provide additional or different license terms and conditions + for use, reproduction, or distribution of Your modifications, or + for any such Derivative Works as a whole, provided Your use, + reproduction, and distribution of the Work otherwise complies with + the conditions stated in this License. + + 5. Submission of Contributions. Unless You explicitly state otherwise, + any Contribution intentionally submitted for inclusion in the Work + by You to the Licensor shall be under the terms and conditions of + this License, without any additional terms or conditions. + Notwithstanding the above, nothing herein shall supersede or modify + the terms of any separate license agreement you may have executed + with Licensor regarding such Contributions. + + 6. Trademarks. This License does not grant permission to use the trade + names, trademarks, service marks, or product names of the Licensor, + except as required for reasonable and customary use in describing the + origin of the Work and reproducing the content of the NOTICE file. + + 7. Disclaimer of Warranty. Unless required by applicable law or + agreed to in writing, Licensor provides the Work (and each + Contributor provides its Contributions) on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or + implied, including, without limitation, any warranties or conditions + of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A + PARTICULAR PURPOSE. You are solely responsible for determining the + appropriateness of using or redistributing the Work and assume any + risks associated with Your exercise of permissions under this License. + + 8. Limitation of Liability. In no event and under no legal theory, + whether in tort (including negligence), contract, or otherwise, + unless required by applicable law (such as deliberate and grossly + negligent acts) or agreed to in writing, shall any Contributor be + liable to You for damages, including any direct, indirect, special, + incidental, or consequential damages of any character arising as a + result of this License or out of the use or inability to use the + Work (including but not limited to damages for loss of goodwill, + work stoppage, computer failure or malfunction, or any and all + other commercial damages or losses), even if such Contributor + has been advised of the possibility of such damages. + + 9. Accepting Warranty or Additional Liability. While redistributing + the Work or Derivative Works thereof, You may choose to offer, + and charge a fee for, acceptance of support, warranty, indemnity, + or other liability obligations and/or rights consistent with this + License. However, in accepting such obligations, You may act only + on Your own behalf and on Your sole responsibility, not on behalf + of any other Contributor, and only if You agree to indemnify, + defend, and hold each Contributor harmless for any liability + incurred by, or claims asserted against, such Contributor by reason + of your accepting any such warranty or additional liability. + + END OF TERMS AND CONDITIONS + + APPENDIX: How to apply the Apache License to your work. + + To apply the Apache License to your work, attach the following + boilerplate notice, with the fields enclosed by brackets "[]" + replaced with your own identifying information. (Don't include + the brackets!) The text should be enclosed in the appropriate + comment syntax for the file format. We also recommend that a + file or class name and description of purpose be included on the + same "printed page" as the copyright notice for easier + identification within third-party archives. + + Copyright [yyyy] [name of copyright owner] + + Licensed under the Apache License, Version 2.0 (the "License"); + you may not use this file except in compliance with the License. + You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, software + distributed under the License is distributed on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + See the License for the specific language governing permissions and + limitations under the License. + + +httpcore +BSD-3-Clause +https://www.encode.io/httpcore/ +Copyright © 2020, [Encode OSS Ltd](https://www.encode.io/). +All rights reserved. + +Redistribution and use in source and binary forms, with or without +modification, are permitted provided that the following conditions are met: + +* Redistributions of source code must retain the above copyright notice, this + list of conditions and the following disclaimer. + +* Redistributions in binary form must reproduce the above copyright notice, + this list of conditions and the following disclaimer in the documentation + and/or other materials provided with the distribution. + +* Neither the name of the copyright holder nor the names of its + contributors may be used to endorse or promote products derived from + this software without specific prior written permission. + +THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" +AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE +IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE +DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE +FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL +DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR +SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER +CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, +OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE +OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + + +httptools +MIT +https://github.com/MagicStack/httptools +The MIT License + +Copyright (c) 2015 MagicStack Inc. http://magic.io + +Permission is hereby granted, free of charge, to any person obtaining a copy +of this software and associated documentation files (the "Software"), to deal +in the Software without restriction, including without limitation the rights +to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +copies of the Software, and to permit persons to whom the Software is +furnished to do so, subject to the following conditions: + +The above copyright notice and this permission notice shall be included in +all copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN +THE SOFTWARE. + + +httpx +BSD-3-Clause +https://github.com/encode/httpx +Copyright © 2019, [Encode OSS Ltd](https://www.encode.io/). +All rights reserved. + +Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met: + +* Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer. + +* Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution. + +* Neither the name of the copyright holder nor the names of its contributors may be used to endorse or promote products derived from this software without specific prior written permission. + +THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + + +huggingface_hub +Apache Software License +https://github.com/huggingface/huggingface_hub + Apache License + Version 2.0, January 2004 + http://www.apache.org/licenses/ + + TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION + + 1. Definitions. + + "License" shall mean the terms and conditions for use, reproduction, + and distribution as defined by Sections 1 through 9 of this document. + + "Licensor" shall mean the copyright owner or entity authorized by + the copyright owner that is granting the License. + + "Legal Entity" shall mean the union of the acting entity and all + other entities that control, are controlled by, or are under common + control with that entity. For the purposes of this definition, + "control" means (i) the power, direct or indirect, to cause the + direction or management of such entity, whether by contract or + otherwise, or (ii) ownership of fifty percent (50%) or more of the + outstanding shares, or (iii) beneficial ownership of such entity. + + "You" (or "Your") shall mean an individual or Legal Entity + exercising permissions granted by this License. + + "Source" form shall mean the preferred form for making modifications, + including but not limited to software source code, documentation + source, and configuration files. + + "Object" form shall mean any form resulting from mechanical + transformation or translation of a Source form, including but + not limited to compiled object code, generated documentation, + and conversions to other media types. + + "Work" shall mean the work of authorship, whether in Source or + Object form, made available under the License, as indicated by a + copyright notice that is included in or attached to the work + (an example is provided in the Appendix below). + + "Derivative Works" shall mean any work, whether in Source or Object + form, that is based on (or derived from) the Work and for which the + editorial revisions, annotations, elaborations, or other modifications + represent, as a whole, an original work of authorship. For the purposes + of this License, Derivative Works shall not include works that remain + separable from, or merely link (or bind by name) to the interfaces of, + the Work and Derivative Works thereof. + + "Contribution" shall mean any work of authorship, including + the original version of the Work and any modifications or additions + to that Work or Derivative Works thereof, that is intentionally + submitted to Licensor for inclusion in the Work by the copyright owner + or by an individual or Legal Entity authorized to submit on behalf of + the copyright owner. For the purposes of this definition, "submitted" + means any form of electronic, verbal, or written communication sent + to the Licensor or its representatives, including but not limited to + communication on electronic mailing lists, source code control systems, + and issue tracking systems that are managed by, or on behalf of, the + Licensor for the purpose of discussing and improving the Work, but + excluding communication that is conspicuously marked or otherwise + designated in writing by the copyright owner as "Not a Contribution." + + "Contributor" shall mean Licensor and any individual or Legal Entity + on behalf of whom a Contribution has been received by Licensor and + subsequently incorporated within the Work. + + 2. Grant of Copyright License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + copyright license to reproduce, prepare Derivative Works of, + publicly display, publicly perform, sublicense, and distribute the + Work and such Derivative Works in Source or Object form. + + 3. Grant of Patent License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + (except as stated in this section) patent license to make, have made, + use, offer to sell, sell, import, and otherwise transfer the Work, + where such license applies only to those patent claims licensable + by such Contributor that are necessarily infringed by their + Contribution(s) alone or by combination of their Contribution(s) + with the Work to which such Contribution(s) was submitted. If You + institute patent litigation against any entity (including a + cross-claim or counterclaim in a lawsuit) alleging that the Work + or a Contribution incorporated within the Work constitutes direct + or contributory patent infringement, then any patent licenses + granted to You under this License for that Work shall terminate + as of the date such litigation is filed. + + 4. Redistribution. You may reproduce and distribute copies of the + Work or Derivative Works thereof in any medium, with or without + modifications, and in Source or Object form, provided that You + meet the following conditions: + + (a) You must give any other recipients of the Work or + Derivative Works a copy of this License; and + + (b) You must cause any modified files to carry prominent notices + stating that You changed the files; and + + (c) You must retain, in the Source form of any Derivative Works + that You distribute, all copyright, patent, trademark, and + attribution notices from the Source form of the Work, + excluding those notices that do not pertain to any part of + the Derivative Works; and + + (d) If the Work includes a "NOTICE" text file as part of its + distribution, then any Derivative Works that You distribute must + include a readable copy of the attribution notices contained + within such NOTICE file, excluding those notices that do not + pertain to any part of the Derivative Works, in at least one + of the following places: within a NOTICE text file distributed + as part of the Derivative Works; within the Source form or + documentation, if provided along with the Derivative Works; or, + within a display generated by the Derivative Works, if and + wherever such third-party notices normally appear. The contents + of the NOTICE file are for informational purposes only and + do not modify the License. You may add Your own attribution + notices within Derivative Works that You distribute, alongside + or as an addendum to the NOTICE text from the Work, provided + that such additional attribution notices cannot be construed + as modifying the License. + + You may add Your own copyright statement to Your modifications and + may provide additional or different license terms and conditions + for use, reproduction, or distribution of Your modifications, or + for any such Derivative Works as a whole, provided Your use, + reproduction, and distribution of the Work otherwise complies with + the conditions stated in this License. + + 5. Submission of Contributions. Unless You explicitly state otherwise, + any Contribution intentionally submitted for inclusion in the Work + by You to the Licensor shall be under the terms and conditions of + this License, without any additional terms or conditions. + Notwithstanding the above, nothing herein shall supersede or modify + the terms of any separate license agreement you may have executed + with Licensor regarding such Contributions. + + 6. Trademarks. This License does not grant permission to use the trade + names, trademarks, service marks, or product names of the Licensor, + except as required for reasonable and customary use in describing the + origin of the Work and reproducing the content of the NOTICE file. + + 7. Disclaimer of Warranty. Unless required by applicable law or + agreed to in writing, Licensor provides the Work (and each + Contributor provides its Contributions) on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or + implied, including, without limitation, any warranties or conditions + of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A + PARTICULAR PURPOSE. You are solely responsible for determining the + appropriateness of using or redistributing the Work and assume any + risks associated with Your exercise of permissions under this License. + + 8. Limitation of Liability. In no event and under no legal theory, + whether in tort (including negligence), contract, or otherwise, + unless required by applicable law (such as deliberate and grossly + negligent acts) or agreed to in writing, shall any Contributor be + liable to You for damages, including any direct, indirect, special, + incidental, or consequential damages of any character arising as a + result of this License or out of the use or inability to use the + Work (including but not limited to damages for loss of goodwill, + work stoppage, computer failure or malfunction, or any and all + other commercial damages or losses), even if such Contributor + has been advised of the possibility of such damages. + + 9. Accepting Warranty or Additional Liability. While redistributing + the Work or Derivative Works thereof, You may choose to offer, + and charge a fee for, acceptance of support, warranty, indemnity, + or other liability obligations and/or rights consistent with this + License. However, in accepting such obligations, You may act only + on Your own behalf and on Your sole responsibility, not on behalf + of any other Contributor, and only if You agree to indemnify, + defend, and hold each Contributor harmless for any liability + incurred by, or claims asserted against, such Contributor by reason + of your accepting any such warranty or additional liability. + + END OF TERMS AND CONDITIONS + + APPENDIX: How to apply the Apache License to your work. + + To apply the Apache License to your work, attach the following + boilerplate notice, with the fields enclosed by brackets "[]" + replaced with your own identifying information. (Don't include + the brackets!) The text should be enclosed in the appropriate + comment syntax for the file format. We also recommend that a + file or class name and description of purpose be included on the + same "printed page" as the copyright notice for easier + identification within third-party archives. + + Copyright [yyyy] [name of copyright owner] + + Licensed under the Apache License, Version 2.0 (the "License"); + you may not use this file except in compliance with the License. + You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, software + distributed under the License is distributed on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + See the License for the specific language governing permissions and + limitations under the License. + + +hvac +Apache Software License +https://github.com/hvac/hvac + Apache License + Version 2.0, January 2004 + http://www.apache.org/licenses/ + + TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION + + 1. Definitions. + + "License" shall mean the terms and conditions for use, reproduction, + and distribution as defined by Sections 1 through 9 of this document. + + "Licensor" shall mean the copyright owner or entity authorized by + the copyright owner that is granting the License. + + "Legal Entity" shall mean the union of the acting entity and all + other entities that control, are controlled by, or are under common + control with that entity. For the purposes of this definition, + "control" means (i) the power, direct or indirect, to cause the + direction or management of such entity, whether by contract or + otherwise, or (ii) ownership of fifty percent (50%) or more of the + outstanding shares, or (iii) beneficial ownership of such entity. + + "You" (or "Your") shall mean an individual or Legal Entity + exercising permissions granted by this License. + + "Source" form shall mean the preferred form for making modifications, + including but not limited to software source code, documentation + source, and configuration files. + + "Object" form shall mean any form resulting from mechanical + transformation or translation of a Source form, including but + not limited to compiled object code, generated documentation, + and conversions to other media types. + + "Work" shall mean the work of authorship, whether in Source or + Object form, made available under the License, as indicated by a + copyright notice that is included in or attached to the work + (an example is provided in the Appendix below). + + "Derivative Works" shall mean any work, whether in Source or Object + form, that is based on (or derived from) the Work and for which the + editorial revisions, annotations, elaborations, or other modifications + represent, as a whole, an original work of authorship. For the purposes + of this License, Derivative Works shall not include works that remain + separable from, or merely link (or bind by name) to the interfaces of, + the Work and Derivative Works thereof. + + "Contribution" shall mean any work of authorship, including + the original version of the Work and any modifications or additions + to that Work or Derivative Works thereof, that is intentionally + submitted to Licensor for inclusion in the Work by the copyright owner + or by an individual or Legal Entity authorized to submit on behalf of + the copyright owner. For the purposes of this definition, "submitted" + means any form of electronic, verbal, or written communication sent + to the Licensor or its representatives, including but not limited to + communication on electronic mailing lists, source code control systems, + and issue tracking systems that are managed by, or on behalf of, the + Licensor for the purpose of discussing and improving the Work, but + excluding communication that is conspicuously marked or otherwise + designated in writing by the copyright owner as "Not a Contribution." + + "Contributor" shall mean Licensor and any individual or Legal Entity + on behalf of whom a Contribution has been received by Licensor and + subsequently incorporated within the Work. + + 2. Grant of Copyright License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + copyright license to reproduce, prepare Derivative Works of, + publicly display, publicly perform, sublicense, and distribute the + Work and such Derivative Works in Source or Object form. + + 3. Grant of Patent License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + (except as stated in this section) patent license to make, have made, + use, offer to sell, sell, import, and otherwise transfer the Work, + where such license applies only to those patent claims licensable + by such Contributor that are necessarily infringed by their + Contribution(s) alone or by combination of their Contribution(s) + with the Work to which such Contribution(s) was submitted. If You + institute patent litigation against any entity (including a + cross-claim or counterclaim in a lawsuit) alleging that the Work + or a Contribution incorporated within the Work constitutes direct + or contributory patent infringement, then any patent licenses + granted to You under this License for that Work shall terminate + as of the date such litigation is filed. + + 4. Redistribution. You may reproduce and distribute copies of the + Work or Derivative Works thereof in any medium, with or without + modifications, and in Source or Object form, provided that You + meet the following conditions: + + (a) You must give any other recipients of the Work or + Derivative Works a copy of this License; and + + (b) You must cause any modified files to carry prominent notices + stating that You changed the files; and + + (c) You must retain, in the Source form of any Derivative Works + that You distribute, all copyright, patent, trademark, and + attribution notices from the Source form of the Work, + excluding those notices that do not pertain to any part of + the Derivative Works; and + + (d) If the Work includes a "NOTICE" text file as part of its + distribution, then any Derivative Works that You distribute must + include a readable copy of the attribution notices contained + within such NOTICE file, excluding those notices that do not + pertain to any part of the Derivative Works, in at least one + of the following places: within a NOTICE text file distributed + as part of the Derivative Works; within the Source form or + documentation, if provided along with the Derivative Works; or, + within a display generated by the Derivative Works, if and + wherever such third-party notices normally appear. The contents + of the NOTICE file are for informational purposes only and + do not modify the License. You may add Your own attribution + notices within Derivative Works that You distribute, alongside + or as an addendum to the NOTICE text from the Work, provided + that such additional attribution notices cannot be construed + as modifying the License. + + You may add Your own copyright statement to Your modifications and + may provide additional or different license terms and conditions + for use, reproduction, or distribution of Your modifications, or + for any such Derivative Works as a whole, provided Your use, + reproduction, and distribution of the Work otherwise complies with + the conditions stated in this License. + + 5. Submission of Contributions. Unless You explicitly state otherwise, + any Contribution intentionally submitted for inclusion in the Work + by You to the Licensor shall be under the terms and conditions of + this License, without any additional terms or conditions. + Notwithstanding the above, nothing herein shall supersede or modify + the terms of any separate license agreement you may have executed + with Licensor regarding such Contributions. + + 6. Trademarks. This License does not grant permission to use the trade + names, trademarks, service marks, or product names of the Licensor, + except as required for reasonable and customary use in describing the + origin of the Work and reproducing the content of the NOTICE file. + + 7. Disclaimer of Warranty. Unless required by applicable law or + agreed to in writing, Licensor provides the Work (and each + Contributor provides its Contributions) on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or + implied, including, without limitation, any warranties or conditions + of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A + PARTICULAR PURPOSE. You are solely responsible for determining the + appropriateness of using or redistributing the Work and assume any + risks associated with Your exercise of permissions under this License. + + 8. Limitation of Liability. In no event and under no legal theory, + whether in tort (including negligence), contract, or otherwise, + unless required by applicable law (such as deliberate and grossly + negligent acts) or agreed to in writing, shall any Contributor be + liable to You for damages, including any direct, indirect, special, + incidental, or consequential damages of any character arising as a + result of this License or out of the use or inability to use the + Work (including but not limited to damages for loss of goodwill, + work stoppage, computer failure or malfunction, or any and all + other commercial damages or losses), even if such Contributor + has been advised of the possibility of such damages. + + 9. Accepting Warranty or Additional Liability. While redistributing + the Work or Derivative Works thereof, You may choose to offer, + and charge a fee for, acceptance of support, warranty, indemnity, + or other liability obligations and/or rights consistent with this + License. However, in accepting such obligations, You may act only + on Your own behalf and on Your sole responsibility, not on behalf + of any other Contributor, and only if You agree to indemnify, + defend, and hold each Contributor harmless for any liability + incurred by, or claims asserted against, such Contributor by reason + of your accepting any such warranty or additional liability. + + END OF TERMS AND CONDITIONS + + APPENDIX: How to apply the Apache License to your work. + + To apply the Apache License to your work, attach the following + boilerplate notice, with the fields enclosed by brackets "{}" + replaced with your own identifying information. (Don't include + the brackets!) The text should be enclosed in the appropriate + comment syntax for the file format. We also recommend that a + file or class name and description of purpose be included on the + same "printed page" as the copyright notice for easier + identification within third-party archives. + + Copyright {yyyy} {name of copyright owner} + + Licensed under the Apache License, Version 2.0 (the "License"); + you may not use this file except in compliance with the License. + You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, software + distributed under the License is distributed on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + See the License for the specific language governing permissions and + limitations under the License. + + + +hydra-core +MIT License +https://github.com/facebookresearch/hydra +MIT License + +Copyright (c) Facebook, Inc. and its affiliates. + +Permission is hereby granted, free of charge, to any person obtaining a copy +of this software and associated documentation files (the "Software"), to deal +in the Software without restriction, including without limitation the rights +to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +copies of the Software, and to permit persons to whom the Software is +furnished to do so, subject to the following conditions: + +The above copyright notice and this permission notice shall be included in all +copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE +SOFTWARE. + + +hyperlink +MIT License +https://github.com/python-hyper/hyperlink +Copyright (c) 2017 +Glyph Lefkowitz +Itamar Turner-Trauring +Jean Paul Calderone +Adi Roiban +Amber Hawkie Brown +Mahmoud Hashemi +Wilfredo Sanchez Vega + +and others that have contributed code to the public domain. + +Permission is hereby granted, free of charge, to any person obtaining +a copy of this software and associated documentation files (the +"Software"), to deal in the Software without restriction, including +without limitation the rights to use, copy, modify, merge, publish, +distribute, sublicense, and/or sell copies of the Software, and to +permit persons to whom the Software is furnished to do so, subject to +the following conditions: + +The above copyright notice and this permission notice shall be +included in all copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, +EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF +MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND +NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE +LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION +OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION +WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. + + +idna +BSD-3-Clause +https://github.com/kjd/idna +BSD 3-Clause License + +Copyright (c) 2013-2025, Kim Davies and contributors. +All rights reserved. + +Redistribution and use in source and binary forms, with or without +modification, are permitted provided that the following conditions are +met: + +1. Redistributions of source code must retain the above copyright + notice, this list of conditions and the following disclaimer. + +2. Redistributions in binary form must reproduce the above copyright + notice, this list of conditions and the following disclaimer in the + documentation and/or other materials provided with the distribution. + +3. Neither the name of the copyright holder nor the names of its + contributors may be used to endorse or promote products derived from + this software without specific prior written permission. + +THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS +"AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT +LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR +A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT +HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, +SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED +TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR +PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF +LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING +NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS +SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + + +imagecodecs +BSD-3-Clause +https://www.cgohlke.com +BSD-3-Clause license + +Copyright (c) 2008-2026, Christoph Gohlke +All rights reserved. + +Redistribution and use in source and binary forms, with or without +modification, are permitted provided that the following conditions are met: + +1. Redistributions of source code must retain the above copyright notice, + this list of conditions and the following disclaimer. + +2. Redistributions in binary form must reproduce the above copyright notice, + this list of conditions and the following disclaimer in the documentation + and/or other materials provided with the distribution. + +3. Neither the name of the copyright holder nor the names of its + contributors may be used to endorse or promote products derived from + this software without specific prior written permission. + +THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" +AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE +IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE +ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE +LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR +CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF +SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS +INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN +CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) +ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE +POSSIBILITY OF SUCH DAMAGE. + + +imageio-ffmpeg +BSD License +https://github.com/imageio/imageio-ffmpeg +BSD 2-Clause License + +Copyright (c) 2019-2025, imageio +All rights reserved. + +Redistribution and use in source and binary forms, with or without +modification, are permitted provided that the following conditions are met: + +* Redistributions of source code must retain the above copyright notice, this + list of conditions and the following disclaimer. + +* Redistributions in binary form must reproduce the above copyright notice, + this list of conditions and the following disclaimer in the documentation + and/or other materials provided with the distribution. + +THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" +AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE +IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE +DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE +FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL +DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR +SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER +CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, +OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE +OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + + +importlib_metadata +Apache-2.0 +https://github.com/python/importlib_metadata +Apache License +Version 2.0, January 2004 +http://www.apache.org/licenses/ + +TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION + +1. Definitions. + +"License" shall mean the terms and conditions for use, reproduction, and distribution as defined by Sections 1 through 9 of this document. + +"Licensor" shall mean the copyright owner or entity authorized by the copyright owner that is granting the License. + +"Legal Entity" shall mean the union of the acting entity and all other entities that control, are controlled by, or are under common control with that entity. For the purposes of this definition, "control" means (i) the power, direct or indirect, to cause the direction or management of such entity, whether by contract or otherwise, or (ii) ownership of fifty percent (50%) or more of the outstanding shares, or (iii) beneficial ownership of such entity. + +"You" (or "Your") shall mean an individual or Legal Entity exercising permissions granted by this License. + +"Source" form shall mean the preferred form for making modifications, including but not limited to software source code, documentation source, and configuration files. + +"Object" form shall mean any form resulting from mechanical transformation or translation of a Source form, including but not limited to compiled object code, generated documentation, and conversions to other media types. + +"Work" shall mean the work of authorship, whether in Source or Object form, made available under the License, as indicated by a copyright notice that is included in or attached to the work (an example is provided in the Appendix below). + +"Derivative Works" shall mean any work, whether in Source or Object form, that is based on (or derived from) the Work and for which the editorial revisions, annotations, elaborations, or other modifications represent, as a whole, an original work of authorship. For the purposes of this License, Derivative Works shall not include works that remain separable from, or merely link (or bind by name) to the interfaces of, the Work and Derivative Works thereof. + +"Contribution" shall mean any work of authorship, including the original version of the Work and any modifications or additions to that Work or Derivative Works thereof, that is intentionally submitted to Licensor for inclusion in the Work by the copyright owner or by an individual or Legal Entity authorized to submit on behalf of the copyright owner. For the purposes of this definition, "submitted" means any form of electronic, verbal, or written communication sent to the Licensor or its representatives, including but not limited to communication on electronic mailing lists, source code control systems, and issue tracking systems that are managed by, or on behalf of, the Licensor for the purpose of discussing and improving the Work, but excluding communication that is conspicuously marked or otherwise designated in writing by the copyright owner as "Not a Contribution." + +"Contributor" shall mean Licensor and any individual or Legal Entity on behalf of whom a Contribution has been received by Licensor and subsequently incorporated within the Work. + +2. Grant of Copyright License. Subject to the terms and conditions of this License, each Contributor hereby grants to You a perpetual, worldwide, non-exclusive, no-charge, royalty-free, irrevocable copyright license to reproduce, prepare Derivative Works of, publicly display, publicly perform, sublicense, and distribute the Work and such Derivative Works in Source or Object form. + +3. Grant of Patent License. Subject to the terms and conditions of this License, each Contributor hereby grants to You a perpetual, worldwide, non-exclusive, no-charge, royalty-free, irrevocable (except as stated in this section) patent license to make, have made, use, offer to sell, sell, import, and otherwise transfer the Work, where such license applies only to those patent claims licensable by such Contributor that are necessarily infringed by their Contribution(s) alone or by combination of their Contribution(s) with the Work to which such Contribution(s) was submitted. If You institute patent litigation against any entity (including a cross-claim or counterclaim in a lawsuit) alleging that the Work or a Contribution incorporated within the Work constitutes direct or contributory patent infringement, then any patent licenses granted to You under this License for that Work shall terminate as of the date such litigation is filed. + +4. Redistribution. You may reproduce and distribute copies of the Work or Derivative Works thereof in any medium, with or without modifications, and in Source or Object form, provided that You meet the following conditions: + + (a) You must give any other recipients of the Work or Derivative Works a copy of this License; and + + (b) You must cause any modified files to carry prominent notices stating that You changed the files; and + + (c) You must retain, in the Source form of any Derivative Works that You distribute, all copyright, patent, trademark, and attribution notices from the Source form of the Work, excluding those notices that do not pertain to any part of the Derivative Works; and + + (d) If the Work includes a "NOTICE" text file as part of its distribution, then any Derivative Works that You distribute must include a readable copy of the attribution notices contained within such NOTICE file, excluding those notices that do not pertain to any part of the Derivative Works, in at least one of the following places: within a NOTICE text file distributed as part of the Derivative Works; within the Source form or documentation, if provided along with the Derivative Works; or, within a display generated by the Derivative Works, if and wherever such third-party notices normally appear. The contents of the NOTICE file are for informational purposes only and do not modify the License. You may add Your own attribution notices within Derivative Works that You distribute, alongside or as an addendum to the NOTICE text from the Work, provided that such additional attribution notices cannot be construed as modifying the License. + + You may add Your own copyright statement to Your modifications and may provide additional or different license terms and conditions for use, reproduction, or distribution of Your modifications, or for any such Derivative Works as a whole, provided Your use, reproduction, and distribution of the Work otherwise complies with the conditions stated in this License. + +5. Submission of Contributions. Unless You explicitly state otherwise, any Contribution intentionally submitted for inclusion in the Work by You to the Licensor shall be under the terms and conditions of this License, without any additional terms or conditions. Notwithstanding the above, nothing herein shall supersede or modify the terms of any separate license agreement you may have executed with Licensor regarding such Contributions. + +6. Trademarks. This License does not grant permission to use the trade names, trademarks, service marks, or product names of the Licensor, except as required for reasonable and customary use in describing the origin of the Work and reproducing the content of the NOTICE file. + +7. Disclaimer of Warranty. Unless required by applicable law or agreed to in writing, Licensor provides the Work (and each Contributor provides its Contributions) on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied, including, without limitation, any warranties or conditions of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A PARTICULAR PURPOSE. You are solely responsible for determining the appropriateness of using or redistributing the Work and assume any risks associated with Your exercise of permissions under this License. + +8. Limitation of Liability. In no event and under no legal theory, whether in tort (including negligence), contract, or otherwise, unless required by applicable law (such as deliberate and grossly negligent acts) or agreed to in writing, shall any Contributor be liable to You for damages, including any direct, indirect, special, incidental, or consequential damages of any character arising as a result of this License or out of the use or inability to use the Work (including but not limited to damages for loss of goodwill, work stoppage, computer failure or malfunction, or any and all other commercial damages or losses), even if such Contributor has been advised of the possibility of such damages. + +9. Accepting Warranty or Additional Liability. While redistributing the Work or Derivative Works thereof, You may choose to offer, and charge a fee for, acceptance of support, warranty, indemnity, or other liability obligations and/or rights consistent with this License. However, in accepting such obligations, You may act only on Your own behalf and on Your sole responsibility, not on behalf of any other Contributor, and only if You agree to indemnify, defend, and hold each Contributor harmless for any liability incurred by, or claims asserted against, such Contributor by reason of your accepting any such warranty or additional liability. + +END OF TERMS AND CONDITIONS + +APPENDIX: How to apply the Apache License to your work. + +To apply the Apache License to your work, attach the following boilerplate notice, with the fields enclosed by brackets "[]" replaced with your own identifying information. (Don't include the brackets!) The text should be enclosed in the appropriate comment syntax for the file format. We also recommend that a file or class name and description of purpose be included on the same "printed page" as the copyright notice for easier identification within third-party archives. + +Copyright 2025 [name of copyright owner] + +Licensed under the Apache License, Version 2.0 (the "License"); +you may not use this file except in compliance with the License. +You may obtain a copy of the License at + +http://www.apache.org/licenses/LICENSE-2.0 + +Unless required by applicable law or agreed to in writing, software +distributed under the License is distributed on an "AS IS" BASIS, +WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +See the License for the specific language governing permissions and +limitations under the License. + + +iniconfig +MIT +https://github.com/pytest-dev/iniconfig +The MIT License (MIT) + +Copyright (c) 2010 - 2023 Holger Krekel and others + +Permission is hereby granted, free of charge, to any person obtaining a copy of +this software and associated documentation files (the "Software"), to deal in +the Software without restriction, including without limitation the rights to +use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies +of the Software, and to permit persons to whom the Software is furnished to do +so, subject to the following conditions: + +The above copyright notice and this permission notice shall be included in all +copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE +SOFTWARE. + + +iopath +MIT licensed, as found in the LICENSE file +https://github.com/facebookresearch/iopath +MIT License + +Copyright (c) Facebook, Inc. and its affiliates. + +Permission is hereby granted, free of charge, to any person obtaining a copy +of this software and associated documentation files (the "Software"), to deal +in the Software without restriction, including without limitation the rights +to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +copies of the Software, and to permit persons to whom the Software is +furnished to do so, subject to the following conditions: + +The above copyright notice and this permission notice shall be included in all +copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE +SOFTWARE. + + +ipycanvas +BSD License +https://github.com/jupyter-widgets-contrib/ipycanvas +Copyright (c) 2019 Martin Renou +All rights reserved. + +Redistribution and use in source and binary forms, with or without +modification, are permitted provided that the following conditions are met: + +1. Redistributions of source code must retain the above copyright notice, this + list of conditions and the following disclaimer. + +2. Redistributions in binary form must reproduce the above copyright notice, + this list of conditions and the following disclaimer in the documentation + and/or other materials provided with the distribution. + +3. Neither the name of the copyright holder nor the names of its + contributors may be used to endorse or promote products derived from + this software without specific prior written permission. + +THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" +AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE +IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE +DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE +FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL +DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR +SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER +CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, +OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE +OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + + +ipyevents +BSD License +https://github.com/mwcraig/ipyevents +Copyright (c) 2017, Matt Craig +All rights reserved. + +Redistribution and use in source and binary forms, with or without +modification, are permitted provided that the following conditions are met: + +1. Redistributions of source code must retain the above copyright notice, this + list of conditions and the following disclaimer. + +2. Redistributions in binary form must reproduce the above copyright notice, + this list of conditions and the following disclaimer in the documentation + and/or other materials provided with the distribution. + +3. Neither the name of the copyright holder nor the names of its + contributors may be used to endorse or promote products derived from + this software without specific prior written permission. + +THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" +AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE +IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE +DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE +FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL +DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR +SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER +CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, +OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE +OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + + +ipykernel +BSD-3-Clause +https://ipython.org +BSD 3-Clause License + +Copyright (c) 2015, IPython Development Team + +All rights reserved. + +Redistribution and use in source and binary forms, with or without +modification, are permitted provided that the following conditions are met: + +1. Redistributions of source code must retain the above copyright notice, this + list of conditions and the following disclaimer. + +2. Redistributions in binary form must reproduce the above copyright notice, + this list of conditions and the following disclaimer in the documentation + and/or other materials provided with the distribution. + +3. Neither the name of the copyright holder nor the names of its + contributors may be used to endorse or promote products derived from + this software without specific prior written permission. + +THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" +AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE +IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE +DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE +FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL +DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR +SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER +CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, +OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE +OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + + +ipython +BSD-3-Clause +https://ipython.org +============================= + The IPython licensing terms +============================= + +IPython is licensed under the terms of the Modified BSD License (also known as +New or Revised or 3-Clause BSD). See the LICENSE file. + + +About the IPython Development Team +---------------------------------- + +Fernando Perez began IPython in 2001 based on code from Janko Hauser + and Nathaniel Gray . Fernando is still +the project lead. + +The IPython Development Team is the set of all contributors to the IPython +project. This includes all of the IPython subprojects. + +The core team that coordinates development on GitHub can be found here: +https://github.com/ipython/. + +Our Copyright Policy +-------------------- + +IPython uses a shared copyright model. Each contributor maintains copyright +over their contributions to IPython. But, it is important to note that these +contributions are typically only changes to the repositories. Thus, the IPython +source code, in its entirety is not the copyright of any single person or +institution. Instead, it is the collective copyright of the entire IPython +Development Team. If individual contributors want to maintain a record of what +changes/contributions they have specific copyright on, they should indicate +their copyright in the commit message of the change, when they commit the +change to one of the IPython repositories. + +With this in mind, the following banner should be used in any source code file +to indicate the copyright and license terms: + +:: + + # Copyright (c) IPython Development Team. + # Distributed under the terms of the Modified BSD License. + + +ipython_pygments_lexers +BSD License +https://github.com/ipython/ipython-pygments-lexers +BSD 3-Clause License + +- Copyright (c) 2012-Present, IPython Development Team + +All rights reserved. + +Redistribution and use in source and binary forms, with or without +modification, are permitted provided that the following conditions are met: + +* Redistributions of source code must retain the above copyright notice, this + list of conditions and the following disclaimer. + +* Redistributions in binary form must reproduce the above copyright notice, + this list of conditions and the following disclaimer in the documentation + and/or other materials provided with the distribution. + +* Neither the name of the copyright holder nor the names of its + contributors may be used to endorse or promote products derived from + this software without specific prior written permission. + +THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" +AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE +IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE +DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE +FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL +DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR +SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER +CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, +OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE +OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + + +ipywidgets +BSD License +http://jupyter.org +Copyright (c) 2015 Project Jupyter Contributors +All rights reserved. + +Redistribution and use in source and binary forms, with or without +modification, are permitted provided that the following conditions are met: + +1. Redistributions of source code must retain the above copyright notice, this + list of conditions and the following disclaimer. + +2. Redistributions in binary form must reproduce the above copyright notice, + this list of conditions and the following disclaimer in the documentation + and/or other materials provided with the distribution. + +3. Neither the name of the copyright holder nor the names of its + contributors may be used to endorse or promote products derived from + this software without specific prior written permission. + +THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" +AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE +IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE +DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE +FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL +DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR +SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER +CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, +OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE +OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + + +isoduration +ISC License (ISCL) +https://github.com/bolsote/isoduration +Copyright (c) 2020 Víctor Muñoz + +Permission to use, copy, modify, and distribute this software for any +purpose with or without fee is hereby granted, provided that the above +copyright notice and this permission notice appear in all copies. + +THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES +WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF +MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR +ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES +WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN +ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF +OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE. + + +itsdangerous +BSD License +https://github.com/pallets/itsdangerous/ +Copyright 2011 Pallets + +Redistribution and use in source and binary forms, with or without +modification, are permitted provided that the following conditions are +met: + +1. Redistributions of source code must retain the above copyright + notice, this list of conditions and the following disclaimer. + +2. Redistributions in binary form must reproduce the above copyright + notice, this list of conditions and the following disclaimer in the + documentation and/or other materials provided with the distribution. + +3. Neither the name of the copyright holder nor the names of its + contributors may be used to endorse or promote products derived from + this software without specific prior written permission. + +THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS +"AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT +LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A +PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT +HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, +SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED +TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR +PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF +LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING +NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS +SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + + +jaraco.classes +MIT License +https://github.com/jaraco/jaraco.classes +Permission is hereby granted, free of charge, to any person obtaining a copy +of this software and associated documentation files (the "Software"), to +deal in the Software without restriction, including without limitation the +rights to use, copy, modify, merge, publish, distribute, sublicense, and/or +sell copies of the Software, and to permit persons to whom the Software is +furnished to do so, subject to the following conditions: + +The above copyright notice and this permission notice shall be included in +all copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING +FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS +IN THE SOFTWARE. + + +jaraco.context +MIT +https://github.com/jaraco/jaraco.context +MIT License + +Copyright (c) 2026 + +Permission is hereby granted, free of charge, to any person obtaining a copy of this software and +associated documentation files (the "Software"), to deal in the Software without restriction, including +without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the +following conditions: + +The above copyright notice and this permission notice shall be included in all copies or substantial +portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT +LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO +EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER +IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE +USE OR OTHER DEALINGS IN THE SOFTWARE. + + +jaraco.functools +MIT +https://github.com/jaraco/jaraco.functools +MIT License + +Copyright (c) 2025 + +Permission is hereby granted, free of charge, to any person obtaining a copy of this software and +associated documentation files (the "Software"), to deal in the Software without restriction, including +without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the +following conditions: + +The above copyright notice and this permission notice shall be included in all copies or substantial +portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT +LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO +EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER +IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE +USE OR OTHER DEALINGS IN THE SOFTWARE. + + +jedi +MIT License +https://github.com/davidhalter/jedi +All contributions towards Jedi are MIT licensed. + +------------------------------------------------------------------------------- +The MIT License (MIT) + +Copyright (c) <2013> + +Permission is hereby granted, free of charge, to any person obtaining a copy +of this software and associated documentation files (the "Software"), to deal +in the Software without restriction, including without limitation the rights +to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +copies of the Software, and to permit persons to whom the Software is +furnished to do so, subject to the following conditions: + +The above copyright notice and this permission notice shall be included in +all copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN +THE SOFTWARE. + + +jeepney +MIT +https://gitlab.com/takluyver/jeepney +The MIT License (MIT) + +Copyright (c) 2017 Thomas Kluyver + +Permission is hereby granted, free of charge, to any person obtaining a copy +of this software and associated documentation files (the "Software"), to deal +in the Software without restriction, including without limitation the rights +to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +copies of the Software, and to permit persons to whom the Software is +furnished to do so, subject to the following conditions: + +The above copyright notice and this permission notice shall be included in +all copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN +THE SOFTWARE. + + +jiter +MIT License +https://github.com/pydantic/jiter/ +The MIT License (MIT) + +Copyright (c) 2022 to present Samuel Colvin + +Permission is hereby granted, free of charge, to any person obtaining a copy +of this software and associated documentation files (the "Software"), to deal +in the Software without restriction, including without limitation the rights +to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +copies of the Software, and to permit persons to whom the Software is +furnished to do so, subject to the following conditions: + +The above copyright notice and this permission notice shall be included in all +copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE +SOFTWARE. + + +jmespath +MIT License +https://github.com/jmespath/jmespath.py +MIT License + +Copyright (c) 2013 Amazon.com, Inc. or its affiliates. All Rights Reserved + +Permission is hereby granted, free of charge, to any person obtaining a copy +of this software and associated documentation files (the "Software"), to deal +in the Software without restriction, including without limitation the rights +to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +copies of the Software, and to permit persons to whom the Software is +furnished to do so, subject to the following conditions: + +The above copyright notice and this permission notice shall be included in all +copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE +SOFTWARE. + + +joblib +BSD-3-Clause +https://joblib.readthedocs.io +BSD 3-Clause License + +Copyright (c) 2008-2021, The joblib developers. +All rights reserved. + +Redistribution and use in source and binary forms, with or without +modification, are permitted provided that the following conditions are met: + +* Redistributions of source code must retain the above copyright notice, this + list of conditions and the following disclaimer. + +* Redistributions in binary form must reproduce the above copyright notice, + this list of conditions and the following disclaimer in the documentation + and/or other materials provided with the distribution. + +* Neither the name of the copyright holder nor the names of its + contributors may be used to endorse or promote products derived from + this software without specific prior written permission. + +THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" +AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE +IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE +DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE +FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL +DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR +SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER +CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, +OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE +OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + + +json5 +Apache Software License +https://github.com/dpranke/pyjson5 +Files: Everything except for the benchmarks/*.json files. + +Apache License + Version 2.0, January 2004 + http://www.apache.org/licenses/ + + TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION + + 1. Definitions. + + "License" shall mean the terms and conditions for use, reproduction, + and distribution as defined by Sections 1 through 9 of this document. + + "Licensor" shall mean the copyright owner or entity authorized by + the copyright owner that is granting the License. + + "Legal Entity" shall mean the union of the acting entity and all + other entities that control, are controlled by, or are under common + control with that entity. For the purposes of this definition, + "control" means (i) the power, direct or indirect, to cause the + direction or management of such entity, whether by contract or + otherwise, or (ii) ownership of fifty percent (50%) or more of the + outstanding shares, or (iii) beneficial ownership of such entity. + + "You" (or "Your") shall mean an individual or Legal Entity + exercising permissions granted by this License. + + "Source" form shall mean the preferred form for making modifications, + including but not limited to software source code, documentation + source, and configuration files. + + "Object" form shall mean any form resulting from mechanical + transformation or translation of a Source form, including but + not limited to compiled object code, generated documentation, + and conversions to other media types. + + "Work" shall mean the work of authorship, whether in Source or + Object form, made available under the License, as indicated by a + copyright notice that is included in or attached to the work + (an example is provided in the Appendix below). + + "Derivative Works" shall mean any work, whether in Source or Object + form, that is based on (or derived from) the Work and for which the + editorial revisions, annotations, elaborations, or other modifications + represent, as a whole, an original work of authorship. For the purposes + of this License, Derivative Works shall not include works that remain + separable from, or merely link (or bind by name) to the interfaces of, + the Work and Derivative Works thereof. + + "Contribution" shall mean any work of authorship, including + the original version of the Work and any modifications or additions + to that Work or Derivative Works thereof, that is intentionally + submitted to Licensor for inclusion in the Work by the copyright owner + or by an individual or Legal Entity authorized to submit on behalf of + the copyright owner. For the purposes of this definition, "submitted" + means any form of electronic, verbal, or written communication sent + to the Licensor or its representatives, including but not limited to + communication on electronic mailing lists, source code control systems, + and issue tracking systems that are managed by, or on behalf of, the + Licensor for the purpose of discussing and improving the Work, but + excluding communication that is conspicuously marked or otherwise + designated in writing by the copyright owner as "Not a Contribution." + + "Contributor" shall mean Licensor and any individual or Legal Entity + on behalf of whom a Contribution has been received by Licensor and + subsequently incorporated within the Work. + + 2. Grant of Copyright License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + copyright license to reproduce, prepare Derivative Works of, + publicly display, publicly perform, sublicense, and distribute the + Work and such Derivative Works in Source or Object form. + + 3. Grant of Patent License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + (except as stated in this section) patent license to make, have made, + use, offer to sell, sell, import, and otherwise transfer the Work, + where such license applies only to those patent claims licensable + by such Contributor that are necessarily infringed by their + Contribution(s) alone or by combination of their Contribution(s) + with the Work to which such Contribution(s) was submitted. If You + institute patent litigation against any entity (including a + cross-claim or counterclaim in a lawsuit) alleging that the Work + or a Contribution incorporated within the Work constitutes direct + or contributory patent infringement, then any patent licenses + granted to You under this License for that Work shall terminate + as of the date such litigation is filed. + + 4. Redistribution. You may reproduce and distribute copies of the + Work or Derivative Works thereof in any medium, with or without + modifications, and in Source or Object form, provided that You + meet the following conditions: + + (a) You must give any other recipients of the Work or + Derivative Works a copy of this License; and + + (b) You must cause any modified files to carry prominent notices + stating that You changed the files; and + + (c) You must retain, in the Source form of any Derivative Works + that You distribute, all copyright, patent, trademark, and + attribution notices from the Source form of the Work, + excluding those notices that do not pertain to any part of + the Derivative Works; and + + (d) If the Work includes a "NOTICE" text file as part of its + distribution, then any Derivative Works that You distribute must + include a readable copy of the attribution notices contained + within such NOTICE file, excluding those notices that do not + pertain to any part of the Derivative Works, in at least one + of the following places: within a NOTICE text file distributed + as part of the Derivative Works; within the Source form or + documentation, if provided along with the Derivative Works; or, + within a display generated by the Derivative Works, if and + wherever such third-party notices normally appear. The contents + of the NOTICE file are for informational purposes only and + do not modify the License. You may add Your own attribution + notices within Derivative Works that You distribute, alongside + or as an addendum to the NOTICE text from the Work, provided + that such additional attribution notices cannot be construed + as modifying the License. + + You may add Your own copyright statement to Your modifications and + may provide additional or different license terms and conditions + for use, reproduction, or distribution of Your modifications, or + for any such Derivative Works as a whole, provided Your use, + reproduction, and distribution of the Work otherwise complies with + the conditions stated in this License. + + 5. Submission of Contributions. Unless You explicitly state otherwise, + any Contribution intentionally submitted for inclusion in the Work + by You to the Licensor shall be under the terms and conditions of + this License, without any additional terms or conditions. + Notwithstanding the above, nothing herein shall supersede or modify + the terms of any separate license agreement you may have executed + with Licensor regarding such Contributions. + + 6. Trademarks. This License does not grant permission to use the trade + names, trademarks, service marks, or product names of the Licensor, + except as required for reasonable and customary use in describing the + origin of the Work and reproducing the content of the NOTICE file. + + 7. Disclaimer of Warranty. Unless required by applicable law or + agreed to in writing, Licensor provides the Work (and each + Contributor provides its Contributions) on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or + implied, including, without limitation, any warranties or conditions + of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A + PARTICULAR PURPOSE. You are solely responsible for determining the + appropriateness of using or redistributing the Work and assume any + risks associated with Your exercise of permissions under this License. + + 8. Limitation of Liability. In no event and under no legal theory, + whether in tort (including negligence), contract, or otherwise, + unless required by applicable law (such as deliberate and grossly + negligent acts) or agreed to in writing, shall any Contributor be + liable to You for damages, including any direct, indirect, special, + incidental, or consequential damages of any character arising as a + result of this License or out of the use or inability to use the + Work (including but not limited to damages for loss of goodwill, + work stoppage, computer failure or malfunction, or any and all + other commercial damages or losses), even if such Contributor + has been advised of the possibility of such damages. + + 9. Accepting Warranty or Additional Liability. While redistributing + the Work or Derivative Works thereof, You may choose to offer, + and charge a fee for, acceptance of support, warranty, indemnity, + or other liability obligations and/or rights consistent with this + License. However, in accepting such obligations, You may act only + on Your own behalf and on Your sole responsibility, not on behalf + of any other Contributor, and only if You agree to indemnify, + defend, and hold each Contributor harmless for any liability + incurred by, or claims asserted against, such Contributor by reason + of your accepting any such warranty or additional liability. + + END OF TERMS AND CONDITIONS + + APPENDIX: How to apply the Apache License to your work. + + To apply the Apache License to your work, attach the following + boilerplate notice, with the fields enclosed by brackets "{}" + replaced with your own identifying information. (Don't include + the brackets!) The text should be enclosed in the appropriate + comment syntax for the file format. We also recommend that a + file or class name and description of purpose be included on the + same "printed page" as the copyright notice for easier + identification within third-party archives. + + Copyright {yyyy} {name of copyright owner} + + Licensed under the Apache License, Version 2.0 (the "License"); + you may not use this file except in compliance with the License. + You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, software + distributed under the License is distributed on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + See the License for the specific language governing permissions and + limitations under the License. + +--- + +File: benchmarks/64KB-min.json + +MIT License + +Copyright (c) Microsoft Corporation. + +Permission is hereby granted, free of charge, to any person obtaining a copy +of this software and associated documentation files (the "Software"), to deal +in the Software without restriction, including without limitation the rights +to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +copies of the Software, and to permit persons to whom the Software is +furnished to do so, subject to the following conditions: + +The above copyright notice and this permission notice shall be included in all +copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE +SOFTWARE + +--- + +File: benchmarks/bitly-usa-gov.json + +The MIT License (MIT) + +Copyright (c) 2017 Wes McKinney + +Permission is hereby granted, free of charge, to any person obtaining a copy +of this software and associated documentation files (the "Software"), to deal +in the Software without restriction, including without limitation the rights +to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +copies of the Software, and to permit persons to whom the Software is +furnished to do so, subject to the following conditions: + +The above copyright notice and this permission notice shall be included in all +copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE +SOFTWARE. + +--- + +File: benchmarks/twitter.json + +The MIT License (MIT) + +Copyright (c) 2014 Milo Yip + +Permission is hereby granted, free of charge, to any person obtaining a copy +of this software and associated documentation files (the "Software"), to deal +in the Software without restriction, including without limitation the rights +to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +copies of the Software, and to permit persons to whom the Software is +furnished to do so, subject to the following conditions: + +The above copyright notice and this permission notice shall be included in all +copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE +SOFTWARE. + + +jsonlines +BSD License +https://github.com/wbolster/jsonlines +*(This is the OSI approved 3-clause "New BSD License".)* + +Copyright © 2016, wouter bolsterlee + +All rights reserved. + +Redistribution and use in source and binary forms, with or without +modification, are permitted provided that the following conditions are met: + +* Redistributions of source code must retain the above copyright notice, this + list of conditions and the following disclaimer. + +* Redistributions in binary form must reproduce the above copyright notice, this + list of conditions and the following disclaimer in the documentation and/or + other materials provided with the distribution. + +* Neither the name of the author nor the names of the contributors may be used + to endorse or promote products derived from this software without specific + prior written permission. + +THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND +ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED +WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE +DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE +FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL +DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR +SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER +CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, +OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE +OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + + +jsonpointer +BSD License +https://github.com/stefankoegl/python-json-pointer +Copyright (c) 2011 Stefan Kögl +All rights reserved. + +Redistribution and use in source and binary forms, with or without +modification, are permitted provided that the following conditions +are met: + +1. Redistributions of source code must retain the above copyright + notice, this list of conditions and the following disclaimer. +2. Redistributions in binary form must reproduce the above copyright + notice, this list of conditions and the following disclaimer in the + documentation and/or other materials provided with the distribution. +3. The name of the author may not be used to endorse or promote products + derived from this software without specific prior written permission. + +THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR +IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES +OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. +IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT, +INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT +NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, +DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY +THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT +(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF +THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + + + +jsonschema +MIT +https://github.com/python-jsonschema/jsonschema +Copyright (c) 2013 Julian Berman + +Permission is hereby granted, free of charge, to any person obtaining a copy +of this software and associated documentation files (the "Software"), to deal +in the Software without restriction, including without limitation the rights +to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +copies of the Software, and to permit persons to whom the Software is +furnished to do so, subject to the following conditions: + +The above copyright notice and this permission notice shall be included in +all copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN +THE SOFTWARE. + + +jsonschema-specifications +MIT +https://github.com/python-jsonschema/jsonschema-specifications +Copyright (c) 2022 Julian Berman + +Permission is hereby granted, free of charge, to any person obtaining a copy +of this software and associated documentation files (the "Software"), to deal +in the Software without restriction, including without limitation the rights +to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +copies of the Software, and to permit persons to whom the Software is +furnished to do so, subject to the following conditions: + +The above copyright notice and this permission notice shall be included in +all copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN +THE SOFTWARE. + + +jupyter-compare-view +MIT License +UNKNOWN +MIT License + +Copyright (c) 2022 Octoframes + +Permission is hereby granted, free of charge, to any person obtaining a copy +of this software and associated documentation files (the "Software"), to deal +in the Software without restriction, including without limitation the rights +to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +copies of the Software, and to permit persons to whom the Software is +furnished to do so, subject to the following conditions: + +The above copyright notice and this permission notice shall be included in all +copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE +SOFTWARE. + + +jupyter-events +BSD License +http://jupyter.org +BSD 3-Clause License + +Copyright (c) 2022-, Jupyter Development Team + +All rights reserved. + +Redistribution and use in source and binary forms, with or without +modification, are permitted provided that the following conditions are met: + +1. Redistributions of source code must retain the above copyright notice, this + list of conditions and the following disclaimer. + +2. Redistributions in binary form must reproduce the above copyright notice, + this list of conditions and the following disclaimer in the documentation + and/or other materials provided with the distribution. + +3. Neither the name of the copyright holder nor the names of its + contributors may be used to endorse or promote products derived from + this software without specific prior written permission. + +THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" +AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE +IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE +DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE +FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL +DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR +SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER +CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, +OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE +OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + + +jupyter-lsp +BSD License +https://github.com/jupyter-lsp/jupyterlab-lsp/issues +BSD 3-Clause License + +Copyright (c) 2022, jupyter-lsp contributors +All rights reserved. + +Redistribution and use in source and binary forms, with or without +modification, are permitted provided that the following conditions are met: + +* Redistributions of source code must retain the above copyright notice, this + list of conditions and the following disclaimer. + +* Redistributions in binary form must reproduce the above copyright notice, + this list of conditions and the following disclaimer in the documentation + and/or other materials provided with the distribution. + +* Neither the name of the copyright holder nor the names of its + contributors may be used to endorse or promote products derived from + this software without specific prior written permission. + +THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" +AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE +IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE +DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE +FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL +DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR +SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER +CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, +OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE +OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + + +jupyter_client +BSD License +https://jupyter.org +BSD 3-Clause License + +- Copyright (c) 2001-2015, IPython Development Team +- Copyright (c) 2015-, Jupyter Development Team + +All rights reserved. + +Redistribution and use in source and binary forms, with or without +modification, are permitted provided that the following conditions are met: + +1. Redistributions of source code must retain the above copyright notice, this + list of conditions and the following disclaimer. + +2. Redistributions in binary form must reproduce the above copyright notice, + this list of conditions and the following disclaimer in the documentation + and/or other materials provided with the distribution. + +3. Neither the name of the copyright holder nor the names of its + contributors may be used to endorse or promote products derived from + this software without specific prior written permission. + +THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" +AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE +IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE +DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE +FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL +DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR +SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER +CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, +OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE +OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + + +jupyter_core +BSD-3-Clause +https://jupyter.org +BSD 3-Clause License + +- Copyright (c) 2015-, Jupyter Development Team + +All rights reserved. + +Redistribution and use in source and binary forms, with or without +modification, are permitted provided that the following conditions are met: + +1. Redistributions of source code must retain the above copyright notice, this + list of conditions and the following disclaimer. + +2. Redistributions in binary form must reproduce the above copyright notice, + this list of conditions and the following disclaimer in the documentation + and/or other materials provided with the distribution. + +3. Neither the name of the copyright holder nor the names of its + contributors may be used to endorse or promote products derived from + this software without specific prior written permission. + +THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" +AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE +IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE +DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE +FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL +DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR +SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER +CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, +OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE +OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + + +jupyter_server +BSD License +https://jupyter-server.readthedocs.io +BSD 3-Clause License + +- Copyright (c) 2001-2015, IPython Development Team +- Copyright (c) 2015-, Jupyter Development Team + +All rights reserved. + +Redistribution and use in source and binary forms, with or without +modification, are permitted provided that the following conditions are met: + +1. Redistributions of source code must retain the above copyright notice, this + list of conditions and the following disclaimer. + +2. Redistributions in binary form must reproduce the above copyright notice, + this list of conditions and the following disclaimer in the documentation + and/or other materials provided with the distribution. + +3. Neither the name of the copyright holder nor the names of its + contributors may be used to endorse or promote products derived from + this software without specific prior written permission. + +THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" +AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE +IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE +DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE +FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL +DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR +SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER +CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, +OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE +OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + + +jupyter_server_terminals +BSD License +https://jupyter.org +BSD 3-Clause License + +- Copyright (c) 2021-, Jupyter Development Team + +Redistribution and use in source and binary forms, with or without +modification, are permitted provided that the following conditions are met: + +All rights reserved. + +1. Redistributions of source code must retain the above copyright notice, this + list of conditions and the following disclaimer. + +2. Redistributions in binary form must reproduce the above copyright notice, + this list of conditions and the following disclaimer in the documentation + and/or other materials provided with the distribution. + +3. Neither the name of the copyright holder nor the names of its + contributors may be used to endorse or promote products derived from + this software without specific prior written permission. + +THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" +AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE +IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE +DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE +FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL +DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR +SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER +CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, +OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE +OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + + +jupyterlab +BSD License +https://jupyter.org +Copyright (c) 2015-2025 Project Jupyter Contributors +All rights reserved. + +Redistribution and use in source and binary forms, with or without +modification, are permitted provided that the following conditions are met: + +1. Redistributions of source code must retain the above copyright notice, this + list of conditions and the following disclaimer. + +2. Redistributions in binary form must reproduce the above copyright notice, + this list of conditions and the following disclaimer in the documentation + and/or other materials provided with the distribution. + +3. Neither the name of the copyright holder nor the names of its + contributors may be used to endorse or promote products derived from + this software without specific prior written permission. + +THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" +AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE +IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE +DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE +FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL +DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR +SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER +CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, +OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE +OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + +Semver File License +=================== + +The semver.py file is from https://github.com/podhmo/python-semver +which is licensed under the "MIT" license. See the semver.py file for details. + + +jupyterlab_pygments +BSD License +https://github.com/jupyterlab/jupyterlab_pygments +Copyright (c) 2015 Project Jupyter Contributors +All rights reserved. + +Redistribution and use in source and binary forms, with or without +modification, are permitted provided that the following conditions are met: + +1. Redistributions of source code must retain the above copyright notice, this + list of conditions and the following disclaimer. + +2. Redistributions in binary form must reproduce the above copyright notice, + this list of conditions and the following disclaimer in the documentation + and/or other materials provided with the distribution. + +3. Neither the name of the copyright holder nor the names of its + contributors may be used to endorse or promote products derived from + this software without specific prior written permission. + +THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" +AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE +IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE +DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE +FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL +DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR +SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER +CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, +OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE +OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + + +jupyterlab_server +BSD License +https://jupyterlab-server.readthedocs.io +Copyright (c) 2015-2017, Project Jupyter Contributors +All rights reserved. + +Redistribution and use in source and binary forms, with or without +modification, are permitted provided that the following conditions are met: + +1. Redistributions of source code must retain the above copyright notice, this + list of conditions and the following disclaimer. + +2. Redistributions in binary form must reproduce the above copyright notice, + this list of conditions and the following disclaimer in the documentation + and/or other materials provided with the distribution. + +3. Neither the name of the copyright holder nor the names of its + contributors may be used to endorse or promote products derived from + this software without specific prior written permission. + +THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" +AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE +IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE +DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE +FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL +DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR +SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER +CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, +OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE +OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + + +jupyterlab_widgets +BSD License +https://github.com/jupyter-widgets/ipywidgets +Copyright (c) 2015 Project Jupyter Contributors +All rights reserved. + +Redistribution and use in source and binary forms, with or without +modification, are permitted provided that the following conditions are met: + +1. Redistributions of source code must retain the above copyright notice, this + list of conditions and the following disclaimer. + +2. Redistributions in binary form must reproduce the above copyright notice, + this list of conditions and the following disclaimer in the documentation + and/or other materials provided with the distribution. + +3. Neither the name of the copyright holder nor the names of its + contributors may be used to endorse or promote products derived from + this software without specific prior written permission. + +THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" +AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE +IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE +DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE +FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL +DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR +SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER +CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, +OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE +OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + +------------------------------------------------------------------------------ + +This package bundles several JavaScript npm packages in the +jupyterlab_widgets/static directory. Their licenses (as packaged in their +distributions in the node_modules package installation directory) are copied +below. + +------------------------------------------------------------------------------ +From css-loader/LICENSE: + +Copyright JS Foundation and other contributors + +Permission is hereby granted, free of charge, to any person obtaining +a copy of this software and associated documentation files (the +'Software'), to deal in the Software without restriction, including +without limitation the rights to use, copy, modify, merge, publish, +distribute, sublicense, and/or sell copies of the Software, and to +permit persons to whom the Software is furnished to do so, subject to +the following conditions: + +The above copyright notice and this permission notice shall be +included in all copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED 'AS IS', WITHOUT WARRANTY OF ANY KIND, +EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF +MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. +IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY +CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, +TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE +SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. + +------------------------------------------------------------------------------ +From style-loader/LICENSE: + +Copyright JS Foundation and other contributors + +Permission is hereby granted, free of charge, to any person obtaining +a copy of this software and associated documentation files (the +'Software'), to deal in the Software without restriction, including +without limitation the rights to use, copy, modify, merge, publish, +distribute, sublicense, and/or sell copies of the Software, and to +permit persons to whom the Software is furnished to do so, subject to +the following conditions: + +The above copyright notice and this permission notice shall be +included in all copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED 'AS IS', WITHOUT WARRANTY OF ANY KIND, +EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF +MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. +IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY +CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, +TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE +SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. + +------------------------------------------------------------------------------ +From backbone/backbone.js + +// (c) 2010-2015 Jeremy Ashkenas, DocumentCloud and Investigative Reporters & Editors +// Backbone may be freely distributed under the MIT license. +// For all details and documentation: +// http://backbonejs.org + +------------------------------------------------------------------------------ +From base-64/LICENSE + +The MIT License (MIT) + +Copyright (c) 2014 Jameson Little + +Permission is hereby granted, free of charge, to any person obtaining a copy +of this software and associated documentation files (the "Software"), to deal +in the Software without restriction, including without limitation the rights +to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +copies of the Software, and to permit persons to whom the Software is +furnished to do so, subject to the following conditions: + +The above copyright notice and this permission notice shall be included in +all copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN +THE SOFTWARE. + +------------------------------------------------------------------------------ +From lodash/LICENSE + +Copyright OpenJS Foundation and other contributors + +Based on Underscore.js, copyright Jeremy Ashkenas, +DocumentCloud and Investigative Reporters & Editors + +This software consists of voluntary contributions made by many +individuals. For exact contribution history, see the revision history +available at https://github.com/lodash/lodash + +The following license applies to all parts of this software except as +documented below: + +==== + +Permission is hereby granted, free of charge, to any person obtaining +a copy of this software and associated documentation files (the +"Software"), to deal in the Software without restriction, including +without limitation the rights to use, copy, modify, merge, publish, +distribute, sublicense, and/or sell copies of the Software, and to +permit persons to whom the Software is furnished to do so, subject to +the following conditions: + +The above copyright notice and this permission notice shall be +included in all copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, +EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF +MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND +NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE +LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION +OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION +WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. + +==== + +Copyright and related rights for sample code are waived via CC0. Sample +code is defined as all source code displayed within the prose of the +documentation. + +CC0: http://creativecommons.org/publicdomain/zero/1.0/ + +==== + +Files located in the node_modules and vendor directories are externally +maintained libraries used by this software which have their own +licenses; we recommend you read them, as their terms may differ from the +terms above. + +------------------------------------------------------------------------------ +From d3-format/LICENSE: + +Copyright 2010-2015 Mike Bostock +All rights reserved. + +Redistribution and use in source and binary forms, with or without modification, +are permitted provided that the following conditions are met: + +* Redistributions of source code must retain the above copyright notice, this + list of conditions and the following disclaimer. + +* Redistributions in binary form must reproduce the above copyright notice, + this list of conditions and the following disclaimer in the documentation + and/or other materials provided with the distribution. + +* Neither the name of the author nor the names of contributors may be used to + endorse or promote products derived from this software without specific prior + written permission. + +THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND +ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED +WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE +DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR +ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES +(INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; +LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON +ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT +(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS +SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + +------------------------------------------------------------------------------ +From noUISlider/LICENSE.md (https://github.com/leongersen/noUiSlider/blob/eca62f9e56aaf02f0841b36e7993adf8db3721d5/LICENSE.md) + +MIT License + +Copyright (c) 2019 Léon Gersen + +Permission is hereby granted, free of charge, to any person obtaining a copy +of this software and associated documentation files (the "Software"), to deal +in the Software without restriction, including without limitation the rights +to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +copies of the Software, and to permit persons to whom the Software is +furnished to do so, subject to the following conditions: + +The above copyright notice and this permission notice shall be included in all +copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE +SOFTWARE. + +------------------------------------------------------------------ +From jquery/LICENSE.txt + +Copyright JS Foundation and other contributors, https://js.foundation/ + +Permission is hereby granted, free of charge, to any person obtaining +a copy of this software and associated documentation files (the +"Software"), to deal in the Software without restriction, including +without limitation the rights to use, copy, modify, merge, publish, +distribute, sublicense, and/or sell copies of the Software, and to +permit persons to whom the Software is furnished to do so, subject to +the following conditions: + +The above copyright notice and this permission notice shall be +included in all copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, +EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF +MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND +NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE +LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION +OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION +WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. + +------------------------------------------------------------------ +From semver/LICENSE: + +The ISC License + +Copyright (c) Isaac Z. Schlueter and Contributors + +Permission to use, copy, modify, and/or distribute this software for any +purpose with or without fee is hereby granted, provided that the above +copyright notice and this permission notice appear in all copies. + +THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES +WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF +MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR +ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES +WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN +ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF OR +IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE. + +------------------------------------------------------------------ +From underscore/LICENSE + +Copyright (c) 2009-2018 Jeremy Ashkenas, DocumentCloud and Investigative +Reporters & Editors + +Permission is hereby granted, free of charge, to any person +obtaining a copy of this software and associated documentation +files (the "Software"), to deal in the Software without +restriction, including without limitation the rights to use, +copy, modify, merge, publish, distribute, sublicense, and/or sell +copies of the Software, and to permit persons to whom the +Software is furnished to do so, subject to the following +conditions: + +The above copyright notice and this permission notice shall be +included in all copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, +EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES +OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND +NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT +HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, +WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING +FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR +OTHER DEALINGS IN THE SOFTWARE. + + +keyring +MIT +https://github.com/jaraco/keyring +MIT License + +Copyright (c) 2025 + +Permission is hereby granted, free of charge, to any person obtaining a copy of this software and +associated documentation files (the "Software"), to deal in the Software without restriction, including +without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the +following conditions: + +The above copyright notice and this permission notice shall be included in all copies or substantial +portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT +LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO +EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER +IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE +USE OR OTHER DEALINGS IN THE SOFTWARE. + + +kiwisolver +BSD License +https://github.com/nucleic/kiwi +========================= + The Kiwi licensing terms +========================= +Kiwi is licensed under the terms of the Modified BSD License (also known as +New or Revised BSD), as follows: + +Copyright (c) 2013-2025, Nucleic Development Team + +All rights reserved. + +Redistribution and use in source and binary forms, with or without +modification, are permitted provided that the following conditions are met: + +Redistributions of source code must retain the above copyright notice, this +list of conditions and the following disclaimer. + +Redistributions in binary form must reproduce the above copyright notice, this +list of conditions and the following disclaimer in the documentation and/or +other materials provided with the distribution. + +Neither the name of the Nucleic Development Team nor the names of its +contributors may be used to endorse or promote products derived from this +software without specific prior written permission. + +THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND +ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED +WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE +DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE +FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL +DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR +SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER +CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, +OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE +OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + +About Kiwi +---------- +Chris Colbert began the Kiwi project in December 2013 in an effort to +create a blisteringly fast UI constraint solver. Chris is still the +project lead. + +The Nucleic Development Team is the set of all contributors to the Nucleic +project and its subprojects. + +The core team that coordinates development on GitHub can be found here: +http://github.com/nucleic. The current team consists of: + +* Chris Colbert + +Our Copyright Policy +-------------------- +Nucleic uses a shared copyright model. Each contributor maintains copyright +over their contributions to Nucleic. But, it is important to note that these +contributions are typically only changes to the repositories. Thus, the Nucleic +source code, in its entirety is not the copyright of any single person or +institution. Instead, it is the collective copyright of the entire Nucleic +Development Team. If individual contributors want to maintain a record of what +changes/contributions they have specific copyright on, they should indicate +their copyright in the commit message of the change, when they commit the +change to one of the Nucleic repositories. + +With this in mind, the following banner should be used in any source code file +to indicate the copyright and license terms: + +#------------------------------------------------------------------------------ +# Copyright (c) 2013-2025, Nucleic Development Team. +# +# Distributed under the terms of the Modified BSD License. +# +# The full license is in the file LICENSE, distributed with this software. +#------------------------------------------------------------------------------ + + +kornia +Apache Software License +https://kornia.github.io/ + Apache License + Version 2.0, January 2004 + http://www.apache.org/licenses/ + + TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION + + 1. Definitions. + + "License" shall mean the terms and conditions for use, reproduction, + and distribution as defined by Sections 1 through 9 of this document. + + "Licensor" shall mean the copyright owner or entity authorized by + the copyright owner that is granting the License. + + "Legal Entity" shall mean the union of the acting entity and all + other entities that control, are controlled by, or are under common + control with that entity. For the purposes of this definition, + "control" means (i) the power, direct or indirect, to cause the + direction or management of such entity, whether by contract or + otherwise, or (ii) ownership of fifty percent (50%) or more of the + outstanding shares, or (iii) beneficial ownership of such entity. + + "You" (or "Your") shall mean an individual or Legal Entity + exercising permissions granted by this License. + + "Source" form shall mean the preferred form for making modifications, + including but not limited to software source code, documentation + source, and configuration files. + + "Object" form shall mean any form resulting from mechanical + transformation or translation of a Source form, including but + not limited to compiled object code, generated documentation, + and conversions to other media types. + + "Work" shall mean the work of authorship, whether in Source or + Object form, made available under the License, as indicated by a + copyright notice that is included in or attached to the work + (an example is provided in the Appendix below). + + "Derivative Works" shall mean any work, whether in Source or Object + form, that is based on (or derived from) the Work and for which the + editorial revisions, annotations, elaborations, or other modifications + represent, as a whole, an original work of authorship. For the purposes + of this License, Derivative Works shall not include works that remain + separable from, or merely link (or bind by name) to the interfaces of, + the Work and Derivative Works thereof. + + "Contribution" shall mean any work of authorship, including + the original version of the Work and any modifications or additions + to that Work or Derivative Works thereof, that is intentionally + submitted to Licensor for inclusion in the Work by the copyright owner + or by an individual or Legal Entity authorized to submit on behalf of + the copyright owner. For the purposes of this definition, "submitted" + means any form of electronic, verbal, or written communication sent + to the Licensor or its representatives, including but not limited to + communication on electronic mailing lists, source code control systems, + and issue tracking systems that are managed by, or on behalf of, the + Licensor for the purpose of discussing and improving the Work, but + excluding communication that is conspicuously marked or otherwise + designated in writing by the copyright owner as "Not a Contribution." + + "Contributor" shall mean Licensor and any individual or Legal Entity + on behalf of whom a Contribution has been received by Licensor and + subsequently incorporated within the Work. + + 2. Grant of Copyright License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + copyright license to reproduce, prepare Derivative Works of, + publicly display, publicly perform, sublicense, and distribute the + Work and such Derivative Works in Source or Object form. + + 3. Grant of Patent License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + (except as stated in this section) patent license to make, have made, + use, offer to sell, sell, import, and otherwise transfer the Work, + where such license applies only to those patent claims licensable + by such Contributor that are necessarily infringed by their + Contribution(s) alone or by combination of their Contribution(s) + with the Work to which such Contribution(s) was submitted. If You + institute patent litigation against any entity (including a + cross-claim or counterclaim in a lawsuit) alleging that the Work + or a Contribution incorporated within the Work constitutes direct + or contributory patent infringement, then any patent licenses + granted to You under this License for that Work shall terminate + as of the date such litigation is filed. + + 4. Redistribution. You may reproduce and distribute copies of the + Work or Derivative Works thereof in any medium, with or without + modifications, and in Source or Object form, provided that You + meet the following conditions: + + (a) You must give any other recipients of the Work or + Derivative Works a copy of this License; and + + (b) You must cause any modified files to carry prominent notices + stating that You changed the files; and + + (c) You must retain, in the Source form of any Derivative Works + that You distribute, all copyright, patent, trademark, and + attribution notices from the Source form of the Work, + excluding those notices that do not pertain to any part of + the Derivative Works; and + + (d) If the Work includes a "NOTICE" text file as part of its + distribution, then any Derivative Works that You distribute must + include a readable copy of the attribution notices contained + within such NOTICE file, excluding those notices that do not + pertain to any part of the Derivative Works, in at least one + of the following places: within a NOTICE text file distributed + as part of the Derivative Works; within the Source form or + documentation, if provided along with the Derivative Works; or, + within a display generated by the Derivative Works, if and + wherever such third-party notices normally appear. The contents + of the NOTICE file are for informational purposes only and + do not modify the License. You may add Your own attribution + notices within Derivative Works that You distribute, alongside + or as an addendum to the NOTICE text from the Work, provided + that such additional attribution notices cannot be construed + as modifying the License. + + You may add Your own copyright statement to Your modifications and + may provide additional or different license terms and conditions + for use, reproduction, or distribution of Your modifications, or + for any such Derivative Works as a whole, provided Your use, + reproduction, and distribution of the Work otherwise complies with + the conditions stated in this License. + + 5. Submission of Contributions. Unless You explicitly state otherwise, + any Contribution intentionally submitted for inclusion in the Work + by You to the Licensor shall be under the terms and conditions of + this License, without any additional terms or conditions. + Notwithstanding the above, nothing herein shall supersede or modify + the terms of any separate license agreement you may have executed + with Licensor regarding such Contributions. + + 6. Trademarks. This License does not grant permission to use the trade + names, trademarks, service marks, or product names of the Licensor, + except as required for reasonable and customary use in describing the + origin of the Work and reproducing the content of the NOTICE file. + + 7. Disclaimer of Warranty. Unless required by applicable law or + agreed to in writing, Licensor provides the Work (and each + Contributor provides its Contributions) on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or + implied, including, without limitation, any warranties or conditions + of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A + PARTICULAR PURPOSE. You are solely responsible for determining the + appropriateness of using or redistributing the Work and assume any + risks associated with Your exercise of permissions under this License. + + 8. Limitation of Liability. In no event and under no legal theory, + whether in tort (including negligence), contract, or otherwise, + unless required by applicable law (such as deliberate and grossly + negligent acts) or agreed to in writing, shall any Contributor be + liable to You for damages, including any direct, indirect, special, + incidental, or consequential damages of any character arising as a + result of this License or out of the use or inability to use the + Work (including but not limited to damages for loss of goodwill, + work stoppage, computer failure or malfunction, or any and all + other commercial damages or losses), even if such Contributor + has been advised of the possibility of such damages. + + 9. Accepting Warranty or Additional Liability. While redistributing + the Work or Derivative Works thereof, You may choose to offer, + and charge a fee for, acceptance of support, warranty, indemnity, + or other liability obligations and/or rights consistent with this + License. However, in accepting such obligations, You may act only + on Your own behalf and on Your sole responsibility, not on behalf + of any other Contributor, and only if You agree to indemnify, + defend, and hold each Contributor harmless for any liability + incurred by, or claims asserted against, such Contributor by reason + of your accepting any such warranty or additional liability. + + END OF TERMS AND CONDITIONS + + +kornia_rs +Apache Software License +http://kornia.org + Apache License + Version 2.0, January 2004 + http://www.apache.org/licenses/ + + TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION + + 1. Definitions. + + "License" shall mean the terms and conditions for use, reproduction, + and distribution as defined by Sections 1 through 9 of this document. + + "Licensor" shall mean the copyright owner or entity authorized by + the copyright owner that is granting the License. + + "Legal Entity" shall mean the union of the acting entity and all + other entities that control, are controlled by, or are under common + control with that entity. For the purposes of this definition, + "control" means (i) the power, direct or indirect, to cause the + direction or management of such entity, whether by contract or + otherwise, or (ii) ownership of fifty percent (50%) or more of the + outstanding shares, or (iii) beneficial ownership of such entity. + + "You" (or "Your") shall mean an individual or Legal Entity + exercising permissions granted by this License. + + "Source" form shall mean the preferred form for making modifications, + including but not limited to software source code, documentation + source, and configuration files. + + "Object" form shall mean any form resulting from mechanical + transformation or translation of a Source form, including but + not limited to compiled object code, generated documentation, + and conversions to other media types. + + "Work" shall mean the work of authorship, whether in Source or + Object form, made available under the License, as indicated by a + copyright notice that is included in or attached to the work + (an example is provided in the Appendix below). + + "Derivative Works" shall mean any work, whether in Source or Object + form, that is based on (or derived from) the Work and for which the + editorial revisions, annotations, elaborations, or other modifications + represent, as a whole, an original work of authorship. For the purposes + of this License, Derivative Works shall not include works that remain + separable from, or merely link (or bind by name) to the interfaces of, + the Work and Derivative Works thereof. + + "Contribution" shall mean any work of authorship, including + the original version of the Work and any modifications or additions + to that Work or Derivative Works thereof, that is intentionally + submitted to Licensor for inclusion in the Work by the copyright owner + or by an individual or Legal Entity authorized to submit on behalf of + the copyright owner. For the purposes of this definition, "submitted" + means any form of electronic, verbal, or written communication sent + to the Licensor or its representatives, including but not limited to + communication on electronic mailing lists, source code control systems, + and issue tracking systems that are managed by, or on behalf of, the + Licensor for the purpose of discussing and improving the Work, but + excluding communication that is conspicuously marked or otherwise + designated in writing by the copyright owner as "Not a Contribution." + + "Contributor" shall mean Licensor and any individual or Legal Entity + on behalf of whom a Contribution has been received by Licensor and + subsequently incorporated within the Work. + + 2. Grant of Copyright License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + copyright license to reproduce, prepare Derivative Works of, + publicly display, publicly perform, sublicense, and distribute the + Work and such Derivative Works in Source or Object form. + + 3. Grant of Patent License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + (except as stated in this section) patent license to make, have made, + use, offer to sell, sell, import, and otherwise transfer the Work, + where such license applies only to those patent claims licensable + by such Contributor that are necessarily infringed by their + Contribution(s) alone or by combination of their Contribution(s) + with the Work to which such Contribution(s) was submitted. If You + institute patent litigation against any entity (including a + cross-claim or counterclaim in a lawsuit) alleging that the Work + or a Contribution incorporated within the Work constitutes direct + or contributory patent infringement, then any patent licenses + granted to You under this License for that Work shall terminate + as of the date such litigation is filed. + + 4. Redistribution. You may reproduce and distribute copies of the + Work or Derivative Works thereof in any medium, with or without + modifications, and in Source or Object form, provided that You + meet the following conditions: + + (a) You must give any other recipients of the Work or + Derivative Works a copy of this License; and + + (b) You must cause any modified files to carry prominent notices + stating that You changed the files; and + + (c) You must retain, in the Source form of any Derivative Works + that You distribute, all copyright, patent, trademark, and + attribution notices from the Source form of the Work, + excluding those notices that do not pertain to any part of + the Derivative Works; and + + (d) If the Work includes a "NOTICE" text file as part of its + distribution, then any Derivative Works that You distribute must + include a readable copy of the attribution notices contained + within such NOTICE file, excluding those notices that do not + pertain to any part of the Derivative Works, in at least one + of the following places: within a NOTICE text file distributed + as part of the Derivative Works; within the Source form or + documentation, if provided along with the Derivative Works; or, + within a display generated by the Derivative Works, if and + wherever such third-party notices normally appear. The contents + of the NOTICE file are for informational purposes only and + do not modify the License. You may add Your own attribution + notices within Derivative Works that You distribute, alongside + or as an addendum to the NOTICE text from the Work, provided + that such additional attribution notices cannot be construed + as modifying the License. + + You may add Your own copyright statement to Your modifications and + may provide additional or different license terms and conditions + for use, reproduction, or distribution of Your modifications, or + for any such Derivative Works as a whole, provided Your use, + reproduction, and distribution of the Work otherwise complies with + the conditions stated in this License. + + 5. Submission of Contributions. Unless You explicitly state otherwise, + any Contribution intentionally submitted for inclusion in the Work + by You to the Licensor shall be under the terms and conditions of + this License, without any additional terms or conditions. + Notwithstanding the above, nothing herein shall supersede or modify + the terms of any separate license agreement you may have executed + with Licensor regarding such Contributions. + + 6. Trademarks. This License does not grant permission to use the trade + names, trademarks, service marks, or product names of the Licensor, + except as required for reasonable and customary use in describing the + origin of the Work and reproducing the content of the NOTICE file. + + 7. Disclaimer of Warranty. Unless required by applicable law or + agreed to in writing, Licensor provides the Work (and each + Contributor provides its Contributions) on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or + implied, including, without limitation, any warranties or conditions + of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A + PARTICULAR PURPOSE. You are solely responsible for determining the + appropriateness of using or redistributing the Work and assume any + risks associated with Your exercise of permissions under this License. + + 8. Limitation of Liability. In no event and under no legal theory, + whether in tort (including negligence), contract, or otherwise, + unless required by applicable law (such as deliberate and grossly + negligent acts) or agreed to in writing, shall any Contributor be + liable to You for damages, including any direct, indirect, special, + incidental, or consequential damages of any character arising as a + result of this License or out of the use or inability to use the + Work (including but not limited to damages for loss of goodwill, + work stoppage, computer failure or malfunction, or any and all + other commercial damages or losses), even if such Contributor + has been advised of the possibility of such damages. + + 9. Accepting Warranty or Additional Liability. While redistributing + the Work or Derivative Works thereof, You may choose to offer, + and charge a fee for, acceptance of support, warranty, indemnity, + or other liability obligations and/or rights consistent with this + License. However, in accepting such obligations, You may act only + on Your own behalf and on Your sole responsibility, not on behalf + of any other Contributor, and only if You agree to indemnify, + defend, and hold each Contributor harmless for any liability + incurred by, or claims asserted against, such Contributor by reason + of your accepting any such warranty or additional liability. + + END OF TERMS AND CONDITIONS + + APPENDIX: How to apply the Apache License to your work. + + To apply the Apache License to your work, attach the following + boilerplate notice, with the fields enclosed by brackets "[]" + replaced with your own identifying information. (Don't include + the brackets!) The text should be enclosed in the appropriate + comment syntax for the file format. We also recommend that a + file or class name and description of purpose be included on the + same "printed page" as the copyright notice for easier + identification within third-party archives. + + Copyright [yyyy] [name of copyright owner] + + Licensed under the Apache License, Version 2.0 (the "License"); + you may not use this file except in compliance with the License. + You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, software + distributed under the License is distributed on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + See the License for the specific language governing permissions and + limitations under the License. + + +lark +MIT License +https://github.com/lark-parser/lark +Copyright © 2017 Erez Shinan + +Permission is hereby granted, free of charge, to any person obtaining a copy of +this software and associated documentation files (the "Software"), to deal in +the Software without restriction, including without limitation the rights to +use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of +the Software, and to permit persons to whom the Software is furnished to do so, +subject to the following conditions: + +The above copyright notice and this permission notice shall be included in all +copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS +FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR +COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER +IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN +CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. + + +lazy_loader +BSD License +https://github.com/scientific-python/lazy_loader +BSD 3-Clause License + +Copyright (c) 2022--2023, Scientific Python project +All rights reserved. + +Redistribution and use in source and binary forms, with or without +modification, are permitted provided that the following conditions are met: + +1. Redistributions of source code must retain the above copyright notice, this + list of conditions and the following disclaimer. + +2. Redistributions in binary form must reproduce the above copyright notice, + this list of conditions and the following disclaimer in the documentation + and/or other materials provided with the distribution. + +3. Neither the name of the copyright holder nor the names of its + contributors may be used to endorse or promote products derived from + this software without specific prior written permission. + +THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" +AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE +IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE +DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE +FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL +DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR +SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER +CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, +OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE +OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + + +lerobot +Apache Software License +https://huggingface.co/lerobot +Copyright 2024 The Hugging Face team. All rights reserved. + + Apache License + Version 2.0, January 2004 + http://www.apache.org/licenses/ + + TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION + + 1. Definitions. + + "License" shall mean the terms and conditions for use, reproduction, + and distribution as defined by Sections 1 through 9 of this document. + + "Licensor" shall mean the copyright owner or entity authorized by + the copyright owner that is granting the License. + + "Legal Entity" shall mean the union of the acting entity and all + other entities that control, are controlled by, or are under common + control with that entity. For the purposes of this definition, + "control" means (i) the power, direct or indirect, to cause the + direction or management of such entity, whether by contract or + otherwise, or (ii) ownership of fifty percent (50%) or more of the + outstanding shares, or (iii) beneficial ownership of such entity. + + "You" (or "Your") shall mean an individual or Legal Entity + exercising permissions granted by this License. + + "Source" form shall mean the preferred form for making modifications, + including but not limited to software source code, documentation + source, and configuration files. + + "Object" form shall mean any form resulting from mechanical + transformation or translation of a Source form, including but + not limited to compiled object code, generated documentation, + and conversions to other media types. + + "Work" shall mean the work of authorship, whether in Source or + Object form, made available under the License, as indicated by a + copyright notice that is included in or attached to the work + (an example is provided in the Appendix below). + + "Derivative Works" shall mean any work, whether in Source or Object + form, that is based on (or derived from) the Work and for which the + editorial revisions, annotations, elaborations, or other modifications + represent, as a whole, an original work of authorship. For the purposes + of this License, Derivative Works shall not include works that remain + separable from, or merely link (or bind by name) to the interfaces of, + the Work and Derivative Works thereof. + + "Contribution" shall mean any work of authorship, including + the original version of the Work and any modifications or additions + to that Work or Derivative Works thereof, that is intentionally + submitted to Licensor for inclusion in the Work by the copyright owner + or by an individual or Legal Entity authorized to submit on behalf of + the copyright owner. For the purposes of this definition, "submitted" + means any form of electronic, verbal, or written communication sent + to the Licensor or its representatives, including but not limited to + communication on electronic mailing lists, source code control systems, + and issue tracking systems that are managed by, or on behalf of, the + Licensor for the purpose of discussing and improving the Work, but + excluding communication that is conspicuously marked or otherwise + designated in writing by the copyright owner as "Not a Contribution." + + "Contributor" shall mean Licensor and any individual or Legal Entity + on behalf of whom a Contribution has been received by Licensor and + subsequently incorporated within the Work. + + 2. Grant of Copyright License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + copyright license to reproduce, prepare Derivative Works of, + publicly display, publicly perform, sublicense, and distribute the + Work and such Derivative Works in Source or Object form. + + 3. Grant of Patent License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + (except as stated in this section) patent license to make, have made, + use, offer to sell, sell, import, and otherwise transfer the Work, + where such license applies only to those patent claims licensable + by such Contributor that are necessarily infringed by their + Contribution(s) alone or by combination of their Contribution(s) + with the Work to which such Contribution(s) was submitted. If You + institute patent litigation against any entity (including a + cross-claim or counterclaim in a lawsuit) alleging that the Work + or a Contribution incorporated within the Work constitutes direct + or contributory patent infringement, then any patent licenses + granted to You under this License for that Work shall terminate + as of the date such litigation is filed. + + 4. Redistribution. You may reproduce and distribute copies of the + Work or Derivative Works thereof in any medium, with or without + modifications, and in Source or Object form, provided that You + meet the following conditions: + + (a) You must give any other recipients of the Work or + Derivative Works a copy of this License; and + + (b) You must cause any modified files to carry prominent notices + stating that You changed the files; and + + (c) You must retain, in the Source form of any Derivative Works + that You distribute, all copyright, patent, trademark, and + attribution notices from the Source form of the Work, + excluding those notices that do not pertain to any part of + the Derivative Works; and + + (d) If the Work includes a "NOTICE" text file as part of its + distribution, then any Derivative Works that You distribute must + include a readable copy of the attribution notices contained + within such NOTICE file, excluding those notices that do not + pertain to any part of the Derivative Works, in at least one + of the following places: within a NOTICE text file distributed + as part of the Derivative Works; within the Source form or + documentation, if provided along with the Derivative Works; or, + within a display generated by the Derivative Works, if and + wherever such third-party notices normally appear. The contents + of the NOTICE file are for informational purposes only and + do not modify the License. You may add Your own attribution + notices within Derivative Works that You distribute, alongside + or as an addendum to the NOTICE text from the Work, provided + that such additional attribution notices cannot be construed + as modifying the License. + + You may add Your own copyright statement to Your modifications and + may provide additional or different license terms and conditions + for use, reproduction, or distribution of Your modifications, or + for any such Derivative Works as a whole, provided Your use, + reproduction, and distribution of the Work otherwise complies with + the conditions stated in this License. + + 5. Submission of Contributions. Unless You explicitly state otherwise, + any Contribution intentionally submitted for inclusion in the Work + by You to the Licensor shall be under the terms and conditions of + this License, without any additional terms or conditions. + Notwithstanding the above, nothing herein shall supersede or modify + the terms of any separate license agreement you may have executed + with Licensor regarding such Contributions. + + 6. Trademarks. This License does not grant permission to use the trade + names, trademarks, service marks, or product names of the Licensor, + except as required for reasonable and customary use in describing the + origin of the Work and reproducing the content of the NOTICE file. + + 7. Disclaimer of Warranty. Unless required by applicable law or + agreed to in writing, Licensor provides the Work (and each + Contributor provides its Contributions) on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or + implied, including, without limitation, any warranties or conditions + of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A + PARTICULAR PURPOSE. You are solely responsible for determining the + appropriateness of using or redistributing the Work and assume any + risks associated with Your exercise of permissions under this License. + + 8. Limitation of Liability. In no event and under no legal theory, + whether in tort (including negligence), contract, or otherwise, + unless required by applicable law (such as deliberate and grossly + negligent acts) or agreed to in writing, shall any Contributor be + liable to You for damages, including any direct, indirect, special, + incidental, or consequential damages of any character arising as a + result of this License or out of the use or inability to use the + Work (including but not limited to damages for loss of goodwill, + work stoppage, computer failure or malfunction, or any and all + other commercial damages or losses), even if such Contributor + has been advised of the possibility of such damages. + + 9. Accepting Warranty or Additional Liability. While redistributing + the Work or Derivative Works thereof, You may choose to offer, + and charge a fee for, acceptance of support, warranty, indemnity, + or other liability obligations and/or rights consistent with this + License. However, in accepting such obligations, You may act only + on Your own behalf and on Your sole responsibility, not on behalf + of any other Contributor, and only if You agree to indemnify, + defend, and hold each Contributor harmless for any liability + incurred by, or claims asserted against, such Contributor by reason + of your accepting any such warranty or additional liability. + + END OF TERMS AND CONDITIONS + + APPENDIX: How to apply the Apache License to your work. + + To apply the Apache License to your work, attach the following + boilerplate notice, with the fields enclosed by brackets "[]" + replaced with your own identifying information. (Don't include + the brackets!) The text should be enclosed in the appropriate + comment syntax for the file format. We also recommend that a + file or class name and description of purpose be included on the + same "printed page" as the copyright notice for easier + identification within third-party archives. + + Copyright [yyyy] [name of copyright owner] + + Licensed under the Apache License, Version 2.0 (the "License"); + you may not use this file except in compliance with the License. + You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, software + distributed under the License is distributed on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + See the License for the specific language governing permissions and + limitations under the License. + + +## Some of lerobot's code is derived from Diffusion Policy, which is subject to the following copyright notice: + +MIT License + +Copyright (c) 2023 Columbia Artificial Intelligence and Robotics Lab + +Permission is hereby granted, free of charge, to any person obtaining a copy +of this software and associated documentation files (the "Software"), to deal +in the Software without restriction, including without limitation the rights +to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +copies of the Software, and to permit persons to whom the Software is +furnished to do so, subject to the following conditions: + +The above copyright notice and this permission notice shall be included in all +copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE +SOFTWARE. + + +## Some of lerobot's code is derived from FOWM, which is subject to the following copyright notice: + +MIT License + +Copyright (c) 2023 Yunhai Feng + +Permission is hereby granted, free of charge, to any person obtaining a copy +of this software and associated documentation files (the "Software"), to deal +in the Software without restriction, including without limitation the rights +to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +copies of the Software, and to permit persons to whom the Software is +furnished to do so, subject to the following conditions: + +The above copyright notice and this permission notice shall be included in all +copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE +SOFTWARE. + + +## Some of lerobot's code is derived from simxarm, which is subject to the following copyright notice: + +MIT License + +Copyright (c) 2023 Nicklas Hansen & Yanjie Ze + +Permission is hereby granted, free of charge, to any person obtaining a copy +of this software and associated documentation files (the "Software"), to deal +in the Software without restriction, including without limitation the rights +to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +copies of the Software, and to permit persons to whom the Software is +furnished to do so, subject to the following conditions: + +The above copyright notice and this permission notice shall be included in all +copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE +SOFTWARE. + + +## Some of lerobot's code is derived from ALOHA, which is subject to the following copyright notice: + +MIT License + +Copyright (c) 2023 Tony Z. Zhao + +Permission is hereby granted, free of charge, to any person obtaining a copy +of this software and associated documentation files (the "Software"), to deal +in the Software without restriction, including without limitation the rights +to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +copies of the Software, and to permit persons to whom the Software is +furnished to do so, subject to the following conditions: + +The above copyright notice and this permission notice shall be included in all +copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE +SOFTWARE. + +## Some of lerobot's code is derived from DETR, which is subject to the following copyright notice: + + Apache License + Version 2.0, January 2004 + http://www.apache.org/licenses/ + + TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION + + 1. Definitions. + + "License" shall mean the terms and conditions for use, reproduction, + and distribution as defined by Sections 1 through 9 of this document. + + "Licensor" shall mean the copyright owner or entity authorized by + the copyright owner that is granting the License. + + "Legal Entity" shall mean the union of the acting entity and all + other entities that control, are controlled by, or are under common + control with that entity. For the purposes of this definition, + "control" means (i) the power, direct or indirect, to cause the + direction or management of such entity, whether by contract or + otherwise, or (ii) ownership of fifty percent (50%) or more of the + outstanding shares, or (iii) beneficial ownership of such entity. + + "You" (or "Your") shall mean an individual or Legal Entity + exercising permissions granted by this License. + + "Source" form shall mean the preferred form for making modifications, + including but not limited to software source code, documentation + source, and configuration files. + + "Object" form shall mean any form resulting from mechanical + transformation or translation of a Source form, including but + not limited to compiled object code, generated documentation, + and conversions to other media types. + + "Work" shall mean the work of authorship, whether in Source or + Object form, made available under the License, as indicated by a + copyright notice that is included in or attached to the work + (an example is provided in the Appendix below). + + "Derivative Works" shall mean any work, whether in Source or Object + form, that is based on (or derived from) the Work and for which the + editorial revisions, annotations, elaborations, or other modifications + represent, as a whole, an original work of authorship. For the purposes + of this License, Derivative Works shall not include works that remain + separable from, or merely link (or bind by name) to the interfaces of, + the Work and Derivative Works thereof. + + "Contribution" shall mean any work of authorship, including + the original version of the Work and any modifications or additions + to that Work or Derivative Works thereof, that is intentionally + submitted to Licensor for inclusion in the Work by the copyright owner + or by an individual or Legal Entity authorized to submit on behalf of + the copyright owner. For the purposes of this definition, "submitted" + means any form of electronic, verbal, or written communication sent + to the Licensor or its representatives, including but not limited to + communication on electronic mailing lists, source code control systems, + and issue tracking systems that are managed by, or on behalf of, the + Licensor for the purpose of discussing and improving the Work, but + excluding communication that is conspicuously marked or otherwise + designated in writing by the copyright owner as "Not a Contribution." + + "Contributor" shall mean Licensor and any individual or Legal Entity + on behalf of whom a Contribution has been received by Licensor and + subsequently incorporated within the Work. + + 2. Grant of Copyright License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + copyright license to reproduce, prepare Derivative Works of, + publicly display, publicly perform, sublicense, and distribute the + Work and such Derivative Works in Source or Object form. + + 3. Grant of Patent License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + (except as stated in this section) patent license to make, have made, + use, offer to sell, sell, import, and otherwise transfer the Work, + where such license applies only to those patent claims licensable + by such Contributor that are necessarily infringed by their + Contribution(s) alone or by combination of their Contribution(s) + with the Work to which such Contribution(s) was submitted. If You + institute patent litigation against any entity (including a + cross-claim or counterclaim in a lawsuit) alleging that the Work + or a Contribution incorporated within the Work constitutes direct + or contributory patent infringement, then any patent licenses + granted to You under this License for that Work shall terminate + as of the date such litigation is filed. + + 4. Redistribution. You may reproduce and distribute copies of the + Work or Derivative Works thereof in any medium, with or without + modifications, and in Source or Object form, provided that You + meet the following conditions: + + (a) You must give any other recipients of the Work or + Derivative Works a copy of this License; and + + (b) You must cause any modified files to carry prominent notices + stating that You changed the files; and + + (c) You must retain, in the Source form of any Derivative Works + that You distribute, all copyright, patent, trademark, and + attribution notices from the Source form of the Work, + excluding those notices that do not pertain to any part of + the Derivative Works; and + + (d) If the Work includes a "NOTICE" text file as part of its + distribution, then any Derivative Works that You distribute must + include a readable copy of the attribution notices contained + within such NOTICE file, excluding those notices that do not + pertain to any part of the Derivative Works, in at least one + of the following places: within a NOTICE text file distributed + as part of the Derivative Works; within the Source form or + documentation, if provided along with the Derivative Works; or, + within a display generated by the Derivative Works, if and + wherever such third-party notices normally appear. The contents + of the NOTICE file are for informational purposes only and + do not modify the License. You may add Your own attribution + notices within Derivative Works that You distribute, alongside + or as an addendum to the NOTICE text from the Work, provided + that such additional attribution notices cannot be construed + as modifying the License. + + You may add Your own copyright statement to Your modifications and + may provide additional or different license terms and conditions + for use, reproduction, or distribution of Your modifications, or + for any such Derivative Works as a whole, provided Your use, + reproduction, and distribution of the Work otherwise complies with + the conditions stated in this License. + + 5. Submission of Contributions. Unless You explicitly state otherwise, + any Contribution intentionally submitted for inclusion in the Work + by You to the Licensor shall be under the terms and conditions of + this License, without any additional terms or conditions. + Notwithstanding the above, nothing herein shall supersede or modify + the terms of any separate license agreement you may have executed + with Licensor regarding such Contributions. + + 6. Trademarks. This License does not grant permission to use the trade + names, trademarks, service marks, or product names of the Licensor, + except as required for reasonable and customary use in describing the + origin of the Work and reproducing the content of the NOTICE file. + + 7. Disclaimer of Warranty. Unless required by applicable law or + agreed to in writing, Licensor provides the Work (and each + Contributor provides its Contributions) on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or + implied, including, without limitation, any warranties or conditions + of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A + PARTICULAR PURPOSE. You are solely responsible for determining the + appropriateness of using or redistributing the Work and assume any + risks associated with Your exercise of permissions under this License. + + 8. Limitation of Liability. In no event and under no legal theory, + whether in tort (including negligence), contract, or otherwise, + unless required by applicable law (such as deliberate and grossly + negligent acts) or agreed to in writing, shall any Contributor be + liable to You for damages, including any direct, indirect, special, + incidental, or consequential damages of any character arising as a + result of this License or out of the use or inability to use the + Work (including but not limited to damages for loss of goodwill, + work stoppage, computer failure or malfunction, or any and all + other commercial damages or losses), even if such Contributor + has been advised of the possibility of such damages. + + 9. Accepting Warranty or Additional Liability. While redistributing + the Work or Derivative Works thereof, You may choose to offer, + and charge a fee for, acceptance of support, warranty, indemnity, + or other liability obligations and/or rights consistent with this + License. However, in accepting such obligations, You may act only + on Your own behalf and on Your sole responsibility, not on behalf + of any other Contributor, and only if You agree to indemnify, + defend, and hold each Contributor harmless for any liability + incurred by, or claims asserted against, such Contributor by reason + of your accepting any such warranty or additional liability. + + END OF TERMS AND CONDITIONS + + APPENDIX: How to apply the Apache License to your work. + + To apply the Apache License to your work, attach the following + boilerplate notice, with the fields enclosed by brackets "[]" + replaced with your own identifying information. (Don't include + the brackets!) The text should be enclosed in the appropriate + comment syntax for the file format. We also recommend that a + file or class name and description of purpose be included on the + same "printed page" as the copyright notice for easier + identification within third-party archives. + + Copyright 2020 - present, Facebook, Inc + + Licensed under the Apache License, Version 2.0 (the "License"); + you may not use this file except in compliance with the License. + You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, software + distributed under the License is distributed on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + See the License for the specific language governing permissions and + limitations under the License. + + +lightning +Apache Software License +https://github.com/Lightning-AI/lightning + Apache License + Version 2.0, January 2004 + http://www.apache.org/licenses/ + + TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION + + 1. Definitions. + + "License" shall mean the terms and conditions for use, reproduction, + and distribution as defined by Sections 1 through 9 of this document. + + "Licensor" shall mean the copyright owner or entity authorized by + the copyright owner that is granting the License. + + "Legal Entity" shall mean the union of the acting entity and all + other entities that control, are controlled by, or are under common + control with that entity. For the purposes of this definition, + "control" means (i) the power, direct or indirect, to cause the + direction or management of such entity, whether by contract or + otherwise, or (ii) ownership of fifty percent (50%) or more of the + outstanding shares, or (iii) beneficial ownership of such entity. + + "You" (or "Your") shall mean an individual or Legal Entity + exercising permissions granted by this License. + + "Source" form shall mean the preferred form for making modifications, + including but not limited to software source code, documentation + source, and configuration files. + + "Object" form shall mean any form resulting from mechanical + transformation or translation of a Source form, including but + not limited to compiled object code, generated documentation, + and conversions to other media types. + + "Work" shall mean the work of authorship, whether in Source or + Object form, made available under the License, as indicated by a + copyright notice that is included in or attached to the work + (an example is provided in the Appendix below). + + "Derivative Works" shall mean any work, whether in Source or Object + form, that is based on (or derived from) the Work and for which the + editorial revisions, annotations, elaborations, or other modifications + represent, as a whole, an original work of authorship. For the purposes + of this License, Derivative Works shall not include works that remain + separable from, or merely link (or bind by name) to the interfaces of, + the Work and Derivative Works thereof. + + "Contribution" shall mean any work of authorship, including + the original version of the Work and any modifications or additions + to that Work or Derivative Works thereof, that is intentionally + submitted to Licensor for inclusion in the Work by the copyright owner + or by an individual or Legal Entity authorized to submit on behalf of + the copyright owner. For the purposes of this definition, "submitted" + means any form of electronic, verbal, or written communication sent + to the Licensor or its representatives, including but not limited to + communication on electronic mailing lists, source code control systems, + and issue tracking systems that are managed by, or on behalf of, the + Licensor for the purpose of discussing and improving the Work, but + excluding communication that is conspicuously marked or otherwise + designated in writing by the copyright owner as "Not a Contribution." + + "Contributor" shall mean Licensor and any individual or Legal Entity + on behalf of whom a Contribution has been received by Licensor and + subsequently incorporated within the Work. + + 2. Grant of Copyright License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + copyright license to reproduce, prepare Derivative Works of, + publicly display, publicly perform, sublicense, and distribute the + Work and such Derivative Works in Source or Object form. + + 3. Grant of Patent License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + (except as stated in this section) patent license to make, have made, + use, offer to sell, sell, import, and otherwise transfer the Work, + where such license applies only to those patent claims licensable + by such Contributor that are necessarily infringed by their + Contribution(s) alone or by combination of their Contribution(s) + with the Work to which such Contribution(s) was submitted. If You + institute patent litigation against any entity (including a + cross-claim or counterclaim in a lawsuit) alleging that the Work + or a Contribution incorporated within the Work constitutes direct + or contributory patent infringement, then any patent licenses + granted to You under this License for that Work shall terminate + as of the date such litigation is filed. + + 4. Redistribution. You may reproduce and distribute copies of the + Work or Derivative Works thereof in any medium, with or without + modifications, and in Source or Object form, provided that You + meet the following conditions: + + (a) You must give any other recipients of the Work or + Derivative Works a copy of this License; and + + (b) You must cause any modified files to carry prominent notices + stating that You changed the files; and + + (c) You must retain, in the Source form of any Derivative Works + that You distribute, all copyright, patent, trademark, and + attribution notices from the Source form of the Work, + excluding those notices that do not pertain to any part of + the Derivative Works; and + + (d) If the Work includes a "NOTICE" text file as part of its + distribution, then any Derivative Works that You distribute must + include a readable copy of the attribution notices contained + within such NOTICE file, excluding those notices that do not + pertain to any part of the Derivative Works, in at least one + of the following places: within a NOTICE text file distributed + as part of the Derivative Works; within the Source form or + documentation, if provided along with the Derivative Works; or, + within a display generated by the Derivative Works, if and + wherever such third-party notices normally appear. The contents + of the NOTICE file are for informational purposes only and + do not modify the License. You may add Your own attribution + notices within Derivative Works that You distribute, alongside + or as an addendum to the NOTICE text from the Work, provided + that such additional attribution notices cannot be construed + as modifying the License. + + You may add Your own copyright statement to Your modifications and + may provide additional or different license terms and conditions + for use, reproduction, or distribution of Your modifications, or + for any such Derivative Works as a whole, provided Your use, + reproduction, and distribution of the Work otherwise complies with + the conditions stated in this License. + + 5. Submission of Contributions. Unless You explicitly state otherwise, + any Contribution intentionally submitted for inclusion in the Work + by You to the Licensor shall be under the terms and conditions of + this License, without any additional terms or conditions. + Notwithstanding the above, nothing herein shall supersede or modify + the terms of any separate license agreement you may have executed + with Licensor regarding such Contributions. + + 6. Trademarks. This License does not grant permission to use the trade + names, trademarks, service marks, or product names of the Licensor, + except as required for reasonable and customary use in describing the + origin of the Work and reproducing the content of the NOTICE file. + + 7. Disclaimer of Warranty. Unless required by applicable law or + agreed to in writing, Licensor provides the Work (and each + Contributor provides its Contributions) on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or + implied, including, without limitation, any warranties or conditions + of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A + PARTICULAR PURPOSE. You are solely responsible for determining the + appropriateness of using or redistributing the Work and assume any + risks associated with Your exercise of permissions under this License. + + 8. Limitation of Liability. In no event and under no legal theory, + whether in tort (including negligence), contract, or otherwise, + unless required by applicable law (such as deliberate and grossly + negligent acts) or agreed to in writing, shall any Contributor be + liable to You for damages, including any direct, indirect, special, + incidental, or consequential damages of any character arising as a + result of this License or out of the use or inability to use the + Work (including but not limited to damages for loss of goodwill, + work stoppage, computer failure or malfunction, or any and all + other commercial damages or losses), even if such Contributor + has been advised of the possibility of such damages. + + 9. Accepting Warranty or Additional Liability. While redistributing + the Work or Derivative Works thereof, You may choose to offer, + and charge a fee for, acceptance of support, warranty, indemnity, + or other liability obligations and/or rights consistent with this + License. However, in accepting such obligations, You may act only + on Your own behalf and on Your sole responsibility, not on behalf + of any other Contributor, and only if You agree to indemnify, + defend, and hold each Contributor harmless for any liability + incurred by, or claims asserted against, such Contributor by reason + of your accepting any such warranty or additional liability. + + END OF TERMS AND CONDITIONS + + APPENDIX: How to apply the Apache License to your work. + + To apply the Apache License to your work, attach the following + boilerplate notice, with the fields enclosed by brackets "[]" + replaced with your own identifying information. (Don't include + the brackets!) The text should be enclosed in the appropriate + comment syntax for the file format. We also recommend that a + file or class name and description of purpose be included on the + same "printed page" as the copyright notice for easier + identification within third-party archives. + + Copyright 2018-2021 William Falcon + + Licensed under the Apache License, Version 2.0 (the "License"); + you may not use this file except in compliance with the License. + You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, software + distributed under the License is distributed on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + See the License for the specific language governing permissions and + limitations under the License. + + +lightning-utilities +Apache-2.0 +https://github.com/Lightning-AI/utilities + Apache License + Version 2.0, January 2004 + http://www.apache.org/licenses/ + + TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION + + 1. Definitions. + + "License" shall mean the terms and conditions for use, reproduction, + and distribution as defined by Sections 1 through 9 of this document. + + "Licensor" shall mean the copyright owner or entity authorized by + the copyright owner that is granting the License. + + "Legal Entity" shall mean the union of the acting entity and all + other entities that control, are controlled by, or are under common + control with that entity. For the purposes of this definition, + "control" means (i) the power, direct or indirect, to cause the + direction or management of such entity, whether by contract or + otherwise, or (ii) ownership of fifty percent (50%) or more of the + outstanding shares, or (iii) beneficial ownership of such entity. + + "You" (or "Your") shall mean an individual or Legal Entity + exercising permissions granted by this License. + + "Source" form shall mean the preferred form for making modifications, + including but not limited to software source code, documentation + source, and configuration files. + + "Object" form shall mean any form resulting from mechanical + transformation or translation of a Source form, including but + not limited to compiled object code, generated documentation, + and conversions to other media types. + + "Work" shall mean the work of authorship, whether in Source or + Object form, made available under the License, as indicated by a + copyright notice that is included in or attached to the work + (an example is provided in the Appendix below). + + "Derivative Works" shall mean any work, whether in Source or Object + form, that is based on (or derived from) the Work and for which the + editorial revisions, annotations, elaborations, or other modifications + represent, as a whole, an original work of authorship. For the purposes + of this License, Derivative Works shall not include works that remain + separable from, or merely link (or bind by name) to the interfaces of, + the Work and Derivative Works thereof. + + "Contribution" shall mean any work of authorship, including + the original version of the Work and any modifications or additions + to that Work or Derivative Works thereof, that is intentionally + submitted to Licensor for inclusion in the Work by the copyright owner + or by an individual or Legal Entity authorized to submit on behalf of + the copyright owner. For the purposes of this definition, "submitted" + means any form of electronic, verbal, or written communication sent + to the Licensor or its representatives, including but not limited to + communication on electronic mailing lists, source code control systems, + and issue tracking systems that are managed by, or on behalf of, the + Licensor for the purpose of discussing and improving the Work, but + excluding communication that is conspicuously marked or otherwise + designated in writing by the copyright owner as "Not a Contribution." + + "Contributor" shall mean Licensor and any individual or Legal Entity + on behalf of whom a Contribution has been received by Licensor and + subsequently incorporated within the Work. + + 2. Grant of Copyright License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + copyright license to reproduce, prepare Derivative Works of, + publicly display, publicly perform, sublicense, and distribute the + Work and such Derivative Works in Source or Object form. + + 3. Grant of Patent License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + (except as stated in this section) patent license to make, have made, + use, offer to sell, sell, import, and otherwise transfer the Work, + where such license applies only to those patent claims licensable + by such Contributor that are necessarily infringed by their + Contribution(s) alone or by combination of their Contribution(s) + with the Work to which such Contribution(s) was submitted. If You + institute patent litigation against any entity (including a + cross-claim or counterclaim in a lawsuit) alleging that the Work + or a Contribution incorporated within the Work constitutes direct + or contributory patent infringement, then any patent licenses + granted to You under this License for that Work shall terminate + as of the date such litigation is filed. + + 4. Redistribution. You may reproduce and distribute copies of the + Work or Derivative Works thereof in any medium, with or without + modifications, and in Source or Object form, provided that You + meet the following conditions: + + (a) You must give any other recipients of the Work or + Derivative Works a copy of this License; and + + (b) You must cause any modified files to carry prominent notices + stating that You changed the files; and + + (c) You must retain, in the Source form of any Derivative Works + that You distribute, all copyright, patent, trademark, and + attribution notices from the Source form of the Work, + excluding those notices that do not pertain to any part of + the Derivative Works; and + + (d) If the Work includes a "NOTICE" text file as part of its + distribution, then any Derivative Works that You distribute must + include a readable copy of the attribution notices contained + within such NOTICE file, excluding those notices that do not + pertain to any part of the Derivative Works, in at least one + of the following places: within a NOTICE text file distributed + as part of the Derivative Works; within the Source form or + documentation, if provided along with the Derivative Works; or, + within a display generated by the Derivative Works, if and + wherever such third-party notices normally appear. The contents + of the NOTICE file are for informational purposes only and + do not modify the License. You may add Your own attribution + notices within Derivative Works that You distribute, alongside + or as an addendum to the NOTICE text from the Work, provided + that such additional attribution notices cannot be construed + as modifying the License. + + You may add Your own copyright statement to Your modifications and + may provide additional or different license terms and conditions + for use, reproduction, or distribution of Your modifications, or + for any such Derivative Works as a whole, provided Your use, + reproduction, and distribution of the Work otherwise complies with + the conditions stated in this License. + + 5. Submission of Contributions. Unless You explicitly state otherwise, + any Contribution intentionally submitted for inclusion in the Work + by You to the Licensor shall be under the terms and conditions of + this License, without any additional terms or conditions. + Notwithstanding the above, nothing herein shall supersede or modify + the terms of any separate license agreement you may have executed + with Licensor regarding such Contributions. + + 6. Trademarks. This License does not grant permission to use the trade + names, trademarks, service marks, or product names of the Licensor, + except as required for reasonable and customary use in describing the + origin of the Work and reproducing the content of the NOTICE file. + + 7. Disclaimer of Warranty. Unless required by applicable law or + agreed to in writing, Licensor provides the Work (and each + Contributor provides its Contributions) on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or + implied, including, without limitation, any warranties or conditions + of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A + PARTICULAR PURPOSE. You are solely responsible for determining the + appropriateness of using or redistributing the Work and assume any + risks associated with Your exercise of permissions under this License. + + 8. Limitation of Liability. In no event and under no legal theory, + whether in tort (including negligence), contract, or otherwise, + unless required by applicable law (such as deliberate and grossly + negligent acts) or agreed to in writing, shall any Contributor be + liable to You for damages, including any direct, indirect, special, + incidental, or consequential damages of any character arising as a + result of this License or out of the use or inability to use the + Work (including but not limited to damages for loss of goodwill, + work stoppage, computer failure or malfunction, or any and all + other commercial damages or losses), even if such Contributor + has been advised of the possibility of such damages. + + 9. Accepting Warranty or Additional Liability. While redistributing + the Work or Derivative Works thereof, You may choose to offer, + and charge a fee for, acceptance of support, warranty, indemnity, + or other liability obligations and/or rights consistent with this + License. However, in accepting such obligations, You may act only + on Your own behalf and on Your sole responsibility, not on behalf + of any other Contributor, and only if You agree to indemnify, + defend, and hold each Contributor harmless for any liability + incurred by, or claims asserted against, such Contributor by reason + of your accepting any such warranty or additional liability. + + END OF TERMS AND CONDITIONS + + APPENDIX: How to apply the Apache License to your work. + + To apply the Apache License to your work, attach the following + boilerplate notice, with the fields enclosed by brackets "[]" + replaced with your own identifying information. (Don't include + the brackets!) The text should be enclosed in the appropriate + comment syntax for the file format. We also recommend that a + file or class name and description of purpose be included on the + same "printed page" as the copyright notice for easier + identification within third-party archives. + + Copyright 2018-2021 William Falcon + + Licensed under the Apache License, Version 2.0 (the "License"); + you may not use this file except in compliance with the License. + You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, software + distributed under the License is distributed on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + See the License for the specific language governing permissions and + limitations under the License. + + +loguru +MIT License +https://github.com/Delgan/loguru +UNKNOWN + +lpips +BSD License +https://github.com/richzhang/PerceptualSimilarity +Copyright (c) 2018, Richard Zhang, Phillip Isola, Alexei A. Efros, Eli Shechtman, Oliver Wang +All rights reserved. + +Redistribution and use in source and binary forms, with or without +modification, are permitted provided that the following conditions are met: + +* Redistributions of source code must retain the above copyright notice, this + list of conditions and the following disclaimer. + +* Redistributions in binary form must reproduce the above copyright notice, + this list of conditions and the following disclaimer in the documentation + and/or other materials provided with the distribution. + +THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" +AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE +IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE +DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE +FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL +DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR +SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER +CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, +OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE +OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + + + +lxml +BSD-3-Clause +https://lxml.de/ +BSD 3-Clause License + +Copyright (c) 2004 Infrae. All rights reserved. + +Redistribution and use in source and binary forms, with or without +modification, are permitted provided that the following conditions are +met: + + 1. Redistributions of source code must retain the above copyright + notice, this list of conditions and the following disclaimer. + + 2. Redistributions in binary form must reproduce the above copyright + notice, this list of conditions and the following disclaimer in + the documentation and/or other materials provided with the + distribution. + + 3. Neither the name of Infrae nor the names of its contributors may + be used to endorse or promote products derived from this software + without specific prior written permission. + +THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS +"AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT +LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR +A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL INFRAE OR +CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, +EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, +PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR +PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF +LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING +NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS +SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + + +lz4 +BSD License +https://github.com/python-lz4/python-lz4 +Copyright (c) 2012-2013, Steeve Morin +All rights reserved. + +Redistribution and use in source and binary forms, with or without +modification, are permitted provided that the following conditions are met: + +1. Redistributions of source code must retain the above copyright notice, + this list of conditions and the following disclaimer. + +2. Redistributions in binary form must reproduce the above copyright notice, + this list of conditions and the following disclaimer in the documentation + and/or other materials provided with the distribution. + +3. Neither the name of Steeve Morin nor the names of its contributors may be + used to endorse or promote products derived from this software without + specific prior written permission. + +THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" +AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE +IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE +ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE +LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR +CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF +SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS +INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN +CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) +ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE +POSSIBILITY OF SUCH DAMAGE. + + +makefun +BSD License +https://github.com/smarie/python-makefun +BSD 3-Clause License + +Copyright (c) 2019-2022, Sylvain Marié, Schneider Electric Industries +All rights reserved. + +Redistribution and use in source and binary forms, with or without +modification, are permitted provided that the following conditions are met: + +* Redistributions of source code must retain the above copyright notice, this + list of conditions and the following disclaimer. + +* Redistributions in binary form must reproduce the above copyright notice, + this list of conditions and the following disclaimer in the documentation + and/or other materials provided with the distribution. + +* Neither the name of the copyright holder nor the names of its + contributors may be used to endorse or promote products derived from + this software without specific prior written permission. + +THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" +AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE +IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE +DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE +FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL +DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR +SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER +CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, +OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE +OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + + +markdown-it-py +MIT License +https://github.com/executablebooks/markdown-it-py +MIT License + +Copyright (c) 2020 ExecutableBookProject + +Permission is hereby granted, free of charge, to any person obtaining a copy +of this software and associated documentation files (the "Software"), to deal +in the Software without restriction, including without limitation the rights +to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +copies of the Software, and to permit persons to whom the Software is +furnished to do so, subject to the following conditions: + +The above copyright notice and this permission notice shall be included in all +copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE +SOFTWARE. + + +marshmallow +MIT License +https://github.com/marshmallow-code/marshmallow +Copyright Steven Loria and contributors + +Permission is hereby granted, free of charge, to any person obtaining a copy +of this software and associated documentation files (the "Software"), to deal +in the Software without restriction, including without limitation the rights +to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +copies of the Software, and to permit persons to whom the Software is +furnished to do so, subject to the following conditions: + +The above copyright notice and this permission notice shall be included in +all copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN +THE SOFTWARE. + + +matplotlib +Python Software Foundation License +https://matplotlib.org +License agreement for matplotlib versions 1.3.0 and later +========================================================= + +1. This LICENSE AGREEMENT is between the Matplotlib Development Team +("MDT"), and the Individual or Organization ("Licensee") accessing and +otherwise using matplotlib software in source or binary form and its +associated documentation. + +2. Subject to the terms and conditions of this License Agreement, MDT +hereby grants Licensee a nonexclusive, royalty-free, world-wide license +to reproduce, analyze, test, perform and/or display publicly, prepare +derivative works, distribute, and otherwise use matplotlib +alone or in any derivative version, provided, however, that MDT's +License Agreement and MDT's notice of copyright, i.e., "Copyright (c) +2012- Matplotlib Development Team; All Rights Reserved" are retained in +matplotlib alone or in any derivative version prepared by +Licensee. + +3. In the event Licensee prepares a derivative work that is based on or +incorporates matplotlib or any part thereof, and wants to +make the derivative work available to others as provided herein, then +Licensee hereby agrees to include in any such work a brief summary of +the changes made to matplotlib . + +4. MDT is making matplotlib available to Licensee on an "AS +IS" basis. MDT MAKES NO REPRESENTATIONS OR WARRANTIES, EXPRESS OR +IMPLIED. BY WAY OF EXAMPLE, BUT NOT LIMITATION, MDT MAKES NO AND +DISCLAIMS ANY REPRESENTATION OR WARRANTY OF MERCHANTABILITY OR FITNESS +FOR ANY PARTICULAR PURPOSE OR THAT THE USE OF MATPLOTLIB +WILL NOT INFRINGE ANY THIRD PARTY RIGHTS. + +5. MDT SHALL NOT BE LIABLE TO LICENSEE OR ANY OTHER USERS OF MATPLOTLIB + FOR ANY INCIDENTAL, SPECIAL, OR CONSEQUENTIAL DAMAGES OR +LOSS AS A RESULT OF MODIFYING, DISTRIBUTING, OR OTHERWISE USING +MATPLOTLIB , OR ANY DERIVATIVE THEREOF, EVEN IF ADVISED OF +THE POSSIBILITY THEREOF. + +6. This License Agreement will automatically terminate upon a material +breach of its terms and conditions. + +7. Nothing in this License Agreement shall be deemed to create any +relationship of agency, partnership, or joint venture between MDT and +Licensee. This License Agreement does not grant permission to use MDT +trademarks or trade name in a trademark sense to endorse or promote +products or services of Licensee, or any third party. + +8. By copying, installing or otherwise using matplotlib , +Licensee agrees to be bound by the terms and conditions of this License +Agreement. + +License agreement for matplotlib versions prior to 1.3.0 +======================================================== + +1. This LICENSE AGREEMENT is between John D. Hunter ("JDH"), and the +Individual or Organization ("Licensee") accessing and otherwise using +matplotlib software in source or binary form and its associated +documentation. + +2. Subject to the terms and conditions of this License Agreement, JDH +hereby grants Licensee a nonexclusive, royalty-free, world-wide license +to reproduce, analyze, test, perform and/or display publicly, prepare +derivative works, distribute, and otherwise use matplotlib +alone or in any derivative version, provided, however, that JDH's +License Agreement and JDH's notice of copyright, i.e., "Copyright (c) +2002-2011 John D. Hunter; All Rights Reserved" are retained in +matplotlib alone or in any derivative version prepared by +Licensee. + +3. In the event Licensee prepares a derivative work that is based on or +incorporates matplotlib or any part thereof, and wants to +make the derivative work available to others as provided herein, then +Licensee hereby agrees to include in any such work a brief summary of +the changes made to matplotlib. + +4. JDH is making matplotlib available to Licensee on an "AS +IS" basis. JDH MAKES NO REPRESENTATIONS OR WARRANTIES, EXPRESS OR +IMPLIED. BY WAY OF EXAMPLE, BUT NOT LIMITATION, JDH MAKES NO AND +DISCLAIMS ANY REPRESENTATION OR WARRANTY OF MERCHANTABILITY OR FITNESS +FOR ANY PARTICULAR PURPOSE OR THAT THE USE OF MATPLOTLIB +WILL NOT INFRINGE ANY THIRD PARTY RIGHTS. + +5. JDH SHALL NOT BE LIABLE TO LICENSEE OR ANY OTHER USERS OF MATPLOTLIB + FOR ANY INCIDENTAL, SPECIAL, OR CONSEQUENTIAL DAMAGES OR +LOSS AS A RESULT OF MODIFYING, DISTRIBUTING, OR OTHERWISE USING +MATPLOTLIB , OR ANY DERIVATIVE THEREOF, EVEN IF ADVISED OF +THE POSSIBILITY THEREOF. + +6. This License Agreement will automatically terminate upon a material +breach of its terms and conditions. + +7. Nothing in this License Agreement shall be deemed to create any +relationship of agency, partnership, or joint venture between JDH and +Licensee. This License Agreement does not grant permission to use JDH +trademarks or trade name in a trademark sense to endorse or promote +products or services of Licensee, or any third party. + +8. By copying, installing or otherwise using matplotlib, +Licensee agrees to be bound by the terms and conditions of this License +Agreement. +---- + +This binary distrubution of Matplotlib can also bundle the following software +(depending on the build): + +Name: AMS Fonts +Files: matplotlib/tests/cmr10.pfb +Description: Type-1 version of one of Knuth's Computer Modern fonts +License: OFL-1.1 + The cmr10.pfb file is a Type-1 version of one of Knuth's Computer Modern fonts. + It is included here as test data only, but the following license applies. + + Copyright (c) 1997, 2009, American Mathematical Society (http://www.ams.org). + All Rights Reserved. + + "cmb10" is a Reserved Font Name for this Font Software. + "cmbsy10" is a Reserved Font Name for this Font Software. + "cmbsy5" is a Reserved Font Name for this Font Software. + "cmbsy6" is a Reserved Font Name for this Font Software. + "cmbsy7" is a Reserved Font Name for this Font Software. + "cmbsy8" is a Reserved Font Name for this Font Software. + "cmbsy9" is a Reserved Font Name for this Font Software. + "cmbx10" is a Reserved Font Name for this Font Software. + "cmbx12" is a Reserved Font Name for this Font Software. + "cmbx5" is a Reserved Font Name for this Font Software. + "cmbx6" is a Reserved Font Name for this Font Software. + "cmbx7" is a Reserved Font Name for this Font Software. + "cmbx8" is a Reserved Font Name for this Font Software. + "cmbx9" is a Reserved Font Name for this Font Software. + "cmbxsl10" is a Reserved Font Name for this Font Software. + "cmbxti10" is a Reserved Font Name for this Font Software. + "cmcsc10" is a Reserved Font Name for this Font Software. + "cmcsc8" is a Reserved Font Name for this Font Software. + "cmcsc9" is a Reserved Font Name for this Font Software. + "cmdunh10" is a Reserved Font Name for this Font Software. + "cmex10" is a Reserved Font Name for this Font Software. + "cmex7" is a Reserved Font Name for this Font Software. + "cmex8" is a Reserved Font Name for this Font Software. + "cmex9" is a Reserved Font Name for this Font Software. + "cmff10" is a Reserved Font Name for this Font Software. + "cmfi10" is a Reserved Font Name for this Font Software. + "cmfib8" is a Reserved Font Name for this Font Software. + "cminch" is a Reserved Font Name for this Font Software. + "cmitt10" is a Reserved Font Name for this Font Software. + "cmmi10" is a Reserved Font Name for this Font Software. + "cmmi12" is a Reserved Font Name for this Font Software. + "cmmi5" is a Reserved Font Name for this Font Software. + "cmmi6" is a Reserved Font Name for this Font Software. + "cmmi7" is a Reserved Font Name for this Font Software. + "cmmi8" is a Reserved Font Name for this Font Software. + "cmmi9" is a Reserved Font Name for this Font Software. + "cmmib10" is a Reserved Font Name for this Font Software. + "cmmib5" is a Reserved Font Name for this Font Software. + "cmmib6" is a Reserved Font Name for this Font Software. + "cmmib7" is a Reserved Font Name for this Font Software. + "cmmib8" is a Reserved Font Name for this Font Software. + "cmmib9" is a Reserved Font Name for this Font Software. + "cmr10" is a Reserved Font Name for this Font Software. + "cmr12" is a Reserved Font Name for this Font Software. + "cmr17" is a Reserved Font Name for this Font Software. + "cmr5" is a Reserved Font Name for this Font Software. + "cmr6" is a Reserved Font Name for this Font Software. + "cmr7" is a Reserved Font Name for this Font Software. + "cmr8" is a Reserved Font Name for this Font Software. + "cmr9" is a Reserved Font Name for this Font Software. + "cmsl10" is a Reserved Font Name for this Font Software. + "cmsl12" is a Reserved Font Name for this Font Software. + "cmsl8" is a Reserved Font Name for this Font Software. + "cmsl9" is a Reserved Font Name for this Font Software. + "cmsltt10" is a Reserved Font Name for this Font Software. + "cmss10" is a Reserved Font Name for this Font Software. + "cmss12" is a Reserved Font Name for this Font Software. + "cmss17" is a Reserved Font Name for this Font Software. + "cmss8" is a Reserved Font Name for this Font Software. + "cmss9" is a Reserved Font Name for this Font Software. + "cmssbx10" is a Reserved Font Name for this Font Software. + "cmssdc10" is a Reserved Font Name for this Font Software. + "cmssi10" is a Reserved Font Name for this Font Software. + "cmssi12" is a Reserved Font Name for this Font Software. + "cmssi17" is a Reserved Font Name for this Font Software. + "cmssi8" is a Reserved Font Name for this Font Software. + "cmssi9" is a Reserved Font Name for this Font Software. + "cmssq8" is a Reserved Font Name for this Font Software. + "cmssqi8" is a Reserved Font Name for this Font Software. + "cmsy10" is a Reserved Font Name for this Font Software. + "cmsy5" is a Reserved Font Name for this Font Software. + "cmsy6" is a Reserved Font Name for this Font Software. + "cmsy7" is a Reserved Font Name for this Font Software. + "cmsy8" is a Reserved Font Name for this Font Software. + "cmsy9" is a Reserved Font Name for this Font Software. + "cmtcsc10" is a Reserved Font Name for this Font Software. + "cmtex10" is a Reserved Font Name for this Font Software. + "cmtex8" is a Reserved Font Name for this Font Software. + "cmtex9" is a Reserved Font Name for this Font Software. + "cmti10" is a Reserved Font Name for this Font Software. + "cmti12" is a Reserved Font Name for this Font Software. + "cmti7" is a Reserved Font Name for this Font Software. + "cmti8" is a Reserved Font Name for this Font Software. + "cmti9" is a Reserved Font Name for this Font Software. + "cmtt10" is a Reserved Font Name for this Font Software. + "cmtt12" is a Reserved Font Name for this Font Software. + "cmtt8" is a Reserved Font Name for this Font Software. + "cmtt9" is a Reserved Font Name for this Font Software. + "cmu10" is a Reserved Font Name for this Font Software. + "cmvtt10" is a Reserved Font Name for this Font Software. + "euex10" is a Reserved Font Name for this Font Software. + "euex7" is a Reserved Font Name for this Font Software. + "euex8" is a Reserved Font Name for this Font Software. + "euex9" is a Reserved Font Name for this Font Software. + "eufb10" is a Reserved Font Name for this Font Software. + "eufb5" is a Reserved Font Name for this Font Software. + "eufb7" is a Reserved Font Name for this Font Software. + "eufm10" is a Reserved Font Name for this Font Software. + "eufm5" is a Reserved Font Name for this Font Software. + "eufm7" is a Reserved Font Name for this Font Software. + "eurb10" is a Reserved Font Name for this Font Software. + "eurb5" is a Reserved Font Name for this Font Software. + "eurb7" is a Reserved Font Name for this Font Software. + "eurm10" is a Reserved Font Name for this Font Software. + "eurm5" is a Reserved Font Name for this Font Software. + "eurm7" is a Reserved Font Name for this Font Software. + "eusb10" is a Reserved Font Name for this Font Software. + "eusb5" is a Reserved Font Name for this Font Software. + "eusb7" is a Reserved Font Name for this Font Software. + "eusm10" is a Reserved Font Name for this Font Software. + "eusm5" is a Reserved Font Name for this Font Software. + "eusm7" is a Reserved Font Name for this Font Software. + "lasy10" is a Reserved Font Name for this Font Software. + "lasy5" is a Reserved Font Name for this Font Software. + "lasy6" is a Reserved Font Name for this Font Software. + "lasy7" is a Reserved Font Name for this Font Software. + "lasy8" is a Reserved Font Name for this Font Software. + "lasy9" is a Reserved Font Name for this Font Software. + "lasyb10" is a Reserved Font Name for this Font Software. + "lcircle1" is a Reserved Font Name for this Font Software. + "lcirclew" is a Reserved Font Name for this Font Software. + "lcmss8" is a Reserved Font Name for this Font Software. + "lcmssb8" is a Reserved Font Name for this Font Software. + "lcmssi8" is a Reserved Font Name for this Font Software. + "line10" is a Reserved Font Name for this Font Software. + "linew10" is a Reserved Font Name for this Font Software. + "msam10" is a Reserved Font Name for this Font Software. + "msam5" is a Reserved Font Name for this Font Software. + "msam6" is a Reserved Font Name for this Font Software. + "msam7" is a Reserved Font Name for this Font Software. + "msam8" is a Reserved Font Name for this Font Software. + "msam9" is a Reserved Font Name for this Font Software. + "msbm10" is a Reserved Font Name for this Font Software. + "msbm5" is a Reserved Font Name for this Font Software. + "msbm6" is a Reserved Font Name for this Font Software. + "msbm7" is a Reserved Font Name for this Font Software. + "msbm8" is a Reserved Font Name for this Font Software. + "msbm9" is a Reserved Font Name for this Font Software. + "wncyb10" is a Reserved Font Name for this Font Software. + "wncyi10" is a Reserved Font Name for this Font Software. + "wncyr10" is a Reserved Font Name for this Font Software. + "wncysc10" is a Reserved Font Name for this Font Software. + "wncyss10" is a Reserved Font Name for this Font Software. + + This Font Software is licensed under the SIL Open Font License, Version 1.1. + This license is copied below, and is also available with a FAQ at: + http://scripts.sil.org/OFL + + ----------------------------------------------------------- + SIL OPEN FONT LICENSE Version 1.1 - 26 February 2007 + ----------------------------------------------------------- + + PREAMBLE + The goals of the Open Font License (OFL) are to stimulate worldwide + development of collaborative font projects, to support the font creation + efforts of academic and linguistic communities, and to provide a free and + open framework in which fonts may be shared and improved in partnership + with others. + + The OFL allows the licensed fonts to be used, studied, modified and + redistributed freely as long as they are not sold by themselves. The + fonts, including any derivative works, can be bundled, embedded, + redistributed and/or sold with any software provided that any reserved + names are not used by derivative works. The fonts and derivatives, + however, cannot be released under any other type of license. The + requirement for fonts to remain under this license does not apply + to any document created using the fonts or their derivatives. + + DEFINITIONS + "Font Software" refers to the set of files released by the Copyright + Holder(s) under this license and clearly marked as such. This may + include source files, build scripts and documentation. + + "Reserved Font Name" refers to any names specified as such after the + copyright statement(s). + + "Original Version" refers to the collection of Font Software components as + distributed by the Copyright Holder(s). + + "Modified Version" refers to any derivative made by adding to, deleting, + or substituting -- in part or in whole -- any of the components of the + Original Version, by changing formats or by porting the Font Software to a + new environment. + + "Author" refers to any designer, engineer, programmer, technical + writer or other person who contributed to the Font Software. + + PERMISSION & CONDITIONS + Permission is hereby granted, free of charge, to any person obtaining + a copy of the Font Software, to use, study, copy, merge, embed, modify, + redistribute, and sell modified and unmodified copies of the Font + Software, subject to the following conditions: + + 1) Neither the Font Software nor any of its individual components, + in Original or Modified Versions, may be sold by itself. + + 2) Original or Modified Versions of the Font Software may be bundled, + redistributed and/or sold with any software, provided that each copy + contains the above copyright notice and this license. These can be + included either as stand-alone text files, human-readable headers or + in the appropriate machine-readable metadata fields within text or + binary files as long as those fields can be easily viewed by the user. + + 3) No Modified Version of the Font Software may use the Reserved Font + Name(s) unless explicit written permission is granted by the corresponding + Copyright Holder. This restriction only applies to the primary font name as + presented to the users. + + 4) The name(s) of the Copyright Holder(s) or the Author(s) of the Font + Software shall not be used to promote, endorse or advertise any + Modified Version, except to acknowledge the contribution(s) of the + Copyright Holder(s) and the Author(s) or with their explicit written + permission. + + 5) The Font Software, modified or unmodified, in part or in whole, + must be distributed entirely under this license, and must not be + distributed under any other license. The requirement for fonts to + remain under this license does not apply to any document created + using the Font Software. + + TERMINATION + This license becomes null and void if any of the above conditions are + not met. + + DISCLAIMER + THE FONT SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, + EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTIES OF + MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT + OF COPYRIGHT, PATENT, TRADEMARK, OR OTHER RIGHT. IN NO EVENT SHALL THE + COPYRIGHT HOLDER BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, + INCLUDING ANY GENERAL, SPECIAL, INDIRECT, INCIDENTAL, OR CONSEQUENTIAL + DAMAGES, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING + FROM, OUT OF THE USE OR INABILITY TO USE THE FONT SOFTWARE OR FROM + OTHER DEALINGS IN THE FONT SOFTWARE. + + + +Name: BaKoMa Fonts +Files: matplotlib/mpl-data/fonts/ttf/cm*.ttf matplotlib/mpl-data/fonts/afm/cm*.afm +Description: Computer Modern Fonts in PostScript Type 1 and TrueType font formats. +License: BaKoMa Fonts Licence + BaKoMa Fonts Licence + -------------------- + + This licence covers two font packs (known as BaKoMa Fonts Collection, + which is available at `CTAN:fonts/cm/ps-type1/bakoma/'): + + 1) BaKoMa-CM (1.1/12-Nov-94) + Computer Modern Fonts in PostScript Type 1 and TrueType font formats. + + 2) BaKoMa-AMS (1.2/19-Jan-95) + AMS TeX fonts in PostScript Type 1 and TrueType font formats. + + Copyright (C) 1994, 1995, Basil K. Malyshev. All Rights Reserved. + + Permission to copy and distribute these fonts for any purpose is + hereby granted without fee, provided that the above copyright notice, + author statement and this permission notice appear in all copies of + these fonts and related documentation. + + Permission to modify and distribute modified fonts for any purpose is + hereby granted without fee, provided that the copyright notice, + author statement, this permission notice and location of original + fonts (http://www.ctan.org/tex-archive/fonts/cm/ps-type1/bakoma) + appear in all copies of modified fonts and related documentation. + + Permission to use these fonts (embedding into PostScript, PDF, SVG + and printing by using any software) is hereby granted without fee. + It is not required to provide any notices about using these fonts. + + Basil K. Malyshev + INSTITUTE FOR HIGH ENERGY PHYSICS + IHEP, OMVT + Moscow Region + 142281 PROTVINO + RUSSIA + + E-Mail: bakoma@mail.ru + or malyshev@mail.ihep.ru + + + + +Name: ColorBrewer Color Schemes +Files: lib/matplotlib/_cm.py +Description: Color schemes from ColorBrewer +License: Apache-2.0 + Apache-Style Software License for ColorBrewer software and ColorBrewer Color Schemes + + Copyright (c) 2002 Cynthia Brewer, Mark Harrower, and The Pennsylvania State University. + + Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. + You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, software distributed + under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR + CONDITIONS OF ANY KIND, either express or implied. See the License for the + specific language governing permissions and limitations under the License. + + +Name: Courier 10 +Files: matplotlib/tests/Courier10PitchBT-Bold.pfb +Description: Courier 10 font, used in tests. +License: Bitstream-Charter + The Courier10PitchBT-Bold.pfb file is a Type-1 version of + Courier 10 Pitch BT Bold by Bitstream, obtained from + . It is included + here as test data only, but the following license applies. + + + (c) Copyright 1989-1992, Bitstream Inc., Cambridge, MA. + + You are hereby granted permission under all Bitstream propriety rights + to use, copy, modify, sublicense, sell, and redistribute the 4 Bitstream + Charter (r) Type 1 outline fonts and the 4 Courier Type 1 outline fonts + for any purpose and without restriction; provided, that this notice is + left intact on all copies of such fonts and that Bitstream's trademark + is acknowledged as shown below on all unmodified copies of the 4 Charter + Type 1 fonts. + + BITSTREAM CHARTER is a registered trademark of Bitstream Inc. + + + +Name: JSXTools resize observer +Files: +Description: Minimal polyfill for the ResizeObserver API +License: CC0-1.0 + # CC0 1.0 Universal + + ## Statement of Purpose + + The laws of most jurisdictions throughout the world automatically confer + exclusive Copyright and Related Rights (defined below) upon the creator and + subsequent owner(s) (each and all, an “owner”) of an original work of + authorship and/or a database (each, a “Work”). + + Certain owners wish to permanently relinquish those rights to a Work for the + purpose of contributing to a commons of creative, cultural and scientific works + (“Commons”) that the public can reliably and without fear of later claims of + infringement build upon, modify, incorporate in other works, reuse and + redistribute as freely as possible in any form whatsoever and for any purposes, + including without limitation commercial purposes. These owners may contribute + to the Commons to promote the ideal of a free culture and the further + production of creative, cultural and scientific works, or to gain reputation or + greater distribution for their Work in part through the use and efforts of + others. + + For these and/or other purposes and motivations, and without any expectation of + additional consideration or compensation, the person associating CC0 with a + Work (the “Affirmer”), to the extent that he or she is an owner of Copyright + and Related Rights in the Work, voluntarily elects to apply CC0 to the Work and + publicly distribute the Work under its terms, with knowledge of his or her + Copyright and Related Rights in the Work and the meaning and intended legal + effect of CC0 on those rights. + + 1. Copyright and Related Rights. A Work made available under CC0 may be + protected by copyright and related or neighboring rights (“Copyright and + Related Rights”). Copyright and Related Rights include, but are not limited + to, the following: + 1. the right to reproduce, adapt, distribute, perform, display, communicate, + and translate a Work; + 2. moral rights retained by the original author(s) and/or performer(s); + 3. publicity and privacy rights pertaining to a person’s image or likeness + depicted in a Work; + 4. rights protecting against unfair competition in regards to a Work, + subject to the limitations in paragraph 4(i), below; + 5. rights protecting the extraction, dissemination, use and reuse of data in + a Work; + 6. database rights (such as those arising under Directive 96/9/EC of the + European Parliament and of the Council of 11 March 1996 on the legal + protection of databases, and under any national implementation thereof, + including any amended or successor version of such directive); and + 7. other similar, equivalent or corresponding rights throughout the world + based on applicable law or treaty, and any national implementations + thereof. + + 2. Waiver. To the greatest extent permitted by, but not in contravention of, + applicable law, Affirmer hereby overtly, fully, permanently, irrevocably and + unconditionally waives, abandons, and surrenders all of Affirmer’s Copyright + and Related Rights and associated claims and causes of action, whether now + known or unknown (including existing as well as future claims and causes of + action), in the Work (i) in all territories worldwide, (ii) for the maximum + duration provided by applicable law or treaty (including future time + extensions), (iii) in any current or future medium and for any number of + copies, and (iv) for any purpose whatsoever, including without limitation + commercial, advertising or promotional purposes (the “Waiver”). Affirmer + makes the Waiver for the benefit of each member of the public at large and + to the detriment of Affirmer’s heirs and successors, fully intending that + such Waiver shall not be subject to revocation, rescission, cancellation, + termination, or any other legal or equitable action to disrupt the quiet + enjoyment of the Work by the public as contemplated by Affirmer’s express + Statement of Purpose. + + 3. Public License Fallback. Should any part of the Waiver for any reason be + judged legally invalid or ineffective under applicable law, then the Waiver + shall be preserved to the maximum extent permitted taking into account + Affirmer’s express Statement of Purpose. In addition, to the extent the + Waiver is so judged Affirmer hereby grants to each affected person a + royalty-free, non transferable, non sublicensable, non exclusive, + irrevocable and unconditional license to exercise Affirmer’s Copyright and + Related Rights in the Work (i) in all territories worldwide, (ii) for the + maximum duration provided by applicable law or treaty (including future time + extensions), (iii) in any current or future medium and for any number of + copies, and (iv) for any purpose whatsoever, including without limitation + commercial, advertising or promotional purposes (the “License”). The License + shall be deemed effective as of the date CC0 was applied by Affirmer to the + Work. Should any part of the License for any reason be judged legally + invalid or ineffective under applicable law, such partial invalidity or + ineffectiveness shall not invalidate the remainder of the License, and in + such case Affirmer hereby affirms that he or she will not (i) exercise any + of his or her remaining Copyright and Related Rights in the Work or (ii) + assert any associated claims and causes of action with respect to the Work, + in either case contrary to Affirmer’s express Statement of Purpose. + + 4. Limitations and Disclaimers. + 1. No trademark or patent rights held by Affirmer are waived, abandoned, + surrendered, licensed or otherwise affected by this document. + 2. Affirmer offers the Work as-is and makes no representations or warranties + of any kind concerning the Work, express, implied, statutory or + otherwise, including without limitation warranties of title, + merchantability, fitness for a particular purpose, non infringement, or + the absence of latent or other defects, accuracy, or the present or + absence of errors, whether or not discoverable, all to the greatest + extent permissible under applicable law. + 3. Affirmer disclaims responsibility for clearing rights of other persons + that may apply to the Work or any use thereof, including without + limitation any person’s Copyright and Related Rights in the Work. + Further, Affirmer disclaims responsibility for obtaining any necessary + consents, permissions or other rights required for any use of the Work. + 4. Affirmer understands and acknowledges that Creative Commons is not a + party to this document and has no duty or obligation with respect to this + CC0 or use of the Work. + + For more information, please see + http://creativecommons.org/publicdomain/zero/1.0/. + + +Name: QHull +Files: matplotlib/_qhull.*.so +Description: Convex hull, Delaunay triangulation, Voronoi diagrams, Halfspace intersection +License: Qhull + Qhull, Copyright (c) 1993-2020 + + C.B. Barber + Arlington, MA + + and + + The National Science and Technology Research Center for + Computation and Visualization of Geometric Structures + (The Geometry Center) + University of Minnesota + + email: qhull@qhull.org + + This software includes Qhull from C.B. Barber and The Geometry Center. + Files derived from Qhull 1.0 are copyrighted by the Geometry Center. The + remaining files are copyrighted by C.B. Barber. Qhull is free software + and may be obtained via http from www.qhull.org. It may be freely copied, + modified, and redistributed under the following conditions: + + 1. All copyright notices must remain intact in all files. + + 2. A copy of this text file must be distributed along with any copies + of Qhull that you redistribute; this includes copies that you have + modified, or copies of programs or other software products that + include Qhull. + + 3. If you modify Qhull, you must include a notice giving the + name of the person performing the modification, the date of + modification, and the reason for such modification. + + 4. When distributing modified versions of Qhull, or other software + products that include Qhull, you must provide notice that the original + source code may be obtained as noted above. + + 5. There is no warranty or other guarantee of fitness for Qhull, it is + provided solely "as is". Bug reports or fixes may be sent to + qhull_bug@qhull.org; the authors may or may not act on them as + they desire. + + +Name: Qt4 Editor +Files: matplotlib/backends/qt_editor +Description: Module creating PyQt4 form dialogs/layouts to edit various type of parameters +License: MIT + Module creating PyQt4 form dialogs/layouts to edit various type of parameters + + + formlayout License Agreement (MIT License) + ------------------------------------------ + + Copyright (c) 2009 Pierre Raybaut + + Permission is hereby granted, free of charge, to any person + obtaining a copy of this software and associated documentation + files (the "Software"), to deal in the Software without + restriction, including without limitation the rights to use, + copy, modify, merge, publish, distribute, sublicense, and/or sell + copies of the Software, and to permit persons to whom the + Software is furnished to do so, subject to the following + conditions: + + The above copyright notice and this permission notice shall be + included in all copies or substantial portions of the Software. + + THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, + EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES + OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND + NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT + HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, + WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING + FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR + OTHER DEALINGS IN THE SOFTWARE. + """ + + +Name: Solarized +Files: matplotlib/mpl-data/stylelib/Solarize_Light2.mplstyle +Description: Solarized color scheme/style +License: MIT + https://github.com/altercation/solarized/blob/master/LICENSE + Copyright (c) 2011 Ethan Schoonover + + Permission is hereby granted, free of charge, to any person obtaining a copy + of this software and associated documentation files (the "Software"), to deal + in the Software without restriction, including without limitation the rights + to use, copy, modify, merge, publish, distribute, sublicense, and/or sell + copies of the Software, and to permit persons to whom the Software is + furnished to do so, subject to the following conditions: + + The above copyright notice and this permission notice shall be included in + all copies or substantial portions of the Software. + + THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE + AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER + LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, + OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN + THE SOFTWARE. + + +Name: Stix fonts +Files: matplotlib/mpl-data/fonts/ttf/STIX*.ttf +Description: STIX fonts +License: + TERMS AND CONDITIONS + + 1. Permission is hereby granted, free of charge, to any person + obtaining a copy of the STIX Fonts-TM set accompanying this license + (collectively, the "Fonts") and the associated documentation files + (collectively with the Fonts, the "Font Software"), to reproduce and + distribute the Font Software, including the rights to use, copy, merge + and publish copies of the Font Software, and to permit persons to whom + the Font Software is furnished to do so same, subject to the following + terms and conditions (the "License"). + + 2. The following copyright and trademark notice and these Terms and + Conditions shall be included in all copies of one or more of the Font + typefaces and any derivative work created as permitted under this + License: + + Copyright (c) 2001-2005 by the STI Pub Companies, consisting of + the American Institute of Physics, the American Chemical Society, the + American Mathematical Society, the American Physical Society, Elsevier, + Inc., and The Institute of Electrical and Electronic Engineers, Inc. + Portions copyright (c) 1998-2003 by MicroPress, Inc. Portions copyright + (c) 1990 by Elsevier, Inc. All rights reserved. STIX Fonts-TM is a + trademark of The Institute of Electrical and Electronics Engineers, Inc. + + 3. You may (a) convert the Fonts from one format to another (e.g., + from TrueType to PostScript), in which case the normal and reasonable + distortion that occurs during such conversion shall be permitted and (b) + embed or include a subset of the Fonts in a document for the purposes of + allowing users to read text in the document that utilizes the Fonts. In + each case, you may use the STIX Fonts-TM mark to designate the resulting + Fonts or subset of the Fonts. + + 4. You may also (a) add glyphs or characters to the Fonts, or modify + the shape of existing glyphs, so long as the base set of glyphs is not + removed and (b) delete glyphs or characters from the Fonts, provided + that the resulting font set is distributed with the following + disclaimer: "This [name] font does not include all the Unicode points + covered in the STIX Fonts-TM set but may include others." In each case, + the name used to denote the resulting font set shall not include the + term "STIX" or any similar term. + + 5. You may charge a fee in connection with the distribution of the + Font Software, provided that no copy of one or more of the individual + Font typefaces that form the STIX Fonts-TM set may be sold by itself. + + 6. THE FONT SOFTWARE IS PROVIDED "AS IS," WITHOUT WARRANTY OF ANY + KIND, EXPRESS OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, ANY WARRANTIES + OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT + OF COPYRIGHT, PATENT, TRADEMARK OR OTHER RIGHT. IN NO EVENT SHALL + MICROPRESS OR ANY OF THE STI PUB COMPANIES BE LIABLE FOR ANY CLAIM, + DAMAGES OR OTHER LIABILITY, INCLUDING, BUT NOT LIMITED TO, ANY GENERAL, + SPECIAL, INDIRECT, INCIDENTAL OR CONSEQUENTIAL DAMAGES, WHETHER IN AN + ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM OR OUT OF THE USE OR + INABILITY TO USE THE FONT SOFTWARE OR FROM OTHER DEALINGS IN THE FONT + SOFTWARE. + + 7. Except as contained in the notice set forth in Section 2, the + names MicroPress Inc. and STI Pub Companies, as well as the names of the + companies/organizations that compose the STI Pub Companies, shall not be + used in advertising or otherwise to promote the sale, use or other + dealings in the Font Software without the prior written consent of the + respective company or organization. + + 8. This License shall become null and void in the event of any + material breach of the Terms and Conditions herein by licensee. + + 9. A substantial portion of the STIX Fonts set was developed by + MicroPress Inc. for the STI Pub Companies. To obtain additional + mathematical fonts, please contact MicroPress, Inc., 68-30 Harrow + Street, Forest Hills, NY 11375, USA - Phone: (718) 575-1816. + + +Name: Yorick Colormaps +Files: lib/matplotlib/_cm.py +Description: Gist/Yorick colormaps +License: + BSD-style license for gist/yorick colormaps. + + Copyright: + + Copyright (c) 1996. The Regents of the University of California. + All rights reserved. + + Permission to use, copy, modify, and distribute this software for any + purpose without fee is hereby granted, provided that this entire + notice is included in all copies of any software which is or includes + a copy or modification of this software and in all copies of the + supporting documentation for such software. + + This work was produced at the University of California, Lawrence + Livermore National Laboratory under contract no. W-7405-ENG-48 between + the U.S. Department of Energy and The Regents of the University of + California for the operation of UC LLNL. + + + DISCLAIMER + + This software was prepared as an account of work sponsored by an + agency of the United States Government. Neither the United States + Government nor the University of California nor any of their + employees, makes any warranty, express or implied, or assumes any + liability or responsibility for the accuracy, completeness, or + usefulness of any information, apparatus, product, or process + disclosed, or represents that its use would not infringe + privately-owned rights. Reference herein to any specific commercial + products, process, or service by trade name, trademark, manufacturer, + or otherwise, does not necessarily constitute or imply its + endorsement, recommendation, or favoring by the United States + Government or the University of California. The views and opinions of + authors expressed herein do not necessarily state or reflect those of + the United States Government or the University of California, and + shall not be used for advertising or product endorsement purposes. + + + AUTHOR + + David H. Munro wrote Yorick and Gist. Berkeley Yacc (byacc) generated + the Yorick parser. The routines in Math are from LAPACK and FFTPACK; + MathC contains C translations by David H. Munro. The algorithms for + Yorick's random number generator and several special functions in + Yorick/include were taken from Numerical Recipes by Press, et. al., + although the Yorick implementations are unrelated to those in + Numerical Recipes. A small amount of code in Gist was adapted from + the X11R4 release, copyright M.I.T. -- the complete copyright notice + may be found in the (unused) file Gist/host.c. + + +matplotlib-inline +UNKNOWN +https://github.com/ipython/matplotlib-inline +BSD 3-Clause License + +Copyright (c) 2019-2022, IPython Development Team. +All rights reserved. + +Redistribution and use in source and binary forms, with or without +modification, are permitted provided that the following conditions are met: + +1. Redistributions of source code must retain the above copyright notice, this + list of conditions and the following disclaimer. + +2. Redistributions in binary form must reproduce the above copyright notice, + this list of conditions and the following disclaimer in the documentation + and/or other materials provided with the distribution. + +3. Neither the name of the copyright holder nor the names of its + contributors may be used to endorse or promote products derived from + this software without specific prior written permission. + +THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" +AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE +IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE +DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE +FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL +DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR +SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER +CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, +OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE +OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + + +mdurl +MIT License +https://github.com/executablebooks/mdurl +Copyright (c) 2015 Vitaly Puzrin, Alex Kocharin. +Copyright (c) 2021 Taneli Hukkinen + +Permission is hereby granted, free of charge, to any person +obtaining a copy of this software and associated documentation +files (the "Software"), to deal in the Software without +restriction, including without limitation the rights to use, +copy, modify, merge, publish, distribute, sublicense, and/or sell +copies of the Software, and to permit persons to whom the +Software is furnished to do so, subject to the following +conditions: + +The above copyright notice and this permission notice shall be +included in all copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, +EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES +OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND +NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT +HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, +WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING +FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR +OTHER DEALINGS IN THE SOFTWARE. + +-------------------------------------------------------------------------------- + +.parse() is based on Joyent's node.js `url` code: + +Copyright Joyent, Inc. and other Node contributors. All rights reserved. +Permission is hereby granted, free of charge, to any person obtaining a copy +of this software and associated documentation files (the "Software"), to +deal in the Software without restriction, including without limitation the +rights to use, copy, modify, merge, publish, distribute, sublicense, and/or +sell copies of the Software, and to permit persons to whom the Software is +furnished to do so, subject to the following conditions: + +The above copyright notice and this permission notice shall be included in +all copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING +FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS +IN THE SOFTWARE. + + +mediapy +Apache Software License +https://github.com/google/mediapy + + Apache License + Version 2.0, January 2004 + http://www.apache.org/licenses/ + + TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION + + 1. Definitions. + + "License" shall mean the terms and conditions for use, reproduction, + and distribution as defined by Sections 1 through 9 of this document. + + "Licensor" shall mean the copyright owner or entity authorized by + the copyright owner that is granting the License. + + "Legal Entity" shall mean the union of the acting entity and all + other entities that control, are controlled by, or are under common + control with that entity. For the purposes of this definition, + "control" means (i) the power, direct or indirect, to cause the + direction or management of such entity, whether by contract or + otherwise, or (ii) ownership of fifty percent (50%) or more of the + outstanding shares, or (iii) beneficial ownership of such entity. + + "You" (or "Your") shall mean an individual or Legal Entity + exercising permissions granted by this License. + + "Source" form shall mean the preferred form for making modifications, + including but not limited to software source code, documentation + source, and configuration files. + + "Object" form shall mean any form resulting from mechanical + transformation or translation of a Source form, including but + not limited to compiled object code, generated documentation, + and conversions to other media types. + + "Work" shall mean the work of authorship, whether in Source or + Object form, made available under the License, as indicated by a + copyright notice that is included in or attached to the work + (an example is provided in the Appendix below). + + "Derivative Works" shall mean any work, whether in Source or Object + form, that is based on (or derived from) the Work and for which the + editorial revisions, annotations, elaborations, or other modifications + represent, as a whole, an original work of authorship. For the purposes + of this License, Derivative Works shall not include works that remain + separable from, or merely link (or bind by name) to the interfaces of, + the Work and Derivative Works thereof. + + "Contribution" shall mean any work of authorship, including + the original version of the Work and any modifications or additions + to that Work or Derivative Works thereof, that is intentionally + submitted to Licensor for inclusion in the Work by the copyright owner + or by an individual or Legal Entity authorized to submit on behalf of + the copyright owner. For the purposes of this definition, "submitted" + means any form of electronic, verbal, or written communication sent + to the Licensor or its representatives, including but not limited to + communication on electronic mailing lists, source code control systems, + and issue tracking systems that are managed by, or on behalf of, the + Licensor for the purpose of discussing and improving the Work, but + excluding communication that is conspicuously marked or otherwise + designated in writing by the copyright owner as "Not a Contribution." + + "Contributor" shall mean Licensor and any individual or Legal Entity + on behalf of whom a Contribution has been received by Licensor and + subsequently incorporated within the Work. + + 2. Grant of Copyright License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + copyright license to reproduce, prepare Derivative Works of, + publicly display, publicly perform, sublicense, and distribute the + Work and such Derivative Works in Source or Object form. + + 3. Grant of Patent License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + (except as stated in this section) patent license to make, have made, + use, offer to sell, sell, import, and otherwise transfer the Work, + where such license applies only to those patent claims licensable + by such Contributor that are necessarily infringed by their + Contribution(s) alone or by combination of their Contribution(s) + with the Work to which such Contribution(s) was submitted. If You + institute patent litigation against any entity (including a + cross-claim or counterclaim in a lawsuit) alleging that the Work + or a Contribution incorporated within the Work constitutes direct + or contributory patent infringement, then any patent licenses + granted to You under this License for that Work shall terminate + as of the date such litigation is filed. + + 4. Redistribution. You may reproduce and distribute copies of the + Work or Derivative Works thereof in any medium, with or without + modifications, and in Source or Object form, provided that You + meet the following conditions: + + (a) You must give any other recipients of the Work or + Derivative Works a copy of this License; and + + (b) You must cause any modified files to carry prominent notices + stating that You changed the files; and + + (c) You must retain, in the Source form of any Derivative Works + that You distribute, all copyright, patent, trademark, and + attribution notices from the Source form of the Work, + excluding those notices that do not pertain to any part of + the Derivative Works; and + + (d) If the Work includes a "NOTICE" text file as part of its + distribution, then any Derivative Works that You distribute must + include a readable copy of the attribution notices contained + within such NOTICE file, excluding those notices that do not + pertain to any part of the Derivative Works, in at least one + of the following places: within a NOTICE text file distributed + as part of the Derivative Works; within the Source form or + documentation, if provided along with the Derivative Works; or, + within a display generated by the Derivative Works, if and + wherever such third-party notices normally appear. The contents + of the NOTICE file are for informational purposes only and + do not modify the License. You may add Your own attribution + notices within Derivative Works that You distribute, alongside + or as an addendum to the NOTICE text from the Work, provided + that such additional attribution notices cannot be construed + as modifying the License. + + You may add Your own copyright statement to Your modifications and + may provide additional or different license terms and conditions + for use, reproduction, or distribution of Your modifications, or + for any such Derivative Works as a whole, provided Your use, + reproduction, and distribution of the Work otherwise complies with + the conditions stated in this License. + + 5. Submission of Contributions. Unless You explicitly state otherwise, + any Contribution intentionally submitted for inclusion in the Work + by You to the Licensor shall be under the terms and conditions of + this License, without any additional terms or conditions. + Notwithstanding the above, nothing herein shall supersede or modify + the terms of any separate license agreement you may have executed + with Licensor regarding such Contributions. + + 6. Trademarks. This License does not grant permission to use the trade + names, trademarks, service marks, or product names of the Licensor, + except as required for reasonable and customary use in describing the + origin of the Work and reproducing the content of the NOTICE file. + + 7. Disclaimer of Warranty. Unless required by applicable law or + agreed to in writing, Licensor provides the Work (and each + Contributor provides its Contributions) on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or + implied, including, without limitation, any warranties or conditions + of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A + PARTICULAR PURPOSE. You are solely responsible for determining the + appropriateness of using or redistributing the Work and assume any + risks associated with Your exercise of permissions under this License. + + 8. Limitation of Liability. In no event and under no legal theory, + whether in tort (including negligence), contract, or otherwise, + unless required by applicable law (such as deliberate and grossly + negligent acts) or agreed to in writing, shall any Contributor be + liable to You for damages, including any direct, indirect, special, + incidental, or consequential damages of any character arising as a + result of this License or out of the use or inability to use the + Work (including but not limited to damages for loss of goodwill, + work stoppage, computer failure or malfunction, or any and all + other commercial damages or losses), even if such Contributor + has been advised of the possibility of such damages. + + 9. Accepting Warranty or Additional Liability. While redistributing + the Work or Derivative Works thereof, You may choose to offer, + and charge a fee for, acceptance of support, warranty, indemnity, + or other liability obligations and/or rights consistent with this + License. However, in accepting such obligations, You may act only + on Your own behalf and on Your sole responsibility, not on behalf + of any other Contributor, and only if You agree to indemnify, + defend, and hold each Contributor harmless for any liability + incurred by, or claims asserted against, such Contributor by reason + of your accepting any such warranty or additional liability. + + END OF TERMS AND CONDITIONS + + APPENDIX: How to apply the Apache License to your work. + + To apply the Apache License to your work, attach the following + boilerplate notice, with the fields enclosed by brackets "[]" + replaced with your own identifying information. (Don't include + the brackets!) The text should be enclosed in the appropriate + comment syntax for the file format. We also recommend that a + file or class name and description of purpose be included on the + same "printed page" as the copyright notice for easier + identification within third-party archives. + + Copyright [yyyy] [name of copyright owner] + + Licensed under the Apache License, Version 2.0 (the "License"); + you may not use this file except in compliance with the License. + You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, software + distributed under the License is distributed on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + See the License for the specific language governing permissions and + limitations under the License. + + +megatron-core +BSD License +https://github.com/NVIDIA/Megatron-LM/megatron/core +The following applies to all files unless otherwise noted: + +# Copyright (c) 2019-2025, NVIDIA CORPORATION. All rights reserved. +# Redistribution and use in source and binary forms, with or without +# modification, are permitted provided that the following conditions +# are met: +# * Redistributions of source code must retain the above copyright +# notice, this list of conditions and the following disclaimer. +# * Redistributions in binary form must reproduce the above copyright +# notice, this list of conditions and the following disclaimer in the +# documentation and/or other materials provided with the distribution. +# * Neither the name of NVIDIA CORPORATION nor the names of its +# contributors may be used to endorse or promote products derived +# from this software without specific prior written permission. +# +# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS ``AS IS'' AND ANY +# EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE +# IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR +# PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR +# CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, +# EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, +# PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR +# PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY +# OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT +# (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE +# OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + +-- + +This repository also contains code from Hugging Face Inc., Google Research, +Facebook (from their Fairseq, Dino, and ParlAI projects), Microsoft (from their +Swin-Transformer project), Philip Popien, the Mamba project (Tri Dao and +Albert Gu), and the Triton language and compiler project (Philippe Tillet and +OpenAI). Files from these organizations have notices at the top of each file. +Below are licenses used in those files, as indicated. + + +-------------------------------------------------------------------------------------- +-- LICENSE FOR Facebook, huggingface, Google Research, LLaVA, Mamba, TinyZero and vLLM code -- + + + Apache License + Version 2.0, January 2004 + http://www.apache.org/licenses/ + + TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION + + 1. Definitions. + + "License" shall mean the terms and conditions for use, reproduction, + and distribution as defined by Sections 1 through 9 of this document. + + "Licensor" shall mean the copyright owner or entity authorized by + the copyright owner that is granting the License. + + "Legal Entity" shall mean the union of the acting entity and all + other entities that control, are controlled by, or are under common + control with that entity. For the purposes of this definition, + "control" means (i) the power, direct or indirect, to cause the + direction or management of such entity, whether by contract or + otherwise, or (ii) ownership of fifty percent (50%) or more of the + outstanding shares, or (iii) beneficial ownership of such entity. + + "You" (or "Your") shall mean an individual or Legal Entity + exercising permissions granted by this License. + + "Source" form shall mean the preferred form for making modifications, + including but not limited to software source code, documentation + source, and configuration files. + + "Object" form shall mean any form resulting from mechanical + transformation or translation of a Source form, including but + not limited to compiled object code, generated documentation, + and conversions to other media types. + + "Work" shall mean the work of authorship, whether in Source or + Object form, made available under the License, as indicated by a + copyright notice that is included in or attached to the work + (an example is provided in the Appendix below). + + "Derivative Works" shall mean any work, whether in Source or Object + form, that is based on (or derived from) the Work and for which the + editorial revisions, annotations, elaborations, or other modifications + represent, as a whole, an original work of authorship. For the purposes + of this License, Derivative Works shall not include works that remain + separable from, or merely link (or bind by name) to the interfaces of, + the Work and Derivative Works thereof. + + "Contribution" shall mean any work of authorship, including + the original version of the Work and any modifications or additions + to that Work or Derivative Works thereof, that is intentionally + submitted to Licensor for inclusion in the Work by the copyright owner + or by an individual or Legal Entity authorized to submit on behalf of + the copyright owner. For the purposes of this definition, "submitted" + means any form of electronic, verbal, or written communication sent + to the Licensor or its representatives, including but not limited to + communication on electronic mailing lists, source code control systems, + and issue tracking systems that are managed by, or on behalf of, the + Licensor for the purpose of discussing and improving the Work, but + excluding communication that is conspicuously marked or otherwise + designated in writing by the copyright owner as "Not a Contribution." + + "Contributor" shall mean Licensor and any individual or Legal Entity + on behalf of whom a Contribution has been received by Licensor and + subsequently incorporated within the Work. + + 2. Grant of Copyright License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + copyright license to reproduce, prepare Derivative Works of, + publicly display, publicly perform, sublicense, and distribute the + Work and such Derivative Works in Source or Object form. + + 3. Grant of Patent License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + (except as stated in this section) patent license to make, have made, + use, offer to sell, sell, import, and otherwise transfer the Work, + where such license applies only to those patent claims licensable + by such Contributor that are necessarily infringed by their + Contribution(s) alone or by combination of their Contribution(s) + with the Work to which such Contribution(s) was submitted. If You + institute patent litigation against any entity (including a + cross-claim or counterclaim in a lawsuit) alleging that the Work + or a Contribution incorporated within the Work constitutes direct + or contributory patent infringement, then any patent licenses + granted to You under this License for that Work shall terminate + as of the date such litigation is filed. + + 4. Redistribution. You may reproduce and distribute copies of the + Work or Derivative Works thereof in any medium, with or without + modifications, and in Source or Object form, provided that You + meet the following conditions: + + (a) You must give any other recipients of the Work or + Derivative Works a copy of this License; and + + (b) You must cause any modified files to carry prominent notices + stating that You changed the files; and + + (c) You must retain, in the Source form of any Derivative Works + that You distribute, all copyright, patent, trademark, and + attribution notices from the Source form of the Work, + excluding those notices that do not pertain to any part of + the Derivative Works; and + + (d) If the Work includes a "NOTICE" text file as part of its + distribution, then any Derivative Works that You distribute must + include a readable copy of the attribution notices contained + within such NOTICE file, excluding those notices that do not + pertain to any part of the Derivative Works, in at least one + of the following places: within a NOTICE text file distributed + as part of the Derivative Works; within the Source form or + documentation, if provided along with the Derivative Works; or, + within a display generated by the Derivative Works, if and + wherever such third-party notices normally appear. The contents + of the NOTICE file are for informational purposes only and + do not modify the License. You may add Your own attribution + notices within Derivative Works that You distribute, alongside + or as an addendum to the NOTICE text from the Work, provided + that such additional attribution notices cannot be construed + as modifying the License. + + You may add Your own copyright statement to Your modifications and + may provide additional or different license terms and conditions + for use, reproduction, or distribution of Your modifications, or + for any such Derivative Works as a whole, provided Your use, + reproduction, and distribution of the Work otherwise complies with + the conditions stated in this License. + + 5. Submission of Contributions. Unless You explicitly state otherwise, + any Contribution intentionally submitted for inclusion in the Work + by You to the Licensor shall be under the terms and conditions of + this License, without any additional terms or conditions. + Notwithstanding the above, nothing herein shall supersede or modify + the terms of any separate license agreement you may have executed + with Licensor regarding such Contributions. + + 6. Trademarks. This License does not grant permission to use the trade + names, trademarks, service marks, or product names of the Licensor, + except as required for reasonable and customary use in describing the + origin of the Work and reproducing the content of the NOTICE file. + + 7. Disclaimer of Warranty. Unless required by applicable law or + agreed to in writing, Licensor provides the Work (and each + Contributor provides its Contributions) on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or + implied, including, without limitation, any warranties or conditions + of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A + PARTICULAR PURPOSE. You are solely responsible for determining the + appropriateness of using or redistributing the Work and assume any + risks associated with Your exercise of permissions under this License. + + 8. Limitation of Liability. In no event and under no legal theory, + whether in tort (including negligence), contract, or otherwise, + unless required by applicable law (such as deliberate and grossly + negligent acts) or agreed to in writing, shall any Contributor be + liable to You for damages, including any direct, indirect, special, + incidental, or consequential damages of any character arising as a + result of this License or out of the use or inability to use the + Work (including but not limited to damages for loss of goodwill, + work stoppage, computer failure or malfunction, or any and all + other commercial damages or losses), even if such Contributor + has been advised of the possibility of such damages. + + 9. Accepting Warranty or Additional Liability. While redistributing + the Work or Derivative Works thereof, You may choose to offer, + and charge a fee for, acceptance of support, warranty, indemnity, + or other liability obligations and/or rights consistent with this + License. However, in accepting such obligations, You may act only + on Your own behalf and on Your sole responsibility, not on behalf + of any other Contributor, and only if You agree to indemnify, + defend, and hold each Contributor harmless for any liability + incurred by, or claims asserted against, such Contributor by reason + of your accepting any such warranty or additional liability. + + END OF TERMS AND CONDITIONS + + APPENDIX: How to apply the Apache License to your work. + + To apply the Apache License to your work, attach the following + boilerplate notice, with the fields enclosed by brackets "[]" + replaced with your own identifying information. (Don't include + the brackets!) The text should be enclosed in the appropriate + comment syntax for the file format. We also recommend that a + file or class name and description of purpose be included on the + same "printed page" as the copyright notice for easier + identification within third-party archives. + + Copyright [yyyy] [name of copyright owner] + + Licensed under the Apache License, Version 2.0 (the "License"); + you may not use this file except in compliance with the License. + You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, software + distributed under the License is distributed on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + See the License for the specific language governing permissions and + limitations under the License. + +-------------------------------------------------------------------------------- +LICENSE FOR +Facebook, Inc. and its affiliates, +Meta Platforms, Inc. and its affiliates, +Microsoft Corporation, +OpenGVLab/InternVL, +Triton language and compiler, +and DeepSeek. + +MIT License + +Permission is hereby granted, free of charge, to any person obtaining a copy +of this software and associated documentation files (the "Software"), to deal +in the Software without restriction, including without limitation the rights +to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +copies of the Software, and to permit persons to whom the Software is +furnished to do so, subject to the following conditions: + +The above copyright notice and this permission notice shall be included in all +copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE +SOFTWARE. + +-------------------------------------------------------------------------------- +LICENSE FOR Thinking Machines Lab + +MIT License + +Copyright 2025 Thinking Machines Lab + +Permission is hereby granted, free of charge, to any person obtaining a copy +of this software and associated documentation files (the "Software"), to deal +in the Software without restriction, including without limitation the rights +to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +copies of the Software, and to permit persons to whom the Software is +furnished to do so, subject to the following conditions: + +The above copyright notice and this permission notice shall be included in all +copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE +SOFTWARE. + +-------------------------------------------------------------------------------- +LICENSE FOR +Meta Platforms, Inc. and affiliates. + +BSD License + +Redistribution and use in source and binary forms, with or without modification, +are permitted provided that the following conditions are met: + + * Redistributions of source code must retain the above copyright notice, this + list of conditions and the following disclaimer. + + * Redistributions in binary form must reproduce the above copyright notice, + this list of conditions and the following disclaimer in the documentation + and/or other materials provided with the distribution. + + * Neither the name Meta nor the names of its contributors may be used to + endorse or promote products derived from this software without specific + prior written permission. + +THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND +ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED +WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE +DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR +ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES +(INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; +LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON +ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT +(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS +SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + + +mistune +BSD License +https://github.com/lepture/mistune +Copyright (c) 2014, Hsiaoming Yang + +All rights reserved. + +Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met: + +* Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer. + +* Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution. + +* Neither the name of the creator nor the names of its contributors may be used to endorse or promote products derived from this software without specific prior written permission. + + +THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + + +ml_dtypes +Apache-2.0 +https://github.com/jax-ml/ml_dtypes + + Apache License + Version 2.0, January 2004 + http://www.apache.org/licenses/ + + TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION + + 1. Definitions. + + "License" shall mean the terms and conditions for use, reproduction, + and distribution as defined by Sections 1 through 9 of this document. + + "Licensor" shall mean the copyright owner or entity authorized by + the copyright owner that is granting the License. + + "Legal Entity" shall mean the union of the acting entity and all + other entities that control, are controlled by, or are under common + control with that entity. For the purposes of this definition, + "control" means (i) the power, direct or indirect, to cause the + direction or management of such entity, whether by contract or + otherwise, or (ii) ownership of fifty percent (50%) or more of the + outstanding shares, or (iii) beneficial ownership of such entity. + + "You" (or "Your") shall mean an individual or Legal Entity + exercising permissions granted by this License. + + "Source" form shall mean the preferred form for making modifications, + including but not limited to software source code, documentation + source, and configuration files. + + "Object" form shall mean any form resulting from mechanical + transformation or translation of a Source form, including but + not limited to compiled object code, generated documentation, + and conversions to other media types. + + "Work" shall mean the work of authorship, whether in Source or + Object form, made available under the License, as indicated by a + copyright notice that is included in or attached to the work + (an example is provided in the Appendix below). + + "Derivative Works" shall mean any work, whether in Source or Object + form, that is based on (or derived from) the Work and for which the + editorial revisions, annotations, elaborations, or other modifications + represent, as a whole, an original work of authorship. For the purposes + of this License, Derivative Works shall not include works that remain + separable from, or merely link (or bind by name) to the interfaces of, + the Work and Derivative Works thereof. + + "Contribution" shall mean any work of authorship, including + the original version of the Work and any modifications or additions + to that Work or Derivative Works thereof, that is intentionally + submitted to Licensor for inclusion in the Work by the copyright owner + or by an individual or Legal Entity authorized to submit on behalf of + the copyright owner. For the purposes of this definition, "submitted" + means any form of electronic, verbal, or written communication sent + to the Licensor or its representatives, including but not limited to + communication on electronic mailing lists, source code control systems, + and issue tracking systems that are managed by, or on behalf of, the + Licensor for the purpose of discussing and improving the Work, but + excluding communication that is conspicuously marked or otherwise + designated in writing by the copyright owner as "Not a Contribution." + + "Contributor" shall mean Licensor and any individual or Legal Entity + on behalf of whom a Contribution has been received by Licensor and + subsequently incorporated within the Work. + + 2. Grant of Copyright License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + copyright license to reproduce, prepare Derivative Works of, + publicly display, publicly perform, sublicense, and distribute the + Work and such Derivative Works in Source or Object form. + + 3. Grant of Patent License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + (except as stated in this section) patent license to make, have made, + use, offer to sell, sell, import, and otherwise transfer the Work, + where such license applies only to those patent claims licensable + by such Contributor that are necessarily infringed by their + Contribution(s) alone or by combination of their Contribution(s) + with the Work to which such Contribution(s) was submitted. If You + institute patent litigation against any entity (including a + cross-claim or counterclaim in a lawsuit) alleging that the Work + or a Contribution incorporated within the Work constitutes direct + or contributory patent infringement, then any patent licenses + granted to You under this License for that Work shall terminate + as of the date such litigation is filed. + + 4. Redistribution. You may reproduce and distribute copies of the + Work or Derivative Works thereof in any medium, with or without + modifications, and in Source or Object form, provided that You + meet the following conditions: + + (a) You must give any other recipients of the Work or + Derivative Works a copy of this License; and + + (b) You must cause any modified files to carry prominent notices + stating that You changed the files; and + + (c) You must retain, in the Source form of any Derivative Works + that You distribute, all copyright, patent, trademark, and + attribution notices from the Source form of the Work, + excluding those notices that do not pertain to any part of + the Derivative Works; and + + (d) If the Work includes a "NOTICE" text file as part of its + distribution, then any Derivative Works that You distribute must + include a readable copy of the attribution notices contained + within such NOTICE file, excluding those notices that do not + pertain to any part of the Derivative Works, in at least one + of the following places: within a NOTICE text file distributed + as part of the Derivative Works; within the Source form or + documentation, if provided along with the Derivative Works; or, + within a display generated by the Derivative Works, if and + wherever such third-party notices normally appear. The contents + of the NOTICE file are for informational purposes only and + do not modify the License. You may add Your own attribution + notices within Derivative Works that You distribute, alongside + or as an addendum to the NOTICE text from the Work, provided + that such additional attribution notices cannot be construed + as modifying the License. + + You may add Your own copyright statement to Your modifications and + may provide additional or different license terms and conditions + for use, reproduction, or distribution of Your modifications, or + for any such Derivative Works as a whole, provided Your use, + reproduction, and distribution of the Work otherwise complies with + the conditions stated in this License. + + 5. Submission of Contributions. Unless You explicitly state otherwise, + any Contribution intentionally submitted for inclusion in the Work + by You to the Licensor shall be under the terms and conditions of + this License, without any additional terms or conditions. + Notwithstanding the above, nothing herein shall supersede or modify + the terms of any separate license agreement you may have executed + with Licensor regarding such Contributions. + + 6. Trademarks. This License does not grant permission to use the trade + names, trademarks, service marks, or product names of the Licensor, + except as required for reasonable and customary use in describing the + origin of the Work and reproducing the content of the NOTICE file. + + 7. Disclaimer of Warranty. Unless required by applicable law or + agreed to in writing, Licensor provides the Work (and each + Contributor provides its Contributions) on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or + implied, including, without limitation, any warranties or conditions + of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A + PARTICULAR PURPOSE. You are solely responsible for determining the + appropriateness of using or redistributing the Work and assume any + risks associated with Your exercise of permissions under this License. + + 8. Limitation of Liability. In no event and under no legal theory, + whether in tort (including negligence), contract, or otherwise, + unless required by applicable law (such as deliberate and grossly + negligent acts) or agreed to in writing, shall any Contributor be + liable to You for damages, including any direct, indirect, special, + incidental, or consequential damages of any character arising as a + result of this License or out of the use or inability to use the + Work (including but not limited to damages for loss of goodwill, + work stoppage, computer failure or malfunction, or any and all + other commercial damages or losses), even if such Contributor + has been advised of the possibility of such damages. + + 9. Accepting Warranty or Additional Liability. While redistributing + the Work or Derivative Works thereof, You may choose to offer, + and charge a fee for, acceptance of support, warranty, indemnity, + or other liability obligations and/or rights consistent with this + License. However, in accepting such obligations, You may act only + on Your own behalf and on Your sole responsibility, not on behalf + of any other Contributor, and only if You agree to indemnify, + defend, and hold each Contributor harmless for any liability + incurred by, or claims asserted against, such Contributor by reason + of your accepting any such warranty or additional liability. + + END OF TERMS AND CONDITIONS + + APPENDIX: How to apply the Apache License to your work. + + To apply the Apache License to your work, attach the following + boilerplate notice, with the fields enclosed by brackets "[]" + replaced with your own identifying information. (Don't include + the brackets!) The text should be enclosed in the appropriate + comment syntax for the file format. We also recommend that a + file or class name and description of purpose be included on the + same "printed page" as the copyright notice for easier + identification within third-party archives. + + Copyright [yyyy] [name of copyright owner] + + Licensed under the Apache License, Version 2.0 (the "License"); + you may not use this file except in compliance with the License. + You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, software + distributed under the License is distributed on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + See the License for the specific language governing permissions and + limitations under the License. + + +more-itertools +MIT +https://github.com/more-itertools/more-itertools +Copyright (c) 2012 Erik Rose + +Permission is hereby granted, free of charge, to any person obtaining a copy of +this software and associated documentation files (the "Software"), to deal in +the Software without restriction, including without limitation the rights to +use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies +of the Software, and to permit persons to whom the Software is furnished to do +so, subject to the following conditions: + +The above copyright notice and this permission notice shall be included in all +copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE +SOFTWARE. + + +moviepy +MIT License +UNKNOWN +The MIT License (MIT) + +Copyright (c) 2015 Zulko + +Permission is hereby granted, free of charge, to any person obtaining a copy +of this software and associated documentation files (the "Software"), to deal +in the Software without restriction, including without limitation the rights +to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +copies of the Software, and to permit persons to whom the Software is +furnished to do so, subject to the following conditions: + +The above copyright notice and this permission notice shall be included in +all copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN +THE SOFTWARE. + +mpmath +BSD License +http://mpmath.org/ +Copyright (c) 2005-2021 Fredrik Johansson and mpmath contributors + +All rights reserved. + +Redistribution and use in source and binary forms, with or without +modification, are permitted provided that the following conditions are met: + + a. Redistributions of source code must retain the above copyright notice, + this list of conditions and the following disclaimer. + b. Redistributions in binary form must reproduce the above copyright + notice, this list of conditions and the following disclaimer in the + documentation and/or other materials provided with the distribution. + c. Neither the name of the copyright holder nor the names of its + contributors may be used to endorse or promote products derived + from this software without specific prior written permission. + +THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" +AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE +IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE +ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE FOR +ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL +DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR +SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER +CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT +LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY +OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH +DAMAGE. + + +msgpack +Apache-2.0 +https://msgpack.org/ +Copyright (C) 2008-2011 INADA Naoki + + Licensed under the Apache License, Version 2.0 (the "License"); + you may not use this file except in compliance with the License. + You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, software + distributed under the License is distributed on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + See the License for the specific language governing permissions and + limitations under the License. + + + +multi-storage-client +Apache-2.0 +https://github.com/NVIDIA/multi-storage-client +UNKNOWN + +multidict +Apache License 2.0 +https://github.com/aio-libs/multidict + Copyright 2016 Andrew Svetlov and aio-libs contributors + + Licensed under the Apache License, Version 2.0 (the "License"); + you may not use this file except in compliance with the License. + You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, software + distributed under the License is distributed on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + See the License for the specific language governing permissions and + limitations under the License. + + +multiprocess +BSD License +https://github.com/uqfoundation/multiprocess +Copyright (c) 2006-2008, R Oudkerk + +All rights reserved. + +Redistribution and use in source and binary forms, with or without +modification, are permitted provided that the following conditions +are met: + +1. Redistributions of source code must retain the above copyright + notice, this list of conditions and the following disclaimer. +2. Redistributions in binary form must reproduce the above copyright + notice, this list of conditions and the following disclaimer in the + documentation and/or other materials provided with the distribution. +3. Neither the name of author nor the names of any contributors may be + used to endorse or promote products derived from this software + without specific prior written permission. + +THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS "AS IS" AND +ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE +IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE +ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE +FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL +DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS +OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) +HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT +LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY +OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF +SUCH DAMAGE. + + +mypy_extensions +MIT +https://github.com/python/mypy_extensions +Mypy extensions are licensed under the terms of the MIT license, reproduced below. + += = = = = + +The MIT License + +Copyright (c) 2016-2017 Jukka Lehtosalo and contributors + +Permission is hereby granted, free of charge, to any person obtaining a +copy of this software and associated documentation files (the "Software"), +to deal in the Software without restriction, including without limitation +the rights to use, copy, modify, merge, publish, distribute, sublicense, +and/or sell copies of the Software, and to permit persons to whom the +Software is furnished to do so, subject to the following conditions: + +The above copyright notice and this permission notice shall be included in +all copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING +FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER +DEALINGS IN THE SOFTWARE. + += = = = = + + +nbclient +BSD License +https://jupyter.org +BSD 3-Clause License + +Copyright (c) 2020-, Jupyter Development Team + +All rights reserved. + +Redistribution and use in source and binary forms, with or without +modification, are permitted provided that the following conditions are met: + +1. Redistributions of source code must retain the above copyright notice, this + list of conditions and the following disclaimer. + +2. Redistributions in binary form must reproduce the above copyright notice, + this list of conditions and the following disclaimer in the documentation + and/or other materials provided with the distribution. + +3. Neither the name of the copyright holder nor the names of its + contributors may be used to endorse or promote products derived from + this software without specific prior written permission. + +THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" +AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE +IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE +DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE +FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL +DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR +SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER +CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, +OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE +OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + + +nbconvert +BSD License +https://jupyter.org +BSD 3-Clause License + +- Copyright (c) 2001-2015, IPython Development Team +- Copyright (c) 2015-, Jupyter Development Team + +All rights reserved. + +Redistribution and use in source and binary forms, with or without +modification, are permitted provided that the following conditions are met: + +1. Redistributions of source code must retain the above copyright notice, this + list of conditions and the following disclaimer. + +2. Redistributions in binary form must reproduce the above copyright notice, + this list of conditions and the following disclaimer in the documentation + and/or other materials provided with the distribution. + +3. Neither the name of the copyright holder nor the names of its + contributors may be used to endorse or promote products derived from + this software without specific prior written permission. + +THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" +AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE +IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE +DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE +FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL +DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR +SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER +CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, +OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE +OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + + +nbformat +BSD License +https://jupyter.org +BSD 3-Clause License + +- Copyright (c) 2001-2015, IPython Development Team +- Copyright (c) 2015-, Jupyter Development Team + +All rights reserved. + +Redistribution and use in source and binary forms, with or without +modification, are permitted provided that the following conditions are met: + +1. Redistributions of source code must retain the above copyright notice, this + list of conditions and the following disclaimer. + +2. Redistributions in binary form must reproduce the above copyright notice, + this list of conditions and the following disclaimer in the documentation + and/or other materials provided with the distribution. + +3. Neither the name of the copyright holder nor the names of its + contributors may be used to endorse or promote products derived from + this software without specific prior written permission. + +THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" +AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE +IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE +DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE +FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL +DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR +SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER +CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, +OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE +OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + + +nest-asyncio +BSD License +https://github.com/erdewit/nest_asyncio +BSD 2-Clause License + +Copyright (c) 2018-2020, Ewald de Wit +All rights reserved. + +Redistribution and use in source and binary forms, with or without +modification, are permitted provided that the following conditions are met: + +* Redistributions of source code must retain the above copyright notice, this + list of conditions and the following disclaimer. + +* Redistributions in binary form must reproduce the above copyright notice, + this list of conditions and the following disclaimer in the documentation + and/or other materials provided with the distribution. + +THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" +AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE +IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE +DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE +FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL +DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR +SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER +CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, +OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE +OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + + +networkx +BSD-3-Clause +https://networkx.org/ +NetworkX is distributed with the 3-clause BSD license. + +:: + + Copyright (c) 2004-2025, NetworkX Developers + Aric Hagberg + Dan Schult + Pieter Swart + All rights reserved. + + Redistribution and use in source and binary forms, with or without + modification, are permitted provided that the following conditions are + met: + + * Redistributions of source code must retain the above copyright + notice, this list of conditions and the following disclaimer. + + * Redistributions in binary form must reproduce the above + copyright notice, this list of conditions and the following + disclaimer in the documentation and/or other materials provided + with the distribution. + + * Neither the name of the NetworkX Developers nor the names of its + contributors may be used to endorse or promote products derived + from this software without specific prior written permission. + + THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS + "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT + LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR + A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT + OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, + SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT + LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, + DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY + THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT + (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE + OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + + +ninja +Apache Software License; BSD License +http://ninja-build.org/ +Apache License +Version 2.0, January 2004 +http://www.apache.org/licenses/ + +TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION + +1. Definitions. + +"License" shall mean the terms and conditions for use, reproduction, and +distribution as defined by Sections 1 through 9 of this document. + +"Licensor" shall mean the copyright owner or entity authorized by the copyright +owner that is granting the License. + +"Legal Entity" shall mean the union of the acting entity and all other entities +that control, are controlled by, or are under common control with that entity. +For the purposes of this definition, "control" means (i) the power, direct or +indirect, to cause the direction or management of such entity, whether by +contract or otherwise, or (ii) ownership of fifty percent (50%) or more of the +outstanding shares, or (iii) beneficial ownership of such entity. + +"You" (or "Your") shall mean an individual or Legal Entity exercising +permissions granted by this License. + +"Source" form shall mean the preferred form for making modifications, including +but not limited to software source code, documentation source, and configuration +files. + +"Object" form shall mean any form resulting from mechanical transformation or +translation of a Source form, including but not limited to compiled object code, +generated documentation, and conversions to other media types. + +"Work" shall mean the work of authorship, whether in Source or Object form, made +available under the License, as indicated by a copyright notice that is included +in or attached to the work (an example is provided in the Appendix below). + +"Derivative Works" shall mean any work, whether in Source or Object form, that +is based on (or derived from) the Work and for which the editorial revisions, +annotations, elaborations, or other modifications represent, as a whole, an +original work of authorship. For the purposes of this License, Derivative Works +shall not include works that remain separable from, or merely link (or bind by +name) to the interfaces of, the Work and Derivative Works thereof. + +"Contribution" shall mean any work of authorship, including the original version +of the Work and any modifications or additions to that Work or Derivative Works +thereof, that is intentionally submitted to Licensor for inclusion in the Work +by the copyright owner or by an individual or Legal Entity authorized to submit +on behalf of the copyright owner. For the purposes of this definition, +"submitted" means any form of electronic, verbal, or written communication sent +to the Licensor or its representatives, including but not limited to +communication on electronic mailing lists, source code control systems, and +issue tracking systems that are managed by, or on behalf of, the Licensor for +the purpose of discussing and improving the Work, but excluding communication +that is conspicuously marked or otherwise designated in writing by the copyright +owner as "Not a Contribution." + +"Contributor" shall mean Licensor and any individual or Legal Entity on behalf +of whom a Contribution has been received by Licensor and subsequently +incorporated within the Work. + +2. Grant of Copyright License. + +Subject to the terms and conditions of this License, each Contributor hereby +grants to You a perpetual, worldwide, non-exclusive, no-charge, royalty-free, +irrevocable copyright license to reproduce, prepare Derivative Works of, +publicly display, publicly perform, sublicense, and distribute the Work and such +Derivative Works in Source or Object form. + +3. Grant of Patent License. + +Subject to the terms and conditions of this License, each Contributor hereby +grants to You a perpetual, worldwide, non-exclusive, no-charge, royalty-free, +irrevocable (except as stated in this section) patent license to make, have +made, use, offer to sell, sell, import, and otherwise transfer the Work, where +such license applies only to those patent claims licensable by such Contributor +that are necessarily infringed by their Contribution(s) alone or by combination +of their Contribution(s) with the Work to which such Contribution(s) was +submitted. If You institute patent litigation against any entity (including a +cross-claim or counterclaim in a lawsuit) alleging that the Work or a +Contribution incorporated within the Work constitutes direct or contributory +patent infringement, then any patent licenses granted to You under this License +for that Work shall terminate as of the date such litigation is filed. + +4. Redistribution. + +You may reproduce and distribute copies of the Work or Derivative Works thereof +in any medium, with or without modifications, and in Source or Object form, +provided that You meet the following conditions: + +You must give any other recipients of the Work or Derivative Works a copy of +this License; and +You must cause any modified files to carry prominent notices stating that You +changed the files; and +You must retain, in the Source form of any Derivative Works that You distribute, +all copyright, patent, trademark, and attribution notices from the Source form +of the Work, excluding those notices that do not pertain to any part of the +Derivative Works; and +If the Work includes a "NOTICE" text file as part of its distribution, then any +Derivative Works that You distribute must include a readable copy of the +attribution notices contained within such NOTICE file, excluding those notices +that do not pertain to any part of the Derivative Works, in at least one of the +following places: within a NOTICE text file distributed as part of the +Derivative Works; within the Source form or documentation, if provided along +with the Derivative Works; or, within a display generated by the Derivative +Works, if and wherever such third-party notices normally appear. The contents of +the NOTICE file are for informational purposes only and do not modify the +License. You may add Your own attribution notices within Derivative Works that +You distribute, alongside or as an addendum to the NOTICE text from the Work, +provided that such additional attribution notices cannot be construed as +modifying the License. +You may add Your own copyright statement to Your modifications and may provide +additional or different license terms and conditions for use, reproduction, or +distribution of Your modifications, or for any such Derivative Works as a whole, +provided Your use, reproduction, and distribution of the Work otherwise complies +with the conditions stated in this License. + +5. Submission of Contributions. + +Unless You explicitly state otherwise, any Contribution intentionally submitted +for inclusion in the Work by You to the Licensor shall be under the terms and +conditions of this License, without any additional terms or conditions. +Notwithstanding the above, nothing herein shall supersede or modify the terms of +any separate license agreement you may have executed with Licensor regarding +such Contributions. + +6. Trademarks. + +This License does not grant permission to use the trade names, trademarks, +service marks, or product names of the Licensor, except as required for +reasonable and customary use in describing the origin of the Work and +reproducing the content of the NOTICE file. + +7. Disclaimer of Warranty. + +Unless required by applicable law or agreed to in writing, Licensor provides the +Work (and each Contributor provides its Contributions) on an "AS IS" BASIS, +WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied, +including, without limitation, any warranties or conditions of TITLE, +NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A PARTICULAR PURPOSE. You are +solely responsible for determining the appropriateness of using or +redistributing the Work and assume any risks associated with Your exercise of +permissions under this License. + +8. Limitation of Liability. + +In no event and under no legal theory, whether in tort (including negligence), +contract, or otherwise, unless required by applicable law (such as deliberate +and grossly negligent acts) or agreed to in writing, shall any Contributor be +liable to You for damages, including any direct, indirect, special, incidental, +or consequential damages of any character arising as a result of this License or +out of the use or inability to use the Work (including but not limited to +damages for loss of goodwill, work stoppage, computer failure or malfunction, or +any and all other commercial damages or losses), even if such Contributor has +been advised of the possibility of such damages. + +9. Accepting Warranty or Additional Liability. + +While redistributing the Work or Derivative Works thereof, You may choose to +offer, and charge a fee for, acceptance of support, warranty, indemnity, or +other liability obligations and/or rights consistent with this License. However, +in accepting such obligations, You may act only on Your own behalf and on Your +sole responsibility, not on behalf of any other Contributor, and only if You +agree to indemnify, defend, and hold each Contributor harmless for any liability +incurred by, or claims asserted against, such Contributor by reason of your +accepting any such warranty or additional liability. + +END OF TERMS AND CONDITIONS + +APPENDIX: How to apply the Apache License to your work + +To apply the Apache License to your work, attach the following boilerplate +notice, with the fields enclosed by brackets "[]" replaced with your own +identifying information. (Don't include the brackets!) The text should be +enclosed in the appropriate comment syntax for the file format. We also +recommend that a file or class name and description of purpose be included on +the same "printed page" as the copyright notice for easier identification within +third-party archives. + + Copyright [yyyy] [name of copyright owner] + + Licensed under the Apache License, Version 2.0 (the "License"); + you may not use this file except in compliance with the License. + You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, software + distributed under the License is distributed on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + See the License for the specific language governing permissions and + limitations under the License. + + +nltk +Apache Software License +https://www.nltk.org/ + + Apache License + Version 2.0, January 2004 + http://www.apache.org/licenses/ + + TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION + + 1. Definitions. + + "License" shall mean the terms and conditions for use, reproduction, + and distribution as defined by Sections 1 through 9 of this document. + + "Licensor" shall mean the copyright owner or entity authorized by + the copyright owner that is granting the License. + + "Legal Entity" shall mean the union of the acting entity and all + other entities that control, are controlled by, or are under common + control with that entity. For the purposes of this definition, + "control" means (i) the power, direct or indirect, to cause the + direction or management of such entity, whether by contract or + otherwise, or (ii) ownership of fifty percent (50%) or more of the + outstanding shares, or (iii) beneficial ownership of such entity. + + "You" (or "Your") shall mean an individual or Legal Entity + exercising permissions granted by this License. + + "Source" form shall mean the preferred form for making modifications, + including but not limited to software source code, documentation + source, and configuration files. + + "Object" form shall mean any form resulting from mechanical + transformation or translation of a Source form, including but + not limited to compiled object code, generated documentation, + and conversions to other media types. + + "Work" shall mean the work of authorship, whether in Source or + Object form, made available under the License, as indicated by a + copyright notice that is included in or attached to the work + (an example is provided in the Appendix below). + + "Derivative Works" shall mean any work, whether in Source or Object + form, that is based on (or derived from) the Work and for which the + editorial revisions, annotations, elaborations, or other modifications + represent, as a whole, an original work of authorship. For the purposes + of this License, Derivative Works shall not include works that remain + separable from, or merely link (or bind by name) to the interfaces of, + the Work and Derivative Works thereof. + + "Contribution" shall mean any work of authorship, including + the original version of the Work and any modifications or additions + to that Work or Derivative Works thereof, that is intentionally + submitted to Licensor for inclusion in the Work by the copyright owner + or by an individual or Legal Entity authorized to submit on behalf of + the copyright owner. For the purposes of this definition, "submitted" + means any form of electronic, verbal, or written communication sent + to the Licensor or its representatives, including but not limited to + communication on electronic mailing lists, source code control systems, + and issue tracking systems that are managed by, or on behalf of, the + Licensor for the purpose of discussing and improving the Work, but + excluding communication that is conspicuously marked or otherwise + designated in writing by the copyright owner as "Not a Contribution." + + "Contributor" shall mean Licensor and any individual or Legal Entity + on behalf of whom a Contribution has been received by Licensor and + subsequently incorporated within the Work. + + 2. Grant of Copyright License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + copyright license to reproduce, prepare Derivative Works of, + publicly display, publicly perform, sublicense, and distribute the + Work and such Derivative Works in Source or Object form. + + 3. Grant of Patent License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + (except as stated in this section) patent license to make, have made, + use, offer to sell, sell, import, and otherwise transfer the Work, + where such license applies only to those patent claims licensable + by such Contributor that are necessarily infringed by their + Contribution(s) alone or by combination of their Contribution(s) + with the Work to which such Contribution(s) was submitted. If You + institute patent litigation against any entity (including a + cross-claim or counterclaim in a lawsuit) alleging that the Work + or a Contribution incorporated within the Work constitutes direct + or contributory patent infringement, then any patent licenses + granted to You under this License for that Work shall terminate + as of the date such litigation is filed. + + 4. Redistribution. You may reproduce and distribute copies of the + Work or Derivative Works thereof in any medium, with or without + modifications, and in Source or Object form, provided that You + meet the following conditions: + + (a) You must give any other recipients of the Work or + Derivative Works a copy of this License; and + + (b) You must cause any modified files to carry prominent notices + stating that You changed the files; and + + (c) You must retain, in the Source form of any Derivative Works + that You distribute, all copyright, patent, trademark, and + attribution notices from the Source form of the Work, + excluding those notices that do not pertain to any part of + the Derivative Works; and + + (d) If the Work includes a "NOTICE" text file as part of its + distribution, then any Derivative Works that You distribute must + include a readable copy of the attribution notices contained + within such NOTICE file, excluding those notices that do not + pertain to any part of the Derivative Works, in at least one + of the following places: within a NOTICE text file distributed + as part of the Derivative Works; within the Source form or + documentation, if provided along with the Derivative Works; or, + within a display generated by the Derivative Works, if and + wherever such third-party notices normally appear. The contents + of the NOTICE file are for informational purposes only and + do not modify the License. You may add Your own attribution + notices within Derivative Works that You distribute, alongside + or as an addendum to the NOTICE text from the Work, provided + that such additional attribution notices cannot be construed + as modifying the License. + + You may add Your own copyright statement to Your modifications and + may provide additional or different license terms and conditions + for use, reproduction, or distribution of Your modifications, or + for any such Derivative Works as a whole, provided Your use, + reproduction, and distribution of the Work otherwise complies with + the conditions stated in this License. + + 5. Submission of Contributions. Unless You explicitly state otherwise, + any Contribution intentionally submitted for inclusion in the Work + by You to the Licensor shall be under the terms and conditions of + this License, without any additional terms or conditions. + Notwithstanding the above, nothing herein shall supersede or modify + the terms of any separate license agreement you may have executed + with Licensor regarding such Contributions. + + 6. Trademarks. This License does not grant permission to use the trade + names, trademarks, service marks, or product names of the Licensor, + except as required for reasonable and customary use in describing the + origin of the Work and reproducing the content of the NOTICE file. + + 7. Disclaimer of Warranty. Unless required by applicable law or + agreed to in writing, Licensor provides the Work (and each + Contributor provides its Contributions) on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or + implied, including, without limitation, any warranties or conditions + of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A + PARTICULAR PURPOSE. You are solely responsible for determining the + appropriateness of using or redistributing the Work and assume any + risks associated with Your exercise of permissions under this License. + + 8. Limitation of Liability. In no event and under no legal theory, + whether in tort (including negligence), contract, or otherwise, + unless required by applicable law (such as deliberate and grossly + negligent acts) or agreed to in writing, shall any Contributor be + liable to You for damages, including any direct, indirect, special, + incidental, or consequential damages of any character arising as a + result of this License or out of the use or inability to use the + Work (including but not limited to damages for loss of goodwill, + work stoppage, computer failure or malfunction, or any and all + other commercial damages or losses), even if such Contributor + has been advised of the possibility of such damages. + + 9. Accepting Warranty or Additional Liability. While redistributing + the Work or Derivative Works thereof, You may choose to offer, + and charge a fee for, acceptance of support, warranty, indemnity, + or other liability obligations and/or rights consistent with this + License. However, in accepting such obligations, You may act only + on Your own behalf and on Your sole responsibility, not on behalf + of any other Contributor, and only if You agree to indemnify, + defend, and hold each Contributor harmless for any liability + incurred by, or claims asserted against, such Contributor by reason + of your accepting any such warranty or additional liability. + + END OF TERMS AND CONDITIONS + + APPENDIX: How to apply the Apache License to your work. + + To apply the Apache License to your work, attach the following + boilerplate notice, with the fields enclosed by brackets "[]" + replaced with your own identifying information. (Don't include + the brackets!) The text should be enclosed in the appropriate + comment syntax for the file format. We also recommend that a + file or class name and description of purpose be included on the + same "printed page" as the copyright notice for easier + identification within third-party archives. + + Copyright [yyyy] [name of copyright owner] + + Licensed under the Apache License, Version 2.0 (the "License"); + you may not use this file except in compliance with the License. + You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, software + distributed under the License is distributed on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + See the License for the specific language governing permissions and + limitations under the License. + + +notebook_shim +BSD License +UNKNOWN +BSD 3-Clause License + +Copyright (c) 2022 Project Jupyter Contributors +All rights reserved. + +Redistribution and use in source and binary forms, with or without +modification, are permitted provided that the following conditions are met: + +1. Redistributions of source code must retain the above copyright notice, this + list of conditions and the following disclaimer. + +2. Redistributions in binary form must reproduce the above copyright notice, + this list of conditions and the following disclaimer in the documentation + and/or other materials provided with the distribution. + +3. Neither the name of the copyright holder nor the names of its + contributors may be used to endorse or promote products derived from + this software without specific prior written permission. + +THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" +AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE +IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE +DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE +FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL +DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR +SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER +CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, +OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE +OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + + +numcodecs +MIT +https://github.com/zarr-developers/numcodecs +The MIT License (MIT) + +Copyright (c) 2015-2018 Zarr Developers + +Permission is hereby granted, free of charge, to any person obtaining a copy +of this software and associated documentation files (the "Software"), to deal +in the Software without restriction, including without limitation the rights +to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +copies of the Software, and to permit persons to whom the Software is +furnished to do so, subject to the following conditions: + +The above copyright notice and this permission notice shall be included in all +copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE +SOFTWARE. + + +numpy +BSD-3-Clause AND 0BSD AND MIT AND Zlib AND CC0-1.0 +https://numpy.org +Copyright (c) 2005-2025, NumPy Developers. +All rights reserved. + +Redistribution and use in source and binary forms, with or without +modification, are permitted provided that the following conditions are +met: + + * Redistributions of source code must retain the above copyright + notice, this list of conditions and the following disclaimer. + + * Redistributions in binary form must reproduce the above + copyright notice, this list of conditions and the following + disclaimer in the documentation and/or other materials provided + with the distribution. + + * Neither the name of the NumPy Developers nor the names of any + contributors may be used to endorse or promote products derived + from this software without specific prior written permission. + +THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS +"AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT +LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR +A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT +OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, +SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT +LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, +DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY +THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT +(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE +OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + +---- + + +---- + +This binary distribution of NumPy also bundles the following software: + + +Name: OpenBLAS +Files: numpy.libs/libscipy_openblas*.so +Description: bundled as a dynamically linked library +Availability: https://github.com/OpenMathLib/OpenBLAS/ +License: BSD-3-Clause + Copyright (c) 2011-2014, The OpenBLAS Project + All rights reserved. + + Redistribution and use in source and binary forms, with or without + modification, are permitted provided that the following conditions are + met: + + 1. Redistributions of source code must retain the above copyright + notice, this list of conditions and the following disclaimer. + + 2. Redistributions in binary form must reproduce the above copyright + notice, this list of conditions and the following disclaimer in + the documentation and/or other materials provided with the + distribution. + 3. Neither the name of the OpenBLAS project nor the names of + its contributors may be used to endorse or promote products + derived from this software without specific prior written + permission. + + THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" + AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE + IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE + ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE + LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL + DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR + SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER + CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, + OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE + USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + + +Name: LAPACK +Files: numpy.libs/libscipy_openblas*.so +Description: bundled in OpenBLAS +Availability: https://github.com/OpenMathLib/OpenBLAS/ +License: BSD-3-Clause-Open-MPI + Copyright (c) 1992-2013 The University of Tennessee and The University + of Tennessee Research Foundation. All rights + reserved. + Copyright (c) 2000-2013 The University of California Berkeley. All + rights reserved. + Copyright (c) 2006-2013 The University of Colorado Denver. All rights + reserved. + + $COPYRIGHT$ + + Additional copyrights may follow + + $HEADER$ + + Redistribution and use in source and binary forms, with or without + modification, are permitted provided that the following conditions are + met: + + - Redistributions of source code must retain the above copyright + notice, this list of conditions and the following disclaimer. + + - Redistributions in binary form must reproduce the above copyright + notice, this list of conditions and the following disclaimer listed + in this license in the documentation and/or other materials + provided with the distribution. + + - Neither the name of the copyright holders nor the names of its + contributors may be used to endorse or promote products derived from + this software without specific prior written permission. + + The copyright holders provide no reassurances that the source code + provided does not infringe any patent, copyright, or any other + intellectual property rights of third parties. The copyright holders + disclaim any liability to any recipient for claims brought against + recipient by any third party for infringement of that parties + intellectual property rights. + + THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS + "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT + LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR + A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT + OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, + SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT + LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, + DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY + THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT + (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE + OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + + +Name: GCC runtime library +Files: numpy.libs/libgfortran*.so +Description: dynamically linked to files compiled with gcc +Availability: https://gcc.gnu.org/git/?p=gcc.git;a=tree;f=libgfortran +License: GPL-3.0-or-later WITH GCC-exception-3.1 + Copyright (C) 2002-2017 Free Software Foundation, Inc. + + Libgfortran is free software; you can redistribute it and/or modify + it under the terms of the GNU General Public License as published by + the Free Software Foundation; either version 3, or (at your option) + any later version. + + Libgfortran is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + Under Section 7 of GPL version 3, you are granted additional + permissions described in the GCC Runtime Library Exception, version + 3.1, as published by the Free Software Foundation. + + You should have received a copy of the GNU General Public License and + a copy of the GCC Runtime Library Exception along with this program; + see the files COPYING3 and COPYING.RUNTIME respectively. If not, see + . + +---- + +Full text of license texts referred to above follows (that they are +listed below does not necessarily imply the conditions apply to the +present binary release): + +---- + +GCC RUNTIME LIBRARY EXCEPTION + +Version 3.1, 31 March 2009 + +Copyright (C) 2009 Free Software Foundation, Inc. + +Everyone is permitted to copy and distribute verbatim copies of this +license document, but changing it is not allowed. + +This GCC Runtime Library Exception ("Exception") is an additional +permission under section 7 of the GNU General Public License, version +3 ("GPLv3"). It applies to a given file (the "Runtime Library") that +bears a notice placed by the copyright holder of the file stating that +the file is governed by GPLv3 along with this Exception. + +When you use GCC to compile a program, GCC may combine portions of +certain GCC header files and runtime libraries with the compiled +program. The purpose of this Exception is to allow compilation of +non-GPL (including proprietary) programs to use, in this way, the +header files and runtime libraries covered by this Exception. + +0. Definitions. + +A file is an "Independent Module" if it either requires the Runtime +Library for execution after a Compilation Process, or makes use of an +interface provided by the Runtime Library, but is not otherwise based +on the Runtime Library. + +"GCC" means a version of the GNU Compiler Collection, with or without +modifications, governed by version 3 (or a specified later version) of +the GNU General Public License (GPL) with the option of using any +subsequent versions published by the FSF. + +"GPL-compatible Software" is software whose conditions of propagation, +modification and use would permit combination with GCC in accord with +the license of GCC. + +"Target Code" refers to output from any compiler for a real or virtual +target processor architecture, in executable form or suitable for +input to an assembler, loader, linker and/or execution +phase. Notwithstanding that, Target Code does not include data in any +format that is used as a compiler intermediate representation, or used +for producing a compiler intermediate representation. + +The "Compilation Process" transforms code entirely represented in +non-intermediate languages designed for human-written code, and/or in +Java Virtual Machine byte code, into Target Code. Thus, for example, +use of source code generators and preprocessors need not be considered +part of the Compilation Process, since the Compilation Process can be +understood as starting with the output of the generators or +preprocessors. + +A Compilation Process is "Eligible" if it is done using GCC, alone or +with other GPL-compatible software, or if it is done without using any +work based on GCC. For example, using non-GPL-compatible Software to +optimize any GCC intermediate representations would not qualify as an +Eligible Compilation Process. + +1. Grant of Additional Permission. + +You have permission to propagate a work of Target Code formed by +combining the Runtime Library with Independent Modules, even if such +propagation would otherwise violate the terms of GPLv3, provided that +all Target Code was generated by Eligible Compilation Processes. You +may then convey such a combination under terms of your choice, +consistent with the licensing of the Independent Modules. + +2. No Weakening of GCC Copyleft. + +The availability of this Exception does not imply any general +presumption that third-party software is unaffected by the copyleft +requirements of the license of GCC. + +---- + + GNU GENERAL PUBLIC LICENSE + Version 3, 29 June 2007 + + Copyright (C) 2007 Free Software Foundation, Inc. + Everyone is permitted to copy and distribute verbatim copies + of this license document, but changing it is not allowed. + + Preamble + + The GNU General Public License is a free, copyleft license for +software and other kinds of works. + + The licenses for most software and other practical works are designed +to take away your freedom to share and change the works. By contrast, +the GNU General Public License is intended to guarantee your freedom to +share and change all versions of a program--to make sure it remains free +software for all its users. We, the Free Software Foundation, use the +GNU General Public License for most of our software; it applies also to +any other work released this way by its authors. You can apply it to +your programs, too. + + When we speak of free software, we are referring to freedom, not +price. Our General Public Licenses are designed to make sure that you +have the freedom to distribute copies of free software (and charge for +them if you wish), that you receive source code or can get it if you +want it, that you can change the software or use pieces of it in new +free programs, and that you know you can do these things. + + To protect your rights, we need to prevent others from denying you +these rights or asking you to surrender the rights. Therefore, you have +certain responsibilities if you distribute copies of the software, or if +you modify it: responsibilities to respect the freedom of others. + + For example, if you distribute copies of such a program, whether +gratis or for a fee, you must pass on to the recipients the same +freedoms that you received. You must make sure that they, too, receive +or can get the source code. And you must show them these terms so they +know their rights. + + Developers that use the GNU GPL protect your rights with two steps: +(1) assert copyright on the software, and (2) offer you this License +giving you legal permission to copy, distribute and/or modify it. + + For the developers' and authors' protection, the GPL clearly explains +that there is no warranty for this free software. For both users' and +authors' sake, the GPL requires that modified versions be marked as +changed, so that their problems will not be attributed erroneously to +authors of previous versions. + + Some devices are designed to deny users access to install or run +modified versions of the software inside them, although the manufacturer +can do so. This is fundamentally incompatible with the aim of +protecting users' freedom to change the software. The systematic +pattern of such abuse occurs in the area of products for individuals to +use, which is precisely where it is most unacceptable. Therefore, we +have designed this version of the GPL to prohibit the practice for those +products. If such problems arise substantially in other domains, we +stand ready to extend this provision to those domains in future versions +of the GPL, as needed to protect the freedom of users. + + Finally, every program is threatened constantly by software patents. +States should not allow patents to restrict development and use of +software on general-purpose computers, but in those that do, we wish to +avoid the special danger that patents applied to a free program could +make it effectively proprietary. To prevent this, the GPL assures that +patents cannot be used to render the program non-free. + + The precise terms and conditions for copying, distribution and +modification follow. + + TERMS AND CONDITIONS + + 0. Definitions. + + "This License" refers to version 3 of the GNU General Public License. + + "Copyright" also means copyright-like laws that apply to other kinds of +works, such as semiconductor masks. + + "The Program" refers to any copyrightable work licensed under this +License. Each licensee is addressed as "you". "Licensees" and +"recipients" may be individuals or organizations. + + To "modify" a work means to copy from or adapt all or part of the work +in a fashion requiring copyright permission, other than the making of an +exact copy. The resulting work is called a "modified version" of the +earlier work or a work "based on" the earlier work. + + A "covered work" means either the unmodified Program or a work based +on the Program. + + To "propagate" a work means to do anything with it that, without +permission, would make you directly or secondarily liable for +infringement under applicable copyright law, except executing it on a +computer or modifying a private copy. Propagation includes copying, +distribution (with or without modification), making available to the +public, and in some countries other activities as well. + + To "convey" a work means any kind of propagation that enables other +parties to make or receive copies. Mere interaction with a user through +a computer network, with no transfer of a copy, is not conveying. + + An interactive user interface displays "Appropriate Legal Notices" +to the extent that it includes a convenient and prominently visible +feature that (1) displays an appropriate copyright notice, and (2) +tells the user that there is no warranty for the work (except to the +extent that warranties are provided), that licensees may convey the +work under this License, and how to view a copy of this License. If +the interface presents a list of user commands or options, such as a +menu, a prominent item in the list meets this criterion. + + 1. Source Code. + + The "source code" for a work means the preferred form of the work +for making modifications to it. "Object code" means any non-source +form of a work. + + A "Standard Interface" means an interface that either is an official +standard defined by a recognized standards body, or, in the case of +interfaces specified for a particular programming language, one that +is widely used among developers working in that language. + + The "System Libraries" of an executable work include anything, other +than the work as a whole, that (a) is included in the normal form of +packaging a Major Component, but which is not part of that Major +Component, and (b) serves only to enable use of the work with that +Major Component, or to implement a Standard Interface for which an +implementation is available to the public in source code form. A +"Major Component", in this context, means a major essential component +(kernel, window system, and so on) of the specific operating system +(if any) on which the executable work runs, or a compiler used to +produce the work, or an object code interpreter used to run it. + + The "Corresponding Source" for a work in object code form means all +the source code needed to generate, install, and (for an executable +work) run the object code and to modify the work, including scripts to +control those activities. However, it does not include the work's +System Libraries, or general-purpose tools or generally available free +programs which are used unmodified in performing those activities but +which are not part of the work. For example, Corresponding Source +includes interface definition files associated with source files for +the work, and the source code for shared libraries and dynamically +linked subprograms that the work is specifically designed to require, +such as by intimate data communication or control flow between those +subprograms and other parts of the work. + + The Corresponding Source need not include anything that users +can regenerate automatically from other parts of the Corresponding +Source. + + The Corresponding Source for a work in source code form is that +same work. + + 2. Basic Permissions. + + All rights granted under this License are granted for the term of +copyright on the Program, and are irrevocable provided the stated +conditions are met. This License explicitly affirms your unlimited +permission to run the unmodified Program. The output from running a +covered work is covered by this License only if the output, given its +content, constitutes a covered work. This License acknowledges your +rights of fair use or other equivalent, as provided by copyright law. + + You may make, run and propagate covered works that you do not +convey, without conditions so long as your license otherwise remains +in force. You may convey covered works to others for the sole purpose +of having them make modifications exclusively for you, or provide you +with facilities for running those works, provided that you comply with +the terms of this License in conveying all material for which you do +not control copyright. Those thus making or running the covered works +for you must do so exclusively on your behalf, under your direction +and control, on terms that prohibit them from making any copies of +your copyrighted material outside their relationship with you. + + Conveying under any other circumstances is permitted solely under +the conditions stated below. Sublicensing is not allowed; section 10 +makes it unnecessary. + + 3. Protecting Users' Legal Rights From Anti-Circumvention Law. + + No covered work shall be deemed part of an effective technological +measure under any applicable law fulfilling obligations under article +11 of the WIPO copyright treaty adopted on 20 December 1996, or +similar laws prohibiting or restricting circumvention of such +measures. + + When you convey a covered work, you waive any legal power to forbid +circumvention of technological measures to the extent such circumvention +is effected by exercising rights under this License with respect to +the covered work, and you disclaim any intention to limit operation or +modification of the work as a means of enforcing, against the work's +users, your or third parties' legal rights to forbid circumvention of +technological measures. + + 4. Conveying Verbatim Copies. + + You may convey verbatim copies of the Program's source code as you +receive it, in any medium, provided that you conspicuously and +appropriately publish on each copy an appropriate copyright notice; +keep intact all notices stating that this License and any +non-permissive terms added in accord with section 7 apply to the code; +keep intact all notices of the absence of any warranty; and give all +recipients a copy of this License along with the Program. + + You may charge any price or no price for each copy that you convey, +and you may offer support or warranty protection for a fee. + + 5. Conveying Modified Source Versions. + + You may convey a work based on the Program, or the modifications to +produce it from the Program, in the form of source code under the +terms of section 4, provided that you also meet all of these conditions: + + a) The work must carry prominent notices stating that you modified + it, and giving a relevant date. + + b) The work must carry prominent notices stating that it is + released under this License and any conditions added under section + 7. This requirement modifies the requirement in section 4 to + "keep intact all notices". + + c) You must license the entire work, as a whole, under this + License to anyone who comes into possession of a copy. This + License will therefore apply, along with any applicable section 7 + additional terms, to the whole of the work, and all its parts, + regardless of how they are packaged. This License gives no + permission to license the work in any other way, but it does not + invalidate such permission if you have separately received it. + + d) If the work has interactive user interfaces, each must display + Appropriate Legal Notices; however, if the Program has interactive + interfaces that do not display Appropriate Legal Notices, your + work need not make them do so. + + A compilation of a covered work with other separate and independent +works, which are not by their nature extensions of the covered work, +and which are not combined with it such as to form a larger program, +in or on a volume of a storage or distribution medium, is called an +"aggregate" if the compilation and its resulting copyright are not +used to limit the access or legal rights of the compilation's users +beyond what the individual works permit. Inclusion of a covered work +in an aggregate does not cause this License to apply to the other +parts of the aggregate. + + 6. Conveying Non-Source Forms. + + You may convey a covered work in object code form under the terms +of sections 4 and 5, provided that you also convey the +machine-readable Corresponding Source under the terms of this License, +in one of these ways: + + a) Convey the object code in, or embodied in, a physical product + (including a physical distribution medium), accompanied by the + Corresponding Source fixed on a durable physical medium + customarily used for software interchange. + + b) Convey the object code in, or embodied in, a physical product + (including a physical distribution medium), accompanied by a + written offer, valid for at least three years and valid for as + long as you offer spare parts or customer support for that product + model, to give anyone who possesses the object code either (1) a + copy of the Corresponding Source for all the software in the + product that is covered by this License, on a durable physical + medium customarily used for software interchange, for a price no + more than your reasonable cost of physically performing this + conveying of source, or (2) access to copy the + Corresponding Source from a network server at no charge. + + c) Convey individual copies of the object code with a copy of the + written offer to provide the Corresponding Source. This + alternative is allowed only occasionally and noncommercially, and + only if you received the object code with such an offer, in accord + with subsection 6b. + + d) Convey the object code by offering access from a designated + place (gratis or for a charge), and offer equivalent access to the + Corresponding Source in the same way through the same place at no + further charge. You need not require recipients to copy the + Corresponding Source along with the object code. If the place to + copy the object code is a network server, the Corresponding Source + may be on a different server (operated by you or a third party) + that supports equivalent copying facilities, provided you maintain + clear directions next to the object code saying where to find the + Corresponding Source. Regardless of what server hosts the + Corresponding Source, you remain obligated to ensure that it is + available for as long as needed to satisfy these requirements. + + e) Convey the object code using peer-to-peer transmission, provided + you inform other peers where the object code and Corresponding + Source of the work are being offered to the general public at no + charge under subsection 6d. + + A separable portion of the object code, whose source code is excluded +from the Corresponding Source as a System Library, need not be +included in conveying the object code work. + + A "User Product" is either (1) a "consumer product", which means any +tangible personal property which is normally used for personal, family, +or household purposes, or (2) anything designed or sold for incorporation +into a dwelling. In determining whether a product is a consumer product, +doubtful cases shall be resolved in favor of coverage. For a particular +product received by a particular user, "normally used" refers to a +typical or common use of that class of product, regardless of the status +of the particular user or of the way in which the particular user +actually uses, or expects or is expected to use, the product. A product +is a consumer product regardless of whether the product has substantial +commercial, industrial or non-consumer uses, unless such uses represent +the only significant mode of use of the product. + + "Installation Information" for a User Product means any methods, +procedures, authorization keys, or other information required to install +and execute modified versions of a covered work in that User Product from +a modified version of its Corresponding Source. The information must +suffice to ensure that the continued functioning of the modified object +code is in no case prevented or interfered with solely because +modification has been made. + + If you convey an object code work under this section in, or with, or +specifically for use in, a User Product, and the conveying occurs as +part of a transaction in which the right of possession and use of the +User Product is transferred to the recipient in perpetuity or for a +fixed term (regardless of how the transaction is characterized), the +Corresponding Source conveyed under this section must be accompanied +by the Installation Information. But this requirement does not apply +if neither you nor any third party retains the ability to install +modified object code on the User Product (for example, the work has +been installed in ROM). + + The requirement to provide Installation Information does not include a +requirement to continue to provide support service, warranty, or updates +for a work that has been modified or installed by the recipient, or for +the User Product in which it has been modified or installed. Access to a +network may be denied when the modification itself materially and +adversely affects the operation of the network or violates the rules and +protocols for communication across the network. + + Corresponding Source conveyed, and Installation Information provided, +in accord with this section must be in a format that is publicly +documented (and with an implementation available to the public in +source code form), and must require no special password or key for +unpacking, reading or copying. + + 7. Additional Terms. + + "Additional permissions" are terms that supplement the terms of this +License by making exceptions from one or more of its conditions. +Additional permissions that are applicable to the entire Program shall +be treated as though they were included in this License, to the extent +that they are valid under applicable law. If additional permissions +apply only to part of the Program, that part may be used separately +under those permissions, but the entire Program remains governed by +this License without regard to the additional permissions. + + When you convey a copy of a covered work, you may at your option +remove any additional permissions from that copy, or from any part of +it. (Additional permissions may be written to require their own +removal in certain cases when you modify the work.) You may place +additional permissions on material, added by you to a covered work, +for which you have or can give appropriate copyright permission. + + Notwithstanding any other provision of this License, for material you +add to a covered work, you may (if authorized by the copyright holders of +that material) supplement the terms of this License with terms: + + a) Disclaiming warranty or limiting liability differently from the + terms of sections 15 and 16 of this License; or + + b) Requiring preservation of specified reasonable legal notices or + author attributions in that material or in the Appropriate Legal + Notices displayed by works containing it; or + + c) Prohibiting misrepresentation of the origin of that material, or + requiring that modified versions of such material be marked in + reasonable ways as different from the original version; or + + d) Limiting the use for publicity purposes of names of licensors or + authors of the material; or + + e) Declining to grant rights under trademark law for use of some + trade names, trademarks, or service marks; or + + f) Requiring indemnification of licensors and authors of that + material by anyone who conveys the material (or modified versions of + it) with contractual assumptions of liability to the recipient, for + any liability that these contractual assumptions directly impose on + those licensors and authors. + + All other non-permissive additional terms are considered "further +restrictions" within the meaning of section 10. If the Program as you +received it, or any part of it, contains a notice stating that it is +governed by this License along with a term that is a further +restriction, you may remove that term. If a license document contains +a further restriction but permits relicensing or conveying under this +License, you may add to a covered work material governed by the terms +of that license document, provided that the further restriction does +not survive such relicensing or conveying. + + If you add terms to a covered work in accord with this section, you +must place, in the relevant source files, a statement of the +additional terms that apply to those files, or a notice indicating +where to find the applicable terms. + + Additional terms, permissive or non-permissive, may be stated in the +form of a separately written license, or stated as exceptions; +the above requirements apply either way. + + 8. Termination. + + You may not propagate or modify a covered work except as expressly +provided under this License. Any attempt otherwise to propagate or +modify it is void, and will automatically terminate your rights under +this License (including any patent licenses granted under the third +paragraph of section 11). + + However, if you cease all violation of this License, then your +license from a particular copyright holder is reinstated (a) +provisionally, unless and until the copyright holder explicitly and +finally terminates your license, and (b) permanently, if the copyright +holder fails to notify you of the violation by some reasonable means +prior to 60 days after the cessation. + + Moreover, your license from a particular copyright holder is +reinstated permanently if the copyright holder notifies you of the +violation by some reasonable means, this is the first time you have +received notice of violation of this License (for any work) from that +copyright holder, and you cure the violation prior to 30 days after +your receipt of the notice. + + Termination of your rights under this section does not terminate the +licenses of parties who have received copies or rights from you under +this License. If your rights have been terminated and not permanently +reinstated, you do not qualify to receive new licenses for the same +material under section 10. + + 9. Acceptance Not Required for Having Copies. + + You are not required to accept this License in order to receive or +run a copy of the Program. Ancillary propagation of a covered work +occurring solely as a consequence of using peer-to-peer transmission +to receive a copy likewise does not require acceptance. However, +nothing other than this License grants you permission to propagate or +modify any covered work. These actions infringe copyright if you do +not accept this License. Therefore, by modifying or propagating a +covered work, you indicate your acceptance of this License to do so. + + 10. Automatic Licensing of Downstream Recipients. + + Each time you convey a covered work, the recipient automatically +receives a license from the original licensors, to run, modify and +propagate that work, subject to this License. You are not responsible +for enforcing compliance by third parties with this License. + + An "entity transaction" is a transaction transferring control of an +organization, or substantially all assets of one, or subdividing an +organization, or merging organizations. If propagation of a covered +work results from an entity transaction, each party to that +transaction who receives a copy of the work also receives whatever +licenses to the work the party's predecessor in interest had or could +give under the previous paragraph, plus a right to possession of the +Corresponding Source of the work from the predecessor in interest, if +the predecessor has it or can get it with reasonable efforts. + + You may not impose any further restrictions on the exercise of the +rights granted or affirmed under this License. For example, you may +not impose a license fee, royalty, or other charge for exercise of +rights granted under this License, and you may not initiate litigation +(including a cross-claim or counterclaim in a lawsuit) alleging that +any patent claim is infringed by making, using, selling, offering for +sale, or importing the Program or any portion of it. + + 11. Patents. + + A "contributor" is a copyright holder who authorizes use under this +License of the Program or a work on which the Program is based. The +work thus licensed is called the contributor's "contributor version". + + A contributor's "essential patent claims" are all patent claims +owned or controlled by the contributor, whether already acquired or +hereafter acquired, that would be infringed by some manner, permitted +by this License, of making, using, or selling its contributor version, +but do not include claims that would be infringed only as a +consequence of further modification of the contributor version. For +purposes of this definition, "control" includes the right to grant +patent sublicenses in a manner consistent with the requirements of +this License. + + Each contributor grants you a non-exclusive, worldwide, royalty-free +patent license under the contributor's essential patent claims, to +make, use, sell, offer for sale, import and otherwise run, modify and +propagate the contents of its contributor version. + + In the following three paragraphs, a "patent license" is any express +agreement or commitment, however denominated, not to enforce a patent +(such as an express permission to practice a patent or covenant not to +sue for patent infringement). To "grant" such a patent license to a +party means to make such an agreement or commitment not to enforce a +patent against the party. + + If you convey a covered work, knowingly relying on a patent license, +and the Corresponding Source of the work is not available for anyone +to copy, free of charge and under the terms of this License, through a +publicly available network server or other readily accessible means, +then you must either (1) cause the Corresponding Source to be so +available, or (2) arrange to deprive yourself of the benefit of the +patent license for this particular work, or (3) arrange, in a manner +consistent with the requirements of this License, to extend the patent +license to downstream recipients. "Knowingly relying" means you have +actual knowledge that, but for the patent license, your conveying the +covered work in a country, or your recipient's use of the covered work +in a country, would infringe one or more identifiable patents in that +country that you have reason to believe are valid. + + If, pursuant to or in connection with a single transaction or +arrangement, you convey, or propagate by procuring conveyance of, a +covered work, and grant a patent license to some of the parties +receiving the covered work authorizing them to use, propagate, modify +or convey a specific copy of the covered work, then the patent license +you grant is automatically extended to all recipients of the covered +work and works based on it. + + A patent license is "discriminatory" if it does not include within +the scope of its coverage, prohibits the exercise of, or is +conditioned on the non-exercise of one or more of the rights that are +specifically granted under this License. You may not convey a covered +work if you are a party to an arrangement with a third party that is +in the business of distributing software, under which you make payment +to the third party based on the extent of your activity of conveying +the work, and under which the third party grants, to any of the +parties who would receive the covered work from you, a discriminatory +patent license (a) in connection with copies of the covered work +conveyed by you (or copies made from those copies), or (b) primarily +for and in connection with specific products or compilations that +contain the covered work, unless you entered into that arrangement, +or that patent license was granted, prior to 28 March 2007. + + Nothing in this License shall be construed as excluding or limiting +any implied license or other defenses to infringement that may +otherwise be available to you under applicable patent law. + + 12. No Surrender of Others' Freedom. + + If conditions are imposed on you (whether by court order, agreement or +otherwise) that contradict the conditions of this License, they do not +excuse you from the conditions of this License. If you cannot convey a +covered work so as to satisfy simultaneously your obligations under this +License and any other pertinent obligations, then as a consequence you may +not convey it at all. For example, if you agree to terms that obligate you +to collect a royalty for further conveying from those to whom you convey +the Program, the only way you could satisfy both those terms and this +License would be to refrain entirely from conveying the Program. + + 13. Use with the GNU Affero General Public License. + + Notwithstanding any other provision of this License, you have +permission to link or combine any covered work with a work licensed +under version 3 of the GNU Affero General Public License into a single +combined work, and to convey the resulting work. The terms of this +License will continue to apply to the part which is the covered work, +but the special requirements of the GNU Affero General Public License, +section 13, concerning interaction through a network will apply to the +combination as such. + + 14. Revised Versions of this License. + + The Free Software Foundation may publish revised and/or new versions of +the GNU General Public License from time to time. Such new versions will +be similar in spirit to the present version, but may differ in detail to +address new problems or concerns. + + Each version is given a distinguishing version number. If the +Program specifies that a certain numbered version of the GNU General +Public License "or any later version" applies to it, you have the +option of following the terms and conditions either of that numbered +version or of any later version published by the Free Software +Foundation. If the Program does not specify a version number of the +GNU General Public License, you may choose any version ever published +by the Free Software Foundation. + + If the Program specifies that a proxy can decide which future +versions of the GNU General Public License can be used, that proxy's +public statement of acceptance of a version permanently authorizes you +to choose that version for the Program. + + Later license versions may give you additional or different +permissions. However, no additional obligations are imposed on any +author or copyright holder as a result of your choosing to follow a +later version. + + 15. Disclaimer of Warranty. + + THERE IS NO WARRANTY FOR THE PROGRAM, TO THE EXTENT PERMITTED BY +APPLICABLE LAW. EXCEPT WHEN OTHERWISE STATED IN WRITING THE COPYRIGHT +HOLDERS AND/OR OTHER PARTIES PROVIDE THE PROGRAM "AS IS" WITHOUT WARRANTY +OF ANY KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, +THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR +PURPOSE. THE ENTIRE RISK AS TO THE QUALITY AND PERFORMANCE OF THE PROGRAM +IS WITH YOU. SHOULD THE PROGRAM PROVE DEFECTIVE, YOU ASSUME THE COST OF +ALL NECESSARY SERVICING, REPAIR OR CORRECTION. + + 16. Limitation of Liability. + + IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN WRITING +WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MODIFIES AND/OR CONVEYS +THE PROGRAM AS PERMITTED ABOVE, BE LIABLE TO YOU FOR DAMAGES, INCLUDING ANY +GENERAL, SPECIAL, INCIDENTAL OR CONSEQUENTIAL DAMAGES ARISING OUT OF THE +USE OR INABILITY TO USE THE PROGRAM (INCLUDING BUT NOT LIMITED TO LOSS OF +DATA OR DATA BEING RENDERED INACCURATE OR LOSSES SUSTAINED BY YOU OR THIRD +PARTIES OR A FAILURE OF THE PROGRAM TO OPERATE WITH ANY OTHER PROGRAMS), +EVEN IF SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE POSSIBILITY OF +SUCH DAMAGES. + + 17. Interpretation of Sections 15 and 16. + + If the disclaimer of warranty and limitation of liability provided +above cannot be given local legal effect according to their terms, +reviewing courts shall apply local law that most closely approximates +an absolute waiver of all civil liability in connection with the +Program, unless a warranty or assumption of liability accompanies a +copy of the Program in return for a fee. + + END OF TERMS AND CONDITIONS + + How to Apply These Terms to Your New Programs + + If you develop a new program, and you want it to be of the greatest +possible use to the public, the best way to achieve this is to make it +free software which everyone can redistribute and change under these terms. + + To do so, attach the following notices to the program. It is safest +to attach them to the start of each source file to most effectively +state the exclusion of warranty; and each file should have at least +the "copyright" line and a pointer to where the full notice is found. + + + Copyright (C) + + This program is free software: you can redistribute it and/or modify + it under the terms of the GNU General Public License as published by + the Free Software Foundation, either version 3 of the License, or + (at your option) any later version. + + This program is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License + along with this program. If not, see . + +Also add information on how to contact you by electronic and paper mail. + + If the program does terminal interaction, make it output a short +notice like this when it starts in an interactive mode: + + Copyright (C) + This program comes with ABSOLUTELY NO WARRANTY; for details type `show w'. + This is free software, and you are welcome to redistribute it + under certain conditions; type `show c' for details. + +The hypothetical commands `show w' and `show c' should show the appropriate +parts of the General Public License. Of course, your program's commands +might be different; for a GUI interface, you would use an "about box". + + You should also get your employer (if you work as a programmer) or school, +if any, to sign a "copyright disclaimer" for the program, if necessary. +For more information on this, and how to apply and follow the GNU GPL, see +. + + The GNU General Public License does not permit incorporating your program +into proprietary programs. If your program is a subroutine library, you +may consider it more useful to permit linking proprietary applications with +the library. If this is what you want to do, use the GNU Lesser General +Public License instead of this License. But first, please read +. + +Name: libquadmath +Files: numpy.libs/libquadmath*.so +Description: dynamically linked to files compiled with gcc +Availability: https://gcc.gnu.org/git/?p=gcc.git;a=tree;f=libquadmath +License: LGPL-2.1-or-later + + GCC Quad-Precision Math Library + Copyright (C) 2010-2019 Free Software Foundation, Inc. + Written by Francois-Xavier Coudert + + This file is part of the libquadmath library. + Libquadmath is free software; you can redistribute it and/or + modify it under the terms of the GNU Library General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + Libquadmath is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + https://www.gnu.org/licenses/old-licenses/lgpl-2.1.html + + +nvdlfw_inspect +Apache2 +https://github.com/NVIDIA/nvidia-dlfw-inspect + Apache License + Version 2.0, January 2004 + http://www.apache.org/licenses/ + + TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION + + 1. Definitions. + + "License" shall mean the terms and conditions for use, reproduction, + and distribution as defined by Sections 1 through 9 of this document. + + "Licensor" shall mean the copyright owner or entity authorized by + the copyright owner that is granting the License. + + "Legal Entity" shall mean the union of the acting entity and all + other entities that control, are controlled by, or are under common + control with that entity. For the purposes of this definition, + "control" means (i) the power, direct or indirect, to cause the + direction or management of such entity, whether by contract or + otherwise, or (ii) ownership of fifty percent (50%) or more of the + outstanding shares, or (iii) beneficial ownership of such entity. + + "You" (or "Your") shall mean an individual or Legal Entity + exercising permissions granted by this License. + + "Source" form shall mean the preferred form for making modifications, + including but not limited to software source code, documentation + source, and configuration files. + + "Object" form shall mean any form resulting from mechanical + transformation or translation of a Source form, including but + not limited to compiled object code, generated documentation, + and conversions to other media types. + + "Work" shall mean the work of authorship, whether in Source or + Object form, made available under the License, as indicated by a + copyright notice that is included in or attached to the work + (an example is provided in the Appendix below). + + "Derivative Works" shall mean any work, whether in Source or Object + form, that is based on (or derived from) the Work and for which the + editorial revisions, annotations, elaborations, or other modifications + represent, as a whole, an original work of authorship. For the purposes + of this License, Derivative Works shall not include works that remain + separable from, or merely link (or bind by name) to the interfaces of, + the Work and Derivative Works thereof. + + "Contribution" shall mean any work of authorship, including + the original version of the Work and any modifications or additions + to that Work or Derivative Works thereof, that is intentionally + submitted to Licensor for inclusion in the Work by the copyright owner + or by an individual or Legal Entity authorized to submit on behalf of + the copyright owner. For the purposes of this definition, "submitted" + means any form of electronic, verbal, or written communication sent + to the Licensor or its representatives, including but not limited to + communication on electronic mailing lists, source code control systems, + and issue tracking systems that are managed by, or on behalf of, the + Licensor for the purpose of discussing and improving the Work, but + excluding communication that is conspicuously marked or otherwise + designated in writing by the copyright owner as "Not a Contribution." + + "Contributor" shall mean Licensor and any individual or Legal Entity + on behalf of whom a Contribution has been received by Licensor and + subsequently incorporated within the Work. + + 2. Grant of Copyright License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + copyright license to reproduce, prepare Derivative Works of, + publicly display, publicly perform, sublicense, and distribute the + Work and such Derivative Works in Source or Object form. + + 3. Grant of Patent License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + (except as stated in this section) patent license to make, have made, + use, offer to sell, sell, import, and otherwise transfer the Work, + where such license applies only to those patent claims licensable + by such Contributor that are necessarily infringed by their + Contribution(s) alone or by combination of their Contribution(s) + with the Work to which such Contribution(s) was submitted. If You + institute patent litigation against any entity (including a + cross-claim or counterclaim in a lawsuit) alleging that the Work + or a Contribution incorporated within the Work constitutes direct + or contributory patent infringement, then any patent licenses + granted to You under this License for that Work shall terminate + as of the date such litigation is filed. + + 4. Redistribution. You may reproduce and distribute copies of the + Work or Derivative Works thereof in any medium, with or without + modifications, and in Source or Object form, provided that You + meet the following conditions: + + (a) You must give any other recipients of the Work or + Derivative Works a copy of this License; and + + (b) You must cause any modified files to carry prominent notices + stating that You changed the files; and + + (c) You must retain, in the Source form of any Derivative Works + that You distribute, all copyright, patent, trademark, and + attribution notices from the Source form of the Work, + excluding those notices that do not pertain to any part of + the Derivative Works; and + + (d) If the Work includes a "NOTICE" text file as part of its + distribution, then any Derivative Works that You distribute must + include a readable copy of the attribution notices contained + within such NOTICE file, excluding those notices that do not + pertain to any part of the Derivative Works, in at least one + of the following places: within a NOTICE text file distributed + as part of the Derivative Works; within the Source form or + documentation, if provided along with the Derivative Works; or, + within a display generated by the Derivative Works, if and + wherever such third-party notices normally appear. The contents + of the NOTICE file are for informational purposes only and + do not modify the License. You may add Your own attribution + notices within Derivative Works that You distribute, alongside + or as an addendum to the NOTICE text from the Work, provided + that such additional attribution notices cannot be construed + as modifying the License. + + You may add Your own copyright statement to Your modifications and + may provide additional or different license terms and conditions + for use, reproduction, or distribution of Your modifications, or + for any such Derivative Works as a whole, provided Your use, + reproduction, and distribution of the Work otherwise complies with + the conditions stated in this License. + + 5. Submission of Contributions. Unless You explicitly state otherwise, + any Contribution intentionally submitted for inclusion in the Work + by You to the Licensor shall be under the terms and conditions of + this License, without any additional terms or conditions. + Notwithstanding the above, nothing herein shall supersede or modify + the terms of any separate license agreement you may have executed + with Licensor regarding such Contributions. + + 6. Trademarks. This License does not grant permission to use the trade + names, trademarks, service marks, or product names of the Licensor, + except as required for reasonable and customary use in describing the + origin of the Work and reproducing the content of the NOTICE file. + + 7. Disclaimer of Warranty. Unless required by applicable law or + agreed to in writing, Licensor provides the Work (and each + Contributor provides its Contributions) on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or + implied, including, without limitation, any warranties or conditions + of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A + PARTICULAR PURPOSE. You are solely responsible for determining the + appropriateness of using or redistributing the Work and assume any + risks associated with Your exercise of permissions under this License. + + 8. Limitation of Liability. In no event and under no legal theory, + whether in tort (including negligence), contract, or otherwise, + unless required by applicable law (such as deliberate and grossly + negligent acts) or agreed to in writing, shall any Contributor be + liable to You for damages, including any direct, indirect, special, + incidental, or consequential damages of any character arising as a + result of this License or out of the use or inability to use the + Work (including but not limited to damages for loss of goodwill, + work stoppage, computer failure or malfunction, or any and all + other commercial damages or losses), even if such Contributor + has been advised of the possibility of such damages. + + 9. Accepting Warranty or Additional Liability. While redistributing + the Work or Derivative Works thereof, You may choose to offer, + and charge a fee for, acceptance of support, warranty, indemnity, + or other liability obligations and/or rights consistent with this + License. However, in accepting such obligations, You may act only + on Your own behalf and on Your sole responsibility, not on behalf + of any other Contributor, and only if You agree to indemnify, + defend, and hold each Contributor harmless for any liability + incurred by, or claims asserted against, such Contributor by reason + of your accepting any such warranty or additional liability. + + END OF TERMS AND CONDITIONS + + APPENDIX: How to apply the Apache License to your work. + + To apply the Apache License to your work, attach the following + boilerplate notice, with the fields enclosed by brackets "[]" + replaced with your own identifying information. (Don't include + the brackets!) The text should be enclosed in the appropriate + comment syntax for the file format. We also recommend that a + file or class name and description of purpose be included on the + same "printed page" as the copyright notice for easier + identification within third-party archives. + + Copyright [yyyy] [name of copyright owner] + + Licensed under the Apache License, Version 2.0 (the "License"); + you may not use this file except in compliance with the License. + You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, software + distributed under the License is distributed on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + See the License for the specific language governing permissions and + limitations under the License. + + +nvidia-cublas +LicenseRef-NVIDIA-Proprietary +https://developer.nvidia.com/cuda-zone +UNKNOWN + +nvidia-cuda-cupti +Other/Proprietary License +https://developer.nvidia.com/cuda-zone +UNKNOWN + +nvidia-cuda-nvrtc +Other/Proprietary License +https://developer.nvidia.com/cuda-zone +UNKNOWN + +nvidia-cuda-runtime +LicenseRef-NVIDIA-Proprietary +https://developer.nvidia.com/cuda-zone +UNKNOWN + +nvidia-cudnn-cu13 +LicenseRef-NVIDIA-Proprietary +https://developer.nvidia.com/cuda-zone +UNKNOWN + +nvidia-cudnn-frontend +NVIDIA Proprietary Software +https://github.com/nvidia/cudnn-frontend +MIT License + +Copyright (c) 2013-2022 Niels Lohmann + +Permission is hereby granted, free of charge, to any person obtaining a copy +of this software and associated documentation files (the "Software"), to deal +in the Software without restriction, including without limitation the rights +to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +copies of the Software, and to permit persons to whom the Software is +furnished to do so, subject to the following conditions: + +The above copyright notice and this permission notice shall be included in all +copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE +SOFTWARE. + + +nvidia-cufft +Other/Proprietary License +https://developer.nvidia.com/cuda-zone +UNKNOWN + +nvidia-cufile +Other/Proprietary License +https://developer.nvidia.com/cuda-zone +UNKNOWN + +nvidia-curand +Other/Proprietary License +https://developer.nvidia.com/cuda-zone +UNKNOWN + +nvidia-cusolver +Other/Proprietary License +https://developer.nvidia.com/cuda-zone +UNKNOWN + +nvidia-cusparse +Other/Proprietary License +https://developer.nvidia.com/cuda-zone +UNKNOWN + +nvidia-cusparselt-cu13 +NVIDIA Proprietary Software +https://developer.nvidia.com/cusparselt +LICENSE AGREEMENT FOR NVIDIA SOFTWARE DEVELOPMENT KITS + +This license agreement, including exhibits attached ("Agreement”) is a legal agreement between you and NVIDIA Corporation ("NVIDIA") and governs your use of a NVIDIA software development kit (“SDK”). + +Each SDK has its own set of software and materials, but here is a description of the types of items that may be included in a SDK: source code, header files, APIs, data sets and assets (examples include images, textures, models, scenes, videos, native API input/output files), binary software, sample code, libraries, utility programs, programming code and documentation. + +This Agreement can be accepted only by an adult of legal age of majority in the country in which the SDK is used. + +If you are entering into this Agreement on behalf of a company or other legal entity, you represent that you have the legal authority to bind the entity to this Agreement, in which case “you” will mean the entity you represent. + +If you don’t have the required age or authority to accept this Agreement, or if you don’t accept all the terms and conditions of this Agreement, do not download, install or use the SDK. + +You agree to use the SDK only for purposes that are permitted by (a) this Agreement, and (b) any applicable law, regulation or generally accepted practices or guidelines in the relevant jurisdictions. + +1. License. + +1.1 Grant + +Subject to the terms of this Agreement, NVIDIA hereby grants you a non-exclusive, non-transferable license, without the right to sublicense (except as expressly provided in this Agreement) to: + +(i) Install and use the SDK, + +(ii) Modify and create derivative works of sample source code delivered in the SDK, and + +(iii) Distribute those portions of the SDK that are identified in this Agreement as distributable, as incorporated in object code format into a software application that meets the distribution requirements indicated in this Agreement. + +1.2 Distribution Requirements + +These are the distribution requirements for you to exercise the distribution grant: + +(i) Your application must have material additional functionality, beyond the included portions of the SDK. + +(ii) The distributable portions of the SDK shall only be accessed by your application. + +(iii) The following notice shall be included in modifications and derivative works of sample source code distributed: “This software contains source code provided by NVIDIA Corporation.” + +(iv) Unless a developer tool is identified in this Agreement as distributable, it is delivered for your internal use only. + +(v) The terms under which you distribute your application must be consistent with the terms of this Agreement, including (without limitation) terms relating to the license grant and license restrictions and protection of NVIDIA’s intellectual property rights. Additionally, you agree that you will protect the privacy, security and legal rights of your application users. + +(vi) You agree to notify NVIDIA in writing of any known or suspected distribution or use of the SDK not in compliance with the requirements of this Agreement, and to enforce the terms of your agreements with respect to distributed SDK. + +1.3 Authorized Users + +You may allow employees and contractors of your entity or of your subsidiary(ies) to access and use the SDK from your secure network to perform work on your behalf. + +If you are an academic institution you may allow users enrolled or employed by the academic institution to access and use the SDK from your secure network. + +You are responsible for the compliance with the terms of this Agreement by your authorized users. If you become aware that your authorized users didn’t follow the terms of this Agreement, you agree to take reasonable steps to resolve the non-compliance and prevent new occurrences. + +1.4 Pre-Release SDK +The SDK versions identified as alpha, beta, preview or otherwise as pre-release, may not be fully functional, may contain errors or design flaws, and may have reduced or different security, privacy, accessibility, availability, and reliability standards relative to commercial versions of NVIDIA software and materials. Use of a pre-release SDK may result in unexpected results, loss of data, project delays or other unpredictable damage or loss. +You may use a pre-release SDK at your own risk, understanding that pre-release SDKs are not intended for use in production or business-critical systems. +NVIDIA may choose not to make available a commercial version of any pre-release SDK. NVIDIA may also choose to abandon development and terminate the availability of a pre-release SDK at any time without liability. +1.5 Updates + +NVIDIA may, at its option, make available patches, workarounds or other updates to this SDK. Unless the updates are provided with their separate governing terms, they are deemed part of the SDK licensed to you as provided in this Agreement. + +You agree that the form and content of the SDK that NVIDIA provides may change without prior notice to you. While NVIDIA generally maintains compatibility between versions, NVIDIA may in some cases make changes that introduce incompatibilities in future versions of the SDK. + +1.6 Third Party Licenses + +The SDK may come bundled with, or otherwise include or be distributed with, third-party software licensed by a NVIDIA supplier and/or open source software provided under an open source license. Use of third-party software is subject to the third-party license terms, or in the absence of third-party terms, the terms of this Agreement. Copyright to third party software is held by the copyright holders indicated in the third-party software or license. + +1.7 Reservation of Rights + +NVIDIA reserves all rights, title and interest in and to the SDK not expressly granted to you under this Agreement. + +2. Limitations. + +The following license limitations apply to your use of the SDK: + +2.1 You may not reverse engineer, decompile or disassemble, or remove copyright or other proprietary notices from any portion of the SDK or copies of the SDK. + +2.2 Except as expressly provided in this Agreement, you may not copy, sell, rent, sublicense, transfer, distribute, modify, or create derivative works of any portion of the SDK. For clarity, you may not distribute or sublicense the SDK as a stand-alone product. + +2.3 Unless you have an agreement with NVIDIA for this purpose, you may not indicate that an application created with the SDK is sponsored or endorsed by NVIDIA. + +2.4 You may not bypass, disable, or circumvent any encryption, security, digital rights management or authentication mechanism in the SDK. + +2.5 You may not use the SDK in any manner that would cause it to become subject to an open source software license. As examples, licenses that require as a condition of use, modification, and/or distribution that the SDK be (i) disclosed or distributed in source code form; (ii) licensed for the purpose of making derivative works; or (iii) redistributable at no charge. + +2.6 Unless you have an agreement with NVIDIA for this purpose, you may not use the SDK with any system or application where the use or failure of the system or application can reasonably be expected to threaten or result in personal injury, death, or catastrophic loss. Examples include use in avionics, navigation, military, medical, life support or other life critical applications. NVIDIA does not design, test or manufacture the SDK for these critical uses and NVIDIA shall not be liable to you or any third party, in whole or in part, for any claims or damages arising from such uses. + +2.7 You agree to defend, indemnify and hold harmless NVIDIA and its affiliates, and their respective employees, contractors, agents, officers and directors, from and against any and all claims, damages, obligations, losses, liabilities, costs or debt, fines, restitutions and expenses (including but not limited to attorney’s fees and costs incident to establishing the right of indemnification) arising out of or related to your use of the SDK outside of the scope of this Agreement, or not in compliance with its terms. + +3. Ownership. + +3.1 NVIDIA or its licensors hold all rights, title and interest in and to the SDK and its modifications and derivative works, including their respective intellectual property rights, subject to your rights under Section 3.2. This SDK may include software and materials from NVIDIA’s licensors, and these licensors are intended third party beneficiaries that may enforce this Agreement with respect to their intellectual property rights. + +3.2 You hold all rights, title and interest in and to your applications and your derivative works of the sample source code delivered in the SDK, including their respective intellectual property rights, subject to NVIDIA’s rights under section 3.1. + +3.3 You may, but don’t have to, provide to NVIDIA suggestions, feature requests or other feedback regarding the SDK, including possible enhancements or modifications to the SDK. For any feedback that you voluntarily provide, you hereby grant NVIDIA and its affiliates a perpetual, non-exclusive, worldwide, irrevocable license to use, reproduce, modify, license, sublicense (through multiple tiers of sublicensees), and distribute (through multiple tiers of distributors) it without the payment of any royalties or fees to you. NVIDIA will use feedback at its choice. NVIDIA is constantly looking for ways to improve its products, so you may send feedback to NVIDIA through the developer portal at https://developer.nvidia.com. + +4. No Warranties. + +THE SDK IS PROVIDED BY NVIDIA “AS IS” AND “WITH ALL FAULTS.” TO THE MAXIMUM EXTENT PERMITTED BY LAW, NVIDIA AND ITS AFFILIATES EXPRESSLY DISCLAIM ALL WARRANTIES OF ANY KIND OR NATURE, WHETHER EXPRESS, IMPLIED OR STATUTORY, INCLUDING, BUT NOT LIMITED TO, ANY WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE, TITLE, NON-INFRINGEMENT, OR THE ABSENCE OF ANY DEFECTS THEREIN, WHETHER LATENT OR PATENT. NO WARRANTY IS MADE ON THE BASIS OF TRADE USAGE, COURSE OF DEALING OR COURSE OF TRADE. + +5. Limitations of Liability. + +TO THE MAXIMUM EXTENT PERMITTED BY LAW, NVIDIA AND ITS AFFILIATES SHALL NOT BE LIABLE FOR ANY SPECIAL, INCIDENTAL, PUNITIVE OR CONSEQUENTIAL DAMAGES, OR ANY LOST PROFITS, LOSS OF USE, LOSS OF DATA OR LOSS OF GOODWILL, OR THE COSTS OF PROCURING SUBSTITUTE PRODUCTS, ARISING OUT OF OR IN CONNECTION WITH THIS AGREEMENT OR THE USE OR PERFORMANCE OF THE SDK, WHETHER SUCH LIABILITY ARISES FROM ANY CLAIM BASED UPON BREACH OF CONTRACT, BREACH OF WARRANTY, TORT (INCLUDING NEGLIGENCE), PRODUCT LIABILITY OR ANY OTHER CAUSE OF ACTION OR THEORY OF LIABILITY. IN NO EVENT WILL NVIDIA’S AND ITS AFFILIATES TOTAL CUMULATIVE LIABILITY UNDER OR ARISING OUT OF THIS AGREEMENT EXCEED US$10.00. THE NATURE OF THE LIABILITY OR THE NUMBER OF CLAIMS OR SUITS SHALL NOT ENLARGE OR EXTEND THIS LIMIT. + +These exclusions and limitations of liability shall apply regardless if NVIDIA or its affiliates have been advised of the possibility of such damages, and regardless of whether a remedy fails its essential purpose. These exclusions and limitations of liability form an essential basis of the bargain between the parties, and, absent any of these exclusions or limitations of liability, the provisions of this Agreement, including, without limitation, the economic terms, would be substantially different. + +6. Termination. + +6.1 This Agreement will continue to apply until terminated by either you or NVIDIA as described below. + +6.2 If you want to terminate this Agreement, you may do so by stopping to use the SDK. + +6.3 NVIDIA may, at any time, terminate this Agreement if: (i) you fail to comply with any term of this Agreement and the non-compliance is not fixed within thirty (30) days following notice from NVIDIA (or immediately if you violate NVIDIA’s intellectual property rights); (ii) you commence or participate in any legal proceeding against NVIDIA with respect to the SDK; or (iii) NVIDIA decides to no longer provide the SDK in a country or, in NVIDIA’s sole discretion, the continued use of it is no longer commercially viable. + +6.4 Upon any termination of this Agreement, you agree to promptly discontinue use of the SDK and destroy all copies in your possession or control. Your prior distributions in accordance with this Agreement are not affected by the termination of this Agreement. Upon written request, you will certify in writing that you have complied with your commitments under this section. Upon any termination of this Agreement all provisions survive except for the licenses granted to you. + +7. General. + +If you wish to assign this Agreement or your rights and obligations, including by merger, consolidation, dissolution or operation of law, contact NVIDIA to ask for permission. Any attempted assignment not approved by NVIDIA in writing shall be void and of no effect. NVIDIA may assign, delegate or transfer this Agreement and its rights and obligations, and if to a non-affiliate you will be notified. + +You agree to cooperate with NVIDIA and provide reasonably requested information to verify your compliance with this Agreement. + +This Agreement will be governed in all respects by the laws of the United States and of the State of Delaware as those laws are applied to contracts entered into and performed entirely within Delaware by Delaware residents, without regard to the conflicts of laws principles. The United Nations Convention on Contracts for the International Sale of Goods is specifically disclaimed. You agree to all terms of this Agreement in the English language. + +The state or federal courts residing in Santa Clara County, California shall have exclusive jurisdiction over any dispute or claim arising out of this Agreement. Notwithstanding this, you agree that NVIDIA shall still be allowed to apply for injunctive remedies or an equivalent type of urgent legal relief in any jurisdiction. + +If any court of competent jurisdiction determines that any provision of this Agreement is illegal, invalid or unenforceable, such provision will be construed as limited to the extent necessary to be consistent with and fully enforceable under the law and the remaining provisions will remain in full force and effect. Unless otherwise specified, remedies are cumulative. + +Each party acknowledges and agrees that the other is an independent contractor in the performance of this Agreement. + +The SDK has been developed entirely at private expense and is “commercial items” consisting of “commercial computer software” and “commercial computer software documentation” provided with RESTRICTED RIGHTS. Use, duplication or disclosure by the U.S. Government or a U.S. Government subcontractor is subject to the restrictions in this Agreement pursuant to DFARS 227.7202-3(a) or as set forth in subparagraphs (b)(1) and (2) of the Commercial Computer Software - Restricted Rights clause at FAR 52.227-19, as applicable. Contractor/manufacturer is NVIDIA, 2788 San Tomas Expressway, Santa Clara, CA 95051. + +The SDK is subject to United States export laws and regulations. You agree that you will not ship, transfer or export the SDK into any country, or use the SDK in any manner, prohibited by the United States Bureau of Industry and Security or economic sanctions regulations administered by the U.S. Department of Treasury’s Office of Foreign Assets Control (OFAC), or any applicable export laws, restrictions or regulations. These laws include restrictions on destinations, end users and end use. By accepting this Agreement, you confirm that you are not a resident or citizen of any country currently embargoed by the U.S. and that you are not otherwise prohibited from receiving the SDK. + +Any notice delivered by NVIDIA to you under this Agreement will be delivered via mail, email or fax. You agree that any notices that NVIDIA sends you electronically will satisfy any legal communication requirements. Please direct your legal notices or other correspondence to NVIDIA Corporation, 2788 San Tomas Expressway, Santa Clara, California 95051, United States of America, Attention: Legal Department. + +This Agreement and any exhibits incorporated into this Agreement constitute the entire agreement of the parties with respect to the subject matter of this Agreement and supersede all prior negotiations or documentation exchanged between the parties relating to this subject matter. Any additional and/or conflicting terms on documents issued by you are null, void, and invalid. Any amendment or waiver under this Agreement shall be in writing and signed by representatives of both parties. + +(v. October 12, 2020) + + + + + + + + + + + + + + + +cuSPARSELt SUPPLEMENT TO SOFTWARE LICENSE AGREEMENT FOR NVIDIA SOFTWARE DEVELOPMENT KITS + +The terms in this supplement govern your use of the NVIDIA cuSPARSELt SDK under the terms of your license agreement (“Agreement”) as modified by this supplement. Capitalized terms used but not defined below have the meaning assigned to them in the Agreement. + +This supplement is an exhibit to the Agreement and is incorporated as an integral part of the Agreement. In the event of conflict between the terms in this supplement and the terms in the Agreement, the terms in this supplement govern. + +1. License Scope. The SDK is licensed for you to develop applications only for use in systems with NVIDIA GPUs. + +2. Distribution. The following portions of the SDK are distributable under the Agreement: the runtimes files ending with .so and .h as part of your application. + +3. Licensing. If the distribution terms in this Agreement are not suitable for your organization, or for any questions regarding this Agreement, please contact NVIDIA at nvidia-compute-license-questions@nvidia.com + +(v. October 12, 2020) + + +nvidia-dali-cuda120 +Apache License 2.0 +https://github.com/NVIDIA/dali + Apache License + Version 2.0, January 2004 + http://www.apache.org/licenses/ + + TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION + + 1. Definitions. + + "License" shall mean the terms and conditions for use, reproduction, + and distribution as defined by Sections 1 through 9 of this document. + + "Licensor" shall mean the copyright owner or entity authorized by + the copyright owner that is granting the License. + + "Legal Entity" shall mean the union of the acting entity and all + other entities that control, are controlled by, or are under common + control with that entity. For the purposes of this definition, + "control" means (i) the power, direct or indirect, to cause the + direction or management of such entity, whether by contract or + otherwise, or (ii) ownership of fifty percent (50%) or more of the + outstanding shares, or (iii) beneficial ownership of such entity. + + "You" (or "Your") shall mean an individual or Legal Entity + exercising permissions granted by this License. + + "Source" form shall mean the preferred form for making modifications, + including but not limited to software source code, documentation + source, and configuration files. + + "Object" form shall mean any form resulting from mechanical + transformation or translation of a Source form, including but + not limited to compiled object code, generated documentation, + and conversions to other media types. + + "Work" shall mean the work of authorship, whether in Source or + Object form, made available under the License, as indicated by a + copyright notice that is included in or attached to the work + (an example is provided in the Appendix below). + + "Derivative Works" shall mean any work, whether in Source or Object + form, that is based on (or derived from) the Work and for which the + editorial revisions, annotations, elaborations, or other modifications + represent, as a whole, an original work of authorship. For the purposes + of this License, Derivative Works shall not include works that remain + separable from, or merely link (or bind by name) to the interfaces of, + the Work and Derivative Works thereof. + + "Contribution" shall mean any work of authorship, including + the original version of the Work and any modifications or additions + to that Work or Derivative Works thereof, that is intentionally + submitted to Licensor for inclusion in the Work by the copyright owner + or by an individual or Legal Entity authorized to submit on behalf of + the copyright owner. For the purposes of this definition, "submitted" + means any form of electronic, verbal, or written communication sent + to the Licensor or its representatives, including but not limited to + communication on electronic mailing lists, source code control systems, + and issue tracking systems that are managed by, or on behalf of, the + Licensor for the purpose of discussing and improving the Work, but + excluding communication that is conspicuously marked or otherwise + designated in writing by the copyright owner as "Not a Contribution." + + "Contributor" shall mean Licensor and any individual or Legal Entity + on behalf of whom a Contribution has been received by Licensor and + subsequently incorporated within the Work. + + 2. Grant of Copyright License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + copyright license to reproduce, prepare Derivative Works of, + publicly display, publicly perform, sublicense, and distribute the + Work and such Derivative Works in Source or Object form. + + 3. Grant of Patent License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + (except as stated in this section) patent license to make, have made, + use, offer to sell, sell, import, and otherwise transfer the Work, + where such license applies only to those patent claims licensable + by such Contributor that are necessarily infringed by their + Contribution(s) alone or by combination of their Contribution(s) + with the Work to which such Contribution(s) was submitted. If You + institute patent litigation against any entity (including a + cross-claim or counterclaim in a lawsuit) alleging that the Work + or a Contribution incorporated within the Work constitutes direct + or contributory patent infringement, then any patent licenses + granted to You under this License for that Work shall terminate + as of the date such litigation is filed. + + 4. Redistribution. You may reproduce and distribute copies of the + Work or Derivative Works thereof in any medium, with or without + modifications, and in Source or Object form, provided that You + meet the following conditions: + + (a) You must give any other recipients of the Work or + Derivative Works a copy of this License; and + + (b) You must cause any modified files to carry prominent notices + stating that You changed the files; and + + (c) You must retain, in the Source form of any Derivative Works + that You distribute, all copyright, patent, trademark, and + attribution notices from the Source form of the Work, + excluding those notices that do not pertain to any part of + the Derivative Works; and + + (d) If the Work includes a "NOTICE" text file as part of its + distribution, then any Derivative Works that You distribute must + include a readable copy of the attribution notices contained + within such NOTICE file, excluding those notices that do not + pertain to any part of the Derivative Works, in at least one + of the following places: within a NOTICE text file distributed + as part of the Derivative Works; within the Source form or + documentation, if provided along with the Derivative Works; or, + within a display generated by the Derivative Works, if and + wherever such third-party notices normally appear. The contents + of the NOTICE file are for informational purposes only and + do not modify the License. You may add Your own attribution + notices within Derivative Works that You distribute, alongside + or as an addendum to the NOTICE text from the Work, provided + that such additional attribution notices cannot be construed + as modifying the License. + + You may add Your own copyright statement to Your modifications and + may provide additional or different license terms and conditions + for use, reproduction, or distribution of Your modifications, or + for any such Derivative Works as a whole, provided Your use, + reproduction, and distribution of the Work otherwise complies with + the conditions stated in this License. + + 5. Submission of Contributions. Unless You explicitly state otherwise, + any Contribution intentionally submitted for inclusion in the Work + by You to the Licensor shall be under the terms and conditions of + this License, without any additional terms or conditions. + Notwithstanding the above, nothing herein shall supersede or modify + the terms of any separate license agreement you may have executed + with Licensor regarding such Contributions. + + 6. Trademarks. This License does not grant permission to use the trade + names, trademarks, service marks, or product names of the Licensor, + except as required for reasonable and customary use in describing the + origin of the Work and reproducing the content of the NOTICE file. + + 7. Disclaimer of Warranty. Unless required by applicable law or + agreed to in writing, Licensor provides the Work (and each + Contributor provides its Contributions) on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or + implied, including, without limitation, any warranties or conditions + of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A + PARTICULAR PURPOSE. You are solely responsible for determining the + appropriateness of using or redistributing the Work and assume any + risks associated with Your exercise of permissions under this License. + + 8. Limitation of Liability. In no event and under no legal theory, + whether in tort (including negligence), contract, or otherwise, + unless required by applicable law (such as deliberate and grossly + negligent acts) or agreed to in writing, shall any Contributor be + liable to You for damages, including any direct, indirect, special, + incidental, or consequential damages of any character arising as a + result of this License or out of the use or inability to use the + Work (including but not limited to damages for loss of goodwill, + work stoppage, computer failure or malfunction, or any and all + other commercial damages or losses), even if such Contributor + has been advised of the possibility of such damages. + + 9. Accepting Warranty or Additional Liability. While redistributing + the Work or Derivative Works thereof, You may choose to offer, + and charge a fee for, acceptance of support, warranty, indemnity, + or other liability obligations and/or rights consistent with this + License. However, in accepting such obligations, You may act only + on Your own behalf and on Your sole responsibility, not on behalf + of any other Contributor, and only if You agree to indemnify, + defend, and hold each Contributor harmless for any liability + incurred by, or claims asserted against, such Contributor by reason + of your accepting any such warranty or additional liability. + + + +nvidia-libnvcomp-cu12 +LICENSE AGREEMENT FOR NVIDIA SOFTWARE DEVELOPMENT KITS +https://developer.nvidia.com/nvcomp +LICENSE AGREEMENT FOR NVIDIA SOFTWARE DEVELOPMENT KITS + +This license agreement("Agreement") is a legal agreement between you and NVIDIA Corporation ("NVIDIA") and governs your use of the NVIDIA nvCOMP software development kit as available at NVIDIA's discretion (each, a "SDK"). + +Each SDK has its own set of software and materials, but here is a description of the types of items that may be included in a SDK: source code, header files, APIs, data sets and assets (examples include images, textures, models, scenes, videos, native API input/output files), binary software, sample code, libraries, utility programs, programming code and documentation. + +This Agreement can be accepted only by an adult of legal age of majority in the country in which the SDK is used. + +If you are entering into this Agreement on behalf of a company or other legal entity, you represent that you have the legal authority to bind the entity to this Agreement, in which case "you" will mean the entity you represent. + +If you don't have the required age or authority to accept this Agreement, or if you don't accept all the terms and conditions of this Agreement, do not download, install or use the SDK. + +You agree to use the SDK only for purposes that are permitted by (a) this Agreement, and (b) any applicable law, regulation or generally accepted practices or guidelines in the relevant jurisdictions. + +1. License. + +1.1 Grant + +Subject to the terms of this Agreement, NVIDIA hereby grants you a non-exclusive, non-transferable license, without the right to sublicense (except as expressly provided in this Agreement) to: + +(i) Install and use the SDK, + +(ii) Modify and create derivative works of sample source code delivered in the SDK, and + +(ii) Distribute the binary files, files identified as samples, and headers as incorporated into a software application that meets the distribution requirements indicated in this Agreement. + +1.2 Distribution Requirements + +These are the distribution requirements for you to exercise the distribution grant: + +(i) Your application must have material additional functionality, beyond the included portions of the SDK. + +(ii) The distributable portions of the SDK shall only be accessed by your application. + +(iii) The following notice shall be included in modifications and derivative works of sample source code distributed: "This software contains source code provided by NVIDIA Corporation." + +(iv) Unless a developer tool is identified in this Agreement as distributable, it is delivered for your internal use only. + +(v) The terms under which you distribute your application must be consistent with the terms of this Agreement, including (without limitation) terms relating to the license grant and license restrictions and protection of NVIDIA's intellectual property rights. Additionally, you agree that you will protect the privacy, security and legal rights of your application users. + +(vi) You agree to notify NVIDIA in writing of any known or suspected distribution or use of the SDK not in compliance with the requirements of this Agreement, and to enforce the terms of your agreements with respect to distributed SDK. + +1.3 Authorized Users + +You may allow employees and contractors of your entity or of your subsidiary(ies) to access and use the SDK from your secure network to perform work on your behalf. + +If you are an academic institution you may allow users enrolled or employed by the academic institution to access and use the SDK from your secure network. + +You are responsible for the compliance with the terms of this Agreement by your authorized users. If you become aware that your authorized users didn't follow the terms of this Agreement, you agree to take reasonable steps to resolve the non-compliance and prevent new occurrences. + +1.4 Pre-Release SDK + +The SDK versions identified as alpha, beta, preview or otherwise as pre-release, may not be fully functional, may contain errors or design flaws, and may have reduced or different security, privacy, accessibility, availability, and reliability standards relative to commercial versions of NVIDIA software and materials. Use of a pre-release SDK may result in unexpected results, loss of data, project delays or other unpredictable damage or loss. + +You may use a pre-release SDK at your own risk, understanding that pre-release SDKs are not intended for use in production or business-critical systems. + +NVIDIA may choose not to make available a commercial version of any pre-release SDK. NVIDIA may also choose to abandon development and terminate the availability of a pre-release SDK at any time without liability. + +1.5 Updates + +NVIDIA may, at its option, make available patches, workarounds or other updates to this SDK. Unless the updates are provided with their separate governing terms, they are deemed part of the SDK licensed to you as provided in this Agreement. + +You agree that the form and content of the SDK that NVIDIA provides may change without prior notice to you. While NVIDIA generally maintains compatibility between versions, NVIDIA may in some cases make changes that introduce incompatibilities in future versions of the SDK. + +1.6 Components Under Other Licenses. + +The SDK may include NVIDIA or third-party components with separate legal notices or terms as may be described in proprietary notices accompanying the SDK, such as components governed by open source software licenses. If and to the extent there is a conflict between the terms in this license and the license terms associated with a component, the license terms associated with the components control only to the extent necessary to resolve the conflict. + +1.7 Reservation of Rights + +NVIDIA reserves all rights, title and interest in and to the SDK not expressly granted to you under this Agreement. + +2. Limitations. + +The following license limitations apply to your use of the SDK: + +2.1 The SDK is licensed for you to develop applications only for use in systems with NVIDIA GPUs. + +2.2 You may not reverse engineer, decompile or disassemble, or remove copyright or other proprietary notices from any portion of the SDK or copies of the SDK. + +2.3 Except as expressly provided in this Agreement, you may not copy, sell, rent, sublicense, transfer, distribute, modify, or create derivative works of any portion of the SDK. + +2.4 Unless you have an agreement with NVIDIA for this purpose, you may not indicate that an application created with the SDK is sponsored or endorsed by NVIDIA. + +2.5 You may not bypass, disable, or circumvent any encryption, security, digital rights management or authentication mechanism in the SDK. + +2.6 You may not use the SDK in any manner that would cause it to become subject to an open source software license. As examples, licenses that require as a condition of use, modification, and/or distribution that the SDK be (i) disclosed or distributed in source code form; (ii) licensed for the purpose of making derivative works; or (iii) redistributable at no charge. + +2.7 You acknowledge that the SDK as delivered is not tested or certified by NVIDIA for use in connection with the design, construction, maintenance, and/or operation of any system where the use or failure of such system could result in a situation that threatens the safety of human life or results in catastrophic damages (each, a "Critical Application"). Examples of Critical Applications include use in avionics, navigation, autonomous vehicle applications, ai solutions for automotive products, military, medical, life support or other life critical applications. NVIDIA shall not be liable to you or any third party, in whole or in part, for any claims or damages arising from such uses. You are solely responsible for ensuring that any product or service developed with the SDK as a whole includes sufficient features to comply with all applicable legal and regulatory standards and requirements. + +2.8 You agree to defend, indemnify and hold harmless NVIDIA and its affiliates, and their respective employees, contractors, agents, officers and directors, from and against any and all claims, damages, obligations, losses, liabilities, costs or debt, fines, restitutions and expenses (including but not limited to attorney's fees and costs incident to establishing the right of indemnification) arising out of or related to products or services that use the SDK in or for Critical Applications, and for use of the SDK outside of the scope of this Agreement, or not in compliance with its terms. + +3. Ownership. + +3.1 NVIDIA or its licensors hold all rights, title and interest in and to the SDK and its modifications and derivative works, including their respective intellectual property rights, subject to your rights under Section 3.2. This SDK may include software and materials from NVIDIA's licensors, and these licensors are intended third party beneficiaries that may enforce this Agreement with respect to their intellectual property rights. + +3.2 You hold all rights, title and interest in and to your applications and your derivative works of the sample source code delivered in the SDK, including their respective intellectual property rights, subject to NVIDIA's rights under section 3.1. + +3.3 You may, but don't have to, provide to NVIDIA suggestions, feature requests or other feedback regarding the SDK, including possible enhancements or modifications to the SDK. For any feedback that you voluntarily provide, you hereby grant NVIDIA and its affiliates a perpetual, non-exclusive, worldwide, irrevocable license to use, reproduce, modify, license, sublicense (through multiple tiers of sublicensees), and distribute (through multiple tiers of distributors) it without the payment of any royalties or fees to you. NVIDIA will use feedback at its choice. + +4. No Warranties. + +THE SDK IS PROVIDED BY NVIDIA "AS IS" AND "WITH ALL FAULTS." TO THE MAXIMUM EXTENT PERMITTED BY LAW, NVIDIA AND ITS AFFILIATES EXPRESSLY DISCLAIM ALL WARRANTIES OF ANY KIND OR NATURE, WHETHER EXPRESS, IMPLIED OR STATUTORY, INCLUDING, BUT NOT LIMITED TO, ANY WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE, TITLE, NON-INFRINGEMENT, OR THE ABSENCE OF ANY DEFECTS THEREIN, WHETHER LATENT OR PATENT. NO WARRANTY IS MADE ON THE BASIS OF TRADE USAGE, COURSE OF DEALING OR COURSE OF TRADE. + +5. Limitations of Liability. + +TO THE MAXIMUM EXTENT PERMITTED BY LAW, NVIDIA AND ITS AFFILIATES SHALL NOT BE LIABLE FOR ANY SPECIAL, INCIDENTAL, PUNITIVE OR CONSEQUENTIAL DAMAGES, OR ANY LOST PROFITS, LOSS OF USE, LOSS OF DATA OR LOSS OF GOODWILL, OR THE COSTS OF PROCURING SUBSTITUTE PRODUCTS, ARISING OUT OF OR IN CONNECTION WITH THIS AGREEMENT OR THE USE OR PERFORMANCE OF THE SDK, WHETHER SUCH LIABILITY ARISES FROM ANY CLAIM BASED UPON BREACH OF CONTRACT, BREACH OF WARRANTY, TORT (INCLUDING NEGLIGENCE), PRODUCT LIABILITY OR ANY OTHER CAUSE OF ACTION OR THEORY OF LIABILITY. IN NO EVENT WILL NVIDIA'S AND ITS AFFILIATES TOTAL CUMULATIVE LIABILITY UNDER OR ARISING OUT OF THIS AGREEMENT EXCEED US$10.00. THE NATURE OF THE LIABILITY OR THE NUMBER OF CLAIMS OR SUITS SHALL NOT ENLARGE OR EXTEND THIS LIMIT. + +These exclusions and limitations of liability shall apply regardless if NVIDIA or its affiliates have been advised of the possibility of such damages, and regardless of whether a remedy fails its essential purpose. These exclusions and limitations of liability form an essential basis of the bargain between the parties, and, absent any of these exclusions or limitations of liability, the provisions of this Agreement, including, without limitation, the economic terms, would be substantially different. + +6. Termination. + +6.1 This Agreement will continue to apply until terminated by either you or NVIDIA as described below. + +6.2 If you want to terminate this Agreement, you may do so by stopping to use the SDK. + +6.3 NVIDIA may, at any time, terminate this Agreement if: (i) you fail to comply with any term of this Agreement and the non-compliance is not fixed within thirty (30) days following notice from NVIDIA (or immediately if you violate NVIDIA's intellectual property rights); (ii) you commence or participate in any legal proceeding against NVIDIA with respect to the SDK; or (iii) NVIDIA decides to no longer provide the SDK in a country or, in NVIDIA's sole discretion, the continued use of it is no longer commercially viable. + +6.4 Upon any termination of this Agreement, you agree to promptly discontinue use of the SDK and destroy all copies in your possession or control. Your prior distributions in accordance with this Agreement are not affected by the termination of this Agreement. Upon written request, you will certify in writing that you have complied with your commitments under this section. Upon any termination of this Agreement all provisions survive except for the licenses granted to you. + +7. General. + +If you wish to assign this Agreement or your rights and obligations, including by merger, consolidation, dissolution or operation of law, contact NVIDIA to ask for permission. Any attempted assignment not approved by NVIDIA in writing shall be void and of no effect. NVIDIA may assign, delegate or transfer this Agreement and its rights and obligations. + +You agree to cooperate with NVIDIA and provide reasonably requested information to verify your compliance with this Agreement. + +This Agreement will be governed in all respects by the laws of the United States and of the State of Delaware as those laws are applied to contracts entered into and performed entirely within Delaware, without regard to the conflicts of laws principles. The United Nations Convention on Contracts for the International Sale of Goods is specifically disclaimed. You agree to all terms of this Agreement in the English language. + +The state or federal courts residing in Santa Clara County, California shall have exclusive jurisdiction over any dispute or claim arising out of this Agreement. Notwithstanding this, you agree that NVIDIA shall still be allowed to apply for injunctive remedies or an equivalent type of urgent legal relief in any jurisdiction. + +If any court of competent jurisdiction determines that any provision of this Agreement is illegal, invalid or unenforceable, such provision will be construed as limited to the extent necessary to be consistent with and fully enforceable under the law and the remaining provisions will remain in full force and effect. Unless otherwise specified, remedies are cumulative. + +Each party acknowledges and agrees that the other is an independent contractor in the performance of this Agreement. + +The SDK has been developed entirely at private expense and is "commercial items" consisting of "commercial computer software" and "commercial computer software documentation" provided with RESTRICTED RIGHTS. Use, duplication or disclosure by the U.S. Government or a U.S. Government subcontractor is subject to the restrictions in this Agreement pursuant to DFARS 227.7202-3(a) or as set forth in subparagraphs (b)(1) and (2) of the Commercial Computer Software - Restricted Rights clause at FAR 52.227-19, as applicable. Contractor/manufacturer is NVIDIA, 2788 San Tomas Expressway, Santa Clara, CA 95051. + +The SDK is subject to United States export laws and regulations. You agree that you will not ship, transfer or export the SDK into any country, or use the SDK in any manner, prohibited by the United States Bureau of Industry and Security or economic sanctions regulations administered by the U.S. Department of Treasury's Office of Foreign Assets Control (OFAC), or any applicable export laws, restrictions or regulations. These laws include restrictions on destinations, end users and end use. By accepting this Agreement, you confirm that you are not a resident or citizen of any country currently embargoed by the U.S. and that you are not otherwise prohibited from receiving the SDK. + +Any notice delivered by NVIDIA to you under this Agreement will be delivered via mail, email or fax. You agree that any notices that NVIDIA sends you electronically will satisfy any legal communication requirements. Please direct your legal notices or other correspondence to NVIDIA Corporation, 2788 San Tomas Expressway, Santa Clara, California 95051, United States of America, Attention: Legal Department. + +This Agreement constitutes the entire agreement of the parties with respect to the subject matter of this Agreement and supersedes all prior negotiations or documentation exchanged between the parties relating to this subject matter. Any additional and/or conflicting terms on documents issued by you are null, void, and invalid. Any amendment or waiver under this Agreement shall be in writing and signed by representatives of both parties. + +If the distribution terms in this Agreement are not suitable for your organization, or for any questions regarding this Agreement, please contact NVIDIA at nvidia-compute-license-questions@nvidia.com. + +(v. April 26, 2022) + + +nvidia-ml-py +BSD License +https://forums.developer.nvidia.com +UNKNOWN + +nvidia-nccl-cu13 +LicenseRef-NVIDIA-Proprietary +https://developer.nvidia.com/cuda-zone +UNKNOWN + +nvidia-nvimgcodec-cu12 +Apache License 2.0 +https://github.com/NVIDIA/nvImageCodec +NVIDIA Software License Agreement + +IMPORTANT NOTICE – PLEASE READ AND AGREE BEFORE USING THE SOFTWARE. + +This license agreement (“Agreement”) is a legal agreement between you, whether an individual or entity (“you”) and NVIDIA Corporation (“NVIDIA”) and governs the use of the NVIDIA Image Codec and any additional software and materials provided under this Agreement (“Software”). + +This Agreement can be accepted only by an adult of legal age of majority in the country in which the Software is used. + +If you don’t have the required age or authority to accept this Agreement, or if you don’t accept all the terms and conditions of this Agreement, do not use the Software. + +You agree to use the Software only for purposes that are permitted by this Agreement and any applicable law or regulation in the relevant jurisdictions. + +1. License Grant. Subject to the terms of this Agreement, NVIDIA grants you a non-exclusive, revocable, non-transferable, non-sublicensable (except as expressly granted in this Agreement), license to: + +1.1 install and use copies of the Software, + +1.2 modify and create derivative works of sample or example Software provided by NVIDIA in source code format, and + +1.3 distribute the Software in binary format as incorporated into a software application subject to the following distribution requirements: + +(a) Your application must have material additional functionality, beyond the included portions of the Software. + +(b) The distributable portions of the Software shall only be accessed by your application. + +(c) The following notice shall be included in modifications and derivative works of sample source code distributed: “This software contains source code provided by NVIDIA Corporation.” + +(d) Unless a developer tool is identified in this Agreement as distributable, it is delivered for your internal use only. + +(e) The terms under which you distribute your application must be consistent with the terms of this Agreement, including (without limitation) terms relating to the license grant and license restrictions and protection of NVIDIA’s intellectual property rights. Additionally, you agree that you will protect the privacy, security and legal rights of your application users. + +(f) You agree to notify NVIDIA in writing of any known or suspected distribution or use of the Software not in compliance with the requirements of this Agreement, and to enforce the terms of your agreements with respect to distributed Software. + +2. Limitations. Your license to use the Software is restricted as follows: + +2.1 The Software is licensed for you to develop applications only for use in systems with NVIDIA GPUs and NVIDIA CPUs (if and when available). + +2.2 You may not reverse engineer, decompile or disassemble the Software components provided in binary form, nor attempt in any other manner to obtain source code of the Software. + +2.3 You may not change or remove copyright or other proprietary notices in the Software. + +2.4 Except as expressly granted in this Agreement, you may not copy, sell, rent, sublicense, transfer, distribute, modify or create derivative works of the Software, or make its functionality available to others. + +2.5 You may not bypass, disable or circumvent any technical limitation, encryption, security, digital rights management or authentication mechanism in the Software. + +2.6 You may not use the Software in any manner that would cause it to become subject to an open source software license; subject to the terms in the “Components Under Other Licenses” section below . + +2.7 You may not use the Software for the purpose of developing competing products or technologies or assist a third party in such activities. + +2.8 You may not indicate that a product or service developed with the Software is sponsored or endorsed by NVIDIA. + +2.9 You may not replace any NVIDIA software components in the Software that are governed by this Agreement with other software that implements NVIDIA APIs. + +2.10 You may not reverse engineer, decompile or disassemble any portion of the output generated using Software elements for the purpose of translating such output artifacts to target a non-NVIDIA platform. + +2.11 You may not distribute or disclose to third parties the output of the Software where the output reveals functionality or performance data pertinent to NVIDIA hardware or software products, results of benchmarking, competitive analysis, or regression or performance data relating to the Software or NVIDIA GPUs without the prior written permission from NVIDIA. + +2.12 You acknowledge that the Software provided under this Agreement is not designed or tested by NVIDIA for use in any system or application where the use or failure of such system or application developed with NVIDIA’s Software could result in injury, death or catastrophic damage (each, a “Mission Critical Application”). Examples of Mission Critical Applications include use in avionics, navigation, autonomous vehicle applications, AI solutions for automotive products, military, medical, life support or other mission-critical or life-critical applications. NVIDIA will not be liable to you or any third party, in whole or in part, for any claims or damages arising from these uses. You are solely responsible for ensuring that systems and applications developed with the Software include sufficient safety and redundancy features and comply with all applicable legal and regulatory standards and requirements. + +2.13 You agree to defend, indemnify and hold harmless NVIDIA and its affiliates, and their respective employees, contractors, agents, officers and directors, from and against any and all claims, damages, obligations, losses, liabilities, costs or debt, fines, restitutions and expenses (including but not limited to attorney’s fees and costs incident to establishing the right of indemnification) arising out of or related to (i) products or services that have been developed or deployed with or use the Software, or claims that they violate laws, or infringe, violate, or misappropriate any third party right; or (ii) a violation of the terms and conditions of this Agreement. + +3. Authorized Users. You may allow employees and contractors of your entity or of your subsidiary(ies) to access and use the Software from your secure network to perform the work authorized by this Agreement on your behalf. If you are an academic institution, you may allow users enrolled or employed by the academic institution to access and use the Software as authorized by this Agreement from your secure network. You are responsible for the compliance with the terms of this Agreement by your authorized users. Any act or omission that if committed by you would constitute a breach of this Agreement will be deemed to constitute a breach of this Agreement if committed by your authorized users. + +4. Pre-Release Versions. Software versions or specific features identified as alpha, beta, preview, early access or otherwise as pre-release may not be fully functional, may contain errors or design flaws, and may have reduced or different security, privacy, availability and reliability standards relative to commercial versions of NVIDIA offerings. You may use pre-release Software at your own risk, understanding that such versions are not intended for use in production or business-critical systems. NVIDIA may choose not to make available a commercial version of any pre-release Software. NVIDIA may also choose to abandon development and terminate the availability of pre-release Software at any time without liability. + +5. Updates. NVIDIA may, at its option, make available patches, workarounds or other updates to the Software. Unless the updates are provided with their separate governing terms, they are deemed part of the Software licensed to you as provided in this Agreement. + +6. Components Under Other Licenses. The Software may include or be distributed with components provided with separate legal notices or terms that accompany the components, such as open source software licenses and other license. The components are subject to the applicable other licenses, including any proprietary notices, disclaimers, requirements and extended use rights; except that this Agreement will prevail regarding the use of third-party open source software, unless a third-party open source software license requires its license terms to prevail. Open source software license means any software, data or documentation subject to any license identified as an open source license by the Open Source Initiative (http://opensource.org), Free Software Foundation (http://www.fsf.org) or other similar open source organization or listed by the Software Package Data Exchange (SPDX) Workgroup under the Linux Foundation (http://www.spdx.org). + +7. Termination . This Agreement will automatically terminate without notice from NVIDIA if you fail to comply with any of the terms in this Agreement or if you commence or participate in any legal proceeding against NVIDIA with respect to the Software. Additionally, NVIDIA may terminate this Agreement with prior written notice to you if, in NVIDIA’s sole discretion, the continued use of the Software is no longer commercially viable or creates liabilities for NVIDIA. You agree to cooperate with NVIDIA and provide reasonably requested information to verify your compliance with this Agreement. Upon any termination, you must stop using and destroy all copies of the Software. Upon written request, you will certify in writing that you have complied with your commitments under this section. All provisions will survive termination, except for the licenses granted to you. + +8. Ownership. + +8.1 NVIDIA Ownership. The Software, including all intellectual property rights, is and will remain the sole and exclusive property of NVIDIA or its licensors. Except as expressly granted in this Agreement, (i) NVIDIA reserves all rights, interests and remedies in connection with the Software and (ii) no other license or right is granted to you by implication, estoppel or otherwise. + +8.2 Your Ownership. Subject to the rights of NVIDIA and its suppliers in the Software, you hold all rights, title and interest in and to your services, applications and derivative works of samples or examples you develop as permitted in this Agreement including their respective intellectual property rights. + +8.3 Non-Assert. You agree that you will not, and will not assist or enable any other party to, assert or threaten to assert any intellectual property rights against NVIDIA or its affiliates with respect to new software samples or examples that NVIDIA or its affiliates may develop and make available in the future. + +9. Feedback. You may, but are not obligated to, provide suggestions, requests, fixes, modifications, enhancements or other feedback regarding or in connection with your use of the Software (collectively, “Feedback”). Feedback, even if designated as confidential by you, will not create any confidentiality obligation for NVIDIA or its affiliates. If you provide Feedback, you hereby grant NVIDIA, its affiliates and its designees a non-exclusive, perpetual, irrevocable, sublicensable, worldwide, royalty-free, fully paid-up and transferable license, under your intellectual property rights, to publicly perform, publicly display, reproduce, use, make, have made, sell, offer for sale, distribute (through multiple tiers of distribution), import, create derivative works of and otherwise commercialize and exploit the Feedback at NVIDIA’s discretion. You will not give Feedback (i) that you have reason to believe is subject to any restriction that impairs the exercise of the grant stated in this section, such as third-party intellectual property rights or (ii) subject to license terms which seek to require any product incorporating or developed using such Feedback, or other intellectual property of NVIDIA or its affiliates, to be licensed to or otherwise shared with any third party. + +10. Disclaimer of Warranties. THE SOFTWARE IS PROVIDED BY NVIDIA AS-IS AND WITH ALL FAULTS. TO THE MAXIMUM EXTENT PERMITTED BY APPLICABLE LAW, NVIDIA DISCLAIMS ALL WARRANTIES AND REPRESENTATIONS OF ANY KIND, WHETHER EXPRESS, IMPLIED OR STATUTORY, RELATING TO OR ARISING UNDER THIS AGREEMENT, INCLUDING, WITHOUT LIMITATION, THE WARRANTIES OF TITLE, NONINFRINGEMENT, MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE, USAGE OF TRADE AND COURSE OF DEALING. WITHOUT LIMITING THE FOREGOING, NVIDIA DOES NOT WARRANT THAT THE SOFTWARE WILL MEET YOUR REQUIREMENTS; THAT ANY DEFECTS OR ERRORS WILL BE CORRECTED; THAT ANY CERTAIN CONTENT WILL BE AVAILABLE; OR THAT THE SOFTWARE IS FREE OF VIRUSES OR OTHER HARMFUL COMPONENTS. NO INFORMATION OR ADVICE GIVEN BY NVIDIA WILL IN ANY WAY INCREASE THE SCOPE OF ANY WARRANTY EXPRESSLY PROVIDED IN THIS AGREEMENT. NVIDIA does not warrant or assume responsibility for the accuracy or completeness of any third-party information, text, graphics or links contained in the Software. + +11. Limitations of Liability. + +11.1 DISCLAIMERS. TO THE MAXIMUM EXTENT PERMITTED BY APPLICABLE LAW, IN NO EVENT WILL NVIDIA BE LIABLE FOR ANY (I) INDIRECT, PUNITIVE, SPECIAL, INCIDENTAL OR CONSEQUENTIAL DAMAGES, OR (II) DAMAGES FOR THE (A) COST OF PROCURING SUBSTITUTE GOODS OR (B) LOSS OF PROFITS, REVENUES, USE, DATA OR GOODWILL ARISING OUT OF OR RELATED TO THIS AGREEMENT, WHETHER BASED ON BREACH OF CONTRACT, TORT (INCLUDING NEGLIGENCE), STRICT LIABILITY, OR OTHERWISE, AND EVEN IF NVIDIA HAS BEEN ADVISED OF THE POSSIBILITY OF SUCH DAMAGES AND EVEN IF A PARTY’S REMEDIES FAIL THEIR ESSENTIAL PURPOSE. + +11.2 DAMAGES CAP. ADDITIONALLY, TO THE MAXIMUM EXTENT PERMITTED BY APPLICABLE LAW, NVIDIA’S TOTAL CUMULATIVE AGGREGATE LIABILITY FOR ANY AND ALL LIABILITIES, OBLIGATIONS OR CLAIMS ARISING OUT OF OR RELATED TO THIS AGREEMENT WILL NOT EXCEED FIVE U.S. DOLLARS (US$5). + +12. Governing Law and Jurisdiction. This Agreement will be governed in all respects by the laws of the United States and the laws of the State of Delaware, without regard to conflict of laws principles or the United Nations Convention on Contracts for the International Sale of Goods. The state and federal courts residing in Santa Clara County, California will have exclusive jurisdiction over any dispute or claim arising out of or related to this Agreement, and the parties irrevocably consent to personal jurisdiction and venue in those courts; except that either party may apply for injunctive remedies or an equivalent type of urgent legal relief in any jurisdiction. + +13. General. + +13.1 No Assignment. NVIDIA may assign, delegate or transfer its rights or obligations under this Agreement by any means or operation of law. You may not, without NVIDIA’s prior written consent, assign, delegate or transfer any of your rights or obligations under this Agreement by any means or operation of law, and any attempt to do so is null and void. + +13.2 No Waiver. No waiver of any term of the Agreement will be deemed a further or continuing waiver of such term or any other term, and NVIDIA’s failure to assert any right or provision under the Agreement will not constitute a waiver of such right or provision. + +13.3 Trade and Compliance. You agree to comply with all applicable export, import, trade and economic sanctions laws and regulations, including U.S. Export Administration Regulations and Office of Foreign Assets Control regulations. You confirm that you will not export or reexport any products or technology, directly or indirectly, without first obtaining any required license or other approval from appropriate authorities, (i) to any countries that are subject to any U.S. or local export restrictions (currently including, but not necessarily limited to, Cuba, Iran, North Korea, Syria, the Region of Crimea, Donetsk People’s Republic Region and Luhansk People’s Republic Region); (ii) to any end user who you know or have reason to know will utilize them in the design, development or production of nuclear, chemical or biological weapons, missiles, rocket systems, unmanned air vehicles, or any weapons of mass destruction; (iii) to any end-user who has been prohibited from participating in the U.S. or local export transactions by any governing authority; or (iv) to any known military or military-intelligence end-user or for any known military or military-intelligence end-use in accordance with U.S. trade compliance laws and regulations. Use of the Software under this Agreement must be consistent with NVIDIA’s HumanRightsPolicy.pdf (nvidia.com). + +13.4 Government Rights. The Software, documentation and technology (“Protected Items”) are “Commercial products” as this term is defined at 48 C.F.R. 2.101, consisting of “commercial computer software” and “commercial computer software documentation” as such terms are used in, respectively, 48 C.F.R. 12.212 and 48 C.F.R. 227.7202 & 252.227-7014(a)(1). Before any Protected Items are supplied to the U.S. Government, you will (i) inform the U.S. Government in writing that the Protected Items are and must be treated as commercial computer software and commercial computer software documentation developed at private expense; (ii) inform the U.S. Government that the Protected Items are provided subject to the terms of the Agreement; and (iii) mark the Protected Items as commercial computer software and commercial computer software documentation developed at private expense. In no event will you permit the U.S. Government to acquire rights in Protected Items beyond those specified in 48 C.F.R. 52.227-19(b)(1)-(2) or 252.227-7013(c) except as expressly approved by NVIDIA in writing. + +13.5 Notices. Please direct your legal notices or other correspondence to NVIDIA Corporation, 2788 San Tomas Expressway, Santa Clara, California 95051, United States of America, Attention: Legal Department, with a copy emailed to legalnotices@nvidia.com. If NVIDIA needs to contact you about the Software, you consent to receive the notices by email and agree that such notices will satisfy any legal communication requirements. + +13.6 Force Majeure. Neither party will be liable during any period where an event or circumstance prevents or delays that party from performing its obligations under this Agreement and that event or circumstance: (i) is not within the reasonable control of that party and is not the result of that party’s negligence, and (ii) cannot be overcome or avoided by that party using reasonably diligent efforts. + +13.7 Severability and Amendment. If a court of competent jurisdiction rules that a provision of this Agreement is unenforceable, that provision will be deemed modified to the extent necessary to make it enforceable and the remainder of this Agreement will continue in full force and effect. Any amendment to this Agreement must be in writing and signed by authorized representatives of both parties. + +13.8 Construction. The headings in the Agreement are included solely for convenience and are not intended to affect the meaning or interpretation of the Agreement. As required by the context of the Agreement, the singular of a term includes the plural and vice versa. + +13.9 Entire Agreement. Regarding the subject matter of this Agreement, the parties agree that (i) this Agreement constitutes the entire and exclusive agreement between the parties and supersedes all prior and contemporaneous communications and (ii) any additional or different terms or conditions, whether contained in purchase orders, order acknowledgments, invoices or otherwise, will not be binding and are null and void. + + + +(v. November 28, 2023) + + + + + +nvidia-nvjitlink +Other/Proprietary License +https://developer.nvidia.com/cuda-zone +UNKNOWN + +nvidia-nvjpeg-cu12 +Other/Proprietary License +https://developer.nvidia.com/cuda-zone +UNKNOWN + +nvidia-nvjpeg2k-cu12 +Other/Proprietary License +https://developer.nvidia.com/nvjpeg +LICENSE AGREEMENT FOR NVIDIA SOFTWARE DEVELOPMENT KITS + +This license agreement, including exhibits attached ("Agreement”) is a legal agreement between you and NVIDIA Corporation ("NVIDIA") and governs your use of a NVIDIA software development kit (“SDK”). + +Each SDK has its own set of software and materials, but here is a description of the types of items that may be included in a SDK: source code, header files, APIs, data sets and assets (examples include images, textures, models, scenes, videos, native API input/output files), binary software, sample code, libraries, utility programs, programming code and documentation. + +This Agreement can be accepted only by an adult of legal age of majority in the country in which the SDK is used. + +If you are entering into this Agreement on behalf of a company or other legal entity, you represent that you have the legal authority to bind the entity to this Agreement, in which case “you” will mean the entity you represent. + +If you don’t have the required age or authority to accept this Agreement, or if you don’t accept all the terms and conditions of this Agreement, do not download, install or use the SDK. + +You agree to use the SDK only for purposes that are permitted by (a) this Agreement, and (b) any applicable law, regulation or generally accepted practices or guidelines in the relevant jurisdictions. + +1. License. + +1.1 Grant + +Subject to the terms of this Agreement, NVIDIA hereby grants you a non-exclusive, non-transferable license, without the right to sublicense (except as expressly provided in this Agreement) to: + +(i) Install and use the SDK, + +(ii) Modify and create derivative works of sample source code delivered in the SDK, and + +(iii) Distribute those portions of the SDK that are identified in this Agreement as distributable, as incorporated in object code format into a software application that meets the distribution requirements indicated in this Agreement. + +1.2 Distribution Requirements + +These are the distribution requirements for you to exercise the distribution grant: + +(i) Your application must have material additional functionality, beyond the included portions of the SDK. + +(ii) The distributable portions of the SDK shall only be accessed by your application. + +(iii) The following notice shall be included in modifications and derivative works of sample source code distributed: “This software contains source code provided by NVIDIA Corporation.” + +(iv) Unless a developer tool is identified in this Agreement as distributable, it is delivered for your internal use only. + +(v) The terms under which you distribute your application must be consistent with the terms of this Agreement, including (without limitation) terms relating to the license grant and license restrictions and protection of NVIDIA’s intellectual property rights. Additionally, you agree that you will protect the privacy, security and legal rights of your application users. + +(vi) You agree to notify NVIDIA in writing of any known or suspected distribution or use of the SDK not in compliance with the requirements of this Agreement, and to enforce the terms of your agreements with respect to distributed SDK. + +1.3 Authorized Users + +You may allow employees and contractors of your entity or of your subsidiary(ies) to access and use the SDK from your secure network to perform work on your behalf. + +If you are an academic institution you may allow users enrolled or employed by the academic institution to access and use the SDK from your secure network. + +You are responsible for the compliance with the terms of this Agreement by your authorized users. If you become aware that your authorized users didn’t follow the terms of this Agreement, you agree to take reasonable steps to resolve the non-compliance and prevent new occurrences. + +1.4 Pre-Release SDK +The SDK versions identified as alpha, beta, preview or otherwise as pre-release, may not be fully functional, may contain errors or design flaws, and may have reduced or different security, privacy, accessibility, availability, and reliability standards relative to commercial versions of NVIDIA software and materials. Use of a pre-release SDK may result in unexpected results, loss of data, project delays or other unpredictable damage or loss. +You may use a pre-release SDK at your own risk, understanding that pre-release SDKs are not intended for use in production or business-critical systems. +NVIDIA may choose not to make available a commercial version of any pre-release SDK. NVIDIA may also choose to abandon development and terminate the availability of a pre-release SDK at any time without liability. +1.5 Updates + +NVIDIA may, at its option, make available patches, workarounds or other updates to this SDK. Unless the updates are provided with their separate governing terms, they are deemed part of the SDK licensed to you as provided in this Agreement. + +You agree that the form and content of the SDK that NVIDIA provides may change without prior notice to you. While NVIDIA generally maintains compatibility between versions, NVIDIA may in some cases make changes that introduce incompatibilities in future versions of the SDK. + +1.6 Third Party Licenses + +The SDK may come bundled with, or otherwise include or be distributed with, third party software licensed by a NVIDIA supplier and/or open source software provided under an open source license. Use of third party software is subject to the third-party license terms, or in the absence of third party terms, the terms of this Agreement. Copyright to third party software is held by the copyright holders indicated in the third-party software or license. + +1.7 Reservation of Rights + +NVIDIA reserves all rights, title and interest in and to the SDK not expressly granted to you under this Agreement. + +2. Limitations. + +The following license limitations apply to your use of the SDK: + +2.1 You may not reverse engineer, decompile or disassemble, or remove copyright or other proprietary notices from any portion of the SDK or copies of the SDK. + +2.2 Except as expressly provided in this Agreement, you may not copy, sell, rent, sublicense, transfer, distribute, modify, or create derivative works of any portion of the SDK. + +2.3 Unless you have an agreement with NVIDIA for this purpose, you may not indicate that an application created with the SDK is sponsored or endorsed by NVIDIA. + +2.4 You may not bypass, disable, or circumvent any encryption, security, digital rights management or authentication mechanism in the SDK. + +2.5 You may not use the SDK in any manner that would cause it to become subject to an open source software license. As examples, licenses that require as a condition of use, modification, and/or distribution that the SDK be (i) disclosed or distributed in source code form; (ii) licensed for the purpose of making derivative works; or (iii) redistributable at no charge. + +2.6 Unless you have an agreement with NVIDIA for this purpose, you may not use the SDK with any system or application where the use or failure of the system or application can reasonably be expected to threaten or result in personal injury, death, or catastrophic loss. Examples include use in avionics, navigation, military, medical, life support or other life critical applications. NVIDIA does not design, test or manufacture the SDK for these critical uses and NVIDIA shall not be liable to you or any third party, in whole or in part, for any claims or damages arising from such uses. + +2.7 You agree to defend, indemnify and hold harmless NVIDIA and its affiliates, and their respective employees, contractors, agents, officers and directors, from and against any and all claims, damages, obligations, losses, liabilities, costs or debt, fines, restitutions and expenses (including but not limited to attorney’s fees and costs incident to establishing the right of indemnification) arising out of or related to your use of the SDK outside of the scope of this Agreement, or not in compliance with its terms. + +3. Ownership. + +3.1 NVIDIA or its licensors hold all rights, title and interest in and to the SDK and its modifications and derivative works, including their respective intellectual property rights, subject to your rights under Section 3.2. This SDK may include software and materials from NVIDIA’s licensors, and these licensors are intended third party beneficiaries that may enforce this Agreement with respect to their intellectual property rights. + +3.2 You hold all rights, title and interest in and to your applications and your derivative works of the sample source code delivered in the SDK, including their respective intellectual property rights, subject to NVIDIA’s rights under section 3.1. + +3.3 You may, but don’t have to, provide to NVIDIA suggestions, feature requests or other feedback regarding the SDK, including possible enhancements or modifications to the SDK. For any feedback that you voluntarily provide, you hereby grant NVIDIA and its affiliates a perpetual, non-exclusive, worldwide, irrevocable license to use, reproduce, modify, license, sublicense (through multiple tiers of sublicensees), and distribute (through multiple tiers of distributors) it without the payment of any royalties or fees to you. NVIDIA will use feedback at its choice. NVIDIA is constantly looking for ways to improve its products, so you may send feedback to NVIDIA through the developer portal at https://developer.nvidia.com. + +4. No Warranties. + +THE SDK IS PROVIDED BY NVIDIA “AS IS” AND “WITH ALL FAULTS.” TO THE MAXIMUM EXTENT PERMITTED BY LAW, NVIDIA AND ITS AFFILIATES EXPRESSLY DISCLAIM ALL WARRANTIES OF ANY KIND OR NATURE, WHETHER EXPRESS, IMPLIED OR STATUTORY, INCLUDING, BUT NOT LIMITED TO, ANY WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE, TITLE, NON-INFRINGEMENT, OR THE ABSENCE OF ANY DEFECTS THEREIN, WHETHER LATENT OR PATENT. NO WARRANTY IS MADE ON THE BASIS OF TRADE USAGE, COURSE OF DEALING OR COURSE OF TRADE. + +5. Limitations of Liability. + +TO THE MAXIMUM EXTENT PERMITTED BY LAW, NVIDIA AND ITS AFFILIATES SHALL NOT BE LIABLE FOR ANY SPECIAL, INCIDENTAL, PUNITIVE OR CONSEQUENTIAL DAMAGES, OR ANY LOST PROFITS, LOSS OF USE, LOSS OF DATA OR LOSS OF GOODWILL, OR THE COSTS OF PROCURING SUBSTITUTE PRODUCTS, ARISING OUT OF OR IN CONNECTION WITH THIS AGREEMENT OR THE USE OR PERFORMANCE OF THE SDK, WHETHER SUCH LIABILITY ARISES FROM ANY CLAIM BASED UPON BREACH OF CONTRACT, BREACH OF WARRANTY, TORT (INCLUDING NEGLIGENCE), PRODUCT LIABILITY OR ANY OTHER CAUSE OF ACTION OR THEORY OF LIABILITY. IN NO EVENT WILL NVIDIA’S AND ITS AFFILIATES TOTAL CUMULATIVE LIABILITY UNDER OR ARISING OUT OF THIS AGREEMENT EXCEED US$10.00. THE NATURE OF THE LIABILITY OR THE NUMBER OF CLAIMS OR SUITS SHALL NOT ENLARGE OR EXTEND THIS LIMIT. + +These exclusions and limitations of liability shall apply regardless if NVIDIA or its affiliates have been advised of the possibility of such damages, and regardless of whether a remedy fails its essential purpose. These exclusions and limitations of liability form an essential basis of the bargain between the parties, and, absent any of these exclusions or limitations of liability, the provisions of this Agreement, including, without limitation, the economic terms, would be substantially different. + +6. Termination. + +6.1 This Agreement will continue to apply until terminated by either you or NVIDIA as described below. + +6.2 If you want to terminate this Agreement, you may do so by stopping to use the SDK. + +6.3 NVIDIA may, at any time, terminate this Agreement if: (i) you fail to comply with any term of this Agreement and the non-compliance is not fixed within thirty (30) days following notice from NVIDIA (or immediately if you violate NVIDIA’s intellectual property rights); (ii) you commence or participate in any legal proceeding against NVIDIA with respect to the SDK; or (iii) NVIDIA decides to no longer provide the SDK in a country or, in NVIDIA’s sole discretion, the continued use of it is no longer commercially viable. + +6.4 Upon any termination of this Agreement, you agree to promptly discontinue use of the SDK and destroy all copies in your possession or control. Your prior distributions in accordance with this Agreement are not affected by the termination of this Agreement. Upon written request, you will certify in writing that you have complied with your commitments under this section. Upon any termination of this Agreement all provisions survive except for the licenses granted to you. + +7. General. + +If you wish to assign this Agreement or your rights and obligations, including by merger, consolidation, dissolution or operation of law, contact NVIDIA to ask for permission. Any attempted assignment not approved by NVIDIA in writing shall be void and of no effect. NVIDIA may assign, delegate or transfer this Agreement and its rights and obligations, and if to a non-affiliate you will be notified. + +You agree to cooperate with NVIDIA and provide reasonably requested information to verify your compliance with this Agreement. + +This Agreement will be governed in all respects by the laws of the United States and of the State of Delaware as those laws are applied to contracts entered into and performed entirely within Delaware by Delaware residents, without regard to the conflicts of laws principles. The United Nations Convention on Contracts for the International Sale of Goods is specifically disclaimed. You agree to all terms of this Agreement in the English language. + +The state or federal courts residing in Santa Clara County, California shall have exclusive jurisdiction over any dispute or claim arising out of this Agreement. Notwithstanding this, you agree that NVIDIA shall still be allowed to apply for injunctive remedies or an equivalent type of urgent legal relief in any jurisdiction. + +If any court of competent jurisdiction determines that any provision of this Agreement is illegal, invalid or unenforceable, such provision will be construed as limited to the extent necessary to be consistent with and fully enforceable under the law and the remaining provisions will remain in full force and effect. Unless otherwise specified, remedies are cumulative. + +Each party acknowledges and agrees that the other is an independent contractor in the performance of this Agreement. + +The SDK has been developed entirely at private expense and is “commercial items” consisting of “commercial computer software” and “commercial computer software documentation” provided with RESTRICTED RIGHTS. Use, duplication or disclosure by the U.S. Government or a U.S. Government subcontractor is subject to the restrictions in this Agreement pursuant to DFARS 227.7202-3(a) or as set forth in subparagraphs (b)(1) and (2) of the Commercial Computer Software - Restricted Rights clause at FAR 52.227-19, as applicable. Contractor/manufacturer is NVIDIA, 2788 San Tomas Expressway, Santa Clara, CA 95051. + +The SDK is subject to United States export laws and regulations. You agree that you will not ship, transfer or export the SDK into any country, or use the SDK in any manner, prohibited by the United States Bureau of Industry and Security or economic sanctions regulations administered by the U.S. Department of Treasury’s Office of Foreign Assets Control (OFAC), or any applicable export laws, restrictions or regulations. These laws include restrictions on destinations, end users and end use. By accepting this Agreement, you confirm that you are not a resident or citizen of any country currently embargoed by the U.S. and that you are not otherwise prohibited from receiving the SDK. + +Any notice delivered by NVIDIA to you under this Agreement will be delivered via mail, email or fax. You agree that any notices that NVIDIA sends you electronically will satisfy any legal communication requirements. Please direct your legal notices or other correspondence to NVIDIA Corporation, 2788 San Tomas Expressway, Santa Clara, California 95051, United States of America, Attention: Legal Department. + +This Agreement and any exhibits incorporated into this Agreement constitute the entire agreement of the parties with respect to the subject matter of this Agreement and supersede all prior negotiations or documentation exchanged between the parties relating to this SDK license. Any additional and/or conflicting terms on documents issued by you are null, void, and invalid. Any amendment or waiver under this Agreement shall be in writing and signed by representatives of both parties. + +(v. January 28, 2020) + + + + + + + + + + + + + + + +nvJPEG2K SUPPLEMENT TO SOFTWARE LICENSE AGREEMENT FOR NVIDIA SOFTWARE DEVELOPMENT KITS + +The terms in this supplement govern your use of the NVIDIA nvJPEG2K SDK under the terms of your license agreement (“Agreement”) as modified by this supplement. Capitalized terms used but not defined below have the meaning assigned to them in the Agreement. + +This supplement is an exhibit to the Agreement and is incorporated as an integral part of the Agreement. In the event of conflict between the terms in this supplement and the terms in the Agreement, the terms in this supplement govern. + +4.1 License Scope. The SDK is licensed for you to develop applications only for use in systems with NVIDIA GPUs. + +2. Distribution. The following portions of the SDK are distributable under the Agreement: the runtime files .so and .h, nvjpeg2k_0.dll, nvjpeg2k.lib and libnvjpeg2k_static.a. + +3. Licensing. If the distribution terms in this Agreement are not suitable for your organization, or for any questions regarding this Agreement, please contact NVIDIA at nvidia-compute-license-questions@nvidia.com. + (v. November 14, 2020) + + +OpenJPEG LICENSE + +/* + * The copyright in this software is being made available under the 2-clauses + * BSD License, included below. This software may be subject to other third + * party and contributor rights, including patent rights, and no such rights + * are granted under this license. + * + * Copyright (c) 2002-2014, Universite catholique de Louvain (UCL), Belgium + * Copyright (c) 2002-2014, Professor Benoit Macq + * Copyright (c) 2003-2014, Antonin Descampe + * Copyright (c) 2003-2009, Francois-Olivier Devaux + * Copyright (c) 2005, Herve Drolon, FreeImage Team + * Copyright (c) 2002-2003, Yannick Verschueren + * Copyright (c) 2001-2003, David Janssens + * Copyright (c) 2011-2012, Centre National d'Etudes Spatiales (CNES), France + * Copyright (c) 2012, CS Systemes d'Information, France + * + * All rights reserved. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS `AS IS' + * AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE + * ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE + * LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR + * CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF + * SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS + * INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN + * CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) + * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE + * POSSIBILITY OF SUCH DAMAGE. + */ + + + + + + +nvidia-nvshmem-cu13 +LicenseRef-NVIDIA-Proprietary +https://developer.nvidia.com/cuda-zone +UNKNOWN + +nvidia-nvtiff-cu12 +Other/Proprietary License +https://developer.nvidia.com/nvtiff +NVIDIA nvTIFF +Software License Agreement | NVIDIA Docs + +Table of Contents +LICENSE AGREEMENT FOR NVIDIA SOFTWARE DEVELOPMENT KITS iii +Chapter 1. License. 1 +1.1. Grant 1 +1.2. Distribution Requirements 1 +1.3. Authorized Users 2 +1.4. Pre-Release SDK 2 +1.5. Updates 2 +1.6. Third Party Licenses 2 +1.7. Reservation of Rights 3 +Chapter 2. Limitations. 4 +Chapter 3. Ownership. 5 +Chapter 4. No Warranties. 6 +Chapter 5. Limitations of Liability. 7 +Chapter 6. Termination. 8 +Chapter 7. General. 9 + + + +LICENSE AGREEMENT FOR NVIDIA SOFTWARE DEVELOPMENT KITS + + + +This license agreement, including exhibits attached ("Agreement") is a legal agreement between you and NVIDIA Corporation ("NVIDIA") and governs your use of a NVIDIA software development kit ("SDK"). +Each SDK has its own set of software and materials, but here is a description of the types of items that may be included in a SDK: source code, header files, APIs, data sets and assets (examples include images, textures, models, scenes, videos, native API input/output files), binary software, sample code, libraries, utility programs, programming code and documentation. +This Agreement can be accepted only by an adult of legal age of majority in the country in which the SDK is used. +If you are entering into this Agreement on behalf of a company or other legal entity, you represent that you have the legal authority to bind the entity to this Agreement, in which case "you" will mean the entity you represent. +If you don't have the required age or authority to accept this Agreement, or if you don't accept all the terms and conditions of this Agreement, do not download, install or use the SDK. +You agree to use the SDK only for purposes that are permitted by (a) this Agreement, and (b) any applicable law, regulation or generally accepted practices or guidelines in the relevant jurisdictions. + +Chapter 1. License. + +1.1. Grant +Subject to the terms of this Agreement, NVIDIA hereby grants you a non-exclusive, non- transferable license, without the right to sublicense (except as expressly provided in this Agreement) to: +1. Install and use the SDK, +2. Modify and create derivative works of sample source code delivered in the SDK, and +3. Distribute those portions of the SDK that are identified in this Agreement as distributable, as incorporated in object code format into a software application that meets the distribution requirements indicated in this Agreement. + +1.2. Distribution Requirements +These are the distribution requirements for you to exercise the distribution grant: +1. Your application must have material additional functionality, beyond the included portions of the SDK. +2. The distributable portions of the SDK shall only be accessed by your application. +3. The following notice shall be included in modifications and derivative works of sample source code distributed: "This software contains source code provided by NVIDIA Corporation." +4. Unless a developer tool is identified in this Agreement as distributable, it is delivered for your internal use only. +5. The terms under which you distribute your application must be consistent with the terms of this Agreement, including (without limitation) terms relating to the license grant and license restrictions and protection of NVIDIA's intellectual property rights. Additionally, you agree that you will protect the privacy, security and legal rights of your application users. +6. You agree to notify NVIDIA in writing of any known or suspected distribution or use of the SDK not in compliance with the requirements of this Agreement, and to enforce the terms of your agreements with respect to distributed SDK. + + +1.3. Authorized Users +You may allow employees and contractors of your entity or of your subsidiary(ies) to access and use the SDK from your secure network to perform work on your behalf. +If you are an academic institution you may allow users enrolled or employed by the academic institution to access and use the SDK from your secure network. +You are responsible for the compliance with the terms of this Agreement by your authorized users. If you become aware that your authorized users didn't follow the terms of this Agreement, you agree to take reasonable steps to resolve the non-compliance and prevent new occurrences. + +1.4. Pre-Release SDK +The SDK versions identified as alpha, beta, preview or otherwise as pre-release, may not be fully functional, may contain errors or design flaws, and may have reduced or different security, privacy, accessibility, availability, and reliability standards relative to commercial +versions of NVIDIA software and materials. Use of a pre-release SDK may result in unexpected results, loss of data, project delays or other unpredictable damage or loss. +You may use a pre-release SDK at your own risk, understanding that pre-release SDKs are not intended for use in production or business-critical systems. +NVIDIA may choose not to make available a commercial version of any pre-release SDK. NVIDIA may also choose to abandon development and terminate the availability of a pre- release SDK at any time without liability. + +1.5. Updates +NVIDIA may, at its option, make available patches, workarounds or other updates to this SDK. Unless the updates are provided with their separate governing terms, they are deemed part of the SDK licensed to you as provided in this Agreement. +You agree that the form and content of the SDK that NVIDIA provides may change without prior notice to you. While NVIDIA generally maintains compatibility between versions, NVIDIA may in some cases make changes that introduce incompatibilities in future versions of the SDK. + +1.6. Components Under Other Licenses +The SDK may come bundled with, or otherwise include or be distributed with, NVIDIA or third party software licensed with separate legal notices or terms as may be described in proprietary notices accompanying the SDK. If and to the extent there is a conflict between the terms in this Agreement and the license terms associated with the component, the license terms associated with the components control only to the extent necessary to resolve the conflict. + +1.7. Reservation of Rights +NVIDIA reserves all rights, title and interest in and to the SDK not expressly granted to you under this Agreement. + + +Chapter 2. Limitations. + +The following license limitations apply to your use of the SDK: +2.1 You may not reverse engineer, decompile or disassemble, or remove copyright or other proprietary notices from any portion of the SDK or copies of the SDK. +2.2 Except as expressly provided in this Agreement, you may not copy, sell, rent, sublicense, transfer, distribute, modify, or create derivative works of any portion of the SDK. +2.3 Unless you have an agreement with NVIDIA for this purpose, you may not indicate that an application created with the SDK is sponsored or endorsed by NVIDIA. +2.4 You may not bypass, disable, or circumvent any encryption, security, digital rights management or authentication mechanism in the SDK. +2.5 You may not use the SDK in any manner that would cause it to become subject to an open source software license. As examples, licenses that require as a condition of use, modification, and/or distribution that the SDK be (i) disclosed or distributed in source code form; (ii) licensed for the purpose of making derivative works; or (iii) redistributable at no charge. +2.6 You acknowledge that the SDK as delivered is not tested or certified by NVIDIA for use in connection with the design, construction, maintenance, and/or operation of any system where the use or failure of such system could result in a situation that threatens the safety of human life or results in catastrophic damages (each, a "Critical Application"). Examples of Critical Applications include use in avionics, navigation, autonomous vehicle applications, ai solutions for automotive products, military, medical, life support or other life critical applications. NVIDIA shall not be liable to you or any third party, in whole or in part, for any claims or damages arising from such uses. You are solely responsible for ensuring that any product +or service developed with the SDK as a whole includes sufficient features to comply with all applicable legal and regulatory standards and requirements. +2.7 You agree to defend, indemnify and hold harmless NVIDIA and its affiliates, and their respective employees, contractors, agents, officers and directors, from and against any and all claims, damages, obligations, losses, liabilities, costs or debt, fines, restitutions and expenses (including but not limited to attorney's fees and costs incident to establishing the right of indemnification) arising out of or related to products or services that use the SDK in or for Critical Applications, and for use of the SDK outside of the scope of this Agreement or not in compliance with its terms. + + +Chapter 3. Ownership. + +3.1 NVIDIA or its licensors hold all rights, title and interest in and to the SDK and its modifications and derivative works, including their respective intellectual property rights, subject to your rights under Section 3.2. This SDK may include software and materials from NVIDIA's licensors, and these licensors are intended third party beneficiaries that may enforce this Agreement with respect to their intellectual property rights. +3.2 You hold all rights, title and interest in and to your applications and your derivative works of the sample source code delivered in the SDK, including their respective intellectual property rights, subject to NVIDIA's rights under section 3.1. +3.3 You may, but don't have to, provide to NVIDIA suggestions, feature requests or other feedback regarding the SDK, including possible enhancements or modifications to the SDK. For any feedback that you voluntarily provide, you hereby grant NVIDIA and its affiliates a perpetual, non-exclusive, worldwide, irrevocable license to use, reproduce, modify, license, sublicense (through multiple tiers of sublicensees), and distribute (through multiple tiers of distributors) it without the payment of any royalties or fees to you. NVIDIA will use feedback at its choice. NVIDIA is constantly looking for ways to improve its products, so you may send feedback to NVIDIA through the developer portal at https://developer.nvidia.com. + + +Chapter 4. No Warranties. + + +THE SDK IS PROVIDED BY NVIDIA "AS IS" AND "WITH ALL FAULTS." TO THE MAXIMUM EXTENT PERMITTED BY LAW, NVIDIA AND ITS AFFILIATES EXPRESSLY DISCLAIM ALL WARRANTIES OF ANY KIND OR NATURE, WHETHER EXPRESS, IMPLIED OR STATUTORY, INCLUDING, BUT NOT LIMITED TO, ANY WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE, TITLE, NON-INFRINGEMENT, OR THE ABSENCE OF ANY DEFECTS THEREIN, WHETHER LATENT OR PATENT. NO WARRANTY IS MADE ON THE BASIS OF TRADE USAGE, COURSE OF DEALING OR COURSE OF TRADE. + + +Chapter 5. Limitations of Liability. + + +TO THE MAXIMUM EXTENT PERMITTED BY LAW, NVIDIA AND ITS AFFILIATES SHALL NOT BE LIABLE FOR ANY SPECIAL, INCIDENTAL, PUNITIVE OR CONSEQUENTIAL DAMAGES, OR ANY LOST PROFITS, LOSS OF USE, LOSS OF DATA OR LOSS OF GOODWILL, OR THE COSTS OF PROCURING SUBSTITUTE PRODUCTS, ARISING OUT OF OR IN CONNECTION WITH THIS AGREEMENT OR THE USE OR PERFORMANCE OF THE SDK, WHETHER SUCH LIABILITY ARISES FROM ANY CLAIM BASED UPON BREACH OF CONTRACT, BREACH OF WARRANTY, +TORT (INCLUDING NEGLIGENCE), PRODUCT LIABILITY OR ANY OTHER CAUSE OF ACTION OR THEORY OF LIABILITY. IN NO EVENT WILL NVIDIA'S AND ITS AFFILIATES TOTAL CUMULATIVE LIABILITY UNDER OR ARISING OUT OF THIS AGREEMENT EXCEED US$10.00. THE NATURE OF THE LIABILITY OR THE NUMBER OF CLAIMS OR SUITS SHALL NOT ENLARGE OR EXTEND THIS LIMIT. +These exclusions and limitations of liability shall apply regardless if NVIDIA or its affiliates have been advised of the possibility of such damages, and regardless of whether a remedy fails its essential purpose. These exclusions and limitations of liability form an essential basis of the bargain between the parties, and, absent any of these exclusions or limitations of liability, the provisions of this Agreement, including, without limitation, the economic terms, would be substantially different. + + +Chapter 6. Termination. + + + +6.1 This Agreement will continue to apply until terminated by either you or NVIDIA as described below. +6.2 If you want to terminate this Agreement, you may do so by stopping to use the SDK. +6.3 NVIDIA may, at any time, terminate this Agreement if: (i) you fail to comply with any term of this Agreement and the non-compliance is not fixed within thirty (30) days following notice from NVIDIA (or immediately if you violate NVIDIA's intellectual property rights); (ii) you commence or participate in any legal proceeding against NVIDIA with respect to the SDK; or +(iii) NVIDIA decides to no longer provide the SDK in a country or, in NVIDIA's sole discretion, the continued use of it is no longer commercially viable. +6.4 Upon any termination of this Agreement, you agree to promptly discontinue use of the SDK and destroy all copies in your possession or control. Your prior distributions in accordance with this Agreement are not affected by the termination of this Agreement. Upon written request, you will certify in writing that you have complied with your commitments under this section. Upon any termination of this Agreement all provisions survive except for the licenses granted to you. + + +Chapter 7. General. + + +If you wish to assign this Agreement or your rights and obligations, including by merger, consolidation, dissolution or operation of law, contact NVIDIA to ask for permission. Any attempted assignment not approved by NVIDIA in writing shall be void and of no effect. NVIDIA may assign, delegate or transfer this Agreement and its rights and obligations, and if to a non- affiliate you will be notified. +You agree to cooperate with NVIDIA and provide reasonably requested information to verify your compliance with this Agreement. +This Agreement will be governed in all respects by the laws of the United States and of the State of Delaware as those laws are applied to contracts entered into and performed entirely within Delaware by Delaware residents, without regard to the conflicts of laws principles. +The United Nations Convention on Contracts for the International Sale of Goods is specifically disclaimed. You agree to all terms of this Agreement in the English language. +The state or federal courts residing in Santa Clara County, California shall have exclusive jurisdiction over any dispute or claim arising out of this Agreement. Notwithstanding this, you agree that NVIDIA shall still be allowed to apply for injunctive remedies or an equivalent type of urgent legal relief in any jurisdiction. +If any court of competent jurisdiction determines that any provision of this Agreement is illegal, invalid or unenforceable, such provision will be construed as limited to the extent necessary to be consistent with and fully enforceable under the law and the remaining provisions will remain in full force and effect. Unless otherwise specified, remedies are cumulative. +Each party acknowledges and agrees that the other is an independent contractor in the performance of this Agreement. +The SDK has been developed entirely at private expense and is "commercial items" consisting of "commercial computer software" and "commercial computer software documentation" provided with RESTRICTED RIGHTS. Use, duplication or disclosure by the U.S. Government +or a U.S. Government subcontractor is subject to the restrictions in this Agreement pursuant to DFARS 227.7202-3(a) or as set forth in subparagraphs (b)(1) and (2) of the Commercial Computer Software - Restricted Rights clause at FAR 52.227-19, as applicable. Contractor/ manufacturer is NVIDIA, 2788 San Tomas Expressway, Santa Clara, CA 95051. +The SDK is subject to United States export laws and regulations. You agree that you will not ship, transfer or export the SDK into any country, or use the SDK in any manner, prohibited by the United States Bureau of Industry and Security or economic sanctions regulations administered by the U.S. Department of Treasury's Office of Foreign Assets Control (OFAC), + + + + +or any applicable export laws, restrictions or regulations. These laws include restrictions on destinations, end users and end use. By accepting this Agreement, you confirm that you are not a resident or citizen of any country currently embargoed by the U.S. and that you are not otherwise prohibited from receiving the SDK. +Any notice delivered by NVIDIA to you under this Agreement will be delivered via mail, email or fax. You agree that any notices that NVIDIA sends you electronically will satisfy any legal communication requirements. Please direct your legal notices or other correspondence to +NVIDIA Corporation, 2788 San Tomas Expressway, Santa Clara, California 95051, United States of America, Attention: Legal Department. +This Agreement and any exhibits incorporated into this Agreement constitute the entire agreement of the parties with respect to the subject matter of this Agreement and supersede all prior negotiations or documentation exchanged between the parties relating to this SDK license. Any additional and/or conflicting terms on documents issued by you are null, void, and invalid. Any amendment or waiver under this Agreement shall be in writing and signed by representatives of both parties. +(v. March 31, 2022) + + +Chapter 8. nvTIFF SUPPLEMENT +TO SOFTWARE LICENSE AGREEMENT FOR NVIDIA SOFTWARE DEVELOPMENT KITS + + +The terms in this supplement govern your use of the NVIDIA nvTIFF SDK under the terms of your license agreement ("Agreement") as modified by this supplement. Capitalized terms used but not defined below have the meaning assigned to them in the Agreement. +This supplement is an exhibit to the Agreement and is incorporated as an integral part of the Agreement. In the event of conflict between the terms in this supplement and the terms in the Agreement, the terms in this supplement govern. +1. License Scope. The SDK is licensed for you to develop applications only for use in systems with NVIDIA GPUs. +2. Distribution. The following portions of the SDK are distributable under the Agreement: the runtime files .so and .h, nvTIFF_0.dll, nvTIFF.lib and libnvTIFF_static.a. +3. Licensing. If the distribution terms in this Agreement are not suitable for your organization, or for any questions regarding this Agreement, please contact NVIDIA at nvidia-compute- license-questions@nvidia.com +(v. March 31, 2022) + + + + + + + + + + + +nvidia-nvtx +Other/Proprietary License +https://developer.nvidia.com/cuda-zone +UNKNOWN + +nvtx +Apache Software License +https://github.com/NVIDIA/NVTX +============================================================================== +NVTX is under the Apache License v2.0 with LLVM Exceptions: +============================================================================== + + Apache License + Version 2.0, January 2004 + http://www.apache.org/licenses/ + + TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION + + 1. Definitions. + + "License" shall mean the terms and conditions for use, reproduction, + and distribution as defined by Sections 1 through 9 of this document. + + "Licensor" shall mean the copyright owner or entity authorized by + the copyright owner that is granting the License. + + "Legal Entity" shall mean the union of the acting entity and all + other entities that control, are controlled by, or are under common + control with that entity. For the purposes of this definition, + "control" means (i) the power, direct or indirect, to cause the + direction or management of such entity, whether by contract or + otherwise, or (ii) ownership of fifty percent (50%) or more of the + outstanding shares, or (iii) beneficial ownership of such entity. + + "You" (or "Your") shall mean an individual or Legal Entity + exercising permissions granted by this License. + + "Source" form shall mean the preferred form for making modifications, + including but not limited to software source code, documentation + source, and configuration files. + + "Object" form shall mean any form resulting from mechanical + transformation or translation of a Source form, including but + not limited to compiled object code, generated documentation, + and conversions to other media types. + + "Work" shall mean the work of authorship, whether in Source or + Object form, made available under the License, as indicated by a + copyright notice that is included in or attached to the work + (an example is provided in the Appendix below). + + "Derivative Works" shall mean any work, whether in Source or Object + form, that is based on (or derived from) the Work and for which the + editorial revisions, annotations, elaborations, or other modifications + represent, as a whole, an original work of authorship. For the purposes + of this License, Derivative Works shall not include works that remain + separable from, or merely link (or bind by name) to the interfaces of, + the Work and Derivative Works thereof. + + "Contribution" shall mean any work of authorship, including + the original version of the Work and any modifications or additions + to that Work or Derivative Works thereof, that is intentionally + submitted to Licensor for inclusion in the Work by the copyright owner + or by an individual or Legal Entity authorized to submit on behalf of + the copyright owner. For the purposes of this definition, "submitted" + means any form of electronic, verbal, or written communication sent + to the Licensor or its representatives, including but not limited to + communication on electronic mailing lists, source code control systems, + and issue tracking systems that are managed by, or on behalf of, the + Licensor for the purpose of discussing and improving the Work, but + excluding communication that is conspicuously marked or otherwise + designated in writing by the copyright owner as "Not a Contribution." + + "Contributor" shall mean Licensor and any individual or Legal Entity + on behalf of whom a Contribution has been received by Licensor and + subsequently incorporated within the Work. + + 2. Grant of Copyright License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + copyright license to reproduce, prepare Derivative Works of, + publicly display, publicly perform, sublicense, and distribute the + Work and such Derivative Works in Source or Object form. + + 3. Grant of Patent License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + (except as stated in this section) patent license to make, have made, + use, offer to sell, sell, import, and otherwise transfer the Work, + where such license applies only to those patent claims licensable + by such Contributor that are necessarily infringed by their + Contribution(s) alone or by combination of their Contribution(s) + with the Work to which such Contribution(s) was submitted. If You + institute patent litigation against any entity (including a + cross-claim or counterclaim in a lawsuit) alleging that the Work + or a Contribution incorporated within the Work constitutes direct + or contributory patent infringement, then any patent licenses + granted to You under this License for that Work shall terminate + as of the date such litigation is filed. + + 4. Redistribution. You may reproduce and distribute copies of the + Work or Derivative Works thereof in any medium, with or without + modifications, and in Source or Object form, provided that You + meet the following conditions: + + (a) You must give any other recipients of the Work or + Derivative Works a copy of this License; and + + (b) You must cause any modified files to carry prominent notices + stating that You changed the files; and + + (c) You must retain, in the Source form of any Derivative Works + that You distribute, all copyright, patent, trademark, and + attribution notices from the Source form of the Work, + excluding those notices that do not pertain to any part of + the Derivative Works; and + + (d) If the Work includes a "NOTICE" text file as part of its + distribution, then any Derivative Works that You distribute must + include a readable copy of the attribution notices contained + within such NOTICE file, excluding those notices that do not + pertain to any part of the Derivative Works, in at least one + of the following places: within a NOTICE text file distributed + as part of the Derivative Works; within the Source form or + documentation, if provided along with the Derivative Works; or, + within a display generated by the Derivative Works, if and + wherever such third-party notices normally appear. The contents + of the NOTICE file are for informational purposes only and + do not modify the License. You may add Your own attribution + notices within Derivative Works that You distribute, alongside + or as an addendum to the NOTICE text from the Work, provided + that such additional attribution notices cannot be construed + as modifying the License. + + You may add Your own copyright statement to Your modifications and + may provide additional or different license terms and conditions + for use, reproduction, or distribution of Your modifications, or + for any such Derivative Works as a whole, provided Your use, + reproduction, and distribution of the Work otherwise complies with + the conditions stated in this License. + + 5. Submission of Contributions. Unless You explicitly state otherwise, + any Contribution intentionally submitted for inclusion in the Work + by You to the Licensor shall be under the terms and conditions of + this License, without any additional terms or conditions. + Notwithstanding the above, nothing herein shall supersede or modify + the terms of any separate license agreement you may have executed + with Licensor regarding such Contributions. + + 6. Trademarks. This License does not grant permission to use the trade + names, trademarks, service marks, or product names of the Licensor, + except as required for reasonable and customary use in describing the + origin of the Work and reproducing the content of the NOTICE file. + + 7. Disclaimer of Warranty. Unless required by applicable law or + agreed to in writing, Licensor provides the Work (and each + Contributor provides its Contributions) on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or + implied, including, without limitation, any warranties or conditions + of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A + PARTICULAR PURPOSE. You are solely responsible for determining the + appropriateness of using or redistributing the Work and assume any + risks associated with Your exercise of permissions under this License. + + 8. Limitation of Liability. In no event and under no legal theory, + whether in tort (including negligence), contract, or otherwise, + unless required by applicable law (such as deliberate and grossly + negligent acts) or agreed to in writing, shall any Contributor be + liable to You for damages, including any direct, indirect, special, + incidental, or consequential damages of any character arising as a + result of this License or out of the use or inability to use the + Work (including but not limited to damages for loss of goodwill, + work stoppage, computer failure or malfunction, or any and all + other commercial damages or losses), even if such Contributor + has been advised of the possibility of such damages. + + 9. Accepting Warranty or Additional Liability. While redistributing + the Work or Derivative Works thereof, You may choose to offer, + and charge a fee for, acceptance of support, warranty, indemnity, + or other liability obligations and/or rights consistent with this + License. However, in accepting such obligations, You may act only + on Your own behalf and on Your sole responsibility, not on behalf + of any other Contributor, and only if You agree to indemnify, + defend, and hold each Contributor harmless for any liability + incurred by, or claims asserted against, such Contributor by reason + of your accepting any such warranty or additional liability. + + END OF TERMS AND CONDITIONS + + APPENDIX: How to apply the Apache License to your work. + + To apply the Apache License to your work, attach the following + boilerplate notice, with the fields enclosed by brackets "[]" + replaced with your own identifying information. (Don't include + the brackets!) The text should be enclosed in the appropriate + comment syntax for the file format. We also recommend that a + file or class name and description of purpose be included on the + same "printed page" as the copyright notice for easier + identification within third-party archives. + + Copyright [yyyy] [name of copyright owner] + + Licensed under the Apache License, Version 2.0 (the "License"); + you may not use this file except in compliance with the License. + You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, software + distributed under the License is distributed on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + See the License for the specific language governing permissions and + limitations under the License. + + +---- LLVM Exceptions to the Apache 2.0 License ---- + +As an exception, if, as a result of your compiling your source code, portions +of this Software are embedded into an Object form of such source code, you +may redistribute such embedded portions in such Object form without complying +with the conditions of Sections 4(a), 4(b) and 4(d) of the License. + +In addition, if you combine or link compiled forms of this Software with +software that is licensed under the GPLv2 ("Combined Software") and if a +court of competent jurisdiction determines that the patent provision (Section +3), the indemnity provision (Section 9) or other Section of the License +conflicts with the conditions of the GPLv2, you may retroactively and +prospectively choose to deem waived or otherwise exclude such Section(s) of +the License, but only in their entirety and only with respect to the Combined +Software. + + + +obstore +MIT License +https://developmentseed.org/obstore +UNKNOWN + +omegaconf +BSD License +https://github.com/omry/omegaconf +BSD 3-Clause License + +Copyright (c) 2018, Omry Yadan +All rights reserved. + +Redistribution and use in source and binary forms, with or without +modification, are permitted provided that the following conditions are met: + +1. Redistributions of source code must retain the above copyright notice, this + list of conditions and the following disclaimer. + +2. Redistributions in binary form must reproduce the above copyright notice, + this list of conditions and the following disclaimer in the documentation + and/or other materials provided with the distribution. + +3. Neither the name of the copyright holder nor the names of its + contributors may be used to endorse or promote products derived from + this software without specific prior written permission. + +THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" +AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE +IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE +DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE +FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL +DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR +SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER +CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, +OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE +OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + + +onnx +Apache-2.0 +https://onnx.ai/ + + Apache License + Version 2.0, January 2004 + http://www.apache.org/licenses/ + + TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION + + 1. Definitions. + + "License" shall mean the terms and conditions for use, reproduction, + and distribution as defined by Sections 1 through 9 of this document. + + "Licensor" shall mean the copyright owner or entity authorized by + the copyright owner that is granting the License. + + "Legal Entity" shall mean the union of the acting entity and all + other entities that control, are controlled by, or are under common + control with that entity. For the purposes of this definition, + "control" means (i) the power, direct or indirect, to cause the + direction or management of such entity, whether by contract or + otherwise, or (ii) ownership of fifty percent (50%) or more of the + outstanding shares, or (iii) beneficial ownership of such entity. + + "You" (or "Your") shall mean an individual or Legal Entity + exercising permissions granted by this License. + + "Source" form shall mean the preferred form for making modifications, + including but not limited to software source code, documentation + source, and configuration files. + + "Object" form shall mean any form resulting from mechanical + transformation or translation of a Source form, including but + not limited to compiled object code, generated documentation, + and conversions to other media types. + + "Work" shall mean the work of authorship, whether in Source or + Object form, made available under the License, as indicated by a + copyright notice that is included in or attached to the work + (an example is provided in the Appendix below). + + "Derivative Works" shall mean any work, whether in Source or Object + form, that is based on (or derived from) the Work and for which the + editorial revisions, annotations, elaborations, or other modifications + represent, as a whole, an original work of authorship. For the purposes + of this License, Derivative Works shall not include works that remain + separable from, or merely link (or bind by name) to the interfaces of, + the Work and Derivative Works thereof. + + "Contribution" shall mean any work of authorship, including + the original version of the Work and any modifications or additions + to that Work or Derivative Works thereof, that is intentionally + submitted to Licensor for inclusion in the Work by the copyright owner + or by an individual or Legal Entity authorized to submit on behalf of + the copyright owner. For the purposes of this definition, "submitted" + means any form of electronic, verbal, or written communication sent + to the Licensor or its representatives, including but not limited to + communication on electronic mailing lists, source code control systems, + and issue tracking systems that are managed by, or on behalf of, the + Licensor for the purpose of discussing and improving the Work, but + excluding communication that is conspicuously marked or otherwise + designated in writing by the copyright owner as "Not a Contribution." + + "Contributor" shall mean Licensor and any individual or Legal Entity + on behalf of whom a Contribution has been received by Licensor and + subsequently incorporated within the Work. + + 2. Grant of Copyright License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + copyright license to reproduce, prepare Derivative Works of, + publicly display, publicly perform, sublicense, and distribute the + Work and such Derivative Works in Source or Object form. + + 3. Grant of Patent License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + (except as stated in this section) patent license to make, have made, + use, offer to sell, sell, import, and otherwise transfer the Work, + where such license applies only to those patent claims licensable + by such Contributor that are necessarily infringed by their + Contribution(s) alone or by combination of their Contribution(s) + with the Work to which such Contribution(s) was submitted. If You + institute patent litigation against any entity (including a + cross-claim or counterclaim in a lawsuit) alleging that the Work + or a Contribution incorporated within the Work constitutes direct + or contributory patent infringement, then any patent licenses + granted to You under this License for that Work shall terminate + as of the date such litigation is filed. + + 4. Redistribution. You may reproduce and distribute copies of the + Work or Derivative Works thereof in any medium, with or without + modifications, and in Source or Object form, provided that You + meet the following conditions: + + (a) You must give any other recipients of the Work or + Derivative Works a copy of this License; and + + (b) You must cause any modified files to carry prominent notices + stating that You changed the files; and + + (c) You must retain, in the Source form of any Derivative Works + that You distribute, all copyright, patent, trademark, and + attribution notices from the Source form of the Work, + excluding those notices that do not pertain to any part of + the Derivative Works; and + + (d) If the Work includes a "NOTICE" text file as part of its + distribution, then any Derivative Works that You distribute must + include a readable copy of the attribution notices contained + within such NOTICE file, excluding those notices that do not + pertain to any part of the Derivative Works, in at least one + of the following places: within a NOTICE text file distributed + as part of the Derivative Works; within the Source form or + documentation, if provided along with the Derivative Works; or, + within a display generated by the Derivative Works, if and + wherever such third-party notices normally appear. The contents + of the NOTICE file are for informational purposes only and + do not modify the License. You may add Your own attribution + notices within Derivative Works that You distribute, alongside + or as an addendum to the NOTICE text from the Work, provided + that such additional attribution notices cannot be construed + as modifying the License. + + You may add Your own copyright statement to Your modifications and + may provide additional or different license terms and conditions + for use, reproduction, or distribution of Your modifications, or + for any such Derivative Works as a whole, provided Your use, + reproduction, and distribution of the Work otherwise complies with + the conditions stated in this License. + + 5. Submission of Contributions. Unless You explicitly state otherwise, + any Contribution intentionally submitted for inclusion in the Work + by You to the Licensor shall be under the terms and conditions of + this License, without any additional terms or conditions. + Notwithstanding the above, nothing herein shall supersede or modify + the terms of any separate license agreement you may have executed + with Licensor regarding such Contributions. + + 6. Trademarks. This License does not grant permission to use the trade + names, trademarks, service marks, or product names of the Licensor, + except as required for reasonable and customary use in describing the + origin of the Work and reproducing the content of the NOTICE file. + + 7. Disclaimer of Warranty. Unless required by applicable law or + agreed to in writing, Licensor provides the Work (and each + Contributor provides its Contributions) on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or + implied, including, without limitation, any warranties or conditions + of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A + PARTICULAR PURPOSE. You are solely responsible for determining the + appropriateness of using or redistributing the Work and assume any + risks associated with Your exercise of permissions under this License. + + 8. Limitation of Liability. In no event and under no legal theory, + whether in tort (including negligence), contract, or otherwise, + unless required by applicable law (such as deliberate and grossly + negligent acts) or agreed to in writing, shall any Contributor be + liable to You for damages, including any direct, indirect, special, + incidental, or consequential damages of any character arising as a + result of this License or out of the use or inability to use the + Work (including but not limited to damages for loss of goodwill, + work stoppage, computer failure or malfunction, or any and all + other commercial damages or losses), even if such Contributor + has been advised of the possibility of such damages. + + 9. Accepting Warranty or Additional Liability. While redistributing + the Work or Derivative Works thereof, You may choose to offer, + and charge a fee for, acceptance of support, warranty, indemnity, + or other liability obligations and/or rights consistent with this + License. However, in accepting such obligations, You may act only + on Your own behalf and on Your sole responsibility, not on behalf + of any other Contributor, and only if You agree to indemnify, + defend, and hold each Contributor harmless for any liability + incurred by, or claims asserted against, such Contributor by reason + of your accepting any such warranty or additional liability. + + END OF TERMS AND CONDITIONS + + APPENDIX: How to apply the Apache License to your work. + + To apply the Apache License to your work, attach the following + boilerplate notice, with the fields enclosed by brackets "[]" + replaced with your own identifying information. (Don't include + the brackets!) The text should be enclosed in the appropriate + comment syntax for the file format. We also recommend that a + file or class name and description of purpose be included on the + same "printed page" as the copyright notice for easier + identification within third-party archives. + + Copyright [yyyy] [name of copyright owner] + + Licensed under the Apache License, Version 2.0 (the "License"); + you may not use this file except in compliance with the License. + You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, software + distributed under the License is distributed on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + See the License for the specific language governing permissions and + limitations under the License. + + +onnx-ir +Apache-2.0 +https://onnx.ai/ir-py + + Apache License + Version 2.0, January 2004 + http://www.apache.org/licenses/ + + TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION + + 1. Definitions. + + "License" shall mean the terms and conditions for use, reproduction, + and distribution as defined by Sections 1 through 9 of this document. + + "Licensor" shall mean the copyright owner or entity authorized by + the copyright owner that is granting the License. + + "Legal Entity" shall mean the union of the acting entity and all + other entities that control, are controlled by, or are under common + control with that entity. For the purposes of this definition, + "control" means (i) the power, direct or indirect, to cause the + direction or management of such entity, whether by contract or + otherwise, or (ii) ownership of fifty percent (50%) or more of the + outstanding shares, or (iii) beneficial ownership of such entity. + + "You" (or "Your") shall mean an individual or Legal Entity + exercising permissions granted by this License. + + "Source" form shall mean the preferred form for making modifications, + including but not limited to software source code, documentation + source, and configuration files. + + "Object" form shall mean any form resulting from mechanical + transformation or translation of a Source form, including but + not limited to compiled object code, generated documentation, + and conversions to other media types. + + "Work" shall mean the work of authorship, whether in Source or + Object form, made available under the License, as indicated by a + copyright notice that is included in or attached to the work + (an example is provided in the Appendix below). + + "Derivative Works" shall mean any work, whether in Source or Object + form, that is based on (or derived from) the Work and for which the + editorial revisions, annotations, elaborations, or other modifications + represent, as a whole, an original work of authorship. For the purposes + of this License, Derivative Works shall not include works that remain + separable from, or merely link (or bind by name) to the interfaces of, + the Work and Derivative Works thereof. + + "Contribution" shall mean any work of authorship, including + the original version of the Work and any modifications or additions + to that Work or Derivative Works thereof, that is intentionally + submitted to Licensor for inclusion in the Work by the copyright owner + or by an individual or Legal Entity authorized to submit on behalf of + the copyright owner. For the purposes of this definition, "submitted" + means any form of electronic, verbal, or written communication sent + to the Licensor or its representatives, including but not limited to + communication on electronic mailing lists, source code control systems, + and issue tracking systems that are managed by, or on behalf of, the + Licensor for the purpose of discussing and improving the Work, but + excluding communication that is conspicuously marked or otherwise + designated in writing by the copyright owner as "Not a Contribution." + + "Contributor" shall mean Licensor and any individual or Legal Entity + on behalf of whom a Contribution has been received by Licensor and + subsequently incorporated within the Work. + + 2. Grant of Copyright License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + copyright license to reproduce, prepare Derivative Works of, + publicly display, publicly perform, sublicense, and distribute the + Work and such Derivative Works in Source or Object form. + + 3. Grant of Patent License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + (except as stated in this section) patent license to make, have made, + use, offer to sell, sell, import, and otherwise transfer the Work, + where such license applies only to those patent claims licensable + by such Contributor that are necessarily infringed by their + Contribution(s) alone or by combination of their Contribution(s) + with the Work to which such Contribution(s) was submitted. If You + institute patent litigation against any entity (including a + cross-claim or counterclaim in a lawsuit) alleging that the Work + or a Contribution incorporated within the Work constitutes direct + or contributory patent infringement, then any patent licenses + granted to You under this License for that Work shall terminate + as of the date such litigation is filed. + + 4. Redistribution. You may reproduce and distribute copies of the + Work or Derivative Works thereof in any medium, with or without + modifications, and in Source or Object form, provided that You + meet the following conditions: + + (a) You must give any other recipients of the Work or + Derivative Works a copy of this License; and + + (b) You must cause any modified files to carry prominent notices + stating that You changed the files; and + + (c) You must retain, in the Source form of any Derivative Works + that You distribute, all copyright, patent, trademark, and + attribution notices from the Source form of the Work, + excluding those notices that do not pertain to any part of + the Derivative Works; and + + (d) If the Work includes a "NOTICE" text file as part of its + distribution, then any Derivative Works that You distribute must + include a readable copy of the attribution notices contained + within such NOTICE file, excluding those notices that do not + pertain to any part of the Derivative Works, in at least one + of the following places: within a NOTICE text file distributed + as part of the Derivative Works; within the Source form or + documentation, if provided along with the Derivative Works; or, + within a display generated by the Derivative Works, if and + wherever such third-party notices normally appear. The contents + of the NOTICE file are for informational purposes only and + do not modify the License. You may add Your own attribution + notices within Derivative Works that You distribute, alongside + or as an addendum to the NOTICE text from the Work, provided + that such additional attribution notices cannot be construed + as modifying the License. + + You may add Your own copyright statement to Your modifications and + may provide additional or different license terms and conditions + for use, reproduction, or distribution of Your modifications, or + for any such Derivative Works as a whole, provided Your use, + reproduction, and distribution of the Work otherwise complies with + the conditions stated in this License. + + 5. Submission of Contributions. Unless You explicitly state otherwise, + any Contribution intentionally submitted for inclusion in the Work + by You to the Licensor shall be under the terms and conditions of + this License, without any additional terms or conditions. + Notwithstanding the above, nothing herein shall supersede or modify + the terms of any separate license agreement you may have executed + with Licensor regarding such Contributions. + + 6. Trademarks. This License does not grant permission to use the trade + names, trademarks, service marks, or product names of the Licensor, + except as required for reasonable and customary use in describing the + origin of the Work and reproducing the content of the NOTICE file. + + 7. Disclaimer of Warranty. Unless required by applicable law or + agreed to in writing, Licensor provides the Work (and each + Contributor provides its Contributions) on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or + implied, including, without limitation, any warranties or conditions + of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A + PARTICULAR PURPOSE. You are solely responsible for determining the + appropriateness of using or redistributing the Work and assume any + risks associated with Your exercise of permissions under this License. + + 8. Limitation of Liability. In no event and under no legal theory, + whether in tort (including negligence), contract, or otherwise, + unless required by applicable law (such as deliberate and grossly + negligent acts) or agreed to in writing, shall any Contributor be + liable to You for damages, including any direct, indirect, special, + incidental, or consequential damages of any character arising as a + result of this License or out of the use or inability to use the + Work (including but not limited to damages for loss of goodwill, + work stoppage, computer failure or malfunction, or any and all + other commercial damages or losses), even if such Contributor + has been advised of the possibility of such damages. + + 9. Accepting Warranty or Additional Liability. While redistributing + the Work or Derivative Works thereof, You may choose to offer, + and charge a fee for, acceptance of support, warranty, indemnity, + or other liability obligations and/or rights consistent with this + License. However, in accepting such obligations, You may act only + on Your own behalf and on Your sole responsibility, not on behalf + of any other Contributor, and only if You agree to indemnify, + defend, and hold each Contributor harmless for any liability + incurred by, or claims asserted against, such Contributor by reason + of your accepting any such warranty or additional liability. + + END OF TERMS AND CONDITIONS + + APPENDIX: How to apply the Apache License to your work. + + To apply the Apache License to your work, attach the following + boilerplate notice, with the fields enclosed by brackets "[]" + replaced with your own identifying information. (Don't include + the brackets!) The text should be enclosed in the appropriate + comment syntax for the file format. We also recommend that a + file or class name and description of purpose be included on the + same "printed page" as the copyright notice for easier + identification within third-party archives. + + Copyright [yyyy] [name of copyright owner] + + Licensed under the Apache License, Version 2.0 (the "License"); + you may not use this file except in compliance with the License. + You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, software + distributed under the License is distributed on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + See the License for the specific language governing permissions and + limitations under the License. + + +onnxscript +MIT License +https://microsoft.github.io/onnxscript/ +MIT License + +Copyright (c) Microsoft Corporation + +Permission is hereby granted, free of charge, to any person obtaining a copy +of this software and associated documentation files (the "Software"), to deal +in the Software without restriction, including without limitation the rights +to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +copies of the Software, and to permit persons to whom the Software is +furnished to do so, subject to the following conditions: + +The above copyright notice and this permission notice shall be included in all +copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE +SOFTWARE. + + +open_clip_torch +MIT License +https://github.com/mlfoundations/open_clip +Copyright (c) 2012-2021 Gabriel Ilharco, Mitchell Wortsman, +Nicholas Carlini, Rohan Taori, Achal Dave, Vaishaal Shankar, +John Miller, Hongseok Namkoong, Hannaneh Hajishirzi, Ali Farhadi, +Ludwig Schmidt + +Permission is hereby granted, free of charge, to any person obtaining +a copy of this software and associated documentation files (the +"Software"), to deal in the Software without restriction, including +without limitation the rights to use, copy, modify, merge, publish, +distribute, sublicense, and/or sell copies of the Software, and to +permit persons to whom the Software is furnished to do so, subject to +the following conditions: + +The above copyright notice and this permission notice shall be +included in all copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, +EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF +MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND +NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE +LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION +OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION +WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. + + +openai +Apache Software License +https://github.com/openai/openai-python + Apache License + Version 2.0, January 2004 + http://www.apache.org/licenses/ + + TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION + + 1. Definitions. + + "License" shall mean the terms and conditions for use, reproduction, + and distribution as defined by Sections 1 through 9 of this document. + + "Licensor" shall mean the copyright owner or entity authorized by + the copyright owner that is granting the License. + + "Legal Entity" shall mean the union of the acting entity and all + other entities that control, are controlled by, or are under common + control with that entity. For the purposes of this definition, + "control" means (i) the power, direct or indirect, to cause the + direction or management of such entity, whether by contract or + otherwise, or (ii) ownership of fifty percent (50%) or more of the + outstanding shares, or (iii) beneficial ownership of such entity. + + "You" (or "Your") shall mean an individual or Legal Entity + exercising permissions granted by this License. + + "Source" form shall mean the preferred form for making modifications, + including but not limited to software source code, documentation + source, and configuration files. + + "Object" form shall mean any form resulting from mechanical + transformation or translation of a Source form, including but + not limited to compiled object code, generated documentation, + and conversions to other media types. + + "Work" shall mean the work of authorship, whether in Source or + Object form, made available under the License, as indicated by a + copyright notice that is included in or attached to the work + (an example is provided in the Appendix below). + + "Derivative Works" shall mean any work, whether in Source or Object + form, that is based on (or derived from) the Work and for which the + editorial revisions, annotations, elaborations, or other modifications + represent, as a whole, an original work of authorship. For the purposes + of this License, Derivative Works shall not include works that remain + separable from, or merely link (or bind by name) to the interfaces of, + the Work and Derivative Works thereof. + + "Contribution" shall mean any work of authorship, including + the original version of the Work and any modifications or additions + to that Work or Derivative Works thereof, that is intentionally + submitted to Licensor for inclusion in the Work by the copyright owner + or by an individual or Legal Entity authorized to submit on behalf of + the copyright owner. For the purposes of this definition, "submitted" + means any form of electronic, verbal, or written communication sent + to the Licensor or its representatives, including but not limited to + communication on electronic mailing lists, source code control systems, + and issue tracking systems that are managed by, or on behalf of, the + Licensor for the purpose of discussing and improving the Work, but + excluding communication that is conspicuously marked or otherwise + designated in writing by the copyright owner as "Not a Contribution." + + "Contributor" shall mean Licensor and any individual or Legal Entity + on behalf of whom a Contribution has been received by Licensor and + subsequently incorporated within the Work. + + 2. Grant of Copyright License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + copyright license to reproduce, prepare Derivative Works of, + publicly display, publicly perform, sublicense, and distribute the + Work and such Derivative Works in Source or Object form. + + 3. Grant of Patent License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + (except as stated in this section) patent license to make, have made, + use, offer to sell, sell, import, and otherwise transfer the Work, + where such license applies only to those patent claims licensable + by such Contributor that are necessarily infringed by their + Contribution(s) alone or by combination of their Contribution(s) + with the Work to which such Contribution(s) was submitted. If You + institute patent litigation against any entity (including a + cross-claim or counterclaim in a lawsuit) alleging that the Work + or a Contribution incorporated within the Work constitutes direct + or contributory patent infringement, then any patent licenses + granted to You under this License for that Work shall terminate + as of the date such litigation is filed. + + 4. Redistribution. You may reproduce and distribute copies of the + Work or Derivative Works thereof in any medium, with or without + modifications, and in Source or Object form, provided that You + meet the following conditions: + + (a) You must give any other recipients of the Work or + Derivative Works a copy of this License; and + + (b) You must cause any modified files to carry prominent notices + stating that You changed the files; and + + (c) You must retain, in the Source form of any Derivative Works + that You distribute, all copyright, patent, trademark, and + attribution notices from the Source form of the Work, + excluding those notices that do not pertain to any part of + the Derivative Works; and + + (d) If the Work includes a "NOTICE" text file as part of its + distribution, then any Derivative Works that You distribute must + include a readable copy of the attribution notices contained + within such NOTICE file, excluding those notices that do not + pertain to any part of the Derivative Works, in at least one + of the following places: within a NOTICE text file distributed + as part of the Derivative Works; within the Source form or + documentation, if provided along with the Derivative Works; or, + within a display generated by the Derivative Works, if and + wherever such third-party notices normally appear. The contents + of the NOTICE file are for informational purposes only and + do not modify the License. You may add Your own attribution + notices within Derivative Works that You distribute, alongside + or as an addendum to the NOTICE text from the Work, provided + that such additional attribution notices cannot be construed + as modifying the License. + + You may add Your own copyright statement to Your modifications and + may provide additional or different license terms and conditions + for use, reproduction, or distribution of Your modifications, or + for any such Derivative Works as a whole, provided Your use, + reproduction, and distribution of the Work otherwise complies with + the conditions stated in this License. + + 5. Submission of Contributions. Unless You explicitly state otherwise, + any Contribution intentionally submitted for inclusion in the Work + by You to the Licensor shall be under the terms and conditions of + this License, without any additional terms or conditions. + Notwithstanding the above, nothing herein shall supersede or modify + the terms of any separate license agreement you may have executed + with Licensor regarding such Contributions. + + 6. Trademarks. This License does not grant permission to use the trade + names, trademarks, service marks, or product names of the Licensor, + except as required for reasonable and customary use in describing the + origin of the Work and reproducing the content of the NOTICE file. + + 7. Disclaimer of Warranty. Unless required by applicable law or + agreed to in writing, Licensor provides the Work (and each + Contributor provides its Contributions) on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or + implied, including, without limitation, any warranties or conditions + of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A + PARTICULAR PURPOSE. You are solely responsible for determining the + appropriateness of using or redistributing the Work and assume any + risks associated with Your exercise of permissions under this License. + + 8. Limitation of Liability. In no event and under no legal theory, + whether in tort (including negligence), contract, or otherwise, + unless required by applicable law (such as deliberate and grossly + negligent acts) or agreed to in writing, shall any Contributor be + liable to You for damages, including any direct, indirect, special, + incidental, or consequential damages of any character arising as a + result of this License or out of the use or inability to use the + Work (including but not limited to damages for loss of goodwill, + work stoppage, computer failure or malfunction, or any and all + other commercial damages or losses), even if such Contributor + has been advised of the possibility of such damages. + + 9. Accepting Warranty or Additional Liability. While redistributing + the Work or Derivative Works thereof, You may choose to offer, + and charge a fee for, acceptance of support, warranty, indemnity, + or other liability obligations and/or rights consistent with this + License. However, in accepting such obligations, You may act only + on Your own behalf and on Your sole responsibility, not on behalf + of any other Contributor, and only if You agree to indemnify, + defend, and hold each Contributor harmless for any liability + incurred by, or claims asserted against, such Contributor by reason + of your accepting any such warranty or additional liability. + + END OF TERMS AND CONDITIONS + + APPENDIX: How to apply the Apache License to your work. + + To apply the Apache License to your work, attach the following + boilerplate notice, with the fields enclosed by brackets "[]" + replaced with your own identifying information. (Don't include + the brackets!) The text should be enclosed in the appropriate + comment syntax for the file format. We also recommend that a + file or class name and description of purpose be included on the + same "printed page" as the copyright notice for easier + identification within third-party archives. + + Copyright 2026 OpenAI + + Licensed under the Apache License, Version 2.0 (the "License"); + you may not use this file except in compliance with the License. + You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, software + distributed under the License is distributed on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + See the License for the specific language governing permissions and + limitations under the License. + + +opencensus +Apache Software License +https://github.com/census-instrumentation/opencensus-python + Apache License + Version 2.0, January 2004 + http://www.apache.org/licenses/ + + TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION + + 1. Definitions. + + "License" shall mean the terms and conditions for use, reproduction, + and distribution as defined by Sections 1 through 9 of this document. + + "Licensor" shall mean the copyright owner or entity authorized by + the copyright owner that is granting the License. + + "Legal Entity" shall mean the union of the acting entity and all + other entities that control, are controlled by, or are under common + control with that entity. For the purposes of this definition, + "control" means (i) the power, direct or indirect, to cause the + direction or management of such entity, whether by contract or + otherwise, or (ii) ownership of fifty percent (50%) or more of the + outstanding shares, or (iii) beneficial ownership of such entity. + + "You" (or "Your") shall mean an individual or Legal Entity + exercising permissions granted by this License. + + "Source" form shall mean the preferred form for making modifications, + including but not limited to software source code, documentation + source, and configuration files. + + "Object" form shall mean any form resulting from mechanical + transformation or translation of a Source form, including but + not limited to compiled object code, generated documentation, + and conversions to other media types. + + "Work" shall mean the work of authorship, whether in Source or + Object form, made available under the License, as indicated by a + copyright notice that is included in or attached to the work + (an example is provided in the Appendix below). + + "Derivative Works" shall mean any work, whether in Source or Object + form, that is based on (or derived from) the Work and for which the + editorial revisions, annotations, elaborations, or other modifications + represent, as a whole, an original work of authorship. For the purposes + of this License, Derivative Works shall not include works that remain + separable from, or merely link (or bind by name) to the interfaces of, + the Work and Derivative Works thereof. + + "Contribution" shall mean any work of authorship, including + the original version of the Work and any modifications or additions + to that Work or Derivative Works thereof, that is intentionally + submitted to Licensor for inclusion in the Work by the copyright owner + or by an individual or Legal Entity authorized to submit on behalf of + the copyright owner. For the purposes of this definition, "submitted" + means any form of electronic, verbal, or written communication sent + to the Licensor or its representatives, including but not limited to + communication on electronic mailing lists, source code control systems, + and issue tracking systems that are managed by, or on behalf of, the + Licensor for the purpose of discussing and improving the Work, but + excluding communication that is conspicuously marked or otherwise + designated in writing by the copyright owner as "Not a Contribution." + + "Contributor" shall mean Licensor and any individual or Legal Entity + on behalf of whom a Contribution has been received by Licensor and + subsequently incorporated within the Work. + + 2. Grant of Copyright License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + copyright license to reproduce, prepare Derivative Works of, + publicly display, publicly perform, sublicense, and distribute the + Work and such Derivative Works in Source or Object form. + + 3. Grant of Patent License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + (except as stated in this section) patent license to make, have made, + use, offer to sell, sell, import, and otherwise transfer the Work, + where such license applies only to those patent claims licensable + by such Contributor that are necessarily infringed by their + Contribution(s) alone or by combination of their Contribution(s) + with the Work to which such Contribution(s) was submitted. If You + institute patent litigation against any entity (including a + cross-claim or counterclaim in a lawsuit) alleging that the Work + or a Contribution incorporated within the Work constitutes direct + or contributory patent infringement, then any patent licenses + granted to You under this License for that Work shall terminate + as of the date such litigation is filed. + + 4. Redistribution. You may reproduce and distribute copies of the + Work or Derivative Works thereof in any medium, with or without + modifications, and in Source or Object form, provided that You + meet the following conditions: + + (a) You must give any other recipients of the Work or + Derivative Works a copy of this License; and + + (b) You must cause any modified files to carry prominent notices + stating that You changed the files; and + + (c) You must retain, in the Source form of any Derivative Works + that You distribute, all copyright, patent, trademark, and + attribution notices from the Source form of the Work, + excluding those notices that do not pertain to any part of + the Derivative Works; and + + (d) If the Work includes a "NOTICE" text file as part of its + distribution, then any Derivative Works that You distribute must + include a readable copy of the attribution notices contained + within such NOTICE file, excluding those notices that do not + pertain to any part of the Derivative Works, in at least one + of the following places: within a NOTICE text file distributed + as part of the Derivative Works; within the Source form or + documentation, if provided along with the Derivative Works; or, + within a display generated by the Derivative Works, if and + wherever such third-party notices normally appear. The contents + of the NOTICE file are for informational purposes only and + do not modify the License. You may add Your own attribution + notices within Derivative Works that You distribute, alongside + or as an addendum to the NOTICE text from the Work, provided + that such additional attribution notices cannot be construed + as modifying the License. + + You may add Your own copyright statement to Your modifications and + may provide additional or different license terms and conditions + for use, reproduction, or distribution of Your modifications, or + for any such Derivative Works as a whole, provided Your use, + reproduction, and distribution of the Work otherwise complies with + the conditions stated in this License. + + 5. Submission of Contributions. Unless You explicitly state otherwise, + any Contribution intentionally submitted for inclusion in the Work + by You to the Licensor shall be under the terms and conditions of + this License, without any additional terms or conditions. + Notwithstanding the above, nothing herein shall supersede or modify + the terms of any separate license agreement you may have executed + with Licensor regarding such Contributions. + + 6. Trademarks. This License does not grant permission to use the trade + names, trademarks, service marks, or product names of the Licensor, + except as required for reasonable and customary use in describing the + origin of the Work and reproducing the content of the NOTICE file. + + 7. Disclaimer of Warranty. Unless required by applicable law or + agreed to in writing, Licensor provides the Work (and each + Contributor provides its Contributions) on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or + implied, including, without limitation, any warranties or conditions + of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A + PARTICULAR PURPOSE. You are solely responsible for determining the + appropriateness of using or redistributing the Work and assume any + risks associated with Your exercise of permissions under this License. + + 8. Limitation of Liability. In no event and under no legal theory, + whether in tort (including negligence), contract, or otherwise, + unless required by applicable law (such as deliberate and grossly + negligent acts) or agreed to in writing, shall any Contributor be + liable to You for damages, including any direct, indirect, special, + incidental, or consequential damages of any character arising as a + result of this License or out of the use or inability to use the + Work (including but not limited to damages for loss of goodwill, + work stoppage, computer failure or malfunction, or any and all + other commercial damages or losses), even if such Contributor + has been advised of the possibility of such damages. + + 9. Accepting Warranty or Additional Liability. While redistributing + the Work or Derivative Works thereof, You may choose to offer, + and charge a fee for, acceptance of support, warranty, indemnity, + or other liability obligations and/or rights consistent with this + License. However, in accepting such obligations, You may act only + on Your own behalf and on Your sole responsibility, not on behalf + of any other Contributor, and only if You agree to indemnify, + defend, and hold each Contributor harmless for any liability + incurred by, or claims asserted against, such Contributor by reason + of your accepting any such warranty or additional liability. + + END OF TERMS AND CONDITIONS + + APPENDIX: How to apply the Apache License to your work. + + To apply the Apache License to your work, attach the following + boilerplate notice, with the fields enclosed by brackets "[]" + replaced with your own identifying information. (Don't include + the brackets!) The text should be enclosed in the appropriate + comment syntax for the file format. We also recommend that a + file or class name and description of purpose be included on the + same "printed page" as the copyright notice for easier + identification within third-party archives. + + Copyright [yyyy] [name of copyright owner] + + Licensed under the Apache License, Version 2.0 (the "License"); + you may not use this file except in compliance with the License. + You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, software + distributed under the License is distributed on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + See the License for the specific language governing permissions and + limitations under the License. + + +opencensus-context +Apache Software License +https://github.com/census-instrumentation/opencensus-python/tree/master/context/opencensus-context +UNKNOWN + +opencv-contrib-python +Apache Software License +https://github.com/opencv/opencv-python +OpenCV library is redistributed within opencv-python package. +This license applies to OpenCV binary in the directory cv2/. + + + Apache License + Version 2.0, January 2004 + http://www.apache.org/licenses/ + + TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION + + 1. Definitions. + + "License" shall mean the terms and conditions for use, reproduction, + and distribution as defined by Sections 1 through 9 of this document. + + "Licensor" shall mean the copyright owner or entity authorized by + the copyright owner that is granting the License. + + "Legal Entity" shall mean the union of the acting entity and all + other entities that control, are controlled by, or are under common + control with that entity. For the purposes of this definition, + "control" means (i) the power, direct or indirect, to cause the + direction or management of such entity, whether by contract or + otherwise, or (ii) ownership of fifty percent (50%) or more of the + outstanding shares, or (iii) beneficial ownership of such entity. + + "You" (or "Your") shall mean an individual or Legal Entity + exercising permissions granted by this License. + + "Source" form shall mean the preferred form for making modifications, + including but not limited to software source code, documentation + source, and configuration files. + + "Object" form shall mean any form resulting from mechanical + transformation or translation of a Source form, including but + not limited to compiled object code, generated documentation, + and conversions to other media types. + + "Work" shall mean the work of authorship, whether in Source or + Object form, made available under the License, as indicated by a + copyright notice that is included in or attached to the work + (an example is provided in the Appendix below). + + "Derivative Works" shall mean any work, whether in Source or Object + form, that is based on (or derived from) the Work and for which the + editorial revisions, annotations, elaborations, or other modifications + represent, as a whole, an original work of authorship. For the purposes + of this License, Derivative Works shall not include works that remain + separable from, or merely link (or bind by name) to the interfaces of, + the Work and Derivative Works thereof. + + "Contribution" shall mean any work of authorship, including + the original version of the Work and any modifications or additions + to that Work or Derivative Works thereof, that is intentionally + submitted to Licensor for inclusion in the Work by the copyright owner + or by an individual or Legal Entity authorized to submit on behalf of + the copyright owner. For the purposes of this definition, "submitted" + means any form of electronic, verbal, or written communication sent + to the Licensor or its representatives, including but not limited to + communication on electronic mailing lists, source code control systems, + and issue tracking systems that are managed by, or on behalf of, the + Licensor for the purpose of discussing and improving the Work, but + excluding communication that is conspicuously marked or otherwise + designated in writing by the copyright owner as "Not a Contribution." + + "Contributor" shall mean Licensor and any individual or Legal Entity + on behalf of whom a Contribution has been received by Licensor and + subsequently incorporated within the Work. + + 2. Grant of Copyright License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + copyright license to reproduce, prepare Derivative Works of, + publicly display, publicly perform, sublicense, and distribute the + Work and such Derivative Works in Source or Object form. + + 3. Grant of Patent License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + (except as stated in this section) patent license to make, have made, + use, offer to sell, sell, import, and otherwise transfer the Work, + where such license applies only to those patent claims licensable + by such Contributor that are necessarily infringed by their + Contribution(s) alone or by combination of their Contribution(s) + with the Work to which such Contribution(s) was submitted. If You + institute patent litigation against any entity (including a + cross-claim or counterclaim in a lawsuit) alleging that the Work + or a Contribution incorporated within the Work constitutes direct + or contributory patent infringement, then any patent licenses + granted to You under this License for that Work shall terminate + as of the date such litigation is filed. + + 4. Redistribution. You may reproduce and distribute copies of the + Work or Derivative Works thereof in any medium, with or without + modifications, and in Source or Object form, provided that You + meet the following conditions: + + (a) You must give any other recipients of the Work or + Derivative Works a copy of this License; and + + (b) You must cause any modified files to carry prominent notices + stating that You changed the files; and + + (c) You must retain, in the Source form of any Derivative Works + that You distribute, all copyright, patent, trademark, and + attribution notices from the Source form of the Work, + excluding those notices that do not pertain to any part of + the Derivative Works; and + + (d) If the Work includes a "NOTICE" text file as part of its + distribution, then any Derivative Works that You distribute must + include a readable copy of the attribution notices contained + within such NOTICE file, excluding those notices that do not + pertain to any part of the Derivative Works, in at least one + of the following places: within a NOTICE text file distributed + as part of the Derivative Works; within the Source form or + documentation, if provided along with the Derivative Works; or, + within a display generated by the Derivative Works, if and + wherever such third-party notices normally appear. The contents + of the NOTICE file are for informational purposes only and + do not modify the License. You may add Your own attribution + notices within Derivative Works that You distribute, alongside + or as an addendum to the NOTICE text from the Work, provided + that such additional attribution notices cannot be construed + as modifying the License. + + You may add Your own copyright statement to Your modifications and + may provide additional or different license terms and conditions + for use, reproduction, or distribution of Your modifications, or + for any such Derivative Works as a whole, provided Your use, + reproduction, and distribution of the Work otherwise complies with + the conditions stated in this License. + + 5. Submission of Contributions. Unless You explicitly state otherwise, + any Contribution intentionally submitted for inclusion in the Work + by You to the Licensor shall be under the terms and conditions of + this License, without any additional terms or conditions. + Notwithstanding the above, nothing herein shall supersede or modify + the terms of any separate license agreement you may have executed + with Licensor regarding such Contributions. + + 6. Trademarks. This License does not grant permission to use the trade + names, trademarks, service marks, or product names of the Licensor, + except as required for reasonable and customary use in describing the + origin of the Work and reproducing the content of the NOTICE file. + + 7. Disclaimer of Warranty. Unless required by applicable law or + agreed to in writing, Licensor provides the Work (and each + Contributor provides its Contributions) on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or + implied, including, without limitation, any warranties or conditions + of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A + PARTICULAR PURPOSE. You are solely responsible for determining the + appropriateness of using or redistributing the Work and assume any + risks associated with Your exercise of permissions under this License. + + 8. Limitation of Liability. In no event and under no legal theory, + whether in tort (including negligence), contract, or otherwise, + unless required by applicable law (such as deliberate and grossly + negligent acts) or agreed to in writing, shall any Contributor be + liable to You for damages, including any direct, indirect, special, + incidental, or consequential damages of any character arising as a + result of this License or out of the use or inability to use the + Work (including but not limited to damages for loss of goodwill, + work stoppage, computer failure or malfunction, or any and all + other commercial damages or losses), even if such Contributor + has been advised of the possibility of such damages. + + 9. Accepting Warranty or Additional Liability. While redistributing + the Work or Derivative Works thereof, You may choose to offer, + and charge a fee for, acceptance of support, warranty, indemnity, + or other liability obligations and/or rights consistent with this + License. However, in accepting such obligations, You may act only + on Your own behalf and on Your sole responsibility, not on behalf + of any other Contributor, and only if You agree to indemnify, + defend, and hold each Contributor harmless for any liability + incurred by, or claims asserted against, such Contributor by reason + of your accepting any such warranty or additional liability. + + END OF TERMS AND CONDITIONS + + APPENDIX: How to apply the Apache License to your work. + + To apply the Apache License to your work, attach the following + boilerplate notice, with the fields enclosed by brackets "[]" + replaced with your own identifying information. (Don't include + the brackets!) The text should be enclosed in the appropriate + comment syntax for the file format. We also recommend that a + file or class name and description of purpose be included on the + same "printed page" as the copyright notice for easier + identification within third-party archives. + + Copyright [yyyy] [name of copyright owner] + + Licensed under the Apache License, Version 2.0 (the "License"); + you may not use this file except in compliance with the License. + You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, software + distributed under the License is distributed on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + See the License for the specific language governing permissions and + limitations under the License. + +------------------------------------------------------------------------------ +libvpx is redistributed within all opencv-python Linux packages. +This license applies to libvpx binary in the directory cv2/. + +Copyright (c) 2010, The WebM Project authors. All rights reserved. + +Redistribution and use in source and binary forms, with or without +modification, are permitted provided that the following conditions are +met: + + * Redistributions of source code must retain the above copyright + notice, this list of conditions and the following disclaimer. + + * Redistributions in binary form must reproduce the above copyright + notice, this list of conditions and the following disclaimer in + the documentation and/or other materials provided with the + distribution. + + * Neither the name of Google, nor the WebM Project, nor the names + of its contributors may be used to endorse or promote products + derived from this software without specific prior written + permission. + +THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS +"AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT +LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR +A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT +HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, +SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT +LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, +DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY +THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT +(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE +OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + +------------------------------------------------------------------------------ +FFmpeg is redistributed within all opencv-python packages. + +Libbluray, libgnutls, libnettle, libhogweed, libintl, libmp3lame, libp11, +librtmp, libsoxr and libtasn1 are redistributed within all opencv-python macOS packages. + +This license applies to the above library binaries in the directory cv2/. + + GNU LESSER GENERAL PUBLIC LICENSE + Version 2.1, February 1999 + + Copyright (C) 1991, 1999 Free Software Foundation, Inc. + 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + Everyone is permitted to copy and distribute verbatim copies + of this license document, but changing it is not allowed. + +[This is the first released version of the Lesser GPL. It also counts + as the successor of the GNU Library Public License, version 2, hence + the version number 2.1.] + + Preamble + + The licenses for most software are designed to take away your +freedom to share and change it. By contrast, the GNU General Public +Licenses are intended to guarantee your freedom to share and change +free software--to make sure the software is free for all its users. + + This license, the Lesser General Public License, applies to some +specially designated software packages--typically libraries--of the +Free Software Foundation and other authors who decide to use it. You +can use it too, but we suggest you first think carefully about whether +this license or the ordinary General Public License is the better +strategy to use in any particular case, based on the explanations below. + + When we speak of free software, we are referring to freedom of use, +not price. Our General Public Licenses are designed to make sure that +you have the freedom to distribute copies of free software (and charge +for this service if you wish); that you receive source code or can get +it if you want it; that you can change the software and use pieces of +it in new free programs; and that you are informed that you can do +these things. + + To protect your rights, we need to make restrictions that forbid +distributors to deny you these rights or to ask you to surrender these +rights. These restrictions translate to certain responsibilities for +you if you distribute copies of the library or if you modify it. + + For example, if you distribute copies of the library, whether gratis +or for a fee, you must give the recipients all the rights that we gave +you. You must make sure that they, too, receive or can get the source +code. If you link other code with the library, you must provide +complete object files to the recipients, so that they can relink them +with the library after making changes to the library and recompiling +it. And you must show them these terms so they know their rights. + + We protect your rights with a two-step method: (1) we copyright the +library, and (2) we offer you this license, which gives you legal +permission to copy, distribute and/or modify the library. + + To protect each distributor, we want to make it very clear that +there is no warranty for the free library. Also, if the library is +modified by someone else and passed on, the recipients should know +that what they have is not the original version, so that the original +author's reputation will not be affected by problems that might be +introduced by others. + + Finally, software patents pose a constant threat to the existence of +any free program. We wish to make sure that a company cannot +effectively restrict the users of a free program by obtaining a +restrictive license from a patent holder. Therefore, we insist that +any patent license obtained for a version of the library must be +consistent with the full freedom of use specified in this license. + + Most GNU software, including some libraries, is covered by the +ordinary GNU General Public License. This license, the GNU Lesser +General Public License, applies to certain designated libraries, and +is quite different from the ordinary General Public License. We use +this license for certain libraries in order to permit linking those +libraries into non-free programs. + + When a program is linked with a library, whether statically or using +a shared library, the combination of the two is legally speaking a +combined work, a derivative of the original library. The ordinary +General Public License therefore permits such linking only if the +entire combination fits its criteria of freedom. The Lesser General +Public License permits more lax criteria for linking other code with +the library. + + We call this license the "Lesser" General Public License because it +does Less to protect the user's freedom than the ordinary General +Public License. It also provides other free software developers Less +of an advantage over competing non-free programs. These disadvantages +are the reason we use the ordinary General Public License for many +libraries. However, the Lesser license provides advantages in certain +special circumstances. + + For example, on rare occasions, there may be a special need to +encourage the widest possible use of a certain library, so that it becomes +a de-facto standard. To achieve this, non-free programs must be +allowed to use the library. A more frequent case is that a free +library does the same job as widely used non-free libraries. In this +case, there is little to gain by limiting the free library to free +software only, so we use the Lesser General Public License. + + In other cases, permission to use a particular library in non-free +programs enables a greater number of people to use a large body of +free software. For example, permission to use the GNU C Library in +non-free programs enables many more people to use the whole GNU +operating system, as well as its variant, the GNU/Linux operating +system. + + Although the Lesser General Public License is Less protective of the +users' freedom, it does ensure that the user of a program that is +linked with the Library has the freedom and the wherewithal to run +that program using a modified version of the Library. + + The precise terms and conditions for copying, distribution and +modification follow. Pay close attention to the difference between a +"work based on the library" and a "work that uses the library". The +former contains code derived from the library, whereas the latter must +be combined with the library in order to run. + + GNU LESSER GENERAL PUBLIC LICENSE + TERMS AND CONDITIONS FOR COPYING, DISTRIBUTION AND MODIFICATION + + 0. This License Agreement applies to any software library or other +program which contains a notice placed by the copyright holder or +other authorized party saying it may be distributed under the terms of +this Lesser General Public License (also called "this License"). +Each licensee is addressed as "you". + + A "library" means a collection of software functions and/or data +prepared so as to be conveniently linked with application programs +(which use some of those functions and data) to form executables. + + The "Library", below, refers to any such software library or work +which has been distributed under these terms. A "work based on the +Library" means either the Library or any derivative work under +copyright law: that is to say, a work containing the Library or a +portion of it, either verbatim or with modifications and/or translated +straightforwardly into another language. (Hereinafter, translation is +included without limitation in the term "modification".) + + "Source code" for a work means the preferred form of the work for +making modifications to it. For a library, complete source code means +all the source code for all modules it contains, plus any associated +interface definition files, plus the scripts used to control compilation +and installation of the library. + + Activities other than copying, distribution and modification are not +covered by this License; they are outside its scope. The act of +running a program using the Library is not restricted, and output from +such a program is covered only if its contents constitute a work based +on the Library (independent of the use of the Library in a tool for +writing it). Whether that is true depends on what the Library does +and what the program that uses the Library does. + + 1. You may copy and distribute verbatim copies of the Library's +complete source code as you receive it, in any medium, provided that +you conspicuously and appropriately publish on each copy an +appropriate copyright notice and disclaimer of warranty; keep intact +all the notices that refer to this License and to the absence of any +warranty; and distribute a copy of this License along with the +Library. + + You may charge a fee for the physical act of transferring a copy, +and you may at your option offer warranty protection in exchange for a +fee. + + 2. You may modify your copy or copies of the Library or any portion +of it, thus forming a work based on the Library, and copy and +distribute such modifications or work under the terms of Section 1 +above, provided that you also meet all of these conditions: + + a) The modified work must itself be a software library. + + b) You must cause the files modified to carry prominent notices + stating that you changed the files and the date of any change. + + c) You must cause the whole of the work to be licensed at no + charge to all third parties under the terms of this License. + + d) If a facility in the modified Library refers to a function or a + table of data to be supplied by an application program that uses + the facility, other than as an argument passed when the facility + is invoked, then you must make a good faith effort to ensure that, + in the event an application does not supply such function or + table, the facility still operates, and performs whatever part of + its purpose remains meaningful. + + (For example, a function in a library to compute square roots has + a purpose that is entirely well-defined independent of the + application. Therefore, Subsection 2d requires that any + application-supplied function or table used by this function must + be optional: if the application does not supply it, the square + root function must still compute square roots.) + +These requirements apply to the modified work as a whole. If +identifiable sections of that work are not derived from the Library, +and can be reasonably considered independent and separate works in +themselves, then this License, and its terms, do not apply to those +sections when you distribute them as separate works. But when you +distribute the same sections as part of a whole which is a work based +on the Library, the distribution of the whole must be on the terms of +this License, whose permissions for other licensees extend to the +entire whole, and thus to each and every part regardless of who wrote +it. + +Thus, it is not the intent of this section to claim rights or contest +your rights to work written entirely by you; rather, the intent is to +exercise the right to control the distribution of derivative or +collective works based on the Library. + +In addition, mere aggregation of another work not based on the Library +with the Library (or with a work based on the Library) on a volume of +a storage or distribution medium does not bring the other work under +the scope of this License. + + 3. You may opt to apply the terms of the ordinary GNU General Public +License instead of this License to a given copy of the Library. To do +this, you must alter all the notices that refer to this License, so +that they refer to the ordinary GNU General Public License, version 2, +instead of to this License. (If a newer version than version 2 of the +ordinary GNU General Public License has appeared, then you can specify +that version instead if you wish.) Do not make any other change in +these notices. + + Once this change is made in a given copy, it is irreversible for +that copy, so the ordinary GNU General Public License applies to all +subsequent copies and derivative works made from that copy. + + This option is useful when you wish to copy part of the code of +the Library into a program that is not a library. + + 4. You may copy and distribute the Library (or a portion or +derivative of it, under Section 2) in object code or executable form +under the terms of Sections 1 and 2 above provided that you accompany +it with the complete corresponding machine-readable source code, which +must be distributed under the terms of Sections 1 and 2 above on a +medium customarily used for software interchange. + + If distribution of object code is made by offering access to copy +from a designated place, then offering equivalent access to copy the +source code from the same place satisfies the requirement to +distribute the source code, even though third parties are not +compelled to copy the source along with the object code. + + 5. A program that contains no derivative of any portion of the +Library, but is designed to work with the Library by being compiled or +linked with it, is called a "work that uses the Library". Such a +work, in isolation, is not a derivative work of the Library, and +therefore falls outside the scope of this License. + + However, linking a "work that uses the Library" with the Library +creates an executable that is a derivative of the Library (because it +contains portions of the Library), rather than a "work that uses the +library". The executable is therefore covered by this License. +Section 6 states terms for distribution of such executables. + + When a "work that uses the Library" uses material from a header file +that is part of the Library, the object code for the work may be a +derivative work of the Library even though the source code is not. +Whether this is true is especially significant if the work can be +linked without the Library, or if the work is itself a library. The +threshold for this to be true is not precisely defined by law. + + If such an object file uses only numerical parameters, data +structure layouts and accessors, and small macros and small inline +functions (ten lines or less in length), then the use of the object +file is unrestricted, regardless of whether it is legally a derivative +work. (Executables containing this object code plus portions of the +Library will still fall under Section 6.) + + Otherwise, if the work is a derivative of the Library, you may +distribute the object code for the work under the terms of Section 6. +Any executables containing that work also fall under Section 6, +whether or not they are linked directly with the Library itself. + + 6. As an exception to the Sections above, you may also combine or +link a "work that uses the Library" with the Library to produce a +work containing portions of the Library, and distribute that work +under terms of your choice, provided that the terms permit +modification of the work for the customer's own use and reverse +engineering for debugging such modifications. + + You must give prominent notice with each copy of the work that the +Library is used in it and that the Library and its use are covered by +this License. You must supply a copy of this License. If the work +during execution displays copyright notices, you must include the +copyright notice for the Library among them, as well as a reference +directing the user to the copy of this License. Also, you must do one +of these things: + + a) Accompany the work with the complete corresponding + machine-readable source code for the Library including whatever + changes were used in the work (which must be distributed under + Sections 1 and 2 above); and, if the work is an executable linked + with the Library, with the complete machine-readable "work that + uses the Library", as object code and/or source code, so that the + user can modify the Library and then relink to produce a modified + executable containing the modified Library. (It is understood + that the user who changes the contents of definitions files in the + Library will not necessarily be able to recompile the application + to use the modified definitions.) + + b) Use a suitable shared library mechanism for linking with the + Library. A suitable mechanism is one that (1) uses at run time a + copy of the library already present on the user's computer system, + rather than copying library functions into the executable, and (2) + will operate properly with a modified version of the library, if + the user installs one, as long as the modified version is + interface-compatible with the version that the work was made with. + + c) Accompany the work with a written offer, valid for at + least three years, to give the same user the materials + specified in Subsection 6a, above, for a charge no more + than the cost of performing this distribution. + + d) If distribution of the work is made by offering access to copy + from a designated place, offer equivalent access to copy the above + specified materials from the same place. + + e) Verify that the user has already received a copy of these + materials or that you have already sent this user a copy. + + For an executable, the required form of the "work that uses the +Library" must include any data and utility programs needed for +reproducing the executable from it. However, as a special exception, +the materials to be distributed need not include anything that is +normally distributed (in either source or binary form) with the major +components (compiler, kernel, and so on) of the operating system on +which the executable runs, unless that component itself accompanies +the executable. + + It may happen that this requirement contradicts the license +restrictions of other proprietary libraries that do not normally +accompany the operating system. Such a contradiction means you cannot +use both them and the Library together in an executable that you +distribute. + + 7. You may place library facilities that are a work based on the +Library side-by-side in a single library together with other library +facilities not covered by this License, and distribute such a combined +library, provided that the separate distribution of the work based on +the Library and of the other library facilities is otherwise +permitted, and provided that you do these two things: + + a) Accompany the combined library with a copy of the same work + based on the Library, uncombined with any other library + facilities. This must be distributed under the terms of the + Sections above. + + b) Give prominent notice with the combined library of the fact + that part of it is a work based on the Library, and explaining + where to find the accompanying uncombined form of the same work. + + 8. You may not copy, modify, sublicense, link with, or distribute +the Library except as expressly provided under this License. Any +attempt otherwise to copy, modify, sublicense, link with, or +distribute the Library is void, and will automatically terminate your +rights under this License. However, parties who have received copies, +or rights, from you under this License will not have their licenses +terminated so long as such parties remain in full compliance. + + 9. You are not required to accept this License, since you have not +signed it. However, nothing else grants you permission to modify or +distribute the Library or its derivative works. These actions are +prohibited by law if you do not accept this License. Therefore, by +modifying or distributing the Library (or any work based on the +Library), you indicate your acceptance of this License to do so, and +all its terms and conditions for copying, distributing or modifying +the Library or works based on it. + + 10. Each time you redistribute the Library (or any work based on the +Library), the recipient automatically receives a license from the +original licensor to copy, distribute, link with or modify the Library +subject to these terms and conditions. You may not impose any further +restrictions on the recipients' exercise of the rights granted herein. +You are not responsible for enforcing compliance by third parties with +this License. + + 11. If, as a consequence of a court judgment or allegation of patent +infringement or for any other reason (not limited to patent issues), +conditions are imposed on you (whether by court order, agreement or +otherwise) that contradict the conditions of this License, they do not +excuse you from the conditions of this License. If you cannot +distribute so as to satisfy simultaneously your obligations under this +License and any other pertinent obligations, then as a consequence you +may not distribute the Library at all. For example, if a patent +license would not permit royalty-free redistribution of the Library by +all those who receive copies directly or indirectly through you, then +the only way you could satisfy both it and this License would be to +refrain entirely from distribution of the Library. + +If any portion of this section is held invalid or unenforceable under any +particular circumstance, the balance of the section is intended to apply, +and the section as a whole is intended to apply in other circumstances. + +It is not the purpose of this section to induce you to infringe any +patents or other property right claims or to contest validity of any +such claims; this section has the sole purpose of protecting the +integrity of the free software distribution system which is +implemented by public license practices. Many people have made +generous contributions to the wide range of software distributed +through that system in reliance on consistent application of that +system; it is up to the author/donor to decide if he or she is willing +to distribute software through any other system and a licensee cannot +impose that choice. + +This section is intended to make thoroughly clear what is believed to +be a consequence of the rest of this License. + + 12. If the distribution and/or use of the Library is restricted in +certain countries either by patents or by copyrighted interfaces, the +original copyright holder who places the Library under this License may add +an explicit geographical distribution limitation excluding those countries, +so that distribution is permitted only in or among countries not thus +excluded. In such case, this License incorporates the limitation as if +written in the body of this License. + + 13. The Free Software Foundation may publish revised and/or new +versions of the Lesser General Public License from time to time. +Such new versions will be similar in spirit to the present version, +but may differ in detail to address new problems or concerns. + +Each version is given a distinguishing version number. If the Library +specifies a version number of this License which applies to it and +"any later version", you have the option of following the terms and +conditions either of that version or of any later version published by +the Free Software Foundation. If the Library does not specify a +license version number, you may choose any version ever published by +the Free Software Foundation. + + 14. If you wish to incorporate parts of the Library into other free +programs whose distribution conditions are incompatible with these, +write to the author to ask for permission. For software which is +copyrighted by the Free Software Foundation, write to the Free +Software Foundation; we sometimes make exceptions for this. Our +decision will be guided by the two goals of preserving the free status +of all derivatives of our free software and of promoting the sharing +and reuse of software generally. + + NO WARRANTY + + 15. BECAUSE THE LIBRARY IS LICENSED FREE OF CHARGE, THERE IS NO +WARRANTY FOR THE LIBRARY, TO THE EXTENT PERMITTED BY APPLICABLE LAW. +EXCEPT WHEN OTHERWISE STATED IN WRITING THE COPYRIGHT HOLDERS AND/OR +OTHER PARTIES PROVIDE THE LIBRARY "AS IS" WITHOUT WARRANTY OF ANY +KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE +IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR +PURPOSE. THE ENTIRE RISK AS TO THE QUALITY AND PERFORMANCE OF THE +LIBRARY IS WITH YOU. SHOULD THE LIBRARY PROVE DEFECTIVE, YOU ASSUME +THE COST OF ALL NECESSARY SERVICING, REPAIR OR CORRECTION. + + 16. IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN +WRITING WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MAY MODIFY +AND/OR REDISTRIBUTE THE LIBRARY AS PERMITTED ABOVE, BE LIABLE TO YOU +FOR DAMAGES, INCLUDING ANY GENERAL, SPECIAL, INCIDENTAL OR +CONSEQUENTIAL DAMAGES ARISING OUT OF THE USE OR INABILITY TO USE THE +LIBRARY (INCLUDING BUT NOT LIMITED TO LOSS OF DATA OR DATA BEING +RENDERED INACCURATE OR LOSSES SUSTAINED BY YOU OR THIRD PARTIES OR A +FAILURE OF THE LIBRARY TO OPERATE WITH ANY OTHER SOFTWARE), EVEN IF +SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE POSSIBILITY OF SUCH +DAMAGES. + + END OF TERMS AND CONDITIONS + +------------------------------------------------------------------------------ +Qt 5 is redistributed within non-headless opencv-python Linux and macOS packages. +libgmp is redistributed within opencv-python macOS packages. +libidn2 is redistributed within opencv-python macOS packages. +libunistring is redistributed within opencv-python macOS packages. +This license applies to the above binaries in the directory cv2/. + + GNU LESSER GENERAL PUBLIC LICENSE + Version 3, 29 June 2007 + + Copyright (C) 2007 Free Software Foundation, Inc. + Everyone is permitted to copy and distribute verbatim copies + of this license document, but changing it is not allowed. + + This version of the GNU Lesser General Public License incorporates +the terms and conditions of version 3 of the GNU General Public +License, supplemented by the additional permissions listed below. + + 0. Additional Definitions. + + As used herein, "this License" refers to version 3 of the GNU Lesser +General Public License, and the "GNU GPL" refers to version 3 of the GNU +General Public License. + + "The Library" refers to a covered work governed by this License, +other than an Application or a Combined Work as defined below. + + An "Application" is any work that makes use of an interface provided +by the Library, but which is not otherwise based on the Library. +Defining a subclass of a class defined by the Library is deemed a mode +of using an interface provided by the Library. + + A "Combined Work" is a work produced by combining or linking an +Application with the Library. The particular version of the Library +with which the Combined Work was made is also called the "Linked +Version". + + The "Minimal Corresponding Source" for a Combined Work means the +Corresponding Source for the Combined Work, excluding any source code +for portions of the Combined Work that, considered in isolation, are +based on the Application, and not on the Linked Version. + + The "Corresponding Application Code" for a Combined Work means the +object code and/or source code for the Application, including any data +and utility programs needed for reproducing the Combined Work from the +Application, but excluding the System Libraries of the Combined Work. + + 1. Exception to Section 3 of the GNU GPL. + + You may convey a covered work under sections 3 and 4 of this License +without being bound by section 3 of the GNU GPL. + + 2. Conveying Modified Versions. + + If you modify a copy of the Library, and, in your modifications, a +facility refers to a function or data to be supplied by an Application +that uses the facility (other than as an argument passed when the +facility is invoked), then you may convey a copy of the modified +version: + + a) under this License, provided that you make a good faith effort to + ensure that, in the event an Application does not supply the + function or data, the facility still operates, and performs + whatever part of its purpose remains meaningful, or + + b) under the GNU GPL, with none of the additional permissions of + this License applicable to that copy. + + 3. Object Code Incorporating Material from Library Header Files. + + The object code form of an Application may incorporate material from +a header file that is part of the Library. You may convey such object +code under terms of your choice, provided that, if the incorporated +material is not limited to numerical parameters, data structure +layouts and accessors, or small macros, inline functions and templates +(ten or fewer lines in length), you do both of the following: + + a) Give prominent notice with each copy of the object code that the + Library is used in it and that the Library and its use are + covered by this License. + + b) Accompany the object code with a copy of the GNU GPL and this license + document. + + 4. Combined Works. + + You may convey a Combined Work under terms of your choice that, +taken together, effectively do not restrict modification of the +portions of the Library contained in the Combined Work and reverse +engineering for debugging such modifications, if you also do each of +the following: + + a) Give prominent notice with each copy of the Combined Work that + the Library is used in it and that the Library and its use are + covered by this License. + + b) Accompany the Combined Work with a copy of the GNU GPL and this license + document. + + c) For a Combined Work that displays copyright notices during + execution, include the copyright notice for the Library among + these notices, as well as a reference directing the user to the + copies of the GNU GPL and this license document. + + d) Do one of the following: + + 0) Convey the Minimal Corresponding Source under the terms of this + License, and the Corresponding Application Code in a form + suitable for, and under terms that permit, the user to + recombine or relink the Application with a modified version of + the Linked Version to produce a modified Combined Work, in the + manner specified by section 6 of the GNU GPL for conveying + Corresponding Source. + + 1) Use a suitable shared library mechanism for linking with the + Library. A suitable mechanism is one that (a) uses at run time + a copy of the Library already present on the user's computer + system, and (b) will operate properly with a modified version + of the Library that is interface-compatible with the Linked + Version. + + e) Provide Installation Information, but only if you would otherwise + be required to provide such information under section 6 of the + GNU GPL, and only to the extent that such information is + necessary to install and execute a modified version of the + Combined Work produced by recombining or relinking the + Application with a modified version of the Linked Version. (If + you use option 4d0, the Installation Information must accompany + the Minimal Corresponding Source and Corresponding Application + Code. If you use option 4d1, you must provide the Installation + Information in the manner specified by section 6 of the GNU GPL + for conveying Corresponding Source.) + + 5. Combined Libraries. + + You may place library facilities that are a work based on the +Library side by side in a single library together with other library +facilities that are not Applications and are not covered by this +License, and convey such a combined library under terms of your +choice, if you do both of the following: + + a) Accompany the combined library with a copy of the same work based + on the Library, uncombined with any other library facilities, + conveyed under the terms of this License. + + b) Give prominent notice with the combined library that part of it + is a work based on the Library, and explaining where to find the + accompanying uncombined form of the same work. + + 6. Revised Versions of the GNU Lesser General Public License. + + The Free Software Foundation may publish revised and/or new versions +of the GNU Lesser General Public License from time to time. Such new +versions will be similar in spirit to the present version, but may +differ in detail to address new problems or concerns. + + Each version is given a distinguishing version number. If the +Library as you received it specifies that a certain numbered version +of the GNU Lesser General Public License "or any later version" +applies to it, you have the option of following the terms and +conditions either of that published version or of any later version +published by the Free Software Foundation. If the Library as you +received it does not specify a version number of the GNU Lesser +General Public License, you may choose any version of the GNU Lesser +General Public License ever published by the Free Software Foundation. + + If the Library as you received it specifies that a proxy can decide +whether future versions of the GNU Lesser General Public License shall +apply, that proxy's public statement of acceptance of any version is +permanent authorization for you to choose that version for the +Library. + +------------------------------------------------------------------------------ +bzip2 is redistributed within all opencv-python Linux packages. +This license applies to libbz2 binary in the directory cv2/. + +This program, "bzip2", the associated library "libbzip2", and all +documentation, are copyright (C) 1996-2010 Julian R Seward. All +rights reserved. + +Redistribution and use in source and binary forms, with or without +modification, are permitted provided that the following conditions +are met: + +1. Redistributions of source code must retain the above copyright + notice, this list of conditions and the following disclaimer. + +2. The origin of this software must not be misrepresented; you must + not claim that you wrote the original software. If you use this + software in a product, an acknowledgment in the product + documentation would be appreciated but is not required. + +3. Altered source versions must be plainly marked as such, and must + not be misrepresented as being the original software. + +4. The name of the author may not be used to endorse or promote + products derived from this software without specific prior written + permission. + +THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS +OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED +WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE +ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY +DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL +DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE +GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS +INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, +WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING +NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS +SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + +Julian Seward, jseward@bzip.org +bzip2/libbzip2 version 1.0.6 of 6 September 2010 + +------------------------------------------------------------------------------ +libcrypto and libssl are redistributed within all opencv-python Linux and macOS packages. +libopencore-amrnb and libopencore-amrwb are redistributed within all opencv-python Linux and macOS packages. +This license applies to above binaries in the directory cv2/. + + LICENSE ISSUES + ============== + + The OpenSSL toolkit stays under a double license, i.e. both the conditions of + the OpenSSL License and the original SSLeay license apply to the toolkit. + See below for the actual license texts. + + OpenSSL License + --------------- + +/* ==================================================================== + * Copyright (c) 1998-2019 The OpenSSL Project. All rights reserved. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in + * the documentation and/or other materials provided with the + * distribution. + * + * 3. All advertising materials mentioning features or use of this + * software must display the following acknowledgment: + * "This product includes software developed by the OpenSSL Project + * for use in the OpenSSL Toolkit. (http://www.openssl.org/)" + * + * 4. The names "OpenSSL Toolkit" and "OpenSSL Project" must not be used to + * endorse or promote products derived from this software without + * prior written permission. For written permission, please contact + * openssl-core@openssl.org. + * + * 5. Products derived from this software may not be called "OpenSSL" + * nor may "OpenSSL" appear in their names without prior written + * permission of the OpenSSL Project. + * + * 6. Redistributions of any form whatsoever must retain the following + * acknowledgment: + * "This product includes software developed by the OpenSSL Project + * for use in the OpenSSL Toolkit (http://www.openssl.org/)" + * + * THIS SOFTWARE IS PROVIDED BY THE OpenSSL PROJECT ``AS IS'' AND ANY + * EXPRESSED OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR + * PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE OpenSSL PROJECT OR + * ITS CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, + * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT + * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; + * LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) + * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, + * STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) + * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED + * OF THE POSSIBILITY OF SUCH DAMAGE. + * ==================================================================== + * + * This product includes cryptographic software written by Eric Young + * (eay@cryptsoft.com). This product includes software written by Tim + * Hudson (tjh@cryptsoft.com). + * + */ + + Original SSLeay License + ----------------------- + +/* Copyright (C) 1995-1998 Eric Young (eay@cryptsoft.com) + * All rights reserved. + * + * This package is an SSL implementation written + * by Eric Young (eay@cryptsoft.com). + * The implementation was written so as to conform with Netscapes SSL. + * + * This library is free for commercial and non-commercial use as long as + * the following conditions are adhered to. The following conditions + * apply to all code found in this distribution, be it the RC4, RSA, + * lhash, DES, etc., code; not just the SSL code. The SSL documentation + * included with this distribution is covered by the same copyright terms + * except that the holder is Tim Hudson (tjh@cryptsoft.com). + * + * Copyright remains Eric Young's, and as such any Copyright notices in + * the code are not to be removed. + * If this package is used in a product, Eric Young should be given attribution + * as the author of the parts of the library used. + * This can be in the form of a textual message at program startup or + * in documentation (online or textual) provided with the package. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the copyright + * notice, this list of conditions and the following disclaimer. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * 3. All advertising materials mentioning features or use of this software + * must display the following acknowledgement: + * "This product includes cryptographic software written by + * Eric Young (eay@cryptsoft.com)" + * The word 'cryptographic' can be left out if the routines from the library + * being used are not cryptographic related :-). + * 4. If you include any Windows specific code (or a derivative thereof) from + * the apps directory (application code) you must include an acknowledgement: + * "This product includes software written by Tim Hudson (tjh@cryptsoft.com)" + * + * THIS SOFTWARE IS PROVIDED BY ERIC YOUNG ``AS IS'' AND + * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE + * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE + * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL + * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS + * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) + * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT + * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY + * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF + * SUCH DAMAGE. + * + * The licence and distribution terms for any publicly available version or + * derivative of this code cannot be changed. i.e. this code cannot simply be + * copied and put under another distribution licence + * [including the GNU Public Licence.] + */ + +------------------------------------------------------------------------------ +libfontconfig is redistributed within all opencv-python macOS packages. +This license applies to libfontconfig binary in the directory cv2/. + +Copyright © 2000,2001,2002,2003,2004,2006,2007 Keith Packard +Copyright © 2005 Patrick Lam +Copyright © 2009 Roozbeh Pournader +Copyright © 2008,2009 Red Hat, Inc. +Copyright © 2008 Danilo Šegan +Copyright © 2012 Google, Inc. + + +Permission to use, copy, modify, distribute, and sell this software and its +documentation for any purpose is hereby granted without fee, provided that +the above copyright notice appear in all copies and that both that +copyright notice and this permission notice appear in supporting +documentation, and that the name of the author(s) not be used in +advertising or publicity pertaining to distribution of the software without +specific, written prior permission. The authors make no +representations about the suitability of this software for any purpose. It +is provided "as is" without express or implied warranty. + +THE AUTHOR(S) DISCLAIMS ALL WARRANTIES WITH REGARD TO THIS SOFTWARE, +INCLUDING ALL IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS, IN NO +EVENT SHALL THE AUTHOR(S) BE LIABLE FOR ANY SPECIAL, INDIRECT OR +CONSEQUENTIAL DAMAGES OR ANY DAMAGES WHATSOEVER RESULTING FROM LOSS OF USE, +DATA OR PROFITS, WHETHER IN AN ACTION OF CONTRACT, NEGLIGENCE OR OTHER +TORTIOUS ACTION, ARISING OUT OF OR IN CONNECTION WITH THE USE OR +PERFORMANCE OF THIS SOFTWARE. + +------------------------------------------------------------------------------ +libfreetype is redistributed within opencv-python Linux and macOS packages. +This license applies to libfreetype binary in the directory cv2/. + + The FreeType Project LICENSE + ---------------------------- + + 2006-Jan-27 + + Copyright 1996-2002, 2006 by + David Turner, Robert Wilhelm, and Werner Lemberg + + + +Introduction +============ + + The FreeType Project is distributed in several archive packages; + some of them may contain, in addition to the FreeType font engine, + various tools and contributions which rely on, or relate to, the + FreeType Project. + + This license applies to all files found in such packages, and + which do not fall under their own explicit license. The license + affects thus the FreeType font engine, the test programs, + documentation and makefiles, at the very least. + + This license was inspired by the BSD, Artistic, and IJG + (Independent JPEG Group) licenses, which all encourage inclusion + and use of free software in commercial and freeware products + alike. As a consequence, its main points are that: + + o We don't promise that this software works. However, we will be + interested in any kind of bug reports. (`as is' distribution) + + o You can use this software for whatever you want, in parts or + full form, without having to pay us. (`royalty-free' usage) + + o You may not pretend that you wrote this software. If you use + it, or only parts of it, in a program, you must acknowledge + somewhere in your documentation that you have used the + FreeType code. (`credits') + + We specifically permit and encourage the inclusion of this + software, with or without modifications, in commercial products. + We disclaim all warranties covering The FreeType Project and + assume no liability related to The FreeType Project. + + + Finally, many people asked us for a preferred form for a + credit/disclaimer to use in compliance with this license. We thus + encourage you to use the following text: + + """ + Portions of this software are copyright © The FreeType + Project (www.freetype.org). All rights reserved. + """ + + Please replace with the value from the FreeType version you + actually use. + + +Legal Terms +=========== + +0. Definitions +-------------- + + Throughout this license, the terms `package', `FreeType Project', + and `FreeType archive' refer to the set of files originally + distributed by the authors (David Turner, Robert Wilhelm, and + Werner Lemberg) as the `FreeType Project', be they named as alpha, + beta or final release. + + `You' refers to the licensee, or person using the project, where + `using' is a generic term including compiling the project's source + code as well as linking it to form a `program' or `executable'. + This program is referred to as `a program using the FreeType + engine'. + + This license applies to all files distributed in the original + FreeType Project, including all source code, binaries and + documentation, unless otherwise stated in the file in its + original, unmodified form as distributed in the original archive. + If you are unsure whether or not a particular file is covered by + this license, you must contact us to verify this. + + The FreeType Project is copyright (C) 1996-2000 by David Turner, + Robert Wilhelm, and Werner Lemberg. All rights reserved except as + specified below. + +1. No Warranty +-------------- + + THE FREETYPE PROJECT IS PROVIDED `AS IS' WITHOUT WARRANTY OF ANY + KIND, EITHER EXPRESS OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, + WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR + PURPOSE. IN NO EVENT WILL ANY OF THE AUTHORS OR COPYRIGHT HOLDERS + BE LIABLE FOR ANY DAMAGES CAUSED BY THE USE OR THE INABILITY TO + USE, OF THE FREETYPE PROJECT. + +2. Redistribution +----------------- + + This license grants a worldwide, royalty-free, perpetual and + irrevocable right and license to use, execute, perform, compile, + display, copy, create derivative works of, distribute and + sublicense the FreeType Project (in both source and object code + forms) and derivative works thereof for any purpose; and to + authorize others to exercise some or all of the rights granted + herein, subject to the following conditions: + + o Redistribution of source code must retain this license file + (`FTL.TXT') unaltered; any additions, deletions or changes to + the original files must be clearly indicated in accompanying + documentation. The copyright notices of the unaltered, + original files must be preserved in all copies of source + files. + + o Redistribution in binary form must provide a disclaimer that + states that the software is based in part of the work of the + FreeType Team, in the distribution documentation. We also + encourage you to put an URL to the FreeType web page in your + documentation, though this isn't mandatory. + + These conditions apply to any software derived from or based on + the FreeType Project, not just the unmodified files. If you use + our work, you must acknowledge us. However, no fee need be paid + to us. + +3. Advertising +-------------- + + Neither the FreeType authors and contributors nor you shall use + the name of the other for commercial, advertising, or promotional + purposes without specific prior written permission. + + We suggest, but do not require, that you use one or more of the + following phrases to refer to this software in your documentation + or advertising materials: `FreeType Project', `FreeType Engine', + `FreeType library', or `FreeType Distribution'. + + As you have not signed this license, you are not required to + accept it. However, as the FreeType Project is copyrighted + material, only this license, or another one contracted with the + authors, grants you the right to use, distribute, and modify it. + Therefore, by using, distributing, or modifying the FreeType + Project, you indicate that you understand and accept all the terms + of this license. + +4. Contacts +----------- + + There are two mailing lists related to FreeType: + + o freetype@nongnu.org + + Discusses general use and applications of FreeType, as well as + future and wanted additions to the library and distribution. + If you are looking for support, start in this list if you + haven't found anything to help you in the documentation. + + o freetype-devel@nongnu.org + + Discusses bugs, as well as engine internals, design issues, + specific licenses, porting, etc. + + Our home page can be found at + + https://www.freetype.org + +------------------------------------------------------------------------------ +libpng is redistributed within all opencv-python Linux and macOS packages. +This license applies to libpng binary in the directory cv2/. + +PNG Reference Library License version 2 +--------------------------------------- + + * Copyright (c) 1995-2019 The PNG Reference Library Authors. + * Copyright (c) 2018-2019 Cosmin Truta. + * Copyright (c) 2000-2002, 2004, 2006-2018 Glenn Randers-Pehrson. + * Copyright (c) 1996-1997 Andreas Dilger. + * Copyright (c) 1995-1996 Guy Eric Schalnat, Group 42, Inc. + +The software is supplied "as is", without warranty of any kind, +express or implied, including, without limitation, the warranties +of merchantability, fitness for a particular purpose, title, and +non-infringement. In no event shall the Copyright owners, or +anyone distributing the software, be liable for any damages or +other liability, whether in contract, tort or otherwise, arising +from, out of, or in connection with the software, or the use or +other dealings in the software, even if advised of the possibility +of such damage. + +Permission is hereby granted to use, copy, modify, and distribute +this software, or portions hereof, for any purpose, without fee, +subject to the following restrictions: + + 1. The origin of this software must not be misrepresented; you + must not claim that you wrote the original software. If you + use this software in a product, an acknowledgment in the product + documentation would be appreciated, but is not required. + + 2. Altered source versions must be plainly marked as such, and must + not be misrepresented as being the original software. + + 3. This Copyright notice may not be removed or altered from any + source or altered source distribution. + + +PNG Reference Library License version 1 (for libpng 0.5 through 1.6.35) +----------------------------------------------------------------------- + +libpng versions 1.0.7, July 1, 2000, through 1.6.35, July 15, 2018 are +Copyright (c) 2000-2002, 2004, 2006-2018 Glenn Randers-Pehrson, are +derived from libpng-1.0.6, and are distributed according to the same +disclaimer and license as libpng-1.0.6 with the following individuals +added to the list of Contributing Authors: + + Simon-Pierre Cadieux + Eric S. Raymond + Mans Rullgard + Cosmin Truta + Gilles Vollant + James Yu + Mandar Sahastrabuddhe + Google Inc. + Vadim Barkov + +and with the following additions to the disclaimer: + + There is no warranty against interference with your enjoyment of + the library or against infringement. There is no warranty that our + efforts or the library will fulfill any of your particular purposes + or needs. This library is provided with all faults, and the entire + risk of satisfactory quality, performance, accuracy, and effort is + with the user. + +Some files in the "contrib" directory and some configure-generated +files that are distributed with libpng have other copyright owners, and +are released under other open source licenses. + +libpng versions 0.97, January 1998, through 1.0.6, March 20, 2000, are +Copyright (c) 1998-2000 Glenn Randers-Pehrson, are derived from +libpng-0.96, and are distributed according to the same disclaimer and +license as libpng-0.96, with the following individuals added to the +list of Contributing Authors: + + Tom Lane + Glenn Randers-Pehrson + Willem van Schaik + +libpng versions 0.89, June 1996, through 0.96, May 1997, are +Copyright (c) 1996-1997 Andreas Dilger, are derived from libpng-0.88, +and are distributed according to the same disclaimer and license as +libpng-0.88, with the following individuals added to the list of +Contributing Authors: + + John Bowler + Kevin Bracey + Sam Bushell + Magnus Holmgren + Greg Roelofs + Tom Tanner + +Some files in the "scripts" directory have other copyright owners, +but are released under this license. + +libpng versions 0.5, May 1995, through 0.88, January 1996, are +Copyright (c) 1995-1996 Guy Eric Schalnat, Group 42, Inc. + +For the purposes of this copyright and license, "Contributing Authors" +is defined as the following set of individuals: + + Andreas Dilger + Dave Martindale + Guy Eric Schalnat + Paul Schmidt + Tim Wegner + +The PNG Reference Library is supplied "AS IS". The Contributing +Authors and Group 42, Inc. disclaim all warranties, expressed or +implied, including, without limitation, the warranties of +merchantability and of fitness for any purpose. The Contributing +Authors and Group 42, Inc. assume no liability for direct, indirect, +incidental, special, exemplary, or consequential damages, which may +result from the use of the PNG Reference Library, even if advised of +the possibility of such damage. + +Permission is hereby granted to use, copy, modify, and distribute this +source code, or portions hereof, for any purpose, without fee, subject +to the following restrictions: + + 1. The origin of this source code must not be misrepresented. + + 2. Altered versions must be plainly marked as such and must not + be misrepresented as being the original source. + + 3. This Copyright notice may not be removed or altered from any + source or altered source distribution. + +The Contributing Authors and Group 42, Inc. specifically permit, +without fee, and encourage the use of this source code as a component +to supporting the PNG file format in commercial products. If you use +this source code in a product, acknowledgment is not required but would +be appreciated. + +------------------------------------------------------------------------------ +libz is redistributed within all opencv-python Linux packages. +This license applies to libz binary in the directory cv2/. + + Copyright (C) 1995-2017 Jean-loup Gailly and Mark Adler + + This software is provided 'as-is', without any express or implied + warranty. In no event will the authors be held liable for any damages + arising from the use of this software. + + Permission is granted to anyone to use this software for any purpose, + including commercial applications, and to alter it and redistribute it + freely, subject to the following restrictions: + + 1. The origin of this software must not be misrepresented; you must not + claim that you wrote the original software. If you use this software + in a product, an acknowledgment in the product documentation would be + appreciated but is not required. + 2. Altered source versions must be plainly marked as such, and must not be + misrepresented as being the original software. + 3. This notice may not be removed or altered from any source distribution. + + Jean-loup Gailly Mark Adler + jloup@gzip.org madler@alumni.caltech.edu + +------------------------------------------------------------------------------ +libdav1d is redistributed within opencv-python macOS packages. +This license applies to libdav1d binary in the directory cv2/. + +Copyright © 2018-2019, VideoLAN and dav1d authors +All rights reserved. + +Redistribution and use in source and binary forms, with or without +modification, are permitted provided that the following conditions are met: + +1. Redistributions of source code must retain the above copyright notice, this + list of conditions and the following disclaimer. + +2. Redistributions in binary form must reproduce the above copyright notice, + this list of conditions and the following disclaimer in the documentation + and/or other materials provided with the distribution. + +THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND +ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED +WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE +DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR +ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES +(INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; +LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND +ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT +(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS +SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + +------------------------------------------------------------------------------ +libffi is redistributed within opencv-python macOS packages. +This license applies to libffi binary in the directory cv2/. + +libffi - Copyright (c) 1996-2020 Anthony Green, Red Hat, Inc and others. +See source files for details. + +Permission is hereby granted, free of charge, to any person obtaining +a copy of this software and associated documentation files (the +``Software''), to deal in the Software without restriction, including +without limitation the rights to use, copy, modify, merge, publish, +distribute, sublicense, and/or sell copies of the Software, and to +permit persons to whom the Software is furnished to do so, subject to +the following conditions: + +The above copyright notice and this permission notice shall be +included in all copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED ``AS IS'', WITHOUT WARRANTY OF ANY KIND, +EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF +MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. +IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY +CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, +TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE +SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. + +------------------------------------------------------------------------------ +libogg is redistributed within opencv-python macOS packages. +This license applies to libogg binary in the directory cv2/. + +Copyright (c) 2002, Xiph.org Foundation + +Redistribution and use in source and binary forms, with or without +modification, are permitted provided that the following conditions +are met: + +- Redistributions of source code must retain the above copyright +notice, this list of conditions and the following disclaimer. + +- Redistributions in binary form must reproduce the above copyright +notice, this list of conditions and the following disclaimer in the +documentation and/or other materials provided with the distribution. + +- Neither the name of the Xiph.org Foundation nor the names of its +contributors may be used to endorse or promote products derived from +this software without specific prior written permission. + +THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS +``AS IS'' AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT +LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR +A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE FOUNDATION +OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, +SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT +LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, +DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY +THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT +(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE +OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + +------------------------------------------------------------------------------ +libopenjp2 is redistributed within opencv-python macOS packages. +This license applies to libopenjp2 binary in the directory cv2/. + +The copyright in this software is being made available under the 2-clauses +BSD License, included below. This software may be subject to other third +party and contributor rights, including patent rights, and no such rights +are granted under this license. + +Copyright (c) 2002-2014, Universite catholique de Louvain (UCL), Belgium +Copyright (c) 2002-2014, Professor Benoit Macq +Copyright (c) 2003-2014, Antonin Descampe +Copyright (c) 2003-2009, Francois-Olivier Devaux +Copyright (c) 2005, Herve Drolon, FreeImage Team +Copyright (c) 2002-2003, Yannick Verschueren +Copyright (c) 2001-2003, David Janssens +Copyright (c) 2011-2012, Centre National d'Etudes Spatiales (CNES), France +Copyright (c) 2012, CS Systemes d'Information, France + +All rights reserved. + +Redistribution and use in source and binary forms, with or without +modification, are permitted provided that the following conditions +are met: +1. Redistributions of source code must retain the above copyright + notice, this list of conditions and the following disclaimer. +2. Redistributions in binary form must reproduce the above copyright + notice, this list of conditions and the following disclaimer in the + documentation and/or other materials provided with the distribution. + +THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS `AS IS' +AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE +IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE +ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE +LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR +CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF +SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS +INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN +CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) +ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE +POSSIBILITY OF SUCH DAMAGE. + +------------------------------------------------------------------------------ +libopus is redistributed within opencv-python macOS packages. +This license applies to libopus binary in the directory cv2/. + +Copyright 2001-2011 Xiph.Org, Skype Limited, Octasic, + Jean-Marc Valin, Timothy B. Terriberry, + CSIRO, Gregory Maxwell, Mark Borgerding, + Erik de Castro Lopo + +Redistribution and use in source and binary forms, with or without +modification, are permitted provided that the following conditions +are met: + +- Redistributions of source code must retain the above copyright +notice, this list of conditions and the following disclaimer. + +- Redistributions in binary form must reproduce the above copyright +notice, this list of conditions and the following disclaimer in the +documentation and/or other materials provided with the distribution. + +- Neither the name of Internet Society, IETF or IETF Trust, nor the +names of specific contributors, may be used to endorse or promote +products derived from this software without specific prior written +permission. + +THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS +``AS IS'' AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT +LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR +A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER +OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, +EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, +PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR +PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF +LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING +NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS +SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + +Opus is subject to the royalty-free patent licenses which are +specified at: + +Xiph.Org Foundation: +https://datatracker.ietf.org/ipr/1524/ + +Microsoft Corporation: +https://datatracker.ietf.org/ipr/1914/ + +Broadcom Corporation: +https://datatracker.ietf.org/ipr/1526/ + +------------------------------------------------------------------------------ +librav1e is redistributed within opencv-python macOS packages. +This license applies to librav1e binary in the directory cv2/. + +BSD 2-Clause License + +Copyright (c) 2017-2020, the rav1e contributors +All rights reserved. + +Redistribution and use in source and binary forms, with or without +modification, are permitted provided that the following conditions are met: + +* Redistributions of source code must retain the above copyright notice, this + list of conditions and the following disclaimer. + +* Redistributions in binary form must reproduce the above copyright notice, + this list of conditions and the following disclaimer in the documentation + and/or other materials provided with the distribution. + +THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" +AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE +IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE +DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE +FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL +DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR +SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER +CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, +OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE +OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + +------------------------------------------------------------------------------ +libsnappy is redistributed within opencv-python macOS packages. +This license applies to libsnappy binary in the directory cv2/. + +Copyright 2011, Google Inc. +All rights reserved. + +Redistribution and use in source and binary forms, with or without +modification, are permitted provided that the following conditions are +met: + + * Redistributions of source code must retain the above copyright +notice, this list of conditions and the following disclaimer. + * Redistributions in binary form must reproduce the above +copyright notice, this list of conditions and the following disclaimer +in the documentation and/or other materials provided with the +distribution. + * Neither the name of Google Inc. nor the names of its +contributors may be used to endorse or promote products derived from +this software without specific prior written permission. + +THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS +"AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT +LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR +A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT +OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, +SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT +LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, +DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY +THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT +(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE +OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + +------------------------------------------------------------------------------ +libspeex is redistributed within opencv-python macOS packages. +This license applies to libspeex binary in the directory cv2/. + +Copyright 2002-2008 Xiph.org Foundation +Copyright 2002-2008 Jean-Marc Valin +Copyright 2005-2007 Analog Devices Inc. +Copyright 2005-2008 Commonwealth Scientific and Industrial Research + Organisation (CSIRO) +Copyright 1993, 2002, 2006 David Rowe +Copyright 2003 EpicGames +Copyright 1992-1994 Jutta Degener, Carsten Bormann + +Redistribution and use in source and binary forms, with or without +modification, are permitted provided that the following conditions +are met: + +- Redistributions of source code must retain the above copyright +notice, this list of conditions and the following disclaimer. + +- Redistributions in binary form must reproduce the above copyright +notice, this list of conditions and the following disclaimer in the +documentation and/or other materials provided with the distribution. + +- Neither the name of the Xiph.org Foundation nor the names of its +contributors may be used to endorse or promote products derived from +this software without specific prior written permission. + +THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS +``AS IS'' AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT +LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR +A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE FOUNDATION OR +CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, +EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, +PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR +PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF +LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING +NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS +SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + +------------------------------------------------------------------------------ +libsrt is redistributed within opencv-python macOS packages. +This license applies to libsrt binary in the directory cv2/. + +/* + * + * Copyright (c) 2001-2017 Cisco Systems, Inc. + * All rights reserved. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * + * Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * + * Redistributions in binary form must reproduce the above + * copyright notice, this list of conditions and the following + * disclaimer in the documentation and/or other materials provided + * with the distribution. + * + * Neither the name of the Cisco Systems, Inc. nor the names of its + * contributors may be used to endorse or promote products derived + * from this software without specific prior written permission. + * + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS + * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT + * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS + * FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE + * COPYRIGHT HOLDERS OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, + * INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES + * (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR + * SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) + * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, + * STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) + * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED + * OF THE POSSIBILITY OF SUCH DAMAGE. + * + */ + + + Mozilla Public License Version 2.0 +================================== + +1. Definitions +-------------- + +1.1. "Contributor" + means each individual or legal entity that creates, contributes to + the creation of, or owns Covered Software. + +1.2. "Contributor Version" + means the combination of the Contributions of others (if any) used + by a Contributor and that particular Contributor's Contribution. + +1.3. "Contribution" + means Covered Software of a particular Contributor. + +1.4. "Covered Software" + means Source Code Form to which the initial Contributor has attached + the notice in Exhibit A, the Executable Form of such Source Code + Form, and Modifications of such Source Code Form, in each case + including portions thereof. + +1.5. "Incompatible With Secondary Licenses" + means + + (a) that the initial Contributor has attached the notice described + in Exhibit B to the Covered Software; or + + (b) that the Covered Software was made available under the terms of + version 1.1 or earlier of the License, but not also under the + terms of a Secondary License. + +1.6. "Executable Form" + means any form of the work other than Source Code Form. + +1.7. "Larger Work" + means a work that combines Covered Software with other material, in + a separate file or files, that is not Covered Software. + +1.8. "License" + means this document. + +1.9. "Licensable" + means having the right to grant, to the maximum extent possible, + whether at the time of the initial grant or subsequently, any and + all of the rights conveyed by this License. + +1.10. "Modifications" + means any of the following: + + (a) any file in Source Code Form that results from an addition to, + deletion from, or modification of the contents of Covered + Software; or + + (b) any new file in Source Code Form that contains any Covered + Software. + +1.11. "Patent Claims" of a Contributor + means any patent claim(s), including without limitation, method, + process, and apparatus claims, in any patent Licensable by such + Contributor that would be infringed, but for the grant of the + License, by the making, using, selling, offering for sale, having + made, import, or transfer of either its Contributions or its + Contributor Version. + +1.12. "Secondary License" + means either the GNU General Public License, Version 2.0, the GNU + Lesser General Public License, Version 2.1, the GNU Affero General + Public License, Version 3.0, or any later versions of those + licenses. + +1.13. "Source Code Form" + means the form of the work preferred for making modifications. + +1.14. "You" (or "Your") + means an individual or a legal entity exercising rights under this + License. For legal entities, "You" includes any entity that + controls, is controlled by, or is under common control with You. For + purposes of this definition, "control" means (a) the power, direct + or indirect, to cause the direction or management of such entity, + whether by contract or otherwise, or (b) ownership of more than + fifty percent (50%) of the outstanding shares or beneficial + ownership of such entity. + +2. License Grants and Conditions +-------------------------------- + +2.1. Grants + +Each Contributor hereby grants You a world-wide, royalty-free, +non-exclusive license: + +(a) under intellectual property rights (other than patent or trademark) + Licensable by such Contributor to use, reproduce, make available, + modify, display, perform, distribute, and otherwise exploit its + Contributions, either on an unmodified basis, with Modifications, or + as part of a Larger Work; and + +(b) under Patent Claims of such Contributor to make, use, sell, offer + for sale, have made, import, and otherwise transfer either its + Contributions or its Contributor Version. + +2.2. Effective Date + +The licenses granted in Section 2.1 with respect to any Contribution +become effective for each Contribution on the date the Contributor first +distributes such Contribution. + +2.3. Limitations on Grant Scope + +The licenses granted in this Section 2 are the only rights granted under +this License. No additional rights or licenses will be implied from the +distribution or licensing of Covered Software under this License. +Notwithstanding Section 2.1(b) above, no patent license is granted by a +Contributor: + +(a) for any code that a Contributor has removed from Covered Software; + or + +(b) for infringements caused by: (i) Your and any other third party's + modifications of Covered Software, or (ii) the combination of its + Contributions with other software (except as part of its Contributor + Version); or + +(c) under Patent Claims infringed by Covered Software in the absence of + its Contributions. + +This License does not grant any rights in the trademarks, service marks, +or logos of any Contributor (except as may be necessary to comply with +the notice requirements in Section 3.4). + +2.4. Subsequent Licenses + +No Contributor makes additional grants as a result of Your choice to +distribute the Covered Software under a subsequent version of this +License (see Section 10.2) or under the terms of a Secondary License (if +permitted under the terms of Section 3.3). + +2.5. Representation + +Each Contributor represents that the Contributor believes its +Contributions are its original creation(s) or it has sufficient rights +to grant the rights to its Contributions conveyed by this License. + +2.6. Fair Use + +This License is not intended to limit any rights You have under +applicable copyright doctrines of fair use, fair dealing, or other +equivalents. + +2.7. Conditions + +Sections 3.1, 3.2, 3.3, and 3.4 are conditions of the licenses granted +in Section 2.1. + +3. Responsibilities +------------------- + +3.1. Distribution of Source Form + +All distribution of Covered Software in Source Code Form, including any +Modifications that You create or to which You contribute, must be under +the terms of this License. You must inform recipients that the Source +Code Form of the Covered Software is governed by the terms of this +License, and how they can obtain a copy of this License. You may not +attempt to alter or restrict the recipients' rights in the Source Code +Form. + +3.2. Distribution of Executable Form + +If You distribute Covered Software in Executable Form then: + +(a) such Covered Software must also be made available in Source Code + Form, as described in Section 3.1, and You must inform recipients of + the Executable Form how they can obtain a copy of such Source Code + Form by reasonable means in a timely manner, at a charge no more + than the cost of distribution to the recipient; and + +(b) You may distribute such Executable Form under the terms of this + License, or sublicense it under different terms, provided that the + license for the Executable Form does not attempt to limit or alter + the recipients' rights in the Source Code Form under this License. + +3.3. Distribution of a Larger Work + +You may create and distribute a Larger Work under terms of Your choice, +provided that You also comply with the requirements of this License for +the Covered Software. If the Larger Work is a combination of Covered +Software with a work governed by one or more Secondary Licenses, and the +Covered Software is not Incompatible With Secondary Licenses, this +License permits You to additionally distribute such Covered Software +under the terms of such Secondary License(s), so that the recipient of +the Larger Work may, at their option, further distribute the Covered +Software under the terms of either this License or such Secondary +License(s). + +3.4. Notices + +You may not remove or alter the substance of any license notices +(including copyright notices, patent notices, disclaimers of warranty, +or limitations of liability) contained within the Source Code Form of +the Covered Software, except that You may alter any license notices to +the extent required to remedy known factual inaccuracies. + +3.5. Application of Additional Terms + +You may choose to offer, and to charge a fee for, warranty, support, +indemnity or liability obligations to one or more recipients of Covered +Software. However, You may do so only on Your own behalf, and not on +behalf of any Contributor. You must make it absolutely clear that any +such warranty, support, indemnity, or liability obligation is offered by +You alone, and You hereby agree to indemnify every Contributor for any +liability incurred by such Contributor as a result of warranty, support, +indemnity or liability terms You offer. You may include additional +disclaimers of warranty and limitations of liability specific to any +jurisdiction. + +4. Inability to Comply Due to Statute or Regulation +--------------------------------------------------- + +If it is impossible for You to comply with any of the terms of this +License with respect to some or all of the Covered Software due to +statute, judicial order, or regulation then You must: (a) comply with +the terms of this License to the maximum extent possible; and (b) +describe the limitations and the code they affect. Such description must +be placed in a text file included with all distributions of the Covered +Software under this License. Except to the extent prohibited by statute +or regulation, such description must be sufficiently detailed for a +recipient of ordinary skill to be able to understand it. + +5. Termination +-------------- + +5.1. The rights granted under this License will terminate automatically +if You fail to comply with any of its terms. However, if You become +compliant, then the rights granted under this License from a particular +Contributor are reinstated (a) provisionally, unless and until such +Contributor explicitly and finally terminates Your grants, and (b) on an +ongoing basis, if such Contributor fails to notify You of the +non-compliance by some reasonable means prior to 60 days after You have +come back into compliance. Moreover, Your grants from a particular +Contributor are reinstated on an ongoing basis if such Contributor +notifies You of the non-compliance by some reasonable means, this is the +first time You have received notice of non-compliance with this License +from such Contributor, and You become compliant prior to 30 days after +Your receipt of the notice. + +5.2. If You initiate litigation against any entity by asserting a patent +infringement claim (excluding declaratory judgment actions, +counter-claims, and cross-claims) alleging that a Contributor Version +directly or indirectly infringes any patent, then the rights granted to +You by any and all Contributors for the Covered Software under Section +2.1 of this License shall terminate. + +5.3. In the event of termination under Sections 5.1 or 5.2 above, all +end user license agreements (excluding distributors and resellers) which +have been validly granted by You or Your distributors under this License +prior to termination shall survive termination. + +************************************************************************ +* * +* 6. Disclaimer of Warranty * +* ------------------------- * +* * +* Covered Software is provided under this License on an "as is" * +* basis, without warranty of any kind, either expressed, implied, or * +* statutory, including, without limitation, warranties that the * +* Covered Software is free of defects, merchantable, fit for a * +* particular purpose or non-infringing. The entire risk as to the * +* quality and performance of the Covered Software is with You. * +* Should any Covered Software prove defective in any respect, You * +* (not any Contributor) assume the cost of any necessary servicing, * +* repair, or correction. This disclaimer of warranty constitutes an * +* essential part of this License. No use of any Covered Software is * +* authorized under this License except under this disclaimer. * +* * +************************************************************************ + +************************************************************************ +* * +* 7. Limitation of Liability * +* -------------------------- * +* * +* Under no circumstances and under no legal theory, whether tort * +* (including negligence), contract, or otherwise, shall any * +* Contributor, or anyone who distributes Covered Software as * +* permitted above, be liable to You for any direct, indirect, * +* special, incidental, or consequential damages of any character * +* including, without limitation, damages for lost profits, loss of * +* goodwill, work stoppage, computer failure or malfunction, or any * +* and all other commercial damages or losses, even if such party * +* shall have been informed of the possibility of such damages. This * +* limitation of liability shall not apply to liability for death or * +* personal injury resulting from such party's negligence to the * +* extent applicable law prohibits such limitation. Some * +* jurisdictions do not allow the exclusion or limitation of * +* incidental or consequential damages, so this exclusion and * +* limitation may not apply to You. * +* * +************************************************************************ + +8. Litigation +------------- + +Any litigation relating to this License may be brought only in the +courts of a jurisdiction where the defendant maintains its principal +place of business and such litigation shall be governed by laws of that +jurisdiction, without reference to its conflict-of-law provisions. +Nothing in this Section shall prevent a party's ability to bring +cross-claims or counter-claims. + +9. Miscellaneous +---------------- + +This License represents the complete agreement concerning the subject +matter hereof. If any provision of this License is held to be +unenforceable, such provision shall be reformed only to the extent +necessary to make it enforceable. Any law or regulation which provides +that the language of a contract shall be construed against the drafter +shall not be used to construe this License against a Contributor. + +10. Versions of the License +--------------------------- + +10.1. New Versions + +Mozilla Foundation is the license steward. Except as provided in Section +10.3, no one other than the license steward has the right to modify or +publish new versions of this License. Each version will be given a +distinguishing version number. + +10.2. Effect of New Versions + +You may distribute the Covered Software under the terms of the version +of the License under which You originally received the Covered Software, +or under the terms of any subsequent version published by the license +steward. + +10.3. Modified Versions + +If you create software not governed by this License, and you want to +create a new license for such software, you may create and use a +modified version of this License if you rename the license and remove +any references to the name of the license steward (except to note that +such modified license differs from this License). + +10.4. Distributing Source Code Form that is Incompatible With Secondary +Licenses + +If You choose to distribute Source Code Form that is Incompatible With +Secondary Licenses under the terms of this version of the License, the +notice described in Exhibit B of this License must be attached. + +Exhibit A - Source Code Form License Notice +------------------------------------------- + + This Source Code Form is subject to the terms of the Mozilla Public + License, v. 2.0. If a copy of the MPL was not distributed with this + file, You can obtain one at http://mozilla.org/MPL/2.0/. + +If it is not possible or desirable to put the notice in a particular +file, then You may include the notice in a location (such as a LICENSE +file in a relevant directory) where a recipient would be likely to look +for such a notice. + +You may add additional accurate notices of copyright ownership. + +Exhibit B - "Incompatible With Secondary Licenses" Notice +--------------------------------------------------------- + + This Source Code Form is "Incompatible With Secondary Licenses", as + defined by the Mozilla Public License, v. 2.0. + +------------------------------------------------------------------------------ +libtheoradec and libtheoraenc are redistributed within opencv-python macOS packages. +This license applies to libtheoradec and libtheoraenc binaries in the directory cv2/. + + Copyright (C) 2002-2009 Xiph.org Foundation + +Redistribution and use in source and binary forms, with or without +modification, are permitted provided that the following conditions +are met: + +- Redistributions of source code must retain the above copyright +notice, this list of conditions and the following disclaimer. + +- Redistributions in binary form must reproduce the above copyright +notice, this list of conditions and the following disclaimer in the +documentation and/or other materials provided with the distribution. + +- Neither the name of the Xiph.org Foundation nor the names of its +contributors may be used to endorse or promote products derived from +this software without specific prior written permission. + +THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS +``AS IS'' AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT +LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR +A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE FOUNDATION +OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, +SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT +LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, +DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY +THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT +(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE +OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + +------------------------------------------------------------------------------ +libwebp and libwebpmux are redistributed within all opencv-python packages. +This license applies to libwebp and libwebpmux binaries in the directory cv2/. + +Copyright (c) 2010, Google Inc. All rights reserved. + +Redistribution and use in source and binary forms, with or without +modification, are permitted provided that the following conditions are +met: + + * Redistributions of source code must retain the above copyright + notice, this list of conditions and the following disclaimer. + + * Redistributions in binary form must reproduce the above copyright + notice, this list of conditions and the following disclaimer in + the documentation and/or other materials provided with the + distribution. + + * Neither the name of Google nor the names of its contributors may + be used to endorse or promote products derived from this software + without specific prior written permission. + +THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS +"AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT +LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR +A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT +HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, +SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT +LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, +DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY +THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT +(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE +OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + +------------------------------------------------------------------------------ +libvorbis and libvorbisenc are redistributed within opencv-python macOS packages. +This license applies to libvorbis and libvorbisenc binaries in the directory cv2/. + +Copyright (c) 2002-2020 Xiph.org Foundation + +Redistribution and use in source and binary forms, with or without +modification, are permitted provided that the following conditions +are met: + +- Redistributions of source code must retain the above copyright +notice, this list of conditions and the following disclaimer. + +- Redistributions in binary form must reproduce the above copyright +notice, this list of conditions and the following disclaimer in the +documentation and/or other materials provided with the distribution. + +- Neither the name of the Xiph.org Foundation nor the names of its +contributors may be used to endorse or promote products derived from +this software without specific prior written permission. + +THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS +``AS IS'' AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT +LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR +A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE FOUNDATION +OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, +SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT +LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, +DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY +THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT +(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE +OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + +------------------------------------------------------------------------------ +Libxcb utility libraries are redistributed within opencv-python non-headless Linux packages. +This license applies to libxcb related binaries in the directory cv2/. + +Copyright (C) 2001-2006 Bart Massey, Jamey Sharp, and Josh Triplett. +All Rights Reserved. + +Permission is hereby granted, free of charge, to any person +obtaining a copy of this software and associated +documentation files (the "Software"), to deal in the +Software without restriction, including without limitation +the rights to use, copy, modify, merge, publish, distribute, +sublicense, and/or sell copies of the Software, and to +permit persons to whom the Software is furnished to do so, +subject to the following conditions: + +The above copyright notice and this permission notice shall +be included in all copies or substantial portions of the +Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY +KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE +WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR +PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS +BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER +IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR +OTHER DEALINGS IN THE SOFTWARE. + +Except as contained in this notice, the names of the authors +or their institutions shall not be used in advertising or +otherwise to promote the sale, use or other dealings in this +Software without prior written authorization from the +authors. + +------------------------------------------------------------------------------ +Libxcb-image is redistributed within opencv-python non-headless Linux packages. +This license applies to libxcb-image binary in the directory cv2/. + +Copyright © 2007-2008 Bart Massey +Copyright © 2008 Julien Danjou +Copyright © 2008 Keith Packard + +Permission is hereby granted, free of charge, to any person +obtaining a copy of this software and associated documentation +files (the "Software"), to deal in the Software without +restriction, including without limitation the rights to use, copy, +modify, merge, publish, distribute, sublicense, and/or sell copies +of the Software, and to permit persons to whom the Software is +furnished to do so, subject to the following conditions: + +The above copyright notice and this permission notice shall be +included in all copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, +EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF +MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND +NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS BE LIABLE FOR ANY +CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF +CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION +WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. + +Except as contained in this notice, the names of the authors or +their institutions shall not be used in advertising or otherwise to +promote the sale, use or other dealings in this Software without +prior written authorization from the authors. + +------------------------------------------------------------------------------ +Libxcb-util is redistributed within opencv-python non-headless Linux packages. +This license applies to libxcb-util binary in the directory cv2/. + +Copyright © 2008 Bart Massey +Copyright © 2008 Ian Osgood +Copyright © 2008 Jamey Sharp +Copyright © 2008 Josh Triplett +Copyright © 2008-2009 Julien Danjou + +Permission is hereby granted, free of charge, to any person +obtaining a copy of this software and associated documentation +files (the "Software"), to deal in the Software without +restriction, including without limitation the rights to use, copy, +modify, merge, publish, distribute, sublicense, and/or sell copies +of the Software, and to permit persons to whom the Software is +furnished to do so, subject to the following conditions: + +The above copyright notice and this permission notice shall be +included in all copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, +EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF +MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND +NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS BE LIABLE FOR ANY +CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF +CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION +WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. + +Except as contained in this notice, the names of the authors or +their institutions shall not be used in advertising or otherwise to +promote the sale, use or other dealings in this Software without +prior written authorization from the authors. + +------------------------------------------------------------------------------ +Libxcb-render-util is redistributed within opencv-python non-headless Linux packages. +This license applies to libxcb-render-util binary in the directory cv2/. + +Copyright © 2000 Keith Packard + +Permission to use, copy, modify, distribute, and sell this software and its +documentation for any purpose is hereby granted without fee, provided that +the above copyright notice appear in all copies and that both that +copyright notice and this permission notice appear in supporting +documentation, and that the name of Keith Packard not be used in +advertising or publicity pertaining to distribution of the software without +specific, written prior permission. Keith Packard makes no +representations about the suitability of this software for any purpose. It +is provided "as is" without express or implied warranty. + +KEITH PACKARD DISCLAIMS ALL WARRANTIES WITH REGARD TO THIS SOFTWARE, +INCLUDING ALL IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS, IN NO +EVENT SHALL KEITH PACKARD BE LIABLE FOR ANY SPECIAL, INDIRECT OR +CONSEQUENTIAL DAMAGES OR ANY DAMAGES WHATSOEVER RESULTING FROM LOSS OF USE, +DATA OR PROFITS, WHETHER IN AN ACTION OF CONTRACT, NEGLIGENCE OR OTHER +TORTIOUS ACTION, ARISING OUT OF OR IN CONNECTION WITH THE USE OR +PERFORMANCE OF THIS SOFTWARE. + +Copyright © 2006 Jamey Sharp. + +Permission is hereby granted, free of charge, to any person obtaining a +copy of this software and associated documentation files (the "Software"), +to deal in the Software without restriction, including without limitation +the rights to use, copy, modify, merge, publish, distribute, sublicense, +and/or sell copies of the Software, and to permit persons to whom the +Software is furnished to do so, subject to the following conditions: + +The above copyright notice and this permission notice shall be included in +all copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +AUTHORS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN +ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN +CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. + +Except as contained in this notice, the names of the authors or their +institutions shall not be used in advertising or otherwise to promote the +sale, use or other dealings in this Software without prior written +authorization from the authors. + +Copyright © 2006 Ian Osgood + +Permission is hereby granted, free of charge, to any person obtaining a +copy of this software and associated documentation files (the "Software"), +to deal in the Software without restriction, including without limitation +the rights to use, copy, modify, merge, publish, distribute, sublicense, +and/or sell copies of the Software, and to permit persons to whom the +Software is furnished to do so, subject to the following conditions: + +The above copyright notice and this permission notice shall be included in +all copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +AUTHORS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN +ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN +CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. + +Except as contained in this notice, the names of the authors or their +institutions shall not be used in advertising or otherwise to promote the +sale, use or other dealings in this Software without prior written +authorization from the authors. + +------------------------------------------------------------------------------ +Libxcb-icccm is redistributed within opencv-python non-headless Linux packages. +This license applies to Libxcb-icccm binary in the directory cv2/. + +Copyright © 2008-2011 Arnaud Fontaine +Copyright © 2007-2008 Vincent Torri + +Permission is hereby granted, free of charge, to any person +obtaining a copy of this software and associated documentation +files (the "Software"), to deal in the Software without +restriction, including without limitation the rights to use, copy, +modify, merge, publish, distribute, sublicense, and/or sell copies +of the Software, and to permit persons to whom the Software is +furnished to do so, subject to the following conditions: + +The above copyright notice and this permission notice shall be +included in all copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, +EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF +MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND +NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS BE LIABLE FOR ANY +CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF +CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION +WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. + +Except as contained in this notice, the names of the authors or +their institutions shall not be used in advertising or otherwise to +promote the sale, use or other dealings in this Software without +prior written authorization from the authors. + +------------------------------------------------------------------------------ +libXau is redistributed within opencv-python non-headless Linux packages. +This license applies to libXau binary in the directory cv2/. + +Copyright 1988, 1993, 1994, 1998 The Open Group + +Permission to use, copy, modify, distribute, and sell this software and its +documentation for any purpose is hereby granted without fee, provided that +the above copyright notice appear in all copies and that both that +copyright notice and this permission notice appear in supporting +documentation. + +The above copyright notice and this permission notice shall be included in +all copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +OPEN GROUP BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN +AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN +CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. + +Except as contained in this notice, the name of The Open Group shall not be +used in advertising or otherwise to promote the sale, use or other dealings +in this Software without prior written authorization from The Open Group. + +------------------------------------------------------------------------------ +Vulkan headers are redistributed within all opencv-python packages. +This license applies to Vulkan headers in the directory 3rdparty/include/vulkan. + +Copyright (c) 2015-2018 The Khronos Group Inc. + +Licensed under the Apache License, Version 2.0 (the "License"); +you may not use this file except in compliance with the License. +You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + +Unless required by applicable law or agreed to in writing, software +distributed under the License is distributed on an "AS IS" BASIS, +WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +See the License for the specific language governing permissions and +limitations under the License. + +------------------------------------------------------------------------------ +Libjpeg-turbo is redistributed within all opencv-python packages as build option. + +libjpeg-turbo Licenses +====================== + +libjpeg-turbo is covered by three compatible BSD-style open source licenses: + +- The IJG (Independent JPEG Group) License, which is listed in + [README.ijg](README.ijg) + + This license applies to the libjpeg API library and associated programs + (any code inherited from libjpeg, and any modifications to that code.) + +- The Modified (3-clause) BSD License, which is listed below + + This license covers the TurboJPEG API library and associated programs, as + well as the build system. + +- The [zlib License](https://opensource.org/licenses/Zlib) + + This license is a subset of the other two, and it covers the libjpeg-turbo + SIMD extensions. + + +Complying with the libjpeg-turbo Licenses +========================================= + +This section provides a roll-up of the libjpeg-turbo licensing terms, to the +best of our understanding. + +1. If you are distributing a modified version of the libjpeg-turbo source, + then: + + 1. You cannot alter or remove any existing copyright or license notices + from the source. + + **Origin** + - Clause 1 of the IJG License + - Clause 1 of the Modified BSD License + - Clauses 1 and 3 of the zlib License + + 2. You must add your own copyright notice to the header of each source + file you modified, so others can tell that you modified that file (if + there is not an existing copyright header in that file, then you can + simply add a notice stating that you modified the file.) + + **Origin** + - Clause 1 of the IJG License + - Clause 2 of the zlib License + + 3. You must include the IJG README file, and you must not alter any of the + copyright or license text in that file. + + **Origin** + - Clause 1 of the IJG License + +2. If you are distributing only libjpeg-turbo binaries without the source, or + if you are distributing an application that statically links with + libjpeg-turbo, then: + + 1. Your product documentation must include a message stating: + + This software is based in part on the work of the Independent JPEG + Group. + + **Origin** + - Clause 2 of the IJG license + + 2. If your binary distribution includes or uses the TurboJPEG API, then + your product documentation must include the text of the Modified BSD + License (see below.) + + **Origin** + - Clause 2 of the Modified BSD License + +3. You cannot use the name of the IJG or The libjpeg-turbo Project or the + contributors thereof in advertising, publicity, etc. + + **Origin** + - IJG License + - Clause 3 of the Modified BSD License + +4. The IJG and The libjpeg-turbo Project do not warrant libjpeg-turbo to be + free of defects, nor do we accept any liability for undesirable + consequences resulting from your use of the software. + + **Origin** + - IJG License + - Modified BSD License + - zlib License + + +The Modified (3-clause) BSD License +=================================== + +Copyright (C)2009-2022 D. R. Commander. All Rights Reserved.
+Copyright (C)2015 Viktor Szathmáry. All Rights Reserved. + +Redistribution and use in source and binary forms, with or without +modification, are permitted provided that the following conditions are met: + +- Redistributions of source code must retain the above copyright notice, + this list of conditions and the following disclaimer. +- Redistributions in binary form must reproduce the above copyright notice, + this list of conditions and the following disclaimer in the documentation + and/or other materials provided with the distribution. +- Neither the name of the libjpeg-turbo Project nor the names of its + contributors may be used to endorse or promote products derived from this + software without specific prior written permission. + +THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS", +AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE +IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE +ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDERS OR CONTRIBUTORS BE +LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR +CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF +SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS +INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN +CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) +ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE +POSSIBILITY OF SUCH DAMAGE. + + +Why Three Licenses? +=================== + +The zlib License could have been used instead of the Modified (3-clause) BSD +License, and since the IJG License effectively subsumes the distribution +conditions of the zlib License, this would have effectively placed +libjpeg-turbo binary distributions under the IJG License. However, the IJG +License specifically refers to the Independent JPEG Group and does not extend +attribution and endorsement protections to other entities. Thus, it was +desirable to choose a license that granted us the same protections for new code +that were granted to the IJG for code derived from their software. + +------------------------------------------------------------------------------ +Libspng is redistributed within all opencv-python packages as build option. + +BSD 2-Clause License + +Copyright (c) 2018-2022, Randy +All rights reserved. + +Redistribution and use in source and binary forms, with or without +modification, are permitted provided that the following conditions are met: + +* Redistributions of source code must retain the above copyright notice, this + list of conditions and the following disclaimer. + +* Redistributions in binary form must reproduce the above copyright notice, + this list of conditions and the following disclaimer in the documentation + and/or other materials provided with the distribution. + +THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" +AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE +IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE +DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE +FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL +DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR +SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER +CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, +OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE +OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + +------------------------------------------------------------------------------ +QUIRC library is redistributed within all opencv-python packages. + +quirc -- QR-code recognition library +Copyright (C) 2010-2012 Daniel Beer + +Permission to use, copy, modify, and/or distribute this software for +any purpose with or without fee is hereby granted, provided that the +above copyright notice and this permission notice appear in all +copies. + +THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL +WARRANTIES WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED +WARRANTIES OF MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE +AUTHOR BE LIABLE FOR ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL +DAMAGES OR ANY DAMAGES WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR +PROFITS, WHETHER IN AN ACTION OF CONTRACT, NEGLIGENCE OR OTHER +TORTIOUS ACTION, ARISING OUT OF OR IN CONNECTION WITH THE USE OR +PERFORMANCE OF THIS SOFTWARE. + +------------------------------------------------------------------------------ +Flatbuffers library is redistributed within all opencv-python packages. + + Apache License + Version 2.0, January 2004 + http://www.apache.org/licenses/ + + TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION + + 1. Definitions. + + "License" shall mean the terms and conditions for use, reproduction, + and distribution as defined by Sections 1 through 9 of this document. + + "Licensor" shall mean the copyright owner or entity authorized by + the copyright owner that is granting the License. + + "Legal Entity" shall mean the union of the acting entity and all + other entities that control, are controlled by, or are under common + control with that entity. For the purposes of this definition, + "control" means (i) the power, direct or indirect, to cause the + direction or management of such entity, whether by contract or + otherwise, or (ii) ownership of fifty percent (50%) or more of the + outstanding shares, or (iii) beneficial ownership of such entity. + + "You" (or "Your") shall mean an individual or Legal Entity + exercising permissions granted by this License. + + "Source" form shall mean the preferred form for making modifications, + including but not limited to software source code, documentation + source, and configuration files. + + "Object" form shall mean any form resulting from mechanical + transformation or translation of a Source form, including but + not limited to compiled object code, generated documentation, + and conversions to other media types. + + "Work" shall mean the work of authorship, whether in Source or + Object form, made available under the License, as indicated by a + copyright notice that is included in or attached to the work + (an example is provided in the Appendix below). + + "Derivative Works" shall mean any work, whether in Source or Object + form, that is based on (or derived from) the Work and for which the + editorial revisions, annotations, elaborations, or other modifications + represent, as a whole, an original work of authorship. For the purposes + of this License, Derivative Works shall not include works that remain + separable from, or merely link (or bind by name) to the interfaces of, + the Work and Derivative Works thereof. + + "Contribution" shall mean any work of authorship, including + the original version of the Work and any modifications or additions + to that Work or Derivative Works thereof, that is intentionally + submitted to Licensor for inclusion in the Work by the copyright owner + or by an individual or Legal Entity authorized to submit on behalf of + the copyright owner. For the purposes of this definition, "submitted" + means any form of electronic, verbal, or written communication sent + to the Licensor or its representatives, including but not limited to + communication on electronic mailing lists, source code control systems, + and issue tracking systems that are managed by, or on behalf of, the + Licensor for the purpose of discussing and improving the Work, but + excluding communication that is conspicuously marked or otherwise + designated in writing by the copyright owner as "Not a Contribution." + + "Contributor" shall mean Licensor and any individual or Legal Entity + on behalf of whom a Contribution has been received by Licensor and + subsequently incorporated within the Work. + + 2. Grant of Copyright License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + copyright license to reproduce, prepare Derivative Works of, + publicly display, publicly perform, sublicense, and distribute the + Work and such Derivative Works in Source or Object form. + + 3. Grant of Patent License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + (except as stated in this section) patent license to make, have made, + use, offer to sell, sell, import, and otherwise transfer the Work, + where such license applies only to those patent claims licensable + by such Contributor that are necessarily infringed by their + Contribution(s) alone or by combination of their Contribution(s) + with the Work to which such Contribution(s) was submitted. If You + institute patent litigation against any entity (including a + cross-claim or counterclaim in a lawsuit) alleging that the Work + or a Contribution incorporated within the Work constitutes direct + or contributory patent infringement, then any patent licenses + granted to You under this License for that Work shall terminate + as of the date such litigation is filed. + + 4. Redistribution. You may reproduce and distribute copies of the + Work or Derivative Works thereof in any medium, with or without + modifications, and in Source or Object form, provided that You + meet the following conditions: + + (a) You must give any other recipients of the Work or + Derivative Works a copy of this License; and + + (b) You must cause any modified files to carry prominent notices + stating that You changed the files; and + + (c) You must retain, in the Source form of any Derivative Works + that You distribute, all copyright, patent, trademark, and + attribution notices from the Source form of the Work, + excluding those notices that do not pertain to any part of + the Derivative Works; and + + (d) If the Work includes a "NOTICE" text file as part of its + distribution, then any Derivative Works that You distribute must + include a readable copy of the attribution notices contained + within such NOTICE file, excluding those notices that do not + pertain to any part of the Derivative Works, in at least one + of the following places: within a NOTICE text file distributed + as part of the Derivative Works; within the Source form or + documentation, if provided along with the Derivative Works; or, + within a display generated by the Derivative Works, if and + wherever such third-party notices normally appear. The contents + of the NOTICE file are for informational purposes only and + do not modify the License. You may add Your own attribution + notices within Derivative Works that You distribute, alongside + or as an addendum to the NOTICE text from the Work, provided + that such additional attribution notices cannot be construed + as modifying the License. + + You may add Your own copyright statement to Your modifications and + may provide additional or different license terms and conditions + for use, reproduction, or distribution of Your modifications, or + for any such Derivative Works as a whole, provided Your use, + reproduction, and distribution of the Work otherwise complies with + the conditions stated in this License. + + 5. Submission of Contributions. Unless You explicitly state otherwise, + any Contribution intentionally submitted for inclusion in the Work + by You to the Licensor shall be under the terms and conditions of + this License, without any additional terms or conditions. + Notwithstanding the above, nothing herein shall supersede or modify + the terms of any separate license agreement you may have executed + with Licensor regarding such Contributions. + + 6. Trademarks. This License does not grant permission to use the trade + names, trademarks, service marks, or product names of the Licensor, + except as required for reasonable and customary use in describing the + origin of the Work and reproducing the content of the NOTICE file. + + 7. Disclaimer of Warranty. Unless required by applicable law or + agreed to in writing, Licensor provides the Work (and each + Contributor provides its Contributions) on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or + implied, including, without limitation, any warranties or conditions + of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A + PARTICULAR PURPOSE. You are solely responsible for determining the + appropriateness of using or redistributing the Work and assume any + risks associated with Your exercise of permissions under this License. + + 8. Limitation of Liability. In no event and under no legal theory, + whether in tort (including negligence), contract, or otherwise, + unless required by applicable law (such as deliberate and grossly + negligent acts) or agreed to in writing, shall any Contributor be + liable to You for damages, including any direct, indirect, special, + incidental, or consequential damages of any character arising as a + result of this License or out of the use or inability to use the + Work (including but not limited to damages for loss of goodwill, + work stoppage, computer failure or malfunction, or any and all + other commercial damages or losses), even if such Contributor + has been advised of the possibility of such damages. + + 9. Accepting Warranty or Additional Liability. While redistributing + the Work or Derivative Works thereof, You may choose to offer, + and charge a fee for, acceptance of support, warranty, indemnity, + or other liability obligations and/or rights consistent with this + License. However, in accepting such obligations, You may act only + on Your own behalf and on Your sole responsibility, not on behalf + of any other Contributor, and only if You agree to indemnify, + defend, and hold each Contributor harmless for any liability + incurred by, or claims asserted against, such Contributor by reason + of your accepting any such warranty or additional liability. + + END OF TERMS AND CONDITIONS + + APPENDIX: How to apply the Apache License to your work. + + To apply the Apache License to your work, attach the following + boilerplate notice, with the fields enclosed by brackets "[]" + replaced with your own identifying information. (Don't include + the brackets!) The text should be enclosed in the appropriate + comment syntax for the file format. We also recommend that a + file or class name and description of purpose be included on the + same "printed page" as the copyright notice for easier + identification within third-party archives. + + Copyright [yyyy] [name of copyright owner] + + Licensed under the Apache License, Version 2.0 (the "License"); + you may not use this file except in compliance with the License. + You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, software + distributed under the License is distributed on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + See the License for the specific language governing permissions and + limitations under the License. + +------------------------------------------------------------------------------ +Protobuf library is redistributed within all opencv-python packages. + +Copyright 2008 Google Inc. All rights reserved. + +Redistribution and use in source and binary forms, with or without +modification, are permitted provided that the following conditions are +met: + + * Redistributions of source code must retain the above copyright +notice, this list of conditions and the following disclaimer. + * Redistributions in binary form must reproduce the above +copyright notice, this list of conditions and the following disclaimer +in the documentation and/or other materials provided with the +distribution. + * Neither the name of Google Inc. nor the names of its +contributors may be used to endorse or promote products derived from +this software without specific prior written permission. + +THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS +"AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT +LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR +A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT +OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, +SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT +LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, +DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY +THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT +(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE +OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + +Code generated by the Protocol Buffer compiler is owned by the owner +of the input file used when generating it. This code is not +standalone and requires a support library to be linked with it. This +support library is itself covered by the above license. + +------------------------------------------------------------------------------ +OpenJPEG library is redistributed within all opencv-python packages. + +/* + * The copyright in this software is being made available under the 2-clauses + * BSD License, included below. This software may be subject to other third + * party and contributor rights, including patent rights, and no such rights + * are granted under this license. + * + * Copyright (c) 2002-2014, Universite catholique de Louvain (UCL), Belgium + * Copyright (c) 2002-2014, Professor Benoit Macq + * Copyright (c) 2003-2014, Antonin Descampe + * Copyright (c) 2003-2009, Francois-Olivier Devaux + * Copyright (c) 2005, Herve Drolon, FreeImage Team + * Copyright (c) 2002-2003, Yannick Verschueren + * Copyright (c) 2001-2003, David Janssens + * Copyright (c) 2011-2012, Centre National d'Etudes Spatiales (CNES), France + * Copyright (c) 2012, CS Systemes d'Information, France + * + * All rights reserved. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS `AS IS' + * AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE + * ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE + * LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR + * CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF + * SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS + * INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN + * CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) + * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE + * POSSIBILITY OF SUCH DAMAGE. + */ + +------------------------------------------------------------------------------ +TIFF library is redistributed within all opencv-python packages. + +Copyright (c) 1988-1997 Sam Leffler +Copyright (c) 1991-1997 Silicon Graphics, Inc. + +Permission to use, copy, modify, distribute, and sell this software and +its documentation for any purpose is hereby granted without fee, provided +that (i) the above copyright notices and this permission notice appear in +all copies of the software and related documentation, and (ii) the names of +Sam Leffler and Silicon Graphics may not be used in any advertising or +publicity relating to the software without the specific, prior written +permission of Sam Leffler and Silicon Graphics. + +THE SOFTWARE IS PROVIDED "AS-IS" AND WITHOUT WARRANTY OF ANY KIND, +EXPRESS, IMPLIED OR OTHERWISE, INCLUDING WITHOUT LIMITATION, ANY +WARRANTY OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. + +IN NO EVENT SHALL SAM LEFFLER OR SILICON GRAPHICS BE LIABLE FOR +ANY SPECIAL, INCIDENTAL, INDIRECT OR CONSEQUENTIAL DAMAGES OF ANY KIND, +OR ANY DAMAGES WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, +WHETHER OR NOT ADVISED OF THE POSSIBILITY OF DAMAGE, AND ON ANY THEORY OF +LIABILITY, ARISING OUT OF OR IN CONNECTION WITH THE USE OR PERFORMANCE +OF THIS SOFTWARE. + +------------------------------------------------------------------------------ +OpenEXR library is redistributed within all opencv-python packages. + +Copyright (c) 2006, Industrial Light & Magic, a division of Lucasfilm +Entertainment Company Ltd. Portions contributed and copyright held by +others as indicated. All rights reserved. + +Redistribution and use in source and binary forms, with or without +modification, are permitted provided that the following conditions are +met: + + * Redistributions of source code must retain the above + copyright notice, this list of conditions and the following + disclaimer. + + * Redistributions in binary form must reproduce the above + copyright notice, this list of conditions and the following + disclaimer in the documentation and/or other materials provided with + the distribution. + + * Neither the name of Industrial Light & Magic nor the names of + any other contributors to this software may be used to endorse or + promote products derived from this software without specific prior + written permission. + +THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS +IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, +THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR +PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR +CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, +EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, +PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR +PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF +LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING +NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS +SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + +------------------------------------------------------------------------------ +Intel(R) IPP ICV library statically linked within x86 and x86_64 opencv-python packages. + +Intel(R) Integrated Performance Primitives 2021 Update 10 + +Intel Simplified Software License (Version October 2022) + +Intel(R) Integrated Performance Primitives (Intel(R) IPP) : Copyright (C) 1997 Intel Corporation + +Use and Redistribution. You may use and redistribute the software, which is +provided in binary form only, (the "Software"), without modification, +provided the following conditions are met: + +* Redistributions must reproduce the above copyright notice and these + terms of use in the Software and in the documentation and/or other materials + provided with the distribution. +* Neither the name of Intel nor the names of its suppliers may be used to + endorse or promote products derived from this Software without specific + prior written permission. +* No reverse engineering, decompilation, or disassembly of the Software is + permitted, nor any modification or alteration of the Software or its operation + at any time, including during execution. + +No other licenses. Except as provided in the preceding section, Intel grants no +licenses or other rights by implication, estoppel or otherwise to, patent, +copyright, trademark, trade name, service mark or other intellectual property +licenses or rights of Intel. + +Third party software. "Third Party Software" means the files (if any) listed +in the "third-party-software.txt" or other similarly-named text file that may +be included with the Software. Third Party Software, even if included with the +distribution of the Software, may be governed by separate license terms, including +without limitation, third party license terms, open source software notices and +terms, and/or other Intel software license terms. These separate license terms +solely govern Your use of the Third Party Software. + +DISCLAIMER. THIS SOFTWARE IS PROVIDED "AS IS" AND ANY EXPRESS OR IMPLIED +WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF +MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE, AND NON-INFRINGEMENT ARE +DISCLAIMED. THIS SOFTWARE IS NOT INTENDED FOR USE IN SYSTEMS OR APPLICATIONS +WHERE FAILURE OF THE SOFTWARE MAY CAUSE PERSONAL INJURY OR DEATH AND YOU AGREE +THAT YOU ARE FULLY RESPONSIBLE FOR ANY CLAIMS, COSTS, DAMAGES, EXPENSES, AND +ATTORNEYS' FEES ARISING OUT OF ANY SUCH USE, EVEN IF ANY CLAIM ALLEGES THAT +INTEL WAS NEGLIGENT REGARDING THE DESIGN OR MANUFACTURE OF THE SOFTWARE. + +LIMITATION OF LIABILITY. IN NO EVENT WILL INTEL BE LIABLE FOR ANY DIRECT, +INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, +BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, +DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF +LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE +OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF +ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + +No support. Intel may make changes to the Software, at any time without notice, +and is not obligated to support, update or provide training for the Software. + +Termination. Your right to use the Software is terminated in the event of your +breach of this license. + +Feedback. Should you provide Intel with comments, modifications, corrections, +enhancements or other input ("Feedback") related to the Software, Intel will be +free to use, disclose, reproduce, license or otherwise distribute or exploit the +Feedback in its sole discretion without any obligations or restrictions of any +kind, including without limitation, intellectual property rights or licensing +obligations. + +Compliance with laws. You agree to comply with all relevant laws and regulations +governing your use, transfer, import or export (or prohibition thereof) of the +Software. + +Governing law. All disputes will be governed by the laws of the United States of +America and the State of Delaware without reference to conflict of law +principles and subject to the exclusive jurisdiction of the state or federal +courts sitting in the State of Delaware, and each party agrees that it submits +to the personal jurisdiction and venue of those courts and waives any +objections. THE UNITED NATIONS CONVENTION ON CONTRACTS FOR THE INTERNATIONAL +SALE OF GOODS (1980) IS SPECIFICALLY EXCLUDED AND WILL NOT APPLY TO THE SOFTWARE. + +------------------------------------------------------------------------------ +Orbbec SDK distributed with arm64 MacOS packages. + +MIT License + +Copyright (c) 2023 OrbbecDeveloper + +Permission is hereby granted, free of charge, to any person obtaining a copy +of this software and associated documentation files (the "Software"), to deal +in the Software without restriction, including without limitation the rights +to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +copies of the Software, and to permit persons to whom the Software is +furnished to do so, subject to the following conditions: + +The above copyright notice and this permission notice shall be included in all +copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE +SOFTWARE. + +------------------------------------------------------------------------------ + +libavif library and it's dependnecies are redistributed within all opencv-python packages. + +Copyright 2019 Joe Drago. All rights reserved. + +Redistribution and use in source and binary forms, with or without +modification, are permitted provided that the following conditions are met: + +1. Redistributions of source code must retain the above copyright notice, this +list of conditions and the following disclaimer. + +2. Redistributions in binary form must reproduce the above copyright notice, +this list of conditions and the following disclaimer in the documentation +and/or other materials provided with the distribution. + +THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" +AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE +IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE +DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE +FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL +DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR +SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER +CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, +OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE +OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + +------------------------------------------------------------------------------ + +Files: src/obu.c + +Copyright © 2018-2019, VideoLAN and dav1d authors +All rights reserved. + +Redistribution and use in source and binary forms, with or without +modification, are permitted provided that the following conditions are met: + +1. Redistributions of source code must retain the above copyright notice, this + list of conditions and the following disclaimer. + +2. Redistributions in binary form must reproduce the above copyright notice, + this list of conditions and the following disclaimer in the documentation + and/or other materials provided with the distribution. + +THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND +ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED +WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE +DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR +ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES +(INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; +LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND +ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT +(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS +SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + +------------------------------------------------------------------------------ + +Files: third_party/iccjpeg/* + +In plain English: + +1. We don't promise that this software works. (But if you find any bugs, + please let us know!) +2. You can use this software for whatever you want. You don't have to pay us. +3. You may not pretend that you wrote this software. If you use it in a + program, you must acknowledge somewhere in your documentation that + you've used the IJG code. + +In legalese: + +The authors make NO WARRANTY or representation, either express or implied, +with respect to this software, its quality, accuracy, merchantability, or +fitness for a particular purpose. This software is provided "AS IS", and you, +its user, assume the entire risk as to its quality and accuracy. + +This software is copyright (C) 1991-2013, Thomas G. Lane, Guido Vollbeding. +All Rights Reserved except as specified below. + +Permission is hereby granted to use, copy, modify, and distribute this +software (or portions thereof) for any purpose, without fee, subject to these +conditions: +(1) If any part of the source code for this software is distributed, then this +README file must be included, with this copyright and no-warranty notice +unaltered; and any additions, deletions, or changes to the original files +must be clearly indicated in accompanying documentation. +(2) If only executable code is distributed, then the accompanying +documentation must state that "this software is based in part on the work of +the Independent JPEG Group". +(3) Permission for use of this software is granted only if the user accepts +full responsibility for any undesirable consequences; the authors accept +NO LIABILITY for damages of any kind. + +These conditions apply to any software derived from or based on the IJG code, +not just to the unmodified library. If you use our work, you ought to +acknowledge us. + +Permission is NOT granted for the use of any IJG author's name or company name +in advertising or publicity relating to this software or products derived from +it. This software may be referred to only as "the Independent JPEG Group's +software". + +We specifically permit and encourage the use of this software as the basis of +commercial products, provided that all warranty or liability claims are +assumed by the product vendor. + + +The Unix configuration script "configure" was produced with GNU Autoconf. +It is copyright by the Free Software Foundation but is freely distributable. +The same holds for its supporting scripts (config.guess, config.sub, +ltmain.sh). Another support script, install-sh, is copyright by X Consortium +but is also freely distributable. + +The IJG distribution formerly included code to read and write GIF files. +To avoid entanglement with the Unisys LZW patent, GIF reading support has +been removed altogether, and the GIF writer has been simplified to produce +"uncompressed GIFs". This technique does not use the LZW algorithm; the +resulting GIF files are larger than usual, but are readable by all standard +GIF decoders. + +We are required to state that + "The Graphics Interchange Format(c) is the Copyright property of + CompuServe Incorporated. GIF(sm) is a Service Mark property of + CompuServe Incorporated." + +------------------------------------------------------------------------------ + +Files: contrib/gdk-pixbuf/* + +Copyright 2020 Emmanuel Gil Peyrot. All rights reserved. + +Redistribution and use in source and binary forms, with or without +modification, are permitted provided that the following conditions are met: + +1. Redistributions of source code must retain the above copyright notice, this +list of conditions and the following disclaimer. + +2. Redistributions in binary form must reproduce the above copyright notice, +this list of conditions and the following disclaimer in the documentation +and/or other materials provided with the distribution. + +THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" +AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE +IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE +DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE +FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL +DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR +SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER +CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, +OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE +OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + +------------------------------------------------------------------------------ + +Files: android_jni/gradlew* + + + Apache License + Version 2.0, January 2004 + http://www.apache.org/licenses/ + + TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION + + 1. Definitions. + + "License" shall mean the terms and conditions for use, reproduction, + and distribution as defined by Sections 1 through 9 of this document. + + "Licensor" shall mean the copyright owner or entity authorized by + the copyright owner that is granting the License. + + "Legal Entity" shall mean the union of the acting entity and all + other entities that control, are controlled by, or are under common + control with that entity. For the purposes of this definition, + "control" means (i) the power, direct or indirect, to cause the + direction or management of such entity, whether by contract or + otherwise, or (ii) ownership of fifty percent (50%) or more of the + outstanding shares, or (iii) beneficial ownership of such entity. + + "You" (or "Your") shall mean an individual or Legal Entity + exercising permissions granted by this License. + + "Source" form shall mean the preferred form for making modifications, + including but not limited to software source code, documentation + source, and configuration files. + + "Object" form shall mean any form resulting from mechanical + transformation or translation of a Source form, including but + not limited to compiled object code, generated documentation, + and conversions to other media types. + + "Work" shall mean the work of authorship, whether in Source or + Object form, made available under the License, as indicated by a + copyright notice that is included in or attached to the work + (an example is provided in the Appendix below). + + "Derivative Works" shall mean any work, whether in Source or Object + form, that is based on (or derived from) the Work and for which the + editorial revisions, annotations, elaborations, or other modifications + represent, as a whole, an original work of authorship. For the purposes + of this License, Derivative Works shall not include works that remain + separable from, or merely link (or bind by name) to the interfaces of, + the Work and Derivative Works thereof. + + "Contribution" shall mean any work of authorship, including + the original version of the Work and any modifications or additions + to that Work or Derivative Works thereof, that is intentionally + submitted to Licensor for inclusion in the Work by the copyright owner + or by an individual or Legal Entity authorized to submit on behalf of + the copyright owner. For the purposes of this definition, "submitted" + means any form of electronic, verbal, or written communication sent + to the Licensor or its representatives, including but not limited to + communication on electronic mailing lists, source code control systems, + and issue tracking systems that are managed by, or on behalf of, the + Licensor for the purpose of discussing and improving the Work, but + excluding communication that is conspicuously marked or otherwise + designated in writing by the copyright owner as "Not a Contribution." + + "Contributor" shall mean Licensor and any individual or Legal Entity + on behalf of whom a Contribution has been received by Licensor and + subsequently incorporated within the Work. + + 2. Grant of Copyright License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + copyright license to reproduce, prepare Derivative Works of, + publicly display, publicly perform, sublicense, and distribute the + Work and such Derivative Works in Source or Object form. + + 3. Grant of Patent License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + (except as stated in this section) patent license to make, have made, + use, offer to sell, sell, import, and otherwise transfer the Work, + where such license applies only to those patent claims licensable + by such Contributor that are necessarily infringed by their + Contribution(s) alone or by combination of their Contribution(s) + with the Work to which such Contribution(s) was submitted. If You + institute patent litigation against any entity (including a + cross-claim or counterclaim in a lawsuit) alleging that the Work + or a Contribution incorporated within the Work constitutes direct + or contributory patent infringement, then any patent licenses + granted to You under this License for that Work shall terminate + as of the date such litigation is filed. + + 4. Redistribution. You may reproduce and distribute copies of the + Work or Derivative Works thereof in any medium, with or without + modifications, and in Source or Object form, provided that You + meet the following conditions: + + (a) You must give any other recipients of the Work or + Derivative Works a copy of this License; and + + (b) You must cause any modified files to carry prominent notices + stating that You changed the files; and + + (c) You must retain, in the Source form of any Derivative Works + that You distribute, all copyright, patent, trademark, and + attribution notices from the Source form of the Work, + excluding those notices that do not pertain to any part of + the Derivative Works; and + + (d) If the Work includes a "NOTICE" text file as part of its + distribution, then any Derivative Works that You distribute must + include a readable copy of the attribution notices contained + within such NOTICE file, excluding those notices that do not + pertain to any part of the Derivative Works, in at least one + of the following places: within a NOTICE text file distributed + as part of the Derivative Works; within the Source form or + documentation, if provided along with the Derivative Works; or, + within a display generated by the Derivative Works, if and + wherever such third-party notices normally appear. The contents + of the NOTICE file are for informational purposes only and + do not modify the License. You may add Your own attribution + notices within Derivative Works that You distribute, alongside + or as an addendum to the NOTICE text from the Work, provided + that such additional attribution notices cannot be construed + as modifying the License. + + You may add Your own copyright statement to Your modifications and + may provide additional or different license terms and conditions + for use, reproduction, or distribution of Your modifications, or + for any such Derivative Works as a whole, provided Your use, + reproduction, and distribution of the Work otherwise complies with + the conditions stated in this License. + + 5. Submission of Contributions. Unless You explicitly state otherwise, + any Contribution intentionally submitted for inclusion in the Work + by You to the Licensor shall be under the terms and conditions of + this License, without any additional terms or conditions. + Notwithstanding the above, nothing herein shall supersede or modify + the terms of any separate license agreement you may have executed + with Licensor regarding such Contributions. + + 6. Trademarks. This License does not grant permission to use the trade + names, trademarks, service marks, or product names of the Licensor, + except as required for reasonable and customary use in describing the + origin of the Work and reproducing the content of the NOTICE file. + + 7. Disclaimer of Warranty. Unless required by applicable law or + agreed to in writing, Licensor provides the Work (and each + Contributor provides its Contributions) on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or + implied, including, without limitation, any warranties or conditions + of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A + PARTICULAR PURPOSE. You are solely responsible for determining the + appropriateness of using or redistributing the Work and assume any + risks associated with Your exercise of permissions under this License. + + 8. Limitation of Liability. In no event and under no legal theory, + whether in tort (including negligence), contract, or otherwise, + unless required by applicable law (such as deliberate and grossly + negligent acts) or agreed to in writing, shall any Contributor be + liable to You for damages, including any direct, indirect, special, + incidental, or consequential damages of any character arising as a + result of this License or out of the use or inability to use the + Work (including but not limited to damages for loss of goodwill, + work stoppage, computer failure or malfunction, or any and all + other commercial damages or losses), even if such Contributor + has been advised of the possibility of such damages. + + 9. Accepting Warranty or Additional Liability. While redistributing + the Work or Derivative Works thereof, You may choose to offer, + and charge a fee for, acceptance of support, warranty, indemnity, + or other liability obligations and/or rights consistent with this + License. However, in accepting such obligations, You may act only + on Your own behalf and on Your sole responsibility, not on behalf + of any other Contributor, and only if You agree to indemnify, + defend, and hold each Contributor harmless for any liability + incurred by, or claims asserted against, such Contributor by reason + of your accepting any such warranty or additional liability. + + END OF TERMS AND CONDITIONS + + APPENDIX: How to apply the Apache License to your work. + + To apply the Apache License to your work, attach the following + boilerplate notice, with the fields enclosed by brackets "[]" + replaced with your own identifying information. (Don't include + the brackets!) The text should be enclosed in the appropriate + comment syntax for the file format. We also recommend that a + file or class name and description of purpose be included on the + same "printed page" as the copyright notice for easier + identification within third-party archives. + + Copyright [yyyy] [name of copyright owner] + + Licensed under the Apache License, Version 2.0 (the "License"); + you may not use this file except in compliance with the License. + You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, software + distributed under the License is distributed on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + See the License for the specific language governing permissions and + limitations under the License. + +------------------------------------------------------------------------------ + +Files: third_party/libyuv/* + +Copyright 2011 The LibYuv Project Authors. All rights reserved. + +Redistribution and use in source and binary forms, with or without +modification, are permitted provided that the following conditions are +met: + + * Redistributions of source code must retain the above copyright + notice, this list of conditions and the following disclaimer. + + * Redistributions in binary form must reproduce the above copyright + notice, this list of conditions and the following disclaimer in + the documentation and/or other materials provided with the + distribution. + + * Neither the name of Google nor the names of its contributors may + be used to endorse or promote products derived from this software + without specific prior written permission. + +THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS +"AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT +LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR +A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT +HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, +SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT +LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, +DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY +THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT +(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE +OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + +------------------------------------------------------------------------------ + +aom library and it's dependnecies are redistributed within all opencv-python packages. + +Copyright (c) 2016, Alliance for Open Media. All rights reserved. + +Redistribution and use in source and binary forms, with or without +modification, are permitted provided that the following conditions +are met: + +1. Redistributions of source code must retain the above copyright + notice, this list of conditions and the following disclaimer. + +2. Redistributions in binary form must reproduce the above copyright + notice, this list of conditions and the following disclaimer in + the documentation and/or other materials provided with the + distribution. + +THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS +"AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT +LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS +FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE +COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, +INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, +BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; +LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER +CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT +LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN +ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE +POSSIBILITY OF SUCH DAMAGE. + +------------------------------------------------------------------------------ + +KAZE Features library is redistributed within all opencv-python packages. + +Copyright (c) 2012, Pablo Fernández Alcantarilla +All Rights Reserved + +Redistribution and use in source and binary forms, with or without modification, +are permitted provided that the following conditions are met: + + * Redistributions of source code must retain the above copyright notice, + this list of conditions and the following disclaimer. + + * Redistributions in binary form must reproduce the above copyright notice, + this list of conditions and the following disclaimer in the documentation + and/or other materials provided with the distribution. + + * Neither the name of the copyright holders nor the names of its contributors + may be used to endorse or promote products derived from this software without + specific prior written permission. + +THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY +EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES +OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT +SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, +INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED +TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR +BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN +CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY +WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + +------------------------------------------------------------------------------ + +AKAZE Features library is redistributed within all opencv-python packages. + +Copyright (c) 2014, Pablo Fernandez Alcantarilla, Jesus Nuevo +All Rights Reserved + +Redistribution and use in source and binary forms, with or without modification, +are permitted provided that the following conditions are met: + + * Redistributions of source code must retain the above copyright notice, + this list of conditions and the following disclaimer. + + * Redistributions in binary form must reproduce the above copyright notice, + this list of conditions and the following disclaimer in the documentation + and/or other materials provided with the distribution. + + * Neither the name of the copyright holders nor the names of its contributors + may be used to endorse or promote products derived from this software without + specific prior written permission. + +THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY +EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES +OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT +SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, +INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED +TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR +BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN +CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY +WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + + +opencv-python +Apache Software License +https://github.com/opencv/opencv-python +OpenCV library is redistributed within opencv-python package. +This license applies to OpenCV binary in the directory cv2/. + + + Apache License + Version 2.0, January 2004 + http://www.apache.org/licenses/ + + TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION + + 1. Definitions. + + "License" shall mean the terms and conditions for use, reproduction, + and distribution as defined by Sections 1 through 9 of this document. + + "Licensor" shall mean the copyright owner or entity authorized by + the copyright owner that is granting the License. + + "Legal Entity" shall mean the union of the acting entity and all + other entities that control, are controlled by, or are under common + control with that entity. For the purposes of this definition, + "control" means (i) the power, direct or indirect, to cause the + direction or management of such entity, whether by contract or + otherwise, or (ii) ownership of fifty percent (50%) or more of the + outstanding shares, or (iii) beneficial ownership of such entity. + + "You" (or "Your") shall mean an individual or Legal Entity + exercising permissions granted by this License. + + "Source" form shall mean the preferred form for making modifications, + including but not limited to software source code, documentation + source, and configuration files. + + "Object" form shall mean any form resulting from mechanical + transformation or translation of a Source form, including but + not limited to compiled object code, generated documentation, + and conversions to other media types. + + "Work" shall mean the work of authorship, whether in Source or + Object form, made available under the License, as indicated by a + copyright notice that is included in or attached to the work + (an example is provided in the Appendix below). + + "Derivative Works" shall mean any work, whether in Source or Object + form, that is based on (or derived from) the Work and for which the + editorial revisions, annotations, elaborations, or other modifications + represent, as a whole, an original work of authorship. For the purposes + of this License, Derivative Works shall not include works that remain + separable from, or merely link (or bind by name) to the interfaces of, + the Work and Derivative Works thereof. + + "Contribution" shall mean any work of authorship, including + the original version of the Work and any modifications or additions + to that Work or Derivative Works thereof, that is intentionally + submitted to Licensor for inclusion in the Work by the copyright owner + or by an individual or Legal Entity authorized to submit on behalf of + the copyright owner. For the purposes of this definition, "submitted" + means any form of electronic, verbal, or written communication sent + to the Licensor or its representatives, including but not limited to + communication on electronic mailing lists, source code control systems, + and issue tracking systems that are managed by, or on behalf of, the + Licensor for the purpose of discussing and improving the Work, but + excluding communication that is conspicuously marked or otherwise + designated in writing by the copyright owner as "Not a Contribution." + + "Contributor" shall mean Licensor and any individual or Legal Entity + on behalf of whom a Contribution has been received by Licensor and + subsequently incorporated within the Work. + + 2. Grant of Copyright License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + copyright license to reproduce, prepare Derivative Works of, + publicly display, publicly perform, sublicense, and distribute the + Work and such Derivative Works in Source or Object form. + + 3. Grant of Patent License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + (except as stated in this section) patent license to make, have made, + use, offer to sell, sell, import, and otherwise transfer the Work, + where such license applies only to those patent claims licensable + by such Contributor that are necessarily infringed by their + Contribution(s) alone or by combination of their Contribution(s) + with the Work to which such Contribution(s) was submitted. If You + institute patent litigation against any entity (including a + cross-claim or counterclaim in a lawsuit) alleging that the Work + or a Contribution incorporated within the Work constitutes direct + or contributory patent infringement, then any patent licenses + granted to You under this License for that Work shall terminate + as of the date such litigation is filed. + + 4. Redistribution. You may reproduce and distribute copies of the + Work or Derivative Works thereof in any medium, with or without + modifications, and in Source or Object form, provided that You + meet the following conditions: + + (a) You must give any other recipients of the Work or + Derivative Works a copy of this License; and + + (b) You must cause any modified files to carry prominent notices + stating that You changed the files; and + + (c) You must retain, in the Source form of any Derivative Works + that You distribute, all copyright, patent, trademark, and + attribution notices from the Source form of the Work, + excluding those notices that do not pertain to any part of + the Derivative Works; and + + (d) If the Work includes a "NOTICE" text file as part of its + distribution, then any Derivative Works that You distribute must + include a readable copy of the attribution notices contained + within such NOTICE file, excluding those notices that do not + pertain to any part of the Derivative Works, in at least one + of the following places: within a NOTICE text file distributed + as part of the Derivative Works; within the Source form or + documentation, if provided along with the Derivative Works; or, + within a display generated by the Derivative Works, if and + wherever such third-party notices normally appear. The contents + of the NOTICE file are for informational purposes only and + do not modify the License. You may add Your own attribution + notices within Derivative Works that You distribute, alongside + or as an addendum to the NOTICE text from the Work, provided + that such additional attribution notices cannot be construed + as modifying the License. + + You may add Your own copyright statement to Your modifications and + may provide additional or different license terms and conditions + for use, reproduction, or distribution of Your modifications, or + for any such Derivative Works as a whole, provided Your use, + reproduction, and distribution of the Work otherwise complies with + the conditions stated in this License. + + 5. Submission of Contributions. Unless You explicitly state otherwise, + any Contribution intentionally submitted for inclusion in the Work + by You to the Licensor shall be under the terms and conditions of + this License, without any additional terms or conditions. + Notwithstanding the above, nothing herein shall supersede or modify + the terms of any separate license agreement you may have executed + with Licensor regarding such Contributions. + + 6. Trademarks. This License does not grant permission to use the trade + names, trademarks, service marks, or product names of the Licensor, + except as required for reasonable and customary use in describing the + origin of the Work and reproducing the content of the NOTICE file. + + 7. Disclaimer of Warranty. Unless required by applicable law or + agreed to in writing, Licensor provides the Work (and each + Contributor provides its Contributions) on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or + implied, including, without limitation, any warranties or conditions + of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A + PARTICULAR PURPOSE. You are solely responsible for determining the + appropriateness of using or redistributing the Work and assume any + risks associated with Your exercise of permissions under this License. + + 8. Limitation of Liability. In no event and under no legal theory, + whether in tort (including negligence), contract, or otherwise, + unless required by applicable law (such as deliberate and grossly + negligent acts) or agreed to in writing, shall any Contributor be + liable to You for damages, including any direct, indirect, special, + incidental, or consequential damages of any character arising as a + result of this License or out of the use or inability to use the + Work (including but not limited to damages for loss of goodwill, + work stoppage, computer failure or malfunction, or any and all + other commercial damages or losses), even if such Contributor + has been advised of the possibility of such damages. + + 9. Accepting Warranty or Additional Liability. While redistributing + the Work or Derivative Works thereof, You may choose to offer, + and charge a fee for, acceptance of support, warranty, indemnity, + or other liability obligations and/or rights consistent with this + License. However, in accepting such obligations, You may act only + on Your own behalf and on Your sole responsibility, not on behalf + of any other Contributor, and only if You agree to indemnify, + defend, and hold each Contributor harmless for any liability + incurred by, or claims asserted against, such Contributor by reason + of your accepting any such warranty or additional liability. + + END OF TERMS AND CONDITIONS + + APPENDIX: How to apply the Apache License to your work. + + To apply the Apache License to your work, attach the following + boilerplate notice, with the fields enclosed by brackets "[]" + replaced with your own identifying information. (Don't include + the brackets!) The text should be enclosed in the appropriate + comment syntax for the file format. We also recommend that a + file or class name and description of purpose be included on the + same "printed page" as the copyright notice for easier + identification within third-party archives. + + Copyright [yyyy] [name of copyright owner] + + Licensed under the Apache License, Version 2.0 (the "License"); + you may not use this file except in compliance with the License. + You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, software + distributed under the License is distributed on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + See the License for the specific language governing permissions and + limitations under the License. + +------------------------------------------------------------------------------ +libvpx is redistributed within all opencv-python Linux packages. +This license applies to libvpx binary in the directory cv2/. + +Copyright (c) 2010, The WebM Project authors. All rights reserved. + +Redistribution and use in source and binary forms, with or without +modification, are permitted provided that the following conditions are +met: + + * Redistributions of source code must retain the above copyright + notice, this list of conditions and the following disclaimer. + + * Redistributions in binary form must reproduce the above copyright + notice, this list of conditions and the following disclaimer in + the documentation and/or other materials provided with the + distribution. + + * Neither the name of Google, nor the WebM Project, nor the names + of its contributors may be used to endorse or promote products + derived from this software without specific prior written + permission. + +THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS +"AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT +LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR +A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT +HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, +SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT +LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, +DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY +THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT +(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE +OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + +------------------------------------------------------------------------------ +FFmpeg is redistributed within all opencv-python packages. + +Libbluray, libgnutls, libnettle, libhogweed, libintl, libmp3lame, libp11, +librtmp, libsoxr and libtasn1 are redistributed within all opencv-python macOS packages. + +This license applies to the above library binaries in the directory cv2/. + + GNU LESSER GENERAL PUBLIC LICENSE + Version 2.1, February 1999 + + Copyright (C) 1991, 1999 Free Software Foundation, Inc. + 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + Everyone is permitted to copy and distribute verbatim copies + of this license document, but changing it is not allowed. + +[This is the first released version of the Lesser GPL. It also counts + as the successor of the GNU Library Public License, version 2, hence + the version number 2.1.] + + Preamble + + The licenses for most software are designed to take away your +freedom to share and change it. By contrast, the GNU General Public +Licenses are intended to guarantee your freedom to share and change +free software--to make sure the software is free for all its users. + + This license, the Lesser General Public License, applies to some +specially designated software packages--typically libraries--of the +Free Software Foundation and other authors who decide to use it. You +can use it too, but we suggest you first think carefully about whether +this license or the ordinary General Public License is the better +strategy to use in any particular case, based on the explanations below. + + When we speak of free software, we are referring to freedom of use, +not price. Our General Public Licenses are designed to make sure that +you have the freedom to distribute copies of free software (and charge +for this service if you wish); that you receive source code or can get +it if you want it; that you can change the software and use pieces of +it in new free programs; and that you are informed that you can do +these things. + + To protect your rights, we need to make restrictions that forbid +distributors to deny you these rights or to ask you to surrender these +rights. These restrictions translate to certain responsibilities for +you if you distribute copies of the library or if you modify it. + + For example, if you distribute copies of the library, whether gratis +or for a fee, you must give the recipients all the rights that we gave +you. You must make sure that they, too, receive or can get the source +code. If you link other code with the library, you must provide +complete object files to the recipients, so that they can relink them +with the library after making changes to the library and recompiling +it. And you must show them these terms so they know their rights. + + We protect your rights with a two-step method: (1) we copyright the +library, and (2) we offer you this license, which gives you legal +permission to copy, distribute and/or modify the library. + + To protect each distributor, we want to make it very clear that +there is no warranty for the free library. Also, if the library is +modified by someone else and passed on, the recipients should know +that what they have is not the original version, so that the original +author's reputation will not be affected by problems that might be +introduced by others. + + Finally, software patents pose a constant threat to the existence of +any free program. We wish to make sure that a company cannot +effectively restrict the users of a free program by obtaining a +restrictive license from a patent holder. Therefore, we insist that +any patent license obtained for a version of the library must be +consistent with the full freedom of use specified in this license. + + Most GNU software, including some libraries, is covered by the +ordinary GNU General Public License. This license, the GNU Lesser +General Public License, applies to certain designated libraries, and +is quite different from the ordinary General Public License. We use +this license for certain libraries in order to permit linking those +libraries into non-free programs. + + When a program is linked with a library, whether statically or using +a shared library, the combination of the two is legally speaking a +combined work, a derivative of the original library. The ordinary +General Public License therefore permits such linking only if the +entire combination fits its criteria of freedom. The Lesser General +Public License permits more lax criteria for linking other code with +the library. + + We call this license the "Lesser" General Public License because it +does Less to protect the user's freedom than the ordinary General +Public License. It also provides other free software developers Less +of an advantage over competing non-free programs. These disadvantages +are the reason we use the ordinary General Public License for many +libraries. However, the Lesser license provides advantages in certain +special circumstances. + + For example, on rare occasions, there may be a special need to +encourage the widest possible use of a certain library, so that it becomes +a de-facto standard. To achieve this, non-free programs must be +allowed to use the library. A more frequent case is that a free +library does the same job as widely used non-free libraries. In this +case, there is little to gain by limiting the free library to free +software only, so we use the Lesser General Public License. + + In other cases, permission to use a particular library in non-free +programs enables a greater number of people to use a large body of +free software. For example, permission to use the GNU C Library in +non-free programs enables many more people to use the whole GNU +operating system, as well as its variant, the GNU/Linux operating +system. + + Although the Lesser General Public License is Less protective of the +users' freedom, it does ensure that the user of a program that is +linked with the Library has the freedom and the wherewithal to run +that program using a modified version of the Library. + + The precise terms and conditions for copying, distribution and +modification follow. Pay close attention to the difference between a +"work based on the library" and a "work that uses the library". The +former contains code derived from the library, whereas the latter must +be combined with the library in order to run. + + GNU LESSER GENERAL PUBLIC LICENSE + TERMS AND CONDITIONS FOR COPYING, DISTRIBUTION AND MODIFICATION + + 0. This License Agreement applies to any software library or other +program which contains a notice placed by the copyright holder or +other authorized party saying it may be distributed under the terms of +this Lesser General Public License (also called "this License"). +Each licensee is addressed as "you". + + A "library" means a collection of software functions and/or data +prepared so as to be conveniently linked with application programs +(which use some of those functions and data) to form executables. + + The "Library", below, refers to any such software library or work +which has been distributed under these terms. A "work based on the +Library" means either the Library or any derivative work under +copyright law: that is to say, a work containing the Library or a +portion of it, either verbatim or with modifications and/or translated +straightforwardly into another language. (Hereinafter, translation is +included without limitation in the term "modification".) + + "Source code" for a work means the preferred form of the work for +making modifications to it. For a library, complete source code means +all the source code for all modules it contains, plus any associated +interface definition files, plus the scripts used to control compilation +and installation of the library. + + Activities other than copying, distribution and modification are not +covered by this License; they are outside its scope. The act of +running a program using the Library is not restricted, and output from +such a program is covered only if its contents constitute a work based +on the Library (independent of the use of the Library in a tool for +writing it). Whether that is true depends on what the Library does +and what the program that uses the Library does. + + 1. You may copy and distribute verbatim copies of the Library's +complete source code as you receive it, in any medium, provided that +you conspicuously and appropriately publish on each copy an +appropriate copyright notice and disclaimer of warranty; keep intact +all the notices that refer to this License and to the absence of any +warranty; and distribute a copy of this License along with the +Library. + + You may charge a fee for the physical act of transferring a copy, +and you may at your option offer warranty protection in exchange for a +fee. + + 2. You may modify your copy or copies of the Library or any portion +of it, thus forming a work based on the Library, and copy and +distribute such modifications or work under the terms of Section 1 +above, provided that you also meet all of these conditions: + + a) The modified work must itself be a software library. + + b) You must cause the files modified to carry prominent notices + stating that you changed the files and the date of any change. + + c) You must cause the whole of the work to be licensed at no + charge to all third parties under the terms of this License. + + d) If a facility in the modified Library refers to a function or a + table of data to be supplied by an application program that uses + the facility, other than as an argument passed when the facility + is invoked, then you must make a good faith effort to ensure that, + in the event an application does not supply such function or + table, the facility still operates, and performs whatever part of + its purpose remains meaningful. + + (For example, a function in a library to compute square roots has + a purpose that is entirely well-defined independent of the + application. Therefore, Subsection 2d requires that any + application-supplied function or table used by this function must + be optional: if the application does not supply it, the square + root function must still compute square roots.) + +These requirements apply to the modified work as a whole. If +identifiable sections of that work are not derived from the Library, +and can be reasonably considered independent and separate works in +themselves, then this License, and its terms, do not apply to those +sections when you distribute them as separate works. But when you +distribute the same sections as part of a whole which is a work based +on the Library, the distribution of the whole must be on the terms of +this License, whose permissions for other licensees extend to the +entire whole, and thus to each and every part regardless of who wrote +it. + +Thus, it is not the intent of this section to claim rights or contest +your rights to work written entirely by you; rather, the intent is to +exercise the right to control the distribution of derivative or +collective works based on the Library. + +In addition, mere aggregation of another work not based on the Library +with the Library (or with a work based on the Library) on a volume of +a storage or distribution medium does not bring the other work under +the scope of this License. + + 3. You may opt to apply the terms of the ordinary GNU General Public +License instead of this License to a given copy of the Library. To do +this, you must alter all the notices that refer to this License, so +that they refer to the ordinary GNU General Public License, version 2, +instead of to this License. (If a newer version than version 2 of the +ordinary GNU General Public License has appeared, then you can specify +that version instead if you wish.) Do not make any other change in +these notices. + + Once this change is made in a given copy, it is irreversible for +that copy, so the ordinary GNU General Public License applies to all +subsequent copies and derivative works made from that copy. + + This option is useful when you wish to copy part of the code of +the Library into a program that is not a library. + + 4. You may copy and distribute the Library (or a portion or +derivative of it, under Section 2) in object code or executable form +under the terms of Sections 1 and 2 above provided that you accompany +it with the complete corresponding machine-readable source code, which +must be distributed under the terms of Sections 1 and 2 above on a +medium customarily used for software interchange. + + If distribution of object code is made by offering access to copy +from a designated place, then offering equivalent access to copy the +source code from the same place satisfies the requirement to +distribute the source code, even though third parties are not +compelled to copy the source along with the object code. + + 5. A program that contains no derivative of any portion of the +Library, but is designed to work with the Library by being compiled or +linked with it, is called a "work that uses the Library". Such a +work, in isolation, is not a derivative work of the Library, and +therefore falls outside the scope of this License. + + However, linking a "work that uses the Library" with the Library +creates an executable that is a derivative of the Library (because it +contains portions of the Library), rather than a "work that uses the +library". The executable is therefore covered by this License. +Section 6 states terms for distribution of such executables. + + When a "work that uses the Library" uses material from a header file +that is part of the Library, the object code for the work may be a +derivative work of the Library even though the source code is not. +Whether this is true is especially significant if the work can be +linked without the Library, or if the work is itself a library. The +threshold for this to be true is not precisely defined by law. + + If such an object file uses only numerical parameters, data +structure layouts and accessors, and small macros and small inline +functions (ten lines or less in length), then the use of the object +file is unrestricted, regardless of whether it is legally a derivative +work. (Executables containing this object code plus portions of the +Library will still fall under Section 6.) + + Otherwise, if the work is a derivative of the Library, you may +distribute the object code for the work under the terms of Section 6. +Any executables containing that work also fall under Section 6, +whether or not they are linked directly with the Library itself. + + 6. As an exception to the Sections above, you may also combine or +link a "work that uses the Library" with the Library to produce a +work containing portions of the Library, and distribute that work +under terms of your choice, provided that the terms permit +modification of the work for the customer's own use and reverse +engineering for debugging such modifications. + + You must give prominent notice with each copy of the work that the +Library is used in it and that the Library and its use are covered by +this License. You must supply a copy of this License. If the work +during execution displays copyright notices, you must include the +copyright notice for the Library among them, as well as a reference +directing the user to the copy of this License. Also, you must do one +of these things: + + a) Accompany the work with the complete corresponding + machine-readable source code for the Library including whatever + changes were used in the work (which must be distributed under + Sections 1 and 2 above); and, if the work is an executable linked + with the Library, with the complete machine-readable "work that + uses the Library", as object code and/or source code, so that the + user can modify the Library and then relink to produce a modified + executable containing the modified Library. (It is understood + that the user who changes the contents of definitions files in the + Library will not necessarily be able to recompile the application + to use the modified definitions.) + + b) Use a suitable shared library mechanism for linking with the + Library. A suitable mechanism is one that (1) uses at run time a + copy of the library already present on the user's computer system, + rather than copying library functions into the executable, and (2) + will operate properly with a modified version of the library, if + the user installs one, as long as the modified version is + interface-compatible with the version that the work was made with. + + c) Accompany the work with a written offer, valid for at + least three years, to give the same user the materials + specified in Subsection 6a, above, for a charge no more + than the cost of performing this distribution. + + d) If distribution of the work is made by offering access to copy + from a designated place, offer equivalent access to copy the above + specified materials from the same place. + + e) Verify that the user has already received a copy of these + materials or that you have already sent this user a copy. + + For an executable, the required form of the "work that uses the +Library" must include any data and utility programs needed for +reproducing the executable from it. However, as a special exception, +the materials to be distributed need not include anything that is +normally distributed (in either source or binary form) with the major +components (compiler, kernel, and so on) of the operating system on +which the executable runs, unless that component itself accompanies +the executable. + + It may happen that this requirement contradicts the license +restrictions of other proprietary libraries that do not normally +accompany the operating system. Such a contradiction means you cannot +use both them and the Library together in an executable that you +distribute. + + 7. You may place library facilities that are a work based on the +Library side-by-side in a single library together with other library +facilities not covered by this License, and distribute such a combined +library, provided that the separate distribution of the work based on +the Library and of the other library facilities is otherwise +permitted, and provided that you do these two things: + + a) Accompany the combined library with a copy of the same work + based on the Library, uncombined with any other library + facilities. This must be distributed under the terms of the + Sections above. + + b) Give prominent notice with the combined library of the fact + that part of it is a work based on the Library, and explaining + where to find the accompanying uncombined form of the same work. + + 8. You may not copy, modify, sublicense, link with, or distribute +the Library except as expressly provided under this License. Any +attempt otherwise to copy, modify, sublicense, link with, or +distribute the Library is void, and will automatically terminate your +rights under this License. However, parties who have received copies, +or rights, from you under this License will not have their licenses +terminated so long as such parties remain in full compliance. + + 9. You are not required to accept this License, since you have not +signed it. However, nothing else grants you permission to modify or +distribute the Library or its derivative works. These actions are +prohibited by law if you do not accept this License. Therefore, by +modifying or distributing the Library (or any work based on the +Library), you indicate your acceptance of this License to do so, and +all its terms and conditions for copying, distributing or modifying +the Library or works based on it. + + 10. Each time you redistribute the Library (or any work based on the +Library), the recipient automatically receives a license from the +original licensor to copy, distribute, link with or modify the Library +subject to these terms and conditions. You may not impose any further +restrictions on the recipients' exercise of the rights granted herein. +You are not responsible for enforcing compliance by third parties with +this License. + + 11. If, as a consequence of a court judgment or allegation of patent +infringement or for any other reason (not limited to patent issues), +conditions are imposed on you (whether by court order, agreement or +otherwise) that contradict the conditions of this License, they do not +excuse you from the conditions of this License. If you cannot +distribute so as to satisfy simultaneously your obligations under this +License and any other pertinent obligations, then as a consequence you +may not distribute the Library at all. For example, if a patent +license would not permit royalty-free redistribution of the Library by +all those who receive copies directly or indirectly through you, then +the only way you could satisfy both it and this License would be to +refrain entirely from distribution of the Library. + +If any portion of this section is held invalid or unenforceable under any +particular circumstance, the balance of the section is intended to apply, +and the section as a whole is intended to apply in other circumstances. + +It is not the purpose of this section to induce you to infringe any +patents or other property right claims or to contest validity of any +such claims; this section has the sole purpose of protecting the +integrity of the free software distribution system which is +implemented by public license practices. Many people have made +generous contributions to the wide range of software distributed +through that system in reliance on consistent application of that +system; it is up to the author/donor to decide if he or she is willing +to distribute software through any other system and a licensee cannot +impose that choice. + +This section is intended to make thoroughly clear what is believed to +be a consequence of the rest of this License. + + 12. If the distribution and/or use of the Library is restricted in +certain countries either by patents or by copyrighted interfaces, the +original copyright holder who places the Library under this License may add +an explicit geographical distribution limitation excluding those countries, +so that distribution is permitted only in or among countries not thus +excluded. In such case, this License incorporates the limitation as if +written in the body of this License. + + 13. The Free Software Foundation may publish revised and/or new +versions of the Lesser General Public License from time to time. +Such new versions will be similar in spirit to the present version, +but may differ in detail to address new problems or concerns. + +Each version is given a distinguishing version number. If the Library +specifies a version number of this License which applies to it and +"any later version", you have the option of following the terms and +conditions either of that version or of any later version published by +the Free Software Foundation. If the Library does not specify a +license version number, you may choose any version ever published by +the Free Software Foundation. + + 14. If you wish to incorporate parts of the Library into other free +programs whose distribution conditions are incompatible with these, +write to the author to ask for permission. For software which is +copyrighted by the Free Software Foundation, write to the Free +Software Foundation; we sometimes make exceptions for this. Our +decision will be guided by the two goals of preserving the free status +of all derivatives of our free software and of promoting the sharing +and reuse of software generally. + + NO WARRANTY + + 15. BECAUSE THE LIBRARY IS LICENSED FREE OF CHARGE, THERE IS NO +WARRANTY FOR THE LIBRARY, TO THE EXTENT PERMITTED BY APPLICABLE LAW. +EXCEPT WHEN OTHERWISE STATED IN WRITING THE COPYRIGHT HOLDERS AND/OR +OTHER PARTIES PROVIDE THE LIBRARY "AS IS" WITHOUT WARRANTY OF ANY +KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE +IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR +PURPOSE. THE ENTIRE RISK AS TO THE QUALITY AND PERFORMANCE OF THE +LIBRARY IS WITH YOU. SHOULD THE LIBRARY PROVE DEFECTIVE, YOU ASSUME +THE COST OF ALL NECESSARY SERVICING, REPAIR OR CORRECTION. + + 16. IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN +WRITING WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MAY MODIFY +AND/OR REDISTRIBUTE THE LIBRARY AS PERMITTED ABOVE, BE LIABLE TO YOU +FOR DAMAGES, INCLUDING ANY GENERAL, SPECIAL, INCIDENTAL OR +CONSEQUENTIAL DAMAGES ARISING OUT OF THE USE OR INABILITY TO USE THE +LIBRARY (INCLUDING BUT NOT LIMITED TO LOSS OF DATA OR DATA BEING +RENDERED INACCURATE OR LOSSES SUSTAINED BY YOU OR THIRD PARTIES OR A +FAILURE OF THE LIBRARY TO OPERATE WITH ANY OTHER SOFTWARE), EVEN IF +SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE POSSIBILITY OF SUCH +DAMAGES. + + END OF TERMS AND CONDITIONS + +------------------------------------------------------------------------------ +Qt 5 is redistributed within non-headless opencv-python Linux and macOS packages. +libgmp is redistributed within opencv-python macOS packages. +libidn2 is redistributed within opencv-python macOS packages. +libunistring is redistributed within opencv-python macOS packages. +This license applies to the above binaries in the directory cv2/. + + GNU LESSER GENERAL PUBLIC LICENSE + Version 3, 29 June 2007 + + Copyright (C) 2007 Free Software Foundation, Inc. + Everyone is permitted to copy and distribute verbatim copies + of this license document, but changing it is not allowed. + + This version of the GNU Lesser General Public License incorporates +the terms and conditions of version 3 of the GNU General Public +License, supplemented by the additional permissions listed below. + + 0. Additional Definitions. + + As used herein, "this License" refers to version 3 of the GNU Lesser +General Public License, and the "GNU GPL" refers to version 3 of the GNU +General Public License. + + "The Library" refers to a covered work governed by this License, +other than an Application or a Combined Work as defined below. + + An "Application" is any work that makes use of an interface provided +by the Library, but which is not otherwise based on the Library. +Defining a subclass of a class defined by the Library is deemed a mode +of using an interface provided by the Library. + + A "Combined Work" is a work produced by combining or linking an +Application with the Library. The particular version of the Library +with which the Combined Work was made is also called the "Linked +Version". + + The "Minimal Corresponding Source" for a Combined Work means the +Corresponding Source for the Combined Work, excluding any source code +for portions of the Combined Work that, considered in isolation, are +based on the Application, and not on the Linked Version. + + The "Corresponding Application Code" for a Combined Work means the +object code and/or source code for the Application, including any data +and utility programs needed for reproducing the Combined Work from the +Application, but excluding the System Libraries of the Combined Work. + + 1. Exception to Section 3 of the GNU GPL. + + You may convey a covered work under sections 3 and 4 of this License +without being bound by section 3 of the GNU GPL. + + 2. Conveying Modified Versions. + + If you modify a copy of the Library, and, in your modifications, a +facility refers to a function or data to be supplied by an Application +that uses the facility (other than as an argument passed when the +facility is invoked), then you may convey a copy of the modified +version: + + a) under this License, provided that you make a good faith effort to + ensure that, in the event an Application does not supply the + function or data, the facility still operates, and performs + whatever part of its purpose remains meaningful, or + + b) under the GNU GPL, with none of the additional permissions of + this License applicable to that copy. + + 3. Object Code Incorporating Material from Library Header Files. + + The object code form of an Application may incorporate material from +a header file that is part of the Library. You may convey such object +code under terms of your choice, provided that, if the incorporated +material is not limited to numerical parameters, data structure +layouts and accessors, or small macros, inline functions and templates +(ten or fewer lines in length), you do both of the following: + + a) Give prominent notice with each copy of the object code that the + Library is used in it and that the Library and its use are + covered by this License. + + b) Accompany the object code with a copy of the GNU GPL and this license + document. + + 4. Combined Works. + + You may convey a Combined Work under terms of your choice that, +taken together, effectively do not restrict modification of the +portions of the Library contained in the Combined Work and reverse +engineering for debugging such modifications, if you also do each of +the following: + + a) Give prominent notice with each copy of the Combined Work that + the Library is used in it and that the Library and its use are + covered by this License. + + b) Accompany the Combined Work with a copy of the GNU GPL and this license + document. + + c) For a Combined Work that displays copyright notices during + execution, include the copyright notice for the Library among + these notices, as well as a reference directing the user to the + copies of the GNU GPL and this license document. + + d) Do one of the following: + + 0) Convey the Minimal Corresponding Source under the terms of this + License, and the Corresponding Application Code in a form + suitable for, and under terms that permit, the user to + recombine or relink the Application with a modified version of + the Linked Version to produce a modified Combined Work, in the + manner specified by section 6 of the GNU GPL for conveying + Corresponding Source. + + 1) Use a suitable shared library mechanism for linking with the + Library. A suitable mechanism is one that (a) uses at run time + a copy of the Library already present on the user's computer + system, and (b) will operate properly with a modified version + of the Library that is interface-compatible with the Linked + Version. + + e) Provide Installation Information, but only if you would otherwise + be required to provide such information under section 6 of the + GNU GPL, and only to the extent that such information is + necessary to install and execute a modified version of the + Combined Work produced by recombining or relinking the + Application with a modified version of the Linked Version. (If + you use option 4d0, the Installation Information must accompany + the Minimal Corresponding Source and Corresponding Application + Code. If you use option 4d1, you must provide the Installation + Information in the manner specified by section 6 of the GNU GPL + for conveying Corresponding Source.) + + 5. Combined Libraries. + + You may place library facilities that are a work based on the +Library side by side in a single library together with other library +facilities that are not Applications and are not covered by this +License, and convey such a combined library under terms of your +choice, if you do both of the following: + + a) Accompany the combined library with a copy of the same work based + on the Library, uncombined with any other library facilities, + conveyed under the terms of this License. + + b) Give prominent notice with the combined library that part of it + is a work based on the Library, and explaining where to find the + accompanying uncombined form of the same work. + + 6. Revised Versions of the GNU Lesser General Public License. + + The Free Software Foundation may publish revised and/or new versions +of the GNU Lesser General Public License from time to time. Such new +versions will be similar in spirit to the present version, but may +differ in detail to address new problems or concerns. + + Each version is given a distinguishing version number. If the +Library as you received it specifies that a certain numbered version +of the GNU Lesser General Public License "or any later version" +applies to it, you have the option of following the terms and +conditions either of that published version or of any later version +published by the Free Software Foundation. If the Library as you +received it does not specify a version number of the GNU Lesser +General Public License, you may choose any version of the GNU Lesser +General Public License ever published by the Free Software Foundation. + + If the Library as you received it specifies that a proxy can decide +whether future versions of the GNU Lesser General Public License shall +apply, that proxy's public statement of acceptance of any version is +permanent authorization for you to choose that version for the +Library. + +------------------------------------------------------------------------------ +bzip2 is redistributed within all opencv-python Linux packages. +This license applies to libbz2 binary in the directory cv2/. + +This program, "bzip2", the associated library "libbzip2", and all +documentation, are copyright (C) 1996-2010 Julian R Seward. All +rights reserved. + +Redistribution and use in source and binary forms, with or without +modification, are permitted provided that the following conditions +are met: + +1. Redistributions of source code must retain the above copyright + notice, this list of conditions and the following disclaimer. + +2. The origin of this software must not be misrepresented; you must + not claim that you wrote the original software. If you use this + software in a product, an acknowledgment in the product + documentation would be appreciated but is not required. + +3. Altered source versions must be plainly marked as such, and must + not be misrepresented as being the original software. + +4. The name of the author may not be used to endorse or promote + products derived from this software without specific prior written + permission. + +THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS +OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED +WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE +ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY +DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL +DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE +GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS +INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, +WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING +NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS +SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + +Julian Seward, jseward@bzip.org +bzip2/libbzip2 version 1.0.6 of 6 September 2010 + +------------------------------------------------------------------------------ +libcrypto and libssl are redistributed within all opencv-python Linux and macOS packages. +libopencore-amrnb and libopencore-amrwb are redistributed within all opencv-python Linux and macOS packages. +This license applies to above binaries in the directory cv2/. + + LICENSE ISSUES + ============== + + The OpenSSL toolkit stays under a double license, i.e. both the conditions of + the OpenSSL License and the original SSLeay license apply to the toolkit. + See below for the actual license texts. + + OpenSSL License + --------------- + +/* ==================================================================== + * Copyright (c) 1998-2019 The OpenSSL Project. All rights reserved. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in + * the documentation and/or other materials provided with the + * distribution. + * + * 3. All advertising materials mentioning features or use of this + * software must display the following acknowledgment: + * "This product includes software developed by the OpenSSL Project + * for use in the OpenSSL Toolkit. (http://www.openssl.org/)" + * + * 4. The names "OpenSSL Toolkit" and "OpenSSL Project" must not be used to + * endorse or promote products derived from this software without + * prior written permission. For written permission, please contact + * openssl-core@openssl.org. + * + * 5. Products derived from this software may not be called "OpenSSL" + * nor may "OpenSSL" appear in their names without prior written + * permission of the OpenSSL Project. + * + * 6. Redistributions of any form whatsoever must retain the following + * acknowledgment: + * "This product includes software developed by the OpenSSL Project + * for use in the OpenSSL Toolkit (http://www.openssl.org/)" + * + * THIS SOFTWARE IS PROVIDED BY THE OpenSSL PROJECT ``AS IS'' AND ANY + * EXPRESSED OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR + * PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE OpenSSL PROJECT OR + * ITS CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, + * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT + * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; + * LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) + * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, + * STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) + * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED + * OF THE POSSIBILITY OF SUCH DAMAGE. + * ==================================================================== + * + * This product includes cryptographic software written by Eric Young + * (eay@cryptsoft.com). This product includes software written by Tim + * Hudson (tjh@cryptsoft.com). + * + */ + + Original SSLeay License + ----------------------- + +/* Copyright (C) 1995-1998 Eric Young (eay@cryptsoft.com) + * All rights reserved. + * + * This package is an SSL implementation written + * by Eric Young (eay@cryptsoft.com). + * The implementation was written so as to conform with Netscapes SSL. + * + * This library is free for commercial and non-commercial use as long as + * the following conditions are adhered to. The following conditions + * apply to all code found in this distribution, be it the RC4, RSA, + * lhash, DES, etc., code; not just the SSL code. The SSL documentation + * included with this distribution is covered by the same copyright terms + * except that the holder is Tim Hudson (tjh@cryptsoft.com). + * + * Copyright remains Eric Young's, and as such any Copyright notices in + * the code are not to be removed. + * If this package is used in a product, Eric Young should be given attribution + * as the author of the parts of the library used. + * This can be in the form of a textual message at program startup or + * in documentation (online or textual) provided with the package. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the copyright + * notice, this list of conditions and the following disclaimer. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * 3. All advertising materials mentioning features or use of this software + * must display the following acknowledgement: + * "This product includes cryptographic software written by + * Eric Young (eay@cryptsoft.com)" + * The word 'cryptographic' can be left out if the routines from the library + * being used are not cryptographic related :-). + * 4. If you include any Windows specific code (or a derivative thereof) from + * the apps directory (application code) you must include an acknowledgement: + * "This product includes software written by Tim Hudson (tjh@cryptsoft.com)" + * + * THIS SOFTWARE IS PROVIDED BY ERIC YOUNG ``AS IS'' AND + * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE + * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE + * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL + * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS + * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) + * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT + * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY + * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF + * SUCH DAMAGE. + * + * The licence and distribution terms for any publicly available version or + * derivative of this code cannot be changed. i.e. this code cannot simply be + * copied and put under another distribution licence + * [including the GNU Public Licence.] + */ + +------------------------------------------------------------------------------ +libfontconfig is redistributed within all opencv-python macOS packages. +This license applies to libfontconfig binary in the directory cv2/. + +Copyright © 2000,2001,2002,2003,2004,2006,2007 Keith Packard +Copyright © 2005 Patrick Lam +Copyright © 2009 Roozbeh Pournader +Copyright © 2008,2009 Red Hat, Inc. +Copyright © 2008 Danilo Šegan +Copyright © 2012 Google, Inc. + + +Permission to use, copy, modify, distribute, and sell this software and its +documentation for any purpose is hereby granted without fee, provided that +the above copyright notice appear in all copies and that both that +copyright notice and this permission notice appear in supporting +documentation, and that the name of the author(s) not be used in +advertising or publicity pertaining to distribution of the software without +specific, written prior permission. The authors make no +representations about the suitability of this software for any purpose. It +is provided "as is" without express or implied warranty. + +THE AUTHOR(S) DISCLAIMS ALL WARRANTIES WITH REGARD TO THIS SOFTWARE, +INCLUDING ALL IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS, IN NO +EVENT SHALL THE AUTHOR(S) BE LIABLE FOR ANY SPECIAL, INDIRECT OR +CONSEQUENTIAL DAMAGES OR ANY DAMAGES WHATSOEVER RESULTING FROM LOSS OF USE, +DATA OR PROFITS, WHETHER IN AN ACTION OF CONTRACT, NEGLIGENCE OR OTHER +TORTIOUS ACTION, ARISING OUT OF OR IN CONNECTION WITH THE USE OR +PERFORMANCE OF THIS SOFTWARE. + +------------------------------------------------------------------------------ +libfreetype is redistributed within opencv-python Linux and macOS packages. +This license applies to libfreetype binary in the directory cv2/. + + The FreeType Project LICENSE + ---------------------------- + + 2006-Jan-27 + + Copyright 1996-2002, 2006 by + David Turner, Robert Wilhelm, and Werner Lemberg + + + +Introduction +============ + + The FreeType Project is distributed in several archive packages; + some of them may contain, in addition to the FreeType font engine, + various tools and contributions which rely on, or relate to, the + FreeType Project. + + This license applies to all files found in such packages, and + which do not fall under their own explicit license. The license + affects thus the FreeType font engine, the test programs, + documentation and makefiles, at the very least. + + This license was inspired by the BSD, Artistic, and IJG + (Independent JPEG Group) licenses, which all encourage inclusion + and use of free software in commercial and freeware products + alike. As a consequence, its main points are that: + + o We don't promise that this software works. However, we will be + interested in any kind of bug reports. (`as is' distribution) + + o You can use this software for whatever you want, in parts or + full form, without having to pay us. (`royalty-free' usage) + + o You may not pretend that you wrote this software. If you use + it, or only parts of it, in a program, you must acknowledge + somewhere in your documentation that you have used the + FreeType code. (`credits') + + We specifically permit and encourage the inclusion of this + software, with or without modifications, in commercial products. + We disclaim all warranties covering The FreeType Project and + assume no liability related to The FreeType Project. + + + Finally, many people asked us for a preferred form for a + credit/disclaimer to use in compliance with this license. We thus + encourage you to use the following text: + + """ + Portions of this software are copyright © The FreeType + Project (www.freetype.org). All rights reserved. + """ + + Please replace with the value from the FreeType version you + actually use. + + +Legal Terms +=========== + +0. Definitions +-------------- + + Throughout this license, the terms `package', `FreeType Project', + and `FreeType archive' refer to the set of files originally + distributed by the authors (David Turner, Robert Wilhelm, and + Werner Lemberg) as the `FreeType Project', be they named as alpha, + beta or final release. + + `You' refers to the licensee, or person using the project, where + `using' is a generic term including compiling the project's source + code as well as linking it to form a `program' or `executable'. + This program is referred to as `a program using the FreeType + engine'. + + This license applies to all files distributed in the original + FreeType Project, including all source code, binaries and + documentation, unless otherwise stated in the file in its + original, unmodified form as distributed in the original archive. + If you are unsure whether or not a particular file is covered by + this license, you must contact us to verify this. + + The FreeType Project is copyright (C) 1996-2000 by David Turner, + Robert Wilhelm, and Werner Lemberg. All rights reserved except as + specified below. + +1. No Warranty +-------------- + + THE FREETYPE PROJECT IS PROVIDED `AS IS' WITHOUT WARRANTY OF ANY + KIND, EITHER EXPRESS OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, + WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR + PURPOSE. IN NO EVENT WILL ANY OF THE AUTHORS OR COPYRIGHT HOLDERS + BE LIABLE FOR ANY DAMAGES CAUSED BY THE USE OR THE INABILITY TO + USE, OF THE FREETYPE PROJECT. + +2. Redistribution +----------------- + + This license grants a worldwide, royalty-free, perpetual and + irrevocable right and license to use, execute, perform, compile, + display, copy, create derivative works of, distribute and + sublicense the FreeType Project (in both source and object code + forms) and derivative works thereof for any purpose; and to + authorize others to exercise some or all of the rights granted + herein, subject to the following conditions: + + o Redistribution of source code must retain this license file + (`FTL.TXT') unaltered; any additions, deletions or changes to + the original files must be clearly indicated in accompanying + documentation. The copyright notices of the unaltered, + original files must be preserved in all copies of source + files. + + o Redistribution in binary form must provide a disclaimer that + states that the software is based in part of the work of the + FreeType Team, in the distribution documentation. We also + encourage you to put an URL to the FreeType web page in your + documentation, though this isn't mandatory. + + These conditions apply to any software derived from or based on + the FreeType Project, not just the unmodified files. If you use + our work, you must acknowledge us. However, no fee need be paid + to us. + +3. Advertising +-------------- + + Neither the FreeType authors and contributors nor you shall use + the name of the other for commercial, advertising, or promotional + purposes without specific prior written permission. + + We suggest, but do not require, that you use one or more of the + following phrases to refer to this software in your documentation + or advertising materials: `FreeType Project', `FreeType Engine', + `FreeType library', or `FreeType Distribution'. + + As you have not signed this license, you are not required to + accept it. However, as the FreeType Project is copyrighted + material, only this license, or another one contracted with the + authors, grants you the right to use, distribute, and modify it. + Therefore, by using, distributing, or modifying the FreeType + Project, you indicate that you understand and accept all the terms + of this license. + +4. Contacts +----------- + + There are two mailing lists related to FreeType: + + o freetype@nongnu.org + + Discusses general use and applications of FreeType, as well as + future and wanted additions to the library and distribution. + If you are looking for support, start in this list if you + haven't found anything to help you in the documentation. + + o freetype-devel@nongnu.org + + Discusses bugs, as well as engine internals, design issues, + specific licenses, porting, etc. + + Our home page can be found at + + https://www.freetype.org + +------------------------------------------------------------------------------ +libpng is redistributed within all opencv-python Linux and macOS packages. +This license applies to libpng binary in the directory cv2/. + +PNG Reference Library License version 2 +--------------------------------------- + + * Copyright (c) 1995-2019 The PNG Reference Library Authors. + * Copyright (c) 2018-2019 Cosmin Truta. + * Copyright (c) 2000-2002, 2004, 2006-2018 Glenn Randers-Pehrson. + * Copyright (c) 1996-1997 Andreas Dilger. + * Copyright (c) 1995-1996 Guy Eric Schalnat, Group 42, Inc. + +The software is supplied "as is", without warranty of any kind, +express or implied, including, without limitation, the warranties +of merchantability, fitness for a particular purpose, title, and +non-infringement. In no event shall the Copyright owners, or +anyone distributing the software, be liable for any damages or +other liability, whether in contract, tort or otherwise, arising +from, out of, or in connection with the software, or the use or +other dealings in the software, even if advised of the possibility +of such damage. + +Permission is hereby granted to use, copy, modify, and distribute +this software, or portions hereof, for any purpose, without fee, +subject to the following restrictions: + + 1. The origin of this software must not be misrepresented; you + must not claim that you wrote the original software. If you + use this software in a product, an acknowledgment in the product + documentation would be appreciated, but is not required. + + 2. Altered source versions must be plainly marked as such, and must + not be misrepresented as being the original software. + + 3. This Copyright notice may not be removed or altered from any + source or altered source distribution. + + +PNG Reference Library License version 1 (for libpng 0.5 through 1.6.35) +----------------------------------------------------------------------- + +libpng versions 1.0.7, July 1, 2000, through 1.6.35, July 15, 2018 are +Copyright (c) 2000-2002, 2004, 2006-2018 Glenn Randers-Pehrson, are +derived from libpng-1.0.6, and are distributed according to the same +disclaimer and license as libpng-1.0.6 with the following individuals +added to the list of Contributing Authors: + + Simon-Pierre Cadieux + Eric S. Raymond + Mans Rullgard + Cosmin Truta + Gilles Vollant + James Yu + Mandar Sahastrabuddhe + Google Inc. + Vadim Barkov + +and with the following additions to the disclaimer: + + There is no warranty against interference with your enjoyment of + the library or against infringement. There is no warranty that our + efforts or the library will fulfill any of your particular purposes + or needs. This library is provided with all faults, and the entire + risk of satisfactory quality, performance, accuracy, and effort is + with the user. + +Some files in the "contrib" directory and some configure-generated +files that are distributed with libpng have other copyright owners, and +are released under other open source licenses. + +libpng versions 0.97, January 1998, through 1.0.6, March 20, 2000, are +Copyright (c) 1998-2000 Glenn Randers-Pehrson, are derived from +libpng-0.96, and are distributed according to the same disclaimer and +license as libpng-0.96, with the following individuals added to the +list of Contributing Authors: + + Tom Lane + Glenn Randers-Pehrson + Willem van Schaik + +libpng versions 0.89, June 1996, through 0.96, May 1997, are +Copyright (c) 1996-1997 Andreas Dilger, are derived from libpng-0.88, +and are distributed according to the same disclaimer and license as +libpng-0.88, with the following individuals added to the list of +Contributing Authors: + + John Bowler + Kevin Bracey + Sam Bushell + Magnus Holmgren + Greg Roelofs + Tom Tanner + +Some files in the "scripts" directory have other copyright owners, +but are released under this license. + +libpng versions 0.5, May 1995, through 0.88, January 1996, are +Copyright (c) 1995-1996 Guy Eric Schalnat, Group 42, Inc. + +For the purposes of this copyright and license, "Contributing Authors" +is defined as the following set of individuals: + + Andreas Dilger + Dave Martindale + Guy Eric Schalnat + Paul Schmidt + Tim Wegner + +The PNG Reference Library is supplied "AS IS". The Contributing +Authors and Group 42, Inc. disclaim all warranties, expressed or +implied, including, without limitation, the warranties of +merchantability and of fitness for any purpose. The Contributing +Authors and Group 42, Inc. assume no liability for direct, indirect, +incidental, special, exemplary, or consequential damages, which may +result from the use of the PNG Reference Library, even if advised of +the possibility of such damage. + +Permission is hereby granted to use, copy, modify, and distribute this +source code, or portions hereof, for any purpose, without fee, subject +to the following restrictions: + + 1. The origin of this source code must not be misrepresented. + + 2. Altered versions must be plainly marked as such and must not + be misrepresented as being the original source. + + 3. This Copyright notice may not be removed or altered from any + source or altered source distribution. + +The Contributing Authors and Group 42, Inc. specifically permit, +without fee, and encourage the use of this source code as a component +to supporting the PNG file format in commercial products. If you use +this source code in a product, acknowledgment is not required but would +be appreciated. + +------------------------------------------------------------------------------ +libz is redistributed within all opencv-python Linux packages. +This license applies to libz binary in the directory cv2/. + + Copyright (C) 1995-2017 Jean-loup Gailly and Mark Adler + + This software is provided 'as-is', without any express or implied + warranty. In no event will the authors be held liable for any damages + arising from the use of this software. + + Permission is granted to anyone to use this software for any purpose, + including commercial applications, and to alter it and redistribute it + freely, subject to the following restrictions: + + 1. The origin of this software must not be misrepresented; you must not + claim that you wrote the original software. If you use this software + in a product, an acknowledgment in the product documentation would be + appreciated but is not required. + 2. Altered source versions must be plainly marked as such, and must not be + misrepresented as being the original software. + 3. This notice may not be removed or altered from any source distribution. + + Jean-loup Gailly Mark Adler + jloup@gzip.org madler@alumni.caltech.edu + +------------------------------------------------------------------------------ +libdav1d is redistributed within opencv-python macOS packages. +This license applies to libdav1d binary in the directory cv2/. + +Copyright © 2018-2019, VideoLAN and dav1d authors +All rights reserved. + +Redistribution and use in source and binary forms, with or without +modification, are permitted provided that the following conditions are met: + +1. Redistributions of source code must retain the above copyright notice, this + list of conditions and the following disclaimer. + +2. Redistributions in binary form must reproduce the above copyright notice, + this list of conditions and the following disclaimer in the documentation + and/or other materials provided with the distribution. + +THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND +ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED +WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE +DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR +ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES +(INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; +LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND +ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT +(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS +SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + +------------------------------------------------------------------------------ +libffi is redistributed within opencv-python macOS packages. +This license applies to libffi binary in the directory cv2/. + +libffi - Copyright (c) 1996-2020 Anthony Green, Red Hat, Inc and others. +See source files for details. + +Permission is hereby granted, free of charge, to any person obtaining +a copy of this software and associated documentation files (the +``Software''), to deal in the Software without restriction, including +without limitation the rights to use, copy, modify, merge, publish, +distribute, sublicense, and/or sell copies of the Software, and to +permit persons to whom the Software is furnished to do so, subject to +the following conditions: + +The above copyright notice and this permission notice shall be +included in all copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED ``AS IS'', WITHOUT WARRANTY OF ANY KIND, +EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF +MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. +IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY +CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, +TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE +SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. + +------------------------------------------------------------------------------ +libogg is redistributed within opencv-python macOS packages. +This license applies to libogg binary in the directory cv2/. + +Copyright (c) 2002, Xiph.org Foundation + +Redistribution and use in source and binary forms, with or without +modification, are permitted provided that the following conditions +are met: + +- Redistributions of source code must retain the above copyright +notice, this list of conditions and the following disclaimer. + +- Redistributions in binary form must reproduce the above copyright +notice, this list of conditions and the following disclaimer in the +documentation and/or other materials provided with the distribution. + +- Neither the name of the Xiph.org Foundation nor the names of its +contributors may be used to endorse or promote products derived from +this software without specific prior written permission. + +THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS +``AS IS'' AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT +LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR +A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE FOUNDATION +OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, +SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT +LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, +DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY +THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT +(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE +OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + +------------------------------------------------------------------------------ +libopenjp2 is redistributed within opencv-python macOS packages. +This license applies to libopenjp2 binary in the directory cv2/. + +The copyright in this software is being made available under the 2-clauses +BSD License, included below. This software may be subject to other third +party and contributor rights, including patent rights, and no such rights +are granted under this license. + +Copyright (c) 2002-2014, Universite catholique de Louvain (UCL), Belgium +Copyright (c) 2002-2014, Professor Benoit Macq +Copyright (c) 2003-2014, Antonin Descampe +Copyright (c) 2003-2009, Francois-Olivier Devaux +Copyright (c) 2005, Herve Drolon, FreeImage Team +Copyright (c) 2002-2003, Yannick Verschueren +Copyright (c) 2001-2003, David Janssens +Copyright (c) 2011-2012, Centre National d'Etudes Spatiales (CNES), France +Copyright (c) 2012, CS Systemes d'Information, France + +All rights reserved. + +Redistribution and use in source and binary forms, with or without +modification, are permitted provided that the following conditions +are met: +1. Redistributions of source code must retain the above copyright + notice, this list of conditions and the following disclaimer. +2. Redistributions in binary form must reproduce the above copyright + notice, this list of conditions and the following disclaimer in the + documentation and/or other materials provided with the distribution. + +THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS `AS IS' +AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE +IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE +ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE +LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR +CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF +SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS +INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN +CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) +ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE +POSSIBILITY OF SUCH DAMAGE. + +------------------------------------------------------------------------------ +libopus is redistributed within opencv-python macOS packages. +This license applies to libopus binary in the directory cv2/. + +Copyright 2001-2011 Xiph.Org, Skype Limited, Octasic, + Jean-Marc Valin, Timothy B. Terriberry, + CSIRO, Gregory Maxwell, Mark Borgerding, + Erik de Castro Lopo + +Redistribution and use in source and binary forms, with or without +modification, are permitted provided that the following conditions +are met: + +- Redistributions of source code must retain the above copyright +notice, this list of conditions and the following disclaimer. + +- Redistributions in binary form must reproduce the above copyright +notice, this list of conditions and the following disclaimer in the +documentation and/or other materials provided with the distribution. + +- Neither the name of Internet Society, IETF or IETF Trust, nor the +names of specific contributors, may be used to endorse or promote +products derived from this software without specific prior written +permission. + +THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS +``AS IS'' AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT +LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR +A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER +OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, +EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, +PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR +PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF +LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING +NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS +SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + +Opus is subject to the royalty-free patent licenses which are +specified at: + +Xiph.Org Foundation: +https://datatracker.ietf.org/ipr/1524/ + +Microsoft Corporation: +https://datatracker.ietf.org/ipr/1914/ + +Broadcom Corporation: +https://datatracker.ietf.org/ipr/1526/ + +------------------------------------------------------------------------------ +librav1e is redistributed within opencv-python macOS packages. +This license applies to librav1e binary in the directory cv2/. + +BSD 2-Clause License + +Copyright (c) 2017-2020, the rav1e contributors +All rights reserved. + +Redistribution and use in source and binary forms, with or without +modification, are permitted provided that the following conditions are met: + +* Redistributions of source code must retain the above copyright notice, this + list of conditions and the following disclaimer. + +* Redistributions in binary form must reproduce the above copyright notice, + this list of conditions and the following disclaimer in the documentation + and/or other materials provided with the distribution. + +THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" +AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE +IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE +DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE +FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL +DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR +SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER +CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, +OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE +OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + +------------------------------------------------------------------------------ +libsnappy is redistributed within opencv-python macOS packages. +This license applies to libsnappy binary in the directory cv2/. + +Copyright 2011, Google Inc. +All rights reserved. + +Redistribution and use in source and binary forms, with or without +modification, are permitted provided that the following conditions are +met: + + * Redistributions of source code must retain the above copyright +notice, this list of conditions and the following disclaimer. + * Redistributions in binary form must reproduce the above +copyright notice, this list of conditions and the following disclaimer +in the documentation and/or other materials provided with the +distribution. + * Neither the name of Google Inc. nor the names of its +contributors may be used to endorse or promote products derived from +this software without specific prior written permission. + +THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS +"AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT +LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR +A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT +OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, +SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT +LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, +DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY +THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT +(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE +OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + +------------------------------------------------------------------------------ +libspeex is redistributed within opencv-python macOS packages. +This license applies to libspeex binary in the directory cv2/. + +Copyright 2002-2008 Xiph.org Foundation +Copyright 2002-2008 Jean-Marc Valin +Copyright 2005-2007 Analog Devices Inc. +Copyright 2005-2008 Commonwealth Scientific and Industrial Research + Organisation (CSIRO) +Copyright 1993, 2002, 2006 David Rowe +Copyright 2003 EpicGames +Copyright 1992-1994 Jutta Degener, Carsten Bormann + +Redistribution and use in source and binary forms, with or without +modification, are permitted provided that the following conditions +are met: + +- Redistributions of source code must retain the above copyright +notice, this list of conditions and the following disclaimer. + +- Redistributions in binary form must reproduce the above copyright +notice, this list of conditions and the following disclaimer in the +documentation and/or other materials provided with the distribution. + +- Neither the name of the Xiph.org Foundation nor the names of its +contributors may be used to endorse or promote products derived from +this software without specific prior written permission. + +THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS +``AS IS'' AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT +LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR +A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE FOUNDATION OR +CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, +EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, +PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR +PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF +LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING +NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS +SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + +------------------------------------------------------------------------------ +libsrt is redistributed within opencv-python macOS packages. +This license applies to libsrt binary in the directory cv2/. + +/* + * + * Copyright (c) 2001-2017 Cisco Systems, Inc. + * All rights reserved. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * + * Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * + * Redistributions in binary form must reproduce the above + * copyright notice, this list of conditions and the following + * disclaimer in the documentation and/or other materials provided + * with the distribution. + * + * Neither the name of the Cisco Systems, Inc. nor the names of its + * contributors may be used to endorse or promote products derived + * from this software without specific prior written permission. + * + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS + * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT + * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS + * FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE + * COPYRIGHT HOLDERS OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, + * INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES + * (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR + * SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) + * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, + * STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) + * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED + * OF THE POSSIBILITY OF SUCH DAMAGE. + * + */ + + + Mozilla Public License Version 2.0 +================================== + +1. Definitions +-------------- + +1.1. "Contributor" + means each individual or legal entity that creates, contributes to + the creation of, or owns Covered Software. + +1.2. "Contributor Version" + means the combination of the Contributions of others (if any) used + by a Contributor and that particular Contributor's Contribution. + +1.3. "Contribution" + means Covered Software of a particular Contributor. + +1.4. "Covered Software" + means Source Code Form to which the initial Contributor has attached + the notice in Exhibit A, the Executable Form of such Source Code + Form, and Modifications of such Source Code Form, in each case + including portions thereof. + +1.5. "Incompatible With Secondary Licenses" + means + + (a) that the initial Contributor has attached the notice described + in Exhibit B to the Covered Software; or + + (b) that the Covered Software was made available under the terms of + version 1.1 or earlier of the License, but not also under the + terms of a Secondary License. + +1.6. "Executable Form" + means any form of the work other than Source Code Form. + +1.7. "Larger Work" + means a work that combines Covered Software with other material, in + a separate file or files, that is not Covered Software. + +1.8. "License" + means this document. + +1.9. "Licensable" + means having the right to grant, to the maximum extent possible, + whether at the time of the initial grant or subsequently, any and + all of the rights conveyed by this License. + +1.10. "Modifications" + means any of the following: + + (a) any file in Source Code Form that results from an addition to, + deletion from, or modification of the contents of Covered + Software; or + + (b) any new file in Source Code Form that contains any Covered + Software. + +1.11. "Patent Claims" of a Contributor + means any patent claim(s), including without limitation, method, + process, and apparatus claims, in any patent Licensable by such + Contributor that would be infringed, but for the grant of the + License, by the making, using, selling, offering for sale, having + made, import, or transfer of either its Contributions or its + Contributor Version. + +1.12. "Secondary License" + means either the GNU General Public License, Version 2.0, the GNU + Lesser General Public License, Version 2.1, the GNU Affero General + Public License, Version 3.0, or any later versions of those + licenses. + +1.13. "Source Code Form" + means the form of the work preferred for making modifications. + +1.14. "You" (or "Your") + means an individual or a legal entity exercising rights under this + License. For legal entities, "You" includes any entity that + controls, is controlled by, or is under common control with You. For + purposes of this definition, "control" means (a) the power, direct + or indirect, to cause the direction or management of such entity, + whether by contract or otherwise, or (b) ownership of more than + fifty percent (50%) of the outstanding shares or beneficial + ownership of such entity. + +2. License Grants and Conditions +-------------------------------- + +2.1. Grants + +Each Contributor hereby grants You a world-wide, royalty-free, +non-exclusive license: + +(a) under intellectual property rights (other than patent or trademark) + Licensable by such Contributor to use, reproduce, make available, + modify, display, perform, distribute, and otherwise exploit its + Contributions, either on an unmodified basis, with Modifications, or + as part of a Larger Work; and + +(b) under Patent Claims of such Contributor to make, use, sell, offer + for sale, have made, import, and otherwise transfer either its + Contributions or its Contributor Version. + +2.2. Effective Date + +The licenses granted in Section 2.1 with respect to any Contribution +become effective for each Contribution on the date the Contributor first +distributes such Contribution. + +2.3. Limitations on Grant Scope + +The licenses granted in this Section 2 are the only rights granted under +this License. No additional rights or licenses will be implied from the +distribution or licensing of Covered Software under this License. +Notwithstanding Section 2.1(b) above, no patent license is granted by a +Contributor: + +(a) for any code that a Contributor has removed from Covered Software; + or + +(b) for infringements caused by: (i) Your and any other third party's + modifications of Covered Software, or (ii) the combination of its + Contributions with other software (except as part of its Contributor + Version); or + +(c) under Patent Claims infringed by Covered Software in the absence of + its Contributions. + +This License does not grant any rights in the trademarks, service marks, +or logos of any Contributor (except as may be necessary to comply with +the notice requirements in Section 3.4). + +2.4. Subsequent Licenses + +No Contributor makes additional grants as a result of Your choice to +distribute the Covered Software under a subsequent version of this +License (see Section 10.2) or under the terms of a Secondary License (if +permitted under the terms of Section 3.3). + +2.5. Representation + +Each Contributor represents that the Contributor believes its +Contributions are its original creation(s) or it has sufficient rights +to grant the rights to its Contributions conveyed by this License. + +2.6. Fair Use + +This License is not intended to limit any rights You have under +applicable copyright doctrines of fair use, fair dealing, or other +equivalents. + +2.7. Conditions + +Sections 3.1, 3.2, 3.3, and 3.4 are conditions of the licenses granted +in Section 2.1. + +3. Responsibilities +------------------- + +3.1. Distribution of Source Form + +All distribution of Covered Software in Source Code Form, including any +Modifications that You create or to which You contribute, must be under +the terms of this License. You must inform recipients that the Source +Code Form of the Covered Software is governed by the terms of this +License, and how they can obtain a copy of this License. You may not +attempt to alter or restrict the recipients' rights in the Source Code +Form. + +3.2. Distribution of Executable Form + +If You distribute Covered Software in Executable Form then: + +(a) such Covered Software must also be made available in Source Code + Form, as described in Section 3.1, and You must inform recipients of + the Executable Form how they can obtain a copy of such Source Code + Form by reasonable means in a timely manner, at a charge no more + than the cost of distribution to the recipient; and + +(b) You may distribute such Executable Form under the terms of this + License, or sublicense it under different terms, provided that the + license for the Executable Form does not attempt to limit or alter + the recipients' rights in the Source Code Form under this License. + +3.3. Distribution of a Larger Work + +You may create and distribute a Larger Work under terms of Your choice, +provided that You also comply with the requirements of this License for +the Covered Software. If the Larger Work is a combination of Covered +Software with a work governed by one or more Secondary Licenses, and the +Covered Software is not Incompatible With Secondary Licenses, this +License permits You to additionally distribute such Covered Software +under the terms of such Secondary License(s), so that the recipient of +the Larger Work may, at their option, further distribute the Covered +Software under the terms of either this License or such Secondary +License(s). + +3.4. Notices + +You may not remove or alter the substance of any license notices +(including copyright notices, patent notices, disclaimers of warranty, +or limitations of liability) contained within the Source Code Form of +the Covered Software, except that You may alter any license notices to +the extent required to remedy known factual inaccuracies. + +3.5. Application of Additional Terms + +You may choose to offer, and to charge a fee for, warranty, support, +indemnity or liability obligations to one or more recipients of Covered +Software. However, You may do so only on Your own behalf, and not on +behalf of any Contributor. You must make it absolutely clear that any +such warranty, support, indemnity, or liability obligation is offered by +You alone, and You hereby agree to indemnify every Contributor for any +liability incurred by such Contributor as a result of warranty, support, +indemnity or liability terms You offer. You may include additional +disclaimers of warranty and limitations of liability specific to any +jurisdiction. + +4. Inability to Comply Due to Statute or Regulation +--------------------------------------------------- + +If it is impossible for You to comply with any of the terms of this +License with respect to some or all of the Covered Software due to +statute, judicial order, or regulation then You must: (a) comply with +the terms of this License to the maximum extent possible; and (b) +describe the limitations and the code they affect. Such description must +be placed in a text file included with all distributions of the Covered +Software under this License. Except to the extent prohibited by statute +or regulation, such description must be sufficiently detailed for a +recipient of ordinary skill to be able to understand it. + +5. Termination +-------------- + +5.1. The rights granted under this License will terminate automatically +if You fail to comply with any of its terms. However, if You become +compliant, then the rights granted under this License from a particular +Contributor are reinstated (a) provisionally, unless and until such +Contributor explicitly and finally terminates Your grants, and (b) on an +ongoing basis, if such Contributor fails to notify You of the +non-compliance by some reasonable means prior to 60 days after You have +come back into compliance. Moreover, Your grants from a particular +Contributor are reinstated on an ongoing basis if such Contributor +notifies You of the non-compliance by some reasonable means, this is the +first time You have received notice of non-compliance with this License +from such Contributor, and You become compliant prior to 30 days after +Your receipt of the notice. + +5.2. If You initiate litigation against any entity by asserting a patent +infringement claim (excluding declaratory judgment actions, +counter-claims, and cross-claims) alleging that a Contributor Version +directly or indirectly infringes any patent, then the rights granted to +You by any and all Contributors for the Covered Software under Section +2.1 of this License shall terminate. + +5.3. In the event of termination under Sections 5.1 or 5.2 above, all +end user license agreements (excluding distributors and resellers) which +have been validly granted by You or Your distributors under this License +prior to termination shall survive termination. + +************************************************************************ +* * +* 6. Disclaimer of Warranty * +* ------------------------- * +* * +* Covered Software is provided under this License on an "as is" * +* basis, without warranty of any kind, either expressed, implied, or * +* statutory, including, without limitation, warranties that the * +* Covered Software is free of defects, merchantable, fit for a * +* particular purpose or non-infringing. The entire risk as to the * +* quality and performance of the Covered Software is with You. * +* Should any Covered Software prove defective in any respect, You * +* (not any Contributor) assume the cost of any necessary servicing, * +* repair, or correction. This disclaimer of warranty constitutes an * +* essential part of this License. No use of any Covered Software is * +* authorized under this License except under this disclaimer. * +* * +************************************************************************ + +************************************************************************ +* * +* 7. Limitation of Liability * +* -------------------------- * +* * +* Under no circumstances and under no legal theory, whether tort * +* (including negligence), contract, or otherwise, shall any * +* Contributor, or anyone who distributes Covered Software as * +* permitted above, be liable to You for any direct, indirect, * +* special, incidental, or consequential damages of any character * +* including, without limitation, damages for lost profits, loss of * +* goodwill, work stoppage, computer failure or malfunction, or any * +* and all other commercial damages or losses, even if such party * +* shall have been informed of the possibility of such damages. This * +* limitation of liability shall not apply to liability for death or * +* personal injury resulting from such party's negligence to the * +* extent applicable law prohibits such limitation. Some * +* jurisdictions do not allow the exclusion or limitation of * +* incidental or consequential damages, so this exclusion and * +* limitation may not apply to You. * +* * +************************************************************************ + +8. Litigation +------------- + +Any litigation relating to this License may be brought only in the +courts of a jurisdiction where the defendant maintains its principal +place of business and such litigation shall be governed by laws of that +jurisdiction, without reference to its conflict-of-law provisions. +Nothing in this Section shall prevent a party's ability to bring +cross-claims or counter-claims. + +9. Miscellaneous +---------------- + +This License represents the complete agreement concerning the subject +matter hereof. If any provision of this License is held to be +unenforceable, such provision shall be reformed only to the extent +necessary to make it enforceable. Any law or regulation which provides +that the language of a contract shall be construed against the drafter +shall not be used to construe this License against a Contributor. + +10. Versions of the License +--------------------------- + +10.1. New Versions + +Mozilla Foundation is the license steward. Except as provided in Section +10.3, no one other than the license steward has the right to modify or +publish new versions of this License. Each version will be given a +distinguishing version number. + +10.2. Effect of New Versions + +You may distribute the Covered Software under the terms of the version +of the License under which You originally received the Covered Software, +or under the terms of any subsequent version published by the license +steward. + +10.3. Modified Versions + +If you create software not governed by this License, and you want to +create a new license for such software, you may create and use a +modified version of this License if you rename the license and remove +any references to the name of the license steward (except to note that +such modified license differs from this License). + +10.4. Distributing Source Code Form that is Incompatible With Secondary +Licenses + +If You choose to distribute Source Code Form that is Incompatible With +Secondary Licenses under the terms of this version of the License, the +notice described in Exhibit B of this License must be attached. + +Exhibit A - Source Code Form License Notice +------------------------------------------- + + This Source Code Form is subject to the terms of the Mozilla Public + License, v. 2.0. If a copy of the MPL was not distributed with this + file, You can obtain one at http://mozilla.org/MPL/2.0/. + +If it is not possible or desirable to put the notice in a particular +file, then You may include the notice in a location (such as a LICENSE +file in a relevant directory) where a recipient would be likely to look +for such a notice. + +You may add additional accurate notices of copyright ownership. + +Exhibit B - "Incompatible With Secondary Licenses" Notice +--------------------------------------------------------- + + This Source Code Form is "Incompatible With Secondary Licenses", as + defined by the Mozilla Public License, v. 2.0. + +------------------------------------------------------------------------------ +libtheoradec and libtheoraenc are redistributed within opencv-python macOS packages. +This license applies to libtheoradec and libtheoraenc binaries in the directory cv2/. + + Copyright (C) 2002-2009 Xiph.org Foundation + +Redistribution and use in source and binary forms, with or without +modification, are permitted provided that the following conditions +are met: + +- Redistributions of source code must retain the above copyright +notice, this list of conditions and the following disclaimer. + +- Redistributions in binary form must reproduce the above copyright +notice, this list of conditions and the following disclaimer in the +documentation and/or other materials provided with the distribution. + +- Neither the name of the Xiph.org Foundation nor the names of its +contributors may be used to endorse or promote products derived from +this software without specific prior written permission. + +THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS +``AS IS'' AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT +LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR +A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE FOUNDATION +OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, +SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT +LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, +DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY +THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT +(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE +OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + +------------------------------------------------------------------------------ +libwebp and libwebpmux are redistributed within all opencv-python packages. +This license applies to libwebp and libwebpmux binaries in the directory cv2/. + +Copyright (c) 2010, Google Inc. All rights reserved. + +Redistribution and use in source and binary forms, with or without +modification, are permitted provided that the following conditions are +met: + + * Redistributions of source code must retain the above copyright + notice, this list of conditions and the following disclaimer. + + * Redistributions in binary form must reproduce the above copyright + notice, this list of conditions and the following disclaimer in + the documentation and/or other materials provided with the + distribution. + + * Neither the name of Google nor the names of its contributors may + be used to endorse or promote products derived from this software + without specific prior written permission. + +THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS +"AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT +LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR +A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT +HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, +SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT +LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, +DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY +THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT +(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE +OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + +------------------------------------------------------------------------------ +libvorbis and libvorbisenc are redistributed within opencv-python macOS packages. +This license applies to libvorbis and libvorbisenc binaries in the directory cv2/. + +Copyright (c) 2002-2020 Xiph.org Foundation + +Redistribution and use in source and binary forms, with or without +modification, are permitted provided that the following conditions +are met: + +- Redistributions of source code must retain the above copyright +notice, this list of conditions and the following disclaimer. + +- Redistributions in binary form must reproduce the above copyright +notice, this list of conditions and the following disclaimer in the +documentation and/or other materials provided with the distribution. + +- Neither the name of the Xiph.org Foundation nor the names of its +contributors may be used to endorse or promote products derived from +this software without specific prior written permission. + +THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS +``AS IS'' AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT +LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR +A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE FOUNDATION +OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, +SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT +LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, +DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY +THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT +(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE +OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + +------------------------------------------------------------------------------ +Libxcb utility libraries are redistributed within opencv-python non-headless Linux packages. +This license applies to libxcb related binaries in the directory cv2/. + +Copyright (C) 2001-2006 Bart Massey, Jamey Sharp, and Josh Triplett. +All Rights Reserved. + +Permission is hereby granted, free of charge, to any person +obtaining a copy of this software and associated +documentation files (the "Software"), to deal in the +Software without restriction, including without limitation +the rights to use, copy, modify, merge, publish, distribute, +sublicense, and/or sell copies of the Software, and to +permit persons to whom the Software is furnished to do so, +subject to the following conditions: + +The above copyright notice and this permission notice shall +be included in all copies or substantial portions of the +Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY +KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE +WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR +PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS +BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER +IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR +OTHER DEALINGS IN THE SOFTWARE. + +Except as contained in this notice, the names of the authors +or their institutions shall not be used in advertising or +otherwise to promote the sale, use or other dealings in this +Software without prior written authorization from the +authors. + +------------------------------------------------------------------------------ +Libxcb-image is redistributed within opencv-python non-headless Linux packages. +This license applies to libxcb-image binary in the directory cv2/. + +Copyright © 2007-2008 Bart Massey +Copyright © 2008 Julien Danjou +Copyright © 2008 Keith Packard + +Permission is hereby granted, free of charge, to any person +obtaining a copy of this software and associated documentation +files (the "Software"), to deal in the Software without +restriction, including without limitation the rights to use, copy, +modify, merge, publish, distribute, sublicense, and/or sell copies +of the Software, and to permit persons to whom the Software is +furnished to do so, subject to the following conditions: + +The above copyright notice and this permission notice shall be +included in all copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, +EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF +MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND +NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS BE LIABLE FOR ANY +CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF +CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION +WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. + +Except as contained in this notice, the names of the authors or +their institutions shall not be used in advertising or otherwise to +promote the sale, use or other dealings in this Software without +prior written authorization from the authors. + +------------------------------------------------------------------------------ +Libxcb-util is redistributed within opencv-python non-headless Linux packages. +This license applies to libxcb-util binary in the directory cv2/. + +Copyright © 2008 Bart Massey +Copyright © 2008 Ian Osgood +Copyright © 2008 Jamey Sharp +Copyright © 2008 Josh Triplett +Copyright © 2008-2009 Julien Danjou + +Permission is hereby granted, free of charge, to any person +obtaining a copy of this software and associated documentation +files (the "Software"), to deal in the Software without +restriction, including without limitation the rights to use, copy, +modify, merge, publish, distribute, sublicense, and/or sell copies +of the Software, and to permit persons to whom the Software is +furnished to do so, subject to the following conditions: + +The above copyright notice and this permission notice shall be +included in all copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, +EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF +MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND +NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS BE LIABLE FOR ANY +CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF +CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION +WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. + +Except as contained in this notice, the names of the authors or +their institutions shall not be used in advertising or otherwise to +promote the sale, use or other dealings in this Software without +prior written authorization from the authors. + +------------------------------------------------------------------------------ +Libxcb-render-util is redistributed within opencv-python non-headless Linux packages. +This license applies to libxcb-render-util binary in the directory cv2/. + +Copyright © 2000 Keith Packard + +Permission to use, copy, modify, distribute, and sell this software and its +documentation for any purpose is hereby granted without fee, provided that +the above copyright notice appear in all copies and that both that +copyright notice and this permission notice appear in supporting +documentation, and that the name of Keith Packard not be used in +advertising or publicity pertaining to distribution of the software without +specific, written prior permission. Keith Packard makes no +representations about the suitability of this software for any purpose. It +is provided "as is" without express or implied warranty. + +KEITH PACKARD DISCLAIMS ALL WARRANTIES WITH REGARD TO THIS SOFTWARE, +INCLUDING ALL IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS, IN NO +EVENT SHALL KEITH PACKARD BE LIABLE FOR ANY SPECIAL, INDIRECT OR +CONSEQUENTIAL DAMAGES OR ANY DAMAGES WHATSOEVER RESULTING FROM LOSS OF USE, +DATA OR PROFITS, WHETHER IN AN ACTION OF CONTRACT, NEGLIGENCE OR OTHER +TORTIOUS ACTION, ARISING OUT OF OR IN CONNECTION WITH THE USE OR +PERFORMANCE OF THIS SOFTWARE. + +Copyright © 2006 Jamey Sharp. + +Permission is hereby granted, free of charge, to any person obtaining a +copy of this software and associated documentation files (the "Software"), +to deal in the Software without restriction, including without limitation +the rights to use, copy, modify, merge, publish, distribute, sublicense, +and/or sell copies of the Software, and to permit persons to whom the +Software is furnished to do so, subject to the following conditions: + +The above copyright notice and this permission notice shall be included in +all copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +AUTHORS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN +ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN +CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. + +Except as contained in this notice, the names of the authors or their +institutions shall not be used in advertising or otherwise to promote the +sale, use or other dealings in this Software without prior written +authorization from the authors. + +Copyright © 2006 Ian Osgood + +Permission is hereby granted, free of charge, to any person obtaining a +copy of this software and associated documentation files (the "Software"), +to deal in the Software without restriction, including without limitation +the rights to use, copy, modify, merge, publish, distribute, sublicense, +and/or sell copies of the Software, and to permit persons to whom the +Software is furnished to do so, subject to the following conditions: + +The above copyright notice and this permission notice shall be included in +all copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +AUTHORS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN +ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN +CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. + +Except as contained in this notice, the names of the authors or their +institutions shall not be used in advertising or otherwise to promote the +sale, use or other dealings in this Software without prior written +authorization from the authors. + +------------------------------------------------------------------------------ +Libxcb-icccm is redistributed within opencv-python non-headless Linux packages. +This license applies to Libxcb-icccm binary in the directory cv2/. + +Copyright © 2008-2011 Arnaud Fontaine +Copyright © 2007-2008 Vincent Torri + +Permission is hereby granted, free of charge, to any person +obtaining a copy of this software and associated documentation +files (the "Software"), to deal in the Software without +restriction, including without limitation the rights to use, copy, +modify, merge, publish, distribute, sublicense, and/or sell copies +of the Software, and to permit persons to whom the Software is +furnished to do so, subject to the following conditions: + +The above copyright notice and this permission notice shall be +included in all copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, +EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF +MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND +NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS BE LIABLE FOR ANY +CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF +CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION +WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. + +Except as contained in this notice, the names of the authors or +their institutions shall not be used in advertising or otherwise to +promote the sale, use or other dealings in this Software without +prior written authorization from the authors. + +------------------------------------------------------------------------------ +libXau is redistributed within opencv-python non-headless Linux packages. +This license applies to libXau binary in the directory cv2/. + +Copyright 1988, 1993, 1994, 1998 The Open Group + +Permission to use, copy, modify, distribute, and sell this software and its +documentation for any purpose is hereby granted without fee, provided that +the above copyright notice appear in all copies and that both that +copyright notice and this permission notice appear in supporting +documentation. + +The above copyright notice and this permission notice shall be included in +all copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +OPEN GROUP BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN +AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN +CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. + +Except as contained in this notice, the name of The Open Group shall not be +used in advertising or otherwise to promote the sale, use or other dealings +in this Software without prior written authorization from The Open Group. + +------------------------------------------------------------------------------ +Vulkan headers are redistributed within all opencv-python packages. +This license applies to Vulkan headers in the directory 3rdparty/include/vulkan. + +Copyright (c) 2015-2018 The Khronos Group Inc. + +Licensed under the Apache License, Version 2.0 (the "License"); +you may not use this file except in compliance with the License. +You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + +Unless required by applicable law or agreed to in writing, software +distributed under the License is distributed on an "AS IS" BASIS, +WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +See the License for the specific language governing permissions and +limitations under the License. + +------------------------------------------------------------------------------ +Libjpeg-turbo is redistributed within all opencv-python packages as build option. + +libjpeg-turbo Licenses +====================== + +libjpeg-turbo is covered by three compatible BSD-style open source licenses: + +- The IJG (Independent JPEG Group) License, which is listed in + [README.ijg](README.ijg) + + This license applies to the libjpeg API library and associated programs + (any code inherited from libjpeg, and any modifications to that code.) + +- The Modified (3-clause) BSD License, which is listed below + + This license covers the TurboJPEG API library and associated programs, as + well as the build system. + +- The [zlib License](https://opensource.org/licenses/Zlib) + + This license is a subset of the other two, and it covers the libjpeg-turbo + SIMD extensions. + + +Complying with the libjpeg-turbo Licenses +========================================= + +This section provides a roll-up of the libjpeg-turbo licensing terms, to the +best of our understanding. + +1. If you are distributing a modified version of the libjpeg-turbo source, + then: + + 1. You cannot alter or remove any existing copyright or license notices + from the source. + + **Origin** + - Clause 1 of the IJG License + - Clause 1 of the Modified BSD License + - Clauses 1 and 3 of the zlib License + + 2. You must add your own copyright notice to the header of each source + file you modified, so others can tell that you modified that file (if + there is not an existing copyright header in that file, then you can + simply add a notice stating that you modified the file.) + + **Origin** + - Clause 1 of the IJG License + - Clause 2 of the zlib License + + 3. You must include the IJG README file, and you must not alter any of the + copyright or license text in that file. + + **Origin** + - Clause 1 of the IJG License + +2. If you are distributing only libjpeg-turbo binaries without the source, or + if you are distributing an application that statically links with + libjpeg-turbo, then: + + 1. Your product documentation must include a message stating: + + This software is based in part on the work of the Independent JPEG + Group. + + **Origin** + - Clause 2 of the IJG license + + 2. If your binary distribution includes or uses the TurboJPEG API, then + your product documentation must include the text of the Modified BSD + License (see below.) + + **Origin** + - Clause 2 of the Modified BSD License + +3. You cannot use the name of the IJG or The libjpeg-turbo Project or the + contributors thereof in advertising, publicity, etc. + + **Origin** + - IJG License + - Clause 3 of the Modified BSD License + +4. The IJG and The libjpeg-turbo Project do not warrant libjpeg-turbo to be + free of defects, nor do we accept any liability for undesirable + consequences resulting from your use of the software. + + **Origin** + - IJG License + - Modified BSD License + - zlib License + + +The Modified (3-clause) BSD License +=================================== + +Copyright (C)2009-2022 D. R. Commander. All Rights Reserved.
+Copyright (C)2015 Viktor Szathmáry. All Rights Reserved. + +Redistribution and use in source and binary forms, with or without +modification, are permitted provided that the following conditions are met: + +- Redistributions of source code must retain the above copyright notice, + this list of conditions and the following disclaimer. +- Redistributions in binary form must reproduce the above copyright notice, + this list of conditions and the following disclaimer in the documentation + and/or other materials provided with the distribution. +- Neither the name of the libjpeg-turbo Project nor the names of its + contributors may be used to endorse or promote products derived from this + software without specific prior written permission. + +THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS", +AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE +IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE +ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDERS OR CONTRIBUTORS BE +LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR +CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF +SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS +INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN +CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) +ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE +POSSIBILITY OF SUCH DAMAGE. + + +Why Three Licenses? +=================== + +The zlib License could have been used instead of the Modified (3-clause) BSD +License, and since the IJG License effectively subsumes the distribution +conditions of the zlib License, this would have effectively placed +libjpeg-turbo binary distributions under the IJG License. However, the IJG +License specifically refers to the Independent JPEG Group and does not extend +attribution and endorsement protections to other entities. Thus, it was +desirable to choose a license that granted us the same protections for new code +that were granted to the IJG for code derived from their software. + +------------------------------------------------------------------------------ +Libspng is redistributed within all opencv-python packages as build option. + +BSD 2-Clause License + +Copyright (c) 2018-2022, Randy +All rights reserved. + +Redistribution and use in source and binary forms, with or without +modification, are permitted provided that the following conditions are met: + +* Redistributions of source code must retain the above copyright notice, this + list of conditions and the following disclaimer. + +* Redistributions in binary form must reproduce the above copyright notice, + this list of conditions and the following disclaimer in the documentation + and/or other materials provided with the distribution. + +THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" +AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE +IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE +DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE +FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL +DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR +SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER +CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, +OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE +OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + +------------------------------------------------------------------------------ +QUIRC library is redistributed within all opencv-python packages. + +quirc -- QR-code recognition library +Copyright (C) 2010-2012 Daniel Beer + +Permission to use, copy, modify, and/or distribute this software for +any purpose with or without fee is hereby granted, provided that the +above copyright notice and this permission notice appear in all +copies. + +THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL +WARRANTIES WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED +WARRANTIES OF MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE +AUTHOR BE LIABLE FOR ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL +DAMAGES OR ANY DAMAGES WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR +PROFITS, WHETHER IN AN ACTION OF CONTRACT, NEGLIGENCE OR OTHER +TORTIOUS ACTION, ARISING OUT OF OR IN CONNECTION WITH THE USE OR +PERFORMANCE OF THIS SOFTWARE. + +------------------------------------------------------------------------------ +Flatbuffers library is redistributed within all opencv-python packages. + + Apache License + Version 2.0, January 2004 + http://www.apache.org/licenses/ + + TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION + + 1. Definitions. + + "License" shall mean the terms and conditions for use, reproduction, + and distribution as defined by Sections 1 through 9 of this document. + + "Licensor" shall mean the copyright owner or entity authorized by + the copyright owner that is granting the License. + + "Legal Entity" shall mean the union of the acting entity and all + other entities that control, are controlled by, or are under common + control with that entity. For the purposes of this definition, + "control" means (i) the power, direct or indirect, to cause the + direction or management of such entity, whether by contract or + otherwise, or (ii) ownership of fifty percent (50%) or more of the + outstanding shares, or (iii) beneficial ownership of such entity. + + "You" (or "Your") shall mean an individual or Legal Entity + exercising permissions granted by this License. + + "Source" form shall mean the preferred form for making modifications, + including but not limited to software source code, documentation + source, and configuration files. + + "Object" form shall mean any form resulting from mechanical + transformation or translation of a Source form, including but + not limited to compiled object code, generated documentation, + and conversions to other media types. + + "Work" shall mean the work of authorship, whether in Source or + Object form, made available under the License, as indicated by a + copyright notice that is included in or attached to the work + (an example is provided in the Appendix below). + + "Derivative Works" shall mean any work, whether in Source or Object + form, that is based on (or derived from) the Work and for which the + editorial revisions, annotations, elaborations, or other modifications + represent, as a whole, an original work of authorship. For the purposes + of this License, Derivative Works shall not include works that remain + separable from, or merely link (or bind by name) to the interfaces of, + the Work and Derivative Works thereof. + + "Contribution" shall mean any work of authorship, including + the original version of the Work and any modifications or additions + to that Work or Derivative Works thereof, that is intentionally + submitted to Licensor for inclusion in the Work by the copyright owner + or by an individual or Legal Entity authorized to submit on behalf of + the copyright owner. For the purposes of this definition, "submitted" + means any form of electronic, verbal, or written communication sent + to the Licensor or its representatives, including but not limited to + communication on electronic mailing lists, source code control systems, + and issue tracking systems that are managed by, or on behalf of, the + Licensor for the purpose of discussing and improving the Work, but + excluding communication that is conspicuously marked or otherwise + designated in writing by the copyright owner as "Not a Contribution." + + "Contributor" shall mean Licensor and any individual or Legal Entity + on behalf of whom a Contribution has been received by Licensor and + subsequently incorporated within the Work. + + 2. Grant of Copyright License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + copyright license to reproduce, prepare Derivative Works of, + publicly display, publicly perform, sublicense, and distribute the + Work and such Derivative Works in Source or Object form. + + 3. Grant of Patent License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + (except as stated in this section) patent license to make, have made, + use, offer to sell, sell, import, and otherwise transfer the Work, + where such license applies only to those patent claims licensable + by such Contributor that are necessarily infringed by their + Contribution(s) alone or by combination of their Contribution(s) + with the Work to which such Contribution(s) was submitted. If You + institute patent litigation against any entity (including a + cross-claim or counterclaim in a lawsuit) alleging that the Work + or a Contribution incorporated within the Work constitutes direct + or contributory patent infringement, then any patent licenses + granted to You under this License for that Work shall terminate + as of the date such litigation is filed. + + 4. Redistribution. You may reproduce and distribute copies of the + Work or Derivative Works thereof in any medium, with or without + modifications, and in Source or Object form, provided that You + meet the following conditions: + + (a) You must give any other recipients of the Work or + Derivative Works a copy of this License; and + + (b) You must cause any modified files to carry prominent notices + stating that You changed the files; and + + (c) You must retain, in the Source form of any Derivative Works + that You distribute, all copyright, patent, trademark, and + attribution notices from the Source form of the Work, + excluding those notices that do not pertain to any part of + the Derivative Works; and + + (d) If the Work includes a "NOTICE" text file as part of its + distribution, then any Derivative Works that You distribute must + include a readable copy of the attribution notices contained + within such NOTICE file, excluding those notices that do not + pertain to any part of the Derivative Works, in at least one + of the following places: within a NOTICE text file distributed + as part of the Derivative Works; within the Source form or + documentation, if provided along with the Derivative Works; or, + within a display generated by the Derivative Works, if and + wherever such third-party notices normally appear. The contents + of the NOTICE file are for informational purposes only and + do not modify the License. You may add Your own attribution + notices within Derivative Works that You distribute, alongside + or as an addendum to the NOTICE text from the Work, provided + that such additional attribution notices cannot be construed + as modifying the License. + + You may add Your own copyright statement to Your modifications and + may provide additional or different license terms and conditions + for use, reproduction, or distribution of Your modifications, or + for any such Derivative Works as a whole, provided Your use, + reproduction, and distribution of the Work otherwise complies with + the conditions stated in this License. + + 5. Submission of Contributions. Unless You explicitly state otherwise, + any Contribution intentionally submitted for inclusion in the Work + by You to the Licensor shall be under the terms and conditions of + this License, without any additional terms or conditions. + Notwithstanding the above, nothing herein shall supersede or modify + the terms of any separate license agreement you may have executed + with Licensor regarding such Contributions. + + 6. Trademarks. This License does not grant permission to use the trade + names, trademarks, service marks, or product names of the Licensor, + except as required for reasonable and customary use in describing the + origin of the Work and reproducing the content of the NOTICE file. + + 7. Disclaimer of Warranty. Unless required by applicable law or + agreed to in writing, Licensor provides the Work (and each + Contributor provides its Contributions) on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or + implied, including, without limitation, any warranties or conditions + of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A + PARTICULAR PURPOSE. You are solely responsible for determining the + appropriateness of using or redistributing the Work and assume any + risks associated with Your exercise of permissions under this License. + + 8. Limitation of Liability. In no event and under no legal theory, + whether in tort (including negligence), contract, or otherwise, + unless required by applicable law (such as deliberate and grossly + negligent acts) or agreed to in writing, shall any Contributor be + liable to You for damages, including any direct, indirect, special, + incidental, or consequential damages of any character arising as a + result of this License or out of the use or inability to use the + Work (including but not limited to damages for loss of goodwill, + work stoppage, computer failure or malfunction, or any and all + other commercial damages or losses), even if such Contributor + has been advised of the possibility of such damages. + + 9. Accepting Warranty or Additional Liability. While redistributing + the Work or Derivative Works thereof, You may choose to offer, + and charge a fee for, acceptance of support, warranty, indemnity, + or other liability obligations and/or rights consistent with this + License. However, in accepting such obligations, You may act only + on Your own behalf and on Your sole responsibility, not on behalf + of any other Contributor, and only if You agree to indemnify, + defend, and hold each Contributor harmless for any liability + incurred by, or claims asserted against, such Contributor by reason + of your accepting any such warranty or additional liability. + + END OF TERMS AND CONDITIONS + + APPENDIX: How to apply the Apache License to your work. + + To apply the Apache License to your work, attach the following + boilerplate notice, with the fields enclosed by brackets "[]" + replaced with your own identifying information. (Don't include + the brackets!) The text should be enclosed in the appropriate + comment syntax for the file format. We also recommend that a + file or class name and description of purpose be included on the + same "printed page" as the copyright notice for easier + identification within third-party archives. + + Copyright [yyyy] [name of copyright owner] + + Licensed under the Apache License, Version 2.0 (the "License"); + you may not use this file except in compliance with the License. + You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, software + distributed under the License is distributed on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + See the License for the specific language governing permissions and + limitations under the License. + +------------------------------------------------------------------------------ +Protobuf library is redistributed within all opencv-python packages. + +Copyright 2008 Google Inc. All rights reserved. + +Redistribution and use in source and binary forms, with or without +modification, are permitted provided that the following conditions are +met: + + * Redistributions of source code must retain the above copyright +notice, this list of conditions and the following disclaimer. + * Redistributions in binary form must reproduce the above +copyright notice, this list of conditions and the following disclaimer +in the documentation and/or other materials provided with the +distribution. + * Neither the name of Google Inc. nor the names of its +contributors may be used to endorse or promote products derived from +this software without specific prior written permission. + +THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS +"AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT +LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR +A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT +OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, +SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT +LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, +DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY +THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT +(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE +OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + +Code generated by the Protocol Buffer compiler is owned by the owner +of the input file used when generating it. This code is not +standalone and requires a support library to be linked with it. This +support library is itself covered by the above license. + +------------------------------------------------------------------------------ +OpenJPEG library is redistributed within all opencv-python packages. + +/* + * The copyright in this software is being made available under the 2-clauses + * BSD License, included below. This software may be subject to other third + * party and contributor rights, including patent rights, and no such rights + * are granted under this license. + * + * Copyright (c) 2002-2014, Universite catholique de Louvain (UCL), Belgium + * Copyright (c) 2002-2014, Professor Benoit Macq + * Copyright (c) 2003-2014, Antonin Descampe + * Copyright (c) 2003-2009, Francois-Olivier Devaux + * Copyright (c) 2005, Herve Drolon, FreeImage Team + * Copyright (c) 2002-2003, Yannick Verschueren + * Copyright (c) 2001-2003, David Janssens + * Copyright (c) 2011-2012, Centre National d'Etudes Spatiales (CNES), France + * Copyright (c) 2012, CS Systemes d'Information, France + * + * All rights reserved. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS `AS IS' + * AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE + * ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE + * LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR + * CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF + * SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS + * INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN + * CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) + * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE + * POSSIBILITY OF SUCH DAMAGE. + */ + +------------------------------------------------------------------------------ +TIFF library is redistributed within all opencv-python packages. + +Copyright (c) 1988-1997 Sam Leffler +Copyright (c) 1991-1997 Silicon Graphics, Inc. + +Permission to use, copy, modify, distribute, and sell this software and +its documentation for any purpose is hereby granted without fee, provided +that (i) the above copyright notices and this permission notice appear in +all copies of the software and related documentation, and (ii) the names of +Sam Leffler and Silicon Graphics may not be used in any advertising or +publicity relating to the software without the specific, prior written +permission of Sam Leffler and Silicon Graphics. + +THE SOFTWARE IS PROVIDED "AS-IS" AND WITHOUT WARRANTY OF ANY KIND, +EXPRESS, IMPLIED OR OTHERWISE, INCLUDING WITHOUT LIMITATION, ANY +WARRANTY OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. + +IN NO EVENT SHALL SAM LEFFLER OR SILICON GRAPHICS BE LIABLE FOR +ANY SPECIAL, INCIDENTAL, INDIRECT OR CONSEQUENTIAL DAMAGES OF ANY KIND, +OR ANY DAMAGES WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, +WHETHER OR NOT ADVISED OF THE POSSIBILITY OF DAMAGE, AND ON ANY THEORY OF +LIABILITY, ARISING OUT OF OR IN CONNECTION WITH THE USE OR PERFORMANCE +OF THIS SOFTWARE. + +------------------------------------------------------------------------------ +OpenEXR library is redistributed within all opencv-python packages. + +Copyright (c) 2006, Industrial Light & Magic, a division of Lucasfilm +Entertainment Company Ltd. Portions contributed and copyright held by +others as indicated. All rights reserved. + +Redistribution and use in source and binary forms, with or without +modification, are permitted provided that the following conditions are +met: + + * Redistributions of source code must retain the above + copyright notice, this list of conditions and the following + disclaimer. + + * Redistributions in binary form must reproduce the above + copyright notice, this list of conditions and the following + disclaimer in the documentation and/or other materials provided with + the distribution. + + * Neither the name of Industrial Light & Magic nor the names of + any other contributors to this software may be used to endorse or + promote products derived from this software without specific prior + written permission. + +THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS +IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, +THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR +PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR +CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, +EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, +PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR +PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF +LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING +NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS +SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + +------------------------------------------------------------------------------ +Intel(R) IPP ICV library statically linked within x86 and x86_64 opencv-python packages. + +Intel(R) Integrated Performance Primitives 2021 Update 10 + +Intel Simplified Software License (Version October 2022) + +Intel(R) Integrated Performance Primitives (Intel(R) IPP) : Copyright (C) 1997 Intel Corporation + +Use and Redistribution. You may use and redistribute the software, which is +provided in binary form only, (the "Software"), without modification, +provided the following conditions are met: + +* Redistributions must reproduce the above copyright notice and these + terms of use in the Software and in the documentation and/or other materials + provided with the distribution. +* Neither the name of Intel nor the names of its suppliers may be used to + endorse or promote products derived from this Software without specific + prior written permission. +* No reverse engineering, decompilation, or disassembly of the Software is + permitted, nor any modification or alteration of the Software or its operation + at any time, including during execution. + +No other licenses. Except as provided in the preceding section, Intel grants no +licenses or other rights by implication, estoppel or otherwise to, patent, +copyright, trademark, trade name, service mark or other intellectual property +licenses or rights of Intel. + +Third party software. "Third Party Software" means the files (if any) listed +in the "third-party-software.txt" or other similarly-named text file that may +be included with the Software. Third Party Software, even if included with the +distribution of the Software, may be governed by separate license terms, including +without limitation, third party license terms, open source software notices and +terms, and/or other Intel software license terms. These separate license terms +solely govern Your use of the Third Party Software. + +DISCLAIMER. THIS SOFTWARE IS PROVIDED "AS IS" AND ANY EXPRESS OR IMPLIED +WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF +MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE, AND NON-INFRINGEMENT ARE +DISCLAIMED. THIS SOFTWARE IS NOT INTENDED FOR USE IN SYSTEMS OR APPLICATIONS +WHERE FAILURE OF THE SOFTWARE MAY CAUSE PERSONAL INJURY OR DEATH AND YOU AGREE +THAT YOU ARE FULLY RESPONSIBLE FOR ANY CLAIMS, COSTS, DAMAGES, EXPENSES, AND +ATTORNEYS' FEES ARISING OUT OF ANY SUCH USE, EVEN IF ANY CLAIM ALLEGES THAT +INTEL WAS NEGLIGENT REGARDING THE DESIGN OR MANUFACTURE OF THE SOFTWARE. + +LIMITATION OF LIABILITY. IN NO EVENT WILL INTEL BE LIABLE FOR ANY DIRECT, +INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, +BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, +DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF +LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE +OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF +ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + +No support. Intel may make changes to the Software, at any time without notice, +and is not obligated to support, update or provide training for the Software. + +Termination. Your right to use the Software is terminated in the event of your +breach of this license. + +Feedback. Should you provide Intel with comments, modifications, corrections, +enhancements or other input ("Feedback") related to the Software, Intel will be +free to use, disclose, reproduce, license or otherwise distribute or exploit the +Feedback in its sole discretion without any obligations or restrictions of any +kind, including without limitation, intellectual property rights or licensing +obligations. + +Compliance with laws. You agree to comply with all relevant laws and regulations +governing your use, transfer, import or export (or prohibition thereof) of the +Software. + +Governing law. All disputes will be governed by the laws of the United States of +America and the State of Delaware without reference to conflict of law +principles and subject to the exclusive jurisdiction of the state or federal +courts sitting in the State of Delaware, and each party agrees that it submits +to the personal jurisdiction and venue of those courts and waives any +objections. THE UNITED NATIONS CONVENTION ON CONTRACTS FOR THE INTERNATIONAL +SALE OF GOODS (1980) IS SPECIFICALLY EXCLUDED AND WILL NOT APPLY TO THE SOFTWARE. + +------------------------------------------------------------------------------ +Orbbec SDK distributed with arm64 MacOS packages. + +MIT License + +Copyright (c) 2023 OrbbecDeveloper + +Permission is hereby granted, free of charge, to any person obtaining a copy +of this software and associated documentation files (the "Software"), to deal +in the Software without restriction, including without limitation the rights +to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +copies of the Software, and to permit persons to whom the Software is +furnished to do so, subject to the following conditions: + +The above copyright notice and this permission notice shall be included in all +copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE +SOFTWARE. + +------------------------------------------------------------------------------ + +libavif library and it's dependnecies are redistributed within all opencv-python packages. + +Copyright 2019 Joe Drago. All rights reserved. + +Redistribution and use in source and binary forms, with or without +modification, are permitted provided that the following conditions are met: + +1. Redistributions of source code must retain the above copyright notice, this +list of conditions and the following disclaimer. + +2. Redistributions in binary form must reproduce the above copyright notice, +this list of conditions and the following disclaimer in the documentation +and/or other materials provided with the distribution. + +THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" +AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE +IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE +DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE +FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL +DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR +SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER +CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, +OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE +OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + +------------------------------------------------------------------------------ + +Files: src/obu.c + +Copyright © 2018-2019, VideoLAN and dav1d authors +All rights reserved. + +Redistribution and use in source and binary forms, with or without +modification, are permitted provided that the following conditions are met: + +1. Redistributions of source code must retain the above copyright notice, this + list of conditions and the following disclaimer. + +2. Redistributions in binary form must reproduce the above copyright notice, + this list of conditions and the following disclaimer in the documentation + and/or other materials provided with the distribution. + +THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND +ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED +WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE +DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR +ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES +(INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; +LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND +ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT +(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS +SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + +------------------------------------------------------------------------------ + +Files: third_party/iccjpeg/* + +In plain English: + +1. We don't promise that this software works. (But if you find any bugs, + please let us know!) +2. You can use this software for whatever you want. You don't have to pay us. +3. You may not pretend that you wrote this software. If you use it in a + program, you must acknowledge somewhere in your documentation that + you've used the IJG code. + +In legalese: + +The authors make NO WARRANTY or representation, either express or implied, +with respect to this software, its quality, accuracy, merchantability, or +fitness for a particular purpose. This software is provided "AS IS", and you, +its user, assume the entire risk as to its quality and accuracy. + +This software is copyright (C) 1991-2013, Thomas G. Lane, Guido Vollbeding. +All Rights Reserved except as specified below. + +Permission is hereby granted to use, copy, modify, and distribute this +software (or portions thereof) for any purpose, without fee, subject to these +conditions: +(1) If any part of the source code for this software is distributed, then this +README file must be included, with this copyright and no-warranty notice +unaltered; and any additions, deletions, or changes to the original files +must be clearly indicated in accompanying documentation. +(2) If only executable code is distributed, then the accompanying +documentation must state that "this software is based in part on the work of +the Independent JPEG Group". +(3) Permission for use of this software is granted only if the user accepts +full responsibility for any undesirable consequences; the authors accept +NO LIABILITY for damages of any kind. + +These conditions apply to any software derived from or based on the IJG code, +not just to the unmodified library. If you use our work, you ought to +acknowledge us. + +Permission is NOT granted for the use of any IJG author's name or company name +in advertising or publicity relating to this software or products derived from +it. This software may be referred to only as "the Independent JPEG Group's +software". + +We specifically permit and encourage the use of this software as the basis of +commercial products, provided that all warranty or liability claims are +assumed by the product vendor. + + +The Unix configuration script "configure" was produced with GNU Autoconf. +It is copyright by the Free Software Foundation but is freely distributable. +The same holds for its supporting scripts (config.guess, config.sub, +ltmain.sh). Another support script, install-sh, is copyright by X Consortium +but is also freely distributable. + +The IJG distribution formerly included code to read and write GIF files. +To avoid entanglement with the Unisys LZW patent, GIF reading support has +been removed altogether, and the GIF writer has been simplified to produce +"uncompressed GIFs". This technique does not use the LZW algorithm; the +resulting GIF files are larger than usual, but are readable by all standard +GIF decoders. + +We are required to state that + "The Graphics Interchange Format(c) is the Copyright property of + CompuServe Incorporated. GIF(sm) is a Service Mark property of + CompuServe Incorporated." + +------------------------------------------------------------------------------ + +Files: contrib/gdk-pixbuf/* + +Copyright 2020 Emmanuel Gil Peyrot. All rights reserved. + +Redistribution and use in source and binary forms, with or without +modification, are permitted provided that the following conditions are met: + +1. Redistributions of source code must retain the above copyright notice, this +list of conditions and the following disclaimer. + +2. Redistributions in binary form must reproduce the above copyright notice, +this list of conditions and the following disclaimer in the documentation +and/or other materials provided with the distribution. + +THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" +AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE +IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE +DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE +FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL +DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR +SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER +CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, +OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE +OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + +------------------------------------------------------------------------------ + +Files: android_jni/gradlew* + + + Apache License + Version 2.0, January 2004 + http://www.apache.org/licenses/ + + TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION + + 1. Definitions. + + "License" shall mean the terms and conditions for use, reproduction, + and distribution as defined by Sections 1 through 9 of this document. + + "Licensor" shall mean the copyright owner or entity authorized by + the copyright owner that is granting the License. + + "Legal Entity" shall mean the union of the acting entity and all + other entities that control, are controlled by, or are under common + control with that entity. For the purposes of this definition, + "control" means (i) the power, direct or indirect, to cause the + direction or management of such entity, whether by contract or + otherwise, or (ii) ownership of fifty percent (50%) or more of the + outstanding shares, or (iii) beneficial ownership of such entity. + + "You" (or "Your") shall mean an individual or Legal Entity + exercising permissions granted by this License. + + "Source" form shall mean the preferred form for making modifications, + including but not limited to software source code, documentation + source, and configuration files. + + "Object" form shall mean any form resulting from mechanical + transformation or translation of a Source form, including but + not limited to compiled object code, generated documentation, + and conversions to other media types. + + "Work" shall mean the work of authorship, whether in Source or + Object form, made available under the License, as indicated by a + copyright notice that is included in or attached to the work + (an example is provided in the Appendix below). + + "Derivative Works" shall mean any work, whether in Source or Object + form, that is based on (or derived from) the Work and for which the + editorial revisions, annotations, elaborations, or other modifications + represent, as a whole, an original work of authorship. For the purposes + of this License, Derivative Works shall not include works that remain + separable from, or merely link (or bind by name) to the interfaces of, + the Work and Derivative Works thereof. + + "Contribution" shall mean any work of authorship, including + the original version of the Work and any modifications or additions + to that Work or Derivative Works thereof, that is intentionally + submitted to Licensor for inclusion in the Work by the copyright owner + or by an individual or Legal Entity authorized to submit on behalf of + the copyright owner. For the purposes of this definition, "submitted" + means any form of electronic, verbal, or written communication sent + to the Licensor or its representatives, including but not limited to + communication on electronic mailing lists, source code control systems, + and issue tracking systems that are managed by, or on behalf of, the + Licensor for the purpose of discussing and improving the Work, but + excluding communication that is conspicuously marked or otherwise + designated in writing by the copyright owner as "Not a Contribution." + + "Contributor" shall mean Licensor and any individual or Legal Entity + on behalf of whom a Contribution has been received by Licensor and + subsequently incorporated within the Work. + + 2. Grant of Copyright License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + copyright license to reproduce, prepare Derivative Works of, + publicly display, publicly perform, sublicense, and distribute the + Work and such Derivative Works in Source or Object form. + + 3. Grant of Patent License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + (except as stated in this section) patent license to make, have made, + use, offer to sell, sell, import, and otherwise transfer the Work, + where such license applies only to those patent claims licensable + by such Contributor that are necessarily infringed by their + Contribution(s) alone or by combination of their Contribution(s) + with the Work to which such Contribution(s) was submitted. If You + institute patent litigation against any entity (including a + cross-claim or counterclaim in a lawsuit) alleging that the Work + or a Contribution incorporated within the Work constitutes direct + or contributory patent infringement, then any patent licenses + granted to You under this License for that Work shall terminate + as of the date such litigation is filed. + + 4. Redistribution. You may reproduce and distribute copies of the + Work or Derivative Works thereof in any medium, with or without + modifications, and in Source or Object form, provided that You + meet the following conditions: + + (a) You must give any other recipients of the Work or + Derivative Works a copy of this License; and + + (b) You must cause any modified files to carry prominent notices + stating that You changed the files; and + + (c) You must retain, in the Source form of any Derivative Works + that You distribute, all copyright, patent, trademark, and + attribution notices from the Source form of the Work, + excluding those notices that do not pertain to any part of + the Derivative Works; and + + (d) If the Work includes a "NOTICE" text file as part of its + distribution, then any Derivative Works that You distribute must + include a readable copy of the attribution notices contained + within such NOTICE file, excluding those notices that do not + pertain to any part of the Derivative Works, in at least one + of the following places: within a NOTICE text file distributed + as part of the Derivative Works; within the Source form or + documentation, if provided along with the Derivative Works; or, + within a display generated by the Derivative Works, if and + wherever such third-party notices normally appear. The contents + of the NOTICE file are for informational purposes only and + do not modify the License. You may add Your own attribution + notices within Derivative Works that You distribute, alongside + or as an addendum to the NOTICE text from the Work, provided + that such additional attribution notices cannot be construed + as modifying the License. + + You may add Your own copyright statement to Your modifications and + may provide additional or different license terms and conditions + for use, reproduction, or distribution of Your modifications, or + for any such Derivative Works as a whole, provided Your use, + reproduction, and distribution of the Work otherwise complies with + the conditions stated in this License. + + 5. Submission of Contributions. Unless You explicitly state otherwise, + any Contribution intentionally submitted for inclusion in the Work + by You to the Licensor shall be under the terms and conditions of + this License, without any additional terms or conditions. + Notwithstanding the above, nothing herein shall supersede or modify + the terms of any separate license agreement you may have executed + with Licensor regarding such Contributions. + + 6. Trademarks. This License does not grant permission to use the trade + names, trademarks, service marks, or product names of the Licensor, + except as required for reasonable and customary use in describing the + origin of the Work and reproducing the content of the NOTICE file. + + 7. Disclaimer of Warranty. Unless required by applicable law or + agreed to in writing, Licensor provides the Work (and each + Contributor provides its Contributions) on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or + implied, including, without limitation, any warranties or conditions + of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A + PARTICULAR PURPOSE. You are solely responsible for determining the + appropriateness of using or redistributing the Work and assume any + risks associated with Your exercise of permissions under this License. + + 8. Limitation of Liability. In no event and under no legal theory, + whether in tort (including negligence), contract, or otherwise, + unless required by applicable law (such as deliberate and grossly + negligent acts) or agreed to in writing, shall any Contributor be + liable to You for damages, including any direct, indirect, special, + incidental, or consequential damages of any character arising as a + result of this License or out of the use or inability to use the + Work (including but not limited to damages for loss of goodwill, + work stoppage, computer failure or malfunction, or any and all + other commercial damages or losses), even if such Contributor + has been advised of the possibility of such damages. + + 9. Accepting Warranty or Additional Liability. While redistributing + the Work or Derivative Works thereof, You may choose to offer, + and charge a fee for, acceptance of support, warranty, indemnity, + or other liability obligations and/or rights consistent with this + License. However, in accepting such obligations, You may act only + on Your own behalf and on Your sole responsibility, not on behalf + of any other Contributor, and only if You agree to indemnify, + defend, and hold each Contributor harmless for any liability + incurred by, or claims asserted against, such Contributor by reason + of your accepting any such warranty or additional liability. + + END OF TERMS AND CONDITIONS + + APPENDIX: How to apply the Apache License to your work. + + To apply the Apache License to your work, attach the following + boilerplate notice, with the fields enclosed by brackets "[]" + replaced with your own identifying information. (Don't include + the brackets!) The text should be enclosed in the appropriate + comment syntax for the file format. We also recommend that a + file or class name and description of purpose be included on the + same "printed page" as the copyright notice for easier + identification within third-party archives. + + Copyright [yyyy] [name of copyright owner] + + Licensed under the Apache License, Version 2.0 (the "License"); + you may not use this file except in compliance with the License. + You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, software + distributed under the License is distributed on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + See the License for the specific language governing permissions and + limitations under the License. + +------------------------------------------------------------------------------ + +Files: third_party/libyuv/* + +Copyright 2011 The LibYuv Project Authors. All rights reserved. + +Redistribution and use in source and binary forms, with or without +modification, are permitted provided that the following conditions are +met: + + * Redistributions of source code must retain the above copyright + notice, this list of conditions and the following disclaimer. + + * Redistributions in binary form must reproduce the above copyright + notice, this list of conditions and the following disclaimer in + the documentation and/or other materials provided with the + distribution. + + * Neither the name of Google nor the names of its contributors may + be used to endorse or promote products derived from this software + without specific prior written permission. + +THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS +"AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT +LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR +A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT +HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, +SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT +LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, +DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY +THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT +(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE +OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + +------------------------------------------------------------------------------ + +aom library and it's dependnecies are redistributed within all opencv-python packages. + +Copyright (c) 2016, Alliance for Open Media. All rights reserved. + +Redistribution and use in source and binary forms, with or without +modification, are permitted provided that the following conditions +are met: + +1. Redistributions of source code must retain the above copyright + notice, this list of conditions and the following disclaimer. + +2. Redistributions in binary form must reproduce the above copyright + notice, this list of conditions and the following disclaimer in + the documentation and/or other materials provided with the + distribution. + +THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS +"AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT +LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS +FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE +COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, +INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, +BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; +LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER +CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT +LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN +ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE +POSSIBILITY OF SUCH DAMAGE. + +------------------------------------------------------------------------------ + +KAZE Features library is redistributed within all opencv-python packages. + +Copyright (c) 2012, Pablo Fernández Alcantarilla +All Rights Reserved + +Redistribution and use in source and binary forms, with or without modification, +are permitted provided that the following conditions are met: + + * Redistributions of source code must retain the above copyright notice, + this list of conditions and the following disclaimer. + + * Redistributions in binary form must reproduce the above copyright notice, + this list of conditions and the following disclaimer in the documentation + and/or other materials provided with the distribution. + + * Neither the name of the copyright holders nor the names of its contributors + may be used to endorse or promote products derived from this software without + specific prior written permission. + +THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY +EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES +OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT +SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, +INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED +TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR +BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN +CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY +WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + +------------------------------------------------------------------------------ + +AKAZE Features library is redistributed within all opencv-python packages. + +Copyright (c) 2014, Pablo Fernandez Alcantarilla, Jesus Nuevo +All Rights Reserved + +Redistribution and use in source and binary forms, with or without modification, +are permitted provided that the following conditions are met: + + * Redistributions of source code must retain the above copyright notice, + this list of conditions and the following disclaimer. + + * Redistributions in binary form must reproduce the above copyright notice, + this list of conditions and the following disclaimer in the documentation + and/or other materials provided with the distribution. + + * Neither the name of the copyright holders nor the names of its contributors + may be used to endorse or promote products derived from this software without + specific prior written permission. + +THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY +EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES +OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT +SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, +INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED +TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR +BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN +CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY +WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + + +opentelemetry-api +Apache-2.0 +https://github.com/open-telemetry/opentelemetry-python/tree/main/opentelemetry-api + Apache License + Version 2.0, January 2004 + http://www.apache.org/licenses/ + + TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION + + 1. Definitions. + + "License" shall mean the terms and conditions for use, reproduction, + and distribution as defined by Sections 1 through 9 of this document. + + "Licensor" shall mean the copyright owner or entity authorized by + the copyright owner that is granting the License. + + "Legal Entity" shall mean the union of the acting entity and all + other entities that control, are controlled by, or are under common + control with that entity. For the purposes of this definition, + "control" means (i) the power, direct or indirect, to cause the + direction or management of such entity, whether by contract or + otherwise, or (ii) ownership of fifty percent (50%) or more of the + outstanding shares, or (iii) beneficial ownership of such entity. + + "You" (or "Your") shall mean an individual or Legal Entity + exercising permissions granted by this License. + + "Source" form shall mean the preferred form for making modifications, + including but not limited to software source code, documentation + source, and configuration files. + + "Object" form shall mean any form resulting from mechanical + transformation or translation of a Source form, including but + not limited to compiled object code, generated documentation, + and conversions to other media types. + + "Work" shall mean the work of authorship, whether in Source or + Object form, made available under the License, as indicated by a + copyright notice that is included in or attached to the work + (an example is provided in the Appendix below). + + "Derivative Works" shall mean any work, whether in Source or Object + form, that is based on (or derived from) the Work and for which the + editorial revisions, annotations, elaborations, or other modifications + represent, as a whole, an original work of authorship. For the purposes + of this License, Derivative Works shall not include works that remain + separable from, or merely link (or bind by name) to the interfaces of, + the Work and Derivative Works thereof. + + "Contribution" shall mean any work of authorship, including + the original version of the Work and any modifications or additions + to that Work or Derivative Works thereof, that is intentionally + submitted to Licensor for inclusion in the Work by the copyright owner + or by an individual or Legal Entity authorized to submit on behalf of + the copyright owner. For the purposes of this definition, "submitted" + means any form of electronic, verbal, or written communication sent + to the Licensor or its representatives, including but not limited to + communication on electronic mailing lists, source code control systems, + and issue tracking systems that are managed by, or on behalf of, the + Licensor for the purpose of discussing and improving the Work, but + excluding communication that is conspicuously marked or otherwise + designated in writing by the copyright owner as "Not a Contribution." + + "Contributor" shall mean Licensor and any individual or Legal Entity + on behalf of whom a Contribution has been received by Licensor and + subsequently incorporated within the Work. + + 2. Grant of Copyright License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + copyright license to reproduce, prepare Derivative Works of, + publicly display, publicly perform, sublicense, and distribute the + Work and such Derivative Works in Source or Object form. + + 3. Grant of Patent License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + (except as stated in this section) patent license to make, have made, + use, offer to sell, sell, import, and otherwise transfer the Work, + where such license applies only to those patent claims licensable + by such Contributor that are necessarily infringed by their + Contribution(s) alone or by combination of their Contribution(s) + with the Work to which such Contribution(s) was submitted. If You + institute patent litigation against any entity (including a + cross-claim or counterclaim in a lawsuit) alleging that the Work + or a Contribution incorporated within the Work constitutes direct + or contributory patent infringement, then any patent licenses + granted to You under this License for that Work shall terminate + as of the date such litigation is filed. + + 4. Redistribution. You may reproduce and distribute copies of the + Work or Derivative Works thereof in any medium, with or without + modifications, and in Source or Object form, provided that You + meet the following conditions: + + (a) You must give any other recipients of the Work or + Derivative Works a copy of this License; and + + (b) You must cause any modified files to carry prominent notices + stating that You changed the files; and + + (c) You must retain, in the Source form of any Derivative Works + that You distribute, all copyright, patent, trademark, and + attribution notices from the Source form of the Work, + excluding those notices that do not pertain to any part of + the Derivative Works; and + + (d) If the Work includes a "NOTICE" text file as part of its + distribution, then any Derivative Works that You distribute must + include a readable copy of the attribution notices contained + within such NOTICE file, excluding those notices that do not + pertain to any part of the Derivative Works, in at least one + of the following places: within a NOTICE text file distributed + as part of the Derivative Works; within the Source form or + documentation, if provided along with the Derivative Works; or, + within a display generated by the Derivative Works, if and + wherever such third-party notices normally appear. The contents + of the NOTICE file are for informational purposes only and + do not modify the License. You may add Your own attribution + notices within Derivative Works that You distribute, alongside + or as an addendum to the NOTICE text from the Work, provided + that such additional attribution notices cannot be construed + as modifying the License. + + You may add Your own copyright statement to Your modifications and + may provide additional or different license terms and conditions + for use, reproduction, or distribution of Your modifications, or + for any such Derivative Works as a whole, provided Your use, + reproduction, and distribution of the Work otherwise complies with + the conditions stated in this License. + + 5. Submission of Contributions. Unless You explicitly state otherwise, + any Contribution intentionally submitted for inclusion in the Work + by You to the Licensor shall be under the terms and conditions of + this License, without any additional terms or conditions. + Notwithstanding the above, nothing herein shall supersede or modify + the terms of any separate license agreement you may have executed + with Licensor regarding such Contributions. + + 6. Trademarks. This License does not grant permission to use the trade + names, trademarks, service marks, or product names of the Licensor, + except as required for reasonable and customary use in describing the + origin of the Work and reproducing the content of the NOTICE file. + + 7. Disclaimer of Warranty. Unless required by applicable law or + agreed to in writing, Licensor provides the Work (and each + Contributor provides its Contributions) on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or + implied, including, without limitation, any warranties or conditions + of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A + PARTICULAR PURPOSE. You are solely responsible for determining the + appropriateness of using or redistributing the Work and assume any + risks associated with Your exercise of permissions under this License. + + 8. Limitation of Liability. In no event and under no legal theory, + whether in tort (including negligence), contract, or otherwise, + unless required by applicable law (such as deliberate and grossly + negligent acts) or agreed to in writing, shall any Contributor be + liable to You for damages, including any direct, indirect, special, + incidental, or consequential damages of any character arising as a + result of this License or out of the use or inability to use the + Work (including but not limited to damages for loss of goodwill, + work stoppage, computer failure or malfunction, or any and all + other commercial damages or losses), even if such Contributor + has been advised of the possibility of such damages. + + 9. Accepting Warranty or Additional Liability. While redistributing + the Work or Derivative Works thereof, You may choose to offer, + and charge a fee for, acceptance of support, warranty, indemnity, + or other liability obligations and/or rights consistent with this + License. However, in accepting such obligations, You may act only + on Your own behalf and on Your sole responsibility, not on behalf + of any other Contributor, and only if You agree to indemnify, + defend, and hold each Contributor harmless for any liability + incurred by, or claims asserted against, such Contributor by reason + of your accepting any such warranty or additional liability. + + END OF TERMS AND CONDITIONS + + APPENDIX: How to apply the Apache License to your work. + + To apply the Apache License to your work, attach the following + boilerplate notice, with the fields enclosed by brackets "[]" + replaced with your own identifying information. (Don't include + the brackets!) The text should be enclosed in the appropriate + comment syntax for the file format. We also recommend that a + file or class name and description of purpose be included on the + same "printed page" as the copyright notice for easier + identification within third-party archives. + + Copyright [yyyy] [name of copyright owner] + + Licensed under the Apache License, Version 2.0 (the "License"); + you may not use this file except in compliance with the License. + You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, software + distributed under the License is distributed on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + See the License for the specific language governing permissions and + limitations under the License. + + +opentelemetry-exporter-otlp-proto-common +Apache-2.0 +https://github.com/open-telemetry/opentelemetry-python/tree/main/exporter/opentelemetry-exporter-otlp-proto-common + Apache License + Version 2.0, January 2004 + http://www.apache.org/licenses/ + + TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION + + 1. Definitions. + + "License" shall mean the terms and conditions for use, reproduction, + and distribution as defined by Sections 1 through 9 of this document. + + "Licensor" shall mean the copyright owner or entity authorized by + the copyright owner that is granting the License. + + "Legal Entity" shall mean the union of the acting entity and all + other entities that control, are controlled by, or are under common + control with that entity. For the purposes of this definition, + "control" means (i) the power, direct or indirect, to cause the + direction or management of such entity, whether by contract or + otherwise, or (ii) ownership of fifty percent (50%) or more of the + outstanding shares, or (iii) beneficial ownership of such entity. + + "You" (or "Your") shall mean an individual or Legal Entity + exercising permissions granted by this License. + + "Source" form shall mean the preferred form for making modifications, + including but not limited to software source code, documentation + source, and configuration files. + + "Object" form shall mean any form resulting from mechanical + transformation or translation of a Source form, including but + not limited to compiled object code, generated documentation, + and conversions to other media types. + + "Work" shall mean the work of authorship, whether in Source or + Object form, made available under the License, as indicated by a + copyright notice that is included in or attached to the work + (an example is provided in the Appendix below). + + "Derivative Works" shall mean any work, whether in Source or Object + form, that is based on (or derived from) the Work and for which the + editorial revisions, annotations, elaborations, or other modifications + represent, as a whole, an original work of authorship. For the purposes + of this License, Derivative Works shall not include works that remain + separable from, or merely link (or bind by name) to the interfaces of, + the Work and Derivative Works thereof. + + "Contribution" shall mean any work of authorship, including + the original version of the Work and any modifications or additions + to that Work or Derivative Works thereof, that is intentionally + submitted to Licensor for inclusion in the Work by the copyright owner + or by an individual or Legal Entity authorized to submit on behalf of + the copyright owner. For the purposes of this definition, "submitted" + means any form of electronic, verbal, or written communication sent + to the Licensor or its representatives, including but not limited to + communication on electronic mailing lists, source code control systems, + and issue tracking systems that are managed by, or on behalf of, the + Licensor for the purpose of discussing and improving the Work, but + excluding communication that is conspicuously marked or otherwise + designated in writing by the copyright owner as "Not a Contribution." + + "Contributor" shall mean Licensor and any individual or Legal Entity + on behalf of whom a Contribution has been received by Licensor and + subsequently incorporated within the Work. + + 2. Grant of Copyright License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + copyright license to reproduce, prepare Derivative Works of, + publicly display, publicly perform, sublicense, and distribute the + Work and such Derivative Works in Source or Object form. + + 3. Grant of Patent License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + (except as stated in this section) patent license to make, have made, + use, offer to sell, sell, import, and otherwise transfer the Work, + where such license applies only to those patent claims licensable + by such Contributor that are necessarily infringed by their + Contribution(s) alone or by combination of their Contribution(s) + with the Work to which such Contribution(s) was submitted. If You + institute patent litigation against any entity (including a + cross-claim or counterclaim in a lawsuit) alleging that the Work + or a Contribution incorporated within the Work constitutes direct + or contributory patent infringement, then any patent licenses + granted to You under this License for that Work shall terminate + as of the date such litigation is filed. + + 4. Redistribution. You may reproduce and distribute copies of the + Work or Derivative Works thereof in any medium, with or without + modifications, and in Source or Object form, provided that You + meet the following conditions: + + (a) You must give any other recipients of the Work or + Derivative Works a copy of this License; and + + (b) You must cause any modified files to carry prominent notices + stating that You changed the files; and + + (c) You must retain, in the Source form of any Derivative Works + that You distribute, all copyright, patent, trademark, and + attribution notices from the Source form of the Work, + excluding those notices that do not pertain to any part of + the Derivative Works; and + + (d) If the Work includes a "NOTICE" text file as part of its + distribution, then any Derivative Works that You distribute must + include a readable copy of the attribution notices contained + within such NOTICE file, excluding those notices that do not + pertain to any part of the Derivative Works, in at least one + of the following places: within a NOTICE text file distributed + as part of the Derivative Works; within the Source form or + documentation, if provided along with the Derivative Works; or, + within a display generated by the Derivative Works, if and + wherever such third-party notices normally appear. The contents + of the NOTICE file are for informational purposes only and + do not modify the License. You may add Your own attribution + notices within Derivative Works that You distribute, alongside + or as an addendum to the NOTICE text from the Work, provided + that such additional attribution notices cannot be construed + as modifying the License. + + You may add Your own copyright statement to Your modifications and + may provide additional or different license terms and conditions + for use, reproduction, or distribution of Your modifications, or + for any such Derivative Works as a whole, provided Your use, + reproduction, and distribution of the Work otherwise complies with + the conditions stated in this License. + + 5. Submission of Contributions. Unless You explicitly state otherwise, + any Contribution intentionally submitted for inclusion in the Work + by You to the Licensor shall be under the terms and conditions of + this License, without any additional terms or conditions. + Notwithstanding the above, nothing herein shall supersede or modify + the terms of any separate license agreement you may have executed + with Licensor regarding such Contributions. + + 6. Trademarks. This License does not grant permission to use the trade + names, trademarks, service marks, or product names of the Licensor, + except as required for reasonable and customary use in describing the + origin of the Work and reproducing the content of the NOTICE file. + + 7. Disclaimer of Warranty. Unless required by applicable law or + agreed to in writing, Licensor provides the Work (and each + Contributor provides its Contributions) on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or + implied, including, without limitation, any warranties or conditions + of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A + PARTICULAR PURPOSE. You are solely responsible for determining the + appropriateness of using or redistributing the Work and assume any + risks associated with Your exercise of permissions under this License. + + 8. Limitation of Liability. In no event and under no legal theory, + whether in tort (including negligence), contract, or otherwise, + unless required by applicable law (such as deliberate and grossly + negligent acts) or agreed to in writing, shall any Contributor be + liable to You for damages, including any direct, indirect, special, + incidental, or consequential damages of any character arising as a + result of this License or out of the use or inability to use the + Work (including but not limited to damages for loss of goodwill, + work stoppage, computer failure or malfunction, or any and all + other commercial damages or losses), even if such Contributor + has been advised of the possibility of such damages. + + 9. Accepting Warranty or Additional Liability. While redistributing + the Work or Derivative Works thereof, You may choose to offer, + and charge a fee for, acceptance of support, warranty, indemnity, + or other liability obligations and/or rights consistent with this + License. However, in accepting such obligations, You may act only + on Your own behalf and on Your sole responsibility, not on behalf + of any other Contributor, and only if You agree to indemnify, + defend, and hold each Contributor harmless for any liability + incurred by, or claims asserted against, such Contributor by reason + of your accepting any such warranty or additional liability. + + END OF TERMS AND CONDITIONS + + APPENDIX: How to apply the Apache License to your work. + + To apply the Apache License to your work, attach the following + boilerplate notice, with the fields enclosed by brackets "[]" + replaced with your own identifying information. (Don't include + the brackets!) The text should be enclosed in the appropriate + comment syntax for the file format. We also recommend that a + file or class name and description of purpose be included on the + same "printed page" as the copyright notice for easier + identification within third-party archives. + + Copyright [yyyy] [name of copyright owner] + + Licensed under the Apache License, Version 2.0 (the "License"); + you may not use this file except in compliance with the License. + You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, software + distributed under the License is distributed on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + See the License for the specific language governing permissions and + limitations under the License. + + +opentelemetry-exporter-otlp-proto-http +Apache-2.0 +https://github.com/open-telemetry/opentelemetry-python/tree/main/exporter/opentelemetry-exporter-otlp-proto-http + Apache License + Version 2.0, January 2004 + http://www.apache.org/licenses/ + + TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION + + 1. Definitions. + + "License" shall mean the terms and conditions for use, reproduction, + and distribution as defined by Sections 1 through 9 of this document. + + "Licensor" shall mean the copyright owner or entity authorized by + the copyright owner that is granting the License. + + "Legal Entity" shall mean the union of the acting entity and all + other entities that control, are controlled by, or are under common + control with that entity. For the purposes of this definition, + "control" means (i) the power, direct or indirect, to cause the + direction or management of such entity, whether by contract or + otherwise, or (ii) ownership of fifty percent (50%) or more of the + outstanding shares, or (iii) beneficial ownership of such entity. + + "You" (or "Your") shall mean an individual or Legal Entity + exercising permissions granted by this License. + + "Source" form shall mean the preferred form for making modifications, + including but not limited to software source code, documentation + source, and configuration files. + + "Object" form shall mean any form resulting from mechanical + transformation or translation of a Source form, including but + not limited to compiled object code, generated documentation, + and conversions to other media types. + + "Work" shall mean the work of authorship, whether in Source or + Object form, made available under the License, as indicated by a + copyright notice that is included in or attached to the work + (an example is provided in the Appendix below). + + "Derivative Works" shall mean any work, whether in Source or Object + form, that is based on (or derived from) the Work and for which the + editorial revisions, annotations, elaborations, or other modifications + represent, as a whole, an original work of authorship. For the purposes + of this License, Derivative Works shall not include works that remain + separable from, or merely link (or bind by name) to the interfaces of, + the Work and Derivative Works thereof. + + "Contribution" shall mean any work of authorship, including + the original version of the Work and any modifications or additions + to that Work or Derivative Works thereof, that is intentionally + submitted to Licensor for inclusion in the Work by the copyright owner + or by an individual or Legal Entity authorized to submit on behalf of + the copyright owner. For the purposes of this definition, "submitted" + means any form of electronic, verbal, or written communication sent + to the Licensor or its representatives, including but not limited to + communication on electronic mailing lists, source code control systems, + and issue tracking systems that are managed by, or on behalf of, the + Licensor for the purpose of discussing and improving the Work, but + excluding communication that is conspicuously marked or otherwise + designated in writing by the copyright owner as "Not a Contribution." + + "Contributor" shall mean Licensor and any individual or Legal Entity + on behalf of whom a Contribution has been received by Licensor and + subsequently incorporated within the Work. + + 2. Grant of Copyright License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + copyright license to reproduce, prepare Derivative Works of, + publicly display, publicly perform, sublicense, and distribute the + Work and such Derivative Works in Source or Object form. + + 3. Grant of Patent License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + (except as stated in this section) patent license to make, have made, + use, offer to sell, sell, import, and otherwise transfer the Work, + where such license applies only to those patent claims licensable + by such Contributor that are necessarily infringed by their + Contribution(s) alone or by combination of their Contribution(s) + with the Work to which such Contribution(s) was submitted. If You + institute patent litigation against any entity (including a + cross-claim or counterclaim in a lawsuit) alleging that the Work + or a Contribution incorporated within the Work constitutes direct + or contributory patent infringement, then any patent licenses + granted to You under this License for that Work shall terminate + as of the date such litigation is filed. + + 4. Redistribution. You may reproduce and distribute copies of the + Work or Derivative Works thereof in any medium, with or without + modifications, and in Source or Object form, provided that You + meet the following conditions: + + (a) You must give any other recipients of the Work or + Derivative Works a copy of this License; and + + (b) You must cause any modified files to carry prominent notices + stating that You changed the files; and + + (c) You must retain, in the Source form of any Derivative Works + that You distribute, all copyright, patent, trademark, and + attribution notices from the Source form of the Work, + excluding those notices that do not pertain to any part of + the Derivative Works; and + + (d) If the Work includes a "NOTICE" text file as part of its + distribution, then any Derivative Works that You distribute must + include a readable copy of the attribution notices contained + within such NOTICE file, excluding those notices that do not + pertain to any part of the Derivative Works, in at least one + of the following places: within a NOTICE text file distributed + as part of the Derivative Works; within the Source form or + documentation, if provided along with the Derivative Works; or, + within a display generated by the Derivative Works, if and + wherever such third-party notices normally appear. The contents + of the NOTICE file are for informational purposes only and + do not modify the License. You may add Your own attribution + notices within Derivative Works that You distribute, alongside + or as an addendum to the NOTICE text from the Work, provided + that such additional attribution notices cannot be construed + as modifying the License. + + You may add Your own copyright statement to Your modifications and + may provide additional or different license terms and conditions + for use, reproduction, or distribution of Your modifications, or + for any such Derivative Works as a whole, provided Your use, + reproduction, and distribution of the Work otherwise complies with + the conditions stated in this License. + + 5. Submission of Contributions. Unless You explicitly state otherwise, + any Contribution intentionally submitted for inclusion in the Work + by You to the Licensor shall be under the terms and conditions of + this License, without any additional terms or conditions. + Notwithstanding the above, nothing herein shall supersede or modify + the terms of any separate license agreement you may have executed + with Licensor regarding such Contributions. + + 6. Trademarks. This License does not grant permission to use the trade + names, trademarks, service marks, or product names of the Licensor, + except as required for reasonable and customary use in describing the + origin of the Work and reproducing the content of the NOTICE file. + + 7. Disclaimer of Warranty. Unless required by applicable law or + agreed to in writing, Licensor provides the Work (and each + Contributor provides its Contributions) on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or + implied, including, without limitation, any warranties or conditions + of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A + PARTICULAR PURPOSE. You are solely responsible for determining the + appropriateness of using or redistributing the Work and assume any + risks associated with Your exercise of permissions under this License. + + 8. Limitation of Liability. In no event and under no legal theory, + whether in tort (including negligence), contract, or otherwise, + unless required by applicable law (such as deliberate and grossly + negligent acts) or agreed to in writing, shall any Contributor be + liable to You for damages, including any direct, indirect, special, + incidental, or consequential damages of any character arising as a + result of this License or out of the use or inability to use the + Work (including but not limited to damages for loss of goodwill, + work stoppage, computer failure or malfunction, or any and all + other commercial damages or losses), even if such Contributor + has been advised of the possibility of such damages. + + 9. Accepting Warranty or Additional Liability. While redistributing + the Work or Derivative Works thereof, You may choose to offer, + and charge a fee for, acceptance of support, warranty, indemnity, + or other liability obligations and/or rights consistent with this + License. However, in accepting such obligations, You may act only + on Your own behalf and on Your sole responsibility, not on behalf + of any other Contributor, and only if You agree to indemnify, + defend, and hold each Contributor harmless for any liability + incurred by, or claims asserted against, such Contributor by reason + of your accepting any such warranty or additional liability. + + END OF TERMS AND CONDITIONS + + APPENDIX: How to apply the Apache License to your work. + + To apply the Apache License to your work, attach the following + boilerplate notice, with the fields enclosed by brackets "[]" + replaced with your own identifying information. (Don't include + the brackets!) The text should be enclosed in the appropriate + comment syntax for the file format. We also recommend that a + file or class name and description of purpose be included on the + same "printed page" as the copyright notice for easier + identification within third-party archives. + + Copyright [yyyy] [name of copyright owner] + + Licensed under the Apache License, Version 2.0 (the "License"); + you may not use this file except in compliance with the License. + You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, software + distributed under the License is distributed on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + See the License for the specific language governing permissions and + limitations under the License. + + +opentelemetry-proto +Apache-2.0 +https://github.com/open-telemetry/opentelemetry-python/tree/main/opentelemetry-proto + Apache License + Version 2.0, January 2004 + http://www.apache.org/licenses/ + + TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION + + 1. Definitions. + + "License" shall mean the terms and conditions for use, reproduction, + and distribution as defined by Sections 1 through 9 of this document. + + "Licensor" shall mean the copyright owner or entity authorized by + the copyright owner that is granting the License. + + "Legal Entity" shall mean the union of the acting entity and all + other entities that control, are controlled by, or are under common + control with that entity. For the purposes of this definition, + "control" means (i) the power, direct or indirect, to cause the + direction or management of such entity, whether by contract or + otherwise, or (ii) ownership of fifty percent (50%) or more of the + outstanding shares, or (iii) beneficial ownership of such entity. + + "You" (or "Your") shall mean an individual or Legal Entity + exercising permissions granted by this License. + + "Source" form shall mean the preferred form for making modifications, + including but not limited to software source code, documentation + source, and configuration files. + + "Object" form shall mean any form resulting from mechanical + transformation or translation of a Source form, including but + not limited to compiled object code, generated documentation, + and conversions to other media types. + + "Work" shall mean the work of authorship, whether in Source or + Object form, made available under the License, as indicated by a + copyright notice that is included in or attached to the work + (an example is provided in the Appendix below). + + "Derivative Works" shall mean any work, whether in Source or Object + form, that is based on (or derived from) the Work and for which the + editorial revisions, annotations, elaborations, or other modifications + represent, as a whole, an original work of authorship. For the purposes + of this License, Derivative Works shall not include works that remain + separable from, or merely link (or bind by name) to the interfaces of, + the Work and Derivative Works thereof. + + "Contribution" shall mean any work of authorship, including + the original version of the Work and any modifications or additions + to that Work or Derivative Works thereof, that is intentionally + submitted to Licensor for inclusion in the Work by the copyright owner + or by an individual or Legal Entity authorized to submit on behalf of + the copyright owner. For the purposes of this definition, "submitted" + means any form of electronic, verbal, or written communication sent + to the Licensor or its representatives, including but not limited to + communication on electronic mailing lists, source code control systems, + and issue tracking systems that are managed by, or on behalf of, the + Licensor for the purpose of discussing and improving the Work, but + excluding communication that is conspicuously marked or otherwise + designated in writing by the copyright owner as "Not a Contribution." + + "Contributor" shall mean Licensor and any individual or Legal Entity + on behalf of whom a Contribution has been received by Licensor and + subsequently incorporated within the Work. + + 2. Grant of Copyright License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + copyright license to reproduce, prepare Derivative Works of, + publicly display, publicly perform, sublicense, and distribute the + Work and such Derivative Works in Source or Object form. + + 3. Grant of Patent License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + (except as stated in this section) patent license to make, have made, + use, offer to sell, sell, import, and otherwise transfer the Work, + where such license applies only to those patent claims licensable + by such Contributor that are necessarily infringed by their + Contribution(s) alone or by combination of their Contribution(s) + with the Work to which such Contribution(s) was submitted. If You + institute patent litigation against any entity (including a + cross-claim or counterclaim in a lawsuit) alleging that the Work + or a Contribution incorporated within the Work constitutes direct + or contributory patent infringement, then any patent licenses + granted to You under this License for that Work shall terminate + as of the date such litigation is filed. + + 4. Redistribution. You may reproduce and distribute copies of the + Work or Derivative Works thereof in any medium, with or without + modifications, and in Source or Object form, provided that You + meet the following conditions: + + (a) You must give any other recipients of the Work or + Derivative Works a copy of this License; and + + (b) You must cause any modified files to carry prominent notices + stating that You changed the files; and + + (c) You must retain, in the Source form of any Derivative Works + that You distribute, all copyright, patent, trademark, and + attribution notices from the Source form of the Work, + excluding those notices that do not pertain to any part of + the Derivative Works; and + + (d) If the Work includes a "NOTICE" text file as part of its + distribution, then any Derivative Works that You distribute must + include a readable copy of the attribution notices contained + within such NOTICE file, excluding those notices that do not + pertain to any part of the Derivative Works, in at least one + of the following places: within a NOTICE text file distributed + as part of the Derivative Works; within the Source form or + documentation, if provided along with the Derivative Works; or, + within a display generated by the Derivative Works, if and + wherever such third-party notices normally appear. The contents + of the NOTICE file are for informational purposes only and + do not modify the License. You may add Your own attribution + notices within Derivative Works that You distribute, alongside + or as an addendum to the NOTICE text from the Work, provided + that such additional attribution notices cannot be construed + as modifying the License. + + You may add Your own copyright statement to Your modifications and + may provide additional or different license terms and conditions + for use, reproduction, or distribution of Your modifications, or + for any such Derivative Works as a whole, provided Your use, + reproduction, and distribution of the Work otherwise complies with + the conditions stated in this License. + + 5. Submission of Contributions. Unless You explicitly state otherwise, + any Contribution intentionally submitted for inclusion in the Work + by You to the Licensor shall be under the terms and conditions of + this License, without any additional terms or conditions. + Notwithstanding the above, nothing herein shall supersede or modify + the terms of any separate license agreement you may have executed + with Licensor regarding such Contributions. + + 6. Trademarks. This License does not grant permission to use the trade + names, trademarks, service marks, or product names of the Licensor, + except as required for reasonable and customary use in describing the + origin of the Work and reproducing the content of the NOTICE file. + + 7. Disclaimer of Warranty. Unless required by applicable law or + agreed to in writing, Licensor provides the Work (and each + Contributor provides its Contributions) on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or + implied, including, without limitation, any warranties or conditions + of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A + PARTICULAR PURPOSE. You are solely responsible for determining the + appropriateness of using or redistributing the Work and assume any + risks associated with Your exercise of permissions under this License. + + 8. Limitation of Liability. In no event and under no legal theory, + whether in tort (including negligence), contract, or otherwise, + unless required by applicable law (such as deliberate and grossly + negligent acts) or agreed to in writing, shall any Contributor be + liable to You for damages, including any direct, indirect, special, + incidental, or consequential damages of any character arising as a + result of this License or out of the use or inability to use the + Work (including but not limited to damages for loss of goodwill, + work stoppage, computer failure or malfunction, or any and all + other commercial damages or losses), even if such Contributor + has been advised of the possibility of such damages. + + 9. Accepting Warranty or Additional Liability. While redistributing + the Work or Derivative Works thereof, You may choose to offer, + and charge a fee for, acceptance of support, warranty, indemnity, + or other liability obligations and/or rights consistent with this + License. However, in accepting such obligations, You may act only + on Your own behalf and on Your sole responsibility, not on behalf + of any other Contributor, and only if You agree to indemnify, + defend, and hold each Contributor harmless for any liability + incurred by, or claims asserted against, such Contributor by reason + of your accepting any such warranty or additional liability. + + END OF TERMS AND CONDITIONS + + APPENDIX: How to apply the Apache License to your work. + + To apply the Apache License to your work, attach the following + boilerplate notice, with the fields enclosed by brackets "[]" + replaced with your own identifying information. (Don't include + the brackets!) The text should be enclosed in the appropriate + comment syntax for the file format. We also recommend that a + file or class name and description of purpose be included on the + same "printed page" as the copyright notice for easier + identification within third-party archives. + + Copyright [yyyy] [name of copyright owner] + + Licensed under the Apache License, Version 2.0 (the "License"); + you may not use this file except in compliance with the License. + You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, software + distributed under the License is distributed on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + See the License for the specific language governing permissions and + limitations under the License. + + +opentelemetry-sdk +Apache-2.0 +https://github.com/open-telemetry/opentelemetry-python/tree/main/opentelemetry-sdk + Apache License + Version 2.0, January 2004 + http://www.apache.org/licenses/ + + TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION + + 1. Definitions. + + "License" shall mean the terms and conditions for use, reproduction, + and distribution as defined by Sections 1 through 9 of this document. + + "Licensor" shall mean the copyright owner or entity authorized by + the copyright owner that is granting the License. + + "Legal Entity" shall mean the union of the acting entity and all + other entities that control, are controlled by, or are under common + control with that entity. For the purposes of this definition, + "control" means (i) the power, direct or indirect, to cause the + direction or management of such entity, whether by contract or + otherwise, or (ii) ownership of fifty percent (50%) or more of the + outstanding shares, or (iii) beneficial ownership of such entity. + + "You" (or "Your") shall mean an individual or Legal Entity + exercising permissions granted by this License. + + "Source" form shall mean the preferred form for making modifications, + including but not limited to software source code, documentation + source, and configuration files. + + "Object" form shall mean any form resulting from mechanical + transformation or translation of a Source form, including but + not limited to compiled object code, generated documentation, + and conversions to other media types. + + "Work" shall mean the work of authorship, whether in Source or + Object form, made available under the License, as indicated by a + copyright notice that is included in or attached to the work + (an example is provided in the Appendix below). + + "Derivative Works" shall mean any work, whether in Source or Object + form, that is based on (or derived from) the Work and for which the + editorial revisions, annotations, elaborations, or other modifications + represent, as a whole, an original work of authorship. For the purposes + of this License, Derivative Works shall not include works that remain + separable from, or merely link (or bind by name) to the interfaces of, + the Work and Derivative Works thereof. + + "Contribution" shall mean any work of authorship, including + the original version of the Work and any modifications or additions + to that Work or Derivative Works thereof, that is intentionally + submitted to Licensor for inclusion in the Work by the copyright owner + or by an individual or Legal Entity authorized to submit on behalf of + the copyright owner. For the purposes of this definition, "submitted" + means any form of electronic, verbal, or written communication sent + to the Licensor or its representatives, including but not limited to + communication on electronic mailing lists, source code control systems, + and issue tracking systems that are managed by, or on behalf of, the + Licensor for the purpose of discussing and improving the Work, but + excluding communication that is conspicuously marked or otherwise + designated in writing by the copyright owner as "Not a Contribution." + + "Contributor" shall mean Licensor and any individual or Legal Entity + on behalf of whom a Contribution has been received by Licensor and + subsequently incorporated within the Work. + + 2. Grant of Copyright License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + copyright license to reproduce, prepare Derivative Works of, + publicly display, publicly perform, sublicense, and distribute the + Work and such Derivative Works in Source or Object form. + + 3. Grant of Patent License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + (except as stated in this section) patent license to make, have made, + use, offer to sell, sell, import, and otherwise transfer the Work, + where such license applies only to those patent claims licensable + by such Contributor that are necessarily infringed by their + Contribution(s) alone or by combination of their Contribution(s) + with the Work to which such Contribution(s) was submitted. If You + institute patent litigation against any entity (including a + cross-claim or counterclaim in a lawsuit) alleging that the Work + or a Contribution incorporated within the Work constitutes direct + or contributory patent infringement, then any patent licenses + granted to You under this License for that Work shall terminate + as of the date such litigation is filed. + + 4. Redistribution. You may reproduce and distribute copies of the + Work or Derivative Works thereof in any medium, with or without + modifications, and in Source or Object form, provided that You + meet the following conditions: + + (a) You must give any other recipients of the Work or + Derivative Works a copy of this License; and + + (b) You must cause any modified files to carry prominent notices + stating that You changed the files; and + + (c) You must retain, in the Source form of any Derivative Works + that You distribute, all copyright, patent, trademark, and + attribution notices from the Source form of the Work, + excluding those notices that do not pertain to any part of + the Derivative Works; and + + (d) If the Work includes a "NOTICE" text file as part of its + distribution, then any Derivative Works that You distribute must + include a readable copy of the attribution notices contained + within such NOTICE file, excluding those notices that do not + pertain to any part of the Derivative Works, in at least one + of the following places: within a NOTICE text file distributed + as part of the Derivative Works; within the Source form or + documentation, if provided along with the Derivative Works; or, + within a display generated by the Derivative Works, if and + wherever such third-party notices normally appear. The contents + of the NOTICE file are for informational purposes only and + do not modify the License. You may add Your own attribution + notices within Derivative Works that You distribute, alongside + or as an addendum to the NOTICE text from the Work, provided + that such additional attribution notices cannot be construed + as modifying the License. + + You may add Your own copyright statement to Your modifications and + may provide additional or different license terms and conditions + for use, reproduction, or distribution of Your modifications, or + for any such Derivative Works as a whole, provided Your use, + reproduction, and distribution of the Work otherwise complies with + the conditions stated in this License. + + 5. Submission of Contributions. Unless You explicitly state otherwise, + any Contribution intentionally submitted for inclusion in the Work + by You to the Licensor shall be under the terms and conditions of + this License, without any additional terms or conditions. + Notwithstanding the above, nothing herein shall supersede or modify + the terms of any separate license agreement you may have executed + with Licensor regarding such Contributions. + + 6. Trademarks. This License does not grant permission to use the trade + names, trademarks, service marks, or product names of the Licensor, + except as required for reasonable and customary use in describing the + origin of the Work and reproducing the content of the NOTICE file. + + 7. Disclaimer of Warranty. Unless required by applicable law or + agreed to in writing, Licensor provides the Work (and each + Contributor provides its Contributions) on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or + implied, including, without limitation, any warranties or conditions + of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A + PARTICULAR PURPOSE. You are solely responsible for determining the + appropriateness of using or redistributing the Work and assume any + risks associated with Your exercise of permissions under this License. + + 8. Limitation of Liability. In no event and under no legal theory, + whether in tort (including negligence), contract, or otherwise, + unless required by applicable law (such as deliberate and grossly + negligent acts) or agreed to in writing, shall any Contributor be + liable to You for damages, including any direct, indirect, special, + incidental, or consequential damages of any character arising as a + result of this License or out of the use or inability to use the + Work (including but not limited to damages for loss of goodwill, + work stoppage, computer failure or malfunction, or any and all + other commercial damages or losses), even if such Contributor + has been advised of the possibility of such damages. + + 9. Accepting Warranty or Additional Liability. While redistributing + the Work or Derivative Works thereof, You may choose to offer, + and charge a fee for, acceptance of support, warranty, indemnity, + or other liability obligations and/or rights consistent with this + License. However, in accepting such obligations, You may act only + on Your own behalf and on Your sole responsibility, not on behalf + of any other Contributor, and only if You agree to indemnify, + defend, and hold each Contributor harmless for any liability + incurred by, or claims asserted against, such Contributor by reason + of your accepting any such warranty or additional liability. + + END OF TERMS AND CONDITIONS + + APPENDIX: How to apply the Apache License to your work. + + To apply the Apache License to your work, attach the following + boilerplate notice, with the fields enclosed by brackets "[]" + replaced with your own identifying information. (Don't include + the brackets!) The text should be enclosed in the appropriate + comment syntax for the file format. We also recommend that a + file or class name and description of purpose be included on the + same "printed page" as the copyright notice for easier + identification within third-party archives. + + Copyright [yyyy] [name of copyright owner] + + Licensed under the Apache License, Version 2.0 (the "License"); + you may not use this file except in compliance with the License. + You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, software + distributed under the License is distributed on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + See the License for the specific language governing permissions and + limitations under the License. + + +opentelemetry-semantic-conventions +Apache-2.0 +https://github.com/open-telemetry/opentelemetry-python/tree/main/opentelemetry-semantic-conventions + Apache License + Version 2.0, January 2004 + http://www.apache.org/licenses/ + + TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION + + 1. Definitions. + + "License" shall mean the terms and conditions for use, reproduction, + and distribution as defined by Sections 1 through 9 of this document. + + "Licensor" shall mean the copyright owner or entity authorized by + the copyright owner that is granting the License. + + "Legal Entity" shall mean the union of the acting entity and all + other entities that control, are controlled by, or are under common + control with that entity. For the purposes of this definition, + "control" means (i) the power, direct or indirect, to cause the + direction or management of such entity, whether by contract or + otherwise, or (ii) ownership of fifty percent (50%) or more of the + outstanding shares, or (iii) beneficial ownership of such entity. + + "You" (or "Your") shall mean an individual or Legal Entity + exercising permissions granted by this License. + + "Source" form shall mean the preferred form for making modifications, + including but not limited to software source code, documentation + source, and configuration files. + + "Object" form shall mean any form resulting from mechanical + transformation or translation of a Source form, including but + not limited to compiled object code, generated documentation, + and conversions to other media types. + + "Work" shall mean the work of authorship, whether in Source or + Object form, made available under the License, as indicated by a + copyright notice that is included in or attached to the work + (an example is provided in the Appendix below). + + "Derivative Works" shall mean any work, whether in Source or Object + form, that is based on (or derived from) the Work and for which the + editorial revisions, annotations, elaborations, or other modifications + represent, as a whole, an original work of authorship. For the purposes + of this License, Derivative Works shall not include works that remain + separable from, or merely link (or bind by name) to the interfaces of, + the Work and Derivative Works thereof. + + "Contribution" shall mean any work of authorship, including + the original version of the Work and any modifications or additions + to that Work or Derivative Works thereof, that is intentionally + submitted to Licensor for inclusion in the Work by the copyright owner + or by an individual or Legal Entity authorized to submit on behalf of + the copyright owner. For the purposes of this definition, "submitted" + means any form of electronic, verbal, or written communication sent + to the Licensor or its representatives, including but not limited to + communication on electronic mailing lists, source code control systems, + and issue tracking systems that are managed by, or on behalf of, the + Licensor for the purpose of discussing and improving the Work, but + excluding communication that is conspicuously marked or otherwise + designated in writing by the copyright owner as "Not a Contribution." + + "Contributor" shall mean Licensor and any individual or Legal Entity + on behalf of whom a Contribution has been received by Licensor and + subsequently incorporated within the Work. + + 2. Grant of Copyright License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + copyright license to reproduce, prepare Derivative Works of, + publicly display, publicly perform, sublicense, and distribute the + Work and such Derivative Works in Source or Object form. + + 3. Grant of Patent License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + (except as stated in this section) patent license to make, have made, + use, offer to sell, sell, import, and otherwise transfer the Work, + where such license applies only to those patent claims licensable + by such Contributor that are necessarily infringed by their + Contribution(s) alone or by combination of their Contribution(s) + with the Work to which such Contribution(s) was submitted. If You + institute patent litigation against any entity (including a + cross-claim or counterclaim in a lawsuit) alleging that the Work + or a Contribution incorporated within the Work constitutes direct + or contributory patent infringement, then any patent licenses + granted to You under this License for that Work shall terminate + as of the date such litigation is filed. + + 4. Redistribution. You may reproduce and distribute copies of the + Work or Derivative Works thereof in any medium, with or without + modifications, and in Source or Object form, provided that You + meet the following conditions: + + (a) You must give any other recipients of the Work or + Derivative Works a copy of this License; and + + (b) You must cause any modified files to carry prominent notices + stating that You changed the files; and + + (c) You must retain, in the Source form of any Derivative Works + that You distribute, all copyright, patent, trademark, and + attribution notices from the Source form of the Work, + excluding those notices that do not pertain to any part of + the Derivative Works; and + + (d) If the Work includes a "NOTICE" text file as part of its + distribution, then any Derivative Works that You distribute must + include a readable copy of the attribution notices contained + within such NOTICE file, excluding those notices that do not + pertain to any part of the Derivative Works, in at least one + of the following places: within a NOTICE text file distributed + as part of the Derivative Works; within the Source form or + documentation, if provided along with the Derivative Works; or, + within a display generated by the Derivative Works, if and + wherever such third-party notices normally appear. The contents + of the NOTICE file are for informational purposes only and + do not modify the License. You may add Your own attribution + notices within Derivative Works that You distribute, alongside + or as an addendum to the NOTICE text from the Work, provided + that such additional attribution notices cannot be construed + as modifying the License. + + You may add Your own copyright statement to Your modifications and + may provide additional or different license terms and conditions + for use, reproduction, or distribution of Your modifications, or + for any such Derivative Works as a whole, provided Your use, + reproduction, and distribution of the Work otherwise complies with + the conditions stated in this License. + + 5. Submission of Contributions. Unless You explicitly state otherwise, + any Contribution intentionally submitted for inclusion in the Work + by You to the Licensor shall be under the terms and conditions of + this License, without any additional terms or conditions. + Notwithstanding the above, nothing herein shall supersede or modify + the terms of any separate license agreement you may have executed + with Licensor regarding such Contributions. + + 6. Trademarks. This License does not grant permission to use the trade + names, trademarks, service marks, or product names of the Licensor, + except as required for reasonable and customary use in describing the + origin of the Work and reproducing the content of the NOTICE file. + + 7. Disclaimer of Warranty. Unless required by applicable law or + agreed to in writing, Licensor provides the Work (and each + Contributor provides its Contributions) on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or + implied, including, without limitation, any warranties or conditions + of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A + PARTICULAR PURPOSE. You are solely responsible for determining the + appropriateness of using or redistributing the Work and assume any + risks associated with Your exercise of permissions under this License. + + 8. Limitation of Liability. In no event and under no legal theory, + whether in tort (including negligence), contract, or otherwise, + unless required by applicable law (such as deliberate and grossly + negligent acts) or agreed to in writing, shall any Contributor be + liable to You for damages, including any direct, indirect, special, + incidental, or consequential damages of any character arising as a + result of this License or out of the use or inability to use the + Work (including but not limited to damages for loss of goodwill, + work stoppage, computer failure or malfunction, or any and all + other commercial damages or losses), even if such Contributor + has been advised of the possibility of such damages. + + 9. Accepting Warranty or Additional Liability. While redistributing + the Work or Derivative Works thereof, You may choose to offer, + and charge a fee for, acceptance of support, warranty, indemnity, + or other liability obligations and/or rights consistent with this + License. However, in accepting such obligations, You may act only + on Your own behalf and on Your sole responsibility, not on behalf + of any other Contributor, and only if You agree to indemnify, + defend, and hold each Contributor harmless for any liability + incurred by, or claims asserted against, such Contributor by reason + of your accepting any such warranty or additional liability. + + END OF TERMS AND CONDITIONS + + APPENDIX: How to apply the Apache License to your work. + + To apply the Apache License to your work, attach the following + boilerplate notice, with the fields enclosed by brackets "[]" + replaced with your own identifying information. (Don't include + the brackets!) The text should be enclosed in the appropriate + comment syntax for the file format. We also recommend that a + file or class name and description of purpose be included on the + same "printed page" as the copyright notice for easier + identification within third-party archives. + + Copyright [yyyy] [name of copyright owner] + + Licensed under the Apache License, Version 2.0 (the "License"); + you may not use this file except in compliance with the License. + You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, software + distributed under the License is distributed on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + See the License for the specific language governing permissions and + limitations under the License. + + +orderly-set +MIT License +https://github.com/seperman/orderly-set +UNKNOWN + +orjson +MPL-2.0 AND (Apache-2.0 OR MIT) +https://github.com/ijl/orjson + Apache License + Version 2.0, January 2004 + http://www.apache.org/licenses/ + +TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION + +1. Definitions. + + "License" shall mean the terms and conditions for use, reproduction, + and distribution as defined by Sections 1 through 9 of this document. + + "Licensor" shall mean the copyright owner or entity authorized by + the copyright owner that is granting the License. + + "Legal Entity" shall mean the union of the acting entity and all + other entities that control, are controlled by, or are under common + control with that entity. For the purposes of this definition, + "control" means (i) the power, direct or indirect, to cause the + direction or management of such entity, whether by contract or + otherwise, or (ii) ownership of fifty percent (50%) or more of the + outstanding shares, or (iii) beneficial ownership of such entity. + + "You" (or "Your") shall mean an individual or Legal Entity + exercising permissions granted by this License. + + "Source" form shall mean the preferred form for making modifications, + including but not limited to software source code, documentation + source, and configuration files. + + "Object" form shall mean any form resulting from mechanical + transformation or translation of a Source form, including but + not limited to compiled object code, generated documentation, + and conversions to other media types. + + "Work" shall mean the work of authorship, whether in Source or + Object form, made available under the License, as indicated by a + copyright notice that is included in or attached to the work + (an example is provided in the Appendix below). + + "Derivative Works" shall mean any work, whether in Source or Object + form, that is based on (or derived from) the Work and for which the + editorial revisions, annotations, elaborations, or other modifications + represent, as a whole, an original work of authorship. For the purposes + of this License, Derivative Works shall not include works that remain + separable from, or merely link (or bind by name) to the interfaces of, + the Work and Derivative Works thereof. + + "Contribution" shall mean any work of authorship, including + the original version of the Work and any modifications or additions + to that Work or Derivative Works thereof, that is intentionally + submitted to Licensor for inclusion in the Work by the copyright owner + or by an individual or Legal Entity authorized to submit on behalf of + the copyright owner. For the purposes of this definition, "submitted" + means any form of electronic, verbal, or written communication sent + to the Licensor or its representatives, including but not limited to + communication on electronic mailing lists, source code control systems, + and issue tracking systems that are managed by, or on behalf of, the + Licensor for the purpose of discussing and improving the Work, but + excluding communication that is conspicuously marked or otherwise + designated in writing by the copyright owner as "Not a Contribution." + + "Contributor" shall mean Licensor and any individual or Legal Entity + on behalf of whom a Contribution has been received by Licensor and + subsequently incorporated within the Work. + +2. Grant of Copyright License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + copyright license to reproduce, prepare Derivative Works of, + publicly display, publicly perform, sublicense, and distribute the + Work and such Derivative Works in Source or Object form. + +3. Grant of Patent License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + (except as stated in this section) patent license to make, have made, + use, offer to sell, sell, import, and otherwise transfer the Work, + where such license applies only to those patent claims licensable + by such Contributor that are necessarily infringed by their + Contribution(s) alone or by combination of their Contribution(s) + with the Work to which such Contribution(s) was submitted. If You + institute patent litigation against any entity (including a + cross-claim or counterclaim in a lawsuit) alleging that the Work + or a Contribution incorporated within the Work constitutes direct + or contributory patent infringement, then any patent licenses + granted to You under this License for that Work shall terminate + as of the date such litigation is filed. + +4. Redistribution. You may reproduce and distribute copies of the + Work or Derivative Works thereof in any medium, with or without + modifications, and in Source or Object form, provided that You + meet the following conditions: + + (a) You must give any other recipients of the Work or + Derivative Works a copy of this License; and + + (b) You must cause any modified files to carry prominent notices + stating that You changed the files; and + + (c) You must retain, in the Source form of any Derivative Works + that You distribute, all copyright, patent, trademark, and + attribution notices from the Source form of the Work, + excluding those notices that do not pertain to any part of + the Derivative Works; and + + (d) If the Work includes a "NOTICE" text file as part of its + distribution, then any Derivative Works that You distribute must + include a readable copy of the attribution notices contained + within such NOTICE file, excluding those notices that do not + pertain to any part of the Derivative Works, in at least one + of the following places: within a NOTICE text file distributed + as part of the Derivative Works; within the Source form or + documentation, if provided along with the Derivative Works; or, + within a display generated by the Derivative Works, if and + wherever such third-party notices normally appear. The contents + of the NOTICE file are for informational purposes only and + do not modify the License. You may add Your own attribution + notices within Derivative Works that You distribute, alongside + or as an addendum to the NOTICE text from the Work, provided + that such additional attribution notices cannot be construed + as modifying the License. + + You may add Your own copyright statement to Your modifications and + may provide additional or different license terms and conditions + for use, reproduction, or distribution of Your modifications, or + for any such Derivative Works as a whole, provided Your use, + reproduction, and distribution of the Work otherwise complies with + the conditions stated in this License. + +5. Submission of Contributions. Unless You explicitly state otherwise, + any Contribution intentionally submitted for inclusion in the Work + by You to the Licensor shall be under the terms and conditions of + this License, without any additional terms or conditions. + Notwithstanding the above, nothing herein shall supersede or modify + the terms of any separate license agreement you may have executed + with Licensor regarding such Contributions. + +6. Trademarks. This License does not grant permission to use the trade + names, trademarks, service marks, or product names of the Licensor, + except as required for reasonable and customary use in describing the + origin of the Work and reproducing the content of the NOTICE file. + +7. Disclaimer of Warranty. Unless required by applicable law or + agreed to in writing, Licensor provides the Work (and each + Contributor provides its Contributions) on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or + implied, including, without limitation, any warranties or conditions + of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A + PARTICULAR PURPOSE. You are solely responsible for determining the + appropriateness of using or redistributing the Work and assume any + risks associated with Your exercise of permissions under this License. + +8. Limitation of Liability. In no event and under no legal theory, + whether in tort (including negligence), contract, or otherwise, + unless required by applicable law (such as deliberate and grossly + negligent acts) or agreed to in writing, shall any Contributor be + liable to You for damages, including any direct, indirect, special, + incidental, or consequential damages of any character arising as a + result of this License or out of the use or inability to use the + Work (including but not limited to damages for loss of goodwill, + work stoppage, computer failure or malfunction, or any and all + other commercial damages or losses), even if such Contributor + has been advised of the possibility of such damages. + +9. Accepting Warranty or Additional Liability. While redistributing + the Work or Derivative Works thereof, You may choose to offer, + and charge a fee for, acceptance of support, warranty, indemnity, + or other liability obligations and/or rights consistent with this + License. However, in accepting such obligations, You may act only + on Your own behalf and on Your sole responsibility, not on behalf + of any other Contributor, and only if You agree to indemnify, + defend, and hold each Contributor harmless for any liability + incurred by, or claims asserted against, such Contributor by reason + of your accepting any such warranty or additional liability. + +END OF TERMS AND CONDITIONS + +APPENDIX: How to apply the Apache License to your work. + + To apply the Apache License to your work, attach the following + boilerplate notice, with the fields enclosed by brackets "[]" + replaced with your own identifying information. (Don't include + the brackets!) The text should be enclosed in the appropriate + comment syntax for the file format. We also recommend that a + file or class name and description of purpose be included on the + same "printed page" as the copyright notice for easier + identification within third-party archives. + +Copyright [yyyy] [name of copyright owner] + +Licensed under the Apache License, Version 2.0 (the "License"); +you may not use this file except in compliance with the License. +You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + +Unless required by applicable law or agreed to in writing, software +distributed under the License is distributed on an "AS IS" BASIS, +WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +See the License for the specific language governing permissions and +limitations under the License. + + +packaging +Apache Software License; BSD License +https://github.com/pypa/packaging +This software is made available under the terms of *either* of the licenses +found in LICENSE.APACHE or LICENSE.BSD. Contributions to this software is made +under the terms of *both* these licenses. + + +pandas +BSD License +https://pandas.pydata.org +BSD 3-Clause License + +Copyright (c) 2008-2011, AQR Capital Management, LLC, Lambda Foundry, Inc. and PyData Development Team +All rights reserved. + +Copyright (c) 2011-2023, Open source contributors. + +Redistribution and use in source and binary forms, with or without +modification, are permitted provided that the following conditions are met: + +* Redistributions of source code must retain the above copyright notice, this + list of conditions and the following disclaimer. + +* Redistributions in binary form must reproduce the above copyright notice, + this list of conditions and the following disclaimer in the documentation + and/or other materials provided with the distribution. + +* Neither the name of the copyright holder nor the names of its + contributors may be used to endorse or promote products derived from + this software without specific prior written permission. + +THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" +AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE +IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE +DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE +FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL +DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR +SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER +CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, +OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE +OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. +Copyright (c) 2010-2019 Keith Goodman +Copyright (c) 2019 Bottleneck Developers +All rights reserved. + +Redistribution and use in source and binary forms, with or without +modification, are permitted provided that the following conditions are met: + + * Redistributions of source code must retain the above copyright notice, + this list of conditions and the following disclaimer. + + * Redistributions in binary form must reproduce the above copyright + notice, this list of conditions and the following disclaimer in the + documentation and/or other materials provided with the distribution. + +THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" +AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE +IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE +ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE +LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR +CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF +SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS +INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN +CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) +ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE +POSSIBILITY OF SUCH DAMAGE.Copyright 2017- Paul Ganssle +Copyright 2017- dateutil contributors (see AUTHORS file) + + Licensed under the Apache License, Version 2.0 (the "License"); + you may not use this file except in compliance with the License. + You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, software + distributed under the License is distributed on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + See the License for the specific language governing permissions and + limitations under the License. + +The above license applies to all contributions after 2017-12-01, as well as +all contributions that have been re-licensed (see AUTHORS file for the list of +contributors who have re-licensed their code). +-------------------------------------------------------------------------------- +dateutil - Extensions to the standard Python datetime module. + +Copyright (c) 2003-2011 - Gustavo Niemeyer +Copyright (c) 2012-2014 - Tomi Pieviläinen +Copyright (c) 2014-2016 - Yaron de Leeuw +Copyright (c) 2015- - Paul Ganssle +Copyright (c) 2015- - dateutil contributors (see AUTHORS file) + +All rights reserved. + +Redistribution and use in source and binary forms, with or without +modification, are permitted provided that the following conditions are met: + + * Redistributions of source code must retain the above copyright notice, + this list of conditions and the following disclaimer. + * Redistributions in binary form must reproduce the above copyright notice, + this list of conditions and the following disclaimer in the documentation + and/or other materials provided with the distribution. + * Neither the name of the copyright holder nor the names of its + contributors may be used to endorse or promote products derived from + this software without specific prior written permission. + +THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS +"AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT +LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR +A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR +CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, +EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, +PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR +PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF +LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING +NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS +SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + +The above BSD License Applies to all code, even that also covered by Apache 2.0.# MIT License + +Copyright (c) 2019 Hadley Wickham; RStudio; and Evan Miller + +Permission is hereby granted, free of charge, to any person obtaining a copy +of this software and associated documentation files (the "Software"), to deal +in the Software without restriction, including without limitation the rights +to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +copies of the Software, and to permit persons to whom the Software is +furnished to do so, subject to the following conditions: + +The above copyright notice and this permission notice shall be included in all +copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE +SOFTWARE. +Based on http://opensource.org/licenses/MIT + +This is a template. Complete and ship as file LICENSE the following 2 +lines (only) + +YEAR: +COPYRIGHT HOLDER: + +and specify as + +License: MIT + file LICENSE + +Copyright (c) , + +Permission is hereby granted, free of charge, to any person obtaining +a copy of this software and associated documentation files (the +"Software"), to deal in the Software without restriction, including +without limitation the rights to use, copy, modify, merge, publish, +distribute, sublicense, and/or sell copies of the Software, and to +permit persons to whom the Software is furnished to do so, subject to +the following conditions: + +The above copyright notice and this permission notice shall be +included in all copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, +EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF +MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND +NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE +LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION +OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION +WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. +The MIT License + +Copyright (c) 2008- Attractive Chaos + +Permission is hereby granted, free of charge, to any person obtaining +a copy of this software and associated documentation files (the +"Software"), to deal in the Software without restriction, including +without limitation the rights to use, copy, modify, merge, publish, +distribute, sublicense, and/or sell copies of the Software, and to +permit persons to whom the Software is furnished to do so, subject to +the following conditions: + +The above copyright notice and this permission notice shall be +included in all copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, +EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF +MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND +NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS +BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN +ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN +CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE +SOFTWARE.musl as a whole is licensed under the following standard MIT license: + +---------------------------------------------------------------------- +Copyright © 2005-2020 Rich Felker, et al. + +Permission is hereby granted, free of charge, to any person obtaining +a copy of this software and associated documentation files (the +"Software"), to deal in the Software without restriction, including +without limitation the rights to use, copy, modify, merge, publish, +distribute, sublicense, and/or sell copies of the Software, and to +permit persons to whom the Software is furnished to do so, subject to +the following conditions: + +The above copyright notice and this permission notice shall be +included in all copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, +EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF +MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. +IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY +CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, +TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE +SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. +---------------------------------------------------------------------- + +Authors/contributors include: + +A. Wilcox +Ada Worcester +Alex Dowad +Alex Suykov +Alexander Monakov +Andre McCurdy +Andrew Kelley +Anthony G. Basile +Aric Belsito +Arvid Picciani +Bartosz Brachaczek +Benjamin Peterson +Bobby Bingham +Boris Brezillon +Brent Cook +Chris Spiegel +Clément Vasseur +Daniel Micay +Daniel Sabogal +Daurnimator +David Carlier +David Edelsohn +Denys Vlasenko +Dmitry Ivanov +Dmitry V. Levin +Drew DeVault +Emil Renner Berthing +Fangrui Song +Felix Fietkau +Felix Janda +Gianluca Anzolin +Hauke Mehrtens +He X +Hiltjo Posthuma +Isaac Dunham +Jaydeep Patil +Jens Gustedt +Jeremy Huntwork +Jo-Philipp Wich +Joakim Sindholt +John Spencer +Julien Ramseier +Justin Cormack +Kaarle Ritvanen +Khem Raj +Kylie McClain +Leah Neukirchen +Luca Barbato +Luka Perkov +M Farkas-Dyck (Strake) +Mahesh Bodapati +Markus Wichmann +Masanori Ogino +Michael Clark +Michael Forney +Mikhail Kremnyov +Natanael Copa +Nicholas J. Kain +orc +Pascal Cuoq +Patrick Oppenlander +Petr Hosek +Petr Skocik +Pierre Carrier +Reini Urban +Rich Felker +Richard Pennington +Ryan Fairfax +Samuel Holland +Segev Finer +Shiz +sin +Solar Designer +Stefan Kristiansson +Stefan O'Rear +Szabolcs Nagy +Timo Teräs +Trutz Behn +Valentin Ochs +Will Dietz +William Haddon +William Pitcock + +Portions of this software are derived from third-party works licensed +under terms compatible with the above MIT license: + +The TRE regular expression implementation (src/regex/reg* and +src/regex/tre*) is Copyright © 2001-2008 Ville Laurikari and licensed +under a 2-clause BSD license (license text in the source files). The +included version has been heavily modified by Rich Felker in 2012, in +the interests of size, simplicity, and namespace cleanliness. + +Much of the math library code (src/math/* and src/complex/*) is +Copyright © 1993,2004 Sun Microsystems or +Copyright © 2003-2011 David Schultz or +Copyright © 2003-2009 Steven G. Kargl or +Copyright © 2003-2009 Bruce D. Evans or +Copyright © 2008 Stephen L. Moshier or +Copyright © 2017-2018 Arm Limited +and labelled as such in comments in the individual source files. All +have been licensed under extremely permissive terms. + +The ARM memcpy code (src/string/arm/memcpy.S) is Copyright © 2008 +The Android Open Source Project and is licensed under a two-clause BSD +license. It was taken from Bionic libc, used on Android. + +The AArch64 memcpy and memset code (src/string/aarch64/*) are +Copyright © 1999-2019, Arm Limited. + +The implementation of DES for crypt (src/crypt/crypt_des.c) is +Copyright © 1994 David Burren. It is licensed under a BSD license. + +The implementation of blowfish crypt (src/crypt/crypt_blowfish.c) was +originally written by Solar Designer and placed into the public +domain. The code also comes with a fallback permissive license for use +in jurisdictions that may not recognize the public domain. + +The smoothsort implementation (src/stdlib/qsort.c) is Copyright © 2011 +Valentin Ochs and is licensed under an MIT-style license. + +The x86_64 port was written by Nicholas J. Kain and is licensed under +the standard MIT terms. + +The mips and microblaze ports were originally written by Richard +Pennington for use in the ellcc project. The original code was adapted +by Rich Felker for build system and code conventions during upstream +integration. It is licensed under the standard MIT terms. + +The mips64 port was contributed by Imagination Technologies and is +licensed under the standard MIT terms. + +The powerpc port was also originally written by Richard Pennington, +and later supplemented and integrated by John Spencer. It is licensed +under the standard MIT terms. + +All other files which have no copyright comments are original works +produced specifically for use as part of this library, written either +by Rich Felker, the main author of the library, or by one or more +contibutors listed above. Details on authorship of individual files +can be found in the git version control history of the project. The +omission of copyright and license comments in each file is in the +interest of source tree size. + +In addition, permission is hereby granted for all public header files +(include/* and arch/*/bits/*) and crt files intended to be linked into +applications (crt/*, ldso/dlstart.c, and arch/*/crt_arch.h) to omit +the copyright notice and permission notice otherwise required by the +license, and to use these files without any requirement of +attribution. These files include substantial contributions from: + +Bobby Bingham +John Spencer +Nicholas J. Kain +Rich Felker +Richard Pennington +Stefan Kristiansson +Szabolcs Nagy + +all of whom have explicitly granted such permission. + +This file previously contained text expressing a belief that most of +the files covered by the above exception were sufficiently trivial not +to be subject to copyright, resulting in confusion over whether it +negated the permissions granted in the license. In the spirit of +permissive licensing, and of not having licensing issues being an +obstacle to adoption, that text has been removed.Copyright (c) 2005-2023, NumPy Developers. +All rights reserved. + +Redistribution and use in source and binary forms, with or without +modification, are permitted provided that the following conditions are +met: + + * Redistributions of source code must retain the above copyright + notice, this list of conditions and the following disclaimer. + + * Redistributions in binary form must reproduce the above + copyright notice, this list of conditions and the following + disclaimer in the documentation and/or other materials provided + with the distribution. + + * Neither the name of the NumPy Developers nor the names of any + contributors may be used to endorse or promote products derived + from this software without specific prior written permission. + +THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS +"AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT +LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR +A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT +OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, +SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT +LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, +DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY +THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT +(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE +OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + Apache License + Version 2.0, January 2004 + http://www.apache.org/licenses/ + + TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION + + 1. Definitions. + + "License" shall mean the terms and conditions for use, reproduction, + and distribution as defined by Sections 1 through 9 of this document. + + "Licensor" shall mean the copyright owner or entity authorized by + the copyright owner that is granting the License. + + "Legal Entity" shall mean the union of the acting entity and all + other entities that control, are controlled by, or are under common + control with that entity. For the purposes of this definition, + "control" means (i) the power, direct or indirect, to cause the + direction or management of such entity, whether by contract or + otherwise, or (ii) ownership of fifty percent (50%) or more of the + outstanding shares, or (iii) beneficial ownership of such entity. + + "You" (or "Your") shall mean an individual or Legal Entity + exercising permissions granted by this License. + + "Source" form shall mean the preferred form for making modifications, + including but not limited to software source code, documentation + source, and configuration files. + + "Object" form shall mean any form resulting from mechanical + transformation or translation of a Source form, including but + not limited to compiled object code, generated documentation, + and conversions to other media types. + + "Work" shall mean the work of authorship, whether in Source or + Object form, made available under the License, as indicated by a + copyright notice that is included in or attached to the work + (an example is provided in the Appendix below). + + "Derivative Works" shall mean any work, whether in Source or Object + form, that is based on (or derived from) the Work and for which the + editorial revisions, annotations, elaborations, or other modifications + represent, as a whole, an original work of authorship. For the purposes + of this License, Derivative Works shall not include works that remain + separable from, or merely link (or bind by name) to the interfaces of, + the Work and Derivative Works thereof. + + "Contribution" shall mean any work of authorship, including + the original version of the Work and any modifications or additions + to that Work or Derivative Works thereof, that is intentionally + submitted to Licensor for inclusion in the Work by the copyright owner + or by an individual or Legal Entity authorized to submit on behalf of + the copyright owner. For the purposes of this definition, "submitted" + means any form of electronic, verbal, or written communication sent + to the Licensor or its representatives, including but not limited to + communication on electronic mailing lists, source code control systems, + and issue tracking systems that are managed by, or on behalf of, the + Licensor for the purpose of discussing and improving the Work, but + excluding communication that is conspicuously marked or otherwise + designated in writing by the copyright owner as "Not a Contribution." + + "Contributor" shall mean Licensor and any individual or Legal Entity + on behalf of whom a Contribution has been received by Licensor and + subsequently incorporated within the Work. + + 2. Grant of Copyright License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + copyright license to reproduce, prepare Derivative Works of, + publicly display, publicly perform, sublicense, and distribute the + Work and such Derivative Works in Source or Object form. + + 3. Grant of Patent License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + (except as stated in this section) patent license to make, have made, + use, offer to sell, sell, import, and otherwise transfer the Work, + where such license applies only to those patent claims licensable + by such Contributor that are necessarily infringed by their + Contribution(s) alone or by combination of their Contribution(s) + with the Work to which such Contribution(s) was submitted. If You + institute patent litigation against any entity (including a + cross-claim or counterclaim in a lawsuit) alleging that the Work + or a Contribution incorporated within the Work constitutes direct + or contributory patent infringement, then any patent licenses + granted to You under this License for that Work shall terminate + as of the date such litigation is filed. + + 4. Redistribution. You may reproduce and distribute copies of the + Work or Derivative Works thereof in any medium, with or without + modifications, and in Source or Object form, provided that You + meet the following conditions: + + (a) You must give any other recipients of the Work or + Derivative Works a copy of this License; and + + (b) You must cause any modified files to carry prominent notices + stating that You changed the files; and + + (c) You must retain, in the Source form of any Derivative Works + that You distribute, all copyright, patent, trademark, and + attribution notices from the Source form of the Work, + excluding those notices that do not pertain to any part of + the Derivative Works; and + + (d) If the Work includes a "NOTICE" text file as part of its + distribution, then any Derivative Works that You distribute must + include a readable copy of the attribution notices contained + within such NOTICE file, excluding those notices that do not + pertain to any part of the Derivative Works, in at least one + of the following places: within a NOTICE text file distributed + as part of the Derivative Works; within the Source form or + documentation, if provided along with the Derivative Works; or, + within a display generated by the Derivative Works, if and + wherever such third-party notices normally appear. The contents + of the NOTICE file are for informational purposes only and + do not modify the License. You may add Your own attribution + notices within Derivative Works that You distribute, alongside + or as an addendum to the NOTICE text from the Work, provided + that such additional attribution notices cannot be construed + as modifying the License. + + You may add Your own copyright statement to Your modifications and + may provide additional or different license terms and conditions + for use, reproduction, or distribution of Your modifications, or + for any such Derivative Works as a whole, provided Your use, + reproduction, and distribution of the Work otherwise complies with + the conditions stated in this License. + + 5. Submission of Contributions. Unless You explicitly state otherwise, + any Contribution intentionally submitted for inclusion in the Work + by You to the Licensor shall be under the terms and conditions of + this License, without any additional terms or conditions. + Notwithstanding the above, nothing herein shall supersede or modify + the terms of any separate license agreement you may have executed + with Licensor regarding such Contributions. + + 6. Trademarks. This License does not grant permission to use the trade + names, trademarks, service marks, or product names of the Licensor, + except as required for reasonable and customary use in describing the + origin of the Work and reproducing the content of the NOTICE file. + + 7. Disclaimer of Warranty. Unless required by applicable law or + agreed to in writing, Licensor provides the Work (and each + Contributor provides its Contributions) on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or + implied, including, without limitation, any warranties or conditions + of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A + PARTICULAR PURPOSE. You are solely responsible for determining the + appropriateness of using or redistributing the Work and assume any + risks associated with Your exercise of permissions under this License. + + 8. Limitation of Liability. In no event and under no legal theory, + whether in tort (including negligence), contract, or otherwise, + unless required by applicable law (such as deliberate and grossly + negligent acts) or agreed to in writing, shall any Contributor be + liable to You for damages, including any direct, indirect, special, + incidental, or consequential damages of any character arising as a + result of this License or out of the use or inability to use the + Work (including but not limited to damages for loss of goodwill, + work stoppage, computer failure or malfunction, or any and all + other commercial damages or losses), even if such Contributor + has been advised of the possibility of such damages. + + 9. Accepting Warranty or Additional Liability. While redistributing + the Work or Derivative Works thereof, You may choose to offer, + and charge a fee for, acceptance of support, warranty, indemnity, + or other liability obligations and/or rights consistent with this + License. However, in accepting such obligations, You may act only + on Your own behalf and on Your sole responsibility, not on behalf + of any other Contributor, and only if You agree to indemnify, + defend, and hold each Contributor harmless for any liability + incurred by, or claims asserted against, such Contributor by reason + of your accepting any such warranty or additional liability. + + END OF TERMS AND CONDITIONS + + +Copyright (c) Donald Stufft and individual contributors. +All rights reserved. + +Redistribution and use in source and binary forms, with or without +modification, are permitted provided that the following conditions are met: + + 1. Redistributions of source code must retain the above copyright notice, + this list of conditions and the following disclaimer. + + 2. Redistributions in binary form must reproduce the above copyright + notice, this list of conditions and the following disclaimer in the + documentation and/or other materials provided with the distribution. + +THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND +ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED +WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE +DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE +FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL +DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR +SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER +CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, +OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE +OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.A. HISTORY OF THE SOFTWARE +========================== + +Python was created in the early 1990s by Guido van Rossum at Stichting +Mathematisch Centrum (CWI, see https://www.cwi.nl) in the Netherlands +as a successor of a language called ABC. Guido remains Python's +principal author, although it includes many contributions from others. + +In 1995, Guido continued his work on Python at the Corporation for +National Research Initiatives (CNRI, see https://www.cnri.reston.va.us) +in Reston, Virginia where he released several versions of the +software. + +In May 2000, Guido and the Python core development team moved to +BeOpen.com to form the BeOpen PythonLabs team. In October of the same +year, the PythonLabs team moved to Digital Creations, which became +Zope Corporation. In 2001, the Python Software Foundation (PSF, see +https://www.python.org/psf/) was formed, a non-profit organization +created specifically to own Python-related Intellectual Property. +Zope Corporation was a sponsoring member of the PSF. + +All Python releases are Open Source (see https://opensource.org for +the Open Source Definition). Historically, most, but not all, Python +releases have also been GPL-compatible; the table below summarizes +the various releases. + + Release Derived Year Owner GPL- + from compatible? (1) + + 0.9.0 thru 1.2 1991-1995 CWI yes + 1.3 thru 1.5.2 1.2 1995-1999 CNRI yes + 1.6 1.5.2 2000 CNRI no + 2.0 1.6 2000 BeOpen.com no + 1.6.1 1.6 2001 CNRI yes (2) + 2.1 2.0+1.6.1 2001 PSF no + 2.0.1 2.0+1.6.1 2001 PSF yes + 2.1.1 2.1+2.0.1 2001 PSF yes + 2.1.2 2.1.1 2002 PSF yes + 2.1.3 2.1.2 2002 PSF yes + 2.2 and above 2.1.1 2001-now PSF yes + +Footnotes: + +(1) GPL-compatible doesn't mean that we're distributing Python under + the GPL. All Python licenses, unlike the GPL, let you distribute + a modified version without making your changes open source. The + GPL-compatible licenses make it possible to combine Python with + other software that is released under the GPL; the others don't. + +(2) According to Richard Stallman, 1.6.1 is not GPL-compatible, + because its license has a choice of law clause. According to + CNRI, however, Stallman's lawyer has told CNRI's lawyer that 1.6.1 + is "not incompatible" with the GPL. + +Thanks to the many outside volunteers who have worked under Guido's +direction to make these releases possible. + + +B. TERMS AND CONDITIONS FOR ACCESSING OR OTHERWISE USING PYTHON +=============================================================== + +Python software and documentation are licensed under the +Python Software Foundation License Version 2. + +Starting with Python 3.8.6, examples, recipes, and other code in +the documentation are dual licensed under the PSF License Version 2 +and the Zero-Clause BSD license. + +Some software incorporated into Python is under different licenses. +The licenses are listed with code falling under that license. + + +PYTHON SOFTWARE FOUNDATION LICENSE VERSION 2 +-------------------------------------------- + +1. This LICENSE AGREEMENT is between the Python Software Foundation +("PSF"), and the Individual or Organization ("Licensee") accessing and +otherwise using this software ("Python") in source or binary form and +its associated documentation. + +2. Subject to the terms and conditions of this License Agreement, PSF hereby +grants Licensee a nonexclusive, royalty-free, world-wide license to reproduce, +analyze, test, perform and/or display publicly, prepare derivative works, +distribute, and otherwise use Python alone or in any derivative version, +provided, however, that PSF's License Agreement and PSF's notice of copyright, +i.e., "Copyright (c) 2001, 2002, 2003, 2004, 2005, 2006, 2007, 2008, 2009, 2010, +2011, 2012, 2013, 2014, 2015, 2016, 2017, 2018, 2019, 2020, 2021, 2022, 2023 Python Software Foundation; +All Rights Reserved" are retained in Python alone or in any derivative version +prepared by Licensee. + +3. In the event Licensee prepares a derivative work that is based on +or incorporates Python or any part thereof, and wants to make +the derivative work available to others as provided herein, then +Licensee hereby agrees to include in any such work a brief summary of +the changes made to Python. + +4. PSF is making Python available to Licensee on an "AS IS" +basis. PSF MAKES NO REPRESENTATIONS OR WARRANTIES, EXPRESS OR +IMPLIED. BY WAY OF EXAMPLE, BUT NOT LIMITATION, PSF MAKES NO AND +DISCLAIMS ANY REPRESENTATION OR WARRANTY OF MERCHANTABILITY OR FITNESS +FOR ANY PARTICULAR PURPOSE OR THAT THE USE OF PYTHON WILL NOT +INFRINGE ANY THIRD PARTY RIGHTS. + +5. PSF SHALL NOT BE LIABLE TO LICENSEE OR ANY OTHER USERS OF PYTHON +FOR ANY INCIDENTAL, SPECIAL, OR CONSEQUENTIAL DAMAGES OR LOSS AS +A RESULT OF MODIFYING, DISTRIBUTING, OR OTHERWISE USING PYTHON, +OR ANY DERIVATIVE THEREOF, EVEN IF ADVISED OF THE POSSIBILITY THEREOF. + +6. This License Agreement will automatically terminate upon a material +breach of its terms and conditions. + +7. Nothing in this License Agreement shall be deemed to create any +relationship of agency, partnership, or joint venture between PSF and +Licensee. This License Agreement does not grant permission to use PSF +trademarks or trade name in a trademark sense to endorse or promote +products or services of Licensee, or any third party. + +8. By copying, installing or otherwise using Python, Licensee +agrees to be bound by the terms and conditions of this License +Agreement. + + +BEOPEN.COM LICENSE AGREEMENT FOR PYTHON 2.0 +------------------------------------------- + +BEOPEN PYTHON OPEN SOURCE LICENSE AGREEMENT VERSION 1 + +1. This LICENSE AGREEMENT is between BeOpen.com ("BeOpen"), having an +office at 160 Saratoga Avenue, Santa Clara, CA 95051, and the +Individual or Organization ("Licensee") accessing and otherwise using +this software in source or binary form and its associated +documentation ("the Software"). + +2. Subject to the terms and conditions of this BeOpen Python License +Agreement, BeOpen hereby grants Licensee a non-exclusive, +royalty-free, world-wide license to reproduce, analyze, test, perform +and/or display publicly, prepare derivative works, distribute, and +otherwise use the Software alone or in any derivative version, +provided, however, that the BeOpen Python License is retained in the +Software, alone or in any derivative version prepared by Licensee. + +3. BeOpen is making the Software available to Licensee on an "AS IS" +basis. BEOPEN MAKES NO REPRESENTATIONS OR WARRANTIES, EXPRESS OR +IMPLIED. BY WAY OF EXAMPLE, BUT NOT LIMITATION, BEOPEN MAKES NO AND +DISCLAIMS ANY REPRESENTATION OR WARRANTY OF MERCHANTABILITY OR FITNESS +FOR ANY PARTICULAR PURPOSE OR THAT THE USE OF THE SOFTWARE WILL NOT +INFRINGE ANY THIRD PARTY RIGHTS. + +4. BEOPEN SHALL NOT BE LIABLE TO LICENSEE OR ANY OTHER USERS OF THE +SOFTWARE FOR ANY INCIDENTAL, SPECIAL, OR CONSEQUENTIAL DAMAGES OR LOSS +AS A RESULT OF USING, MODIFYING OR DISTRIBUTING THE SOFTWARE, OR ANY +DERIVATIVE THEREOF, EVEN IF ADVISED OF THE POSSIBILITY THEREOF. + +5. This License Agreement will automatically terminate upon a material +breach of its terms and conditions. + +6. This License Agreement shall be governed by and interpreted in all +respects by the law of the State of California, excluding conflict of +law provisions. Nothing in this License Agreement shall be deemed to +create any relationship of agency, partnership, or joint venture +between BeOpen and Licensee. This License Agreement does not grant +permission to use BeOpen trademarks or trade names in a trademark +sense to endorse or promote products or services of Licensee, or any +third party. As an exception, the "BeOpen Python" logos available at +http://www.pythonlabs.com/logos.html may be used according to the +permissions granted on that web page. + +7. By copying, installing or otherwise using the software, Licensee +agrees to be bound by the terms and conditions of this License +Agreement. + + +CNRI LICENSE AGREEMENT FOR PYTHON 1.6.1 +--------------------------------------- + +1. This LICENSE AGREEMENT is between the Corporation for National +Research Initiatives, having an office at 1895 Preston White Drive, +Reston, VA 20191 ("CNRI"), and the Individual or Organization +("Licensee") accessing and otherwise using Python 1.6.1 software in +source or binary form and its associated documentation. + +2. Subject to the terms and conditions of this License Agreement, CNRI +hereby grants Licensee a nonexclusive, royalty-free, world-wide +license to reproduce, analyze, test, perform and/or display publicly, +prepare derivative works, distribute, and otherwise use Python 1.6.1 +alone or in any derivative version, provided, however, that CNRI's +License Agreement and CNRI's notice of copyright, i.e., "Copyright (c) +1995-2001 Corporation for National Research Initiatives; All Rights +Reserved" are retained in Python 1.6.1 alone or in any derivative +version prepared by Licensee. Alternately, in lieu of CNRI's License +Agreement, Licensee may substitute the following text (omitting the +quotes): "Python 1.6.1 is made available subject to the terms and +conditions in CNRI's License Agreement. This Agreement together with +Python 1.6.1 may be located on the internet using the following +unique, persistent identifier (known as a handle): 1895.22/1013. This +Agreement may also be obtained from a proxy server on the internet +using the following URL: http://hdl.handle.net/1895.22/1013". + +3. In the event Licensee prepares a derivative work that is based on +or incorporates Python 1.6.1 or any part thereof, and wants to make +the derivative work available to others as provided herein, then +Licensee hereby agrees to include in any such work a brief summary of +the changes made to Python 1.6.1. + +4. CNRI is making Python 1.6.1 available to Licensee on an "AS IS" +basis. CNRI MAKES NO REPRESENTATIONS OR WARRANTIES, EXPRESS OR +IMPLIED. BY WAY OF EXAMPLE, BUT NOT LIMITATION, CNRI MAKES NO AND +DISCLAIMS ANY REPRESENTATION OR WARRANTY OF MERCHANTABILITY OR FITNESS +FOR ANY PARTICULAR PURPOSE OR THAT THE USE OF PYTHON 1.6.1 WILL NOT +INFRINGE ANY THIRD PARTY RIGHTS. + +5. CNRI SHALL NOT BE LIABLE TO LICENSEE OR ANY OTHER USERS OF PYTHON +1.6.1 FOR ANY INCIDENTAL, SPECIAL, OR CONSEQUENTIAL DAMAGES OR LOSS AS +A RESULT OF MODIFYING, DISTRIBUTING, OR OTHERWISE USING PYTHON 1.6.1, +OR ANY DERIVATIVE THEREOF, EVEN IF ADVISED OF THE POSSIBILITY THEREOF. + +6. This License Agreement will automatically terminate upon a material +breach of its terms and conditions. + +7. This License Agreement shall be governed by the federal +intellectual property law of the United States, including without +limitation the federal copyright law, and, to the extent such +U.S. federal law does not apply, by the law of the Commonwealth of +Virginia, excluding Virginia's conflict of law provisions. +Notwithstanding the foregoing, with regard to derivative works based +on Python 1.6.1 that incorporate non-separable material that was +previously distributed under the GNU General Public License (GPL), the +law of the Commonwealth of Virginia shall govern this License +Agreement only as to issues arising under or with respect to +Paragraphs 4, 5, and 7 of this License Agreement. Nothing in this +License Agreement shall be deemed to create any relationship of +agency, partnership, or joint venture between CNRI and Licensee. This +License Agreement does not grant permission to use CNRI trademarks or +trade name in a trademark sense to endorse or promote products or +services of Licensee, or any third party. + +8. By clicking on the "ACCEPT" button where indicated, or by copying, +installing or otherwise using Python 1.6.1, Licensee agrees to be +bound by the terms and conditions of this License Agreement. + + ACCEPT + + +CWI LICENSE AGREEMENT FOR PYTHON 0.9.0 THROUGH 1.2 +-------------------------------------------------- + +Copyright (c) 1991 - 1995, Stichting Mathematisch Centrum Amsterdam, +The Netherlands. All rights reserved. + +Permission to use, copy, modify, and distribute this software and its +documentation for any purpose and without fee is hereby granted, +provided that the above copyright notice appear in all copies and that +both that copyright notice and this permission notice appear in +supporting documentation, and that the name of Stichting Mathematisch +Centrum or CWI not be used in advertising or publicity pertaining to +distribution of the software without specific, written prior +permission. + +STICHTING MATHEMATISCH CENTRUM DISCLAIMS ALL WARRANTIES WITH REGARD TO +THIS SOFTWARE, INCLUDING ALL IMPLIED WARRANTIES OF MERCHANTABILITY AND +FITNESS, IN NO EVENT SHALL STICHTING MATHEMATISCH CENTRUM BE LIABLE +FOR ANY SPECIAL, INDIRECT OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES +WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN +ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT +OF OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE. + +ZERO-CLAUSE BSD LICENSE FOR CODE IN THE PYTHON DOCUMENTATION +---------------------------------------------------------------------- + +Permission to use, copy, modify, and/or distribute this software for any +purpose with or without fee is hereby granted. + +THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES WITH +REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF MERCHANTABILITY +AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY SPECIAL, DIRECT, +INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES WHATSOEVER RESULTING FROM +LOSS OF USE, DATA OR PROFITS, WHETHER IN AN ACTION OF CONTRACT, NEGLIGENCE OR +OTHER TORTIOUS ACTION, ARISING OUT OF OR IN CONNECTION WITH THE USE OR +PERFORMANCE OF THIS SOFTWARE. +Copyright (c) 2014, Al Sweigart +All rights reserved. + +Redistribution and use in source and binary forms, with or without +modification, are permitted provided that the following conditions are met: + +* Redistributions of source code must retain the above copyright notice, this + list of conditions and the following disclaimer. + +* Redistributions in binary form must reproduce the above copyright notice, + this list of conditions and the following disclaimer in the documentation + and/or other materials provided with the distribution. + +* Neither the name of the {organization} nor the names of its + contributors may be used to endorse or promote products derived from + this software without specific prior written permission. + +THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" +AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE +IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE +DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE +FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL +DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR +SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER +CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, +OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE +OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.Copyright (c) 2017 Anthony Sottile + +Permission is hereby granted, free of charge, to any person obtaining a copy +of this software and associated documentation files (the "Software"), to deal +in the Software without restriction, including without limitation the rights +to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +copies of the Software, and to permit persons to whom the Software is +furnished to do so, subject to the following conditions: + +The above copyright notice and this permission notice shall be included in +all copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN +THE SOFTWARE.Copyright (c) 2015-2019 Jared Hobbs + +Permission is hereby granted, free of charge, to any person obtaining a copy of +this software and associated documentation files (the "Software"), to deal in +the Software without restriction, including without limitation the rights to +use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies +of the Software, and to permit persons to whom the Software is furnished to do +so, subject to the following conditions: + +The above copyright notice and this permission notice shall be included in all +copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE +SOFTWARE.Developed by ESN, an Electronic Arts Inc. studio. +Copyright (c) 2014, Electronic Arts Inc. +All rights reserved. + +Redistribution and use in source and binary forms, with or without +modification, are permitted provided that the following conditions are met: +* Redistributions of source code must retain the above copyright +notice, this list of conditions and the following disclaimer. +* Redistributions in binary form must reproduce the above copyright +notice, this list of conditions and the following disclaimer in the +documentation and/or other materials provided with the distribution. +* Neither the name of ESN, Electronic Arts Inc. nor the +names of its contributors may be used to endorse or promote products +derived from this software without specific prior written permission. + +THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND +ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED +WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE +DISCLAIMED. IN NO EVENT SHALL ELECTRONIC ARTS INC. BE LIABLE +FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES +(INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; +LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND +ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT +(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS +SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + +---- + +Portions of code from MODP_ASCII - Ascii transformations (upper/lower, etc) +https://github.com/client9/stringencoders + + Copyright 2005, 2006, 2007 + Nick Galbreath -- nickg [at] modp [dot] com + All rights reserved. + + Redistribution and use in source and binary forms, with or without + modification, are permitted provided that the following conditions are + met: + + Redistributions of source code must retain the above copyright + notice, this list of conditions and the following disclaimer. + + Redistributions in binary form must reproduce the above copyright + notice, this list of conditions and the following disclaimer in the + documentation and/or other materials provided with the distribution. + + Neither the name of the modp.com nor the names of its + contributors may be used to endorse or promote products derived from + this software without specific prior written permission. + + THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS + "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT + LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR + A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT + OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, + SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT + LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, + DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY + THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT + (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE + OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + + This is the standard "new" BSD license: + http://www.opensource.org/licenses/bsd-license.php + +https://github.com/client9/stringencoders/blob/cfd5c1507325ae497ea9bacdacba12c0ffd79d30/COPYING + +---- + +Numeric decoder derived from from TCL library +https://opensource.apple.com/source/tcl/tcl-14/tcl/license.terms + * Copyright (c) 1988-1993 The Regents of the University of California. + * Copyright (c) 1994 Sun Microsystems, Inc. + + This software is copyrighted by the Regents of the University of + California, Sun Microsystems, Inc., Scriptics Corporation, ActiveState + Corporation and other parties. The following terms apply to all files + associated with the software unless explicitly disclaimed in + individual files. + + The authors hereby grant permission to use, copy, modify, distribute, + and license this software and its documentation for any purpose, provided + that existing copyright notices are retained in all copies and that this + notice is included verbatim in any distributions. No written agreement, + license, or royalty fee is required for any of the authorized uses. + Modifications to this software may be copyrighted by their authors + and need not follow the licensing terms described here, provided that + the new terms are clearly indicated on the first page of each file where + they apply. + + IN NO EVENT SHALL THE AUTHORS OR DISTRIBUTORS BE LIABLE TO ANY PARTY + FOR DIRECT, INDIRECT, SPECIAL, INCIDENTAL, OR CONSEQUENTIAL DAMAGES + ARISING OUT OF THE USE OF THIS SOFTWARE, ITS DOCUMENTATION, OR ANY + DERIVATIVES THEREOF, EVEN IF THE AUTHORS HAVE BEEN ADVISED OF THE + POSSIBILITY OF SUCH DAMAGE. + + THE AUTHORS AND DISTRIBUTORS SPECIFICALLY DISCLAIM ANY WARRANTIES, + INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY, + FITNESS FOR A PARTICULAR PURPOSE, AND NON-INFRINGEMENT. THIS SOFTWARE + IS PROVIDED ON AN "AS IS" BASIS, AND THE AUTHORS AND DISTRIBUTORS HAVE + NO OBLIGATION TO PROVIDE MAINTENANCE, SUPPORT, UPDATES, ENHANCEMENTS, OR + MODIFICATIONS. + + GOVERNMENT USE: If you are acquiring this software on behalf of the + U.S. government, the Government shall have only "Restricted Rights" + in the software and related documentation as defined in the Federal + Acquisition Regulations (FARs) in Clause 52.227.19 (c) (2). If you + are acquiring the software on behalf of the Department of Defense, the + software shall be classified as "Commercial Computer Software" and the + Government shall have only "Restricted Rights" as defined in Clause + 252.227-7013 (c) (1) of DFARs. Notwithstanding the foregoing, the + authors grant the U.S. Government and others acting in its behalf + permission to use and distribute the software in accordance with the + terms specified in this license.Apache License +Version 2.0, January 2004 +http://www.apache.org/licenses/ + +TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION + +1. Definitions. + +"License" shall mean the terms and conditions for use, reproduction, and +distribution as defined by Sections 1 through 9 of this document. + +"Licensor" shall mean the copyright owner or entity authorized by the copyright +owner that is granting the License. + +"Legal Entity" shall mean the union of the acting entity and all other entities +that control, are controlled by, or are under common control with that entity. +For the purposes of this definition, "control" means (i) the power, direct or +indirect, to cause the direction or management of such entity, whether by +contract or otherwise, or (ii) ownership of fifty percent (50%) or more of the +outstanding shares, or (iii) beneficial ownership of such entity. + +"You" (or "Your") shall mean an individual or Legal Entity exercising +permissions granted by this License. + +"Source" form shall mean the preferred form for making modifications, including +but not limited to software source code, documentation source, and configuration +files. + +"Object" form shall mean any form resulting from mechanical transformation or +translation of a Source form, including but not limited to compiled object code, +generated documentation, and conversions to other media types. + +"Work" shall mean the work of authorship, whether in Source or Object form, made +available under the License, as indicated by a copyright notice that is included +in or attached to the work (an example is provided in the Appendix below). + +"Derivative Works" shall mean any work, whether in Source or Object form, that +is based on (or derived from) the Work and for which the editorial revisions, +annotations, elaborations, or other modifications represent, as a whole, an +original work of authorship. For the purposes of this License, Derivative Works +shall not include works that remain separable from, or merely link (or bind by +name) to the interfaces of, the Work and Derivative Works thereof. + +"Contribution" shall mean any work of authorship, including the original version +of the Work and any modifications or additions to that Work or Derivative Works +thereof, that is intentionally submitted to Licensor for inclusion in the Work +by the copyright owner or by an individual or Legal Entity authorized to submit +on behalf of the copyright owner. For the purposes of this definition, +"submitted" means any form of electronic, verbal, or written communication sent +to the Licensor or its representatives, including but not limited to +communication on electronic mailing lists, source code control systems, and +issue tracking systems that are managed by, or on behalf of, the Licensor for +the purpose of discussing and improving the Work, but excluding communication +that is conspicuously marked or otherwise designated in writing by the copyright +owner as "Not a Contribution." + +"Contributor" shall mean Licensor and any individual or Legal Entity on behalf +of whom a Contribution has been received by Licensor and subsequently +incorporated within the Work. + +2. Grant of Copyright License. + +Subject to the terms and conditions of this License, each Contributor hereby +grants to You a perpetual, worldwide, non-exclusive, no-charge, royalty-free, +irrevocable copyright license to reproduce, prepare Derivative Works of, +publicly display, publicly perform, sublicense, and distribute the Work and such +Derivative Works in Source or Object form. + +3. Grant of Patent License. + +Subject to the terms and conditions of this License, each Contributor hereby +grants to You a perpetual, worldwide, non-exclusive, no-charge, royalty-free, +irrevocable (except as stated in this section) patent license to make, have +made, use, offer to sell, sell, import, and otherwise transfer the Work, where +such license applies only to those patent claims licensable by such Contributor +that are necessarily infringed by their Contribution(s) alone or by combination +of their Contribution(s) with the Work to which such Contribution(s) was +submitted. If You institute patent litigation against any entity (including a +cross-claim or counterclaim in a lawsuit) alleging that the Work or a +Contribution incorporated within the Work constitutes direct or contributory +patent infringement, then any patent licenses granted to You under this License +for that Work shall terminate as of the date such litigation is filed. + +4. Redistribution. + +You may reproduce and distribute copies of the Work or Derivative Works thereof +in any medium, with or without modifications, and in Source or Object form, +provided that You meet the following conditions: + +You must give any other recipients of the Work or Derivative Works a copy of +this License; and +You must cause any modified files to carry prominent notices stating that You +changed the files; and +You must retain, in the Source form of any Derivative Works that You distribute, +all copyright, patent, trademark, and attribution notices from the Source form +of the Work, excluding those notices that do not pertain to any part of the +Derivative Works; and +If the Work includes a "NOTICE" text file as part of its distribution, then any +Derivative Works that You distribute must include a readable copy of the +attribution notices contained within such NOTICE file, excluding those notices +that do not pertain to any part of the Derivative Works, in at least one of the +following places: within a NOTICE text file distributed as part of the +Derivative Works; within the Source form or documentation, if provided along +with the Derivative Works; or, within a display generated by the Derivative +Works, if and wherever such third-party notices normally appear. The contents of +the NOTICE file are for informational purposes only and do not modify the +License. You may add Your own attribution notices within Derivative Works that +You distribute, alongside or as an addendum to the NOTICE text from the Work, +provided that such additional attribution notices cannot be construed as +modifying the License. +You may add Your own copyright statement to Your modifications and may provide +additional or different license terms and conditions for use, reproduction, or +distribution of Your modifications, or for any such Derivative Works as a whole, +provided Your use, reproduction, and distribution of the Work otherwise complies +with the conditions stated in this License. + +5. Submission of Contributions. + +Unless You explicitly state otherwise, any Contribution intentionally submitted +for inclusion in the Work by You to the Licensor shall be under the terms and +conditions of this License, without any additional terms or conditions. +Notwithstanding the above, nothing herein shall supersede or modify the terms of +any separate license agreement you may have executed with Licensor regarding +such Contributions. + +6. Trademarks. + +This License does not grant permission to use the trade names, trademarks, +service marks, or product names of the Licensor, except as required for +reasonable and customary use in describing the origin of the Work and +reproducing the content of the NOTICE file. + +7. Disclaimer of Warranty. + +Unless required by applicable law or agreed to in writing, Licensor provides the +Work (and each Contributor provides its Contributions) on an "AS IS" BASIS, +WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied, +including, without limitation, any warranties or conditions of TITLE, +NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A PARTICULAR PURPOSE. You are +solely responsible for determining the appropriateness of using or +redistributing the Work and assume any risks associated with Your exercise of +permissions under this License. + +8. Limitation of Liability. + +In no event and under no legal theory, whether in tort (including negligence), +contract, or otherwise, unless required by applicable law (such as deliberate +and grossly negligent acts) or agreed to in writing, shall any Contributor be +liable to You for damages, including any direct, indirect, special, incidental, +or consequential damages of any character arising as a result of this License or +out of the use or inability to use the Work (including but not limited to +damages for loss of goodwill, work stoppage, computer failure or malfunction, or +any and all other commercial damages or losses), even if such Contributor has +been advised of the possibility of such damages. + +9. Accepting Warranty or Additional Liability. + +While redistributing the Work or Derivative Works thereof, You may choose to +offer, and charge a fee for, acceptance of support, warranty, indemnity, or +other liability obligations and/or rights consistent with this License. However, +in accepting such obligations, You may act only on Your own behalf and on Your +sole responsibility, not on behalf of any other Contributor, and only if You +agree to indemnify, defend, and hold each Contributor harmless for any liability +incurred by, or claims asserted against, such Contributor by reason of your +accepting any such warranty or additional liability. + +END OF TERMS AND CONDITIONS + +APPENDIX: How to apply the Apache License to your work + +To apply the Apache License to your work, attach the following boilerplate +notice, with the fields enclosed by brackets "[]" replaced with your own +identifying information. (Don't include the brackets!) The text should be +enclosed in the appropriate comment syntax for the file format. We also +recommend that a file or class name and description of purpose be included on +the same "printed page" as the copyright notice for easier identification within +third-party archives. + + Copyright [yyyy] [name of copyright owner] + + Licensed under the Apache License, Version 2.0 (the "License"); + you may not use this file except in compliance with the License. + You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, software + distributed under the License is distributed on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + See the License for the specific language governing permissions and + limitations under the License. + +pandocfilters +BSD License +http://github.com/jgm/pandocfilters +Copyright (c) 2013, John MacFarlane +All rights reserved. + +Redistribution and use in source and binary forms, with or without modification, +are permitted provided that the following conditions are met: + + - Redistributions of source code must retain the above copyright notice, + this list of conditions and the following disclaimer. + + - Redistributions in binary form must reproduce the above copyright notice, + this list of conditions and the following disclaimer in the documentation + and/or other materials provided with the distribution. + + - Neither the name of John Macfarlane nor the names of its contributors may + be used to endorse or promote products derived from this software without + specific prior written permission. + +THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND +ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED +WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE +DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR +ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES +(INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; +LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON +ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT +(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS +SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + + +parse +MIT +https://github.com/r1chardj0n3s/parse +Copyright (c) 2012-2019 Richard Jones + +Permission is hereby granted, free of charge, to any person obtaining a copy +of this software and associated documentation files (the "Software"), to deal +in the Software without restriction, including without limitation the rights +to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +copies of the Software, and to permit persons to whom the Software is +furnished to do so, subject to the following conditions: + + The above copyright notice and this permission notice shall be included in + all copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE +SOFTWARE. + + +parso +MIT License +https://github.com/davidhalter/parso +All contributions towards parso are MIT licensed. + +Some Python files have been taken from the standard library and are therefore +PSF licensed. Modifications on these files are dual licensed (both MIT and +PSF). These files are: + +- parso/pgen2/* +- parso/tokenize.py +- parso/token.py +- test/test_pgen2.py + +Also some test files under test/normalizer_issue_files have been copied from +https://github.com/PyCQA/pycodestyle (Expat License == MIT License). + +------------------------------------------------------------------------------- +The MIT License (MIT) + +Copyright (c) <2013-2017> + +Permission is hereby granted, free of charge, to any person obtaining a copy +of this software and associated documentation files (the "Software"), to deal +in the Software without restriction, including without limitation the rights +to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +copies of the Software, and to permit persons to whom the Software is +furnished to do so, subject to the following conditions: + +The above copyright notice and this permission notice shall be included in +all copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN +THE SOFTWARE. + +------------------------------------------------------------------------------- + +PYTHON SOFTWARE FOUNDATION LICENSE VERSION 2 +-------------------------------------------- + +1. This LICENSE AGREEMENT is between the Python Software Foundation +("PSF"), and the Individual or Organization ("Licensee") accessing and +otherwise using this software ("Python") in source or binary form and +its associated documentation. + +2. Subject to the terms and conditions of this License Agreement, PSF hereby +grants Licensee a nonexclusive, royalty-free, world-wide license to reproduce, +analyze, test, perform and/or display publicly, prepare derivative works, +distribute, and otherwise use Python alone or in any derivative version, +provided, however, that PSF's License Agreement and PSF's notice of copyright, +i.e., "Copyright (c) 2001, 2002, 2003, 2004, 2005, 2006, 2007, 2008, 2009, 2010, +2011, 2012, 2013, 2014, 2015 Python Software Foundation; All Rights Reserved" +are retained in Python alone or in any derivative version prepared by Licensee. + +3. In the event Licensee prepares a derivative work that is based on +or incorporates Python or any part thereof, and wants to make +the derivative work available to others as provided herein, then +Licensee hereby agrees to include in any such work a brief summary of +the changes made to Python. + +4. PSF is making Python available to Licensee on an "AS IS" +basis. PSF MAKES NO REPRESENTATIONS OR WARRANTIES, EXPRESS OR +IMPLIED. BY WAY OF EXAMPLE, BUT NOT LIMITATION, PSF MAKES NO AND +DISCLAIMS ANY REPRESENTATION OR WARRANTY OF MERCHANTABILITY OR FITNESS +FOR ANY PARTICULAR PURPOSE OR THAT THE USE OF PYTHON WILL NOT +INFRINGE ANY THIRD PARTY RIGHTS. + +5. PSF SHALL NOT BE LIABLE TO LICENSEE OR ANY OTHER USERS OF PYTHON +FOR ANY INCIDENTAL, SPECIAL, OR CONSEQUENTIAL DAMAGES OR LOSS AS +A RESULT OF MODIFYING, DISTRIBUTING, OR OTHERWISE USING PYTHON, +OR ANY DERIVATIVE THEREOF, EVEN IF ADVISED OF THE POSSIBILITY THEREOF. + +6. This License Agreement will automatically terminate upon a material +breach of its terms and conditions. + +7. Nothing in this License Agreement shall be deemed to create any +relationship of agency, partnership, or joint venture between PSF and +Licensee. This License Agreement does not grant permission to use PSF +trademarks or trade name in a trademark sense to endorse or promote +products or services of Licensee, or any third party. + +8. By copying, installing or otherwise using Python, Licensee +agrees to be bound by the terms and conditions of this License +Agreement. + + +pathspec +Mozilla Public License 2.0 (MPL 2.0) +UNKNOWN +Mozilla Public License Version 2.0 +================================== + +1. Definitions +-------------- + +1.1. "Contributor" + means each individual or legal entity that creates, contributes to + the creation of, or owns Covered Software. + +1.2. "Contributor Version" + means the combination of the Contributions of others (if any) used + by a Contributor and that particular Contributor's Contribution. + +1.3. "Contribution" + means Covered Software of a particular Contributor. + +1.4. "Covered Software" + means Source Code Form to which the initial Contributor has attached + the notice in Exhibit A, the Executable Form of such Source Code + Form, and Modifications of such Source Code Form, in each case + including portions thereof. + +1.5. "Incompatible With Secondary Licenses" + means + + (a) that the initial Contributor has attached the notice described + in Exhibit B to the Covered Software; or + + (b) that the Covered Software was made available under the terms of + version 1.1 or earlier of the License, but not also under the + terms of a Secondary License. + +1.6. "Executable Form" + means any form of the work other than Source Code Form. + +1.7. "Larger Work" + means a work that combines Covered Software with other material, in + a separate file or files, that is not Covered Software. + +1.8. "License" + means this document. + +1.9. "Licensable" + means having the right to grant, to the maximum extent possible, + whether at the time of the initial grant or subsequently, any and + all of the rights conveyed by this License. + +1.10. "Modifications" + means any of the following: + + (a) any file in Source Code Form that results from an addition to, + deletion from, or modification of the contents of Covered + Software; or + + (b) any new file in Source Code Form that contains any Covered + Software. + +1.11. "Patent Claims" of a Contributor + means any patent claim(s), including without limitation, method, + process, and apparatus claims, in any patent Licensable by such + Contributor that would be infringed, but for the grant of the + License, by the making, using, selling, offering for sale, having + made, import, or transfer of either its Contributions or its + Contributor Version. + +1.12. "Secondary License" + means either the GNU General Public License, Version 2.0, the GNU + Lesser General Public License, Version 2.1, the GNU Affero General + Public License, Version 3.0, or any later versions of those + licenses. + +1.13. "Source Code Form" + means the form of the work preferred for making modifications. + +1.14. "You" (or "Your") + means an individual or a legal entity exercising rights under this + License. For legal entities, "You" includes any entity that + controls, is controlled by, or is under common control with You. For + purposes of this definition, "control" means (a) the power, direct + or indirect, to cause the direction or management of such entity, + whether by contract or otherwise, or (b) ownership of more than + fifty percent (50%) of the outstanding shares or beneficial + ownership of such entity. + +2. License Grants and Conditions +-------------------------------- + +2.1. Grants + +Each Contributor hereby grants You a world-wide, royalty-free, +non-exclusive license: + +(a) under intellectual property rights (other than patent or trademark) + Licensable by such Contributor to use, reproduce, make available, + modify, display, perform, distribute, and otherwise exploit its + Contributions, either on an unmodified basis, with Modifications, or + as part of a Larger Work; and + +(b) under Patent Claims of such Contributor to make, use, sell, offer + for sale, have made, import, and otherwise transfer either its + Contributions or its Contributor Version. + +2.2. Effective Date + +The licenses granted in Section 2.1 with respect to any Contribution +become effective for each Contribution on the date the Contributor first +distributes such Contribution. + +2.3. Limitations on Grant Scope + +The licenses granted in this Section 2 are the only rights granted under +this License. No additional rights or licenses will be implied from the +distribution or licensing of Covered Software under this License. +Notwithstanding Section 2.1(b) above, no patent license is granted by a +Contributor: + +(a) for any code that a Contributor has removed from Covered Software; + or + +(b) for infringements caused by: (i) Your and any other third party's + modifications of Covered Software, or (ii) the combination of its + Contributions with other software (except as part of its Contributor + Version); or + +(c) under Patent Claims infringed by Covered Software in the absence of + its Contributions. + +This License does not grant any rights in the trademarks, service marks, +or logos of any Contributor (except as may be necessary to comply with +the notice requirements in Section 3.4). + +2.4. Subsequent Licenses + +No Contributor makes additional grants as a result of Your choice to +distribute the Covered Software under a subsequent version of this +License (see Section 10.2) or under the terms of a Secondary License (if +permitted under the terms of Section 3.3). + +2.5. Representation + +Each Contributor represents that the Contributor believes its +Contributions are its original creation(s) or it has sufficient rights +to grant the rights to its Contributions conveyed by this License. + +2.6. Fair Use + +This License is not intended to limit any rights You have under +applicable copyright doctrines of fair use, fair dealing, or other +equivalents. + +2.7. Conditions + +Sections 3.1, 3.2, 3.3, and 3.4 are conditions of the licenses granted +in Section 2.1. + +3. Responsibilities +------------------- + +3.1. Distribution of Source Form + +All distribution of Covered Software in Source Code Form, including any +Modifications that You create or to which You contribute, must be under +the terms of this License. You must inform recipients that the Source +Code Form of the Covered Software is governed by the terms of this +License, and how they can obtain a copy of this License. You may not +attempt to alter or restrict the recipients' rights in the Source Code +Form. + +3.2. Distribution of Executable Form + +If You distribute Covered Software in Executable Form then: + +(a) such Covered Software must also be made available in Source Code + Form, as described in Section 3.1, and You must inform recipients of + the Executable Form how they can obtain a copy of such Source Code + Form by reasonable means in a timely manner, at a charge no more + than the cost of distribution to the recipient; and + +(b) You may distribute such Executable Form under the terms of this + License, or sublicense it under different terms, provided that the + license for the Executable Form does not attempt to limit or alter + the recipients' rights in the Source Code Form under this License. + +3.3. Distribution of a Larger Work + +You may create and distribute a Larger Work under terms of Your choice, +provided that You also comply with the requirements of this License for +the Covered Software. If the Larger Work is a combination of Covered +Software with a work governed by one or more Secondary Licenses, and the +Covered Software is not Incompatible With Secondary Licenses, this +License permits You to additionally distribute such Covered Software +under the terms of such Secondary License(s), so that the recipient of +the Larger Work may, at their option, further distribute the Covered +Software under the terms of either this License or such Secondary +License(s). + +3.4. Notices + +You may not remove or alter the substance of any license notices +(including copyright notices, patent notices, disclaimers of warranty, +or limitations of liability) contained within the Source Code Form of +the Covered Software, except that You may alter any license notices to +the extent required to remedy known factual inaccuracies. + +3.5. Application of Additional Terms + +You may choose to offer, and to charge a fee for, warranty, support, +indemnity or liability obligations to one or more recipients of Covered +Software. However, You may do so only on Your own behalf, and not on +behalf of any Contributor. You must make it absolutely clear that any +such warranty, support, indemnity, or liability obligation is offered by +You alone, and You hereby agree to indemnify every Contributor for any +liability incurred by such Contributor as a result of warranty, support, +indemnity or liability terms You offer. You may include additional +disclaimers of warranty and limitations of liability specific to any +jurisdiction. + +4. Inability to Comply Due to Statute or Regulation +--------------------------------------------------- + +If it is impossible for You to comply with any of the terms of this +License with respect to some or all of the Covered Software due to +statute, judicial order, or regulation then You must: (a) comply with +the terms of this License to the maximum extent possible; and (b) +describe the limitations and the code they affect. Such description must +be placed in a text file included with all distributions of the Covered +Software under this License. Except to the extent prohibited by statute +or regulation, such description must be sufficiently detailed for a +recipient of ordinary skill to be able to understand it. + +5. Termination +-------------- + +5.1. The rights granted under this License will terminate automatically +if You fail to comply with any of its terms. However, if You become +compliant, then the rights granted under this License from a particular +Contributor are reinstated (a) provisionally, unless and until such +Contributor explicitly and finally terminates Your grants, and (b) on an +ongoing basis, if such Contributor fails to notify You of the +non-compliance by some reasonable means prior to 60 days after You have +come back into compliance. Moreover, Your grants from a particular +Contributor are reinstated on an ongoing basis if such Contributor +notifies You of the non-compliance by some reasonable means, this is the +first time You have received notice of non-compliance with this License +from such Contributor, and You become compliant prior to 30 days after +Your receipt of the notice. + +5.2. If You initiate litigation against any entity by asserting a patent +infringement claim (excluding declaratory judgment actions, +counter-claims, and cross-claims) alleging that a Contributor Version +directly or indirectly infringes any patent, then the rights granted to +You by any and all Contributors for the Covered Software under Section +2.1 of this License shall terminate. + +5.3. In the event of termination under Sections 5.1 or 5.2 above, all +end user license agreements (excluding distributors and resellers) which +have been validly granted by You or Your distributors under this License +prior to termination shall survive termination. + +************************************************************************ +* * +* 6. Disclaimer of Warranty * +* ------------------------- * +* * +* Covered Software is provided under this License on an "as is" * +* basis, without warranty of any kind, either expressed, implied, or * +* statutory, including, without limitation, warranties that the * +* Covered Software is free of defects, merchantable, fit for a * +* particular purpose or non-infringing. The entire risk as to the * +* quality and performance of the Covered Software is with You. * +* Should any Covered Software prove defective in any respect, You * +* (not any Contributor) assume the cost of any necessary servicing, * +* repair, or correction. This disclaimer of warranty constitutes an * +* essential part of this License. No use of any Covered Software is * +* authorized under this License except under this disclaimer. * +* * +************************************************************************ + +************************************************************************ +* * +* 7. Limitation of Liability * +* -------------------------- * +* * +* Under no circumstances and under no legal theory, whether tort * +* (including negligence), contract, or otherwise, shall any * +* Contributor, or anyone who distributes Covered Software as * +* permitted above, be liable to You for any direct, indirect, * +* special, incidental, or consequential damages of any character * +* including, without limitation, damages for lost profits, loss of * +* goodwill, work stoppage, computer failure or malfunction, or any * +* and all other commercial damages or losses, even if such party * +* shall have been informed of the possibility of such damages. This * +* limitation of liability shall not apply to liability for death or * +* personal injury resulting from such party's negligence to the * +* extent applicable law prohibits such limitation. Some * +* jurisdictions do not allow the exclusion or limitation of * +* incidental or consequential damages, so this exclusion and * +* limitation may not apply to You. * +* * +************************************************************************ + +8. Litigation +------------- + +Any litigation relating to this License may be brought only in the +courts of a jurisdiction where the defendant maintains its principal +place of business and such litigation shall be governed by laws of that +jurisdiction, without reference to its conflict-of-law provisions. +Nothing in this Section shall prevent a party's ability to bring +cross-claims or counter-claims. + +9. Miscellaneous +---------------- + +This License represents the complete agreement concerning the subject +matter hereof. If any provision of this License is held to be +unenforceable, such provision shall be reformed only to the extent +necessary to make it enforceable. Any law or regulation which provides +that the language of a contract shall be construed against the drafter +shall not be used to construe this License against a Contributor. + +10. Versions of the License +--------------------------- + +10.1. New Versions + +Mozilla Foundation is the license steward. Except as provided in Section +10.3, no one other than the license steward has the right to modify or +publish new versions of this License. Each version will be given a +distinguishing version number. + +10.2. Effect of New Versions + +You may distribute the Covered Software under the terms of the version +of the License under which You originally received the Covered Software, +or under the terms of any subsequent version published by the license +steward. + +10.3. Modified Versions + +If you create software not governed by this License, and you want to +create a new license for such software, you may create and use a +modified version of this License if you rename the license and remove +any references to the name of the license steward (except to note that +such modified license differs from this License). + +10.4. Distributing Source Code Form that is Incompatible With Secondary +Licenses + +If You choose to distribute Source Code Form that is Incompatible With +Secondary Licenses under the terms of this version of the License, the +notice described in Exhibit B of this License must be attached. + +Exhibit A - Source Code Form License Notice +------------------------------------------- + + This Source Code Form is subject to the terms of the Mozilla Public + License, v. 2.0. If a copy of the MPL was not distributed with this + file, You can obtain one at http://mozilla.org/MPL/2.0/. + +If it is not possible or desirable to put the notice in a particular +file, then You may include the notice in a location (such as a LICENSE +file in a relevant directory) where a recipient would be likely to look +for such a notice. + +You may add additional accurate notices of copyright ownership. + +Exhibit B - "Incompatible With Secondary Licenses" Notice +--------------------------------------------------------- + + This Source Code Form is "Incompatible With Secondary Licenses", as + defined by the Mozilla Public License, v. 2.0. + + +peft +Apache Software License +https://github.com/huggingface/peft + Apache License + Version 2.0, January 2004 + http://www.apache.org/licenses/ + + TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION + + 1. Definitions. + + "License" shall mean the terms and conditions for use, reproduction, + and distribution as defined by Sections 1 through 9 of this document. + + "Licensor" shall mean the copyright owner or entity authorized by + the copyright owner that is granting the License. + + "Legal Entity" shall mean the union of the acting entity and all + other entities that control, are controlled by, or are under common + control with that entity. For the purposes of this definition, + "control" means (i) the power, direct or indirect, to cause the + direction or management of such entity, whether by contract or + otherwise, or (ii) ownership of fifty percent (50%) or more of the + outstanding shares, or (iii) beneficial ownership of such entity. + + "You" (or "Your") shall mean an individual or Legal Entity + exercising permissions granted by this License. + + "Source" form shall mean the preferred form for making modifications, + including but not limited to software source code, documentation + source, and configuration files. + + "Object" form shall mean any form resulting from mechanical + transformation or translation of a Source form, including but + not limited to compiled object code, generated documentation, + and conversions to other media types. + + "Work" shall mean the work of authorship, whether in Source or + Object form, made available under the License, as indicated by a + copyright notice that is included in or attached to the work + (an example is provided in the Appendix below). + + "Derivative Works" shall mean any work, whether in Source or Object + form, that is based on (or derived from) the Work and for which the + editorial revisions, annotations, elaborations, or other modifications + represent, as a whole, an original work of authorship. For the purposes + of this License, Derivative Works shall not include works that remain + separable from, or merely link (or bind by name) to the interfaces of, + the Work and Derivative Works thereof. + + "Contribution" shall mean any work of authorship, including + the original version of the Work and any modifications or additions + to that Work or Derivative Works thereof, that is intentionally + submitted to Licensor for inclusion in the Work by the copyright owner + or by an individual or Legal Entity authorized to submit on behalf of + the copyright owner. For the purposes of this definition, "submitted" + means any form of electronic, verbal, or written communication sent + to the Licensor or its representatives, including but not limited to + communication on electronic mailing lists, source code control systems, + and issue tracking systems that are managed by, or on behalf of, the + Licensor for the purpose of discussing and improving the Work, but + excluding communication that is conspicuously marked or otherwise + designated in writing by the copyright owner as "Not a Contribution." + + "Contributor" shall mean Licensor and any individual or Legal Entity + on behalf of whom a Contribution has been received by Licensor and + subsequently incorporated within the Work. + + 2. Grant of Copyright License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + copyright license to reproduce, prepare Derivative Works of, + publicly display, publicly perform, sublicense, and distribute the + Work and such Derivative Works in Source or Object form. + + 3. Grant of Patent License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + (except as stated in this section) patent license to make, have made, + use, offer to sell, sell, import, and otherwise transfer the Work, + where such license applies only to those patent claims licensable + by such Contributor that are necessarily infringed by their + Contribution(s) alone or by combination of their Contribution(s) + with the Work to which such Contribution(s) was submitted. If You + institute patent litigation against any entity (including a + cross-claim or counterclaim in a lawsuit) alleging that the Work + or a Contribution incorporated within the Work constitutes direct + or contributory patent infringement, then any patent licenses + granted to You under this License for that Work shall terminate + as of the date such litigation is filed. + + 4. Redistribution. You may reproduce and distribute copies of the + Work or Derivative Works thereof in any medium, with or without + modifications, and in Source or Object form, provided that You + meet the following conditions: + + (a) You must give any other recipients of the Work or + Derivative Works a copy of this License; and + + (b) You must cause any modified files to carry prominent notices + stating that You changed the files; and + + (c) You must retain, in the Source form of any Derivative Works + that You distribute, all copyright, patent, trademark, and + attribution notices from the Source form of the Work, + excluding those notices that do not pertain to any part of + the Derivative Works; and + + (d) If the Work includes a "NOTICE" text file as part of its + distribution, then any Derivative Works that You distribute must + include a readable copy of the attribution notices contained + within such NOTICE file, excluding those notices that do not + pertain to any part of the Derivative Works, in at least one + of the following places: within a NOTICE text file distributed + as part of the Derivative Works; within the Source form or + documentation, if provided along with the Derivative Works; or, + within a display generated by the Derivative Works, if and + wherever such third-party notices normally appear. The contents + of the NOTICE file are for informational purposes only and + do not modify the License. You may add Your own attribution + notices within Derivative Works that You distribute, alongside + or as an addendum to the NOTICE text from the Work, provided + that such additional attribution notices cannot be construed + as modifying the License. + + You may add Your own copyright statement to Your modifications and + may provide additional or different license terms and conditions + for use, reproduction, or distribution of Your modifications, or + for any such Derivative Works as a whole, provided Your use, + reproduction, and distribution of the Work otherwise complies with + the conditions stated in this License. + + 5. Submission of Contributions. Unless You explicitly state otherwise, + any Contribution intentionally submitted for inclusion in the Work + by You to the Licensor shall be under the terms and conditions of + this License, without any additional terms or conditions. + Notwithstanding the above, nothing herein shall supersede or modify + the terms of any separate license agreement you may have executed + with Licensor regarding such Contributions. + + 6. Trademarks. This License does not grant permission to use the trade + names, trademarks, service marks, or product names of the Licensor, + except as required for reasonable and customary use in describing the + origin of the Work and reproducing the content of the NOTICE file. + + 7. Disclaimer of Warranty. Unless required by applicable law or + agreed to in writing, Licensor provides the Work (and each + Contributor provides its Contributions) on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or + implied, including, without limitation, any warranties or conditions + of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A + PARTICULAR PURPOSE. You are solely responsible for determining the + appropriateness of using or redistributing the Work and assume any + risks associated with Your exercise of permissions under this License. + + 8. Limitation of Liability. In no event and under no legal theory, + whether in tort (including negligence), contract, or otherwise, + unless required by applicable law (such as deliberate and grossly + negligent acts) or agreed to in writing, shall any Contributor be + liable to You for damages, including any direct, indirect, special, + incidental, or consequential damages of any character arising as a + result of this License or out of the use or inability to use the + Work (including but not limited to damages for loss of goodwill, + work stoppage, computer failure or malfunction, or any and all + other commercial damages or losses), even if such Contributor + has been advised of the possibility of such damages. + + 9. Accepting Warranty or Additional Liability. While redistributing + the Work or Derivative Works thereof, You may choose to offer, + and charge a fee for, acceptance of support, warranty, indemnity, + or other liability obligations and/or rights consistent with this + License. However, in accepting such obligations, You may act only + on Your own behalf and on Your sole responsibility, not on behalf + of any other Contributor, and only if You agree to indemnify, + defend, and hold each Contributor harmless for any liability + incurred by, or claims asserted against, such Contributor by reason + of your accepting any such warranty or additional liability. + + END OF TERMS AND CONDITIONS + + APPENDIX: How to apply the Apache License to your work. + + To apply the Apache License to your work, attach the following + boilerplate notice, with the fields enclosed by brackets "[]" + replaced with your own identifying information. (Don't include + the brackets!) The text should be enclosed in the appropriate + comment syntax for the file format. We also recommend that a + file or class name and description of purpose be included on the + same "printed page" as the copyright notice for easier + identification within third-party archives. + + Copyright [yyyy] [name of copyright owner] + + Licensed under the Apache License, Version 2.0 (the "License"); + you may not use this file except in compliance with the License. + You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, software + distributed under the License is distributed on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + See the License for the specific language governing permissions and + limitations under the License. + + +pexpect +ISC License (ISCL) +https://pexpect.readthedocs.io/ +ISC LICENSE + + This license is approved by the OSI and FSF as GPL-compatible. + http://opensource.org/licenses/isc-license.txt + + Copyright (c) 2013-2014, Pexpect development team + Copyright (c) 2012, Noah Spurrier + + Permission to use, copy, modify, and/or distribute this software for any + purpose with or without fee is hereby granted, provided that the above + copyright notice and this permission notice appear in all copies. + + THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES + WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF + MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR + ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES + WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN + ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF + OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE. + + + +pillow +MIT-CMU +https://python-pillow.github.io +The Python Imaging Library (PIL) is + + Copyright © 1997-2011 by Secret Labs AB + Copyright © 1995-2011 by Fredrik Lundh and contributors + +Pillow is the friendly PIL fork. It is + + Copyright © 2010 by Jeffrey A. Clark and contributors + +Like PIL, Pillow is licensed under the open source MIT-CMU License: + +By obtaining, using, and/or copying this software and/or its associated +documentation, you agree that you have read, understood, and will comply +with the following terms and conditions: + +Permission to use, copy, modify and distribute this software and its +documentation for any purpose and without fee is hereby granted, +provided that the above copyright notice appears in all copies, and that +both that copyright notice and this permission notice appear in supporting +documentation, and that the name of Secret Labs AB or the author not be +used in advertising or publicity pertaining to distribution of the software +without specific, written prior permission. + +SECRET LABS AB AND THE AUTHOR DISCLAIMS ALL WARRANTIES WITH REGARD TO THIS +SOFTWARE, INCLUDING ALL IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS. +IN NO EVENT SHALL SECRET LABS AB OR THE AUTHOR BE LIABLE FOR ANY SPECIAL, +INDIRECT OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES WHATSOEVER RESULTING FROM +LOSS OF USE, DATA OR PROFITS, WHETHER IN AN ACTION OF CONTRACT, NEGLIGENCE +OR OTHER TORTIOUS ACTION, ARISING OUT OF OR IN CONNECTION WITH THE USE OR +PERFORMANCE OF THIS SOFTWARE. + + +---- + +AOM + +Copyright (c) 2016, Alliance for Open Media. All rights reserved. + +Redistribution and use in source and binary forms, with or without +modification, are permitted provided that the following conditions +are met: + +1. Redistributions of source code must retain the above copyright + notice, this list of conditions and the following disclaimer. + +2. Redistributions in binary form must reproduce the above copyright + notice, this list of conditions and the following disclaimer in + the documentation and/or other materials provided with the + distribution. + +THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS +"AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT +LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS +FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE +COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, +INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, +BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; +LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER +CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT +LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN +ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE +POSSIBILITY OF SUCH DAMAGE. + + +---- + +BROTLI + +Copyright (c) 2009, 2010, 2013-2016 by the Brotli Authors. + +Permission is hereby granted, free of charge, to any person obtaining a copy +of this software and associated documentation files (the "Software"), to deal +in the Software without restriction, including without limitation the rights +to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +copies of the Software, and to permit persons to whom the Software is +furnished to do so, subject to the following conditions: + +The above copyright notice and this permission notice shall be included in +all copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN +THE SOFTWARE. + + +---- + +BZIP2 + + +-------------------------------------------------------------------------- + +This program, "bzip2", the associated library "libbzip2", and all +documentation, are copyright (C) 1996-2019 Julian R Seward. All +rights reserved. + +Redistribution and use in source and binary forms, with or without +modification, are permitted provided that the following conditions +are met: + +1. Redistributions of source code must retain the above copyright + notice, this list of conditions and the following disclaimer. + +2. The origin of this software must not be misrepresented; you must + not claim that you wrote the original software. If you use this + software in a product, an acknowledgment in the product + documentation would be appreciated but is not required. + +3. Altered source versions must be plainly marked as such, and must + not be misrepresented as being the original software. + +4. The name of the author may not be used to endorse or promote + products derived from this software without specific prior written + permission. + +THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS +OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED +WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE +ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY +DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL +DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE +GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS +INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, +WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING +NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS +SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + +Julian Seward, jseward@acm.org +bzip2/libbzip2 version 1.0.8 of 13 July 2019 + +-------------------------------------------------------------------------- + + +---- + +DAV1D + +Copyright © 2018-2019, VideoLAN and dav1d authors +All rights reserved. + +Redistribution and use in source and binary forms, with or without +modification, are permitted provided that the following conditions are met: + +1. Redistributions of source code must retain the above copyright notice, this + list of conditions and the following disclaimer. + +2. Redistributions in binary form must reproduce the above copyright notice, + this list of conditions and the following disclaimer in the documentation + and/or other materials provided with the distribution. + +THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND +ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED +WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE +DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR +ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES +(INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; +LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND +ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT +(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS +SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + + +---- + +FREETYPE2 + +The FreeType 2 font engine is copyrighted work and cannot be used +legally without a software license. In order to make this project +usable to a vast majority of developers, we distribute it under two +mutually exclusive open-source licenses. + +This means that *you* must choose *one* of the two licenses described +below, then obey all its terms and conditions when using FreeType 2 in +any of your projects or products. + + - The FreeType License, found in the file `docs/FTL.TXT`, which is + similar to the original BSD license *with* an advertising clause + that forces you to explicitly cite the FreeType project in your + product's documentation. All details are in the license file. + This license is suited to products which don't use the GNU General + Public License. + + Note that this license is compatible to the GNU General Public + License version 3, but not version 2. + + - The GNU General Public License version 2, found in + `docs/GPLv2.TXT` (any later version can be used also), for + programs which already use the GPL. Note that the FTL is + incompatible with GPLv2 due to its advertisement clause. + +The contributed BDF and PCF drivers come with a license similar to +that of the X Window System. It is compatible to the above two +licenses (see files `src/bdf/README` and `src/pcf/README`). The same +holds for the source code files `src/base/fthash.c` and +`include/freetype/internal/fthash.h`; they were part of the BDF driver +in earlier FreeType versions. + +The gzip module uses the zlib license (see `src/gzip/zlib.h`) which +too is compatible to the above two licenses. + +The files `src/autofit/ft-hb.c` and `src/autofit/ft-hb.h` contain code +taken almost verbatim from the HarfBuzz file `hb-ft.cc`, which uses +the 'Old MIT' license, compatible to the above two licenses. + +The MD5 checksum support (only used for debugging in development +builds) is in the public domain. + +-------------------------------------------------------------------------- + + The FreeType Project LICENSE + ---------------------------- + + 2006-Jan-27 + + Copyright 1996-2002, 2006 by + David Turner, Robert Wilhelm, and Werner Lemberg + + + +Introduction +============ + + The FreeType Project is distributed in several archive packages; + some of them may contain, in addition to the FreeType font engine, + various tools and contributions which rely on, or relate to, the + FreeType Project. + + This license applies to all files found in such packages, and + which do not fall under their own explicit license. The license + affects thus the FreeType font engine, the test programs, + documentation and makefiles, at the very least. + + This license was inspired by the BSD, Artistic, and IJG + (Independent JPEG Group) licenses, which all encourage inclusion + and use of free software in commercial and freeware products + alike. As a consequence, its main points are that: + + o We don't promise that this software works. However, we will be + interested in any kind of bug reports. (`as is' distribution) + + o You can use this software for whatever you want, in parts or + full form, without having to pay us. (`royalty-free' usage) + + o You may not pretend that you wrote this software. If you use + it, or only parts of it, in a program, you must acknowledge + somewhere in your documentation that you have used the + FreeType code. (`credits') + + We specifically permit and encourage the inclusion of this + software, with or without modifications, in commercial products. + We disclaim all warranties covering The FreeType Project and + assume no liability related to The FreeType Project. + + + Finally, many people asked us for a preferred form for a + credit/disclaimer to use in compliance with this license. We thus + encourage you to use the following text: + + """ + Portions of this software are copyright © The FreeType + Project (www.freetype.org). All rights reserved. + """ + + Please replace with the value from the FreeType version you + actually use. + + +Legal Terms +=========== + +0. Definitions +-------------- + + Throughout this license, the terms `package', `FreeType Project', + and `FreeType archive' refer to the set of files originally + distributed by the authors (David Turner, Robert Wilhelm, and + Werner Lemberg) as the `FreeType Project', be they named as alpha, + beta or final release. + + `You' refers to the licensee, or person using the project, where + `using' is a generic term including compiling the project's source + code as well as linking it to form a `program' or `executable'. + This program is referred to as `a program using the FreeType + engine'. + + This license applies to all files distributed in the original + FreeType Project, including all source code, binaries and + documentation, unless otherwise stated in the file in its + original, unmodified form as distributed in the original archive. + If you are unsure whether or not a particular file is covered by + this license, you must contact us to verify this. + + The FreeType Project is copyright (C) 1996-2000 by David Turner, + Robert Wilhelm, and Werner Lemberg. All rights reserved except as + specified below. + +1. No Warranty +-------------- + + THE FREETYPE PROJECT IS PROVIDED `AS IS' WITHOUT WARRANTY OF ANY + KIND, EITHER EXPRESS OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, + WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR + PURPOSE. IN NO EVENT WILL ANY OF THE AUTHORS OR COPYRIGHT HOLDERS + BE LIABLE FOR ANY DAMAGES CAUSED BY THE USE OR THE INABILITY TO + USE, OF THE FREETYPE PROJECT. + +2. Redistribution +----------------- + + This license grants a worldwide, royalty-free, perpetual and + irrevocable right and license to use, execute, perform, compile, + display, copy, create derivative works of, distribute and + sublicense the FreeType Project (in both source and object code + forms) and derivative works thereof for any purpose; and to + authorize others to exercise some or all of the rights granted + herein, subject to the following conditions: + + o Redistribution of source code must retain this license file + (`FTL.TXT') unaltered; any additions, deletions or changes to + the original files must be clearly indicated in accompanying + documentation. The copyright notices of the unaltered, + original files must be preserved in all copies of source + files. + + o Redistribution in binary form must provide a disclaimer that + states that the software is based in part of the work of the + FreeType Team, in the distribution documentation. We also + encourage you to put an URL to the FreeType web page in your + documentation, though this isn't mandatory. + + These conditions apply to any software derived from or based on + the FreeType Project, not just the unmodified files. If you use + our work, you must acknowledge us. However, no fee need be paid + to us. + +3. Advertising +-------------- + + Neither the FreeType authors and contributors nor you shall use + the name of the other for commercial, advertising, or promotional + purposes without specific prior written permission. + + We suggest, but do not require, that you use one or more of the + following phrases to refer to this software in your documentation + or advertising materials: `FreeType Project', `FreeType Engine', + `FreeType library', or `FreeType Distribution'. + + As you have not signed this license, you are not required to + accept it. However, as the FreeType Project is copyrighted + material, only this license, or another one contracted with the + authors, grants you the right to use, distribute, and modify it. + Therefore, by using, distributing, or modifying the FreeType + Project, you indicate that you understand and accept all the terms + of this license. + +4. Contacts +----------- + + There are two mailing lists related to FreeType: + + o freetype@nongnu.org + + Discusses general use and applications of FreeType, as well as + future and wanted additions to the library and distribution. + If you are looking for support, start in this list if you + haven't found anything to help you in the documentation. + + o freetype-devel@nongnu.org + + Discusses bugs, as well as engine internals, design issues, + specific licenses, porting, etc. + + Our home page can be found at + + https://www.freetype.org + + +--- end of FTL.TXT --- + +The following license details are part of `src/bdf/README`: + +``` +License +******* + +Copyright (C) 2001-2002 by Francesco Zappa Nardelli + +Permission is hereby granted, free of charge, to any person obtaining +a copy of this software and associated documentation files (the +"Software"), to deal in the Software without restriction, including +without limitation the rights to use, copy, modify, merge, publish, +distribute, sublicense, and/or sell copies of the Software, and to +permit persons to whom the Software is furnished to do so, subject to +the following conditions: + +The above copyright notice and this permission notice shall be +included in all copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, +EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF +MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. +IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY +CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, +TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE +SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. + +*** Portions of the driver (that is, bdflib.c and bdf.h): + +Copyright 2000 Computing Research Labs, New Mexico State University +Copyright 2001-2002, 2011 Francesco Zappa Nardelli + +Permission is hereby granted, free of charge, to any person obtaining a +copy of this software and associated documentation files (the "Software"), +to deal in the Software without restriction, including without limitation +the rights to use, copy, modify, merge, publish, distribute, sublicense, +and/or sell copies of the Software, and to permit persons to whom the +Software is furnished to do so, subject to the following conditions: + +The above copyright notice and this permission notice shall be included in +all copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL +THE COMPUTING RESEARCH LAB OR NEW MEXICO STATE UNIVERSITY BE LIABLE FOR ANY +CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT +OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR +THE USE OR OTHER DEALINGS IN THE SOFTWARE. + + +Credits +******* + +This driver is based on excellent Mark Leisher's bdf library. If you +find something good in this driver you should probably thank him, not +me. +``` + +The following license details are part of `src/pcf/README`: + +``` +License +******* + +Copyright (C) 2000 by Francesco Zappa Nardelli + +Permission is hereby granted, free of charge, to any person obtaining +a copy of this software and associated documentation files (the +"Software"), to deal in the Software without restriction, including +without limitation the rights to use, copy, modify, merge, publish, +distribute, sublicense, and/or sell copies of the Software, and to +permit persons to whom the Software is furnished to do so, subject to +the following conditions: + +The above copyright notice and this permission notice shall be +included in all copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, +EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF +MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. +IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY +CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, +TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE +SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. + + +Credits +******* + +Keith Packard wrote the pcf driver found in XFree86. His work is at +the same time the specification and the sample implementation of the +PCF format. Undoubtedly, this driver is inspired from his work. +``` + + +---- + +HARFBUZZ + +HarfBuzz is licensed under the so-called "Old MIT" license. Details follow. +For parts of HarfBuzz that are licensed under different licenses see individual +files names COPYING in subdirectories where applicable. + +Copyright © 2010-2022 Google, Inc. +Copyright © 2015-2020 Ebrahim Byagowi +Copyright © 2019,2020 Facebook, Inc. +Copyright © 2012,2015 Mozilla Foundation +Copyright © 2011 Codethink Limited +Copyright © 2008,2010 Nokia Corporation and/or its subsidiary(-ies) +Copyright © 2009 Keith Stribley +Copyright © 2011 Martin Hosken and SIL International +Copyright © 2007 Chris Wilson +Copyright © 2005,2006,2020,2021,2022,2023 Behdad Esfahbod +Copyright © 2004,2007,2008,2009,2010,2013,2021,2022,2023 Red Hat, Inc. +Copyright © 1998-2005 David Turner and Werner Lemberg +Copyright © 2016 Igalia S.L. +Copyright © 2022 Matthias Clasen +Copyright © 2018,2021 Khaled Hosny +Copyright © 2018,2019,2020 Adobe, Inc +Copyright © 2013-2015 Alexei Podtelezhnikov + +For full copyright notices consult the individual files in the package. + + +Permission is hereby granted, without written agreement and without +license or royalty fees, to use, copy, modify, and distribute this +software and its documentation for any purpose, provided that the +above copyright notice and the following two paragraphs appear in +all copies of this software. + +IN NO EVENT SHALL THE COPYRIGHT HOLDER BE LIABLE TO ANY PARTY FOR +DIRECT, INDIRECT, SPECIAL, INCIDENTAL, OR CONSEQUENTIAL DAMAGES +ARISING OUT OF THE USE OF THIS SOFTWARE AND ITS DOCUMENTATION, EVEN +IF THE COPYRIGHT HOLDER HAS BEEN ADVISED OF THE POSSIBILITY OF SUCH +DAMAGE. + +THE COPYRIGHT HOLDER SPECIFICALLY DISCLAIMS ANY WARRANTIES, INCLUDING, +BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND +FITNESS FOR A PARTICULAR PURPOSE. THE SOFTWARE PROVIDED HEREUNDER IS +ON AN "AS IS" BASIS, AND THE COPYRIGHT HOLDER HAS NO OBLIGATION TO +PROVIDE MAINTENANCE, SUPPORT, UPDATES, ENHANCEMENTS, OR MODIFICATIONS. + + +---- + +LCMS2 + +Little CMS +Copyright (c) 1998-2020 Marti Maria Saguer + +Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions: + +The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. + + +---- + +LIBAVIF + +Copyright 2019 Joe Drago. All rights reserved. + +Redistribution and use in source and binary forms, with or without +modification, are permitted provided that the following conditions are met: + +1. Redistributions of source code must retain the above copyright notice, this +list of conditions and the following disclaimer. + +2. Redistributions in binary form must reproduce the above copyright notice, +this list of conditions and the following disclaimer in the documentation +and/or other materials provided with the distribution. + +THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" +AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE +IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE +DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE +FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL +DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR +SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER +CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, +OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE +OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + +------------------------------------------------------------------------------ + +Files: src/obu.c + +Copyright © 2018-2019, VideoLAN and dav1d authors +All rights reserved. + +Redistribution and use in source and binary forms, with or without +modification, are permitted provided that the following conditions are met: + +1. Redistributions of source code must retain the above copyright notice, this + list of conditions and the following disclaimer. + +2. Redistributions in binary form must reproduce the above copyright notice, + this list of conditions and the following disclaimer in the documentation + and/or other materials provided with the distribution. + +THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND +ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED +WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE +DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR +ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES +(INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; +LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND +ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT +(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS +SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + +------------------------------------------------------------------------------ + +Files: third_party/iccjpeg/* + +In plain English: + +1. We don't promise that this software works. (But if you find any bugs, + please let us know!) +2. You can use this software for whatever you want. You don't have to pay us. +3. You may not pretend that you wrote this software. If you use it in a + program, you must acknowledge somewhere in your documentation that + you've used the IJG code. + +In legalese: + +The authors make NO WARRANTY or representation, either express or implied, +with respect to this software, its quality, accuracy, merchantability, or +fitness for a particular purpose. This software is provided "AS IS", and you, +its user, assume the entire risk as to its quality and accuracy. + +This software is copyright (C) 1991-2013, Thomas G. Lane, Guido Vollbeding. +All Rights Reserved except as specified below. + +Permission is hereby granted to use, copy, modify, and distribute this +software (or portions thereof) for any purpose, without fee, subject to these +conditions: +(1) If any part of the source code for this software is distributed, then this +README file must be included, with this copyright and no-warranty notice +unaltered; and any additions, deletions, or changes to the original files +must be clearly indicated in accompanying documentation. +(2) If only executable code is distributed, then the accompanying +documentation must state that "this software is based in part on the work of +the Independent JPEG Group". +(3) Permission for use of this software is granted only if the user accepts +full responsibility for any undesirable consequences; the authors accept +NO LIABILITY for damages of any kind. + +These conditions apply to any software derived from or based on the IJG code, +not just to the unmodified library. If you use our work, you ought to +acknowledge us. + +Permission is NOT granted for the use of any IJG author's name or company name +in advertising or publicity relating to this software or products derived from +it. This software may be referred to only as "the Independent JPEG Group's +software". + +We specifically permit and encourage the use of this software as the basis of +commercial products, provided that all warranty or liability claims are +assumed by the product vendor. + + +The Unix configuration script "configure" was produced with GNU Autoconf. +It is copyright by the Free Software Foundation but is freely distributable. +The same holds for its supporting scripts (config.guess, config.sub, +ltmain.sh). Another support script, install-sh, is copyright by X Consortium +but is also freely distributable. + +The IJG distribution formerly included code to read and write GIF files. +To avoid entanglement with the Unisys LZW patent, GIF reading support has +been removed altogether, and the GIF writer has been simplified to produce +"uncompressed GIFs". This technique does not use the LZW algorithm; the +resulting GIF files are larger than usual, but are readable by all standard +GIF decoders. + +We are required to state that + "The Graphics Interchange Format(c) is the Copyright property of + CompuServe Incorporated. GIF(sm) is a Service Mark property of + CompuServe Incorporated." + +------------------------------------------------------------------------------ + +Files: contrib/gdk-pixbuf/* + +Copyright 2020 Emmanuel Gil Peyrot. All rights reserved. + +Redistribution and use in source and binary forms, with or without +modification, are permitted provided that the following conditions are met: + +1. Redistributions of source code must retain the above copyright notice, this +list of conditions and the following disclaimer. + +2. Redistributions in binary form must reproduce the above copyright notice, +this list of conditions and the following disclaimer in the documentation +and/or other materials provided with the distribution. + +THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" +AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE +IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE +DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE +FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL +DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR +SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER +CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, +OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE +OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + +------------------------------------------------------------------------------ + +Files: android_jni/gradlew* + + + Apache License + Version 2.0, January 2004 + http://www.apache.org/licenses/ + + TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION + + 1. Definitions. + + "License" shall mean the terms and conditions for use, reproduction, + and distribution as defined by Sections 1 through 9 of this document. + + "Licensor" shall mean the copyright owner or entity authorized by + the copyright owner that is granting the License. + + "Legal Entity" shall mean the union of the acting entity and all + other entities that control, are controlled by, or are under common + control with that entity. For the purposes of this definition, + "control" means (i) the power, direct or indirect, to cause the + direction or management of such entity, whether by contract or + otherwise, or (ii) ownership of fifty percent (50%) or more of the + outstanding shares, or (iii) beneficial ownership of such entity. + + "You" (or "Your") shall mean an individual or Legal Entity + exercising permissions granted by this License. + + "Source" form shall mean the preferred form for making modifications, + including but not limited to software source code, documentation + source, and configuration files. + + "Object" form shall mean any form resulting from mechanical + transformation or translation of a Source form, including but + not limited to compiled object code, generated documentation, + and conversions to other media types. + + "Work" shall mean the work of authorship, whether in Source or + Object form, made available under the License, as indicated by a + copyright notice that is included in or attached to the work + (an example is provided in the Appendix below). + + "Derivative Works" shall mean any work, whether in Source or Object + form, that is based on (or derived from) the Work and for which the + editorial revisions, annotations, elaborations, or other modifications + represent, as a whole, an original work of authorship. For the purposes + of this License, Derivative Works shall not include works that remain + separable from, or merely link (or bind by name) to the interfaces of, + the Work and Derivative Works thereof. + + "Contribution" shall mean any work of authorship, including + the original version of the Work and any modifications or additions + to that Work or Derivative Works thereof, that is intentionally + submitted to Licensor for inclusion in the Work by the copyright owner + or by an individual or Legal Entity authorized to submit on behalf of + the copyright owner. For the purposes of this definition, "submitted" + means any form of electronic, verbal, or written communication sent + to the Licensor or its representatives, including but not limited to + communication on electronic mailing lists, source code control systems, + and issue tracking systems that are managed by, or on behalf of, the + Licensor for the purpose of discussing and improving the Work, but + excluding communication that is conspicuously marked or otherwise + designated in writing by the copyright owner as "Not a Contribution." + + "Contributor" shall mean Licensor and any individual or Legal Entity + on behalf of whom a Contribution has been received by Licensor and + subsequently incorporated within the Work. + + 2. Grant of Copyright License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + copyright license to reproduce, prepare Derivative Works of, + publicly display, publicly perform, sublicense, and distribute the + Work and such Derivative Works in Source or Object form. + + 3. Grant of Patent License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + (except as stated in this section) patent license to make, have made, + use, offer to sell, sell, import, and otherwise transfer the Work, + where such license applies only to those patent claims licensable + by such Contributor that are necessarily infringed by their + Contribution(s) alone or by combination of their Contribution(s) + with the Work to which such Contribution(s) was submitted. If You + institute patent litigation against any entity (including a + cross-claim or counterclaim in a lawsuit) alleging that the Work + or a Contribution incorporated within the Work constitutes direct + or contributory patent infringement, then any patent licenses + granted to You under this License for that Work shall terminate + as of the date such litigation is filed. + + 4. Redistribution. You may reproduce and distribute copies of the + Work or Derivative Works thereof in any medium, with or without + modifications, and in Source or Object form, provided that You + meet the following conditions: + + (a) You must give any other recipients of the Work or + Derivative Works a copy of this License; and + + (b) You must cause any modified files to carry prominent notices + stating that You changed the files; and + + (c) You must retain, in the Source form of any Derivative Works + that You distribute, all copyright, patent, trademark, and + attribution notices from the Source form of the Work, + excluding those notices that do not pertain to any part of + the Derivative Works; and + + (d) If the Work includes a "NOTICE" text file as part of its + distribution, then any Derivative Works that You distribute must + include a readable copy of the attribution notices contained + within such NOTICE file, excluding those notices that do not + pertain to any part of the Derivative Works, in at least one + of the following places: within a NOTICE text file distributed + as part of the Derivative Works; within the Source form or + documentation, if provided along with the Derivative Works; or, + within a display generated by the Derivative Works, if and + wherever such third-party notices normally appear. The contents + of the NOTICE file are for informational purposes only and + do not modify the License. You may add Your own attribution + notices within Derivative Works that You distribute, alongside + or as an addendum to the NOTICE text from the Work, provided + that such additional attribution notices cannot be construed + as modifying the License. + + You may add Your own copyright statement to Your modifications and + may provide additional or different license terms and conditions + for use, reproduction, or distribution of Your modifications, or + for any such Derivative Works as a whole, provided Your use, + reproduction, and distribution of the Work otherwise complies with + the conditions stated in this License. + + 5. Submission of Contributions. Unless You explicitly state otherwise, + any Contribution intentionally submitted for inclusion in the Work + by You to the Licensor shall be under the terms and conditions of + this License, without any additional terms or conditions. + Notwithstanding the above, nothing herein shall supersede or modify + the terms of any separate license agreement you may have executed + with Licensor regarding such Contributions. + + 6. Trademarks. This License does not grant permission to use the trade + names, trademarks, service marks, or product names of the Licensor, + except as required for reasonable and customary use in describing the + origin of the Work and reproducing the content of the NOTICE file. + + 7. Disclaimer of Warranty. Unless required by applicable law or + agreed to in writing, Licensor provides the Work (and each + Contributor provides its Contributions) on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or + implied, including, without limitation, any warranties or conditions + of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A + PARTICULAR PURPOSE. You are solely responsible for determining the + appropriateness of using or redistributing the Work and assume any + risks associated with Your exercise of permissions under this License. + + 8. Limitation of Liability. In no event and under no legal theory, + whether in tort (including negligence), contract, or otherwise, + unless required by applicable law (such as deliberate and grossly + negligent acts) or agreed to in writing, shall any Contributor be + liable to You for damages, including any direct, indirect, special, + incidental, or consequential damages of any character arising as a + result of this License or out of the use or inability to use the + Work (including but not limited to damages for loss of goodwill, + work stoppage, computer failure or malfunction, or any and all + other commercial damages or losses), even if such Contributor + has been advised of the possibility of such damages. + + 9. Accepting Warranty or Additional Liability. While redistributing + the Work or Derivative Works thereof, You may choose to offer, + and charge a fee for, acceptance of support, warranty, indemnity, + or other liability obligations and/or rights consistent with this + License. However, in accepting such obligations, You may act only + on Your own behalf and on Your sole responsibility, not on behalf + of any other Contributor, and only if You agree to indemnify, + defend, and hold each Contributor harmless for any liability + incurred by, or claims asserted against, such Contributor by reason + of your accepting any such warranty or additional liability. + + END OF TERMS AND CONDITIONS + + APPENDIX: How to apply the Apache License to your work. + + To apply the Apache License to your work, attach the following + boilerplate notice, with the fields enclosed by brackets "[]" + replaced with your own identifying information. (Don't include + the brackets!) The text should be enclosed in the appropriate + comment syntax for the file format. We also recommend that a + file or class name and description of purpose be included on the + same "printed page" as the copyright notice for easier + identification within third-party archives. + + Copyright [yyyy] [name of copyright owner] + + Licensed under the Apache License, Version 2.0 (the "License"); + you may not use this file except in compliance with the License. + You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, software + distributed under the License is distributed on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + See the License for the specific language governing permissions and + limitations under the License. + +------------------------------------------------------------------------------ + +Files: third_party/libyuv/* + +Copyright 2011 The LibYuv Project Authors. All rights reserved. + +Redistribution and use in source and binary forms, with or without +modification, are permitted provided that the following conditions are +met: + + * Redistributions of source code must retain the above copyright + notice, this list of conditions and the following disclaimer. + + * Redistributions in binary form must reproduce the above copyright + notice, this list of conditions and the following disclaimer in + the documentation and/or other materials provided with the + distribution. + + * Neither the name of Google nor the names of its contributors may + be used to endorse or promote products derived from this software + without specific prior written permission. + +THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS +"AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT +LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR +A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT +HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, +SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT +LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, +DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY +THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT +(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE +OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + + +---- + +LIBJPEG + +1. We don't promise that this software works. (But if you find any bugs, + please let us know!) +2. You can use this software for whatever you want. You don't have to pay us. +3. You may not pretend that you wrote this software. If you use it in a + program, you must acknowledge somewhere in your documentation that + you've used the IJG code. + +In legalese: + +The authors make NO WARRANTY or representation, either express or implied, +with respect to this software, its quality, accuracy, merchantability, or +fitness for a particular purpose. This software is provided "AS IS", and you, +its user, assume the entire risk as to its quality and accuracy. + +This software is copyright (C) 1991-2020, Thomas G. Lane, Guido Vollbeding. +All Rights Reserved except as specified below. + +Permission is hereby granted to use, copy, modify, and distribute this +software (or portions thereof) for any purpose, without fee, subject to these +conditions: +(1) If any part of the source code for this software is distributed, then this +README file must be included, with this copyright and no-warranty notice +unaltered; and any additions, deletions, or changes to the original files +must be clearly indicated in accompanying documentation. +(2) If only executable code is distributed, then the accompanying +documentation must state that "this software is based in part on the work of +the Independent JPEG Group". +(3) Permission for use of this software is granted only if the user accepts +full responsibility for any undesirable consequences; the authors accept +NO LIABILITY for damages of any kind. + +These conditions apply to any software derived from or based on the IJG code, +not just to the unmodified library. If you use our work, you ought to +acknowledge us. + +Permission is NOT granted for the use of any IJG author's name or company name +in advertising or publicity relating to this software or products derived from +it. This software may be referred to only as "the Independent JPEG Group's +software". + +We specifically permit and encourage the use of this software as the basis of +commercial products, provided that all warranty or liability claims are +assumed by the product vendor. + + +---- + +LIBLZMA + +XZ Utils Licensing +================== + + Different licenses apply to different files in this package. Here + is a rough summary of which licenses apply to which parts of this + package (but check the individual files to be sure!): + + - liblzma is in the public domain. + + - xz, xzdec, and lzmadec command line tools are in the public + domain unless GNU getopt_long had to be compiled and linked + in from the lib directory. The getopt_long code is under + GNU LGPLv2.1+. + + - The scripts to grep, diff, and view compressed files have been + adapted from gzip. These scripts and their documentation are + under GNU GPLv2+. + + - All the documentation in the doc directory and most of the + XZ Utils specific documentation files in other directories + are in the public domain. + + - Translated messages are in the public domain. + + - The build system contains public domain files, and files that + are under GNU GPLv2+ or GNU GPLv3+. None of these files end up + in the binaries being built. + + - Test files and test code in the tests directory, and debugging + utilities in the debug directory are in the public domain. + + - The extra directory may contain public domain files, and files + that are under various free software licenses. + + You can do whatever you want with the files that have been put into + the public domain. If you find public domain legally problematic, + take the previous sentence as a license grant. If you still find + the lack of copyright legally problematic, you have too many + lawyers. + + As usual, this software is provided "as is", without any warranty. + + If you copy significant amounts of public domain code from XZ Utils + into your project, acknowledging this somewhere in your software is + polite (especially if it is proprietary, non-free software), but + naturally it is not legally required. Here is an example of a good + notice to put into "about box" or into documentation: + + This software includes code from XZ Utils . + + The following license texts are included in the following files: + - COPYING.LGPLv2.1: GNU Lesser General Public License version 2.1 + - COPYING.GPLv2: GNU General Public License version 2 + - COPYING.GPLv3: GNU General Public License version 3 + + Note that the toolchain (compiler, linker etc.) may add some code + pieces that are copyrighted. Thus, it is possible that e.g. liblzma + binary wouldn't actually be in the public domain in its entirety + even though it contains no copyrighted code from the XZ Utils source + package. + + If you have questions, don't hesitate to ask the author(s) for more + information. + + +---- + +LIBPNG + +COPYRIGHT NOTICE, DISCLAIMER, and LICENSE +========================================= + +PNG Reference Library License version 2 +--------------------------------------- + + * Copyright (c) 1995-2022 The PNG Reference Library Authors. + * Copyright (c) 2018-2022 Cosmin Truta. + * Copyright (c) 2000-2002, 2004, 2006-2018 Glenn Randers-Pehrson. + * Copyright (c) 1996-1997 Andreas Dilger. + * Copyright (c) 1995-1996 Guy Eric Schalnat, Group 42, Inc. + +The software is supplied "as is", without warranty of any kind, +express or implied, including, without limitation, the warranties +of merchantability, fitness for a particular purpose, title, and +non-infringement. In no event shall the Copyright owners, or +anyone distributing the software, be liable for any damages or +other liability, whether in contract, tort or otherwise, arising +from, out of, or in connection with the software, or the use or +other dealings in the software, even if advised of the possibility +of such damage. + +Permission is hereby granted to use, copy, modify, and distribute +this software, or portions hereof, for any purpose, without fee, +subject to the following restrictions: + + 1. The origin of this software must not be misrepresented; you + must not claim that you wrote the original software. If you + use this software in a product, an acknowledgment in the product + documentation would be appreciated, but is not required. + + 2. Altered source versions must be plainly marked as such, and must + not be misrepresented as being the original software. + + 3. This Copyright notice may not be removed or altered from any + source or altered source distribution. + + +PNG Reference Library License version 1 (for libpng 0.5 through 1.6.35) +----------------------------------------------------------------------- + +libpng versions 1.0.7, July 1, 2000, through 1.6.35, July 15, 2018 are +Copyright (c) 2000-2002, 2004, 2006-2018 Glenn Randers-Pehrson, are +derived from libpng-1.0.6, and are distributed according to the same +disclaimer and license as libpng-1.0.6 with the following individuals +added to the list of Contributing Authors: + + Simon-Pierre Cadieux + Eric S. Raymond + Mans Rullgard + Cosmin Truta + Gilles Vollant + James Yu + Mandar Sahastrabuddhe + Google Inc. + Vadim Barkov + +and with the following additions to the disclaimer: + + There is no warranty against interference with your enjoyment of + the library or against infringement. There is no warranty that our + efforts or the library will fulfill any of your particular purposes + or needs. This library is provided with all faults, and the entire + risk of satisfactory quality, performance, accuracy, and effort is + with the user. + +Some files in the "contrib" directory and some configure-generated +files that are distributed with libpng have other copyright owners, and +are released under other open source licenses. + +libpng versions 0.97, January 1998, through 1.0.6, March 20, 2000, are +Copyright (c) 1998-2000 Glenn Randers-Pehrson, are derived from +libpng-0.96, and are distributed according to the same disclaimer and +license as libpng-0.96, with the following individuals added to the +list of Contributing Authors: + + Tom Lane + Glenn Randers-Pehrson + Willem van Schaik + +libpng versions 0.89, June 1996, through 0.96, May 1997, are +Copyright (c) 1996-1997 Andreas Dilger, are derived from libpng-0.88, +and are distributed according to the same disclaimer and license as +libpng-0.88, with the following individuals added to the list of +Contributing Authors: + + John Bowler + Kevin Bracey + Sam Bushell + Magnus Holmgren + Greg Roelofs + Tom Tanner + +Some files in the "scripts" directory have other copyright owners, +but are released under this license. + +libpng versions 0.5, May 1995, through 0.88, January 1996, are +Copyright (c) 1995-1996 Guy Eric Schalnat, Group 42, Inc. + +For the purposes of this copyright and license, "Contributing Authors" +is defined as the following set of individuals: + + Andreas Dilger + Dave Martindale + Guy Eric Schalnat + Paul Schmidt + Tim Wegner + +The PNG Reference Library is supplied "AS IS". The Contributing +Authors and Group 42, Inc. disclaim all warranties, expressed or +implied, including, without limitation, the warranties of +merchantability and of fitness for any purpose. The Contributing +Authors and Group 42, Inc. assume no liability for direct, indirect, +incidental, special, exemplary, or consequential damages, which may +result from the use of the PNG Reference Library, even if advised of +the possibility of such damage. + +Permission is hereby granted to use, copy, modify, and distribute this +source code, or portions hereof, for any purpose, without fee, subject +to the following restrictions: + + 1. The origin of this source code must not be misrepresented. + + 2. Altered versions must be plainly marked as such and must not + be misrepresented as being the original source. + + 3. This Copyright notice may not be removed or altered from any + source or altered source distribution. + +The Contributing Authors and Group 42, Inc. specifically permit, +without fee, and encourage the use of this source code as a component +to supporting the PNG file format in commercial products. If you use +this source code in a product, acknowledgment is not required but would +be appreciated. + + +---- + +LIBTIFF + +Copyright (c) 1988-1997 Sam Leffler +Copyright (c) 1991-1997 Silicon Graphics, Inc. + +Permission to use, copy, modify, distribute, and sell this software and +its documentation for any purpose is hereby granted without fee, provided +that (i) the above copyright notices and this permission notice appear in +all copies of the software and related documentation, and (ii) the names of +Sam Leffler and Silicon Graphics may not be used in any advertising or +publicity relating to the software without the specific, prior written +permission of Sam Leffler and Silicon Graphics. + +THE SOFTWARE IS PROVIDED "AS-IS" AND WITHOUT WARRANTY OF ANY KIND, +EXPRESS, IMPLIED OR OTHERWISE, INCLUDING WITHOUT LIMITATION, ANY +WARRANTY OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. + +IN NO EVENT SHALL SAM LEFFLER OR SILICON GRAPHICS BE LIABLE FOR +ANY SPECIAL, INCIDENTAL, INDIRECT OR CONSEQUENTIAL DAMAGES OF ANY KIND, +OR ANY DAMAGES WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, +WHETHER OR NOT ADVISED OF THE POSSIBILITY OF DAMAGE, AND ON ANY THEORY OF +LIABILITY, ARISING OUT OF OR IN CONNECTION WITH THE USE OR PERFORMANCE +OF THIS SOFTWARE. + + +---- + +LIBWEBP + +Copyright (c) 2010, Google Inc. All rights reserved. + +Redistribution and use in source and binary forms, with or without +modification, are permitted provided that the following conditions are +met: + + * Redistributions of source code must retain the above copyright + notice, this list of conditions and the following disclaimer. + + * Redistributions in binary form must reproduce the above copyright + notice, this list of conditions and the following disclaimer in + the documentation and/or other materials provided with the + distribution. + + * Neither the name of Google nor the names of its contributors may + be used to endorse or promote products derived from this software + without specific prior written permission. + +THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS +"AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT +LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR +A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT +HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, +SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT +LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, +DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY +THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT +(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE +OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + + +---- + +LIBYUV + +Copyright 2011 The LibYuv Project Authors. All rights reserved. + +Redistribution and use in source and binary forms, with or without +modification, are permitted provided that the following conditions are +met: + + * Redistributions of source code must retain the above copyright + notice, this list of conditions and the following disclaimer. + + * Redistributions in binary form must reproduce the above copyright + notice, this list of conditions and the following disclaimer in + the documentation and/or other materials provided with the + distribution. + + * Neither the name of Google nor the names of its contributors may + be used to endorse or promote products derived from this software + without specific prior written permission. + +THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS +"AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT +LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR +A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT +HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, +SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT +LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, +DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY +THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT +(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE +OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + + +---- + +OPENJPEG + +* + * The copyright in this software is being made available under the 2-clauses + * BSD License, included below. This software may be subject to other third + * party and contributor rights, including patent rights, and no such rights + * are granted under this license. + * + * Copyright (c) 2002-2014, Universite catholique de Louvain (UCL), Belgium + * Copyright (c) 2002-2014, Professor Benoit Macq + * Copyright (c) 2003-2014, Antonin Descampe + * Copyright (c) 2003-2009, Francois-Olivier Devaux + * Copyright (c) 2005, Herve Drolon, FreeImage Team + * Copyright (c) 2002-2003, Yannick Verschueren + * Copyright (c) 2001-2003, David Janssens + * Copyright (c) 2011-2012, Centre National d'Etudes Spatiales (CNES), France + * Copyright (c) 2012, CS Systemes d'Information, France + * + * All rights reserved. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS `AS IS' + * AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE + * ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE + * LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR + * CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF + * SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS + * INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN + * CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) + * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE + * POSSIBILITY OF SUCH DAMAGE. + */ + + +---- + +RAQM + +The MIT License (MIT) + +Copyright © 2015 Information Technology Authority (ITA) +Copyright © 2016 Khaled Hosny + +Permission is hereby granted, free of charge, to any person obtaining a copy +of this software and associated documentation files (the "Software"), to deal +in the Software without restriction, including without limitation the rights +to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +copies of the Software, and to permit persons to whom the Software is +furnished to do so, subject to the following conditions: + +The above copyright notice and this permission notice shall be included in all +copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE +SOFTWARE. + + +---- + +XAU + +Copyright 1988, 1993, 1994, 1998 The Open Group + +Permission to use, copy, modify, distribute, and sell this software and its +documentation for any purpose is hereby granted without fee, provided that +the above copyright notice appear in all copies and that both that +copyright notice and this permission notice appear in supporting +documentation. + +The above copyright notice and this permission notice shall be included in +all copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +OPEN GROUP BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN +AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN +CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. + +Except as contained in this notice, the name of The Open Group shall not be +used in advertising or otherwise to promote the sale, use or other dealings +in this Software without prior written authorization from The Open Group. + + +---- + +XCB + +Copyright (C) 2001-2006 Bart Massey, Jamey Sharp, and Josh Triplett. +All Rights Reserved. + +Permission is hereby granted, free of charge, to any person +obtaining a copy of this software and associated +documentation files (the "Software"), to deal in the +Software without restriction, including without limitation +the rights to use, copy, modify, merge, publish, distribute, +sublicense, and/or sell copies of the Software, and to +permit persons to whom the Software is furnished to do so, +subject to the following conditions: + +The above copyright notice and this permission notice shall +be included in all copies or substantial portions of the +Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY +KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE +WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR +PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS +BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER +IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR +OTHER DEALINGS IN THE SOFTWARE. + +Except as contained in this notice, the names of the authors +or their institutions shall not be used in advertising or +otherwise to promote the sale, use or other dealings in this +Software without prior written authorization from the +authors. + + +---- + +XDMCP + +Copyright 1989, 1998 The Open Group + +Permission to use, copy, modify, distribute, and sell this software and its +documentation for any purpose is hereby granted without fee, provided that +the above copyright notice appear in all copies and that both that +copyright notice and this permission notice appear in supporting +documentation. + +The above copyright notice and this permission notice shall be included in +all copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +OPEN GROUP BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN +AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN +CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. + +Except as contained in this notice, the name of The Open Group shall not be +used in advertising or otherwise to promote the sale, use or other dealings +in this Software without prior written authorization from The Open Group. + +Author: Keith Packard, MIT X Consortium + + +---- + +ZLIB + + (C) 1995-2017 Jean-loup Gailly and Mark Adler + + This software is provided 'as-is', without any express or implied + warranty. In no event will the authors be held liable for any damages + arising from the use of this software. + + Permission is granted to anyone to use this software for any purpose, + including commercial applications, and to alter it and redistribute it + freely, subject to the following restrictions: + + 1. The origin of this software must not be misrepresented; you must not + claim that you wrote the original software. If you use this software + in a product, an acknowledgment in the product documentation would be + appreciated but is not required. + 2. Altered source versions must be plainly marked as such, and must not be + misrepresented as being the original software. + 3. This notice may not be removed or altered from any source distribution. + + Jean-loup Gailly Mark Adler + jloup@gzip.org madler@alumni.caltech.edu + +If you use the zlib library in a product, we would appreciate *not* receiving +lengthy legal documents to sign. The sources are provided for free but without +warranty of any kind. The library has been entirely written by Jean-loup +Gailly and Mark Adler; it does not include third-party code. + +If you redistribute modified sources, we would appreciate that you include in +the file ChangeLog history information documenting your changes. Please read +the FAQ for more information on the distribution of modified source versions. + + +platformdirs +MIT +https://github.com/tox-dev/platformdirs +MIT License + +Copyright (c) 2010-202x The platformdirs developers + +Permission is hereby granted, free of charge, to any person obtaining a copy +of this software and associated documentation files (the "Software"), to deal +in the Software without restriction, including without limitation the rights +to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +copies of the Software, and to permit persons to whom the Software is +furnished to do so, subject to the following conditions: + +The above copyright notice and this permission notice shall be included in all +copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE +SOFTWARE. + + +pluggy +MIT License +UNKNOWN +The MIT License (MIT) + +Copyright (c) 2015 holger krekel (rather uses bitbucket/hpk42) + +Permission is hereby granted, free of charge, to any person obtaining a copy +of this software and associated documentation files (the "Software"), to deal +in the Software without restriction, including without limitation the rights +to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +copies of the Software, and to permit persons to whom the Software is +furnished to do so, subject to the following conditions: + +The above copyright notice and this permission notice shall be included in all +copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE +SOFTWARE. + + +plyfile +GNU General Public License v3 or later (GPLv3+) +https://github.com/dranjan/python-plyfile + GNU GENERAL PUBLIC LICENSE + Version 3, 29 June 2007 + + Copyright (C) 2007 Free Software Foundation, Inc. + Everyone is permitted to copy and distribute verbatim copies + of this license document, but changing it is not allowed. + + Preamble + + The GNU General Public License is a free, copyleft license for +software and other kinds of works. + + The licenses for most software and other practical works are designed +to take away your freedom to share and change the works. By contrast, +the GNU General Public License is intended to guarantee your freedom to +share and change all versions of a program--to make sure it remains free +software for all its users. We, the Free Software Foundation, use the +GNU General Public License for most of our software; it applies also to +any other work released this way by its authors. You can apply it to +your programs, too. + + When we speak of free software, we are referring to freedom, not +price. Our General Public Licenses are designed to make sure that you +have the freedom to distribute copies of free software (and charge for +them if you wish), that you receive source code or can get it if you +want it, that you can change the software or use pieces of it in new +free programs, and that you know you can do these things. + + To protect your rights, we need to prevent others from denying you +these rights or asking you to surrender the rights. Therefore, you have +certain responsibilities if you distribute copies of the software, or if +you modify it: responsibilities to respect the freedom of others. + + For example, if you distribute copies of such a program, whether +gratis or for a fee, you must pass on to the recipients the same +freedoms that you received. You must make sure that they, too, receive +or can get the source code. And you must show them these terms so they +know their rights. + + Developers that use the GNU GPL protect your rights with two steps: +(1) assert copyright on the software, and (2) offer you this License +giving you legal permission to copy, distribute and/or modify it. + + For the developers' and authors' protection, the GPL clearly explains +that there is no warranty for this free software. For both users' and +authors' sake, the GPL requires that modified versions be marked as +changed, so that their problems will not be attributed erroneously to +authors of previous versions. + + Some devices are designed to deny users access to install or run +modified versions of the software inside them, although the manufacturer +can do so. This is fundamentally incompatible with the aim of +protecting users' freedom to change the software. The systematic +pattern of such abuse occurs in the area of products for individuals to +use, which is precisely where it is most unacceptable. Therefore, we +have designed this version of the GPL to prohibit the practice for those +products. If such problems arise substantially in other domains, we +stand ready to extend this provision to those domains in future versions +of the GPL, as needed to protect the freedom of users. + + Finally, every program is threatened constantly by software patents. +States should not allow patents to restrict development and use of +software on general-purpose computers, but in those that do, we wish to +avoid the special danger that patents applied to a free program could +make it effectively proprietary. To prevent this, the GPL assures that +patents cannot be used to render the program non-free. + + The precise terms and conditions for copying, distribution and +modification follow. + + TERMS AND CONDITIONS + + 0. Definitions. + + "This License" refers to version 3 of the GNU General Public License. + + "Copyright" also means copyright-like laws that apply to other kinds of +works, such as semiconductor masks. + + "The Program" refers to any copyrightable work licensed under this +License. Each licensee is addressed as "you". "Licensees" and +"recipients" may be individuals or organizations. + + To "modify" a work means to copy from or adapt all or part of the work +in a fashion requiring copyright permission, other than the making of an +exact copy. The resulting work is called a "modified version" of the +earlier work or a work "based on" the earlier work. + + A "covered work" means either the unmodified Program or a work based +on the Program. + + To "propagate" a work means to do anything with it that, without +permission, would make you directly or secondarily liable for +infringement under applicable copyright law, except executing it on a +computer or modifying a private copy. Propagation includes copying, +distribution (with or without modification), making available to the +public, and in some countries other activities as well. + + To "convey" a work means any kind of propagation that enables other +parties to make or receive copies. Mere interaction with a user through +a computer network, with no transfer of a copy, is not conveying. + + An interactive user interface displays "Appropriate Legal Notices" +to the extent that it includes a convenient and prominently visible +feature that (1) displays an appropriate copyright notice, and (2) +tells the user that there is no warranty for the work (except to the +extent that warranties are provided), that licensees may convey the +work under this License, and how to view a copy of this License. If +the interface presents a list of user commands or options, such as a +menu, a prominent item in the list meets this criterion. + + 1. Source Code. + + The "source code" for a work means the preferred form of the work +for making modifications to it. "Object code" means any non-source +form of a work. + + A "Standard Interface" means an interface that either is an official +standard defined by a recognized standards body, or, in the case of +interfaces specified for a particular programming language, one that +is widely used among developers working in that language. + + The "System Libraries" of an executable work include anything, other +than the work as a whole, that (a) is included in the normal form of +packaging a Major Component, but which is not part of that Major +Component, and (b) serves only to enable use of the work with that +Major Component, or to implement a Standard Interface for which an +implementation is available to the public in source code form. A +"Major Component", in this context, means a major essential component +(kernel, window system, and so on) of the specific operating system +(if any) on which the executable work runs, or a compiler used to +produce the work, or an object code interpreter used to run it. + + The "Corresponding Source" for a work in object code form means all +the source code needed to generate, install, and (for an executable +work) run the object code and to modify the work, including scripts to +control those activities. However, it does not include the work's +System Libraries, or general-purpose tools or generally available free +programs which are used unmodified in performing those activities but +which are not part of the work. For example, Corresponding Source +includes interface definition files associated with source files for +the work, and the source code for shared libraries and dynamically +linked subprograms that the work is specifically designed to require, +such as by intimate data communication or control flow between those +subprograms and other parts of the work. + + The Corresponding Source need not include anything that users +can regenerate automatically from other parts of the Corresponding +Source. + + The Corresponding Source for a work in source code form is that +same work. + + 2. Basic Permissions. + + All rights granted under this License are granted for the term of +copyright on the Program, and are irrevocable provided the stated +conditions are met. This License explicitly affirms your unlimited +permission to run the unmodified Program. The output from running a +covered work is covered by this License only if the output, given its +content, constitutes a covered work. This License acknowledges your +rights of fair use or other equivalent, as provided by copyright law. + + You may make, run and propagate covered works that you do not +convey, without conditions so long as your license otherwise remains +in force. You may convey covered works to others for the sole purpose +of having them make modifications exclusively for you, or provide you +with facilities for running those works, provided that you comply with +the terms of this License in conveying all material for which you do +not control copyright. Those thus making or running the covered works +for you must do so exclusively on your behalf, under your direction +and control, on terms that prohibit them from making any copies of +your copyrighted material outside their relationship with you. + + Conveying under any other circumstances is permitted solely under +the conditions stated below. Sublicensing is not allowed; section 10 +makes it unnecessary. + + 3. Protecting Users' Legal Rights From Anti-Circumvention Law. + + No covered work shall be deemed part of an effective technological +measure under any applicable law fulfilling obligations under article +11 of the WIPO copyright treaty adopted on 20 December 1996, or +similar laws prohibiting or restricting circumvention of such +measures. + + When you convey a covered work, you waive any legal power to forbid +circumvention of technological measures to the extent such circumvention +is effected by exercising rights under this License with respect to +the covered work, and you disclaim any intention to limit operation or +modification of the work as a means of enforcing, against the work's +users, your or third parties' legal rights to forbid circumvention of +technological measures. + + 4. Conveying Verbatim Copies. + + You may convey verbatim copies of the Program's source code as you +receive it, in any medium, provided that you conspicuously and +appropriately publish on each copy an appropriate copyright notice; +keep intact all notices stating that this License and any +non-permissive terms added in accord with section 7 apply to the code; +keep intact all notices of the absence of any warranty; and give all +recipients a copy of this License along with the Program. + + You may charge any price or no price for each copy that you convey, +and you may offer support or warranty protection for a fee. + + 5. Conveying Modified Source Versions. + + You may convey a work based on the Program, or the modifications to +produce it from the Program, in the form of source code under the +terms of section 4, provided that you also meet all of these conditions: + + a) The work must carry prominent notices stating that you modified + it, and giving a relevant date. + + b) The work must carry prominent notices stating that it is + released under this License and any conditions added under section + 7. This requirement modifies the requirement in section 4 to + "keep intact all notices". + + c) You must license the entire work, as a whole, under this + License to anyone who comes into possession of a copy. This + License will therefore apply, along with any applicable section 7 + additional terms, to the whole of the work, and all its parts, + regardless of how they are packaged. This License gives no + permission to license the work in any other way, but it does not + invalidate such permission if you have separately received it. + + d) If the work has interactive user interfaces, each must display + Appropriate Legal Notices; however, if the Program has interactive + interfaces that do not display Appropriate Legal Notices, your + work need not make them do so. + + A compilation of a covered work with other separate and independent +works, which are not by their nature extensions of the covered work, +and which are not combined with it such as to form a larger program, +in or on a volume of a storage or distribution medium, is called an +"aggregate" if the compilation and its resulting copyright are not +used to limit the access or legal rights of the compilation's users +beyond what the individual works permit. Inclusion of a covered work +in an aggregate does not cause this License to apply to the other +parts of the aggregate. + + 6. Conveying Non-Source Forms. + + You may convey a covered work in object code form under the terms +of sections 4 and 5, provided that you also convey the +machine-readable Corresponding Source under the terms of this License, +in one of these ways: + + a) Convey the object code in, or embodied in, a physical product + (including a physical distribution medium), accompanied by the + Corresponding Source fixed on a durable physical medium + customarily used for software interchange. + + b) Convey the object code in, or embodied in, a physical product + (including a physical distribution medium), accompanied by a + written offer, valid for at least three years and valid for as + long as you offer spare parts or customer support for that product + model, to give anyone who possesses the object code either (1) a + copy of the Corresponding Source for all the software in the + product that is covered by this License, on a durable physical + medium customarily used for software interchange, for a price no + more than your reasonable cost of physically performing this + conveying of source, or (2) access to copy the + Corresponding Source from a network server at no charge. + + c) Convey individual copies of the object code with a copy of the + written offer to provide the Corresponding Source. This + alternative is allowed only occasionally and noncommercially, and + only if you received the object code with such an offer, in accord + with subsection 6b. + + d) Convey the object code by offering access from a designated + place (gratis or for a charge), and offer equivalent access to the + Corresponding Source in the same way through the same place at no + further charge. You need not require recipients to copy the + Corresponding Source along with the object code. If the place to + copy the object code is a network server, the Corresponding Source + may be on a different server (operated by you or a third party) + that supports equivalent copying facilities, provided you maintain + clear directions next to the object code saying where to find the + Corresponding Source. Regardless of what server hosts the + Corresponding Source, you remain obligated to ensure that it is + available for as long as needed to satisfy these requirements. + + e) Convey the object code using peer-to-peer transmission, provided + you inform other peers where the object code and Corresponding + Source of the work are being offered to the general public at no + charge under subsection 6d. + + A separable portion of the object code, whose source code is excluded +from the Corresponding Source as a System Library, need not be +included in conveying the object code work. + + A "User Product" is either (1) a "consumer product", which means any +tangible personal property which is normally used for personal, family, +or household purposes, or (2) anything designed or sold for incorporation +into a dwelling. In determining whether a product is a consumer product, +doubtful cases shall be resolved in favor of coverage. For a particular +product received by a particular user, "normally used" refers to a +typical or common use of that class of product, regardless of the status +of the particular user or of the way in which the particular user +actually uses, or expects or is expected to use, the product. A product +is a consumer product regardless of whether the product has substantial +commercial, industrial or non-consumer uses, unless such uses represent +the only significant mode of use of the product. + + "Installation Information" for a User Product means any methods, +procedures, authorization keys, or other information required to install +and execute modified versions of a covered work in that User Product from +a modified version of its Corresponding Source. The information must +suffice to ensure that the continued functioning of the modified object +code is in no case prevented or interfered with solely because +modification has been made. + + If you convey an object code work under this section in, or with, or +specifically for use in, a User Product, and the conveying occurs as +part of a transaction in which the right of possession and use of the +User Product is transferred to the recipient in perpetuity or for a +fixed term (regardless of how the transaction is characterized), the +Corresponding Source conveyed under this section must be accompanied +by the Installation Information. But this requirement does not apply +if neither you nor any third party retains the ability to install +modified object code on the User Product (for example, the work has +been installed in ROM). + + The requirement to provide Installation Information does not include a +requirement to continue to provide support service, warranty, or updates +for a work that has been modified or installed by the recipient, or for +the User Product in which it has been modified or installed. Access to a +network may be denied when the modification itself materially and +adversely affects the operation of the network or violates the rules and +protocols for communication across the network. + + Corresponding Source conveyed, and Installation Information provided, +in accord with this section must be in a format that is publicly +documented (and with an implementation available to the public in +source code form), and must require no special password or key for +unpacking, reading or copying. + + 7. Additional Terms. + + "Additional permissions" are terms that supplement the terms of this +License by making exceptions from one or more of its conditions. +Additional permissions that are applicable to the entire Program shall +be treated as though they were included in this License, to the extent +that they are valid under applicable law. If additional permissions +apply only to part of the Program, that part may be used separately +under those permissions, but the entire Program remains governed by +this License without regard to the additional permissions. + + When you convey a copy of a covered work, you may at your option +remove any additional permissions from that copy, or from any part of +it. (Additional permissions may be written to require their own +removal in certain cases when you modify the work.) You may place +additional permissions on material, added by you to a covered work, +for which you have or can give appropriate copyright permission. + + Notwithstanding any other provision of this License, for material you +add to a covered work, you may (if authorized by the copyright holders of +that material) supplement the terms of this License with terms: + + a) Disclaiming warranty or limiting liability differently from the + terms of sections 15 and 16 of this License; or + + b) Requiring preservation of specified reasonable legal notices or + author attributions in that material or in the Appropriate Legal + Notices displayed by works containing it; or + + c) Prohibiting misrepresentation of the origin of that material, or + requiring that modified versions of such material be marked in + reasonable ways as different from the original version; or + + d) Limiting the use for publicity purposes of names of licensors or + authors of the material; or + + e) Declining to grant rights under trademark law for use of some + trade names, trademarks, or service marks; or + + f) Requiring indemnification of licensors and authors of that + material by anyone who conveys the material (or modified versions of + it) with contractual assumptions of liability to the recipient, for + any liability that these contractual assumptions directly impose on + those licensors and authors. + + All other non-permissive additional terms are considered "further +restrictions" within the meaning of section 10. If the Program as you +received it, or any part of it, contains a notice stating that it is +governed by this License along with a term that is a further +restriction, you may remove that term. If a license document contains +a further restriction but permits relicensing or conveying under this +License, you may add to a covered work material governed by the terms +of that license document, provided that the further restriction does +not survive such relicensing or conveying. + + If you add terms to a covered work in accord with this section, you +must place, in the relevant source files, a statement of the +additional terms that apply to those files, or a notice indicating +where to find the applicable terms. + + Additional terms, permissive or non-permissive, may be stated in the +form of a separately written license, or stated as exceptions; +the above requirements apply either way. + + 8. Termination. + + You may not propagate or modify a covered work except as expressly +provided under this License. Any attempt otherwise to propagate or +modify it is void, and will automatically terminate your rights under +this License (including any patent licenses granted under the third +paragraph of section 11). + + However, if you cease all violation of this License, then your +license from a particular copyright holder is reinstated (a) +provisionally, unless and until the copyright holder explicitly and +finally terminates your license, and (b) permanently, if the copyright +holder fails to notify you of the violation by some reasonable means +prior to 60 days after the cessation. + + Moreover, your license from a particular copyright holder is +reinstated permanently if the copyright holder notifies you of the +violation by some reasonable means, this is the first time you have +received notice of violation of this License (for any work) from that +copyright holder, and you cure the violation prior to 30 days after +your receipt of the notice. + + Termination of your rights under this section does not terminate the +licenses of parties who have received copies or rights from you under +this License. If your rights have been terminated and not permanently +reinstated, you do not qualify to receive new licenses for the same +material under section 10. + + 9. Acceptance Not Required for Having Copies. + + You are not required to accept this License in order to receive or +run a copy of the Program. Ancillary propagation of a covered work +occurring solely as a consequence of using peer-to-peer transmission +to receive a copy likewise does not require acceptance. However, +nothing other than this License grants you permission to propagate or +modify any covered work. These actions infringe copyright if you do +not accept this License. Therefore, by modifying or propagating a +covered work, you indicate your acceptance of this License to do so. + + 10. Automatic Licensing of Downstream Recipients. + + Each time you convey a covered work, the recipient automatically +receives a license from the original licensors, to run, modify and +propagate that work, subject to this License. You are not responsible +for enforcing compliance by third parties with this License. + + An "entity transaction" is a transaction transferring control of an +organization, or substantially all assets of one, or subdividing an +organization, or merging organizations. If propagation of a covered +work results from an entity transaction, each party to that +transaction who receives a copy of the work also receives whatever +licenses to the work the party's predecessor in interest had or could +give under the previous paragraph, plus a right to possession of the +Corresponding Source of the work from the predecessor in interest, if +the predecessor has it or can get it with reasonable efforts. + + You may not impose any further restrictions on the exercise of the +rights granted or affirmed under this License. For example, you may +not impose a license fee, royalty, or other charge for exercise of +rights granted under this License, and you may not initiate litigation +(including a cross-claim or counterclaim in a lawsuit) alleging that +any patent claim is infringed by making, using, selling, offering for +sale, or importing the Program or any portion of it. + + 11. Patents. + + A "contributor" is a copyright holder who authorizes use under this +License of the Program or a work on which the Program is based. The +work thus licensed is called the contributor's "contributor version". + + A contributor's "essential patent claims" are all patent claims +owned or controlled by the contributor, whether already acquired or +hereafter acquired, that would be infringed by some manner, permitted +by this License, of making, using, or selling its contributor version, +but do not include claims that would be infringed only as a +consequence of further modification of the contributor version. For +purposes of this definition, "control" includes the right to grant +patent sublicenses in a manner consistent with the requirements of +this License. + + Each contributor grants you a non-exclusive, worldwide, royalty-free +patent license under the contributor's essential patent claims, to +make, use, sell, offer for sale, import and otherwise run, modify and +propagate the contents of its contributor version. + + In the following three paragraphs, a "patent license" is any express +agreement or commitment, however denominated, not to enforce a patent +(such as an express permission to practice a patent or covenant not to +sue for patent infringement). To "grant" such a patent license to a +party means to make such an agreement or commitment not to enforce a +patent against the party. + + If you convey a covered work, knowingly relying on a patent license, +and the Corresponding Source of the work is not available for anyone +to copy, free of charge and under the terms of this License, through a +publicly available network server or other readily accessible means, +then you must either (1) cause the Corresponding Source to be so +available, or (2) arrange to deprive yourself of the benefit of the +patent license for this particular work, or (3) arrange, in a manner +consistent with the requirements of this License, to extend the patent +license to downstream recipients. "Knowingly relying" means you have +actual knowledge that, but for the patent license, your conveying the +covered work in a country, or your recipient's use of the covered work +in a country, would infringe one or more identifiable patents in that +country that you have reason to believe are valid. + + If, pursuant to or in connection with a single transaction or +arrangement, you convey, or propagate by procuring conveyance of, a +covered work, and grant a patent license to some of the parties +receiving the covered work authorizing them to use, propagate, modify +or convey a specific copy of the covered work, then the patent license +you grant is automatically extended to all recipients of the covered +work and works based on it. + + A patent license is "discriminatory" if it does not include within +the scope of its coverage, prohibits the exercise of, or is +conditioned on the non-exercise of one or more of the rights that are +specifically granted under this License. You may not convey a covered +work if you are a party to an arrangement with a third party that is +in the business of distributing software, under which you make payment +to the third party based on the extent of your activity of conveying +the work, and under which the third party grants, to any of the +parties who would receive the covered work from you, a discriminatory +patent license (a) in connection with copies of the covered work +conveyed by you (or copies made from those copies), or (b) primarily +for and in connection with specific products or compilations that +contain the covered work, unless you entered into that arrangement, +or that patent license was granted, prior to 28 March 2007. + + Nothing in this License shall be construed as excluding or limiting +any implied license or other defenses to infringement that may +otherwise be available to you under applicable patent law. + + 12. No Surrender of Others' Freedom. + + If conditions are imposed on you (whether by court order, agreement or +otherwise) that contradict the conditions of this License, they do not +excuse you from the conditions of this License. If you cannot convey a +covered work so as to satisfy simultaneously your obligations under this +License and any other pertinent obligations, then as a consequence you may +not convey it at all. For example, if you agree to terms that obligate you +to collect a royalty for further conveying from those to whom you convey +the Program, the only way you could satisfy both those terms and this +License would be to refrain entirely from conveying the Program. + + 13. Use with the GNU Affero General Public License. + + Notwithstanding any other provision of this License, you have +permission to link or combine any covered work with a work licensed +under version 3 of the GNU Affero General Public License into a single +combined work, and to convey the resulting work. The terms of this +License will continue to apply to the part which is the covered work, +but the special requirements of the GNU Affero General Public License, +section 13, concerning interaction through a network will apply to the +combination as such. + + 14. Revised Versions of this License. + + The Free Software Foundation may publish revised and/or new versions of +the GNU General Public License from time to time. Such new versions will +be similar in spirit to the present version, but may differ in detail to +address new problems or concerns. + + Each version is given a distinguishing version number. If the +Program specifies that a certain numbered version of the GNU General +Public License "or any later version" applies to it, you have the +option of following the terms and conditions either of that numbered +version or of any later version published by the Free Software +Foundation. If the Program does not specify a version number of the +GNU General Public License, you may choose any version ever published +by the Free Software Foundation. + + If the Program specifies that a proxy can decide which future +versions of the GNU General Public License can be used, that proxy's +public statement of acceptance of a version permanently authorizes you +to choose that version for the Program. + + Later license versions may give you additional or different +permissions. However, no additional obligations are imposed on any +author or copyright holder as a result of your choosing to follow a +later version. + + 15. Disclaimer of Warranty. + + THERE IS NO WARRANTY FOR THE PROGRAM, TO THE EXTENT PERMITTED BY +APPLICABLE LAW. EXCEPT WHEN OTHERWISE STATED IN WRITING THE COPYRIGHT +HOLDERS AND/OR OTHER PARTIES PROVIDE THE PROGRAM "AS IS" WITHOUT WARRANTY +OF ANY KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, +THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR +PURPOSE. THE ENTIRE RISK AS TO THE QUALITY AND PERFORMANCE OF THE PROGRAM +IS WITH YOU. SHOULD THE PROGRAM PROVE DEFECTIVE, YOU ASSUME THE COST OF +ALL NECESSARY SERVICING, REPAIR OR CORRECTION. + + 16. Limitation of Liability. + + IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN WRITING +WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MODIFIES AND/OR CONVEYS +THE PROGRAM AS PERMITTED ABOVE, BE LIABLE TO YOU FOR DAMAGES, INCLUDING ANY +GENERAL, SPECIAL, INCIDENTAL OR CONSEQUENTIAL DAMAGES ARISING OUT OF THE +USE OR INABILITY TO USE THE PROGRAM (INCLUDING BUT NOT LIMITED TO LOSS OF +DATA OR DATA BEING RENDERED INACCURATE OR LOSSES SUSTAINED BY YOU OR THIRD +PARTIES OR A FAILURE OF THE PROGRAM TO OPERATE WITH ANY OTHER PROGRAMS), +EVEN IF SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE POSSIBILITY OF +SUCH DAMAGES. + + 17. Interpretation of Sections 15 and 16. + + If the disclaimer of warranty and limitation of liability provided +above cannot be given local legal effect according to their terms, +reviewing courts shall apply local law that most closely approximates +an absolute waiver of all civil liability in connection with the +Program, unless a warranty or assumption of liability accompanies a +copy of the Program in return for a fee. + + END OF TERMS AND CONDITIONS + + How to Apply These Terms to Your New Programs + + If you develop a new program, and you want it to be of the greatest +possible use to the public, the best way to achieve this is to make it +free software which everyone can redistribute and change under these terms. + + To do so, attach the following notices to the program. It is safest +to attach them to the start of each source file to most effectively +state the exclusion of warranty; and each file should have at least +the "copyright" line and a pointer to where the full notice is found. + + + Copyright (C) + + This program is free software: you can redistribute it and/or modify + it under the terms of the GNU General Public License as published by + the Free Software Foundation, either version 3 of the License, or + (at your option) any later version. + + This program is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License + along with this program. If not, see . + +Also add information on how to contact you by electronic and paper mail. + + If the program does terminal interaction, make it output a short +notice like this when it starts in an interactive mode: + + Copyright (C) + This program comes with ABSOLUTELY NO WARRANTY; for details type `show w'. + This is free software, and you are welcome to redistribute it + under certain conditions; type `show c' for details. + +The hypothetical commands `show w' and `show c' should show the appropriate +parts of the General Public License. Of course, your program's commands +might be different; for a GUI interface, you would use an "about box". + + You should also get your employer (if you work as a programmer) or school, +if any, to sign a "copyright disclaimer" for the program, if necessary. +For more information on this, and how to apply and follow the GNU GPL, see +. + + The GNU General Public License does not permit incorporating your program +into proprietary programs. If your program is a subroutine library, you +may consider it more useful to permit linking proprietary applications with +the library. If this is what you want to do, use the GNU Lesser General +Public License instead of this License. But first, please read +. + + +polars +MIT License +https://www.pola.rs/ +Copyright (c) 2025 Ritchie Vink +Some portions Copyright (c) 2024 NVIDIA CORPORATION & AFFILIATES. All rights reserved. + +Permission is hereby granted, free of charge, to any person obtaining a copy +of this software and associated documentation files (the "Software"), to deal +in the Software without restriction, including without limitation the rights +to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +copies of the Software, and to permit persons to whom the Software is +furnished to do so, subject to the following conditions: + +The above copyright notice and this permission notice shall be included in all +copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE +SOFTWARE. + + +polars-runtime-32 +MIT License +https://www.pola.rs/ +Copyright (c) 2025 Ritchie Vink +Some portions Copyright (c) 2024 NVIDIA CORPORATION & AFFILIATES. All rights reserved. + +Permission is hereby granted, free of charge, to any person obtaining a copy +of this software and associated documentation files (the "Software"), to deal +in the Software without restriction, including without limitation the rights +to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +copies of the Software, and to permit persons to whom the Software is +furnished to do so, subject to the following conditions: + +The above copyright notice and this permission notice shall be included in all +copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE +SOFTWARE. + + +polyscope +MIT License +https://github.com/nmwsharp/polyscope +MIT License + +Copyright (c) 2017-2020 Nicholas Sharp and the Polyscope contributors + +Permission is hereby granted, free of charge, to any person obtaining a copy +of this software and associated documentation files (the "Software"), to deal +in the Software without restriction, including without limitation the rights +to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +copies of the Software, and to permit persons to whom the Software is +furnished to do so, subject to the following conditions: + +The above copyright notice and this permission notice shall be included in all +copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE +SOFTWARE. + + +portalocker +BSD-3-Clause +https://github.com/wolph/portalocker/ +Copyright 2022 Rick van Hattem + +Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met: + +1. Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer. + +2. Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution. + +3. Neither the name of the copyright holder nor the names of its contributors may be used to endorse or promote products derived from this software without specific prior written permission. + +THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + + +proglog +MIT +https://github.com/Edinburgh-Genome-Foundry/proglog +MIT License + +Copyright (c) 2017 Edinburgh Genome Foundry, University of Edinburgh + +Permission is hereby granted, free of charge, to any person obtaining a copy +of this software and associated documentation files (the "Software"), to deal +in the Software without restriction, including without limitation the rights +to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +copies of the Software, and to permit persons to whom the Software is +furnished to do so, subject to the following conditions: + +The above copyright notice and this permission notice shall be included in all +copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE +SOFTWARE. + + +prometheus_client +Apache-2.0 AND BSD-2-Clause +https://github.com/prometheus/client_python + Apache License + Version 2.0, January 2004 + http://www.apache.org/licenses/ + + TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION + + 1. Definitions. + + "License" shall mean the terms and conditions for use, reproduction, + and distribution as defined by Sections 1 through 9 of this document. + + "Licensor" shall mean the copyright owner or entity authorized by + the copyright owner that is granting the License. + + "Legal Entity" shall mean the union of the acting entity and all + other entities that control, are controlled by, or are under common + control with that entity. For the purposes of this definition, + "control" means (i) the power, direct or indirect, to cause the + direction or management of such entity, whether by contract or + otherwise, or (ii) ownership of fifty percent (50%) or more of the + outstanding shares, or (iii) beneficial ownership of such entity. + + "You" (or "Your") shall mean an individual or Legal Entity + exercising permissions granted by this License. + + "Source" form shall mean the preferred form for making modifications, + including but not limited to software source code, documentation + source, and configuration files. + + "Object" form shall mean any form resulting from mechanical + transformation or translation of a Source form, including but + not limited to compiled object code, generated documentation, + and conversions to other media types. + + "Work" shall mean the work of authorship, whether in Source or + Object form, made available under the License, as indicated by a + copyright notice that is included in or attached to the work + (an example is provided in the Appendix below). + + "Derivative Works" shall mean any work, whether in Source or Object + form, that is based on (or derived from) the Work and for which the + editorial revisions, annotations, elaborations, or other modifications + represent, as a whole, an original work of authorship. For the purposes + of this License, Derivative Works shall not include works that remain + separable from, or merely link (or bind by name) to the interfaces of, + the Work and Derivative Works thereof. + + "Contribution" shall mean any work of authorship, including + the original version of the Work and any modifications or additions + to that Work or Derivative Works thereof, that is intentionally + submitted to Licensor for inclusion in the Work by the copyright owner + or by an individual or Legal Entity authorized to submit on behalf of + the copyright owner. For the purposes of this definition, "submitted" + means any form of electronic, verbal, or written communication sent + to the Licensor or its representatives, including but not limited to + communication on electronic mailing lists, source code control systems, + and issue tracking systems that are managed by, or on behalf of, the + Licensor for the purpose of discussing and improving the Work, but + excluding communication that is conspicuously marked or otherwise + designated in writing by the copyright owner as "Not a Contribution." + + "Contributor" shall mean Licensor and any individual or Legal Entity + on behalf of whom a Contribution has been received by Licensor and + subsequently incorporated within the Work. + + 2. Grant of Copyright License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + copyright license to reproduce, prepare Derivative Works of, + publicly display, publicly perform, sublicense, and distribute the + Work and such Derivative Works in Source or Object form. + + 3. Grant of Patent License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + (except as stated in this section) patent license to make, have made, + use, offer to sell, sell, import, and otherwise transfer the Work, + where such license applies only to those patent claims licensable + by such Contributor that are necessarily infringed by their + Contribution(s) alone or by combination of their Contribution(s) + with the Work to which such Contribution(s) was submitted. If You + institute patent litigation against any entity (including a + cross-claim or counterclaim in a lawsuit) alleging that the Work + or a Contribution incorporated within the Work constitutes direct + or contributory patent infringement, then any patent licenses + granted to You under this License for that Work shall terminate + as of the date such litigation is filed. + + 4. Redistribution. You may reproduce and distribute copies of the + Work or Derivative Works thereof in any medium, with or without + modifications, and in Source or Object form, provided that You + meet the following conditions: + + (a) You must give any other recipients of the Work or + Derivative Works a copy of this License; and + + (b) You must cause any modified files to carry prominent notices + stating that You changed the files; and + + (c) You must retain, in the Source form of any Derivative Works + that You distribute, all copyright, patent, trademark, and + attribution notices from the Source form of the Work, + excluding those notices that do not pertain to any part of + the Derivative Works; and + + (d) If the Work includes a "NOTICE" text file as part of its + distribution, then any Derivative Works that You distribute must + include a readable copy of the attribution notices contained + within such NOTICE file, excluding those notices that do not + pertain to any part of the Derivative Works, in at least one + of the following places: within a NOTICE text file distributed + as part of the Derivative Works; within the Source form or + documentation, if provided along with the Derivative Works; or, + within a display generated by the Derivative Works, if and + wherever such third-party notices normally appear. The contents + of the NOTICE file are for informational purposes only and + do not modify the License. You may add Your own attribution + notices within Derivative Works that You distribute, alongside + or as an addendum to the NOTICE text from the Work, provided + that such additional attribution notices cannot be construed + as modifying the License. + + You may add Your own copyright statement to Your modifications and + may provide additional or different license terms and conditions + for use, reproduction, or distribution of Your modifications, or + for any such Derivative Works as a whole, provided Your use, + reproduction, and distribution of the Work otherwise complies with + the conditions stated in this License. + + 5. Submission of Contributions. Unless You explicitly state otherwise, + any Contribution intentionally submitted for inclusion in the Work + by You to the Licensor shall be under the terms and conditions of + this License, without any additional terms or conditions. + Notwithstanding the above, nothing herein shall supersede or modify + the terms of any separate license agreement you may have executed + with Licensor regarding such Contributions. + + 6. Trademarks. This License does not grant permission to use the trade + names, trademarks, service marks, or product names of the Licensor, + except as required for reasonable and customary use in describing the + origin of the Work and reproducing the content of the NOTICE file. + + 7. Disclaimer of Warranty. Unless required by applicable law or + agreed to in writing, Licensor provides the Work (and each + Contributor provides its Contributions) on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or + implied, including, without limitation, any warranties or conditions + of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A + PARTICULAR PURPOSE. You are solely responsible for determining the + appropriateness of using or redistributing the Work and assume any + risks associated with Your exercise of permissions under this License. + + 8. Limitation of Liability. In no event and under no legal theory, + whether in tort (including negligence), contract, or otherwise, + unless required by applicable law (such as deliberate and grossly + negligent acts) or agreed to in writing, shall any Contributor be + liable to You for damages, including any direct, indirect, special, + incidental, or consequential damages of any character arising as a + result of this License or out of the use or inability to use the + Work (including but not limited to damages for loss of goodwill, + work stoppage, computer failure or malfunction, or any and all + other commercial damages or losses), even if such Contributor + has been advised of the possibility of such damages. + + 9. Accepting Warranty or Additional Liability. While redistributing + the Work or Derivative Works thereof, You may choose to offer, + and charge a fee for, acceptance of support, warranty, indemnity, + or other liability obligations and/or rights consistent with this + License. However, in accepting such obligations, You may act only + on Your own behalf and on Your sole responsibility, not on behalf + of any other Contributor, and only if You agree to indemnify, + defend, and hold each Contributor harmless for any liability + incurred by, or claims asserted against, such Contributor by reason + of your accepting any such warranty or additional liability. + + END OF TERMS AND CONDITIONS + + APPENDIX: How to apply the Apache License to your work. + + To apply the Apache License to your work, attach the following + boilerplate notice, with the fields enclosed by brackets "[]" + replaced with your own identifying information. (Don't include + the brackets!) The text should be enclosed in the appropriate + comment syntax for the file format. We also recommend that a + file or class name and description of purpose be included on the + same "printed page" as the copyright notice for easier + identification within third-party archives. + + Copyright [yyyy] [name of copyright owner] + + Licensed under the Apache License, Version 2.0 (the "License"); + you may not use this file except in compliance with the License. + You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, software + distributed under the License is distributed on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + See the License for the specific language governing permissions and + limitations under the License. + + +prompt_toolkit +BSD License +https://github.com/prompt-toolkit/python-prompt-toolkit +Copyright (c) 2014, Jonathan Slenders +All rights reserved. + +Redistribution and use in source and binary forms, with or without modification, +are permitted provided that the following conditions are met: + +* Redistributions of source code must retain the above copyright notice, this + list of conditions and the following disclaimer. + +* Redistributions in binary form must reproduce the above copyright notice, this + list of conditions and the following disclaimer in the documentation and/or + other materials provided with the distribution. + +* Neither the name of the {organization} nor the names of its + contributors may be used to endorse or promote products derived from + this software without specific prior written permission. + +THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND +ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED +WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE +DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR +ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES +(INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; +LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON +ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT +(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS +SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + + +propcache +Apache Software License +https://github.com/aio-libs/propcache + + Apache License + Version 2.0, January 2004 + http://www.apache.org/licenses/ + + TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION + + 1. Definitions. + + "License" shall mean the terms and conditions for use, reproduction, + and distribution as defined by Sections 1 through 9 of this document. + + "Licensor" shall mean the copyright owner or entity authorized by + the copyright owner that is granting the License. + + "Legal Entity" shall mean the union of the acting entity and all + other entities that control, are controlled by, or are under common + control with that entity. For the purposes of this definition, + "control" means (i) the power, direct or indirect, to cause the + direction or management of such entity, whether by contract or + otherwise, or (ii) ownership of fifty percent (50%) or more of the + outstanding shares, or (iii) beneficial ownership of such entity. + + "You" (or "Your") shall mean an individual or Legal Entity + exercising permissions granted by this License. + + "Source" form shall mean the preferred form for making modifications, + including but not limited to software source code, documentation + source, and configuration files. + + "Object" form shall mean any form resulting from mechanical + transformation or translation of a Source form, including but + not limited to compiled object code, generated documentation, + and conversions to other media types. + + "Work" shall mean the work of authorship, whether in Source or + Object form, made available under the License, as indicated by a + copyright notice that is included in or attached to the work + (an example is provided in the Appendix below). + + "Derivative Works" shall mean any work, whether in Source or Object + form, that is based on (or derived from) the Work and for which the + editorial revisions, annotations, elaborations, or other modifications + represent, as a whole, an original work of authorship. For the purposes + of this License, Derivative Works shall not include works that remain + separable from, or merely link (or bind by name) to the interfaces of, + the Work and Derivative Works thereof. + + "Contribution" shall mean any work of authorship, including + the original version of the Work and any modifications or additions + to that Work or Derivative Works thereof, that is intentionally + submitted to Licensor for inclusion in the Work by the copyright owner + or by an individual or Legal Entity authorized to submit on behalf of + the copyright owner. For the purposes of this definition, "submitted" + means any form of electronic, verbal, or written communication sent + to the Licensor or its representatives, including but not limited to + communication on electronic mailing lists, source code control systems, + and issue tracking systems that are managed by, or on behalf of, the + Licensor for the purpose of discussing and improving the Work, but + excluding communication that is conspicuously marked or otherwise + designated in writing by the copyright owner as "Not a Contribution." + + "Contributor" shall mean Licensor and any individual or Legal Entity + on behalf of whom a Contribution has been received by Licensor and + subsequently incorporated within the Work. + + 2. Grant of Copyright License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + copyright license to reproduce, prepare Derivative Works of, + publicly display, publicly perform, sublicense, and distribute the + Work and such Derivative Works in Source or Object form. + + 3. Grant of Patent License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + (except as stated in this section) patent license to make, have made, + use, offer to sell, sell, import, and otherwise transfer the Work, + where such license applies only to those patent claims licensable + by such Contributor that are necessarily infringed by their + Contribution(s) alone or by combination of their Contribution(s) + with the Work to which such Contribution(s) was submitted. If You + institute patent litigation against any entity (including a + cross-claim or counterclaim in a lawsuit) alleging that the Work + or a Contribution incorporated within the Work constitutes direct + or contributory patent infringement, then any patent licenses + granted to You under this License for that Work shall terminate + as of the date such litigation is filed. + + 4. Redistribution. You may reproduce and distribute copies of the + Work or Derivative Works thereof in any medium, with or without + modifications, and in Source or Object form, provided that You + meet the following conditions: + + (a) You must give any other recipients of the Work or + Derivative Works a copy of this License; and + + (b) You must cause any modified files to carry prominent notices + stating that You changed the files; and + + (c) You must retain, in the Source form of any Derivative Works + that You distribute, all copyright, patent, trademark, and + attribution notices from the Source form of the Work, + excluding those notices that do not pertain to any part of + the Derivative Works; and + + (d) If the Work includes a "NOTICE" text file as part of its + distribution, then any Derivative Works that You distribute must + include a readable copy of the attribution notices contained + within such NOTICE file, excluding those notices that do not + pertain to any part of the Derivative Works, in at least one + of the following places: within a NOTICE text file distributed + as part of the Derivative Works; within the Source form or + documentation, if provided along with the Derivative Works; or, + within a display generated by the Derivative Works, if and + wherever such third-party notices normally appear. The contents + of the NOTICE file are for informational purposes only and + do not modify the License. You may add Your own attribution + notices within Derivative Works that You distribute, alongside + or as an addendum to the NOTICE text from the Work, provided + that such additional attribution notices cannot be construed + as modifying the License. + + You may add Your own copyright statement to Your modifications and + may provide additional or different license terms and conditions + for use, reproduction, or distribution of Your modifications, or + for any such Derivative Works as a whole, provided Your use, + reproduction, and distribution of the Work otherwise complies with + the conditions stated in this License. + + 5. Submission of Contributions. Unless You explicitly state otherwise, + any Contribution intentionally submitted for inclusion in the Work + by You to the Licensor shall be under the terms and conditions of + this License, without any additional terms or conditions. + Notwithstanding the above, nothing herein shall supersede or modify + the terms of any separate license agreement you may have executed + with Licensor regarding such Contributions. + + 6. Trademarks. This License does not grant permission to use the trade + names, trademarks, service marks, or product names of the Licensor, + except as required for reasonable and customary use in describing the + origin of the Work and reproducing the content of the NOTICE file. + + 7. Disclaimer of Warranty. Unless required by applicable law or + agreed to in writing, Licensor provides the Work (and each + Contributor provides its Contributions) on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or + implied, including, without limitation, any warranties or conditions + of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A + PARTICULAR PURPOSE. You are solely responsible for determining the + appropriateness of using or redistributing the Work and assume any + risks associated with Your exercise of permissions under this License. + + 8. Limitation of Liability. In no event and under no legal theory, + whether in tort (including negligence), contract, or otherwise, + unless required by applicable law (such as deliberate and grossly + negligent acts) or agreed to in writing, shall any Contributor be + liable to You for damages, including any direct, indirect, special, + incidental, or consequential damages of any character arising as a + result of this License or out of the use or inability to use the + Work (including but not limited to damages for loss of goodwill, + work stoppage, computer failure or malfunction, or any and all + other commercial damages or losses), even if such Contributor + has been advised of the possibility of such damages. + + 9. Accepting Warranty or Additional Liability. While redistributing + the Work or Derivative Works thereof, You may choose to offer, + and charge a fee for, acceptance of support, warranty, indemnity, + or other liability obligations and/or rights consistent with this + License. However, in accepting such obligations, You may act only + on Your own behalf and on Your sole responsibility, not on behalf + of any other Contributor, and only if You agree to indemnify, + defend, and hold each Contributor harmless for any liability + incurred by, or claims asserted against, such Contributor by reason + of your accepting any such warranty or additional liability. + + END OF TERMS AND CONDITIONS + + APPENDIX: How to apply the Apache License to your work. + + To apply the Apache License to your work, attach the following + boilerplate notice, with the fields enclosed by brackets "[]" + replaced with your own identifying information. (Don't include + the brackets!) The text should be enclosed in the appropriate + comment syntax for the file format. We also recommend that a + file or class name and description of purpose be included on the + same "printed page" as the copyright notice for easier + identification within third-party archives. + + Copyright [yyyy] [name of copyright owner] + + Licensed under the Apache License, Version 2.0 (the "License"); + you may not use this file except in compliance with the License. + You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, software + distributed under the License is distributed on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + See the License for the specific language governing permissions and + limitations under the License. + + +proto-plus +Apache Software License +https://github.com/googleapis/proto-plus-python + + Apache License + Version 2.0, January 2004 + http://www.apache.org/licenses/ + + TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION + + 1. Definitions. + + "License" shall mean the terms and conditions for use, reproduction, + and distribution as defined by Sections 1 through 9 of this document. + + "Licensor" shall mean the copyright owner or entity authorized by + the copyright owner that is granting the License. + + "Legal Entity" shall mean the union of the acting entity and all + other entities that control, are controlled by, or are under common + control with that entity. For the purposes of this definition, + "control" means (i) the power, direct or indirect, to cause the + direction or management of such entity, whether by contract or + otherwise, or (ii) ownership of fifty percent (50%) or more of the + outstanding shares, or (iii) beneficial ownership of such entity. + + "You" (or "Your") shall mean an individual or Legal Entity + exercising permissions granted by this License. + + "Source" form shall mean the preferred form for making modifications, + including but not limited to software source code, documentation + source, and configuration files. + + "Object" form shall mean any form resulting from mechanical + transformation or translation of a Source form, including but + not limited to compiled object code, generated documentation, + and conversions to other media types. + + "Work" shall mean the work of authorship, whether in Source or + Object form, made available under the License, as indicated by a + copyright notice that is included in or attached to the work + (an example is provided in the Appendix below). + + "Derivative Works" shall mean any work, whether in Source or Object + form, that is based on (or derived from) the Work and for which the + editorial revisions, annotations, elaborations, or other modifications + represent, as a whole, an original work of authorship. For the purposes + of this License, Derivative Works shall not include works that remain + separable from, or merely link (or bind by name) to the interfaces of, + the Work and Derivative Works thereof. + + "Contribution" shall mean any work of authorship, including + the original version of the Work and any modifications or additions + to that Work or Derivative Works thereof, that is intentionally + submitted to Licensor for inclusion in the Work by the copyright owner + or by an individual or Legal Entity authorized to submit on behalf of + the copyright owner. For the purposes of this definition, "submitted" + means any form of electronic, verbal, or written communication sent + to the Licensor or its representatives, including but not limited to + communication on electronic mailing lists, source code control systems, + and issue tracking systems that are managed by, or on behalf of, the + Licensor for the purpose of discussing and improving the Work, but + excluding communication that is conspicuously marked or otherwise + designated in writing by the copyright owner as "Not a Contribution." + + "Contributor" shall mean Licensor and any individual or Legal Entity + on behalf of whom a Contribution has been received by Licensor and + subsequently incorporated within the Work. + + 2. Grant of Copyright License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + copyright license to reproduce, prepare Derivative Works of, + publicly display, publicly perform, sublicense, and distribute the + Work and such Derivative Works in Source or Object form. + + 3. Grant of Patent License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + (except as stated in this section) patent license to make, have made, + use, offer to sell, sell, import, and otherwise transfer the Work, + where such license applies only to those patent claims licensable + by such Contributor that are necessarily infringed by their + Contribution(s) alone or by combination of their Contribution(s) + with the Work to which such Contribution(s) was submitted. If You + institute patent litigation against any entity (including a + cross-claim or counterclaim in a lawsuit) alleging that the Work + or a Contribution incorporated within the Work constitutes direct + or contributory patent infringement, then any patent licenses + granted to You under this License for that Work shall terminate + as of the date such litigation is filed. + + 4. Redistribution. You may reproduce and distribute copies of the + Work or Derivative Works thereof in any medium, with or without + modifications, and in Source or Object form, provided that You + meet the following conditions: + + (a) You must give any other recipients of the Work or + Derivative Works a copy of this License; and + + (b) You must cause any modified files to carry prominent notices + stating that You changed the files; and + + (c) You must retain, in the Source form of any Derivative Works + that You distribute, all copyright, patent, trademark, and + attribution notices from the Source form of the Work, + excluding those notices that do not pertain to any part of + the Derivative Works; and + + (d) If the Work includes a "NOTICE" text file as part of its + distribution, then any Derivative Works that You distribute must + include a readable copy of the attribution notices contained + within such NOTICE file, excluding those notices that do not + pertain to any part of the Derivative Works, in at least one + of the following places: within a NOTICE text file distributed + as part of the Derivative Works; within the Source form or + documentation, if provided along with the Derivative Works; or, + within a display generated by the Derivative Works, if and + wherever such third-party notices normally appear. The contents + of the NOTICE file are for informational purposes only and + do not modify the License. You may add Your own attribution + notices within Derivative Works that You distribute, alongside + or as an addendum to the NOTICE text from the Work, provided + that such additional attribution notices cannot be construed + as modifying the License. + + You may add Your own copyright statement to Your modifications and + may provide additional or different license terms and conditions + for use, reproduction, or distribution of Your modifications, or + for any such Derivative Works as a whole, provided Your use, + reproduction, and distribution of the Work otherwise complies with + the conditions stated in this License. + + 5. Submission of Contributions. Unless You explicitly state otherwise, + any Contribution intentionally submitted for inclusion in the Work + by You to the Licensor shall be under the terms and conditions of + this License, without any additional terms or conditions. + Notwithstanding the above, nothing herein shall supersede or modify + the terms of any separate license agreement you may have executed + with Licensor regarding such Contributions. + + 6. Trademarks. This License does not grant permission to use the trade + names, trademarks, service marks, or product names of the Licensor, + except as required for reasonable and customary use in describing the + origin of the Work and reproducing the content of the NOTICE file. + + 7. Disclaimer of Warranty. Unless required by applicable law or + agreed to in writing, Licensor provides the Work (and each + Contributor provides its Contributions) on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or + implied, including, without limitation, any warranties or conditions + of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A + PARTICULAR PURPOSE. You are solely responsible for determining the + appropriateness of using or redistributing the Work and assume any + risks associated with Your exercise of permissions under this License. + + 8. Limitation of Liability. In no event and under no legal theory, + whether in tort (including negligence), contract, or otherwise, + unless required by applicable law (such as deliberate and grossly + negligent acts) or agreed to in writing, shall any Contributor be + liable to You for damages, including any direct, indirect, special, + incidental, or consequential damages of any character arising as a + result of this License or out of the use or inability to use the + Work (including but not limited to damages for loss of goodwill, + work stoppage, computer failure or malfunction, or any and all + other commercial damages or losses), even if such Contributor + has been advised of the possibility of such damages. + + 9. Accepting Warranty or Additional Liability. While redistributing + the Work or Derivative Works thereof, You may choose to offer, + and charge a fee for, acceptance of support, warranty, indemnity, + or other liability obligations and/or rights consistent with this + License. However, in accepting such obligations, You may act only + on Your own behalf and on Your sole responsibility, not on behalf + of any other Contributor, and only if You agree to indemnify, + defend, and hold each Contributor harmless for any liability + incurred by, or claims asserted against, such Contributor by reason + of your accepting any such warranty or additional liability. + + END OF TERMS AND CONDITIONS + + APPENDIX: How to apply the Apache License to your work. + + To apply the Apache License to your work, attach the following + boilerplate notice, with the fields enclosed by brackets "[]" + replaced with your own identifying information. (Don't include + the brackets!) The text should be enclosed in the appropriate + comment syntax for the file format. We also recommend that a + file or class name and description of purpose be included on the + same "printed page" as the copyright notice for easier + identification within third-party archives. + + Copyright [yyyy] [name of copyright owner] + + Licensed under the Apache License, Version 2.0 (the "License"); + you may not use this file except in compliance with the License. + You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, software + distributed under the License is distributed on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + See the License for the specific language governing permissions and + limitations under the License. + + +protobuf +3-Clause BSD License +https://developers.google.com/protocol-buffers/ +Copyright 2008 Google Inc. All rights reserved. + +Redistribution and use in source and binary forms, with or without +modification, are permitted provided that the following conditions are +met: + + * Redistributions of source code must retain the above copyright +notice, this list of conditions and the following disclaimer. + * Redistributions in binary form must reproduce the above +copyright notice, this list of conditions and the following disclaimer +in the documentation and/or other materials provided with the +distribution. + * Neither the name of Google Inc. nor the names of its +contributors may be used to endorse or promote products derived from +this software without specific prior written permission. + +THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS +"AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT +LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR +A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT +OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, +SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT +LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, +DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY +THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT +(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE +OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + +Code generated by the Protocol Buffer compiler is owned by the owner +of the input file used when generating it. This code is not +standalone and requires a support library to be linked with it. This +support library is itself covered by the above license. + + +psutil +BSD-3-Clause +https://github.com/giampaolo/psutil +BSD 3-Clause License + +Copyright (c) 2009, Jay Loden, Dave Daeschler, Giampaolo Rodola +All rights reserved. + +Redistribution and use in source and binary forms, with or without modification, +are permitted provided that the following conditions are met: + + * Redistributions of source code must retain the above copyright notice, this + list of conditions and the following disclaimer. + + * Redistributions in binary form must reproduce the above copyright notice, + this list of conditions and the following disclaimer in the documentation + and/or other materials provided with the distribution. + + * Neither the name of the psutil authors nor the names of its contributors + may be used to endorse or promote products derived from this software without + specific prior written permission. + +THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND +ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED +WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE +DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR +ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES +(INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; +LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON +ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT +(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS +SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + + +psycopg2-binary +GNU Library or Lesser General Public License (LGPL) +https://psycopg.org/ +psycopg2 and the LGPL +--------------------- + +psycopg2 is free software: you can redistribute it and/or modify it +under the terms of the GNU Lesser General Public License as published +by the Free Software Foundation, either version 3 of the License, or +(at your option) any later version. + +psycopg2 is distributed in the hope that it will be useful, but WITHOUT +ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or +FITNESS FOR A PARTICULAR PURPOSE. See the GNU Lesser General Public +License for more details. + +In addition, as a special exception, the copyright holders give +permission to link this program with the OpenSSL library (or with +modified versions of OpenSSL that use the same license as OpenSSL), +and distribute linked combinations including the two. + +You must obey the GNU Lesser General Public License in all respects for +all of the code used other than OpenSSL. If you modify file(s) with this +exception, you may extend this exception to your version of the file(s), +but you are not obligated to do so. If you do not wish to do so, delete +this exception statement from your version. If you delete this exception +statement from all source files in the program, then also delete it here. + +You should have received a copy of the GNU Lesser General Public License +along with psycopg2 (see the doc/ directory.) +If not, see . + + +Alternative licenses +-------------------- + +The following BSD-like license applies (at your option) to the files following +the pattern ``psycopg/adapter*.{h,c}`` and ``psycopg/microprotocol*.{h,c}``: + + Permission is granted to anyone to use this software for any purpose, + including commercial applications, and to alter it and redistribute it + freely, subject to the following restrictions: + + 1. The origin of this software must not be misrepresented; you must not + claim that you wrote the original software. If you use this + software in a product, an acknowledgment in the product documentation + would be appreciated but is not required. + + 2. Altered source versions must be plainly marked as such, and must not + be misrepresented as being the original software. + + 3. This notice may not be removed or altered from any source distribution. + + +ptyprocess +ISC License (ISCL) +https://github.com/pexpect/ptyprocess +Ptyprocess is under the ISC license, as code derived from Pexpect. + http://opensource.org/licenses/ISC + +Copyright (c) 2013-2014, Pexpect development team +Copyright (c) 2012, Noah Spurrier + +PERMISSION TO USE, COPY, MODIFY, AND/OR DISTRIBUTE THIS SOFTWARE FOR ANY PURPOSE +WITH OR WITHOUT FEE IS HEREBY GRANTED, PROVIDED THAT THE ABOVE COPYRIGHT NOTICE +AND THIS PERMISSION NOTICE APPEAR IN ALL COPIES. THE SOFTWARE IS PROVIDED +"AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES WITH REGARD TO THIS SOFTWARE +INCLUDING ALL IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS. IN NO EVENT +SHALL THE AUTHOR BE LIABLE FOR ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL +DAMAGES OR ANY DAMAGES WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, +WHETHER IN AN ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING +OUT OF OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE. + + + +pure_eval +MIT License +http://github.com/alexmojaki/pure_eval +MIT License + +Copyright (c) 2019 Alex Hall + +Permission is hereby granted, free of charge, to any person obtaining a copy +of this software and associated documentation files (the "Software"), to deal +in the Software without restriction, including without limitation the rights +to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +copies of the Software, and to permit persons to whom the Software is +furnished to do so, subject to the following conditions: + +The above copyright notice and this permission notice shall be included in all +copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE +SOFTWARE. + + +py-spy +MIT License +https://github.com/benfred/py-spy +The MIT License (MIT) + +Copyright (c) 2018-2019 Ben Frederickson + +Permission is hereby granted, free of charge, to any person obtaining a copy +of this software and associated documentation files (the "Software"), to deal +in the Software without restriction, including without limitation the rights +to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +copies of the Software, and to permit persons to whom the Software is +furnished to do so, subject to the following conditions: + +The above copyright notice and this permission notice shall be included in all +copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE +SOFTWARE. + + +py3nvml +BSD License +https://github.com/fbcotter/py3nvml.git +COPYRIGHT +--------- +Copyright (c) 2011-2015, NVIDIA Corporation. All rights reserved. + +LICENSE +------- +Redistribution and use in source and binary forms, with or without +modification, are permitted provided that the following conditions are met: + +- Redistributions of source code must retain the above copyright notice, this +list of conditions and the following disclaimer. + +- Redistributions in binary form must reproduce the above copyright notice, +this list of conditions and the following disclaimer in the documentation +and/or other materials provided with the distribution. + +- Neither the name of the NVIDIA Corporation nor the names of its contributors +may be used to endorse or promote products derived from this software without +specific prior written permission. + +THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND +ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED +WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE +DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE +FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL +DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR +SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER +CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, +OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE +OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + + + + +pyarrow +Apache Software License +https://arrow.apache.org/ + + Apache License + Version 2.0, January 2004 + http://www.apache.org/licenses/ + + TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION + + 1. Definitions. + + "License" shall mean the terms and conditions for use, reproduction, + and distribution as defined by Sections 1 through 9 of this document. + + "Licensor" shall mean the copyright owner or entity authorized by + the copyright owner that is granting the License. + + "Legal Entity" shall mean the union of the acting entity and all + other entities that control, are controlled by, or are under common + control with that entity. For the purposes of this definition, + "control" means (i) the power, direct or indirect, to cause the + direction or management of such entity, whether by contract or + otherwise, or (ii) ownership of fifty percent (50%) or more of the + outstanding shares, or (iii) beneficial ownership of such entity. + + "You" (or "Your") shall mean an individual or Legal Entity + exercising permissions granted by this License. + + "Source" form shall mean the preferred form for making modifications, + including but not limited to software source code, documentation + source, and configuration files. + + "Object" form shall mean any form resulting from mechanical + transformation or translation of a Source form, including but + not limited to compiled object code, generated documentation, + and conversions to other media types. + + "Work" shall mean the work of authorship, whether in Source or + Object form, made available under the License, as indicated by a + copyright notice that is included in or attached to the work + (an example is provided in the Appendix below). + + "Derivative Works" shall mean any work, whether in Source or Object + form, that is based on (or derived from) the Work and for which the + editorial revisions, annotations, elaborations, or other modifications + represent, as a whole, an original work of authorship. For the purposes + of this License, Derivative Works shall not include works that remain + separable from, or merely link (or bind by name) to the interfaces of, + the Work and Derivative Works thereof. + + "Contribution" shall mean any work of authorship, including + the original version of the Work and any modifications or additions + to that Work or Derivative Works thereof, that is intentionally + submitted to Licensor for inclusion in the Work by the copyright owner + or by an individual or Legal Entity authorized to submit on behalf of + the copyright owner. For the purposes of this definition, "submitted" + means any form of electronic, verbal, or written communication sent + to the Licensor or its representatives, including but not limited to + communication on electronic mailing lists, source code control systems, + and issue tracking systems that are managed by, or on behalf of, the + Licensor for the purpose of discussing and improving the Work, but + excluding communication that is conspicuously marked or otherwise + designated in writing by the copyright owner as "Not a Contribution." + + "Contributor" shall mean Licensor and any individual or Legal Entity + on behalf of whom a Contribution has been received by Licensor and + subsequently incorporated within the Work. + + 2. Grant of Copyright License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + copyright license to reproduce, prepare Derivative Works of, + publicly display, publicly perform, sublicense, and distribute the + Work and such Derivative Works in Source or Object form. + + 3. Grant of Patent License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + (except as stated in this section) patent license to make, have made, + use, offer to sell, sell, import, and otherwise transfer the Work, + where such license applies only to those patent claims licensable + by such Contributor that are necessarily infringed by their + Contribution(s) alone or by combination of their Contribution(s) + with the Work to which such Contribution(s) was submitted. If You + institute patent litigation against any entity (including a + cross-claim or counterclaim in a lawsuit) alleging that the Work + or a Contribution incorporated within the Work constitutes direct + or contributory patent infringement, then any patent licenses + granted to You under this License for that Work shall terminate + as of the date such litigation is filed. + + 4. Redistribution. You may reproduce and distribute copies of the + Work or Derivative Works thereof in any medium, with or without + modifications, and in Source or Object form, provided that You + meet the following conditions: + + (a) You must give any other recipients of the Work or + Derivative Works a copy of this License; and + + (b) You must cause any modified files to carry prominent notices + stating that You changed the files; and + + (c) You must retain, in the Source form of any Derivative Works + that You distribute, all copyright, patent, trademark, and + attribution notices from the Source form of the Work, + excluding those notices that do not pertain to any part of + the Derivative Works; and + + (d) If the Work includes a "NOTICE" text file as part of its + distribution, then any Derivative Works that You distribute must + include a readable copy of the attribution notices contained + within such NOTICE file, excluding those notices that do not + pertain to any part of the Derivative Works, in at least one + of the following places: within a NOTICE text file distributed + as part of the Derivative Works; within the Source form or + documentation, if provided along with the Derivative Works; or, + within a display generated by the Derivative Works, if and + wherever such third-party notices normally appear. The contents + of the NOTICE file are for informational purposes only and + do not modify the License. You may add Your own attribution + notices within Derivative Works that You distribute, alongside + or as an addendum to the NOTICE text from the Work, provided + that such additional attribution notices cannot be construed + as modifying the License. + + You may add Your own copyright statement to Your modifications and + may provide additional or different license terms and conditions + for use, reproduction, or distribution of Your modifications, or + for any such Derivative Works as a whole, provided Your use, + reproduction, and distribution of the Work otherwise complies with + the conditions stated in this License. + + 5. Submission of Contributions. Unless You explicitly state otherwise, + any Contribution intentionally submitted for inclusion in the Work + by You to the Licensor shall be under the terms and conditions of + this License, without any additional terms or conditions. + Notwithstanding the above, nothing herein shall supersede or modify + the terms of any separate license agreement you may have executed + with Licensor regarding such Contributions. + + 6. Trademarks. This License does not grant permission to use the trade + names, trademarks, service marks, or product names of the Licensor, + except as required for reasonable and customary use in describing the + origin of the Work and reproducing the content of the NOTICE file. + + 7. Disclaimer of Warranty. Unless required by applicable law or + agreed to in writing, Licensor provides the Work (and each + Contributor provides its Contributions) on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or + implied, including, without limitation, any warranties or conditions + of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A + PARTICULAR PURPOSE. You are solely responsible for determining the + appropriateness of using or redistributing the Work and assume any + risks associated with Your exercise of permissions under this License. + + 8. Limitation of Liability. In no event and under no legal theory, + whether in tort (including negligence), contract, or otherwise, + unless required by applicable law (such as deliberate and grossly + negligent acts) or agreed to in writing, shall any Contributor be + liable to You for damages, including any direct, indirect, special, + incidental, or consequential damages of any character arising as a + result of this License or out of the use or inability to use the + Work (including but not limited to damages for loss of goodwill, + work stoppage, computer failure or malfunction, or any and all + other commercial damages or losses), even if such Contributor + has been advised of the possibility of such damages. + + 9. Accepting Warranty or Additional Liability. While redistributing + the Work or Derivative Works thereof, You may choose to offer, + and charge a fee for, acceptance of support, warranty, indemnity, + or other liability obligations and/or rights consistent with this + License. However, in accepting such obligations, You may act only + on Your own behalf and on Your sole responsibility, not on behalf + of any other Contributor, and only if You agree to indemnify, + defend, and hold each Contributor harmless for any liability + incurred by, or claims asserted against, such Contributor by reason + of your accepting any such warranty or additional liability. + + END OF TERMS AND CONDITIONS + + APPENDIX: How to apply the Apache License to your work. + + To apply the Apache License to your work, attach the following + boilerplate notice, with the fields enclosed by brackets "[]" + replaced with your own identifying information. (Don't include + the brackets!) The text should be enclosed in the appropriate + comment syntax for the file format. We also recommend that a + file or class name and description of purpose be included on the + same "printed page" as the copyright notice for easier + identification within third-party archives. + + Copyright [yyyy] [name of copyright owner] + + Licensed under the Apache License, Version 2.0 (the "License"); + you may not use this file except in compliance with the License. + You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, software + distributed under the License is distributed on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + See the License for the specific language governing permissions and + limitations under the License. + +-------------------------------------------------------------------------------- + +src/arrow/util (some portions): Apache 2.0, and 3-clause BSD + +Some portions of this module are derived from code in the Chromium project, +copyright (c) Google inc and (c) The Chromium Authors and licensed under the +Apache 2.0 License or the under the 3-clause BSD license: + + Copyright (c) 2013 The Chromium Authors. All rights reserved. + + Redistribution and use in source and binary forms, with or without + modification, are permitted provided that the following conditions are + met: + + * Redistributions of source code must retain the above copyright + notice, this list of conditions and the following disclaimer. + * Redistributions in binary form must reproduce the above + copyright notice, this list of conditions and the following disclaimer + in the documentation and/or other materials provided with the + distribution. + * Neither the name of Google Inc. nor the names of its + contributors may be used to endorse or promote products derived from + this software without specific prior written permission. + + THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS + "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT + LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR + A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT + OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, + SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT + LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, + DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY + THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT + (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE + OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + +-------------------------------------------------------------------------------- + +This project includes code from Daniel Lemire's FrameOfReference project. + +https://github.com/lemire/FrameOfReference/blob/6ccaf9e97160f9a3b299e23a8ef739e711ef0c71/src/bpacking.cpp +https://github.com/lemire/FrameOfReference/blob/146948b6058a976bc7767262ad3a2ce201486b93/scripts/turbopacking64.py + +Copyright: 2013 Daniel Lemire +Home page: http://lemire.me/en/ +Project page: https://github.com/lemire/FrameOfReference +License: Apache License Version 2.0 http://www.apache.org/licenses/LICENSE-2.0 + +-------------------------------------------------------------------------------- + +This project includes code from the TensorFlow project + +Copyright 2015 The TensorFlow Authors. All Rights Reserved. + +Licensed under the Apache License, Version 2.0 (the "License"); +you may not use this file except in compliance with the License. +You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + +Unless required by applicable law or agreed to in writing, software +distributed under the License is distributed on an "AS IS" BASIS, +WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +See the License for the specific language governing permissions and +limitations under the License. + +-------------------------------------------------------------------------------- + +This project includes code from the NumPy project. + +https://github.com/numpy/numpy/blob/e1f191c46f2eebd6cb892a4bfe14d9dd43a06c4e/numpy/core/src/multiarray/multiarraymodule.c#L2910 + +https://github.com/numpy/numpy/blob/68fd82271b9ea5a9e50d4e761061dfcca851382a/numpy/core/src/multiarray/datetime.c + +Copyright (c) 2005-2017, NumPy Developers. +All rights reserved. + +Redistribution and use in source and binary forms, with or without +modification, are permitted provided that the following conditions are +met: + + * Redistributions of source code must retain the above copyright + notice, this list of conditions and the following disclaimer. + + * Redistributions in binary form must reproduce the above + copyright notice, this list of conditions and the following + disclaimer in the documentation and/or other materials provided + with the distribution. + + * Neither the name of the NumPy Developers nor the names of any + contributors may be used to endorse or promote products derived + from this software without specific prior written permission. + +THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS +"AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT +LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR +A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT +OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, +SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT +LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, +DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY +THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT +(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE +OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + +-------------------------------------------------------------------------------- + +This project includes code from the Boost project + +Boost Software License - Version 1.0 - August 17th, 2003 + +Permission is hereby granted, free of charge, to any person or organization +obtaining a copy of the software and accompanying documentation covered by +this license (the "Software") to use, reproduce, display, distribute, +execute, and transmit the Software, and to prepare derivative works of the +Software, and to permit third-parties to whom the Software is furnished to +do so, all subject to the following: + +The copyright notices in the Software and this entire statement, including +the above license grant, this restriction and the following disclaimer, +must be included in all copies of the Software, in whole or in part, and +all derivative works of the Software, unless such copies or derivative +works are solely in the form of machine-executable object code generated by +a source language processor. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +FITNESS FOR A PARTICULAR PURPOSE, TITLE AND NON-INFRINGEMENT. IN NO EVENT +SHALL THE COPYRIGHT HOLDERS OR ANYONE DISTRIBUTING THE SOFTWARE BE LIABLE +FOR ANY DAMAGES OR OTHER LIABILITY, WHETHER IN CONTRACT, TORT OR OTHERWISE, +ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER +DEALINGS IN THE SOFTWARE. + +-------------------------------------------------------------------------------- + +This project includes code from the FlatBuffers project + +Copyright 2014 Google Inc. + +Licensed under the Apache License, Version 2.0 (the "License"); +you may not use this file except in compliance with the License. +You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + +Unless required by applicable law or agreed to in writing, software +distributed under the License is distributed on an "AS IS" BASIS, +WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +See the License for the specific language governing permissions and +limitations under the License. + +-------------------------------------------------------------------------------- + +This project includes code from the tslib project + +Copyright 2015 Microsoft Corporation. All rights reserved. + +Licensed under the Apache License, Version 2.0 (the "License"); +you may not use this file except in compliance with the License. +You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + +Unless required by applicable law or agreed to in writing, software +distributed under the License is distributed on an "AS IS" BASIS, +WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +See the License for the specific language governing permissions and +limitations under the License. + +-------------------------------------------------------------------------------- + +This project includes code from the jemalloc project + +https://github.com/jemalloc/jemalloc + +Copyright (C) 2002-2017 Jason Evans . +All rights reserved. +Copyright (C) 2007-2012 Mozilla Foundation. All rights reserved. +Copyright (C) 2009-2017 Facebook, Inc. All rights reserved. + +Redistribution and use in source and binary forms, with or without +modification, are permitted provided that the following conditions are met: +1. Redistributions of source code must retain the above copyright notice(s), + this list of conditions and the following disclaimer. +2. Redistributions in binary form must reproduce the above copyright notice(s), + this list of conditions and the following disclaimer in the documentation + and/or other materials provided with the distribution. + +THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDER(S) ``AS IS'' AND ANY EXPRESS +OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF +MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO +EVENT SHALL THE COPYRIGHT HOLDER(S) BE LIABLE FOR ANY DIRECT, INDIRECT, +INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT +LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR +PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF +LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE +OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF +ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. +-------------------------------------------------------------------------------- + +This project includes code from the Go project, BSD 3-clause license + PATENTS +weak patent termination clause +(https://github.com/golang/go/blob/master/PATENTS). + +Copyright (c) 2009 The Go Authors. All rights reserved. + +Redistribution and use in source and binary forms, with or without +modification, are permitted provided that the following conditions are +met: + + * Redistributions of source code must retain the above copyright +notice, this list of conditions and the following disclaimer. + * Redistributions in binary form must reproduce the above +copyright notice, this list of conditions and the following disclaimer +in the documentation and/or other materials provided with the +distribution. + * Neither the name of Google Inc. nor the names of its +contributors may be used to endorse or promote products derived from +this software without specific prior written permission. + +THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS +"AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT +LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR +A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT +OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, +SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT +LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, +DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY +THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT +(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE +OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + +-------------------------------------------------------------------------------- + +This project includes code from the hs2client + +https://github.com/cloudera/hs2client + +Copyright 2016 Cloudera Inc. + +Licensed under the Apache License, Version 2.0 (the "License"); +you may not use this file except in compliance with the License. +You may obtain a copy of the License at + +http://www.apache.org/licenses/LICENSE-2.0 + +Unless required by applicable law or agreed to in writing, software +distributed under the License is distributed on an "AS IS" BASIS, +WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +See the License for the specific language governing permissions and +limitations under the License. + +-------------------------------------------------------------------------------- + +The script ci/scripts/util_wait_for_it.sh has the following license + +Copyright (c) 2016 Giles Hall + +Permission is hereby granted, free of charge, to any person obtaining a copy of +this software and associated documentation files (the "Software"), to deal in +the Software without restriction, including without limitation the rights to +use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies +of the Software, and to permit persons to whom the Software is furnished to do +so, subject to the following conditions: + +The above copyright notice and this permission notice shall be included in all +copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE +SOFTWARE. + +-------------------------------------------------------------------------------- + +The script r/configure has the following license (MIT) + +Copyright (c) 2017, Jeroen Ooms and Jim Hester + +Permission is hereby granted, free of charge, to any person obtaining a copy of +this software and associated documentation files (the "Software"), to deal in +the Software without restriction, including without limitation the rights to +use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies +of the Software, and to permit persons to whom the Software is furnished to do +so, subject to the following conditions: + +The above copyright notice and this permission notice shall be included in all +copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE +SOFTWARE. + +-------------------------------------------------------------------------------- + +cpp/src/arrow/util/logging.cc, cpp/src/arrow/util/logging.h and +cpp/src/arrow/util/logging-test.cc are adapted from +Ray Project (https://github.com/ray-project/ray) (Apache 2.0). + +Copyright (c) 2016 Ray Project (https://github.com/ray-project/ray) + +Licensed under the Apache License, Version 2.0 (the "License"); +you may not use this file except in compliance with the License. +You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + +Unless required by applicable law or agreed to in writing, software +distributed under the License is distributed on an "AS IS" BASIS, +WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +See the License for the specific language governing permissions and +limitations under the License. + +-------------------------------------------------------------------------------- +The files cpp/src/arrow/vendored/datetime/date.h, cpp/src/arrow/vendored/datetime/tz.h, +cpp/src/arrow/vendored/datetime/tz_private.h, cpp/src/arrow/vendored/datetime/ios.h, +cpp/src/arrow/vendored/datetime/ios.mm, +cpp/src/arrow/vendored/datetime/tz.cpp are adapted from +Howard Hinnant's date library (https://github.com/HowardHinnant/date) +It is licensed under MIT license. + +The MIT License (MIT) +Copyright (c) 2015, 2016, 2017 Howard Hinnant +Copyright (c) 2016 Adrian Colomitchi +Copyright (c) 2017 Florian Dang +Copyright (c) 2017 Paul Thompson +Copyright (c) 2018 Tomasz Kamiński + +Permission is hereby granted, free of charge, to any person obtaining a copy +of this software and associated documentation files (the "Software"), to deal +in the Software without restriction, including without limitation the rights +to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +copies of the Software, and to permit persons to whom the Software is +furnished to do so, subject to the following conditions: + +The above copyright notice and this permission notice shall be included in all +copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE +SOFTWARE. + +-------------------------------------------------------------------------------- + +The file cpp/src/arrow/util/utf8.h includes code adapted from the page + https://bjoern.hoehrmann.de/utf-8/decoder/dfa/ +with the following license (MIT) + +Copyright (c) 2008-2009 Bjoern Hoehrmann + +Permission is hereby granted, free of charge, to any person obtaining a copy +of this software and associated documentation files (the "Software"), to deal +in the Software without restriction, including without limitation the rights +to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +copies of the Software, and to permit persons to whom the Software is +furnished to do so, subject to the following conditions: + +The above copyright notice and this permission notice shall be included in all +copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE +SOFTWARE. + +-------------------------------------------------------------------------------- + +The files in cpp/src/arrow/vendored/xxhash/ have the following license +(BSD 2-Clause License) + +xxHash Library +Copyright (c) 2012-2014, Yann Collet +All rights reserved. + +Redistribution and use in source and binary forms, with or without modification, +are permitted provided that the following conditions are met: + +* Redistributions of source code must retain the above copyright notice, this + list of conditions and the following disclaimer. + +* Redistributions in binary form must reproduce the above copyright notice, this + list of conditions and the following disclaimer in the documentation and/or + other materials provided with the distribution. + +THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND +ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED +WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE +DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR +ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES +(INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; +LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON +ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT +(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS +SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + +You can contact the author at : +- xxHash homepage: http://www.xxhash.com +- xxHash source repository : https://github.com/Cyan4973/xxHash + +-------------------------------------------------------------------------------- + +The files in cpp/src/arrow/vendored/double-conversion/ have the following license +(BSD 3-Clause License) + +Copyright 2006-2011, the V8 project authors. All rights reserved. +Redistribution and use in source and binary forms, with or without +modification, are permitted provided that the following conditions are +met: + + * Redistributions of source code must retain the above copyright + notice, this list of conditions and the following disclaimer. + * Redistributions in binary form must reproduce the above + copyright notice, this list of conditions and the following + disclaimer in the documentation and/or other materials provided + with the distribution. + * Neither the name of Google Inc. nor the names of its + contributors may be used to endorse or promote products derived + from this software without specific prior written permission. + +THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS +"AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT +LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR +A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT +OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, +SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT +LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, +DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY +THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT +(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE +OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + +-------------------------------------------------------------------------------- + +The files in cpp/src/arrow/vendored/uriparser/ have the following license +(BSD 3-Clause License) + +uriparser - RFC 3986 URI parsing library + +Copyright (C) 2007, Weijia Song +Copyright (C) 2007, Sebastian Pipping +All rights reserved. + +Redistribution and use in source and binary forms, with or without +modification, are permitted provided that the following conditions +are met: + + * Redistributions of source code must retain the above + copyright notice, this list of conditions and the following + disclaimer. + + * Redistributions in binary form must reproduce the above + copyright notice, this list of conditions and the following + disclaimer in the documentation and/or other materials + provided with the distribution. + + * Neither the name of the nor the names of its + contributors may be used to endorse or promote products + derived from this software without specific prior written + permission. + +THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS +"AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT +LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS +FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE +COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, +INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES +(INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR +SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) +HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, +STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) +ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED +OF THE POSSIBILITY OF SUCH DAMAGE. + +-------------------------------------------------------------------------------- + +The files under dev/tasks/conda-recipes have the following license + +BSD 3-clause license +Copyright (c) 2015-2018, conda-forge +All rights reserved. + +Redistribution and use in source and binary forms, with or without modification, +are permitted provided that the following conditions are met: + +1. Redistributions of source code must retain the above copyright notice, this + list of conditions and the following disclaimer. + +2. Redistributions in binary form must reproduce the above copyright notice, + this list of conditions and the following disclaimer in the documentation + and/or other materials provided with the distribution. + +3. Neither the name of the copyright holder nor the names of its contributors + may be used to endorse or promote products derived from this software without + specific prior written permission. + +THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND +ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED +WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE +DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE +FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL +DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR +SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER +CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR +TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF +THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + +-------------------------------------------------------------------------------- + +The files in cpp/src/arrow/vendored/utfcpp/ have the following license + +Copyright 2006-2018 Nemanja Trifunovic + +Permission is hereby granted, free of charge, to any person or organization +obtaining a copy of the software and accompanying documentation covered by +this license (the "Software") to use, reproduce, display, distribute, +execute, and transmit the Software, and to prepare derivative works of the +Software, and to permit third-parties to whom the Software is furnished to +do so, all subject to the following: + +The copyright notices in the Software and this entire statement, including +the above license grant, this restriction and the following disclaimer, +must be included in all copies of the Software, in whole or in part, and +all derivative works of the Software, unless such copies or derivative +works are solely in the form of machine-executable object code generated by +a source language processor. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +FITNESS FOR A PARTICULAR PURPOSE, TITLE AND NON-INFRINGEMENT. IN NO EVENT +SHALL THE COPYRIGHT HOLDERS OR ANYONE DISTRIBUTING THE SOFTWARE BE LIABLE +FOR ANY DAMAGES OR OTHER LIABILITY, WHETHER IN CONTRACT, TORT OR OTHERWISE, +ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER +DEALINGS IN THE SOFTWARE. + +-------------------------------------------------------------------------------- + +This project includes code from Apache Kudu. + + * cpp/cmake_modules/CompilerInfo.cmake is based on Kudu's cmake_modules/CompilerInfo.cmake + +Copyright: 2016 The Apache Software Foundation. +Home page: https://kudu.apache.org/ +License: http://www.apache.org/licenses/LICENSE-2.0 + +-------------------------------------------------------------------------------- + +This project includes code from Apache Impala (incubating), formerly +Impala. The Impala code and rights were donated to the ASF as part of the +Incubator process after the initial code imports into Apache Parquet. + +Copyright: 2012 Cloudera, Inc. +Copyright: 2016 The Apache Software Foundation. +Home page: http://impala.apache.org/ +License: http://www.apache.org/licenses/LICENSE-2.0 + +-------------------------------------------------------------------------------- + +This project includes code from Apache Aurora. + +* dev/release/{release,changelog,release-candidate} are based on the scripts from + Apache Aurora + +Copyright: 2016 The Apache Software Foundation. +Home page: https://aurora.apache.org/ +License: http://www.apache.org/licenses/LICENSE-2.0 + +-------------------------------------------------------------------------------- + +This project includes code from Snappy. + +* cpp/cmake_modules/{SnappyCMakeLists.txt,SnappyConfig.h} are based on code + from Google's Snappy project. + +Copyright: 2009 Google Inc. All rights reserved. +Homepage: https://github.com/google/snappy +License: 3-clause BSD + +-------------------------------------------------------------------------------- + +This project includes code from the manylinux project. + +* python/manylinux1/scripts/{build_python.sh,python-tag-abi-tag.py, + requirements.txt} are based on code from the manylinux project. + +Copyright: 2016 manylinux +Homepage: https://github.com/pypa/manylinux +License: The MIT License (MIT) + +-------------------------------------------------------------------------------- + +This project includes code from the cymove project: + +* python/pyarrow/includes/common.pxd includes code from the cymove project + +The MIT License (MIT) +Copyright (c) 2019 Omer Ozarslan + +Permission is hereby granted, free of charge, to any person obtaining a copy +of this software and associated documentation files (the "Software"), to deal +in the Software without restriction, including without limitation the rights +to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +copies of the Software, and to permit persons to whom the Software is +furnished to do so, subject to the following conditions: + +The above copyright notice and this permission notice shall be included in all +copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, +EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF +MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. +IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, +DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR +OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE +OR OTHER DEALINGS IN THE SOFTWARE. + +-------------------------------------------------------------------------------- + +The projects includes code from the Ursabot project under the dev/archery +directory. + +License: BSD 2-Clause + +Copyright 2019 RStudio, Inc. + +Redistribution and use in source and binary forms, with or without +modification, are permitted provided that the following conditions are met: + +1. Redistributions of source code must retain the above copyright notice, this + list of conditions and the following disclaimer. + +2. Redistributions in binary form must reproduce the above copyright notice, + this list of conditions and the following disclaimer in the documentation + and/or other materials provided with the distribution. + +THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND +ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED +WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE +DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE +FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL +DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR +SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER +CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, +OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE +OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + +-------------------------------------------------------------------------------- + +This project include code from mingw-w64. + +* cpp/src/arrow/util/cpu-info.cc has a polyfill for mingw-w64 < 5 + +Copyright (c) 2009 - 2013 by the mingw-w64 project +Homepage: https://mingw-w64.org +License: Zope Public License (ZPL) Version 2.1. + +--------------------------------------------------------------------------------- + +This project include code from Google's Asylo project. + +* cpp/src/arrow/result.h is based on status_or.h + +Copyright (c) Copyright 2017 Asylo authors +Homepage: https://asylo.dev/ +License: Apache 2.0 + +-------------------------------------------------------------------------------- + +This project includes code from Google's protobuf project + +* cpp/src/arrow/result.h ARROW_ASSIGN_OR_RAISE is based off ASSIGN_OR_RETURN +* cpp/src/arrow/util/bit_stream_utils.h contains code from wire_format_lite.h + +Copyright 2008 Google Inc. All rights reserved. +Homepage: https://developers.google.com/protocol-buffers/ +License: + +Redistribution and use in source and binary forms, with or without +modification, are permitted provided that the following conditions are +met: + + * Redistributions of source code must retain the above copyright +notice, this list of conditions and the following disclaimer. + * Redistributions in binary form must reproduce the above +copyright notice, this list of conditions and the following disclaimer +in the documentation and/or other materials provided with the +distribution. + * Neither the name of Google Inc. nor the names of its +contributors may be used to endorse or promote products derived from +this software without specific prior written permission. + +THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS +"AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT +LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR +A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT +OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, +SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT +LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, +DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY +THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT +(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE +OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + +Code generated by the Protocol Buffer compiler is owned by the owner +of the input file used when generating it. This code is not +standalone and requires a support library to be linked with it. This +support library is itself covered by the above license. + +-------------------------------------------------------------------------------- + +3rdparty dependency LLVM is statically linked in certain binary distributions. +Additionally some sections of source code have been derived from sources in LLVM +and have been clearly labeled as such. LLVM has the following license: + +============================================================================== +The LLVM Project is under the Apache License v2.0 with LLVM Exceptions: +============================================================================== + + Apache License + Version 2.0, January 2004 + http://www.apache.org/licenses/ + + TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION + + 1. Definitions. + + "License" shall mean the terms and conditions for use, reproduction, + and distribution as defined by Sections 1 through 9 of this document. + + "Licensor" shall mean the copyright owner or entity authorized by + the copyright owner that is granting the License. + + "Legal Entity" shall mean the union of the acting entity and all + other entities that control, are controlled by, or are under common + control with that entity. For the purposes of this definition, + "control" means (i) the power, direct or indirect, to cause the + direction or management of such entity, whether by contract or + otherwise, or (ii) ownership of fifty percent (50%) or more of the + outstanding shares, or (iii) beneficial ownership of such entity. + + "You" (or "Your") shall mean an individual or Legal Entity + exercising permissions granted by this License. + + "Source" form shall mean the preferred form for making modifications, + including but not limited to software source code, documentation + source, and configuration files. + + "Object" form shall mean any form resulting from mechanical + transformation or translation of a Source form, including but + not limited to compiled object code, generated documentation, + and conversions to other media types. + + "Work" shall mean the work of authorship, whether in Source or + Object form, made available under the License, as indicated by a + copyright notice that is included in or attached to the work + (an example is provided in the Appendix below). + + "Derivative Works" shall mean any work, whether in Source or Object + form, that is based on (or derived from) the Work and for which the + editorial revisions, annotations, elaborations, or other modifications + represent, as a whole, an original work of authorship. For the purposes + of this License, Derivative Works shall not include works that remain + separable from, or merely link (or bind by name) to the interfaces of, + the Work and Derivative Works thereof. + + "Contribution" shall mean any work of authorship, including + the original version of the Work and any modifications or additions + to that Work or Derivative Works thereof, that is intentionally + submitted to Licensor for inclusion in the Work by the copyright owner + or by an individual or Legal Entity authorized to submit on behalf of + the copyright owner. For the purposes of this definition, "submitted" + means any form of electronic, verbal, or written communication sent + to the Licensor or its representatives, including but not limited to + communication on electronic mailing lists, source code control systems, + and issue tracking systems that are managed by, or on behalf of, the + Licensor for the purpose of discussing and improving the Work, but + excluding communication that is conspicuously marked or otherwise + designated in writing by the copyright owner as "Not a Contribution." + + "Contributor" shall mean Licensor and any individual or Legal Entity + on behalf of whom a Contribution has been received by Licensor and + subsequently incorporated within the Work. + + 2. Grant of Copyright License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + copyright license to reproduce, prepare Derivative Works of, + publicly display, publicly perform, sublicense, and distribute the + Work and such Derivative Works in Source or Object form. + + 3. Grant of Patent License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + (except as stated in this section) patent license to make, have made, + use, offer to sell, sell, import, and otherwise transfer the Work, + where such license applies only to those patent claims licensable + by such Contributor that are necessarily infringed by their + Contribution(s) alone or by combination of their Contribution(s) + with the Work to which such Contribution(s) was submitted. If You + institute patent litigation against any entity (including a + cross-claim or counterclaim in a lawsuit) alleging that the Work + or a Contribution incorporated within the Work constitutes direct + or contributory patent infringement, then any patent licenses + granted to You under this License for that Work shall terminate + as of the date such litigation is filed. + + 4. Redistribution. You may reproduce and distribute copies of the + Work or Derivative Works thereof in any medium, with or without + modifications, and in Source or Object form, provided that You + meet the following conditions: + + (a) You must give any other recipients of the Work or + Derivative Works a copy of this License; and + + (b) You must cause any modified files to carry prominent notices + stating that You changed the files; and + + (c) You must retain, in the Source form of any Derivative Works + that You distribute, all copyright, patent, trademark, and + attribution notices from the Source form of the Work, + excluding those notices that do not pertain to any part of + the Derivative Works; and + + (d) If the Work includes a "NOTICE" text file as part of its + distribution, then any Derivative Works that You distribute must + include a readable copy of the attribution notices contained + within such NOTICE file, excluding those notices that do not + pertain to any part of the Derivative Works, in at least one + of the following places: within a NOTICE text file distributed + as part of the Derivative Works; within the Source form or + documentation, if provided along with the Derivative Works; or, + within a display generated by the Derivative Works, if and + wherever such third-party notices normally appear. The contents + of the NOTICE file are for informational purposes only and + do not modify the License. You may add Your own attribution + notices within Derivative Works that You distribute, alongside + or as an addendum to the NOTICE text from the Work, provided + that such additional attribution notices cannot be construed + as modifying the License. + + You may add Your own copyright statement to Your modifications and + may provide additional or different license terms and conditions + for use, reproduction, or distribution of Your modifications, or + for any such Derivative Works as a whole, provided Your use, + reproduction, and distribution of the Work otherwise complies with + the conditions stated in this License. + + 5. Submission of Contributions. Unless You explicitly state otherwise, + any Contribution intentionally submitted for inclusion in the Work + by You to the Licensor shall be under the terms and conditions of + this License, without any additional terms or conditions. + Notwithstanding the above, nothing herein shall supersede or modify + the terms of any separate license agreement you may have executed + with Licensor regarding such Contributions. + + 6. Trademarks. This License does not grant permission to use the trade + names, trademarks, service marks, or product names of the Licensor, + except as required for reasonable and customary use in describing the + origin of the Work and reproducing the content of the NOTICE file. + + 7. Disclaimer of Warranty. Unless required by applicable law or + agreed to in writing, Licensor provides the Work (and each + Contributor provides its Contributions) on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or + implied, including, without limitation, any warranties or conditions + of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A + PARTICULAR PURPOSE. You are solely responsible for determining the + appropriateness of using or redistributing the Work and assume any + risks associated with Your exercise of permissions under this License. + + 8. Limitation of Liability. In no event and under no legal theory, + whether in tort (including negligence), contract, or otherwise, + unless required by applicable law (such as deliberate and grossly + negligent acts) or agreed to in writing, shall any Contributor be + liable to You for damages, including any direct, indirect, special, + incidental, or consequential damages of any character arising as a + result of this License or out of the use or inability to use the + Work (including but not limited to damages for loss of goodwill, + work stoppage, computer failure or malfunction, or any and all + other commercial damages or losses), even if such Contributor + has been advised of the possibility of such damages. + + 9. Accepting Warranty or Additional Liability. While redistributing + the Work or Derivative Works thereof, You may choose to offer, + and charge a fee for, acceptance of support, warranty, indemnity, + or other liability obligations and/or rights consistent with this + License. However, in accepting such obligations, You may act only + on Your own behalf and on Your sole responsibility, not on behalf + of any other Contributor, and only if You agree to indemnify, + defend, and hold each Contributor harmless for any liability + incurred by, or claims asserted against, such Contributor by reason + of your accepting any such warranty or additional liability. + + END OF TERMS AND CONDITIONS + + APPENDIX: How to apply the Apache License to your work. + + To apply the Apache License to your work, attach the following + boilerplate notice, with the fields enclosed by brackets "[]" + replaced with your own identifying information. (Don't include + the brackets!) The text should be enclosed in the appropriate + comment syntax for the file format. We also recommend that a + file or class name and description of purpose be included on the + same "printed page" as the copyright notice for easier + identification within third-party archives. + + Copyright [yyyy] [name of copyright owner] + + Licensed under the Apache License, Version 2.0 (the "License"); + you may not use this file except in compliance with the License. + You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, software + distributed under the License is distributed on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + See the License for the specific language governing permissions and + limitations under the License. + + +---- LLVM Exceptions to the Apache 2.0 License ---- + +As an exception, if, as a result of your compiling your source code, portions +of this Software are embedded into an Object form of such source code, you +may redistribute such embedded portions in such Object form without complying +with the conditions of Sections 4(a), 4(b) and 4(d) of the License. + +In addition, if you combine or link compiled forms of this Software with +software that is licensed under the GPLv2 ("Combined Software") and if a +court of competent jurisdiction determines that the patent provision (Section +3), the indemnity provision (Section 9) or other Section of the License +conflicts with the conditions of the GPLv2, you may retroactively and +prospectively choose to deem waived or otherwise exclude such Section(s) of +the License, but only in their entirety and only with respect to the Combined +Software. + +============================================================================== +Software from third parties included in the LLVM Project: +============================================================================== +The LLVM Project contains third party software which is under different license +terms. All such code will be identified clearly using at least one of two +mechanisms: +1) It will be in a separate directory tree with its own `LICENSE.txt` or + `LICENSE` file at the top containing the specific license and restrictions + which apply to that software, or +2) It will contain specific license and restriction terms at the top of every + file. + +-------------------------------------------------------------------------------- + +3rdparty dependency gRPC is statically linked in certain binary +distributions, like the python wheels. gRPC has the following license: + +Copyright 2014 gRPC authors. + +Licensed under the Apache License, Version 2.0 (the "License"); +you may not use this file except in compliance with the License. +You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + +Unless required by applicable law or agreed to in writing, software +distributed under the License is distributed on an "AS IS" BASIS, +WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +See the License for the specific language governing permissions and +limitations under the License. + +-------------------------------------------------------------------------------- + +3rdparty dependency Apache Thrift is statically linked in certain binary +distributions, like the python wheels. Apache Thrift has the following license: + +Apache Thrift +Copyright (C) 2006 - 2019, The Apache Software Foundation + +This product includes software developed at +The Apache Software Foundation (http://www.apache.org/). + +Licensed under the Apache License, Version 2.0 (the "License"); +you may not use this file except in compliance with the License. +You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + +Unless required by applicable law or agreed to in writing, software +distributed under the License is distributed on an "AS IS" BASIS, +WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +See the License for the specific language governing permissions and +limitations under the License. + +-------------------------------------------------------------------------------- + +3rdparty dependency Apache ORC is statically linked in certain binary +distributions, like the python wheels. Apache ORC has the following license: + +Apache ORC +Copyright 2013-2019 The Apache Software Foundation + +This product includes software developed by The Apache Software +Foundation (http://www.apache.org/). + +This product includes software developed by Hewlett-Packard: +(c) Copyright [2014-2015] Hewlett-Packard Development Company, L.P + +Licensed under the Apache License, Version 2.0 (the "License"); +you may not use this file except in compliance with the License. +You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + +Unless required by applicable law or agreed to in writing, software +distributed under the License is distributed on an "AS IS" BASIS, +WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +See the License for the specific language governing permissions and +limitations under the License. + +-------------------------------------------------------------------------------- + +3rdparty dependency zstd is statically linked in certain binary +distributions, like the python wheels. ZSTD has the following license: + +BSD License + +For Zstandard software + +Copyright (c) 2016-present, Facebook, Inc. All rights reserved. + +Redistribution and use in source and binary forms, with or without modification, +are permitted provided that the following conditions are met: + + * Redistributions of source code must retain the above copyright notice, this + list of conditions and the following disclaimer. + + * Redistributions in binary form must reproduce the above copyright notice, + this list of conditions and the following disclaimer in the documentation + and/or other materials provided with the distribution. + + * Neither the name Facebook nor the names of its contributors may be used to + endorse or promote products derived from this software without specific + prior written permission. + +THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND +ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED +WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE +DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR +ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES +(INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; +LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON +ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT +(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS +SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + +-------------------------------------------------------------------------------- + +3rdparty dependency lz4 is statically linked in certain binary +distributions, like the python wheels. lz4 has the following license: + +LZ4 Library +Copyright (c) 2011-2016, Yann Collet +All rights reserved. + +Redistribution and use in source and binary forms, with or without modification, +are permitted provided that the following conditions are met: + +* Redistributions of source code must retain the above copyright notice, this + list of conditions and the following disclaimer. + +* Redistributions in binary form must reproduce the above copyright notice, this + list of conditions and the following disclaimer in the documentation and/or + other materials provided with the distribution. + +THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND +ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED +WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE +DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR +ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES +(INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; +LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON +ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT +(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS +SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + +-------------------------------------------------------------------------------- + +3rdparty dependency Brotli is statically linked in certain binary +distributions, like the python wheels. Brotli has the following license: + +Copyright (c) 2009, 2010, 2013-2016 by the Brotli Authors. + +Permission is hereby granted, free of charge, to any person obtaining a copy +of this software and associated documentation files (the "Software"), to deal +in the Software without restriction, including without limitation the rights +to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +copies of the Software, and to permit persons to whom the Software is +furnished to do so, subject to the following conditions: + +The above copyright notice and this permission notice shall be included in +all copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN +THE SOFTWARE. + +-------------------------------------------------------------------------------- + +3rdparty dependency rapidjson is statically linked in certain binary +distributions, like the python wheels. rapidjson and its dependencies have the +following licenses: + +Tencent is pleased to support the open source community by making RapidJSON +available. + +Copyright (C) 2015 THL A29 Limited, a Tencent company, and Milo Yip. +All rights reserved. + +If you have downloaded a copy of the RapidJSON binary from Tencent, please note +that the RapidJSON binary is licensed under the MIT License. +If you have downloaded a copy of the RapidJSON source code from Tencent, please +note that RapidJSON source code is licensed under the MIT License, except for +the third-party components listed below which are subject to different license +terms. Your integration of RapidJSON into your own projects may require +compliance with the MIT License, as well as the other licenses applicable to +the third-party components included within RapidJSON. To avoid the problematic +JSON license in your own projects, it's sufficient to exclude the +bin/jsonchecker/ directory, as it's the only code under the JSON license. +A copy of the MIT License is included in this file. + +Other dependencies and licenses: + + Open Source Software Licensed Under the BSD License: + -------------------------------------------------------------------- + + The msinttypes r29 + Copyright (c) 2006-2013 Alexander Chemeris + All rights reserved. + + Redistribution and use in source and binary forms, with or without + modification, are permitted provided that the following conditions are met: + + * Redistributions of source code must retain the above copyright notice, + this list of conditions and the following disclaimer. + * Redistributions in binary form must reproduce the above copyright notice, + this list of conditions and the following disclaimer in the documentation + and/or other materials provided with the distribution. + * Neither the name of copyright holder nor the names of its contributors + may be used to endorse or promote products derived from this software + without specific prior written permission. + + THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND ANY + EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED + WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE + DISCLAIMED. IN NO EVENT SHALL THE REGENTS AND CONTRIBUTORS BE LIABLE FOR + ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL + DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR + SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER + CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT + LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY + OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH + DAMAGE. + + Terms of the MIT License: + -------------------------------------------------------------------- + + Permission is hereby granted, free of charge, to any person obtaining a + copy of this software and associated documentation files (the "Software"), + to deal in the Software without restriction, including without limitation + the rights to use, copy, modify, merge, publish, distribute, sublicense, + and/or sell copies of the Software, and to permit persons to whom the + Software is furnished to do so, subject to the following conditions: + + The above copyright notice and this permission notice shall be included + in all copies or substantial portions of the Software. + + THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE + AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER + LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING + FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER + DEALINGS IN THE SOFTWARE. + +-------------------------------------------------------------------------------- + +3rdparty dependency snappy is statically linked in certain binary +distributions, like the python wheels. snappy has the following license: + +Copyright 2011, Google Inc. +All rights reserved. + +Redistribution and use in source and binary forms, with or without +modification, are permitted provided that the following conditions are +met: + + * Redistributions of source code must retain the above copyright notice, + this list of conditions and the following disclaimer. + * Redistributions in binary form must reproduce the above copyright notice, + this list of conditions and the following disclaimer in the documentation + and/or other materials provided with the distribution. + * Neither the name of Google Inc. nor the names of its contributors may be + used to endorse or promote products derived from this software without + specific prior written permission. + +THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS +"AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT +LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR +A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT +OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, +SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT +LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, +DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY +THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT +(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE +OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + +=== + +Some of the benchmark data in testdata/ is licensed differently: + + - fireworks.jpeg is Copyright 2013 Steinar H. Gunderson, and + is licensed under the Creative Commons Attribution 3.0 license + (CC-BY-3.0). See https://creativecommons.org/licenses/by/3.0/ + for more information. + + - kppkn.gtb is taken from the Gaviota chess tablebase set, and + is licensed under the MIT License. See + https://sites.google.com/site/gaviotachessengine/Home/endgame-tablebases-1 + for more information. + + - paper-100k.pdf is an excerpt (bytes 92160 to 194560) from the paper + “Combinatorial Modeling of Chromatin Features Quantitatively Predicts DNA + Replication Timing in _Drosophila_” by Federico Comoglio and Renato Paro, + which is licensed under the CC-BY license. See + http://www.ploscompbiol.org/static/license for more ifnormation. + + - alice29.txt, asyoulik.txt, plrabn12.txt and lcet10.txt are from Project + Gutenberg. The first three have expired copyrights and are in the public + domain; the latter does not have expired copyright, but is still in the + public domain according to the license information + (http://www.gutenberg.org/ebooks/53). + +-------------------------------------------------------------------------------- + +3rdparty dependency gflags is statically linked in certain binary +distributions, like the python wheels. gflags has the following license: + +Copyright (c) 2006, Google Inc. +All rights reserved. + +Redistribution and use in source and binary forms, with or without +modification, are permitted provided that the following conditions are +met: + + * Redistributions of source code must retain the above copyright +notice, this list of conditions and the following disclaimer. + * Redistributions in binary form must reproduce the above +copyright notice, this list of conditions and the following disclaimer +in the documentation and/or other materials provided with the +distribution. + * Neither the name of Google Inc. nor the names of its +contributors may be used to endorse or promote products derived from +this software without specific prior written permission. + +THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS +"AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT +LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR +A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT +OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, +SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT +LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, +DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY +THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT +(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE +OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + +-------------------------------------------------------------------------------- + +3rdparty dependency glog is statically linked in certain binary +distributions, like the python wheels. glog has the following license: + +Copyright (c) 2008, Google Inc. +All rights reserved. + +Redistribution and use in source and binary forms, with or without +modification, are permitted provided that the following conditions are +met: + + * Redistributions of source code must retain the above copyright +notice, this list of conditions and the following disclaimer. + * Redistributions in binary form must reproduce the above +copyright notice, this list of conditions and the following disclaimer +in the documentation and/or other materials provided with the +distribution. + * Neither the name of Google Inc. nor the names of its +contributors may be used to endorse or promote products derived from +this software without specific prior written permission. + +THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS +"AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT +LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR +A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT +OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, +SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT +LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, +DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY +THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT +(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE +OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + + +A function gettimeofday in utilities.cc is based on + +http://www.google.com/codesearch/p?hl=en#dR3YEbitojA/COPYING&q=GetSystemTimeAsFileTime%20license:bsd + +The license of this code is: + +Copyright (c) 2003-2008, Jouni Malinen and contributors +All Rights Reserved. + +Redistribution and use in source and binary forms, with or without +modification, are permitted provided that the following conditions are +met: + +1. Redistributions of source code must retain the above copyright + notice, this list of conditions and the following disclaimer. + +2. Redistributions in binary form must reproduce the above copyright + notice, this list of conditions and the following disclaimer in the + documentation and/or other materials provided with the distribution. + +3. Neither the name(s) of the above-listed copyright holder(s) nor the + names of its contributors may be used to endorse or promote products + derived from this software without specific prior written permission. + +THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS +"AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT +LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR +A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT +OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, +SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT +LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, +DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY +THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT +(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE +OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + +-------------------------------------------------------------------------------- + +3rdparty dependency re2 is statically linked in certain binary +distributions, like the python wheels. re2 has the following license: + +Copyright (c) 2009 The RE2 Authors. All rights reserved. + +Redistribution and use in source and binary forms, with or without +modification, are permitted provided that the following conditions are +met: + + * Redistributions of source code must retain the above copyright + notice, this list of conditions and the following disclaimer. + * Redistributions in binary form must reproduce the above + copyright notice, this list of conditions and the following + disclaimer in the documentation and/or other materials provided + with the distribution. + * Neither the name of Google Inc. nor the names of its contributors + may be used to endorse or promote products derived from this + software without specific prior written permission. + +THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS +"AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT +LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR +A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT +OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, +SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT +LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, +DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY +THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT +(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE +OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + +-------------------------------------------------------------------------------- + +3rdparty dependency c-ares is statically linked in certain binary +distributions, like the python wheels. c-ares has the following license: + +# c-ares license + +Copyright (c) 2007 - 2018, Daniel Stenberg with many contributors, see AUTHORS +file. + +Copyright 1998 by the Massachusetts Institute of Technology. + +Permission to use, copy, modify, and distribute this software and its +documentation for any purpose and without fee is hereby granted, provided that +the above copyright notice appear in all copies and that both that copyright +notice and this permission notice appear in supporting documentation, and that +the name of M.I.T. not be used in advertising or publicity pertaining to +distribution of the software without specific, written prior permission. +M.I.T. makes no representations about the suitability of this software for any +purpose. It is provided "as is" without express or implied warranty. + +-------------------------------------------------------------------------------- + +3rdparty dependency zlib is redistributed as a dynamically linked shared +library in certain binary distributions, like the python wheels. In the future +this will likely change to static linkage. zlib has the following license: + +zlib.h -- interface of the 'zlib' general purpose compression library + version 1.2.11, January 15th, 2017 + + Copyright (C) 1995-2017 Jean-loup Gailly and Mark Adler + + This software is provided 'as-is', without any express or implied + warranty. In no event will the authors be held liable for any damages + arising from the use of this software. + + Permission is granted to anyone to use this software for any purpose, + including commercial applications, and to alter it and redistribute it + freely, subject to the following restrictions: + + 1. The origin of this software must not be misrepresented; you must not + claim that you wrote the original software. If you use this software + in a product, an acknowledgment in the product documentation would be + appreciated but is not required. + 2. Altered source versions must be plainly marked as such, and must not be + misrepresented as being the original software. + 3. This notice may not be removed or altered from any source distribution. + + Jean-loup Gailly Mark Adler + jloup@gzip.org madler@alumni.caltech.edu + +-------------------------------------------------------------------------------- + +3rdparty dependency openssl is redistributed as a dynamically linked shared +library in certain binary distributions, like the python wheels. openssl +preceding version 3 has the following license: + + LICENSE ISSUES + ============== + + The OpenSSL toolkit stays under a double license, i.e. both the conditions of + the OpenSSL License and the original SSLeay license apply to the toolkit. + See below for the actual license texts. + + OpenSSL License + --------------- + +/* ==================================================================== + * Copyright (c) 1998-2019 The OpenSSL Project. All rights reserved. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in + * the documentation and/or other materials provided with the + * distribution. + * + * 3. All advertising materials mentioning features or use of this + * software must display the following acknowledgment: + * "This product includes software developed by the OpenSSL Project + * for use in the OpenSSL Toolkit. (http://www.openssl.org/)" + * + * 4. The names "OpenSSL Toolkit" and "OpenSSL Project" must not be used to + * endorse or promote products derived from this software without + * prior written permission. For written permission, please contact + * openssl-core@openssl.org. + * + * 5. Products derived from this software may not be called "OpenSSL" + * nor may "OpenSSL" appear in their names without prior written + * permission of the OpenSSL Project. + * + * 6. Redistributions of any form whatsoever must retain the following + * acknowledgment: + * "This product includes software developed by the OpenSSL Project + * for use in the OpenSSL Toolkit (http://www.openssl.org/)" + * + * THIS SOFTWARE IS PROVIDED BY THE OpenSSL PROJECT ``AS IS'' AND ANY + * EXPRESSED OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR + * PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE OpenSSL PROJECT OR + * ITS CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, + * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT + * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; + * LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) + * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, + * STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) + * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED + * OF THE POSSIBILITY OF SUCH DAMAGE. + * ==================================================================== + * + * This product includes cryptographic software written by Eric Young + * (eay@cryptsoft.com). This product includes software written by Tim + * Hudson (tjh@cryptsoft.com). + * + */ + + Original SSLeay License + ----------------------- + +/* Copyright (C) 1995-1998 Eric Young (eay@cryptsoft.com) + * All rights reserved. + * + * This package is an SSL implementation written + * by Eric Young (eay@cryptsoft.com). + * The implementation was written so as to conform with Netscapes SSL. + * + * This library is free for commercial and non-commercial use as long as + * the following conditions are aheared to. The following conditions + * apply to all code found in this distribution, be it the RC4, RSA, + * lhash, DES, etc., code; not just the SSL code. The SSL documentation + * included with this distribution is covered by the same copyright terms + * except that the holder is Tim Hudson (tjh@cryptsoft.com). + * + * Copyright remains Eric Young's, and as such any Copyright notices in + * the code are not to be removed. + * If this package is used in a product, Eric Young should be given attribution + * as the author of the parts of the library used. + * This can be in the form of a textual message at program startup or + * in documentation (online or textual) provided with the package. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the copyright + * notice, this list of conditions and the following disclaimer. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * 3. All advertising materials mentioning features or use of this software + * must display the following acknowledgement: + * "This product includes cryptographic software written by + * Eric Young (eay@cryptsoft.com)" + * The word 'cryptographic' can be left out if the rouines from the library + * being used are not cryptographic related :-). + * 4. If you include any Windows specific code (or a derivative thereof) from + * the apps directory (application code) you must include an acknowledgement: + * "This product includes software written by Tim Hudson (tjh@cryptsoft.com)" + * + * THIS SOFTWARE IS PROVIDED BY ERIC YOUNG ``AS IS'' AND + * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE + * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE + * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL + * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS + * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) + * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT + * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY + * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF + * SUCH DAMAGE. + * + * The licence and distribution terms for any publically available version or + * derivative of this code cannot be changed. i.e. this code cannot simply be + * copied and put under another distribution licence + * [including the GNU Public Licence.] + */ + +-------------------------------------------------------------------------------- + +This project includes code from the rtools-backports project. + +* ci/scripts/PKGBUILD and ci/scripts/r_windows_build.sh are based on code + from the rtools-backports project. + +Copyright: Copyright (c) 2013 - 2019, Алексей and Jeroen Ooms. +All rights reserved. +Homepage: https://github.com/r-windows/rtools-backports +License: 3-clause BSD + +-------------------------------------------------------------------------------- + +Some code from pandas has been adapted for the pyarrow codebase. pandas is +available under the 3-clause BSD license, which follows: + +pandas license +============== + +Copyright (c) 2011-2012, Lambda Foundry, Inc. and PyData Development Team +All rights reserved. + +Copyright (c) 2008-2011 AQR Capital Management, LLC +All rights reserved. + +Redistribution and use in source and binary forms, with or without +modification, are permitted provided that the following conditions are +met: + + * Redistributions of source code must retain the above copyright + notice, this list of conditions and the following disclaimer. + + * Redistributions in binary form must reproduce the above + copyright notice, this list of conditions and the following + disclaimer in the documentation and/or other materials provided + with the distribution. + + * Neither the name of the copyright holder nor the names of any + contributors may be used to endorse or promote products derived + from this software without specific prior written permission. + +THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDER AND CONTRIBUTORS +"AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT +LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR +A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT +OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, +SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT +LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, +DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY +THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT +(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE +OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + +-------------------------------------------------------------------------------- + +Some bits from DyND, in particular aspects of the build system, have been +adapted from libdynd and dynd-python under the terms of the BSD 2-clause +license + +The BSD 2-Clause License + + Copyright (C) 2011-12, Dynamic NDArray Developers + All rights reserved. + + Redistribution and use in source and binary forms, with or without + modification, are permitted provided that the following conditions are + met: + + * Redistributions of source code must retain the above copyright + notice, this list of conditions and the following disclaimer. + + * Redistributions in binary form must reproduce the above + copyright notice, this list of conditions and the following + disclaimer in the documentation and/or other materials provided + with the distribution. + + THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS + "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT + LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR + A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT + OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, + SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT + LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, + DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY + THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT + (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE + OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + +Dynamic NDArray Developers list: + + * Mark Wiebe + * Continuum Analytics + +-------------------------------------------------------------------------------- + +Some source code from Ibis (https://github.com/cloudera/ibis) has been adapted +for PyArrow. Ibis is released under the Apache License, Version 2.0. + +-------------------------------------------------------------------------------- + +dev/tasks/homebrew-formulae/apache-arrow.rb has the following license: + +BSD 2-Clause License + +Copyright (c) 2009-present, Homebrew contributors +All rights reserved. + +Redistribution and use in source and binary forms, with or without +modification, are permitted provided that the following conditions are met: + +* Redistributions of source code must retain the above copyright notice, this + list of conditions and the following disclaimer. + +* Redistributions in binary form must reproduce the above copyright notice, + this list of conditions and the following disclaimer in the documentation + and/or other materials provided with the distribution. + +THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" +AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE +IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE +DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE +FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL +DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR +SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER +CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, +OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE +OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + +---------------------------------------------------------------------- + +cpp/src/arrow/vendored/base64.cpp has the following license + +ZLIB License + +Copyright (C) 2004-2017 René Nyffenegger + +This source code is provided 'as-is', without any express or implied +warranty. In no event will the author be held liable for any damages arising +from the use of this software. + +Permission is granted to anyone to use this software for any purpose, including +commercial applications, and to alter it and redistribute it freely, subject to +the following restrictions: + +1. The origin of this source code must not be misrepresented; you must not + claim that you wrote the original source code. If you use this source code + in a product, an acknowledgment in the product documentation would be + appreciated but is not required. + +2. Altered source versions must be plainly marked as such, and must not be + misrepresented as being the original source code. + +3. This notice may not be removed or altered from any source distribution. + +René Nyffenegger rene.nyffenegger@adp-gmbh.ch + +-------------------------------------------------------------------------------- + +This project includes code from Folly. + + * cpp/src/arrow/vendored/ProducerConsumerQueue.h + +is based on Folly's + + * folly/Portability.h + * folly/lang/Align.h + * folly/ProducerConsumerQueue.h + +Copyright: Copyright (c) Facebook, Inc. and its affiliates. +Home page: https://github.com/facebook/folly +License: http://www.apache.org/licenses/LICENSE-2.0 + +-------------------------------------------------------------------------------- + +The file cpp/src/arrow/vendored/musl/strptime.c has the following license + +Copyright © 2005-2020 Rich Felker, et al. + +Permission is hereby granted, free of charge, to any person obtaining +a copy of this software and associated documentation files (the +"Software"), to deal in the Software without restriction, including +without limitation the rights to use, copy, modify, merge, publish, +distribute, sublicense, and/or sell copies of the Software, and to +permit persons to whom the Software is furnished to do so, subject to +the following conditions: + +The above copyright notice and this permission notice shall be +included in all copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, +EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF +MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. +IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY +CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, +TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE +SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. + +-------------------------------------------------------------------------------- + +The file cpp/cmake_modules/BuildUtils.cmake contains code from + +https://gist.github.com/cristianadam/ef920342939a89fae3e8a85ca9459b49 + +which is made available under the MIT license + +Copyright (c) 2019 Cristian Adam + +Permission is hereby granted, free of charge, to any person obtaining a copy +of this software and associated documentation files (the "Software"), to deal +in the Software without restriction, including without limitation the rights +to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +copies of the Software, and to permit persons to whom the Software is +furnished to do so, subject to the following conditions: + +The above copyright notice and this permission notice shall be included in all +copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE +SOFTWARE. + +-------------------------------------------------------------------------------- + +The files in cpp/src/arrow/vendored/portable-snippets/ contain code from + +https://github.com/nemequ/portable-snippets + +and have the following copyright notice: + +Each source file contains a preamble explaining the license situation +for that file, which takes priority over this file. With the +exception of some code pulled in from other repositories (such as +µnit, an MIT-licensed project which is used for testing), the code is +public domain, released using the CC0 1.0 Universal dedication (*). + +(*) https://creativecommons.org/publicdomain/zero/1.0/legalcode + +-------------------------------------------------------------------------------- + +The files in cpp/src/arrow/vendored/fast_float/ contain code from + +https://github.com/lemire/fast_float + +which is made available under the Apache License 2.0. + +-------------------------------------------------------------------------------- + +The file python/pyarrow/vendored/docscrape.py contains code from + +https://github.com/numpy/numpydoc/ + +which is made available under the BSD 2-clause license. + +-------------------------------------------------------------------------------- + +The file python/pyarrow/vendored/version.py contains code from + +https://github.com/pypa/packaging/ + +which is made available under both the Apache license v2.0 and the +BSD 2-clause license. + +-------------------------------------------------------------------------------- + +The files in cpp/src/arrow/vendored/pcg contain code from + +https://github.com/imneme/pcg-cpp + +and have the following copyright notice: + +Copyright 2014-2019 Melissa O'Neill , + and the PCG Project contributors. + +SPDX-License-Identifier: (Apache-2.0 OR MIT) + +Licensed under the Apache License, Version 2.0 (provided in +LICENSE-APACHE.txt and at http://www.apache.org/licenses/LICENSE-2.0) +or under the MIT license (provided in LICENSE-MIT.txt and at +http://opensource.org/licenses/MIT), at your option. This file may not +be copied, modified, or distributed except according to those terms. + +Distributed on an "AS IS" BASIS, WITHOUT WARRANTY OF ANY KIND, either +express or implied. See your chosen license for details. + +-------------------------------------------------------------------------------- +r/R/dplyr-count-tally.R (some portions) + +Some portions of this file are derived from code from + +https://github.com/tidyverse/dplyr/ + +which is made available under the MIT license + +Copyright (c) 2013-2019 RStudio and others. + +Permission is hereby granted, free of charge, to any person obtaining a copy of +this software and associated documentation files (the “Software”), to deal +in the Software without restriction, including without limitation the rights +to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +copies of the Software, and to permit persons to whom the Software is +furnished to do so, subject to the following conditions: + +The above copyright notice and this permission notice shall be included in all +copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED “AS IS”, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE +SOFTWARE. + +-------------------------------------------------------------------------------- + +The file src/arrow/util/io_util.cc contains code from the CPython project +which is made available under the Python Software Foundation License Version 2. + +-------------------------------------------------------------------------------- + +3rdparty dependency opentelemetry-cpp is statically linked in certain binary +distributions. opentelemetry-cpp is made available under the Apache License 2.0. + +Copyright The OpenTelemetry Authors +SPDX-License-Identifier: Apache-2.0 + +-------------------------------------------------------------------------------- + +ci/conan/ is based on code from Conan Package and Dependency Manager. + +Copyright (c) 2019 Conan.io + +Permission is hereby granted, free of charge, to any person obtaining a copy +of this software and associated documentation files (the "Software"), to deal +in the Software without restriction, including without limitation the rights +to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +copies of the Software, and to permit persons to whom the Software is +furnished to do so, subject to the following conditions: + +The above copyright notice and this permission notice shall be included in all +copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE +SOFTWARE. + +-------------------------------------------------------------------------------- + +3rdparty dependency UCX is redistributed as a dynamically linked shared +library in certain binary distributions. UCX has the following license: + +Copyright (c) 2014-2015 UT-Battelle, LLC. All rights reserved. +Copyright (C) 2014-2020 Mellanox Technologies Ltd. All rights reserved. +Copyright (C) 2014-2015 The University of Houston System. All rights reserved. +Copyright (C) 2015 The University of Tennessee and The University + of Tennessee Research Foundation. All rights reserved. +Copyright (C) 2016-2020 ARM Ltd. All rights reserved. +Copyright (c) 2016 Los Alamos National Security, LLC. All rights reserved. +Copyright (C) 2016-2020 Advanced Micro Devices, Inc. All rights reserved. +Copyright (C) 2019 UChicago Argonne, LLC. All rights reserved. +Copyright (c) 2018-2020 NVIDIA CORPORATION. All rights reserved. +Copyright (C) 2020 Huawei Technologies Co., Ltd. All rights reserved. +Copyright (C) 2016-2020 Stony Brook University. All rights reserved. + +Redistribution and use in source and binary forms, with or without +modification, are permitted provided that the following conditions +are met: + +1. Redistributions of source code must retain the above copyright +notice, this list of conditions and the following disclaimer. +2. Redistributions in binary form must reproduce the above copyright +notice, this list of conditions and the following disclaimer in the +documentation and/or other materials provided with the distribution. +3. Neither the name of the copyright holder nor the names of its +contributors may be used to endorse or promote products derived from +this software without specific prior written permission. + +THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS +"AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT +LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR +A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT +HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, +SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED +TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR +PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF +LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING +NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS +SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + +-------------------------------------------------------------------------------- + +The file dev/tasks/r/github.packages.yml contains code from + +https://github.com/ursa-labs/arrow-r-nightly + +which is made available under the Apache License 2.0. + +-------------------------------------------------------------------------------- +.github/actions/sync-nightlies/action.yml (some portions) + +Some portions of this file are derived from code from + +https://github.com/JoshPiper/rsync-docker + +which is made available under the MIT license + +Copyright (c) 2020 Joshua Piper + +Permission is hereby granted, free of charge, to any person obtaining a copy +of this software and associated documentation files (the "Software"), to deal +in the Software without restriction, including without limitation the rights +to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +copies of the Software, and to permit persons to whom the Software is +furnished to do so, subject to the following conditions: + +The above copyright notice and this permission notice shall be included in all +copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE +SOFTWARE. + +-------------------------------------------------------------------------------- +.github/actions/sync-nightlies/action.yml (some portions) + +Some portions of this file are derived from code from + +https://github.com/burnett01/rsync-deployments + +which is made available under the MIT license + +Copyright (c) 2019-2022 Contention +Copyright (c) 2019-2022 Burnett01 + +Permission is hereby granted, free of charge, to any person obtaining a copy +of this software and associated documentation files (the "Software"), to deal +in the Software without restriction, including without limitation the rights +to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +copies of the Software, and to permit persons to whom the Software is +furnished to do so, subject to the following conditions: + +The above copyright notice and this permission notice shall be included in all +copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE +SOFTWARE. + +-------------------------------------------------------------------------------- +java/vector/src/main/java/org/apache/arrow/vector/util/IntObjectHashMap.java +java/vector/src/main/java/org/apache/arrow/vector/util/IntObjectMap.java + +These files are derived from code from Netty, which is made available under the +Apache License 2.0. + +-------------------------------------------------------------------------------- +cpp/src/arrow/util/math_internal.cc (some portions) + +Some portions of this file are derived from + +https://github.com/ankane/dist-rust/ + +which is made available under the MIT license + +The MIT License (MIT) + +Copyright (c) 2021-2023 Contributors + +Permission is hereby granted, free of charge, to any person obtaining a copy +of this software and associated documentation files (the "Software"), to deal +in the Software without restriction, including without limitation the rights +to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +copies of the Software, and to permit persons to whom the Software is +furnished to do so, subject to the following conditions: + +The above copyright notice and this permission notice shall be included in +all copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN +THE SOFTWARE. + +-------------------------------------------------------------------------------- +The files cpp/src/arrow/vendored/whereami/whereami.h, +cpp/src/arrow/vendored/whereami/whereami.cc are adapted from +Grégory Pakosz's whereami library (https://github.com/gpakosz/whereami) +It is dual licensed under both the WTFPLv2 and MIT licenses. + +The WTFPLv2 License + DO WHAT THE FUCK YOU WANT TO PUBLIC LICENSE + Version 2, December 2004 + + Copyright (C) 2004 Sam Hocevar + + Everyone is permitted to copy and distribute verbatim or modified + copies of this license document, and changing it is allowed as long + as the name is changed. + + DO WHAT THE FUCK YOU WANT TO PUBLIC LICENSE + TERMS AND CONDITIONS FOR COPYING, DISTRIBUTION AND MODIFICATION + + 0. You just DO WHAT THE FUCK YOU WANT TO. + 1. Bla bla bla + 2. Montesqieu et camembert, vive la France, zut alors! + +The MIT License (MIT) +Copyright Gregory Pakosz + +Permission is hereby granted, free of charge, to any person obtaining a copy of +this software and associated documentation files (the "Software"), to deal in +the Software without restriction, including without limitation the rights to +use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of +the Software, and to permit persons to whom the Software is furnished to do so, +subject to the following conditions: + +The above copyright notice and this permission notice shall be included in all +copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS +FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR +COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER +IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN +CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. + + +pyasn1 +BSD-2-Clause +https://github.com/pyasn1/pyasn1 +Copyright (c) 2005-2020, Ilya Etingof +All rights reserved. + +Redistribution and use in source and binary forms, with or without +modification, are permitted provided that the following conditions are met: + + * Redistributions of source code must retain the above copyright notice, + this list of conditions and the following disclaimer. + + * Redistributions in binary form must reproduce the above copyright notice, + this list of conditions and the following disclaimer in the documentation + and/or other materials provided with the distribution. + +THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" +AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE +IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE +ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE +LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR +CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF +SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS +INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN +CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) +ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE +POSSIBILITY OF SUCH DAMAGE. + + +pyasn1_modules +BSD License +https://github.com/pyasn1/pyasn1-modules +Copyright (c) 2005-2020, Ilya Etingof +All rights reserved. + +Redistribution and use in source and binary forms, with or without +modification, are permitted provided that the following conditions are met: + + * Redistributions of source code must retain the above copyright notice, + this list of conditions and the following disclaimer. + + * Redistributions in binary form must reproduce the above copyright notice, + this list of conditions and the following disclaimer in the documentation + and/or other materials provided with the distribution. + +THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" +AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE +IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE +ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE +LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR +CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF +SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS +INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN +CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) +ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE +POSSIBILITY OF SUCH DAMAGE. + + +pycocotools +FreeBSD +https://github.com/ppwwyyxx/cocoapi +UNKNOWN + +pycparser +BSD-3-Clause +https://github.com/eliben/pycparser +pycparser -- A C parser in Python + +Copyright (c) 2008-2022, Eli Bendersky +All rights reserved. + +Redistribution and use in source and binary forms, with or without modification, +are permitted provided that the following conditions are met: + +* Redistributions of source code must retain the above copyright notice, this + list of conditions and the following disclaimer. +* Redistributions in binary form must reproduce the above copyright notice, + this list of conditions and the following disclaimer in the documentation + and/or other materials provided with the distribution. +* Neither the name of the copyright holder nor the names of its contributors may + be used to endorse or promote products derived from this software without + specific prior written permission. + +THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND +ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED +WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE +DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE +LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR +CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE +GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) +HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT +LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT +OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + + +pycryptodomex +BSD License; Public Domain +https://www.pycryptodome.org +The source code in PyCryptodome is partially in the public domain +and partially released under the BSD 2-Clause license. + +In either case, there are minimal if no restrictions on the redistribution, +modification and usage of the software. + +Public domain +============= + +All code originating from PyCrypto is free and unencumbered software +released into the public domain. + +Anyone is free to copy, modify, publish, use, compile, sell, or +distribute this software, either in source code form or as a compiled +binary, for any purpose, commercial or non-commercial, and by any +means. + +In jurisdictions that recognize copyright laws, the author or authors +of this software dedicate any and all copyright interest in the +software to the public domain. We make this dedication for the benefit +of the public at large and to the detriment of our heirs and +successors. We intend this dedication to be an overt act of +relinquishment in perpetuity of all present and future rights to this +software under copyright law. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, +EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF +MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. +IN NO EVENT SHALL THE AUTHORS BE LIABLE FOR ANY CLAIM, DAMAGES OR +OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, +ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR +OTHER DEALINGS IN THE SOFTWARE. + +For more information, please refer to + +BSD license +=========== + +All direct contributions to PyCryptodome are released under the following +license. The copyright of each piece belongs to the respective author. + +Redistribution and use in source and binary forms, with or without +modification, are permitted provided that the following conditions are met: + +1. Redistributions of source code must retain the above copyright notice, + this list of conditions and the following disclaimer. + +2. Redistributions in binary form must reproduce the above copyright notice, + this list of conditions and the following disclaimer in the documentation + and/or other materials provided with the distribution. + +THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" +AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE +IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE +DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE +FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL +DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR +SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER +CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, +OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE +OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + + +pydantic +MIT +https://github.com/pydantic/pydantic +The MIT License (MIT) + +Copyright (c) 2017 to present Pydantic Services Inc. and individual contributors. + +Permission is hereby granted, free of charge, to any person obtaining a copy +of this software and associated documentation files (the "Software"), to deal +in the Software without restriction, including without limitation the rights +to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +copies of the Software, and to permit persons to whom the Software is +furnished to do so, subject to the following conditions: + +The above copyright notice and this permission notice shall be included in all +copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE +SOFTWARE. + + +pydantic_core +MIT +https://github.com/pydantic/pydantic-core +The MIT License (MIT) + +Copyright (c) 2022 Samuel Colvin + +Permission is hereby granted, free of charge, to any person obtaining a copy +of this software and associated documentation files (the "Software"), to deal +in the Software without restriction, including without limitation the rights +to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +copies of the Software, and to permit persons to whom the Software is +furnished to do so, subject to the following conditions: + +The above copyright notice and this permission notice shall be included in all +copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE +SOFTWARE. + + +pydub +MIT License +http://pydub.com +Copyright (c) 2011 James Robert, http://jiaaro.com + +Permission is hereby granted, free of charge, to any person obtaining +a copy of this software and associated documentation files (the +"Software"), to deal in the Software without restriction, including +without limitation the rights to use, copy, modify, merge, publish, +distribute, sublicense, and/or sell copies of the Software, and to +permit persons to whom the Software is furnished to do so, subject to +the following conditions: + +The above copyright notice and this permission notice shall be +included in all copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, +EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF +MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND +NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE +LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION +OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION +WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. + +pygltflib +MIT License +https://gitlab.com/dodgyville/pygltflib +Copyright (c) 2018 Luke Miller + +Permission is hereby granted, free of charge, to any person obtaining a copy +of this software and associated documentation files (the "Software"), to deal +in the Software without restriction, including without limitation the rights +to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +copies of the Software, and to permit persons to whom the Software is +furnished to do so, subject to the following conditions: + +The above copyright notice and this permission notice shall be included in all +copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE +SOFTWARE. + + +pyinstrument +BSD License +https://github.com/joerick/pyinstrument +Copyright (c) 2014-2020, Joe Rickerby and contributors +All rights reserved. + +Redistribution and use in source and binary forms, with or without +modification, are permitted provided that the following conditions are met: + +1. Redistributions of source code must retain the above copyright notice, this +list of conditions and the following disclaimer. + +2. Redistributions in binary form must reproduce the above copyright notice, +this list of conditions and the following disclaimer in the documentation +and/or other materials provided with the distribution. + +3. Neither the name of the copyright holder nor the names of its contributors +may be used to endorse or promote products derived from this software without +specific prior written permission. + +THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" +AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE +IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE +DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE +FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL +DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR +SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER +CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, +OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE +OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + + +pynput +GNU Lesser General Public License v3 (LGPLv3) +https://github.com/moses-palmer/pynput + GNU LESSER GENERAL PUBLIC LICENSE + Version 3, 29 June 2007 + + Copyright (C) 2007-2024 Free Software Foundation, Inc. + Everyone is permitted to copy and distribute verbatim copies + of this license document, but changing it is not allowed. + + + This version of the GNU Lesser General Public License incorporates +the terms and conditions of version 3 of the GNU General Public +License, supplemented by the additional permissions listed below. + + 0. Additional Definitions. + + As used herein, "this License" refers to version 3 of the GNU Lesser +General Public License, and the "GNU GPL" refers to version 3 of the GNU +General Public License. + + "The Library" refers to a covered work governed by this License, +other than an Application or a Combined Work as defined below. + + An "Application" is any work that makes use of an interface provided +by the Library, but which is not otherwise based on the Library. +Defining a subclass of a class defined by the Library is deemed a mode +of using an interface provided by the Library. + + A "Combined Work" is a work produced by combining or linking an +Application with the Library. The particular version of the Library +with which the Combined Work was made is also called the "Linked +Version". + + The "Minimal Corresponding Source" for a Combined Work means the +Corresponding Source for the Combined Work, excluding any source code +for portions of the Combined Work that, considered in isolation, are +based on the Application, and not on the Linked Version. + + The "Corresponding Application Code" for a Combined Work means the +object code and/or source code for the Application, including any data +and utility programs needed for reproducing the Combined Work from the +Application, but excluding the System Libraries of the Combined Work. + + 1. Exception to Section 3 of the GNU GPL. + + You may convey a covered work under sections 3 and 4 of this License +without being bound by section 3 of the GNU GPL. + + 2. Conveying Modified Versions. + + If you modify a copy of the Library, and, in your modifications, a +facility refers to a function or data to be supplied by an Application +that uses the facility (other than as an argument passed when the +facility is invoked), then you may convey a copy of the modified +version: + + a) under this License, provided that you make a good faith effort to + ensure that, in the event an Application does not supply the + function or data, the facility still operates, and performs + whatever part of its purpose remains meaningful, or + + b) under the GNU GPL, with none of the additional permissions of + this License applicable to that copy. + + 3. Object Code Incorporating Material from Library Header Files. + + The object code form of an Application may incorporate material from +a header file that is part of the Library. You may convey such object +code under terms of your choice, provided that, if the incorporated +material is not limited to numerical parameters, data structure +layouts and accessors, or small macros, inline functions and templates +(ten or fewer lines in length), you do both of the following: + + a) Give prominent notice with each copy of the object code that the + Library is used in it and that the Library and its use are + covered by this License. + + b) Accompany the object code with a copy of the GNU GPL and this license + document. + + 4. Combined Works. + + You may convey a Combined Work under terms of your choice that, +taken together, effectively do not restrict modification of the +portions of the Library contained in the Combined Work and reverse +engineering for debugging such modifications, if you also do each of +the following: + + a) Give prominent notice with each copy of the Combined Work that + the Library is used in it and that the Library and its use are + covered by this License. + + b) Accompany the Combined Work with a copy of the GNU GPL and this license + document. + + c) For a Combined Work that displays copyright notices during + execution, include the copyright notice for the Library among + these notices, as well as a reference directing the user to the + copies of the GNU GPL and this license document. + + d) Do one of the following: + + 0) Convey the Minimal Corresponding Source under the terms of this + License, and the Corresponding Application Code in a form + suitable for, and under terms that permit, the user to + recombine or relink the Application with a modified version of + the Linked Version to produce a modified Combined Work, in the + manner specified by section 6 of the GNU GPL for conveying + Corresponding Source. + + 1) Use a suitable shared library mechanism for linking with the + Library. A suitable mechanism is one that (a) uses at run time + a copy of the Library already present on the user's computer + system, and (b) will operate properly with a modified version + of the Library that is interface-compatible with the Linked + Version. + + e) Provide Installation Information, but only if you would otherwise + be required to provide such information under section 6 of the + GNU GPL, and only to the extent that such information is + necessary to install and execute a modified version of the + Combined Work produced by recombining or relinking the + Application with a modified version of the Linked Version. (If + you use option 4d0, the Installation Information must accompany + the Minimal Corresponding Source and Corresponding Application + Code. If you use option 4d1, you must provide the Installation + Information in the manner specified by section 6 of the GNU GPL + for conveying Corresponding Source.) + + 5. Combined Libraries. + + You may place library facilities that are a work based on the +Library side by side in a single library together with other library +facilities that are not Applications and are not covered by this +License, and convey such a combined library under terms of your +choice, if you do both of the following: + + a) Accompany the combined library with a copy of the same work based + on the Library, uncombined with any other library facilities, + conveyed under the terms of this License. + + b) Give prominent notice with the combined library that part of it + is a work based on the Library, and explaining where to find the + accompanying uncombined form of the same work. + + 6. Revised Versions of the GNU Lesser General Public License. + + The Free Software Foundation may publish revised and/or new versions +of the GNU Lesser General Public License from time to time. Such new +versions will be similar in spirit to the present version, but may +differ in detail to address new problems or concerns. + + Each version is given a distinguishing version number. If the +Library as you received it specifies that a certain numbered version +of the GNU Lesser General Public License "or any later version" +applies to it, you have the option of following the terms and +conditions either of that published version or of any later version +published by the Free Software Foundation. If the Library as you +received it does not specify a version number of the GNU Lesser +General Public License, you may choose any version of the GNU Lesser +General Public License ever published by the Free Software Foundation. + + If the Library as you received it specifies that a proxy can decide +whether future versions of the GNU Lesser General Public License shall +apply, that proxy's public statement of acceptance of any version is +permanent authorization for you to choose that version for the +Library. + + +pyparsing +MIT +https://github.com/pyparsing/pyparsing/ +Copyright (c) 2003-2025 Paul McGuire + +Permission is hereby granted, free of charge, to any person obtaining +a copy of this software and associated documentation files (the +"Software"), to deal in the Software without restriction, including +without limitation the rights to use, copy, modify, merge, publish, +distribute, sublicense, and/or sell copies of the Software, and to +permit persons to whom the Software is furnished to do so, subject to +the following conditions: + +The above copyright notice and this permission notice shall be +included in all copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, +EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF +MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. +IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY +CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, +TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE +SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. + + +pyproject_hooks +MIT License +https://github.com/pypa/pyproject-hooks +The MIT License (MIT) + +Copyright (c) 2017 Thomas Kluyver + +Permission is hereby granted, free of charge, to any person obtaining a copy +of this software and associated documentation files (the "Software"), to deal +in the Software without restriction, including without limitation the rights +to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +copies of the Software, and to permit persons to whom the Software is +furnished to do so, subject to the following conditions: + +The above copyright notice and this permission notice shall be included in +all copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN +THE SOFTWARE. + + +pyrefly +UNKNOWN +https://pyrefly.org +UNKNOWN + +pyserial +BSD License +https://github.com/pyserial/pyserial +UNKNOWN + +pytest +MIT +https://docs.pytest.org/en/latest/ +The MIT License (MIT) + +Copyright (c) 2004 Holger Krekel and others + +Permission is hereby granted, free of charge, to any person obtaining a copy of +this software and associated documentation files (the "Software"), to deal in +the Software without restriction, including without limitation the rights to +use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies +of the Software, and to permit persons to whom the Software is furnished to do +so, subject to the following conditions: + +The above copyright notice and this permission notice shall be included in all +copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE +SOFTWARE. + + +pytest-cov +MIT +https://pytest-cov.readthedocs.io/en/latest/changelog.html +The MIT License + +Copyright (c) 2010 Meme Dough + +Permission is hereby granted, free of charge, to any person obtaining a copy +of this software and associated documentation files (the "Software"), to deal +in the Software without restriction, including without limitation the rights +to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +copies of the Software, and to permit persons to whom the Software is +furnished to do so, subject to the following conditions: + +The above copyright notice and this permission notice shall be included in +all copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN +THE SOFTWARE. + + +pytest-custom-exit-code +MIT License +https://github.com/yashtodi94/pytest-custom_exit_code + +The MIT License (MIT) + +Copyright (c) 2019 Yash Todi + +Permission is hereby granted, free of charge, to any person obtaining a copy +of this software and associated documentation files (the "Software"), to deal +in the Software without restriction, including without limitation the rights +to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +copies of the Software, and to permit persons to whom the Software is +furnished to do so, subject to the following conditions: + +The above copyright notice and this permission notice shall be included in +all copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN +THE SOFTWARE. + + +pytest-datadir +MIT License +http://github.com/gabrielcnr/pytest-datadir +This is the MIT license: http://www.opensource.org/licenses/mit-license.php + +Copyright (c) 2015-2022 the pytest-datadir authors and contributors . + +Permission is hereby granted, free of charge, to any person obtaining a copy of this +software and associated documentation files (the "Software"), to deal in the Software +without restriction, including without limitation the rights to use, copy, modify, merge, +publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons +to whom the Software is furnished to do so, subject to the following conditions: + +The above copyright notice and this permission notice shall be included in all copies or +substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, +INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR +PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE +FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR +OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER +DEALINGS IN THE SOFTWARE. + + +pytest-env +MIT License +https://github.com/pytest-dev/pytest-env +MIT License + +Copyright (c) 2010-202x The pytest-env developers + +Permission is hereby granted, free of charge, to any person obtaining a copy +of this software and associated documentation files (the "Software"), to deal +in the Software without restriction, including without limitation the rights +to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +copies of the Software, and to permit persons to whom the Software is +furnished to do so, subject to the following conditions: + +The above copyright notice and this permission notice shall be included in all +copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE +SOFTWARE. + + +pytest-instafail +BSD License +https://github.com/pytest-dev/pytest-instafail +Copyright (c) 2013-2016, Janne Vanhala + +All rights reserved. + +Redistribution and use in source and binary forms, with or without +modification, are permitted provided that the following conditions are met: + +* Redistributions of source code must retain the above copyright notice, this + list of conditions and the following disclaimer. + +* Redistributions in binary form must reproduce the above copyright notice, + this list of conditions and the following disclaimer in the documentation + and/or other materials provided with the distribution. + +* The names of the contributors may not be used to endorse or promote products + derived from this software without specific prior written permission. + +THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND +ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED +WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE +DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER BE LIABLE FOR ANY DIRECT, +INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, +BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, +DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF +LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE +OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF +ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + + +pytest-regressions +MIT License +https://github.com/ESSS/pytest-regressions + +The MIT License (MIT) + +Copyright (c) 2018 ESSS + +Permission is hereby granted, free of charge, to any person obtaining a copy +of this software and associated documentation files (the "Software"), to deal +in the Software without restriction, including without limitation the rights +to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +copies of the Software, and to permit persons to whom the Software is +furnished to do so, subject to the following conditions: + +The above copyright notice and this permission notice shall be included in +all copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN +THE SOFTWARE. + + +pytest-xdist +MIT +https://github.com/pytest-dev/pytest-xdist +MIT License + +Copyright (c) 2010 Holger Krekel and contributors. + +Permission is hereby granted, free of charge, to any person obtaining a copy +of this software and associated documentation files (the "Software"), to deal +in the Software without restriction, including without limitation the rights +to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +copies of the Software, and to permit persons to whom the Software is +furnished to do so, subject to the following conditions: + +The above copyright notice and this permission notice shall be included in all +copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE +SOFTWARE. + + +python-dateutil +Apache Software License; BSD License +https://github.com/dateutil/dateutil +Copyright 2017- Paul Ganssle +Copyright 2017- dateutil contributors (see AUTHORS file) + + Licensed under the Apache License, Version 2.0 (the "License"); + you may not use this file except in compliance with the License. + You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, software + distributed under the License is distributed on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + See the License for the specific language governing permissions and + limitations under the License. + +The above license applies to all contributions after 2017-12-01, as well as +all contributions that have been re-licensed (see AUTHORS file for the list of +contributors who have re-licensed their code). +-------------------------------------------------------------------------------- +dateutil - Extensions to the standard Python datetime module. + +Copyright (c) 2003-2011 - Gustavo Niemeyer +Copyright (c) 2012-2014 - Tomi Pieviläinen +Copyright (c) 2014-2016 - Yaron de Leeuw +Copyright (c) 2015- - Paul Ganssle +Copyright (c) 2015- - dateutil contributors (see AUTHORS file) + +All rights reserved. + +Redistribution and use in source and binary forms, with or without +modification, are permitted provided that the following conditions are met: + + * Redistributions of source code must retain the above copyright notice, + this list of conditions and the following disclaimer. + * Redistributions in binary form must reproduce the above copyright notice, + this list of conditions and the following disclaimer in the documentation + and/or other materials provided with the distribution. + * Neither the name of the copyright holder nor the names of its + contributors may be used to endorse or promote products derived from + this software without specific prior written permission. + +THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS +"AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT +LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR +A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR +CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, +EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, +PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR +PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF +LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING +NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS +SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + +The above BSD License Applies to all code, even that also covered by Apache 2.0. + +python-discovery +MIT License +https://github.com/tox-dev/python-discovery +Permission is hereby granted, free of charge, to any person obtaining a +copy of this software and associated documentation files (the +"Software"), to deal in the Software without restriction, including +without limitation the rights to use, copy, modify, merge, publish, +distribute, sublicense, and/or sell copies of the Software, and to +permit persons to whom the Software is furnished to do so, subject to +the following conditions: + +The above copyright notice and this permission notice shall be included +in all copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS +OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF +MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. +IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY +CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, +TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE +SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. + + +python-dotenv +BSD-3-Clause +https://github.com/theskumar/python-dotenv +Copyright (c) 2014, Saurabh Kumar (python-dotenv), 2013, Ted Tieken (django-dotenv-rw), 2013, Jacob Kaplan-Moss (django-dotenv) + +Redistribution and use in source and binary forms, with or without modification, +are permitted provided that the following conditions are met: + +- Redistributions of source code must retain the above copyright notice, + this list of conditions and the following disclaimer. + +- Redistributions in binary form must reproduce the above copyright notice, + this list of conditions and the following disclaimer in the documentation + and/or other materials provided with the distribution. + +- Neither the name of django-dotenv nor the names of its contributors + may be used to endorse or promote products derived from this software + without specific prior written permission. + +THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS +"AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT +LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR +A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR +CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, +EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, +PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR +PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF +LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING +NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS +SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + + +python-json-logger +BSD License +https://nhairs.github.io/python-json-logger +Copyright (c) 2011, Zakaria Zajac and the python-json-logger Contributors +All rights reserved. + +Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met: + +* Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer. +* Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution. + +THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + + +python-memcached +Python Software Foundation License +https://github.com/linsomniac/python-memcached +UNKNOWN + +python-multipart +Apache-2.0 +https://github.com/Kludex/python-multipart +Copyright 2012, Andrew Dunham + +Licensed under the Apache License, Version 2.0 (the "License"); +you may not use this file except in compliance with the License. +You may obtain a copy of the License at + + https://www.apache.org/licenses/LICENSE-2.0 + +Unless required by applicable law or agreed to in writing, software +distributed under the License is distributed on an "AS IS" BASIS, +WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +See the License for the specific language governing permissions and +limitations under the License. + + + +python-xlib +GNU Lesser General Public License v2 or later (LGPLv2+) +https://github.com/python-xlib/python-xlib + GNU LESSER GENERAL PUBLIC LICENSE + Version 2.1, February 1999 + + Copyright (C) 1991, 1999 Free Software Foundation, Inc. + 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + Everyone is permitted to copy and distribute verbatim copies + of this license document, but changing it is not allowed. + +[This is the first released version of the Lesser GPL. It also counts + as the successor of the GNU Library Public License, version 2, hence + the version number 2.1.] + + Preamble + + The licenses for most software are designed to take away your +freedom to share and change it. By contrast, the GNU General Public +Licenses are intended to guarantee your freedom to share and change +free software--to make sure the software is free for all its users. + + This license, the Lesser General Public License, applies to some +specially designated software packages--typically libraries--of the +Free Software Foundation and other authors who decide to use it. You +can use it too, but we suggest you first think carefully about whether +this license or the ordinary General Public License is the better +strategy to use in any particular case, based on the explanations below. + + When we speak of free software, we are referring to freedom of use, +not price. Our General Public Licenses are designed to make sure that +you have the freedom to distribute copies of free software (and charge +for this service if you wish); that you receive source code or can get +it if you want it; that you can change the software and use pieces of +it in new free programs; and that you are informed that you can do +these things. + + To protect your rights, we need to make restrictions that forbid +distributors to deny you these rights or to ask you to surrender these +rights. These restrictions translate to certain responsibilities for +you if you distribute copies of the library or if you modify it. + + For example, if you distribute copies of the library, whether gratis +or for a fee, you must give the recipients all the rights that we gave +you. You must make sure that they, too, receive or can get the source +code. If you link other code with the library, you must provide +complete object files to the recipients, so that they can relink them +with the library after making changes to the library and recompiling +it. And you must show them these terms so they know their rights. + + We protect your rights with a two-step method: (1) we copyright the +library, and (2) we offer you this license, which gives you legal +permission to copy, distribute and/or modify the library. + + To protect each distributor, we want to make it very clear that +there is no warranty for the free library. Also, if the library is +modified by someone else and passed on, the recipients should know +that what they have is not the original version, so that the original +author's reputation will not be affected by problems that might be +introduced by others. + + Finally, software patents pose a constant threat to the existence of +any free program. We wish to make sure that a company cannot +effectively restrict the users of a free program by obtaining a +restrictive license from a patent holder. Therefore, we insist that +any patent license obtained for a version of the library must be +consistent with the full freedom of use specified in this license. + + Most GNU software, including some libraries, is covered by the +ordinary GNU General Public License. This license, the GNU Lesser +General Public License, applies to certain designated libraries, and +is quite different from the ordinary General Public License. We use +this license for certain libraries in order to permit linking those +libraries into non-free programs. + + When a program is linked with a library, whether statically or using +a shared library, the combination of the two is legally speaking a +combined work, a derivative of the original library. The ordinary +General Public License therefore permits such linking only if the +entire combination fits its criteria of freedom. The Lesser General +Public License permits more lax criteria for linking other code with +the library. + + We call this license the "Lesser" General Public License because it +does Less to protect the user's freedom than the ordinary General +Public License. It also provides other free software developers Less +of an advantage over competing non-free programs. These disadvantages +are the reason we use the ordinary General Public License for many +libraries. However, the Lesser license provides advantages in certain +special circumstances. + + For example, on rare occasions, there may be a special need to +encourage the widest possible use of a certain library, so that it becomes +a de-facto standard. To achieve this, non-free programs must be +allowed to use the library. A more frequent case is that a free +library does the same job as widely used non-free libraries. In this +case, there is little to gain by limiting the free library to free +software only, so we use the Lesser General Public License. + + In other cases, permission to use a particular library in non-free +programs enables a greater number of people to use a large body of +free software. For example, permission to use the GNU C Library in +non-free programs enables many more people to use the whole GNU +operating system, as well as its variant, the GNU/Linux operating +system. + + Although the Lesser General Public License is Less protective of the +users' freedom, it does ensure that the user of a program that is +linked with the Library has the freedom and the wherewithal to run +that program using a modified version of the Library. + + The precise terms and conditions for copying, distribution and +modification follow. Pay close attention to the difference between a +"work based on the library" and a "work that uses the library". The +former contains code derived from the library, whereas the latter must +be combined with the library in order to run. + + GNU LESSER GENERAL PUBLIC LICENSE + TERMS AND CONDITIONS FOR COPYING, DISTRIBUTION AND MODIFICATION + + 0. This License Agreement applies to any software library or other +program which contains a notice placed by the copyright holder or +other authorized party saying it may be distributed under the terms of +this Lesser General Public License (also called "this License"). +Each licensee is addressed as "you". + + A "library" means a collection of software functions and/or data +prepared so as to be conveniently linked with application programs +(which use some of those functions and data) to form executables. + + The "Library", below, refers to any such software library or work +which has been distributed under these terms. A "work based on the +Library" means either the Library or any derivative work under +copyright law: that is to say, a work containing the Library or a +portion of it, either verbatim or with modifications and/or translated +straightforwardly into another language. (Hereinafter, translation is +included without limitation in the term "modification".) + + "Source code" for a work means the preferred form of the work for +making modifications to it. For a library, complete source code means +all the source code for all modules it contains, plus any associated +interface definition files, plus the scripts used to control compilation +and installation of the library. + + Activities other than copying, distribution and modification are not +covered by this License; they are outside its scope. The act of +running a program using the Library is not restricted, and output from +such a program is covered only if its contents constitute a work based +on the Library (independent of the use of the Library in a tool for +writing it). Whether that is true depends on what the Library does +and what the program that uses the Library does. + + 1. You may copy and distribute verbatim copies of the Library's +complete source code as you receive it, in any medium, provided that +you conspicuously and appropriately publish on each copy an +appropriate copyright notice and disclaimer of warranty; keep intact +all the notices that refer to this License and to the absence of any +warranty; and distribute a copy of this License along with the +Library. + + You may charge a fee for the physical act of transferring a copy, +and you may at your option offer warranty protection in exchange for a +fee. + + 2. You may modify your copy or copies of the Library or any portion +of it, thus forming a work based on the Library, and copy and +distribute such modifications or work under the terms of Section 1 +above, provided that you also meet all of these conditions: + + a) The modified work must itself be a software library. + + b) You must cause the files modified to carry prominent notices + stating that you changed the files and the date of any change. + + c) You must cause the whole of the work to be licensed at no + charge to all third parties under the terms of this License. + + d) If a facility in the modified Library refers to a function or a + table of data to be supplied by an application program that uses + the facility, other than as an argument passed when the facility + is invoked, then you must make a good faith effort to ensure that, + in the event an application does not supply such function or + table, the facility still operates, and performs whatever part of + its purpose remains meaningful. + + (For example, a function in a library to compute square roots has + a purpose that is entirely well-defined independent of the + application. Therefore, Subsection 2d requires that any + application-supplied function or table used by this function must + be optional: if the application does not supply it, the square + root function must still compute square roots.) + +These requirements apply to the modified work as a whole. If +identifiable sections of that work are not derived from the Library, +and can be reasonably considered independent and separate works in +themselves, then this License, and its terms, do not apply to those +sections when you distribute them as separate works. But when you +distribute the same sections as part of a whole which is a work based +on the Library, the distribution of the whole must be on the terms of +this License, whose permissions for other licensees extend to the +entire whole, and thus to each and every part regardless of who wrote +it. + +Thus, it is not the intent of this section to claim rights or contest +your rights to work written entirely by you; rather, the intent is to +exercise the right to control the distribution of derivative or +collective works based on the Library. + +In addition, mere aggregation of another work not based on the Library +with the Library (or with a work based on the Library) on a volume of +a storage or distribution medium does not bring the other work under +the scope of this License. + + 3. You may opt to apply the terms of the ordinary GNU General Public +License instead of this License to a given copy of the Library. To do +this, you must alter all the notices that refer to this License, so +that they refer to the ordinary GNU General Public License, version 2, +instead of to this License. (If a newer version than version 2 of the +ordinary GNU General Public License has appeared, then you can specify +that version instead if you wish.) Do not make any other change in +these notices. + + Once this change is made in a given copy, it is irreversible for +that copy, so the ordinary GNU General Public License applies to all +subsequent copies and derivative works made from that copy. + + This option is useful when you wish to copy part of the code of +the Library into a program that is not a library. + + 4. You may copy and distribute the Library (or a portion or +derivative of it, under Section 2) in object code or executable form +under the terms of Sections 1 and 2 above provided that you accompany +it with the complete corresponding machine-readable source code, which +must be distributed under the terms of Sections 1 and 2 above on a +medium customarily used for software interchange. + + If distribution of object code is made by offering access to copy +from a designated place, then offering equivalent access to copy the +source code from the same place satisfies the requirement to +distribute the source code, even though third parties are not +compelled to copy the source along with the object code. + + 5. A program that contains no derivative of any portion of the +Library, but is designed to work with the Library by being compiled or +linked with it, is called a "work that uses the Library". Such a +work, in isolation, is not a derivative work of the Library, and +therefore falls outside the scope of this License. + + However, linking a "work that uses the Library" with the Library +creates an executable that is a derivative of the Library (because it +contains portions of the Library), rather than a "work that uses the +library". The executable is therefore covered by this License. +Section 6 states terms for distribution of such executables. + + When a "work that uses the Library" uses material from a header file +that is part of the Library, the object code for the work may be a +derivative work of the Library even though the source code is not. +Whether this is true is especially significant if the work can be +linked without the Library, or if the work is itself a library. The +threshold for this to be true is not precisely defined by law. + + If such an object file uses only numerical parameters, data +structure layouts and accessors, and small macros and small inline +functions (ten lines or less in length), then the use of the object +file is unrestricted, regardless of whether it is legally a derivative +work. (Executables containing this object code plus portions of the +Library will still fall under Section 6.) + + Otherwise, if the work is a derivative of the Library, you may +distribute the object code for the work under the terms of Section 6. +Any executables containing that work also fall under Section 6, +whether or not they are linked directly with the Library itself. + + 6. As an exception to the Sections above, you may also combine or +link a "work that uses the Library" with the Library to produce a +work containing portions of the Library, and distribute that work +under terms of your choice, provided that the terms permit +modification of the work for the customer's own use and reverse +engineering for debugging such modifications. + + You must give prominent notice with each copy of the work that the +Library is used in it and that the Library and its use are covered by +this License. You must supply a copy of this License. If the work +during execution displays copyright notices, you must include the +copyright notice for the Library among them, as well as a reference +directing the user to the copy of this License. Also, you must do one +of these things: + + a) Accompany the work with the complete corresponding + machine-readable source code for the Library including whatever + changes were used in the work (which must be distributed under + Sections 1 and 2 above); and, if the work is an executable linked + with the Library, with the complete machine-readable "work that + uses the Library", as object code and/or source code, so that the + user can modify the Library and then relink to produce a modified + executable containing the modified Library. (It is understood + that the user who changes the contents of definitions files in the + Library will not necessarily be able to recompile the application + to use the modified definitions.) + + b) Use a suitable shared library mechanism for linking with the + Library. A suitable mechanism is one that (1) uses at run time a + copy of the library already present on the user's computer system, + rather than copying library functions into the executable, and (2) + will operate properly with a modified version of the library, if + the user installs one, as long as the modified version is + interface-compatible with the version that the work was made with. + + c) Accompany the work with a written offer, valid for at + least three years, to give the same user the materials + specified in Subsection 6a, above, for a charge no more + than the cost of performing this distribution. + + d) If distribution of the work is made by offering access to copy + from a designated place, offer equivalent access to copy the above + specified materials from the same place. + + e) Verify that the user has already received a copy of these + materials or that you have already sent this user a copy. + + For an executable, the required form of the "work that uses the +Library" must include any data and utility programs needed for +reproducing the executable from it. However, as a special exception, +the materials to be distributed need not include anything that is +normally distributed (in either source or binary form) with the major +components (compiler, kernel, and so on) of the operating system on +which the executable runs, unless that component itself accompanies +the executable. + + It may happen that this requirement contradicts the license +restrictions of other proprietary libraries that do not normally +accompany the operating system. Such a contradiction means you cannot +use both them and the Library together in an executable that you +distribute. + + 7. You may place library facilities that are a work based on the +Library side-by-side in a single library together with other library +facilities not covered by this License, and distribute such a combined +library, provided that the separate distribution of the work based on +the Library and of the other library facilities is otherwise +permitted, and provided that you do these two things: + + a) Accompany the combined library with a copy of the same work + based on the Library, uncombined with any other library + facilities. This must be distributed under the terms of the + Sections above. + + b) Give prominent notice with the combined library of the fact + that part of it is a work based on the Library, and explaining + where to find the accompanying uncombined form of the same work. + + 8. You may not copy, modify, sublicense, link with, or distribute +the Library except as expressly provided under this License. Any +attempt otherwise to copy, modify, sublicense, link with, or +distribute the Library is void, and will automatically terminate your +rights under this License. However, parties who have received copies, +or rights, from you under this License will not have their licenses +terminated so long as such parties remain in full compliance. + + 9. You are not required to accept this License, since you have not +signed it. However, nothing else grants you permission to modify or +distribute the Library or its derivative works. These actions are +prohibited by law if you do not accept this License. Therefore, by +modifying or distributing the Library (or any work based on the +Library), you indicate your acceptance of this License to do so, and +all its terms and conditions for copying, distributing or modifying +the Library or works based on it. + + 10. Each time you redistribute the Library (or any work based on the +Library), the recipient automatically receives a license from the +original licensor to copy, distribute, link with or modify the Library +subject to these terms and conditions. You may not impose any further +restrictions on the recipients' exercise of the rights granted herein. +You are not responsible for enforcing compliance by third parties with +this License. + + 11. If, as a consequence of a court judgment or allegation of patent +infringement or for any other reason (not limited to patent issues), +conditions are imposed on you (whether by court order, agreement or +otherwise) that contradict the conditions of this License, they do not +excuse you from the conditions of this License. If you cannot +distribute so as to satisfy simultaneously your obligations under this +License and any other pertinent obligations, then as a consequence you +may not distribute the Library at all. For example, if a patent +license would not permit royalty-free redistribution of the Library by +all those who receive copies directly or indirectly through you, then +the only way you could satisfy both it and this License would be to +refrain entirely from distribution of the Library. + +If any portion of this section is held invalid or unenforceable under any +particular circumstance, the balance of the section is intended to apply, +and the section as a whole is intended to apply in other circumstances. + +It is not the purpose of this section to induce you to infringe any +patents or other property right claims or to contest validity of any +such claims; this section has the sole purpose of protecting the +integrity of the free software distribution system which is +implemented by public license practices. Many people have made +generous contributions to the wide range of software distributed +through that system in reliance on consistent application of that +system; it is up to the author/donor to decide if he or she is willing +to distribute software through any other system and a licensee cannot +impose that choice. + +This section is intended to make thoroughly clear what is believed to +be a consequence of the rest of this License. + + 12. If the distribution and/or use of the Library is restricted in +certain countries either by patents or by copyrighted interfaces, the +original copyright holder who places the Library under this License may add +an explicit geographical distribution limitation excluding those countries, +so that distribution is permitted only in or among countries not thus +excluded. In such case, this License incorporates the limitation as if +written in the body of this License. + + 13. The Free Software Foundation may publish revised and/or new +versions of the Lesser General Public License from time to time. +Such new versions will be similar in spirit to the present version, +but may differ in detail to address new problems or concerns. + +Each version is given a distinguishing version number. If the Library +specifies a version number of this License which applies to it and +"any later version", you have the option of following the terms and +conditions either of that version or of any later version published by +the Free Software Foundation. If the Library does not specify a +license version number, you may choose any version ever published by +the Free Software Foundation. + + 14. If you wish to incorporate parts of the Library into other free +programs whose distribution conditions are incompatible with these, +write to the author to ask for permission. For software which is +copyrighted by the Free Software Foundation, write to the Free +Software Foundation; we sometimes make exceptions for this. Our +decision will be guided by the two goals of preserving the free status +of all derivatives of our free software and of promoting the sharing +and reuse of software generally. + + NO WARRANTY + + 15. BECAUSE THE LIBRARY IS LICENSED FREE OF CHARGE, THERE IS NO +WARRANTY FOR THE LIBRARY, TO THE EXTENT PERMITTED BY APPLICABLE LAW. +EXCEPT WHEN OTHERWISE STATED IN WRITING THE COPYRIGHT HOLDERS AND/OR +OTHER PARTIES PROVIDE THE LIBRARY "AS IS" WITHOUT WARRANTY OF ANY +KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE +IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR +PURPOSE. THE ENTIRE RISK AS TO THE QUALITY AND PERFORMANCE OF THE +LIBRARY IS WITH YOU. SHOULD THE LIBRARY PROVE DEFECTIVE, YOU ASSUME +THE COST OF ALL NECESSARY SERVICING, REPAIR OR CORRECTION. + + 16. IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN +WRITING WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MAY MODIFY +AND/OR REDISTRIBUTE THE LIBRARY AS PERMITTED ABOVE, BE LIABLE TO YOU +FOR DAMAGES, INCLUDING ANY GENERAL, SPECIAL, INCIDENTAL OR +CONSEQUENTIAL DAMAGES ARISING OUT OF THE USE OR INABILITY TO USE THE +LIBRARY (INCLUDING BUT NOT LIMITED TO LOSS OF DATA OR DATA BEING +RENDERED INACCURATE OR LOSSES SUSTAINED BY YOU OR THIRD PARTIES OR A +FAILURE OF THE LIBRARY TO OPERATE WITH ANY OTHER SOFTWARE), EVEN IF +SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE POSSIBILITY OF SUCH +DAMAGES. + + END OF TERMS AND CONDITIONS + + How to Apply These Terms to Your New Libraries + + If you develop a new library, and you want it to be of the greatest +possible use to the public, we recommend making it free software that +everyone can redistribute and change. You can do so by permitting +redistribution under these terms (or, alternatively, under the terms of the +ordinary General Public License). + + To apply these terms, attach the following notices to the library. It is +safest to attach them to the start of each source file to most effectively +convey the exclusion of warranty; and each file should have at least the +"copyright" line and a pointer to where the full notice is found. + + {description} + Copyright (C) {year} {fullname} + + This library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + This library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with this library; if not, write to the Free Software + Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 + USA + +Also add information on how to contact you by electronic and paper mail. + +You should also get your employer (if you work as a programmer) or your +school, if any, to sign a "copyright disclaimer" for the library, if +necessary. Here is a sample; alter the names: + + Yoyodyne, Inc., hereby disclaims all copyright interest in the + library `Frob' (a library for tweaking knobs) written by James Random + Hacker. + + {signature of Ty Coon}, 1 April 1990 + Ty Coon, President of Vice + +That's all there is to it! + + +pytorch-lightning +Apache Software License +https://github.com/Lightning-AI/lightning + Apache License + Version 2.0, January 2004 + http://www.apache.org/licenses/ + + TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION + + 1. Definitions. + + "License" shall mean the terms and conditions for use, reproduction, + and distribution as defined by Sections 1 through 9 of this document. + + "Licensor" shall mean the copyright owner or entity authorized by + the copyright owner that is granting the License. + + "Legal Entity" shall mean the union of the acting entity and all + other entities that control, are controlled by, or are under common + control with that entity. For the purposes of this definition, + "control" means (i) the power, direct or indirect, to cause the + direction or management of such entity, whether by contract or + otherwise, or (ii) ownership of fifty percent (50%) or more of the + outstanding shares, or (iii) beneficial ownership of such entity. + + "You" (or "Your") shall mean an individual or Legal Entity + exercising permissions granted by this License. + + "Source" form shall mean the preferred form for making modifications, + including but not limited to software source code, documentation + source, and configuration files. + + "Object" form shall mean any form resulting from mechanical + transformation or translation of a Source form, including but + not limited to compiled object code, generated documentation, + and conversions to other media types. + + "Work" shall mean the work of authorship, whether in Source or + Object form, made available under the License, as indicated by a + copyright notice that is included in or attached to the work + (an example is provided in the Appendix below). + + "Derivative Works" shall mean any work, whether in Source or Object + form, that is based on (or derived from) the Work and for which the + editorial revisions, annotations, elaborations, or other modifications + represent, as a whole, an original work of authorship. For the purposes + of this License, Derivative Works shall not include works that remain + separable from, or merely link (or bind by name) to the interfaces of, + the Work and Derivative Works thereof. + + "Contribution" shall mean any work of authorship, including + the original version of the Work and any modifications or additions + to that Work or Derivative Works thereof, that is intentionally + submitted to Licensor for inclusion in the Work by the copyright owner + or by an individual or Legal Entity authorized to submit on behalf of + the copyright owner. For the purposes of this definition, "submitted" + means any form of electronic, verbal, or written communication sent + to the Licensor or its representatives, including but not limited to + communication on electronic mailing lists, source code control systems, + and issue tracking systems that are managed by, or on behalf of, the + Licensor for the purpose of discussing and improving the Work, but + excluding communication that is conspicuously marked or otherwise + designated in writing by the copyright owner as "Not a Contribution." + + "Contributor" shall mean Licensor and any individual or Legal Entity + on behalf of whom a Contribution has been received by Licensor and + subsequently incorporated within the Work. + + 2. Grant of Copyright License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + copyright license to reproduce, prepare Derivative Works of, + publicly display, publicly perform, sublicense, and distribute the + Work and such Derivative Works in Source or Object form. + + 3. Grant of Patent License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + (except as stated in this section) patent license to make, have made, + use, offer to sell, sell, import, and otherwise transfer the Work, + where such license applies only to those patent claims licensable + by such Contributor that are necessarily infringed by their + Contribution(s) alone or by combination of their Contribution(s) + with the Work to which such Contribution(s) was submitted. If You + institute patent litigation against any entity (including a + cross-claim or counterclaim in a lawsuit) alleging that the Work + or a Contribution incorporated within the Work constitutes direct + or contributory patent infringement, then any patent licenses + granted to You under this License for that Work shall terminate + as of the date such litigation is filed. + + 4. Redistribution. You may reproduce and distribute copies of the + Work or Derivative Works thereof in any medium, with or without + modifications, and in Source or Object form, provided that You + meet the following conditions: + + (a) You must give any other recipients of the Work or + Derivative Works a copy of this License; and + + (b) You must cause any modified files to carry prominent notices + stating that You changed the files; and + + (c) You must retain, in the Source form of any Derivative Works + that You distribute, all copyright, patent, trademark, and + attribution notices from the Source form of the Work, + excluding those notices that do not pertain to any part of + the Derivative Works; and + + (d) If the Work includes a "NOTICE" text file as part of its + distribution, then any Derivative Works that You distribute must + include a readable copy of the attribution notices contained + within such NOTICE file, excluding those notices that do not + pertain to any part of the Derivative Works, in at least one + of the following places: within a NOTICE text file distributed + as part of the Derivative Works; within the Source form or + documentation, if provided along with the Derivative Works; or, + within a display generated by the Derivative Works, if and + wherever such third-party notices normally appear. The contents + of the NOTICE file are for informational purposes only and + do not modify the License. You may add Your own attribution + notices within Derivative Works that You distribute, alongside + or as an addendum to the NOTICE text from the Work, provided + that such additional attribution notices cannot be construed + as modifying the License. + + You may add Your own copyright statement to Your modifications and + may provide additional or different license terms and conditions + for use, reproduction, or distribution of Your modifications, or + for any such Derivative Works as a whole, provided Your use, + reproduction, and distribution of the Work otherwise complies with + the conditions stated in this License. + + 5. Submission of Contributions. Unless You explicitly state otherwise, + any Contribution intentionally submitted for inclusion in the Work + by You to the Licensor shall be under the terms and conditions of + this License, without any additional terms or conditions. + Notwithstanding the above, nothing herein shall supersede or modify + the terms of any separate license agreement you may have executed + with Licensor regarding such Contributions. + + 6. Trademarks. This License does not grant permission to use the trade + names, trademarks, service marks, or product names of the Licensor, + except as required for reasonable and customary use in describing the + origin of the Work and reproducing the content of the NOTICE file. + + 7. Disclaimer of Warranty. Unless required by applicable law or + agreed to in writing, Licensor provides the Work (and each + Contributor provides its Contributions) on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or + implied, including, without limitation, any warranties or conditions + of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A + PARTICULAR PURPOSE. You are solely responsible for determining the + appropriateness of using or redistributing the Work and assume any + risks associated with Your exercise of permissions under this License. + + 8. Limitation of Liability. In no event and under no legal theory, + whether in tort (including negligence), contract, or otherwise, + unless required by applicable law (such as deliberate and grossly + negligent acts) or agreed to in writing, shall any Contributor be + liable to You for damages, including any direct, indirect, special, + incidental, or consequential damages of any character arising as a + result of this License or out of the use or inability to use the + Work (including but not limited to damages for loss of goodwill, + work stoppage, computer failure or malfunction, or any and all + other commercial damages or losses), even if such Contributor + has been advised of the possibility of such damages. + + 9. Accepting Warranty or Additional Liability. While redistributing + the Work or Derivative Works thereof, You may choose to offer, + and charge a fee for, acceptance of support, warranty, indemnity, + or other liability obligations and/or rights consistent with this + License. However, in accepting such obligations, You may act only + on Your own behalf and on Your sole responsibility, not on behalf + of any other Contributor, and only if You agree to indemnify, + defend, and hold each Contributor harmless for any liability + incurred by, or claims asserted against, such Contributor by reason + of your accepting any such warranty or additional liability. + + END OF TERMS AND CONDITIONS + + APPENDIX: How to apply the Apache License to your work. + + To apply the Apache License to your work, attach the following + boilerplate notice, with the fields enclosed by brackets "[]" + replaced with your own identifying information. (Don't include + the brackets!) The text should be enclosed in the appropriate + comment syntax for the file format. We also recommend that a + file or class name and description of purpose be included on the + same "printed page" as the copyright notice for easier + identification within third-party archives. + + Copyright 2018-2021 William Falcon + + Licensed under the Apache License, Version 2.0 (the "License"); + you may not use this file except in compliance with the License. + You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, software + distributed under the License is distributed on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + See the License for the specific language governing permissions and + limitations under the License. + + +pytorch-ranger +MIT License +https://github.com/mpariente/Ranger-Deep-Learning-Optimizer + Apache License + Version 2.0, January 2004 + http://www.apache.org/licenses/ + + TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION + + 1. Definitions. + + "License" shall mean the terms and conditions for use, reproduction, + and distribution as defined by Sections 1 through 9 of this document. + + "Licensor" shall mean the copyright owner or entity authorized by + the copyright owner that is granting the License. + + "Legal Entity" shall mean the union of the acting entity and all + other entities that control, are controlled by, or are under common + control with that entity. For the purposes of this definition, + "control" means (i) the power, direct or indirect, to cause the + direction or management of such entity, whether by contract or + otherwise, or (ii) ownership of fifty percent (50%) or more of the + outstanding shares, or (iii) beneficial ownership of such entity. + + "You" (or "Your") shall mean an individual or Legal Entity + exercising permissions granted by this License. + + "Source" form shall mean the preferred form for making modifications, + including but not limited to software source code, documentation + source, and configuration files. + + "Object" form shall mean any form resulting from mechanical + transformation or translation of a Source form, including but + not limited to compiled object code, generated documentation, + and conversions to other media types. + + "Work" shall mean the work of authorship, whether in Source or + Object form, made available under the License, as indicated by a + copyright notice that is included in or attached to the work + (an example is provided in the Appendix below). + + "Derivative Works" shall mean any work, whether in Source or Object + form, that is based on (or derived from) the Work and for which the + editorial revisions, annotations, elaborations, or other modifications + represent, as a whole, an original work of authorship. For the purposes + of this License, Derivative Works shall not include works that remain + separable from, or merely link (or bind by name) to the interfaces of, + the Work and Derivative Works thereof. + + "Contribution" shall mean any work of authorship, including + the original version of the Work and any modifications or additions + to that Work or Derivative Works thereof, that is intentionally + submitted to Licensor for inclusion in the Work by the copyright owner + or by an individual or Legal Entity authorized to submit on behalf of + the copyright owner. For the purposes of this definition, "submitted" + means any form of electronic, verbal, or written communication sent + to the Licensor or its representatives, including but not limited to + communication on electronic mailing lists, source code control systems, + and issue tracking systems that are managed by, or on behalf of, the + Licensor for the purpose of discussing and improving the Work, but + excluding communication that is conspicuously marked or otherwise + designated in writing by the copyright owner as "Not a Contribution." + + "Contributor" shall mean Licensor and any individual or Legal Entity + on behalf of whom a Contribution has been received by Licensor and + subsequently incorporated within the Work. + + 2. Grant of Copyright License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + copyright license to reproduce, prepare Derivative Works of, + publicly display, publicly perform, sublicense, and distribute the + Work and such Derivative Works in Source or Object form. + + 3. Grant of Patent License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + (except as stated in this section) patent license to make, have made, + use, offer to sell, sell, import, and otherwise transfer the Work, + where such license applies only to those patent claims licensable + by such Contributor that are necessarily infringed by their + Contribution(s) alone or by combination of their Contribution(s) + with the Work to which such Contribution(s) was submitted. If You + institute patent litigation against any entity (including a + cross-claim or counterclaim in a lawsuit) alleging that the Work + or a Contribution incorporated within the Work constitutes direct + or contributory patent infringement, then any patent licenses + granted to You under this License for that Work shall terminate + as of the date such litigation is filed. + + 4. Redistribution. You may reproduce and distribute copies of the + Work or Derivative Works thereof in any medium, with or without + modifications, and in Source or Object form, provided that You + meet the following conditions: + + (a) You must give any other recipients of the Work or + Derivative Works a copy of this License; and + + (b) You must cause any modified files to carry prominent notices + stating that You changed the files; and + + (c) You must retain, in the Source form of any Derivative Works + that You distribute, all copyright, patent, trademark, and + attribution notices from the Source form of the Work, + excluding those notices that do not pertain to any part of + the Derivative Works; and + + (d) If the Work includes a "NOTICE" text file as part of its + distribution, then any Derivative Works that You distribute must + include a readable copy of the attribution notices contained + within such NOTICE file, excluding those notices that do not + pertain to any part of the Derivative Works, in at least one + of the following places: within a NOTICE text file distributed + as part of the Derivative Works; within the Source form or + documentation, if provided along with the Derivative Works; or, + within a display generated by the Derivative Works, if and + wherever such third-party notices normally appear. The contents + of the NOTICE file are for informational purposes only and + do not modify the License. You may add Your own attribution + notices within Derivative Works that You distribute, alongside + or as an addendum to the NOTICE text from the Work, provided + that such additional attribution notices cannot be construed + as modifying the License. + + You may add Your own copyright statement to Your modifications and + may provide additional or different license terms and conditions + for use, reproduction, or distribution of Your modifications, or + for any such Derivative Works as a whole, provided Your use, + reproduction, and distribution of the Work otherwise complies with + the conditions stated in this License. + + 5. Submission of Contributions. Unless You explicitly state otherwise, + any Contribution intentionally submitted for inclusion in the Work + by You to the Licensor shall be under the terms and conditions of + this License, without any additional terms or conditions. + Notwithstanding the above, nothing herein shall supersede or modify + the terms of any separate license agreement you may have executed + with Licensor regarding such Contributions. + + 6. Trademarks. This License does not grant permission to use the trade + names, trademarks, service marks, or product names of the Licensor, + except as required for reasonable and customary use in describing the + origin of the Work and reproducing the content of the NOTICE file. + + 7. Disclaimer of Warranty. Unless required by applicable law or + agreed to in writing, Licensor provides the Work (and each + Contributor provides its Contributions) on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or + implied, including, without limitation, any warranties or conditions + of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A + PARTICULAR PURPOSE. You are solely responsible for determining the + appropriateness of using or redistributing the Work and assume any + risks associated with Your exercise of permissions under this License. + + 8. Limitation of Liability. In no event and under no legal theory, + whether in tort (including negligence), contract, or otherwise, + unless required by applicable law (such as deliberate and grossly + negligent acts) or agreed to in writing, shall any Contributor be + liable to You for damages, including any direct, indirect, special, + incidental, or consequential damages of any character arising as a + result of this License or out of the use or inability to use the + Work (including but not limited to damages for loss of goodwill, + work stoppage, computer failure or malfunction, or any and all + other commercial damages or losses), even if such Contributor + has been advised of the possibility of such damages. + + 9. Accepting Warranty or Additional Liability. While redistributing + the Work or Derivative Works thereof, You may choose to offer, + and charge a fee for, acceptance of support, warranty, indemnity, + or other liability obligations and/or rights consistent with this + License. However, in accepting such obligations, You may act only + on Your own behalf and on Your sole responsibility, not on behalf + of any other Contributor, and only if You agree to indemnify, + defend, and hold each Contributor harmless for any liability + incurred by, or claims asserted against, such Contributor by reason + of your accepting any such warranty or additional liability. + + END OF TERMS AND CONDITIONS + + APPENDIX: How to apply the Apache License to your work. + + To apply the Apache License to your work, attach the following + boilerplate notice, with the fields enclosed by brackets "[]" + replaced with your own identifying information. (Don't include + the brackets!) The text should be enclosed in the appropriate + comment syntax for the file format. We also recommend that a + file or class name and description of purpose be included on the + same "printed page" as the copyright notice for easier + identification within third-party archives. + + Copyright [yyyy] [name of copyright owner] + + Licensed under the Apache License, Version 2.0 (the "License"); + you may not use this file except in compliance with the License. + You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, software + distributed under the License is distributed on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + See the License for the specific language governing permissions and + limitations under the License. + + +pytz +MIT License +http://pythonhosted.org/pytz +Copyright (c) 2003-2019 Stuart Bishop + +Permission is hereby granted, free of charge, to any person obtaining a +copy of this software and associated documentation files (the "Software"), +to deal in the Software without restriction, including without limitation +the rights to use, copy, modify, merge, publish, distribute, sublicense, +and/or sell copies of the Software, and to permit persons to whom the +Software is furnished to do so, subject to the following conditions: + +The above copyright notice and this permission notice shall be included in +all copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL +THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING +FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER +DEALINGS IN THE SOFTWARE. + + +pyzmq +BSD License +https://pyzmq.readthedocs.org +BSD 3-Clause License + +Copyright (c) 2009-2012, Brian Granger, Min Ragan-Kelley + +All rights reserved. + +Redistribution and use in source and binary forms, with or without +modification, are permitted provided that the following conditions are met: + +1. Redistributions of source code must retain the above copyright notice, this + list of conditions and the following disclaimer. + +2. Redistributions in binary form must reproduce the above copyright notice, + this list of conditions and the following disclaimer in the documentation + and/or other materials provided with the distribution. + +3. Neither the name of the copyright holder nor the names of its + contributors may be used to endorse or promote products derived from + this software without specific prior written permission. + +THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" +AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE +IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE +DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE +FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL +DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR +SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER +CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, +OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE +OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + + +qwen-vl-utils +Apache Software License +https://github.com/QwenLM/Qwen2-VL/tree/main/qwen-vl-utils +UNKNOWN + +ray +Apache 2.0 +https://github.com/ray-project/ray +A. HISTORY OF THE SOFTWARE +========================== + +Python was created in the early 1990s by Guido van Rossum at Stichting +Mathematisch Centrum (CWI, see https://www.cwi.nl) in the Netherlands +as a successor of a language called ABC. Guido remains Python's +principal author, although it includes many contributions from others. + +In 1995, Guido continued his work on Python at the Corporation for +National Research Initiatives (CNRI, see https://www.cnri.reston.va.us) +in Reston, Virginia where he released several versions of the +software. + +In May 2000, Guido and the Python core development team moved to +BeOpen.com to form the BeOpen PythonLabs team. In October of the same +year, the PythonLabs team moved to Digital Creations, which became +Zope Corporation. In 2001, the Python Software Foundation (PSF, see +https://www.python.org/psf/) was formed, a non-profit organization +created specifically to own Python-related Intellectual Property. +Zope Corporation was a sponsoring member of the PSF. + +All Python releases are Open Source (see https://opensource.org for +the Open Source Definition). Historically, most, but not all, Python +releases have also been GPL-compatible; the table below summarizes +the various releases. + + Release Derived Year Owner GPL- + from compatible? (1) + + 0.9.0 thru 1.2 1991-1995 CWI yes + 1.3 thru 1.5.2 1.2 1995-1999 CNRI yes + 1.6 1.5.2 2000 CNRI no + 2.0 1.6 2000 BeOpen.com no + 1.6.1 1.6 2001 CNRI yes (2) + 2.1 2.0+1.6.1 2001 PSF no + 2.0.1 2.0+1.6.1 2001 PSF yes + 2.1.1 2.1+2.0.1 2001 PSF yes + 2.1.2 2.1.1 2002 PSF yes + 2.1.3 2.1.2 2002 PSF yes + 2.2 and above 2.1.1 2001-now PSF yes + +Footnotes: + +(1) GPL-compatible doesn't mean that we're distributing Python under + the GPL. All Python licenses, unlike the GPL, let you distribute + a modified version without making your changes open source. The + GPL-compatible licenses make it possible to combine Python with + other software that is released under the GPL; the others don't. + +(2) According to Richard Stallman, 1.6.1 is not GPL-compatible, + because its license has a choice of law clause. According to + CNRI, however, Stallman's lawyer has told CNRI's lawyer that 1.6.1 + is "not incompatible" with the GPL. + +Thanks to the many outside volunteers who have worked under Guido's +direction to make these releases possible. + + +B. TERMS AND CONDITIONS FOR ACCESSING OR OTHERWISE USING PYTHON +=============================================================== + +Python software and documentation are licensed under the +Python Software Foundation License Version 2. + +Starting with Python 3.8.6, examples, recipes, and other code in +the documentation are dual licensed under the PSF License Version 2 +and the Zero-Clause BSD license. + +Some software incorporated into Python is under different licenses. +The licenses are listed with code falling under that license. + + +PYTHON SOFTWARE FOUNDATION LICENSE VERSION 2 +-------------------------------------------- + +1. This LICENSE AGREEMENT is between the Python Software Foundation +("PSF"), and the Individual or Organization ("Licensee") accessing and +otherwise using this software ("Python") in source or binary form and +its associated documentation. + +2. Subject to the terms and conditions of this License Agreement, PSF hereby +grants Licensee a nonexclusive, royalty-free, world-wide license to reproduce, +analyze, test, perform and/or display publicly, prepare derivative works, +distribute, and otherwise use Python alone or in any derivative version, +provided, however, that PSF's License Agreement and PSF's notice of copyright, +i.e., "Copyright (c) 2001, 2002, 2003, 2004, 2005, 2006, 2007, 2008, 2009, 2010, +2011, 2012, 2013, 2014, 2015, 2016, 2017, 2018, 2019, 2020, 2021, 2022, 2023 Python Software Foundation; +All Rights Reserved" are retained in Python alone or in any derivative version +prepared by Licensee. + +3. In the event Licensee prepares a derivative work that is based on +or incorporates Python or any part thereof, and wants to make +the derivative work available to others as provided herein, then +Licensee hereby agrees to include in any such work a brief summary of +the changes made to Python. + +4. PSF is making Python available to Licensee on an "AS IS" +basis. PSF MAKES NO REPRESENTATIONS OR WARRANTIES, EXPRESS OR +IMPLIED. BY WAY OF EXAMPLE, BUT NOT LIMITATION, PSF MAKES NO AND +DISCLAIMS ANY REPRESENTATION OR WARRANTY OF MERCHANTABILITY OR FITNESS +FOR ANY PARTICULAR PURPOSE OR THAT THE USE OF PYTHON WILL NOT +INFRINGE ANY THIRD PARTY RIGHTS. + +5. PSF SHALL NOT BE LIABLE TO LICENSEE OR ANY OTHER USERS OF PYTHON +FOR ANY INCIDENTAL, SPECIAL, OR CONSEQUENTIAL DAMAGES OR LOSS AS +A RESULT OF MODIFYING, DISTRIBUTING, OR OTHERWISE USING PYTHON, +OR ANY DERIVATIVE THEREOF, EVEN IF ADVISED OF THE POSSIBILITY THEREOF. + +6. This License Agreement will automatically terminate upon a material +breach of its terms and conditions. + +7. Nothing in this License Agreement shall be deemed to create any +relationship of agency, partnership, or joint venture between PSF and +Licensee. This License Agreement does not grant permission to use PSF +trademarks or trade name in a trademark sense to endorse or promote +products or services of Licensee, or any third party. + +8. By copying, installing or otherwise using Python, Licensee +agrees to be bound by the terms and conditions of this License +Agreement. + + +BEOPEN.COM LICENSE AGREEMENT FOR PYTHON 2.0 +------------------------------------------- + +BEOPEN PYTHON OPEN SOURCE LICENSE AGREEMENT VERSION 1 + +1. This LICENSE AGREEMENT is between BeOpen.com ("BeOpen"), having an +office at 160 Saratoga Avenue, Santa Clara, CA 95051, and the +Individual or Organization ("Licensee") accessing and otherwise using +this software in source or binary form and its associated +documentation ("the Software"). + +2. Subject to the terms and conditions of this BeOpen Python License +Agreement, BeOpen hereby grants Licensee a non-exclusive, +royalty-free, world-wide license to reproduce, analyze, test, perform +and/or display publicly, prepare derivative works, distribute, and +otherwise use the Software alone or in any derivative version, +provided, however, that the BeOpen Python License is retained in the +Software, alone or in any derivative version prepared by Licensee. + +3. BeOpen is making the Software available to Licensee on an "AS IS" +basis. BEOPEN MAKES NO REPRESENTATIONS OR WARRANTIES, EXPRESS OR +IMPLIED. BY WAY OF EXAMPLE, BUT NOT LIMITATION, BEOPEN MAKES NO AND +DISCLAIMS ANY REPRESENTATION OR WARRANTY OF MERCHANTABILITY OR FITNESS +FOR ANY PARTICULAR PURPOSE OR THAT THE USE OF THE SOFTWARE WILL NOT +INFRINGE ANY THIRD PARTY RIGHTS. + +4. BEOPEN SHALL NOT BE LIABLE TO LICENSEE OR ANY OTHER USERS OF THE +SOFTWARE FOR ANY INCIDENTAL, SPECIAL, OR CONSEQUENTIAL DAMAGES OR LOSS +AS A RESULT OF USING, MODIFYING OR DISTRIBUTING THE SOFTWARE, OR ANY +DERIVATIVE THEREOF, EVEN IF ADVISED OF THE POSSIBILITY THEREOF. + +5. This License Agreement will automatically terminate upon a material +breach of its terms and conditions. + +6. This License Agreement shall be governed by and interpreted in all +respects by the law of the State of California, excluding conflict of +law provisions. Nothing in this License Agreement shall be deemed to +create any relationship of agency, partnership, or joint venture +between BeOpen and Licensee. This License Agreement does not grant +permission to use BeOpen trademarks or trade names in a trademark +sense to endorse or promote products or services of Licensee, or any +third party. As an exception, the "BeOpen Python" logos available at +http://www.pythonlabs.com/logos.html may be used according to the +permissions granted on that web page. + +7. By copying, installing or otherwise using the software, Licensee +agrees to be bound by the terms and conditions of this License +Agreement. + + +CNRI LICENSE AGREEMENT FOR PYTHON 1.6.1 +--------------------------------------- + +1. This LICENSE AGREEMENT is between the Corporation for National +Research Initiatives, having an office at 1895 Preston White Drive, +Reston, VA 20191 ("CNRI"), and the Individual or Organization +("Licensee") accessing and otherwise using Python 1.6.1 software in +source or binary form and its associated documentation. + +2. Subject to the terms and conditions of this License Agreement, CNRI +hereby grants Licensee a nonexclusive, royalty-free, world-wide +license to reproduce, analyze, test, perform and/or display publicly, +prepare derivative works, distribute, and otherwise use Python 1.6.1 +alone or in any derivative version, provided, however, that CNRI's +License Agreement and CNRI's notice of copyright, i.e., "Copyright (c) +1995-2001 Corporation for National Research Initiatives; All Rights +Reserved" are retained in Python 1.6.1 alone or in any derivative +version prepared by Licensee. Alternately, in lieu of CNRI's License +Agreement, Licensee may substitute the following text (omitting the +quotes): "Python 1.6.1 is made available subject to the terms and +conditions in CNRI's License Agreement. This Agreement together with +Python 1.6.1 may be located on the internet using the following +unique, persistent identifier (known as a handle): 1895.22/1013. This +Agreement may also be obtained from a proxy server on the internet +using the following URL: http://hdl.handle.net/1895.22/1013". + +3. In the event Licensee prepares a derivative work that is based on +or incorporates Python 1.6.1 or any part thereof, and wants to make +the derivative work available to others as provided herein, then +Licensee hereby agrees to include in any such work a brief summary of +the changes made to Python 1.6.1. + +4. CNRI is making Python 1.6.1 available to Licensee on an "AS IS" +basis. CNRI MAKES NO REPRESENTATIONS OR WARRANTIES, EXPRESS OR +IMPLIED. BY WAY OF EXAMPLE, BUT NOT LIMITATION, CNRI MAKES NO AND +DISCLAIMS ANY REPRESENTATION OR WARRANTY OF MERCHANTABILITY OR FITNESS +FOR ANY PARTICULAR PURPOSE OR THAT THE USE OF PYTHON 1.6.1 WILL NOT +INFRINGE ANY THIRD PARTY RIGHTS. + +5. CNRI SHALL NOT BE LIABLE TO LICENSEE OR ANY OTHER USERS OF PYTHON +1.6.1 FOR ANY INCIDENTAL, SPECIAL, OR CONSEQUENTIAL DAMAGES OR LOSS AS +A RESULT OF MODIFYING, DISTRIBUTING, OR OTHERWISE USING PYTHON 1.6.1, +OR ANY DERIVATIVE THEREOF, EVEN IF ADVISED OF THE POSSIBILITY THEREOF. + +6. This License Agreement will automatically terminate upon a material +breach of its terms and conditions. + +7. This License Agreement shall be governed by the federal +intellectual property law of the United States, including without +limitation the federal copyright law, and, to the extent such +U.S. federal law does not apply, by the law of the Commonwealth of +Virginia, excluding Virginia's conflict of law provisions. +Notwithstanding the foregoing, with regard to derivative works based +on Python 1.6.1 that incorporate non-separable material that was +previously distributed under the GNU General Public License (GPL), the +law of the Commonwealth of Virginia shall govern this License +Agreement only as to issues arising under or with respect to +Paragraphs 4, 5, and 7 of this License Agreement. Nothing in this +License Agreement shall be deemed to create any relationship of +agency, partnership, or joint venture between CNRI and Licensee. This +License Agreement does not grant permission to use CNRI trademarks or +trade name in a trademark sense to endorse or promote products or +services of Licensee, or any third party. + +8. By clicking on the "ACCEPT" button where indicated, or by copying, +installing or otherwise using Python 1.6.1, Licensee agrees to be +bound by the terms and conditions of this License Agreement. + + ACCEPT + + +CWI LICENSE AGREEMENT FOR PYTHON 0.9.0 THROUGH 1.2 +-------------------------------------------------- + +Copyright (c) 1991 - 1995, Stichting Mathematisch Centrum Amsterdam, +The Netherlands. All rights reserved. + +Permission to use, copy, modify, and distribute this software and its +documentation for any purpose and without fee is hereby granted, +provided that the above copyright notice appear in all copies and that +both that copyright notice and this permission notice appear in +supporting documentation, and that the name of Stichting Mathematisch +Centrum or CWI not be used in advertising or publicity pertaining to +distribution of the software without specific, written prior +permission. + +STICHTING MATHEMATISCH CENTRUM DISCLAIMS ALL WARRANTIES WITH REGARD TO +THIS SOFTWARE, INCLUDING ALL IMPLIED WARRANTIES OF MERCHANTABILITY AND +FITNESS, IN NO EVENT SHALL STICHTING MATHEMATISCH CENTRUM BE LIABLE +FOR ANY SPECIAL, INDIRECT OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES +WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN +ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT +OF OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE. + +ZERO-CLAUSE BSD LICENSE FOR CODE IN THE PYTHON DOCUMENTATION +---------------------------------------------------------------------- + +Permission to use, copy, modify, and/or distribute this software for any +purpose with or without fee is hereby granted. + +THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES WITH +REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF MERCHANTABILITY +AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY SPECIAL, DIRECT, +INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES WHATSOEVER RESULTING FROM +LOSS OF USE, DATA OR PROFITS, WHETHER IN AN ACTION OF CONTRACT, NEGLIGENCE OR +OTHER TORTIOUS ACTION, ARISING OUT OF OR IN CONNECTION WITH THE USE OR +PERFORMANCE OF THIS SOFTWARE. + + +referencing +MIT +https://github.com/python-jsonschema/referencing +Copyright (c) 2022 Julian Berman + +Permission is hereby granted, free of charge, to any person obtaining a copy +of this software and associated documentation files (the "Software"), to deal +in the Software without restriction, including without limitation the rights +to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +copies of the Software, and to permit persons to whom the Software is +furnished to do so, subject to the following conditions: + +The above copyright notice and this permission notice shall be included in +all copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN +THE SOFTWARE. + + +regex +Apache-2.0 AND CNRI-Python +https://github.com/mrabarnett/mrab-regex +This work was derived from the 're' module of CPython 2.6 and CPython 3.1, +copyright (c) 1998-2001 by Secret Labs AB and licensed under CNRI's Python 1.6 +license. + +All additions and alterations are licensed under the Apache 2.0 License. + + + Apache License + Version 2.0, January 2004 + http://www.apache.org/licenses/ + + TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION + + 1. Definitions. + + "License" shall mean the terms and conditions for use, reproduction, + and distribution as defined by Sections 1 through 9 of this document. + + "Licensor" shall mean the copyright owner or entity authorized by + the copyright owner that is granting the License. + + "Legal Entity" shall mean the union of the acting entity and all + other entities that control, are controlled by, or are under common + control with that entity. For the purposes of this definition, + "control" means (i) the power, direct or indirect, to cause the + direction or management of such entity, whether by contract or + otherwise, or (ii) ownership of fifty percent (50%) or more of the + outstanding shares, or (iii) beneficial ownership of such entity. + + "You" (or "Your") shall mean an individual or Legal Entity + exercising permissions granted by this License. + + "Source" form shall mean the preferred form for making modifications, + including but not limited to software source code, documentation + source, and configuration files. + + "Object" form shall mean any form resulting from mechanical + transformation or translation of a Source form, including but + not limited to compiled object code, generated documentation, + and conversions to other media types. + + "Work" shall mean the work of authorship, whether in Source or + Object form, made available under the License, as indicated by a + copyright notice that is included in or attached to the work + (an example is provided in the Appendix below). + + "Derivative Works" shall mean any work, whether in Source or Object + form, that is based on (or derived from) the Work and for which the + editorial revisions, annotations, elaborations, or other modifications + represent, as a whole, an original work of authorship. For the purposes + of this License, Derivative Works shall not include works that remain + separable from, or merely link (or bind by name) to the interfaces of, + the Work and Derivative Works thereof. + + "Contribution" shall mean any work of authorship, including + the original version of the Work and any modifications or additions + to that Work or Derivative Works thereof, that is intentionally + submitted to Licensor for inclusion in the Work by the copyright owner + or by an individual or Legal Entity authorized to submit on behalf of + the copyright owner. For the purposes of this definition, "submitted" + means any form of electronic, verbal, or written communication sent + to the Licensor or its representatives, including but not limited to + communication on electronic mailing lists, source code control systems, + and issue tracking systems that are managed by, or on behalf of, the + Licensor for the purpose of discussing and improving the Work, but + excluding communication that is conspicuously marked or otherwise + designated in writing by the copyright owner as "Not a Contribution." + + "Contributor" shall mean Licensor and any individual or Legal Entity + on behalf of whom a Contribution has been received by Licensor and + subsequently incorporated within the Work. + + 2. Grant of Copyright License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + copyright license to reproduce, prepare Derivative Works of, + publicly display, publicly perform, sublicense, and distribute the + Work and such Derivative Works in Source or Object form. + + 3. Grant of Patent License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + (except as stated in this section) patent license to make, have made, + use, offer to sell, sell, import, and otherwise transfer the Work, + where such license applies only to those patent claims licensable + by such Contributor that are necessarily infringed by their + Contribution(s) alone or by combination of their Contribution(s) + with the Work to which such Contribution(s) was submitted. If You + institute patent litigation against any entity (including a + cross-claim or counterclaim in a lawsuit) alleging that the Work + or a Contribution incorporated within the Work constitutes direct + or contributory patent infringement, then any patent licenses + granted to You under this License for that Work shall terminate + as of the date such litigation is filed. + + 4. Redistribution. You may reproduce and distribute copies of the + Work or Derivative Works thereof in any medium, with or without + modifications, and in Source or Object form, provided that You + meet the following conditions: + + (a) You must give any other recipients of the Work or + Derivative Works a copy of this License; and + + (b) You must cause any modified files to carry prominent notices + stating that You changed the files; and + + (c) You must retain, in the Source form of any Derivative Works + that You distribute, all copyright, patent, trademark, and + attribution notices from the Source form of the Work, + excluding those notices that do not pertain to any part of + the Derivative Works; and + + (d) If the Work includes a "NOTICE" text file as part of its + distribution, then any Derivative Works that You distribute must + include a readable copy of the attribution notices contained + within such NOTICE file, excluding those notices that do not + pertain to any part of the Derivative Works, in at least one + of the following places: within a NOTICE text file distributed + as part of the Derivative Works; within the Source form or + documentation, if provided along with the Derivative Works; or, + within a display generated by the Derivative Works, if and + wherever such third-party notices normally appear. The contents + of the NOTICE file are for informational purposes only and + do not modify the License. You may add Your own attribution + notices within Derivative Works that You distribute, alongside + or as an addendum to the NOTICE text from the Work, provided + that such additional attribution notices cannot be construed + as modifying the License. + + You may add Your own copyright statement to Your modifications and + may provide additional or different license terms and conditions + for use, reproduction, or distribution of Your modifications, or + for any such Derivative Works as a whole, provided Your use, + reproduction, and distribution of the Work otherwise complies with + the conditions stated in this License. + + 5. Submission of Contributions. Unless You explicitly state otherwise, + any Contribution intentionally submitted for inclusion in the Work + by You to the Licensor shall be under the terms and conditions of + this License, without any additional terms or conditions. + Notwithstanding the above, nothing herein shall supersede or modify + the terms of any separate license agreement you may have executed + with Licensor regarding such Contributions. + + 6. Trademarks. This License does not grant permission to use the trade + names, trademarks, service marks, or product names of the Licensor, + except as required for reasonable and customary use in describing the + origin of the Work and reproducing the content of the NOTICE file. + + 7. Disclaimer of Warranty. Unless required by applicable law or + agreed to in writing, Licensor provides the Work (and each + Contributor provides its Contributions) on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or + implied, including, without limitation, any warranties or conditions + of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A + PARTICULAR PURPOSE. You are solely responsible for determining the + appropriateness of using or redistributing the Work and assume any + risks associated with Your exercise of permissions under this License. + + 8. Limitation of Liability. In no event and under no legal theory, + whether in tort (including negligence), contract, or otherwise, + unless required by applicable law (such as deliberate and grossly + negligent acts) or agreed to in writing, shall any Contributor be + liable to You for damages, including any direct, indirect, special, + incidental, or consequential damages of any character arising as a + result of this License or out of the use or inability to use the + Work (including but not limited to damages for loss of goodwill, + work stoppage, computer failure or malfunction, or any and all + other commercial damages or losses), even if such Contributor + has been advised of the possibility of such damages. + + 9. Accepting Warranty or Additional Liability. While redistributing + the Work or Derivative Works thereof, You may choose to offer, + and charge a fee for, acceptance of support, warranty, indemnity, + or other liability obligations and/or rights consistent with this + License. However, in accepting such obligations, You may act only + on Your own behalf and on Your sole responsibility, not on behalf + of any other Contributor, and only if You agree to indemnify, + defend, and hold each Contributor harmless for any liability + incurred by, or claims asserted against, such Contributor by reason + of your accepting any such warranty or additional liability. + + END OF TERMS AND CONDITIONS + + APPENDIX: How to apply the Apache License to your work. + + To apply the Apache License to your work, attach the following + boilerplate notice, with the fields enclosed by brackets "[]" + replaced with your own identifying information. (Don't include + the brackets!) The text should be enclosed in the appropriate + comment syntax for the file format. We also recommend that a + file or class name and description of purpose be included on the + same "printed page" as the copyright notice for easier + identification within third-party archives. + + Copyright 2020 Matthew Barnett + + Licensed under the Apache License, Version 2.0 (the "License"); + you may not use this file except in compliance with the License. + You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, software + distributed under the License is distributed on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + See the License for the specific language governing permissions and + limitations under the License. + + +requests +Apache Software License +https://requests.readthedocs.io + + Apache License + Version 2.0, January 2004 + http://www.apache.org/licenses/ + + TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION + + 1. Definitions. + + "License" shall mean the terms and conditions for use, reproduction, + and distribution as defined by Sections 1 through 9 of this document. + + "Licensor" shall mean the copyright owner or entity authorized by + the copyright owner that is granting the License. + + "Legal Entity" shall mean the union of the acting entity and all + other entities that control, are controlled by, or are under common + control with that entity. For the purposes of this definition, + "control" means (i) the power, direct or indirect, to cause the + direction or management of such entity, whether by contract or + otherwise, or (ii) ownership of fifty percent (50%) or more of the + outstanding shares, or (iii) beneficial ownership of such entity. + + "You" (or "Your") shall mean an individual or Legal Entity + exercising permissions granted by this License. + + "Source" form shall mean the preferred form for making modifications, + including but not limited to software source code, documentation + source, and configuration files. + + "Object" form shall mean any form resulting from mechanical + transformation or translation of a Source form, including but + not limited to compiled object code, generated documentation, + and conversions to other media types. + + "Work" shall mean the work of authorship, whether in Source or + Object form, made available under the License, as indicated by a + copyright notice that is included in or attached to the work + (an example is provided in the Appendix below). + + "Derivative Works" shall mean any work, whether in Source or Object + form, that is based on (or derived from) the Work and for which the + editorial revisions, annotations, elaborations, or other modifications + represent, as a whole, an original work of authorship. For the purposes + of this License, Derivative Works shall not include works that remain + separable from, or merely link (or bind by name) to the interfaces of, + the Work and Derivative Works thereof. + + "Contribution" shall mean any work of authorship, including + the original version of the Work and any modifications or additions + to that Work or Derivative Works thereof, that is intentionally + submitted to Licensor for inclusion in the Work by the copyright owner + or by an individual or Legal Entity authorized to submit on behalf of + the copyright owner. For the purposes of this definition, "submitted" + means any form of electronic, verbal, or written communication sent + to the Licensor or its representatives, including but not limited to + communication on electronic mailing lists, source code control systems, + and issue tracking systems that are managed by, or on behalf of, the + Licensor for the purpose of discussing and improving the Work, but + excluding communication that is conspicuously marked or otherwise + designated in writing by the copyright owner as "Not a Contribution." + + "Contributor" shall mean Licensor and any individual or Legal Entity + on behalf of whom a Contribution has been received by Licensor and + subsequently incorporated within the Work. + + 2. Grant of Copyright License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + copyright license to reproduce, prepare Derivative Works of, + publicly display, publicly perform, sublicense, and distribute the + Work and such Derivative Works in Source or Object form. + + 3. Grant of Patent License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + (except as stated in this section) patent license to make, have made, + use, offer to sell, sell, import, and otherwise transfer the Work, + where such license applies only to those patent claims licensable + by such Contributor that are necessarily infringed by their + Contribution(s) alone or by combination of their Contribution(s) + with the Work to which such Contribution(s) was submitted. If You + institute patent litigation against any entity (including a + cross-claim or counterclaim in a lawsuit) alleging that the Work + or a Contribution incorporated within the Work constitutes direct + or contributory patent infringement, then any patent licenses + granted to You under this License for that Work shall terminate + as of the date such litigation is filed. + + 4. Redistribution. You may reproduce and distribute copies of the + Work or Derivative Works thereof in any medium, with or without + modifications, and in Source or Object form, provided that You + meet the following conditions: + + (a) You must give any other recipients of the Work or + Derivative Works a copy of this License; and + + (b) You must cause any modified files to carry prominent notices + stating that You changed the files; and + + (c) You must retain, in the Source form of any Derivative Works + that You distribute, all copyright, patent, trademark, and + attribution notices from the Source form of the Work, + excluding those notices that do not pertain to any part of + the Derivative Works; and + + (d) If the Work includes a "NOTICE" text file as part of its + distribution, then any Derivative Works that You distribute must + include a readable copy of the attribution notices contained + within such NOTICE file, excluding those notices that do not + pertain to any part of the Derivative Works, in at least one + of the following places: within a NOTICE text file distributed + as part of the Derivative Works; within the Source form or + documentation, if provided along with the Derivative Works; or, + within a display generated by the Derivative Works, if and + wherever such third-party notices normally appear. The contents + of the NOTICE file are for informational purposes only and + do not modify the License. You may add Your own attribution + notices within Derivative Works that You distribute, alongside + or as an addendum to the NOTICE text from the Work, provided + that such additional attribution notices cannot be construed + as modifying the License. + + You may add Your own copyright statement to Your modifications and + may provide additional or different license terms and conditions + for use, reproduction, or distribution of Your modifications, or + for any such Derivative Works as a whole, provided Your use, + reproduction, and distribution of the Work otherwise complies with + the conditions stated in this License. + + 5. Submission of Contributions. Unless You explicitly state otherwise, + any Contribution intentionally submitted for inclusion in the Work + by You to the Licensor shall be under the terms and conditions of + this License, without any additional terms or conditions. + Notwithstanding the above, nothing herein shall supersede or modify + the terms of any separate license agreement you may have executed + with Licensor regarding such Contributions. + + 6. Trademarks. This License does not grant permission to use the trade + names, trademarks, service marks, or product names of the Licensor, + except as required for reasonable and customary use in describing the + origin of the Work and reproducing the content of the NOTICE file. + + 7. Disclaimer of Warranty. Unless required by applicable law or + agreed to in writing, Licensor provides the Work (and each + Contributor provides its Contributions) on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or + implied, including, without limitation, any warranties or conditions + of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A + PARTICULAR PURPOSE. You are solely responsible for determining the + appropriateness of using or redistributing the Work and assume any + risks associated with Your exercise of permissions under this License. + + 8. Limitation of Liability. In no event and under no legal theory, + whether in tort (including negligence), contract, or otherwise, + unless required by applicable law (such as deliberate and grossly + negligent acts) or agreed to in writing, shall any Contributor be + liable to You for damages, including any direct, indirect, special, + incidental, or consequential damages of any character arising as a + result of this License or out of the use or inability to use the + Work (including but not limited to damages for loss of goodwill, + work stoppage, computer failure or malfunction, or any and all + other commercial damages or losses), even if such Contributor + has been advised of the possibility of such damages. + + 9. Accepting Warranty or Additional Liability. While redistributing + the Work or Derivative Works thereof, You may choose to offer, + and charge a fee for, acceptance of support, warranty, indemnity, + or other liability obligations and/or rights consistent with this + License. However, in accepting such obligations, You may act only + on Your own behalf and on Your sole responsibility, not on behalf + of any other Contributor, and only if You agree to indemnify, + defend, and hold each Contributor harmless for any liability + incurred by, or claims asserted against, such Contributor by reason + of your accepting any such warranty or additional liability. + + +retinaface-py +MIT License +https://github.com/andresprados/Pytorch_Retinaface +MIT License + +Copyright (c) 2019 + +Permission is hereby granted, free of charge, to any person obtaining a copy +of this software and associated documentation files (the "Software"), to deal +in the Software without restriction, including without limitation the rights +to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +copies of the Software, and to permit persons to whom the Software is +furnished to do so, subject to the following conditions: + +The above copyright notice and this permission notice shall be included in all +copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE +SOFTWARE. + + +rfc3339-validator +MIT License +https://github.com/naimetti/rfc3339-validator +MIT License + +Copyright (c) 2019, Nicolas Aimetti + +Permission is hereby granted, free of charge, to any person obtaining a copy +of this software and associated documentation files (the "Software"), to deal +in the Software without restriction, including without limitation the rights +to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +copies of the Software, and to permit persons to whom the Software is +furnished to do so, subject to the following conditions: + +The above copyright notice and this permission notice shall be included in all +copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE +SOFTWARE. + + + +rfc3986-validator +MIT License +https://github.com/naimetti/rfc3986-validator +MIT License + +Copyright (c) 2019, Nicolas Aimetti + +Permission is hereby granted, free of charge, to any person obtaining a copy +of this software and associated documentation files (the "Software"), to deal +in the Software without restriction, including without limitation the rights +to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +copies of the Software, and to permit persons to whom the Software is +furnished to do so, subject to the following conditions: + +The above copyright notice and this permission notice shall be included in all +copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE +SOFTWARE. + + + +rfc3987-syntax +MIT +https://github.com/willynilly/rfc3987-syntax +MIT License + +Copyright (c) 2025 Will Riley + +Permission is hereby granted, free of charge, to any person obtaining a copy +of this software and associated documentation files (the "Software"), to deal +in the Software without restriction, including without limitation the rights +to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +copies of the Software, and to permit persons to whom the Software is +furnished to do so, subject to the following conditions: + +The above copyright notice and this permission notice shall be included in all +copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE +SOFTWARE. + + +rich +MIT License +https://github.com/Textualize/rich +Copyright (c) 2020 Will McGugan + +Permission is hereby granted, free of charge, to any person obtaining a copy +of this software and associated documentation files (the "Software"), to deal +in the Software without restriction, including without limitation the rights +to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +copies of the Software, and to permit persons to whom the Software is +furnished to do so, subject to the following conditions: + +The above copyright notice and this permission notice shall be included in all +copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE +SOFTWARE. + + +robotmq +MIT License +https://github.com/yihuai-gao/robot-message-queue +MIT License + +Copyright (c) 2024 Yihuai Gao + +Permission is hereby granted, free of charge, to any person obtaining a copy +of this software and associated documentation files (the "Software"), to deal +in the Software without restriction, including without limitation the rights +to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +copies of the Software, and to permit persons to whom the Software is +furnished to do so, subject to the following conditions: + +The above copyright notice and this permission notice shall be included in all +copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE +SOFTWARE. + + +rpds-py +MIT +https://github.com/crate-py/rpds +Copyright (c) 2023 Julian Berman + +Permission is hereby granted, free of charge, to any person obtaining a copy +of this software and associated documentation files (the "Software"), to deal +in the Software without restriction, including without limitation the rights +to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +copies of the Software, and to permit persons to whom the Software is +furnished to do so, subject to the following conditions: + +The above copyright notice and this permission notice shall be included in +all copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN +THE SOFTWARE. + + +rsa +Apache Software License +https://stuvel.eu/rsa +Copyright 2011 Sybren A. Stüvel + +Licensed under the Apache License, Version 2.0 (the "License"); +you may not use this file except in compliance with the License. +You may obtain a copy of the License at + + https://www.apache.org/licenses/LICENSE-2.0 + +Unless required by applicable law or agreed to in writing, software +distributed under the License is distributed on an "AS IS" BASIS, +WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +See the License for the specific language governing permissions and +limitations under the License. + + +ruff +MIT License +https://docs.astral.sh/ruff +MIT License + +Copyright (c) 2022 Charles Marsh + +Permission is hereby granted, free of charge, to any person obtaining a copy +of this software and associated documentation files (the "Software"), to deal +in the Software without restriction, including without limitation the rights +to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +copies of the Software, and to permit persons to whom the Software is +furnished to do so, subject to the following conditions: + +The above copyright notice and this permission notice shall be included in all +copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE +SOFTWARE. + +end of terms and conditions + +The externally maintained libraries from which parts of the Software is derived +are: + +- flake8-comprehensions, licensed as follows: + """ + MIT License + + Copyright (c) 2017 Adam Johnson + + Permission is hereby granted, free of charge, to any person obtaining a copy + of this software and associated documentation files (the "Software"), to deal + in the Software without restriction, including without limitation the rights + to use, copy, modify, merge, publish, distribute, sublicense, and/or sell + copies of the Software, and to permit persons to whom the Software is + furnished to do so, subject to the following conditions: + + The above copyright notice and this permission notice shall be included in all + copies or substantial portions of the Software. + + THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE + AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER + LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, + OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE + SOFTWARE. + """ + +- flake8-no-pep420, licensed as follows: + """ + MIT License + + Copyright (c) 2020 Adam Johnson + + Permission is hereby granted, free of charge, to any person obtaining a copy + of this software and associated documentation files (the "Software"), to deal + in the Software without restriction, including without limitation the rights + to use, copy, modify, merge, publish, distribute, sublicense, and/or sell + copies of the Software, and to permit persons to whom the Software is + furnished to do so, subject to the following conditions: + + The above copyright notice and this permission notice shall be included in all + copies or substantial portions of the Software. + + THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE + AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER + LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, + OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE + SOFTWARE. + """ + +- flake8-tidy-imports, licensed as follows: + """ + MIT License + + Copyright (c) 2017 Adam Johnson + + Permission is hereby granted, free of charge, to any person obtaining a copy + of this software and associated documentation files (the "Software"), to deal + in the Software without restriction, including without limitation the rights + to use, copy, modify, merge, publish, distribute, sublicense, and/or sell + copies of the Software, and to permit persons to whom the Software is + furnished to do so, subject to the following conditions: + + The above copyright notice and this permission notice shall be included in all + copies or substantial portions of the Software. + + THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE + AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER + LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, + OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE + SOFTWARE. + """ + +- flake8-return, licensed as follows: + """ + MIT License + + Copyright (c) 2019 Afonasev Evgeniy + + Permission is hereby granted, free of charge, to any person obtaining a copy + of this software and associated documentation files (the "Software"), to deal + in the Software without restriction, including without limitation the rights + to use, copy, modify, merge, publish, distribute, sublicense, and/or sell + copies of the Software, and to permit persons to whom the Software is + furnished to do so, subject to the following conditions: + + The above copyright notice and this permission notice shall be included in all + copies or substantial portions of the Software. + + THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE + AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER + LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, + OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE + SOFTWARE. + """ + +- flake8-2020, licensed as follows: + """ + Copyright (c) 2019 Anthony Sottile + + Permission is hereby granted, free of charge, to any person obtaining a copy + of this software and associated documentation files (the "Software"), to deal + in the Software without restriction, including without limitation the rights + to use, copy, modify, merge, publish, distribute, sublicense, and/or sell + copies of the Software, and to permit persons to whom the Software is + furnished to do so, subject to the following conditions: + + The above copyright notice and this permission notice shall be included in + all copies or substantial portions of the Software. + + THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE + AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER + LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, + OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN + THE SOFTWARE. + """ + +- pyupgrade, licensed as follows: + """ + Copyright (c) 2017 Anthony Sottile + + Permission is hereby granted, free of charge, to any person obtaining a copy + of this software and associated documentation files (the "Software"), to deal + in the Software without restriction, including without limitation the rights + to use, copy, modify, merge, publish, distribute, sublicense, and/or sell + copies of the Software, and to permit persons to whom the Software is + furnished to do so, subject to the following conditions: + + The above copyright notice and this permission notice shall be included in + all copies or substantial portions of the Software. + + THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE + AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER + LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, + OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN + THE SOFTWARE. + """ + +- flake8-blind-except, licensed as follows: + """ + The MIT License (MIT) + + Copyright (c) 2014 Elijah Andrews + + Permission is hereby granted, free of charge, to any person obtaining a copy of + this software and associated documentation files (the "Software"), to deal in + the Software without restriction, including without limitation the rights to + use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of + the Software, and to permit persons to whom the Software is furnished to do so, + subject to the following conditions: + + The above copyright notice and this permission notice shall be included in all + copies or substantial portions of the Software. + + THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS + FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR + COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER + IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN + CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. + """ + +- flake8-gettext, licensed as follows: + """ + BSD Zero Clause License + + Permission to use, copy, modify, and/or distribute this software for any purpose with or without fee is hereby granted. + + THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE. + """ + +- flake8-implicit-str-concat, licensed as follows: + """ + The MIT License (MIT) + + Copyright (c) 2019 Dylan Turner + + Permission is hereby granted, free of charge, to any person obtaining a copy + of this software and associated documentation files (the "Software"), to deal + in the Software without restriction, including without limitation the rights + to use, copy, modify, merge, publish, distribute, sublicense, and/or sell + copies of the Software, and to permit persons to whom the Software is + furnished to do so, subject to the following conditions: + + The above copyright notice and this permission notice shall be included in + all copies or substantial portions of the Software. + + THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE + AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER + LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, + OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN + THE SOFTWARE. + """ + +- flake8-debugger, licensed as follows: + """ + MIT License + + Copyright (c) 2016 Joseph Kahn + + Permission is hereby granted, free of charge, to any person obtaining a copy + of this software and associated documentation files (the "Software"), to deal + in the Software without restriction, including without limitation the rights + to use, copy, modify, merge, publish, distribute, sublicense, and/or sell + copies of the Software, and to permit persons to whom the Software is + furnished to do so, subject to the following conditions: + + The above copyright notice and this permission notice shall be included in all + copies or substantial portions of the Software. + + THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE + AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER + LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, + OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE + SOFTWARE. + """ + +- flake8-pyi, licensed as follows: + """ + The MIT License (MIT) + + Copyright (c) 2016 Łukasz Langa + + Permission is hereby granted, free of charge, to any person obtaining a copy + of this software and associated documentation files (the "Software"), to deal + in the Software without restriction, including without limitation the rights + to use, copy, modify, merge, publish, distribute, sublicense, and/or sell + copies of the Software, and to permit persons to whom the Software is + furnished to do so, subject to the following conditions: + + The above copyright notice and this permission notice shall be included in all + copies or substantial portions of the Software. + + THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE + AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER + LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, + OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE + SOFTWARE. + """ + +- flake8-print, licensed as follows: + """ + MIT License + + Copyright (c) 2016 Joseph Kahn + + Permission is hereby granted, free of charge, to any person obtaining a copy + of this software and associated documentation files (the "Software"), to deal + in the Software without restriction, including without limitation the rights + to use, copy, modify, merge, publish, distribute, sublicense, and/or sell + copies of the Software, and to permit persons to whom the Software is + furnished to do so, subject to the following conditions: + + The above copyright notice and this permission notice shall be included in all + copies or substantial portions of the Software. + + THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE + AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER + LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, + OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE + SOFTWARE. + """ + +- flake8-import-conventions, licensed as follows: + """ + MIT License + + Copyright (c) 2021 João Palmeiro + + Permission is hereby granted, free of charge, to any person obtaining a copy + of this software and associated documentation files (the "Software"), to deal + in the Software without restriction, including without limitation the rights + to use, copy, modify, merge, publish, distribute, sublicense, and/or sell + copies of the Software, and to permit persons to whom the Software is + furnished to do so, subject to the following conditions: + + The above copyright notice and this permission notice shall be included in all + copies or substantial portions of the Software. + + THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE + AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER + LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, + OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE + SOFTWARE. + """ + +- flake8-simplify, licensed as follows: + """ + MIT License + + Copyright (c) 2020 Martin Thoma + + Permission is hereby granted, free of charge, to any person obtaining a copy + of this software and associated documentation files (the "Software"), to deal + in the Software without restriction, including without limitation the rights + to use, copy, modify, merge, publish, distribute, sublicense, and/or sell + copies of the Software, and to permit persons to whom the Software is + furnished to do so, subject to the following conditions: + + The above copyright notice and this permission notice shall be included in all + copies or substantial portions of the Software. + + THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE + AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER + LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, + OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE + SOFTWARE. + """ + +- flake8-slots, licensed as follows: + """ + Copyright (c) 2021 Dominic Davis-Foster + + Permission is hereby granted, free of charge, to any person obtaining a copy + of this software and associated documentation files (the "Software"), to deal + in the Software without restriction, including without limitation the rights + to use, copy, modify, merge, publish, distribute, sublicense, and/or sell + copies of the Software, and to permit persons to whom the Software is + furnished to do so, subject to the following conditions: + + The above copyright notice and this permission notice shall be included in all + copies or substantial portions of the Software. + + THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, + EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF + MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. + IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, + DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR + OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE + OR OTHER DEALINGS IN THE SOFTWARE. + """ + +- flake8-todos, licensed as follows: + """ + Copyright (c) 2019 EclecticIQ. All rights reserved. + Copyright (c) 2020 Gram . All rights reserved. + + Redistribution and use in source and binary forms, with or without + modification, are permitted provided that the following conditions are met: + + 1. Redistributions of source code must retain the above copyright notice, + this list of conditions and the following disclaimer. + + 2. Redistributions in binary form must reproduce the above copyright + notice, this list of conditions and the following disclaimer in the + documentation and/or other materials provided with the distribution. + + 3. Neither the name of the copyright holder nor the names of its + contributors may be used to endorse or promote products derived from this + software without specific prior written permission. + + THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" + AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE + IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE + DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE + FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL + DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR + SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER + CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, + OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE + OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + """ + +- flake8-unused-arguments, licensed as follows: + """ + MIT License + + Copyright (c) 2019 Nathan Hoad + + Permission is hereby granted, free of charge, to any person obtaining a copy + of this software and associated documentation files (the "Software"), to deal + in the Software without restriction, including without limitation the rights + to use, copy, modify, merge, publish, distribute, sublicense, and/or sell + copies of the Software, and to permit persons to whom the Software is + furnished to do so, subject to the following conditions: + + The above copyright notice and this permission notice shall be included in all + copies or substantial portions of the Software. + + THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE + AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER + LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, + OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE + SOFTWARE. + """ + +- pygrep-hooks, licensed as follows: + """ + Copyright (c) 2018 Anthony Sottile + + Permission is hereby granted, free of charge, to any person obtaining a copy + of this software and associated documentation files (the "Software"), to deal + in the Software without restriction, including without limitation the rights + to use, copy, modify, merge, publish, distribute, sublicense, and/or sell + copies of the Software, and to permit persons to whom the Software is + furnished to do so, subject to the following conditions: + + The above copyright notice and this permission notice shall be included in + all copies or substantial portions of the Software. + + THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE + AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER + LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, + OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN + THE SOFTWARE. + """ + +- autoflake, licensed as follows: + """ + Copyright (C) 2012-2018 Steven Myint + + Permission is hereby granted, free of charge, to any person obtaining a copy of + this software and associated documentation files (the "Software"), to deal in + the Software without restriction, including without limitation the rights to + use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies + of the Software, and to permit persons to whom the Software is furnished to do + so, subject to the following conditions: + + The above copyright notice and this permission notice shall be included in all + copies or substantial portions of the Software. + + THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE + AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER + LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, + OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE + SOFTWARE. + """ + +- autotyping, licensed as follows: + """ + MIT License + + Copyright (c) 2023 Jelle Zijlstra + + Permission is hereby granted, free of charge, to any person obtaining a copy + of this software and associated documentation files (the "Software"), to deal + in the Software without restriction, including without limitation the rights + to use, copy, modify, merge, publish, distribute, sublicense, and/or sell + copies of the Software, and to permit persons to whom the Software is + furnished to do so, subject to the following conditions: + + The above copyright notice and this permission notice shall be included in all + copies or substantial portions of the Software. + + THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE + AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER + LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, + OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE + SOFTWARE. + """ + +- Flake8, licensed as follows: + """ + == Flake8 License (MIT) == + + Copyright (C) 2011-2013 Tarek Ziade + Copyright (C) 2012-2016 Ian Cordasco + + Permission is hereby granted, free of charge, to any person obtaining a copy of + this software and associated documentation files (the "Software"), to deal in + the Software without restriction, including without limitation the rights to + use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies + of the Software, and to permit persons to whom the Software is furnished to do + so, subject to the following conditions: + + The above copyright notice and this permission notice shall be included in all + copies or substantial portions of the Software. + + THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE + AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER + LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, + OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE + SOFTWARE. + """ + +- flake8-bugbear, licensed as follows: + """ + The MIT License (MIT) + + Copyright (c) 2016 Łukasz Langa + + Permission is hereby granted, free of charge, to any person obtaining a copy + of this software and associated documentation files (the "Software"), to deal + in the Software without restriction, including without limitation the rights + to use, copy, modify, merge, publish, distribute, sublicense, and/or sell + copies of the Software, and to permit persons to whom the Software is + furnished to do so, subject to the following conditions: + + The above copyright notice and this permission notice shall be included in all + copies or substantial portions of the Software. + + THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE + AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER + LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, + OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE + SOFTWARE. + """ + +- flake8-commas, licensed as follows: + """ + The MIT License (MIT) + + Copyright (c) 2017 Thomas Grainger. + + Permission is hereby granted, free of charge, to any person obtaining a copy + of this software and associated documentation files (the "Software"), to deal + in the Software without restriction, including without limitation the rights + to use, copy, modify, merge, publish, distribute, sublicense, and/or sell + copies of the Software, and to permit persons to whom the Software is + furnished to do so, subject to the following conditions: + + The above copyright notice and this permission notice shall be included in + all copies or substantial portions of the Software. + + THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE + AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER + LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, + OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN + THE SOFTWARE. + + + Portions of this flake8-commas Software may utilize the following + copyrighted material, the use of which is hereby acknowledged. + + Original flake8-commas: https://github.com/trevorcreech/flake8-commas/commit/e8563b71b1d5442e102c8734c11cb5202284293d + + Permission is hereby granted, free of charge, to any person obtaining a copy + of this software and associated documentation files (the "Software"), to deal + in the Software without restriction, including without limitation the rights + to use, copy, modify, merge, publish, distribute, sublicense, and/or sell + copies of the Software, and to permit persons to whom the Software is + furnished to do so, subject to the following conditions: + + The above copyright notice and this permission notice shall be included in + all copies or substantial portions of the Software. + + THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE + AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER + LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, + OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN + THE SOFTWARE. + """ + +- flynt, licensed as follows: + """ + MIT License + + Copyright (c) 2019-2022 Ilya Kamenshchikov + + Permission is hereby granted, free of charge, to any person obtaining a copy + of this software and associated documentation files (the "Software"), to deal + in the Software without restriction, including without limitation the rights + to use, copy, modify, merge, publish, distribute, sublicense, and/or sell + copies of the Software, and to permit persons to whom the Software is + furnished to do so, subject to the following conditions: + + The above copyright notice and this permission notice shall be included in all + copies or substantial portions of the Software. + + THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE + AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER + LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, + OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE + SOFTWARE. + """ + +- isort, licensed as follows: + """ + The MIT License (MIT) + + Copyright (c) 2013 Timothy Edmund Crosley + + Permission is hereby granted, free of charge, to any person obtaining a copy + of this software and associated documentation files (the "Software"), to deal + in the Software without restriction, including without limitation the rights + to use, copy, modify, merge, publish, distribute, sublicense, and/or sell + copies of the Software, and to permit persons to whom the Software is + furnished to do so, subject to the following conditions: + + The above copyright notice and this permission notice shall be included in + all copies or substantial portions of the Software. + + THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE + AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER + LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, + OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN + THE SOFTWARE. + """ + +- pep8-naming, licensed as follows: + """ + Copyright © 2013 Florent Xicluna + + Licensed under the terms of the Expat License + + Permission is hereby granted, free of charge, to any person + obtaining a copy of this software and associated documentation files + (the "Software"), to deal in the Software without restriction, + including without limitation the rights to use, copy, modify, merge, + publish, distribute, sublicense, and/or sell copies of the Software, + and to permit persons to whom the Software is furnished to do so, + subject to the following conditions: + + The above copyright notice and this permission notice shall be + included in all copies or substantial portions of the Software. + + THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, + EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF + MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND + NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS + BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN + ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN + CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE + SOFTWARE. + """ + +- pycodestyle, licensed as follows: + """ + Copyright © 2006-2009 Johann C. Rocholl + Copyright © 2009-2014 Florent Xicluna + Copyright © 2014-2020 Ian Lee + + Licensed under the terms of the Expat License + + Permission is hereby granted, free of charge, to any person + obtaining a copy of this software and associated documentation files + (the "Software"), to deal in the Software without restriction, + including without limitation the rights to use, copy, modify, merge, + publish, distribute, sublicense, and/or sell copies of the Software, + and to permit persons to whom the Software is furnished to do so, + subject to the following conditions: + + The above copyright notice and this permission notice shall be + included in all copies or substantial portions of the Software. + + THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, + EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF + MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND + NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS + BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN + ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN + CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE + SOFTWARE. + """ + +- pydocstyle, licensed as follows: + """ + Copyright (c) 2012 GreenSteam, + + Copyright (c) 2014-2020 Amir Rachum, + + Copyright (c) 2020 Sambhav Kothari, + + Permission is hereby granted, free of charge, to any person obtaining a copy of + this software and associated documentation files (the "Software"), to deal in + the Software without restriction, including without limitation the rights to + use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies + of the Software, and to permit persons to whom the Software is furnished to do + so, subject to the following conditions: + + The above copyright notice and this permission notice shall be included in all + copies or substantial portions of the Software. + + THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE + AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER + LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, + OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE + SOFTWARE. + """ + +- Pyflakes, licensed as follows: + """ + Copyright 2005-2011 Divmod, Inc. + Copyright 2013-2014 Florent Xicluna + + Permission is hereby granted, free of charge, to any person obtaining + a copy of this software and associated documentation files (the + "Software"), to deal in the Software without restriction, including + without limitation the rights to use, copy, modify, merge, publish, + distribute, sublicense, and/or sell copies of the Software, and to + permit persons to whom the Software is furnished to do so, subject to + the following conditions: + + The above copyright notice and this permission notice shall be + included in all copies or substantial portions of the Software. + + THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, + EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF + MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND + NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE + LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION + OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION + WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. + """ + +- flake8-use-pathlib, licensed as follows: + """ + MIT License + + Copyright (c) 2021 Rodolphe Pelloux-Prayer + + Permission is hereby granted, free of charge, to any person obtaining a copy + of this software and associated documentation files (the "Software"), to deal + in the Software without restriction, including without limitation the rights + to use, copy, modify, merge, publish, distribute, sublicense, and/or sell + copies of the Software, and to permit persons to whom the Software is + furnished to do so, subject to the following conditions: + + The above copyright notice and this permission notice shall be included in all + copies or substantial portions of the Software. + + THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE + AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER + LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, + OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE + SOFTWARE. + """ + +- RustPython, licensed as follows: + """ + MIT License + + Copyright (c) 2020 RustPython Team + + Permission is hereby granted, free of charge, to any person obtaining a copy + of this software and associated documentation files (the "Software"), to deal + in the Software without restriction, including without limitation the rights + to use, copy, modify, merge, publish, distribute, sublicense, and/or sell + copies of the Software, and to permit persons to whom the Software is + furnished to do so, subject to the following conditions: + + The above copyright notice and this permission notice shall be included in all + copies or substantial portions of the Software. + + THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE + AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER + LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, + OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE + SOFTWARE. + """ + +- flake8-annotations, licensed as follows: + """ + MIT License + + Copyright (c) 2019 - Present S. Co1 + + Permission is hereby granted, free of charge, to any person obtaining a copy + of this software and associated documentation files (the "Software"), to deal + in the Software without restriction, including without limitation the rights + to use, copy, modify, merge, publish, distribute, sublicense, and/or sell + copies of the Software, and to permit persons to whom the Software is + furnished to do so, subject to the following conditions: + + The above copyright notice and this permission notice shall be included in all + copies or substantial portions of the Software. + + THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE + AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER + LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, + OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE + SOFTWARE. + """ + +- flake8-async, licensed as follows: + """ + MIT License + + Copyright (c) 2022 Cooper Lees + + Permission is hereby granted, free of charge, to any person obtaining a copy + of this software and associated documentation files (the "Software"), to deal + in the Software without restriction, including without limitation the rights + to use, copy, modify, merge, publish, distribute, sublicense, and/or sell + copies of the Software, and to permit persons to whom the Software is + furnished to do so, subject to the following conditions: + + The above copyright notice and this permission notice shall be included in all + copies or substantial portions of the Software. + + THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE + AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER + LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, + OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE + SOFTWARE. + """ + +- flake8-type-checking, licensed as follows: + """ + Copyright (c) 2021, Sondre Lillebø Gundersen + All rights reserved. + + Redistribution and use in source and binary forms, with or without + modification, are permitted provided that the following conditions are met: + + * Redistributions of source code must retain the above copyright notice, this + list of conditions and the following disclaimer. + + * Redistributions in binary form must reproduce the above copyright notice, + this list of conditions and the following disclaimer in the documentation + and/or other materials provided with the distribution. + + * Neither the name of pytest-{{ cookiecutter.plugin_name }} nor the names of its + contributors may be used to endorse or promote products derived from + this software without specific prior written permission. + + THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" + AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE + IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE + DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE + FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL + DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR + SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER + CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, + OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE + OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + """ + +- flake8-bandit, licensed as follows: + """ + Copyright (c) 2017 Tyler Wince + + Permission is hereby granted, free of charge, to any person obtaining a copy + of this software and associated documentation files (the "Software"), to deal + in the Software without restriction, including without limitation the rights + to use, copy, modify, merge, publish, distribute, sublicense, and/or sell + copies of the Software, and to permit persons to whom the Software is + furnished to do so, subject to the following conditions: + + The above copyright notice and this permission notice shall be included in + all copies or substantial portions of the Software. + + THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE + AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER + LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, + OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN + THE SOFTWARE. + """ + +- flake8-eradicate, licensed as follows: + """ + MIT License + + Copyright (c) 2018 Nikita Sobolev + + Permission is hereby granted, free of charge, to any person obtaining a copy + of this software and associated documentation files (the "Software"), to deal + in the Software without restriction, including without limitation the rights + to use, copy, modify, merge, publish, distribute, sublicense, and/or sell + copies of the Software, and to permit persons to whom the Software is + furnished to do so, subject to the following conditions: + + The above copyright notice and this permission notice shall be included in all + copies or substantial portions of the Software. + + THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE + AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER + LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, + OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE + SOFTWARE. + """ + +- flake8-quotes, licensed as follows: + """ + Permission is hereby granted, free of charge, to any person obtaining a copy + of this software and associated documentation files (the "Software"), to deal + in the Software without restriction, including without limitation the rights + to use, copy, modify, merge, publish, distribute, sublicense, and/or sell + copies of the Software, and to permit persons to whom the Software is + furnished to do so, subject to the following conditions: + + The above copyright notice and this permission notice shall be included in + all copies or substantial portions of the Software. + + THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE + AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER + LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, + OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN + THE SOFTWARE. + """ + +- flake8-logging-format, licensed as follows: + """ + Apache License + Version 2.0, January 2004 + http://www.apache.org/licenses/ + + TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION + + 1. Definitions. + + "License" shall mean the terms and conditions for use, reproduction, + and distribution as defined by Sections 1 through 9 of this document. + + "Licensor" shall mean the copyright owner or entity authorized by + the copyright owner that is granting the License. + + "Legal Entity" shall mean the union of the acting entity and all + other entities that control, are controlled by, or are under common + control with that entity. For the purposes of this definition, + "control" means (i) the power, direct or indirect, to cause the + direction or management of such entity, whether by contract or + otherwise, or (ii) ownership of fifty percent (50%) or more of the + outstanding shares, or (iii) beneficial ownership of such entity. + + "You" (or "Your") shall mean an individual or Legal Entity + exercising permissions granted by this License. + + "Source" form shall mean the preferred form for making modifications, + including but not limited to software source code, documentation + source, and configuration files. + + "Object" form shall mean any form resulting from mechanical + transformation or translation of a Source form, including but + not limited to compiled object code, generated documentation, + and conversions to other media types. + + "Work" shall mean the work of authorship, whether in Source or + Object form, made available under the License, as indicated by a + copyright notice that is included in or attached to the work + (an example is provided in the Appendix below). + + "Derivative Works" shall mean any work, whether in Source or Object + form, that is based on (or derived from) the Work and for which the + editorial revisions, annotations, elaborations, or other modifications + represent, as a whole, an original work of authorship. For the purposes + of this License, Derivative Works shall not include works that remain + separable from, or merely link (or bind by name) to the interfaces of, + the Work and Derivative Works thereof. + + "Contribution" shall mean any work of authorship, including + the original version of the Work and any modifications or additions + to that Work or Derivative Works thereof, that is intentionally + submitted to Licensor for inclusion in the Work by the copyright owner + or by an individual or Legal Entity authorized to submit on behalf of + the copyright owner. For the purposes of this definition, "submitted" + means any form of electronic, verbal, or written communication sent + to the Licensor or its representatives, including but not limited to + communication on electronic mailing lists, source code control systems, + and issue tracking systems that are managed by, or on behalf of, the + Licensor for the purpose of discussing and improving the Work, but + excluding communication that is conspicuously marked or otherwise + designated in writing by the copyright owner as "Not a Contribution." + + "Contributor" shall mean Licensor and any individual or Legal Entity + on behalf of whom a Contribution has been received by Licensor and + subsequently incorporated within the Work. + + 2. Grant of Copyright License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + copyright license to reproduce, prepare Derivative Works of, + publicly display, publicly perform, sublicense, and distribute the + Work and such Derivative Works in Source or Object form. + + 3. Grant of Patent License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + (except as stated in this section) patent license to make, have made, + use, offer to sell, sell, import, and otherwise transfer the Work, + where such license applies only to those patent claims licensable + by such Contributor that are necessarily infringed by their + Contribution(s) alone or by combination of their Contribution(s) + with the Work to which such Contribution(s) was submitted. If You + institute patent litigation against any entity (including a + cross-claim or counterclaim in a lawsuit) alleging that the Work + or a Contribution incorporated within the Work constitutes direct + or contributory patent infringement, then any patent licenses + granted to You under this License for that Work shall terminate + as of the date such litigation is filed. + + 4. Redistribution. You may reproduce and distribute copies of the + Work or Derivative Works thereof in any medium, with or without + modifications, and in Source or Object form, provided that You + meet the following conditions: + + (a) You must give any other recipients of the Work or + Derivative Works a copy of this License; and + + (b) You must cause any modified files to carry prominent notices + stating that You changed the files; and + + (c) You must retain, in the Source form of any Derivative Works + that You distribute, all copyright, patent, trademark, and + attribution notices from the Source form of the Work, + excluding those notices that do not pertain to any part of + the Derivative Works; and + + (d) If the Work includes a "NOTICE" text file as part of its + distribution, then any Derivative Works that You distribute must + include a readable copy of the attribution notices contained + within such NOTICE file, excluding those notices that do not + pertain to any part of the Derivative Works, in at least one + of the following places: within a NOTICE text file distributed + as part of the Derivative Works; within the Source form or + documentation, if provided along with the Derivative Works; or, + within a display generated by the Derivative Works, if and + wherever such third-party notices normally appear. The contents + of the NOTICE file are for informational purposes only and + do not modify the License. You may add Your own attribution + notices within Derivative Works that You distribute, alongside + or as an addendum to the NOTICE text from the Work, provided + that such additional attribution notices cannot be construed + as modifying the License. + + You may add Your own copyright statement to Your modifications and + may provide additional or different license terms and conditions + for use, reproduction, or distribution of Your modifications, or + for any such Derivative Works as a whole, provided Your use, + reproduction, and distribution of the Work otherwise complies with + the conditions stated in this License. + + 5. Submission of Contributions. Unless You explicitly state otherwise, + any Contribution intentionally submitted for inclusion in the Work + by You to the Licensor shall be under the terms and conditions of + this License, without any additional terms or conditions. + Notwithstanding the above, nothing herein shall supersede or modify + the terms of any separate license agreement you may have executed + with Licensor regarding such Contributions. + + 6. Trademarks. This License does not grant permission to use the trade + names, trademarks, service marks, or product names of the Licensor, + except as required for reasonable and customary use in describing the + origin of the Work and reproducing the content of the NOTICE file. + + 7. Disclaimer of Warranty. Unless required by applicable law or + agreed to in writing, Licensor provides the Work (and each + Contributor provides its Contributions) on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or + implied, including, without limitation, any warranties or conditions + of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A + PARTICULAR PURPOSE. You are solely responsible for determining the + appropriateness of using or redistributing the Work and assume any + risks associated with Your exercise of permissions under this License. + + 8. Limitation of Liability. In no event and under no legal theory, + whether in tort (including negligence), contract, or otherwise, + unless required by applicable law (such as deliberate and grossly + negligent acts) or agreed to in writing, shall any Contributor be + liable to You for damages, including any direct, indirect, special, + incidental, or consequential damages of any character arising as a + result of this License or out of the use or inability to use the + Work (including but not limited to damages for loss of goodwill, + work stoppage, computer failure or malfunction, or any and all + other commercial damages or losses), even if such Contributor + has been advised of the possibility of such damages. + + 9. Accepting Warranty or Additional Liability. While redistributing + the Work or Derivative Works thereof, You may choose to offer, + and charge a fee for, acceptance of support, warranty, indemnity, + or other liability obligations and/or rights consistent with this + License. However, in accepting such obligations, You may act only + on Your own behalf and on Your sole responsibility, not on behalf + of any other Contributor, and only if You agree to indemnify, + defend, and hold each Contributor harmless for any liability + incurred by, or claims asserted against, such Contributor by reason + of your accepting any such warranty or additional liability. + + END OF TERMS AND CONDITIONS + + APPENDIX: How to apply the Apache License to your work. + + To apply the Apache License to your work, attach the following + boilerplate notice, with the fields enclosed by brackets "{}" + replaced with your own identifying information. (Don't include + the brackets!) The text should be enclosed in the appropriate + comment syntax for the file format. We also recommend that a + file or class name and description of purpose be included on the + same "printed page" as the copyright notice for easier + identification within third-party archives. + + Copyright {yyyy} {name of copyright owner} + + Licensed under the Apache License, Version 2.0 (the "License"); + you may not use this file except in compliance with the License. + You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, software + distributed under the License is distributed on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + See the License for the specific language governing permissions and + limitations under the License. + """ + +- flake8-raise, licensed as follows: + """ + MIT License + + Copyright (c) 2020 Jon Dufresne + + Permission is hereby granted, free of charge, to any person obtaining a copy + of this software and associated documentation files (the "Software"), to deal + in the Software without restriction, including without limitation the rights + to use, copy, modify, merge, publish, distribute, sublicense, and/or sell + copies of the Software, and to permit persons to whom the Software is + furnished to do so, subject to the following conditions: + + The above copyright notice and this permission notice shall be included in all + copies or substantial portions of the Software. + + THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE + AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER + LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, + OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE + SOFTWARE. + """ + +- flake8-self, licensed as follows: + """ + MIT License + + Copyright (c) 2023 Korijn van Golen + + Permission is hereby granted, free of charge, to any person obtaining a copy + of this software and associated documentation files (the "Software"), to deal + in the Software without restriction, including without limitation the rights + to use, copy, modify, merge, publish, distribute, sublicense, and/or sell + copies of the Software, and to permit persons to whom the Software is + furnished to do so, subject to the following conditions: + + The above copyright notice and this permission notice shall be included in all + copies or substantial portions of the Software. + + THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE + AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER + LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, + OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE + SOFTWARE. + """ + +- flake8-django, licensed under the GPL license. + +- perflint, licensed as follows: + """ + MIT License + + Copyright (c) 2022 Anthony Shaw + + Permission is hereby granted, free of charge, to any person obtaining a copy + of this software and associated documentation files (the "Software"), to deal + in the Software without restriction, including without limitation the rights + to use, copy, modify, merge, publish, distribute, sublicense, and/or sell + copies of the Software, and to permit persons to whom the Software is + furnished to do so, subject to the following conditions: + + The above copyright notice and this permission notice shall be included in all + copies or substantial portions of the Software. + + THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE + AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER + LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, + OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE + SOFTWARE. + """ + +- flake8-logging, licensed as follows: + """ + MIT License + + Copyright (c) 2023 Adam Johnson + + Permission is hereby granted, free of charge, to any person obtaining a copy + of this software and associated documentation files (the "Software"), to deal + in the Software without restriction, including without limitation the rights + to use, copy, modify, merge, publish, distribute, sublicense, and/or sell + copies of the Software, and to permit persons to whom the Software is + furnished to do so, subject to the following conditions: + + The above copyright notice and this permission notice shall be included in all + copies or substantial portions of the Software. + + THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE + AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER + LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, + OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE + SOFTWARE. + """ + +- flake8-trio, licensed as follows: + """ + MIT License + + Copyright (c) 2022 Zac Hatfield-Dodds + + Permission is hereby granted, free of charge, to any person obtaining a copy + of this software and associated documentation files (the "Software"), to deal + in the Software without restriction, including without limitation the rights + to use, copy, modify, merge, publish, distribute, sublicense, and/or sell + copies of the Software, and to permit persons to whom the Software is + furnished to do so, subject to the following conditions: + + The above copyright notice and this permission notice shall be included in all + copies or substantial portions of the Software. + + THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE + AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER + LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, + OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE + SOFTWARE. + """ + +- Pyright, licensed as follows: + """ + MIT License + + Pyright - A static type checker for the Python language + Copyright (c) Microsoft Corporation. All rights reserved. + + Permission is hereby granted, free of charge, to any person obtaining a copy + of this software and associated documentation files (the "Software"), to deal + in the Software without restriction, including without limitation the rights + to use, copy, modify, merge, publish, distribute, sublicense, and/or sell + copies of the Software, and to permit persons to whom the Software is + furnished to do so, subject to the following conditions: + + The above copyright notice and this permission notice shall be included in all + copies or substantial portions of the Software. + + THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE + AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER + LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, + OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE + SOFTWARE + """ + +- rust-analyzer/text-size, licensed under the MIT license: + """ + Permission is hereby granted, free of charge, to any + person obtaining a copy of this software and associated + documentation files (the "Software"), to deal in the + Software without restriction, including without + limitation the rights to use, copy, modify, merge, + publish, distribute, sublicense, and/or sell copies of + the Software, and to permit persons to whom the Software + is furnished to do so, subject to the following + conditions: + + The above copyright notice and this permission notice + shall be included in all copies or substantial portions + of the Software. + + THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF + ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED + TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A + PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT + SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY + CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION + OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR + IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER + DEALINGS IN THE SOFTWARE. + """ + +- rome/tools, licensed under the MIT license: + """ + MIT License + + Copyright (c) Rome Tools, Inc. and its affiliates. + + Permission is hereby granted, free of charge, to any person obtaining a copy + of this software and associated documentation files (the "Software"), to deal + in the Software without restriction, including without limitation the rights + to use, copy, modify, merge, publish, distribute, sublicense, and/or sell + copies of the Software, and to permit persons to whom the Software is + furnished to do so, subject to the following conditions: + + The above copyright notice and this permission notice shall be included in all + copies or substantial portions of the Software. + + THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE + AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER + LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, + OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE + SOFTWARE. + """ + +- pydoclint, licensed as follows: + """ + MIT License + + Copyright (c) 2023 jsh9 + + Permission is hereby granted, free of charge, to any person obtaining a copy + of this software and associated documentation files (the "Software"), to deal + in the Software without restriction, including without limitation the rights + to use, copy, modify, merge, publish, distribute, sublicense, and/or sell + copies of the Software, and to permit persons to whom the Software is + furnished to do so, subject to the following conditions: + + The above copyright notice and this permission notice shall be included in all + copies or substantial portions of the Software. + + THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE + AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER + LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, + OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE + SOFTWARE. + """ + + +s3fs +BSD License +http://github.com/fsspec/s3fs/ +Copyright (c) 2016, Continuum Analytics, Inc. and contributors +All rights reserved. + +Redistribution and use in source and binary forms, with or without modification, +are permitted provided that the following conditions are met: + +Redistributions of source code must retain the above copyright notice, +this list of conditions and the following disclaimer. + +Redistributions in binary form must reproduce the above copyright notice, +this list of conditions and the following disclaimer in the documentation +and/or other materials provided with the distribution. + +Neither the name of Continuum Analytics nor the names of any contributors +may be used to endorse or promote products derived from this software +without specific prior written permission. + +THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" +AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE +IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE +ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE +LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR +CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF +SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS +INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN +CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) +ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF +THE POSSIBILITY OF SUCH DAMAGE. + + +s3transfer +Apache Software License +https://github.com/boto/s3transfer + + Apache License + Version 2.0, January 2004 + http://www.apache.org/licenses/ + + TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION + + 1. Definitions. + + "License" shall mean the terms and conditions for use, reproduction, + and distribution as defined by Sections 1 through 9 of this document. + + "Licensor" shall mean the copyright owner or entity authorized by + the copyright owner that is granting the License. + + "Legal Entity" shall mean the union of the acting entity and all + other entities that control, are controlled by, or are under common + control with that entity. For the purposes of this definition, + "control" means (i) the power, direct or indirect, to cause the + direction or management of such entity, whether by contract or + otherwise, or (ii) ownership of fifty percent (50%) or more of the + outstanding shares, or (iii) beneficial ownership of such entity. + + "You" (or "Your") shall mean an individual or Legal Entity + exercising permissions granted by this License. + + "Source" form shall mean the preferred form for making modifications, + including but not limited to software source code, documentation + source, and configuration files. + + "Object" form shall mean any form resulting from mechanical + transformation or translation of a Source form, including but + not limited to compiled object code, generated documentation, + and conversions to other media types. + + "Work" shall mean the work of authorship, whether in Source or + Object form, made available under the License, as indicated by a + copyright notice that is included in or attached to the work + (an example is provided in the Appendix below). + + "Derivative Works" shall mean any work, whether in Source or Object + form, that is based on (or derived from) the Work and for which the + editorial revisions, annotations, elaborations, or other modifications + represent, as a whole, an original work of authorship. For the purposes + of this License, Derivative Works shall not include works that remain + separable from, or merely link (or bind by name) to the interfaces of, + the Work and Derivative Works thereof. + + "Contribution" shall mean any work of authorship, including + the original version of the Work and any modifications or additions + to that Work or Derivative Works thereof, that is intentionally + submitted to Licensor for inclusion in the Work by the copyright owner + or by an individual or Legal Entity authorized to submit on behalf of + the copyright owner. For the purposes of this definition, "submitted" + means any form of electronic, verbal, or written communication sent + to the Licensor or its representatives, including but not limited to + communication on electronic mailing lists, source code control systems, + and issue tracking systems that are managed by, or on behalf of, the + Licensor for the purpose of discussing and improving the Work, but + excluding communication that is conspicuously marked or otherwise + designated in writing by the copyright owner as "Not a Contribution." + + "Contributor" shall mean Licensor and any individual or Legal Entity + on behalf of whom a Contribution has been received by Licensor and + subsequently incorporated within the Work. + + 2. Grant of Copyright License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + copyright license to reproduce, prepare Derivative Works of, + publicly display, publicly perform, sublicense, and distribute the + Work and such Derivative Works in Source or Object form. + + 3. Grant of Patent License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + (except as stated in this section) patent license to make, have made, + use, offer to sell, sell, import, and otherwise transfer the Work, + where such license applies only to those patent claims licensable + by such Contributor that are necessarily infringed by their + Contribution(s) alone or by combination of their Contribution(s) + with the Work to which such Contribution(s) was submitted. If You + institute patent litigation against any entity (including a + cross-claim or counterclaim in a lawsuit) alleging that the Work + or a Contribution incorporated within the Work constitutes direct + or contributory patent infringement, then any patent licenses + granted to You under this License for that Work shall terminate + as of the date such litigation is filed. + + 4. Redistribution. You may reproduce and distribute copies of the + Work or Derivative Works thereof in any medium, with or without + modifications, and in Source or Object form, provided that You + meet the following conditions: + + (a) You must give any other recipients of the Work or + Derivative Works a copy of this License; and + + (b) You must cause any modified files to carry prominent notices + stating that You changed the files; and + + (c) You must retain, in the Source form of any Derivative Works + that You distribute, all copyright, patent, trademark, and + attribution notices from the Source form of the Work, + excluding those notices that do not pertain to any part of + the Derivative Works; and + + (d) If the Work includes a "NOTICE" text file as part of its + distribution, then any Derivative Works that You distribute must + include a readable copy of the attribution notices contained + within such NOTICE file, excluding those notices that do not + pertain to any part of the Derivative Works, in at least one + of the following places: within a NOTICE text file distributed + as part of the Derivative Works; within the Source form or + documentation, if provided along with the Derivative Works; or, + within a display generated by the Derivative Works, if and + wherever such third-party notices normally appear. The contents + of the NOTICE file are for informational purposes only and + do not modify the License. You may add Your own attribution + notices within Derivative Works that You distribute, alongside + or as an addendum to the NOTICE text from the Work, provided + that such additional attribution notices cannot be construed + as modifying the License. + + You may add Your own copyright statement to Your modifications and + may provide additional or different license terms and conditions + for use, reproduction, or distribution of Your modifications, or + for any such Derivative Works as a whole, provided Your use, + reproduction, and distribution of the Work otherwise complies with + the conditions stated in this License. + + 5. Submission of Contributions. Unless You explicitly state otherwise, + any Contribution intentionally submitted for inclusion in the Work + by You to the Licensor shall be under the terms and conditions of + this License, without any additional terms or conditions. + Notwithstanding the above, nothing herein shall supersede or modify + the terms of any separate license agreement you may have executed + with Licensor regarding such Contributions. + + 6. Trademarks. This License does not grant permission to use the trade + names, trademarks, service marks, or product names of the Licensor, + except as required for reasonable and customary use in describing the + origin of the Work and reproducing the content of the NOTICE file. + + 7. Disclaimer of Warranty. Unless required by applicable law or + agreed to in writing, Licensor provides the Work (and each + Contributor provides its Contributions) on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or + implied, including, without limitation, any warranties or conditions + of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A + PARTICULAR PURPOSE. You are solely responsible for determining the + appropriateness of using or redistributing the Work and assume any + risks associated with Your exercise of permissions under this License. + + 8. Limitation of Liability. In no event and under no legal theory, + whether in tort (including negligence), contract, or otherwise, + unless required by applicable law (such as deliberate and grossly + negligent acts) or agreed to in writing, shall any Contributor be + liable to You for damages, including any direct, indirect, special, + incidental, or consequential damages of any character arising as a + result of this License or out of the use or inability to use the + Work (including but not limited to damages for loss of goodwill, + work stoppage, computer failure or malfunction, or any and all + other commercial damages or losses), even if such Contributor + has been advised of the possibility of such damages. + + 9. Accepting Warranty or Additional Liability. While redistributing + the Work or Derivative Works thereof, You may choose to offer, + and charge a fee for, acceptance of support, warranty, indemnity, + or other liability obligations and/or rights consistent with this + License. However, in accepting such obligations, You may act only + on Your own behalf and on Your sole responsibility, not on behalf + of any other Contributor, and only if You agree to indemnify, + defend, and hold each Contributor harmless for any liability + incurred by, or claims asserted against, such Contributor by reason + of your accepting any such warranty or additional liability. + + END OF TERMS AND CONDITIONS + + APPENDIX: How to apply the Apache License to your work. + + To apply the Apache License to your work, attach the following + boilerplate notice, with the fields enclosed by brackets "[]" + replaced with your own identifying information. (Don't include + the brackets!) The text should be enclosed in the appropriate + comment syntax for the file format. We also recommend that a + file or class name and description of purpose be included on the + same "printed page" as the copyright notice for easier + identification within third-party archives. + + Copyright [yyyy] [name of copyright owner] + + Licensed under the Apache License, Version 2.0 (the "License"); + you may not use this file except in compliance with the License. + You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, software + distributed under the License is distributed on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + See the License for the specific language governing permissions and + limitations under the License. + + +safehttpx +MIT License +https://github.com/gradio-app/safehttpx + Apache License + Version 2.0, January 2004 + http://www.apache.org/licenses/ + + TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION + + 1. Definitions. + + "License" shall mean the terms and conditions for use, reproduction, + and distribution as defined by Sections 1 through 9 of this document. + + "Licensor" shall mean the copyright owner or entity authorized by + the copyright owner that is granting the License. + + "Legal Entity" shall mean the union of the acting entity and all + other entities that control, are controlled by, or are under common + control with that entity. For the purposes of this definition, + "control" means (i) the power, direct or indirect, to cause the + direction or management of such entity, whether by contract or + otherwise, or (ii) ownership of fifty percent (50%) or more of the + outstanding shares, or (iii) beneficial ownership of such entity. + + "You" (or "Your") shall mean an individual or Legal Entity + exercising permissions granted by this License. + + "Source" form shall mean the preferred form for making modifications, + including but not limited to software source code, documentation + source, and configuration files. + + "Object" form shall mean any form resulting from mechanical + transformation or translation of a Source form, including but + not limited to compiled object code, generated documentation, + and conversions to other media types. + + "Work" shall mean the work of authorship, whether in Source or + Object form, made available under the License, as indicated by a + copyright notice that is included in or attached to the work + (an example is provided in the Appendix below). + + "Derivative Works" shall mean any work, whether in Source or Object + form, that is based on (or derived from) the Work and for which the + editorial revisions, annotations, elaborations, or other modifications + represent, as a whole, an original work of authorship. For the purposes + of this License, Derivative Works shall not include works that remain + separable from, or merely link (or bind by name) to the interfaces of, + the Work and Derivative Works thereof. + + "Contribution" shall mean any work of authorship, including + the original version of the Work and any modifications or additions + to that Work or Derivative Works thereof, that is intentionally + submitted to Licensor for inclusion in the Work by the copyright owner + or by an individual or Legal Entity authorized to submit on behalf of + the copyright owner. For the purposes of this definition, "submitted" + means any form of electronic, verbal, or written communication sent + to the Licensor or its representatives, including but not limited to + communication on electronic mailing lists, source code control systems, + and issue tracking systems that are managed by, or on behalf of, the + Licensor for the purpose of discussing and improving the Work, but + excluding communication that is conspicuously marked or otherwise + designated in writing by the copyright owner as "Not a Contribution." + + "Contributor" shall mean Licensor and any individual or Legal Entity + on behalf of whom a Contribution has been received by Licensor and + subsequently incorporated within the Work. + + 2. Grant of Copyright License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + copyright license to reproduce, prepare Derivative Works of, + publicly display, publicly perform, sublicense, and distribute the + Work and such Derivative Works in Source or Object form. + + 3. Grant of Patent License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + (except as stated in this section) patent license to make, have made, + use, offer to sell, sell, import, and otherwise transfer the Work, + where such license applies only to those patent claims licensable + by such Contributor that are necessarily infringed by their + Contribution(s) alone or by combination of their Contribution(s) + with the Work to which such Contribution(s) was submitted. If You + institute patent litigation against any entity (including a + cross-claim or counterclaim in a lawsuit) alleging that the Work + or a Contribution incorporated within the Work constitutes direct + or contributory patent infringement, then any patent licenses + granted to You under this License for that Work shall terminate + as of the date such litigation is filed. + + 4. Redistribution. You may reproduce and distribute copies of the + Work or Derivative Works thereof in any medium, with or without + modifications, and in Source or Object form, provided that You + meet the following conditions: + + (a) You must give any other recipients of the Work or + Derivative Works a copy of this License; and + + (b) You must cause any modified files to carry prominent notices + stating that You changed the files; and + + (c) You must retain, in the Source form of any Derivative Works + that You distribute, all copyright, patent, trademark, and + attribution notices from the Source form of the Work, + excluding those notices that do not pertain to any part of + the Derivative Works; and + + (d) If the Work includes a "NOTICE" text file as part of its + distribution, then any Derivative Works that You distribute must + include a readable copy of the attribution notices contained + within such NOTICE file, excluding those notices that do not + pertain to any part of the Derivative Works, in at least one + of the following places: within a NOTICE text file distributed + as part of the Derivative Works; within the Source form or + documentation, if provided along with the Derivative Works; or, + within a display generated by the Derivative Works, if and + wherever such third-party notices normally appear. The contents + of the NOTICE file are for informational purposes only and + do not modify the License. You may add Your own attribution + notices within Derivative Works that You distribute, alongside + or as an addendum to the NOTICE text from the Work, provided + that such additional attribution notices cannot be construed + as modifying the License. + + You may add Your own copyright statement to Your modifications and + may provide additional or different license terms and conditions + for use, reproduction, or distribution of Your modifications, or + for any such Derivative Works as a whole, provided Your use, + reproduction, and distribution of the Work otherwise complies with + the conditions stated in this License. + + 5. Submission of Contributions. Unless You explicitly state otherwise, + any Contribution intentionally submitted for inclusion in the Work + by You to the Licensor shall be under the terms and conditions of + this License, without any additional terms or conditions. + Notwithstanding the above, nothing herein shall supersede or modify + the terms of any separate license agreement you may have executed + with Licensor regarding such Contributions. + + 6. Trademarks. This License does not grant permission to use the trade + names, trademarks, service marks, or product names of the Licensor, + except as required for reasonable and customary use in describing the + origin of the Work and reproducing the content of the NOTICE file. + + 7. Disclaimer of Warranty. Unless required by applicable law or + agreed to in writing, Licensor provides the Work (and each + Contributor provides its Contributions) on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or + implied, including, without limitation, any warranties or conditions + of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A + PARTICULAR PURPOSE. You are solely responsible for determining the + appropriateness of using or redistributing the Work and assume any + risks associated with Your exercise of permissions under this License. + + 8. Limitation of Liability. In no event and under no legal theory, + whether in tort (including negligence), contract, or otherwise, + unless required by applicable law (such as deliberate and grossly + negligent acts) or agreed to in writing, shall any Contributor be + liable to You for damages, including any direct, indirect, special, + incidental, or consequential damages of any character arising as a + result of this License or out of the use or inability to use the + Work (including but not limited to damages for loss of goodwill, + work stoppage, computer failure or malfunction, or any and all + other commercial damages or losses), even if such Contributor + has been advised of the possibility of such damages. + + 9. Accepting Warranty or Additional Liability. While redistributing + the Work or Derivative Works thereof, You may choose to offer, + and charge a fee for, acceptance of support, warranty, indemnity, + or other liability obligations and/or rights consistent with this + License. However, in accepting such obligations, You may act only + on Your own behalf and on Your sole responsibility, not on behalf + of any other Contributor, and only if You agree to indemnify, + defend, and hold each Contributor harmless for any liability + incurred by, or claims asserted against, such Contributor by reason + of your accepting any such warranty or additional liability. + + END OF TERMS AND CONDITIONS + + APPENDIX: How to apply the Apache License to your work. + + To apply the Apache License to your work, attach the following + boilerplate notice, with the fields enclosed by brackets "[]" + replaced with your own identifying information. (Don't include + the brackets!) The text should be enclosed in the appropriate + comment syntax for the file format. We also recommend that a + file or class name and description of purpose be included on the + same "printed page" as the copyright notice for easier + identification within third-party archives. + + Copyright [yyyy] [name of copyright owner] + + Licensed under the Apache License, Version 2.0 (the "License"); + you may not use this file except in compliance with the License. + You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, software + distributed under the License is distributed on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + See the License for the specific language governing permissions and + limitations under the License. + + +safetensors +Apache Software License +https://github.com/huggingface/safetensors + Apache License + Version 2.0, January 2004 + http://www.apache.org/licenses/ + + TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION + + 1. Definitions. + + "License" shall mean the terms and conditions for use, reproduction, + and distribution as defined by Sections 1 through 9 of this document. + + "Licensor" shall mean the copyright owner or entity authorized by + the copyright owner that is granting the License. + + "Legal Entity" shall mean the union of the acting entity and all + other entities that control, are controlled by, or are under common + control with that entity. For the purposes of this definition, + "control" means (i) the power, direct or indirect, to cause the + direction or management of such entity, whether by contract or + otherwise, or (ii) ownership of fifty percent (50%) or more of the + outstanding shares, or (iii) beneficial ownership of such entity. + + "You" (or "Your") shall mean an individual or Legal Entity + exercising permissions granted by this License. + + "Source" form shall mean the preferred form for making modifications, + including but not limited to software source code, documentation + source, and configuration files. + + "Object" form shall mean any form resulting from mechanical + transformation or translation of a Source form, including but + not limited to compiled object code, generated documentation, + and conversions to other media types. + + "Work" shall mean the work of authorship, whether in Source or + Object form, made available under the License, as indicated by a + copyright notice that is included in or attached to the work + (an example is provided in the Appendix below). + + "Derivative Works" shall mean any work, whether in Source or Object + form, that is based on (or derived from) the Work and for which the + editorial revisions, annotations, elaborations, or other modifications + represent, as a whole, an original work of authorship. For the purposes + of this License, Derivative Works shall not include works that remain + separable from, or merely link (or bind by name) to the interfaces of, + the Work and Derivative Works thereof. + + "Contribution" shall mean any work of authorship, including + the original version of the Work and any modifications or additions + to that Work or Derivative Works thereof, that is intentionally + submitted to Licensor for inclusion in the Work by the copyright owner + or by an individual or Legal Entity authorized to submit on behalf of + the copyright owner. For the purposes of this definition, "submitted" + means any form of electronic, verbal, or written communication sent + to the Licensor or its representatives, including but not limited to + communication on electronic mailing lists, source code control systems, + and issue tracking systems that are managed by, or on behalf of, the + Licensor for the purpose of discussing and improving the Work, but + excluding communication that is conspicuously marked or otherwise + designated in writing by the copyright owner as "Not a Contribution." + + "Contributor" shall mean Licensor and any individual or Legal Entity + on behalf of whom a Contribution has been received by Licensor and + subsequently incorporated within the Work. + + 2. Grant of Copyright License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + copyright license to reproduce, prepare Derivative Works of, + publicly display, publicly perform, sublicense, and distribute the + Work and such Derivative Works in Source or Object form. + + 3. Grant of Patent License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + (except as stated in this section) patent license to make, have made, + use, offer to sell, sell, import, and otherwise transfer the Work, + where such license applies only to those patent claims licensable + by such Contributor that are necessarily infringed by their + Contribution(s) alone or by combination of their Contribution(s) + with the Work to which such Contribution(s) was submitted. If You + institute patent litigation against any entity (including a + cross-claim or counterclaim in a lawsuit) alleging that the Work + or a Contribution incorporated within the Work constitutes direct + or contributory patent infringement, then any patent licenses + granted to You under this License for that Work shall terminate + as of the date such litigation is filed. + + 4. Redistribution. You may reproduce and distribute copies of the + Work or Derivative Works thereof in any medium, with or without + modifications, and in Source or Object form, provided that You + meet the following conditions: + + (a) You must give any other recipients of the Work or + Derivative Works a copy of this License; and + + (b) You must cause any modified files to carry prominent notices + stating that You changed the files; and + + (c) You must retain, in the Source form of any Derivative Works + that You distribute, all copyright, patent, trademark, and + attribution notices from the Source form of the Work, + excluding those notices that do not pertain to any part of + the Derivative Works; and + + (d) If the Work includes a "NOTICE" text file as part of its + distribution, then any Derivative Works that You distribute must + include a readable copy of the attribution notices contained + within such NOTICE file, excluding those notices that do not + pertain to any part of the Derivative Works, in at least one + of the following places: within a NOTICE text file distributed + as part of the Derivative Works; within the Source form or + documentation, if provided along with the Derivative Works; or, + within a display generated by the Derivative Works, if and + wherever such third-party notices normally appear. The contents + of the NOTICE file are for informational purposes only and + do not modify the License. You may add Your own attribution + notices within Derivative Works that You distribute, alongside + or as an addendum to the NOTICE text from the Work, provided + that such additional attribution notices cannot be construed + as modifying the License. + + You may add Your own copyright statement to Your modifications and + may provide additional or different license terms and conditions + for use, reproduction, or distribution of Your modifications, or + for any such Derivative Works as a whole, provided Your use, + reproduction, and distribution of the Work otherwise complies with + the conditions stated in this License. + + 5. Submission of Contributions. Unless You explicitly state otherwise, + any Contribution intentionally submitted for inclusion in the Work + by You to the Licensor shall be under the terms and conditions of + this License, without any additional terms or conditions. + Notwithstanding the above, nothing herein shall supersede or modify + the terms of any separate license agreement you may have executed + with Licensor regarding such Contributions. + + 6. Trademarks. This License does not grant permission to use the trade + names, trademarks, service marks, or product names of the Licensor, + except as required for reasonable and customary use in describing the + origin of the Work and reproducing the content of the NOTICE file. + + 7. Disclaimer of Warranty. Unless required by applicable law or + agreed to in writing, Licensor provides the Work (and each + Contributor provides its Contributions) on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or + implied, including, without limitation, any warranties or conditions + of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A + PARTICULAR PURPOSE. You are solely responsible for determining the + appropriateness of using or redistributing the Work and assume any + risks associated with Your exercise of permissions under this License. + + 8. Limitation of Liability. In no event and under no legal theory, + whether in tort (including negligence), contract, or otherwise, + unless required by applicable law (such as deliberate and grossly + negligent acts) or agreed to in writing, shall any Contributor be + liable to You for damages, including any direct, indirect, special, + incidental, or consequential damages of any character arising as a + result of this License or out of the use or inability to use the + Work (including but not limited to damages for loss of goodwill, + work stoppage, computer failure or malfunction, or any and all + other commercial damages or losses), even if such Contributor + has been advised of the possibility of such damages. + + 9. Accepting Warranty or Additional Liability. While redistributing + the Work or Derivative Works thereof, You may choose to offer, + and charge a fee for, acceptance of support, warranty, indemnity, + or other liability obligations and/or rights consistent with this + License. However, in accepting such obligations, You may act only + on Your own behalf and on Your sole responsibility, not on behalf + of any other Contributor, and only if You agree to indemnify, + defend, and hold each Contributor harmless for any liability + incurred by, or claims asserted against, such Contributor by reason + of your accepting any such warranty or additional liability. + + END OF TERMS AND CONDITIONS + + APPENDIX: How to apply the Apache License to your work. + + To apply the Apache License to your work, attach the following + boilerplate notice, with the fields enclosed by brackets "[]" + replaced with your own identifying information. (Don't include + the brackets!) The text should be enclosed in the appropriate + comment syntax for the file format. We also recommend that a + file or class name and description of purpose be included on the + same "printed page" as the copyright notice for easier + identification within third-party archives. + + Copyright [yyyy] [name of copyright owner] + + Licensed under the Apache License, Version 2.0 (the "License"); + you may not use this file except in compliance with the License. + You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, software + distributed under the License is distributed on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + See the License for the specific language governing permissions and + limitations under the License. + + +scikit-image +BSD License +https://scikit-image.org +Files: * +Copyright: 2009-2022 the scikit-image team +License: BSD-3-Clause + +Files: doc/source/themes/scikit-image/layout.html +Copyright: 2007-2010 the Sphinx team +License: BSD-3-Clause + +Files: skimage/feature/_canny.py + skimage/filters/edges.py + skimage/filters/_rank_order.py + skimage/morphology/_skeletonize.py + skimage/morphology/tests/test_watershed.py + skimage/morphology/watershed.py + skimage/segmentation/heap_general.pxi + skimage/segmentation/heap_watershed.pxi + skimage/segmentation/_watershed.py + skimage/segmentation/_watershed_cy.pyx +Copyright: 2003-2009 Massachusetts Institute of Technology + 2009-2011 Broad Institute + 2003 Lee Kamentsky + 2003-2005 Peter J. Verveer +License: BSD-3-Clause + +Files: skimage/filters/thresholding.py + skimage/graph/_mcp.pyx + skimage/graph/heap.pyx +Copyright: 2009-2015 Board of Regents of the University of + Wisconsin-Madison, Broad Institute of MIT and Harvard, + and Max Planck Institute of Molecular Cell Biology and + Genetics + 2009 Zachary Pincus + 2009 Almar Klein +License: BSD-2-Clause + +File: skimage/morphology/grayreconstruct.py + skimage/morphology/tests/test_reconstruction.py +Copyright: 2003-2009 Massachusetts Institute of Technology + 2009-2011 Broad Institute + 2003 Lee Kamentsky +License: BSD-3-Clause + +File: skimage/morphology/_grayreconstruct.pyx +Copyright: 2003-2009 Massachusetts Institute of Technology + 2009-2011 Broad Institute + 2003 Lee Kamentsky + 2022 Gregory Lee (added a 64-bit integer variant for large images) +License: BSD-3-Clause + +File: skimage/segmentation/_expand_labels.py +Copyright: 2020 Broad Institute + 2020 CellProfiler team +License: BSD-3-Clause + +File: skimage/exposure/_adapthist.py +Copyright: 1994 Karel Zuiderveld +License: BSD-3-Clause + +Function: skimage/morphology/_skeletonize_various_cy.pyx:_skeletonize_loop +Copyright: 2003-2009 Massachusetts Institute of Technology + 2009-2011 Broad Institute + 2003 Lee Kamentsky +License: BSD-3-Clause + +Function: skimage/_shared/version_requirements.py:_check_version +Copyright: 2013 The IPython Development Team +License: BSD-3-Clause + +Function: skimage/_shared/version_requirements.py:is_installed +Copyright: 2009-2011 Pierre Raybaut +License: MIT + +File: skimage/feature/_fisher_vector.py +Copyright: 2014 2014 Dan Oneata +License: MIT + +File: skimage/_vendored/numpy_lookfor.py +Copyright: 2005-2023, NumPy Developers +License: BSD-3-Clause + +File: skimage/transform/_thin_plate_splines.py +Copyright: 2007 Zachary Pincus +License: BSD-3-Clause + +License: BSD-2-Clause + +Redistribution and use in source and binary forms, with or without +modification, are permitted provided that the following conditions +are met: +1. Redistributions of source code must retain the above copyright + notice, this list of conditions and the following disclaimer. +2. Redistributions in binary form must reproduce the above copyright + notice, this list of conditions and the following disclaimer in the + documentation and/or other materials provided with the distribution. +. +THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS +``AS IS'' AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT +LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR +A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE HOLDERS OR +CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, +EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, +PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR +PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF +LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING +NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS +SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + +License: BSD-3-Clause + +Redistribution and use in source and binary forms, with or without +modification, are permitted provided that the following conditions +are met: +1. Redistributions of source code must retain the above copyright + notice, this list of conditions and the following disclaimer. +2. Redistributions in binary form must reproduce the above copyright + notice, this list of conditions and the following disclaimer in the + documentation and/or other materials provided with the distribution. +3. Neither the name of the University nor the names of its contributors + may be used to endorse or promote products derived from this software + without specific prior written permission. +. +THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS +``AS IS'' AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT +LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR +A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE HOLDERS OR +CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, +EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, +PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR +PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF +LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING +NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS +SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + +License: MIT + +Permission is hereby granted, free of charge, to any person obtaining a copy +of this software and associated documentation files (the "Software"), to deal +in the Software without restriction, including without limitation the rights +to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +copies of the Software, and to permit persons to whom the Software is +furnished to do so, subject to the following conditions: + +The above copyright notice and this permission notice shall be included in all +copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE +SOFTWARE. + + +scipy +BSD License +https://scipy.org/ +Copyright (c) 2001-2002 Enthought, Inc. 2003, SciPy Developers. +All rights reserved. + +Redistribution and use in source and binary forms, with or without +modification, are permitted provided that the following conditions +are met: + +1. Redistributions of source code must retain the above copyright + notice, this list of conditions and the following disclaimer. + +2. Redistributions in binary form must reproduce the above + copyright notice, this list of conditions and the following + disclaimer in the documentation and/or other materials provided + with the distribution. + +3. Neither the name of the copyright holder nor the names of its + contributors may be used to endorse or promote products derived + from this software without specific prior written permission. + +THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS +"AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT +LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR +A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT +OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, +SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT +LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, +DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY +THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT +(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE +OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + +---- + +This binary distribution of SciPy can also bundle the following software +(depending on the build): + + +Name: OpenBLAS +Files: scipy.libs/libscipy_openblas*.so +Description: bundled as a dynamically linked library +Availability: https://github.com/OpenMathLib/OpenBLAS/ +License: BSD-3-Clause + Copyright (c) 2011-2014, The OpenBLAS Project + All rights reserved. + + Redistribution and use in source and binary forms, with or without + modification, are permitted provided that the following conditions are + met: + + 1. Redistributions of source code must retain the above copyright + notice, this list of conditions and the following disclaimer. + + 2. Redistributions in binary form must reproduce the above copyright + notice, this list of conditions and the following disclaimer in + the documentation and/or other materials provided with the + distribution. + 3. Neither the name of the OpenBLAS project nor the names of + its contributors may be used to endorse or promote products + derived from this software without specific prior written + permission. + + THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" + AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE + IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE + ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE + LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL + DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR + SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER + CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, + OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE + USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + + +Name: LAPACK +Files: scipy.libs/libscipy_openblas*.so +Description: bundled in OpenBLAS +Availability: https://github.com/OpenMathLib/OpenBLAS/ +License: BSD-3-Clause-Open-MPI + Copyright (c) 1992-2013 The University of Tennessee and The University + of Tennessee Research Foundation. All rights + reserved. + Copyright (c) 2000-2013 The University of California Berkeley. All + rights reserved. + Copyright (c) 2006-2013 The University of Colorado Denver. All rights + reserved. + + $COPYRIGHT$ + + Additional copyrights may follow + + $HEADER$ + + Redistribution and use in source and binary forms, with or without + modification, are permitted provided that the following conditions are + met: + + - Redistributions of source code must retain the above copyright + notice, this list of conditions and the following disclaimer. + + - Redistributions in binary form must reproduce the above copyright + notice, this list of conditions and the following disclaimer listed + in this license in the documentation and/or other materials + provided with the distribution. + + - Neither the name of the copyright holders nor the names of its + contributors may be used to endorse or promote products derived from + this software without specific prior written permission. + + The copyright holders provide no reassurances that the source code + provided does not infringe any patent, copyright, or any other + intellectual property rights of third parties. The copyright holders + disclaim any liability to any recipient for claims brought against + recipient by any third party for infringement of that parties + intellectual property rights. + + THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS + "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT + LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR + A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT + OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, + SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT + LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, + DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY + THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT + (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE + OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + + +Name: GCC runtime library +Files: scipy.libs/libgfortran*.so +Description: dynamically linked to files compiled with gcc +Availability: https://gcc.gnu.org/git/?p=gcc.git;a=tree;f=libgfortran +License: GPL-3.0-or-later WITH GCC-exception-3.1 + Copyright (C) 2002-2017 Free Software Foundation, Inc. + + Libgfortran is free software; you can redistribute it and/or modify + it under the terms of the GNU General Public License as published by + the Free Software Foundation; either version 3, or (at your option) + any later version. + + Libgfortran is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + Under Section 7 of GPL version 3, you are granted additional + permissions described in the GCC Runtime Library Exception, version + 3.1, as published by the Free Software Foundation. + + You should have received a copy of the GNU General Public License and + a copy of the GCC Runtime Library Exception along with this program; + see the files COPYING3 and COPYING.RUNTIME respectively. If not, see + . + +---- + +Full text of license texts referred to above follows (that they are +listed below does not necessarily imply the conditions apply to the +present binary release): + +---- + +GCC RUNTIME LIBRARY EXCEPTION + +Version 3.1, 31 March 2009 + +Copyright (C) 2009 Free Software Foundation, Inc. + +Everyone is permitted to copy and distribute verbatim copies of this +license document, but changing it is not allowed. + +This GCC Runtime Library Exception ("Exception") is an additional +permission under section 7 of the GNU General Public License, version +3 ("GPLv3"). It applies to a given file (the "Runtime Library") that +bears a notice placed by the copyright holder of the file stating that +the file is governed by GPLv3 along with this Exception. + +When you use GCC to compile a program, GCC may combine portions of +certain GCC header files and runtime libraries with the compiled +program. The purpose of this Exception is to allow compilation of +non-GPL (including proprietary) programs to use, in this way, the +header files and runtime libraries covered by this Exception. + +0. Definitions. + +A file is an "Independent Module" if it either requires the Runtime +Library for execution after a Compilation Process, or makes use of an +interface provided by the Runtime Library, but is not otherwise based +on the Runtime Library. + +"GCC" means a version of the GNU Compiler Collection, with or without +modifications, governed by version 3 (or a specified later version) of +the GNU General Public License (GPL) with the option of using any +subsequent versions published by the FSF. + +"GPL-compatible Software" is software whose conditions of propagation, +modification and use would permit combination with GCC in accord with +the license of GCC. + +"Target Code" refers to output from any compiler for a real or virtual +target processor architecture, in executable form or suitable for +input to an assembler, loader, linker and/or execution +phase. Notwithstanding that, Target Code does not include data in any +format that is used as a compiler intermediate representation, or used +for producing a compiler intermediate representation. + +The "Compilation Process" transforms code entirely represented in +non-intermediate languages designed for human-written code, and/or in +Java Virtual Machine byte code, into Target Code. Thus, for example, +use of source code generators and preprocessors need not be considered +part of the Compilation Process, since the Compilation Process can be +understood as starting with the output of the generators or +preprocessors. + +A Compilation Process is "Eligible" if it is done using GCC, alone or +with other GPL-compatible software, or if it is done without using any +work based on GCC. For example, using non-GPL-compatible Software to +optimize any GCC intermediate representations would not qualify as an +Eligible Compilation Process. + +1. Grant of Additional Permission. + +You have permission to propagate a work of Target Code formed by +combining the Runtime Library with Independent Modules, even if such +propagation would otherwise violate the terms of GPLv3, provided that +all Target Code was generated by Eligible Compilation Processes. You +may then convey such a combination under terms of your choice, +consistent with the licensing of the Independent Modules. + +2. No Weakening of GCC Copyleft. + +The availability of this Exception does not imply any general +presumption that third-party software is unaffected by the copyleft +requirements of the license of GCC. + +---- + + GNU GENERAL PUBLIC LICENSE + Version 3, 29 June 2007 + + Copyright (C) 2007 Free Software Foundation, Inc. + Everyone is permitted to copy and distribute verbatim copies + of this license document, but changing it is not allowed. + + Preamble + + The GNU General Public License is a free, copyleft license for +software and other kinds of works. + + The licenses for most software and other practical works are designed +to take away your freedom to share and change the works. By contrast, +the GNU General Public License is intended to guarantee your freedom to +share and change all versions of a program--to make sure it remains free +software for all its users. We, the Free Software Foundation, use the +GNU General Public License for most of our software; it applies also to +any other work released this way by its authors. You can apply it to +your programs, too. + + When we speak of free software, we are referring to freedom, not +price. Our General Public Licenses are designed to make sure that you +have the freedom to distribute copies of free software (and charge for +them if you wish), that you receive source code or can get it if you +want it, that you can change the software or use pieces of it in new +free programs, and that you know you can do these things. + + To protect your rights, we need to prevent others from denying you +these rights or asking you to surrender the rights. Therefore, you have +certain responsibilities if you distribute copies of the software, or if +you modify it: responsibilities to respect the freedom of others. + + For example, if you distribute copies of such a program, whether +gratis or for a fee, you must pass on to the recipients the same +freedoms that you received. You must make sure that they, too, receive +or can get the source code. And you must show them these terms so they +know their rights. + + Developers that use the GNU GPL protect your rights with two steps: +(1) assert copyright on the software, and (2) offer you this License +giving you legal permission to copy, distribute and/or modify it. + + For the developers' and authors' protection, the GPL clearly explains +that there is no warranty for this free software. For both users' and +authors' sake, the GPL requires that modified versions be marked as +changed, so that their problems will not be attributed erroneously to +authors of previous versions. + + Some devices are designed to deny users access to install or run +modified versions of the software inside them, although the manufacturer +can do so. This is fundamentally incompatible with the aim of +protecting users' freedom to change the software. The systematic +pattern of such abuse occurs in the area of products for individuals to +use, which is precisely where it is most unacceptable. Therefore, we +have designed this version of the GPL to prohibit the practice for those +products. If such problems arise substantially in other domains, we +stand ready to extend this provision to those domains in future versions +of the GPL, as needed to protect the freedom of users. + + Finally, every program is threatened constantly by software patents. +States should not allow patents to restrict development and use of +software on general-purpose computers, but in those that do, we wish to +avoid the special danger that patents applied to a free program could +make it effectively proprietary. To prevent this, the GPL assures that +patents cannot be used to render the program non-free. + + The precise terms and conditions for copying, distribution and +modification follow. + + TERMS AND CONDITIONS + + 0. Definitions. + + "This License" refers to version 3 of the GNU General Public License. + + "Copyright" also means copyright-like laws that apply to other kinds of +works, such as semiconductor masks. + + "The Program" refers to any copyrightable work licensed under this +License. Each licensee is addressed as "you". "Licensees" and +"recipients" may be individuals or organizations. + + To "modify" a work means to copy from or adapt all or part of the work +in a fashion requiring copyright permission, other than the making of an +exact copy. The resulting work is called a "modified version" of the +earlier work or a work "based on" the earlier work. + + A "covered work" means either the unmodified Program or a work based +on the Program. + + To "propagate" a work means to do anything with it that, without +permission, would make you directly or secondarily liable for +infringement under applicable copyright law, except executing it on a +computer or modifying a private copy. Propagation includes copying, +distribution (with or without modification), making available to the +public, and in some countries other activities as well. + + To "convey" a work means any kind of propagation that enables other +parties to make or receive copies. Mere interaction with a user through +a computer network, with no transfer of a copy, is not conveying. + + An interactive user interface displays "Appropriate Legal Notices" +to the extent that it includes a convenient and prominently visible +feature that (1) displays an appropriate copyright notice, and (2) +tells the user that there is no warranty for the work (except to the +extent that warranties are provided), that licensees may convey the +work under this License, and how to view a copy of this License. If +the interface presents a list of user commands or options, such as a +menu, a prominent item in the list meets this criterion. + + 1. Source Code. + + The "source code" for a work means the preferred form of the work +for making modifications to it. "Object code" means any non-source +form of a work. + + A "Standard Interface" means an interface that either is an official +standard defined by a recognized standards body, or, in the case of +interfaces specified for a particular programming language, one that +is widely used among developers working in that language. + + The "System Libraries" of an executable work include anything, other +than the work as a whole, that (a) is included in the normal form of +packaging a Major Component, but which is not part of that Major +Component, and (b) serves only to enable use of the work with that +Major Component, or to implement a Standard Interface for which an +implementation is available to the public in source code form. A +"Major Component", in this context, means a major essential component +(kernel, window system, and so on) of the specific operating system +(if any) on which the executable work runs, or a compiler used to +produce the work, or an object code interpreter used to run it. + + The "Corresponding Source" for a work in object code form means all +the source code needed to generate, install, and (for an executable +work) run the object code and to modify the work, including scripts to +control those activities. However, it does not include the work's +System Libraries, or general-purpose tools or generally available free +programs which are used unmodified in performing those activities but +which are not part of the work. For example, Corresponding Source +includes interface definition files associated with source files for +the work, and the source code for shared libraries and dynamically +linked subprograms that the work is specifically designed to require, +such as by intimate data communication or control flow between those +subprograms and other parts of the work. + + The Corresponding Source need not include anything that users +can regenerate automatically from other parts of the Corresponding +Source. + + The Corresponding Source for a work in source code form is that +same work. + + 2. Basic Permissions. + + All rights granted under this License are granted for the term of +copyright on the Program, and are irrevocable provided the stated +conditions are met. This License explicitly affirms your unlimited +permission to run the unmodified Program. The output from running a +covered work is covered by this License only if the output, given its +content, constitutes a covered work. This License acknowledges your +rights of fair use or other equivalent, as provided by copyright law. + + You may make, run and propagate covered works that you do not +convey, without conditions so long as your license otherwise remains +in force. You may convey covered works to others for the sole purpose +of having them make modifications exclusively for you, or provide you +with facilities for running those works, provided that you comply with +the terms of this License in conveying all material for which you do +not control copyright. Those thus making or running the covered works +for you must do so exclusively on your behalf, under your direction +and control, on terms that prohibit them from making any copies of +your copyrighted material outside their relationship with you. + + Conveying under any other circumstances is permitted solely under +the conditions stated below. Sublicensing is not allowed; section 10 +makes it unnecessary. + + 3. Protecting Users' Legal Rights From Anti-Circumvention Law. + + No covered work shall be deemed part of an effective technological +measure under any applicable law fulfilling obligations under article +11 of the WIPO copyright treaty adopted on 20 December 1996, or +similar laws prohibiting or restricting circumvention of such +measures. + + When you convey a covered work, you waive any legal power to forbid +circumvention of technological measures to the extent such circumvention +is effected by exercising rights under this License with respect to +the covered work, and you disclaim any intention to limit operation or +modification of the work as a means of enforcing, against the work's +users, your or third parties' legal rights to forbid circumvention of +technological measures. + + 4. Conveying Verbatim Copies. + + You may convey verbatim copies of the Program's source code as you +receive it, in any medium, provided that you conspicuously and +appropriately publish on each copy an appropriate copyright notice; +keep intact all notices stating that this License and any +non-permissive terms added in accord with section 7 apply to the code; +keep intact all notices of the absence of any warranty; and give all +recipients a copy of this License along with the Program. + + You may charge any price or no price for each copy that you convey, +and you may offer support or warranty protection for a fee. + + 5. Conveying Modified Source Versions. + + You may convey a work based on the Program, or the modifications to +produce it from the Program, in the form of source code under the +terms of section 4, provided that you also meet all of these conditions: + + a) The work must carry prominent notices stating that you modified + it, and giving a relevant date. + + b) The work must carry prominent notices stating that it is + released under this License and any conditions added under section + 7. This requirement modifies the requirement in section 4 to + "keep intact all notices". + + c) You must license the entire work, as a whole, under this + License to anyone who comes into possession of a copy. This + License will therefore apply, along with any applicable section 7 + additional terms, to the whole of the work, and all its parts, + regardless of how they are packaged. This License gives no + permission to license the work in any other way, but it does not + invalidate such permission if you have separately received it. + + d) If the work has interactive user interfaces, each must display + Appropriate Legal Notices; however, if the Program has interactive + interfaces that do not display Appropriate Legal Notices, your + work need not make them do so. + + A compilation of a covered work with other separate and independent +works, which are not by their nature extensions of the covered work, +and which are not combined with it such as to form a larger program, +in or on a volume of a storage or distribution medium, is called an +"aggregate" if the compilation and its resulting copyright are not +used to limit the access or legal rights of the compilation's users +beyond what the individual works permit. Inclusion of a covered work +in an aggregate does not cause this License to apply to the other +parts of the aggregate. + + 6. Conveying Non-Source Forms. + + You may convey a covered work in object code form under the terms +of sections 4 and 5, provided that you also convey the +machine-readable Corresponding Source under the terms of this License, +in one of these ways: + + a) Convey the object code in, or embodied in, a physical product + (including a physical distribution medium), accompanied by the + Corresponding Source fixed on a durable physical medium + customarily used for software interchange. + + b) Convey the object code in, or embodied in, a physical product + (including a physical distribution medium), accompanied by a + written offer, valid for at least three years and valid for as + long as you offer spare parts or customer support for that product + model, to give anyone who possesses the object code either (1) a + copy of the Corresponding Source for all the software in the + product that is covered by this License, on a durable physical + medium customarily used for software interchange, for a price no + more than your reasonable cost of physically performing this + conveying of source, or (2) access to copy the + Corresponding Source from a network server at no charge. + + c) Convey individual copies of the object code with a copy of the + written offer to provide the Corresponding Source. This + alternative is allowed only occasionally and noncommercially, and + only if you received the object code with such an offer, in accord + with subsection 6b. + + d) Convey the object code by offering access from a designated + place (gratis or for a charge), and offer equivalent access to the + Corresponding Source in the same way through the same place at no + further charge. You need not require recipients to copy the + Corresponding Source along with the object code. If the place to + copy the object code is a network server, the Corresponding Source + may be on a different server (operated by you or a third party) + that supports equivalent copying facilities, provided you maintain + clear directions next to the object code saying where to find the + Corresponding Source. Regardless of what server hosts the + Corresponding Source, you remain obligated to ensure that it is + available for as long as needed to satisfy these requirements. + + e) Convey the object code using peer-to-peer transmission, provided + you inform other peers where the object code and Corresponding + Source of the work are being offered to the general public at no + charge under subsection 6d. + + A separable portion of the object code, whose source code is excluded +from the Corresponding Source as a System Library, need not be +included in conveying the object code work. + + A "User Product" is either (1) a "consumer product", which means any +tangible personal property which is normally used for personal, family, +or household purposes, or (2) anything designed or sold for incorporation +into a dwelling. In determining whether a product is a consumer product, +doubtful cases shall be resolved in favor of coverage. For a particular +product received by a particular user, "normally used" refers to a +typical or common use of that class of product, regardless of the status +of the particular user or of the way in which the particular user +actually uses, or expects or is expected to use, the product. A product +is a consumer product regardless of whether the product has substantial +commercial, industrial or non-consumer uses, unless such uses represent +the only significant mode of use of the product. + + "Installation Information" for a User Product means any methods, +procedures, authorization keys, or other information required to install +and execute modified versions of a covered work in that User Product from +a modified version of its Corresponding Source. The information must +suffice to ensure that the continued functioning of the modified object +code is in no case prevented or interfered with solely because +modification has been made. + + If you convey an object code work under this section in, or with, or +specifically for use in, a User Product, and the conveying occurs as +part of a transaction in which the right of possession and use of the +User Product is transferred to the recipient in perpetuity or for a +fixed term (regardless of how the transaction is characterized), the +Corresponding Source conveyed under this section must be accompanied +by the Installation Information. But this requirement does not apply +if neither you nor any third party retains the ability to install +modified object code on the User Product (for example, the work has +been installed in ROM). + + The requirement to provide Installation Information does not include a +requirement to continue to provide support service, warranty, or updates +for a work that has been modified or installed by the recipient, or for +the User Product in which it has been modified or installed. Access to a +network may be denied when the modification itself materially and +adversely affects the operation of the network or violates the rules and +protocols for communication across the network. + + Corresponding Source conveyed, and Installation Information provided, +in accord with this section must be in a format that is publicly +documented (and with an implementation available to the public in +source code form), and must require no special password or key for +unpacking, reading or copying. + + 7. Additional Terms. + + "Additional permissions" are terms that supplement the terms of this +License by making exceptions from one or more of its conditions. +Additional permissions that are applicable to the entire Program shall +be treated as though they were included in this License, to the extent +that they are valid under applicable law. If additional permissions +apply only to part of the Program, that part may be used separately +under those permissions, but the entire Program remains governed by +this License without regard to the additional permissions. + + When you convey a copy of a covered work, you may at your option +remove any additional permissions from that copy, or from any part of +it. (Additional permissions may be written to require their own +removal in certain cases when you modify the work.) You may place +additional permissions on material, added by you to a covered work, +for which you have or can give appropriate copyright permission. + + Notwithstanding any other provision of this License, for material you +add to a covered work, you may (if authorized by the copyright holders of +that material) supplement the terms of this License with terms: + + a) Disclaiming warranty or limiting liability differently from the + terms of sections 15 and 16 of this License; or + + b) Requiring preservation of specified reasonable legal notices or + author attributions in that material or in the Appropriate Legal + Notices displayed by works containing it; or + + c) Prohibiting misrepresentation of the origin of that material, or + requiring that modified versions of such material be marked in + reasonable ways as different from the original version; or + + d) Limiting the use for publicity purposes of names of licensors or + authors of the material; or + + e) Declining to grant rights under trademark law for use of some + trade names, trademarks, or service marks; or + + f) Requiring indemnification of licensors and authors of that + material by anyone who conveys the material (or modified versions of + it) with contractual assumptions of liability to the recipient, for + any liability that these contractual assumptions directly impose on + those licensors and authors. + + All other non-permissive additional terms are considered "further +restrictions" within the meaning of section 10. If the Program as you +received it, or any part of it, contains a notice stating that it is +governed by this License along with a term that is a further +restriction, you may remove that term. If a license document contains +a further restriction but permits relicensing or conveying under this +License, you may add to a covered work material governed by the terms +of that license document, provided that the further restriction does +not survive such relicensing or conveying. + + If you add terms to a covered work in accord with this section, you +must place, in the relevant source files, a statement of the +additional terms that apply to those files, or a notice indicating +where to find the applicable terms. + + Additional terms, permissive or non-permissive, may be stated in the +form of a separately written license, or stated as exceptions; +the above requirements apply either way. + + 8. Termination. + + You may not propagate or modify a covered work except as expressly +provided under this License. Any attempt otherwise to propagate or +modify it is void, and will automatically terminate your rights under +this License (including any patent licenses granted under the third +paragraph of section 11). + + However, if you cease all violation of this License, then your +license from a particular copyright holder is reinstated (a) +provisionally, unless and until the copyright holder explicitly and +finally terminates your license, and (b) permanently, if the copyright +holder fails to notify you of the violation by some reasonable means +prior to 60 days after the cessation. + + Moreover, your license from a particular copyright holder is +reinstated permanently if the copyright holder notifies you of the +violation by some reasonable means, this is the first time you have +received notice of violation of this License (for any work) from that +copyright holder, and you cure the violation prior to 30 days after +your receipt of the notice. + + Termination of your rights under this section does not terminate the +licenses of parties who have received copies or rights from you under +this License. If your rights have been terminated and not permanently +reinstated, you do not qualify to receive new licenses for the same +material under section 10. + + 9. Acceptance Not Required for Having Copies. + + You are not required to accept this License in order to receive or +run a copy of the Program. Ancillary propagation of a covered work +occurring solely as a consequence of using peer-to-peer transmission +to receive a copy likewise does not require acceptance. However, +nothing other than this License grants you permission to propagate or +modify any covered work. These actions infringe copyright if you do +not accept this License. Therefore, by modifying or propagating a +covered work, you indicate your acceptance of this License to do so. + + 10. Automatic Licensing of Downstream Recipients. + + Each time you convey a covered work, the recipient automatically +receives a license from the original licensors, to run, modify and +propagate that work, subject to this License. You are not responsible +for enforcing compliance by third parties with this License. + + An "entity transaction" is a transaction transferring control of an +organization, or substantially all assets of one, or subdividing an +organization, or merging organizations. If propagation of a covered +work results from an entity transaction, each party to that +transaction who receives a copy of the work also receives whatever +licenses to the work the party's predecessor in interest had or could +give under the previous paragraph, plus a right to possession of the +Corresponding Source of the work from the predecessor in interest, if +the predecessor has it or can get it with reasonable efforts. + + You may not impose any further restrictions on the exercise of the +rights granted or affirmed under this License. For example, you may +not impose a license fee, royalty, or other charge for exercise of +rights granted under this License, and you may not initiate litigation +(including a cross-claim or counterclaim in a lawsuit) alleging that +any patent claim is infringed by making, using, selling, offering for +sale, or importing the Program or any portion of it. + + 11. Patents. + + A "contributor" is a copyright holder who authorizes use under this +License of the Program or a work on which the Program is based. The +work thus licensed is called the contributor's "contributor version". + + A contributor's "essential patent claims" are all patent claims +owned or controlled by the contributor, whether already acquired or +hereafter acquired, that would be infringed by some manner, permitted +by this License, of making, using, or selling its contributor version, +but do not include claims that would be infringed only as a +consequence of further modification of the contributor version. For +purposes of this definition, "control" includes the right to grant +patent sublicenses in a manner consistent with the requirements of +this License. + + Each contributor grants you a non-exclusive, worldwide, royalty-free +patent license under the contributor's essential patent claims, to +make, use, sell, offer for sale, import and otherwise run, modify and +propagate the contents of its contributor version. + + In the following three paragraphs, a "patent license" is any express +agreement or commitment, however denominated, not to enforce a patent +(such as an express permission to practice a patent or covenant not to +sue for patent infringement). To "grant" such a patent license to a +party means to make such an agreement or commitment not to enforce a +patent against the party. + + If you convey a covered work, knowingly relying on a patent license, +and the Corresponding Source of the work is not available for anyone +to copy, free of charge and under the terms of this License, through a +publicly available network server or other readily accessible means, +then you must either (1) cause the Corresponding Source to be so +available, or (2) arrange to deprive yourself of the benefit of the +patent license for this particular work, or (3) arrange, in a manner +consistent with the requirements of this License, to extend the patent +license to downstream recipients. "Knowingly relying" means you have +actual knowledge that, but for the patent license, your conveying the +covered work in a country, or your recipient's use of the covered work +in a country, would infringe one or more identifiable patents in that +country that you have reason to believe are valid. + + If, pursuant to or in connection with a single transaction or +arrangement, you convey, or propagate by procuring conveyance of, a +covered work, and grant a patent license to some of the parties +receiving the covered work authorizing them to use, propagate, modify +or convey a specific copy of the covered work, then the patent license +you grant is automatically extended to all recipients of the covered +work and works based on it. + + A patent license is "discriminatory" if it does not include within +the scope of its coverage, prohibits the exercise of, or is +conditioned on the non-exercise of one or more of the rights that are +specifically granted under this License. You may not convey a covered +work if you are a party to an arrangement with a third party that is +in the business of distributing software, under which you make payment +to the third party based on the extent of your activity of conveying +the work, and under which the third party grants, to any of the +parties who would receive the covered work from you, a discriminatory +patent license (a) in connection with copies of the covered work +conveyed by you (or copies made from those copies), or (b) primarily +for and in connection with specific products or compilations that +contain the covered work, unless you entered into that arrangement, +or that patent license was granted, prior to 28 March 2007. + + Nothing in this License shall be construed as excluding or limiting +any implied license or other defenses to infringement that may +otherwise be available to you under applicable patent law. + + 12. No Surrender of Others' Freedom. + + If conditions are imposed on you (whether by court order, agreement or +otherwise) that contradict the conditions of this License, they do not +excuse you from the conditions of this License. If you cannot convey a +covered work so as to satisfy simultaneously your obligations under this +License and any other pertinent obligations, then as a consequence you may +not convey it at all. For example, if you agree to terms that obligate you +to collect a royalty for further conveying from those to whom you convey +the Program, the only way you could satisfy both those terms and this +License would be to refrain entirely from conveying the Program. + + 13. Use with the GNU Affero General Public License. + + Notwithstanding any other provision of this License, you have +permission to link or combine any covered work with a work licensed +under version 3 of the GNU Affero General Public License into a single +combined work, and to convey the resulting work. The terms of this +License will continue to apply to the part which is the covered work, +but the special requirements of the GNU Affero General Public License, +section 13, concerning interaction through a network will apply to the +combination as such. + + 14. Revised Versions of this License. + + The Free Software Foundation may publish revised and/or new versions of +the GNU General Public License from time to time. Such new versions will +be similar in spirit to the present version, but may differ in detail to +address new problems or concerns. + + Each version is given a distinguishing version number. If the +Program specifies that a certain numbered version of the GNU General +Public License "or any later version" applies to it, you have the +option of following the terms and conditions either of that numbered +version or of any later version published by the Free Software +Foundation. If the Program does not specify a version number of the +GNU General Public License, you may choose any version ever published +by the Free Software Foundation. + + If the Program specifies that a proxy can decide which future +versions of the GNU General Public License can be used, that proxy's +public statement of acceptance of a version permanently authorizes you +to choose that version for the Program. + + Later license versions may give you additional or different +permissions. However, no additional obligations are imposed on any +author or copyright holder as a result of your choosing to follow a +later version. + + 15. Disclaimer of Warranty. + + THERE IS NO WARRANTY FOR THE PROGRAM, TO THE EXTENT PERMITTED BY +APPLICABLE LAW. EXCEPT WHEN OTHERWISE STATED IN WRITING THE COPYRIGHT +HOLDERS AND/OR OTHER PARTIES PROVIDE THE PROGRAM "AS IS" WITHOUT WARRANTY +OF ANY KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, +THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR +PURPOSE. THE ENTIRE RISK AS TO THE QUALITY AND PERFORMANCE OF THE PROGRAM +IS WITH YOU. SHOULD THE PROGRAM PROVE DEFECTIVE, YOU ASSUME THE COST OF +ALL NECESSARY SERVICING, REPAIR OR CORRECTION. + + 16. Limitation of Liability. + + IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN WRITING +WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MODIFIES AND/OR CONVEYS +THE PROGRAM AS PERMITTED ABOVE, BE LIABLE TO YOU FOR DAMAGES, INCLUDING ANY +GENERAL, SPECIAL, INCIDENTAL OR CONSEQUENTIAL DAMAGES ARISING OUT OF THE +USE OR INABILITY TO USE THE PROGRAM (INCLUDING BUT NOT LIMITED TO LOSS OF +DATA OR DATA BEING RENDERED INACCURATE OR LOSSES SUSTAINED BY YOU OR THIRD +PARTIES OR A FAILURE OF THE PROGRAM TO OPERATE WITH ANY OTHER PROGRAMS), +EVEN IF SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE POSSIBILITY OF +SUCH DAMAGES. + + 17. Interpretation of Sections 15 and 16. + + If the disclaimer of warranty and limitation of liability provided +above cannot be given local legal effect according to their terms, +reviewing courts shall apply local law that most closely approximates +an absolute waiver of all civil liability in connection with the +Program, unless a warranty or assumption of liability accompanies a +copy of the Program in return for a fee. + + END OF TERMS AND CONDITIONS + + How to Apply These Terms to Your New Programs + + If you develop a new program, and you want it to be of the greatest +possible use to the public, the best way to achieve this is to make it +free software which everyone can redistribute and change under these terms. + + To do so, attach the following notices to the program. It is safest +to attach them to the start of each source file to most effectively +state the exclusion of warranty; and each file should have at least +the "copyright" line and a pointer to where the full notice is found. + + + Copyright (C) + + This program is free software: you can redistribute it and/or modify + it under the terms of the GNU General Public License as published by + the Free Software Foundation, either version 3 of the License, or + (at your option) any later version. + + This program is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License + along with this program. If not, see . + +Also add information on how to contact you by electronic and paper mail. + + If the program does terminal interaction, make it output a short +notice like this when it starts in an interactive mode: + + Copyright (C) + This program comes with ABSOLUTELY NO WARRANTY; for details type `show w'. + This is free software, and you are welcome to redistribute it + under certain conditions; type `show c' for details. + +The hypothetical commands `show w' and `show c' should show the appropriate +parts of the General Public License. Of course, your program's commands +might be different; for a GUI interface, you would use an "about box". + + You should also get your employer (if you work as a programmer) or school, +if any, to sign a "copyright disclaimer" for the program, if necessary. +For more information on this, and how to apply and follow the GNU GPL, see +. + + The GNU General Public License does not permit incorporating your program +into proprietary programs. If your program is a subroutine library, you +may consider it more useful to permit linking proprietary applications with +the library. If this is what you want to do, use the GNU Lesser General +Public License instead of this License. But first, please read +. + + +Name: libquadmath +Files: scipy.libs/libquadmath*.so +Description: dynamically linked to files compiled with gcc +Availability: https://gcc.gnu.org/git/?p=gcc.git;a=tree;f=libquadmath +License: LGPL-2.1-or-later + + GCC Quad-Precision Math Library + Copyright (C) 2010-2019 Free Software Foundation, Inc. + Written by Francois-Xavier Coudert + + This file is part of the libquadmath library. + Libquadmath is free software; you can redistribute it and/or + modify it under the terms of the GNU Library General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + Libquadmath is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + https://www.gnu.org/licenses/old-licenses/lgpl-2.1.html + + +semantic-version +BSD License +https://github.com/rbarrois/python-semanticversion +Copyright (c) The python-semanticversion project +All rights reserved. + +Redistribution and use in source and binary forms, with or without +modification, are permitted provided that the following conditions are met: + +1. Redistributions of source code must retain the above copyright notice, this + list of conditions and the following disclaimer. +2. Redistributions in binary form must reproduce the above copyright notice, + this list of conditions and the following disclaimer in the documentation + and/or other materials provided with the distribution. + +THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND +ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED +WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE +DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR +ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES +(INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; +LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND +ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT +(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS +SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + + +semver +BSD License +https://python-semver.readthedocs.io/en/latest/changelog.html +Copyright (c) 2013, Konstantine Rybnikov +All rights reserved. + +Redistribution and use in source and binary forms, with or without modification, +are permitted provided that the following conditions are met: + + Redistributions of source code must retain the above copyright notice, this + list of conditions and the following disclaimer. + + Redistributions in binary form must reproduce the above copyright notice, this + list of conditions and the following disclaimer in the documentation and/or + other materials provided with the distribution. + + Neither the name of the python-semver org nor the names of its + contributors may be used to endorse or promote products derived from + this software without specific prior written permission. + +THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND +ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED +WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE +DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR +ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES +(INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; +LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON +ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT +(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS +SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + + +sentencepiece +UNKNOWN +https://github.com/google/sentencepiece +UNKNOWN + +sentry-sdk +BSD License +https://github.com/getsentry/sentry-python +MIT License + +Copyright (c) 2018 Functional Software, Inc. dba Sentry + +Permission is hereby granted, free of charge, to any person obtaining a copy +of this software and associated documentation files (the "Software"), to deal +in the Software without restriction, including without limitation the rights +to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +copies of the Software, and to permit persons to whom the Software is +furnished to do so, subject to the following conditions: + +The above copyright notice and this permission notice shall be included in all +copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE +SOFTWARE. + + +shellingham +ISC License (ISCL) +https://github.com/sarugaku/shellingham +Copyright (c) 2018, Tzu-ping Chung + +Permission to use, copy, modify, and distribute this software for any +purpose with or without fee is hereby granted, provided that the above +copyright notice and this permission notice appear in all copies. + +THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES +WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF +MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR +ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES +WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN +ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF +OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE. + + +six +MIT License +https://github.com/benjaminp/six +Copyright (c) 2010-2024 Benjamin Peterson + +Permission is hereby granted, free of charge, to any person obtaining a copy of +this software and associated documentation files (the "Software"), to deal in +the Software without restriction, including without limitation the rights to +use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of +the Software, and to permit persons to whom the Software is furnished to do so, +subject to the following conditions: + +The above copyright notice and this permission notice shall be included in all +copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS +FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR +COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER +IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN +CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. + + +slangtorch +MIT License +https://github.com/shader-slang/slang +SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception + + Licensed under the Apache License, Version 2.0 (the "License"); + you may not use this file except in compliance with the License. + You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, software + distributed under the License is distributed on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + See the License for the specific language governing permissions and + limitations under the License. + +LLVM Exceptions to the Apache 2.0 License + +As an exception, if, as a result of your compiling your source code, portions +of this Software are embedded into an Object form of such source code, you +may redistribute such embedded portions in such Object form without complying +with the conditions of Sections 4(a), 4(b) and 4(d) of the License. + +In addition, if you combine or link compiled forms of this Software with +software that is licensed under the GPLv2 ("Combined Software") and if a +court of competent jurisdiction determines that the patent provision (Section +3), the indemnity provision (Section 9) or other Section of the License +conflicts with the conditions of the GPLv2, you may retroactively and +prospectively choose to deem waived or otherwise exclude such Section(s) of +the License, but only in their entirety and only with respect to the Combined +Software. + + +smart_open +MIT License +https://github.com/piskvorky/smart_open +The MIT License (MIT) + +Copyright (c) 2015 Radim Řehůřek + +Permission is hereby granted, free of charge, to any person obtaining a copy +of this software and associated documentation files (the "Software"), to deal +in the Software without restriction, including without limitation the rights +to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +copies of the Software, and to permit persons to whom the Software is +furnished to do so, subject to the following conditions: + +The above copyright notice and this permission notice shall be included in all +copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE +SOFTWARE. + + + +smmap +BSD License +https://github.com/gitpython-developers/smmap +Copyright (C) 2010, 2011 Sebastian Thiel and contributors +All rights reserved. + +Redistribution and use in source and binary forms, with or without +modification, are permitted provided that the following conditions +are met: + +* Redistributions of source code must retain the above copyright +notice, this list of conditions and the following disclaimer. + +* Redistributions in binary form must reproduce the above copyright +notice, this list of conditions and the following disclaimer in the +documentation and/or other materials provided with the distribution. + +* Neither the name of the async project nor the names of +its contributors may be used to endorse or promote products derived +from this software without specific prior written permission. + +THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS +"AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT +LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR +A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT +OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, +SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED +TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR +PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF +LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING +NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS +SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + + + +sniffio +Apache Software License; MIT License +https://github.com/python-trio/sniffio +This software is made available under the terms of *either* of the +licenses found in LICENSE.APACHE2 or LICENSE.MIT. Contributions to are +made under the terms of *both* these licenses. + + +soundfile +BSD License +https://github.com/bastibe/python-soundfile + GNU LESSER GENERAL PUBLIC LICENSE + Version 2.1, February 1999 + + Copyright (C) 1991, 1999 Free Software Foundation, Inc. + 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA + Everyone is permitted to copy and distribute verbatim copies + of this license document, but changing it is not allowed. + +[This is the first released version of the Lesser GPL. It also counts + as the successor of the GNU Library Public License, version 2, hence + the version number 2.1.] + + Preamble + + The licenses for most software are designed to take away your +freedom to share and change it. By contrast, the GNU General Public +Licenses are intended to guarantee your freedom to share and change +free software--to make sure the software is free for all its users. + + This license, the Lesser General Public License, applies to some +specially designated software packages--typically libraries--of the +Free Software Foundation and other authors who decide to use it. You +can use it too, but we suggest you first think carefully about whether +this license or the ordinary General Public License is the better +strategy to use in any particular case, based on the explanations below. + + When we speak of free software, we are referring to freedom of use, +not price. Our General Public Licenses are designed to make sure that +you have the freedom to distribute copies of free software (and charge +for this service if you wish); that you receive source code or can get +it if you want it; that you can change the software and use pieces of +it in new free programs; and that you are informed that you can do +these things. + + To protect your rights, we need to make restrictions that forbid +distributors to deny you these rights or to ask you to surrender these +rights. These restrictions translate to certain responsibilities for +you if you distribute copies of the library or if you modify it. + + For example, if you distribute copies of the library, whether gratis +or for a fee, you must give the recipients all the rights that we gave +you. You must make sure that they, too, receive or can get the source +code. If you link other code with the library, you must provide +complete object files to the recipients, so that they can relink them +with the library after making changes to the library and recompiling +it. And you must show them these terms so they know their rights. + + We protect your rights with a two-step method: (1) we copyright the +library, and (2) we offer you this license, which gives you legal +permission to copy, distribute and/or modify the library. + + To protect each distributor, we want to make it very clear that +there is no warranty for the free library. Also, if the library is +modified by someone else and passed on, the recipients should know +that what they have is not the original version, so that the original +author's reputation will not be affected by problems that might be +introduced by others. + + Finally, software patents pose a constant threat to the existence of +any free program. We wish to make sure that a company cannot +effectively restrict the users of a free program by obtaining a +restrictive license from a patent holder. Therefore, we insist that +any patent license obtained for a version of the library must be +consistent with the full freedom of use specified in this license. + + Most GNU software, including some libraries, is covered by the +ordinary GNU General Public License. This license, the GNU Lesser +General Public License, applies to certain designated libraries, and +is quite different from the ordinary General Public License. We use +this license for certain libraries in order to permit linking those +libraries into non-free programs. + + When a program is linked with a library, whether statically or using +a shared library, the combination of the two is legally speaking a +combined work, a derivative of the original library. The ordinary +General Public License therefore permits such linking only if the +entire combination fits its criteria of freedom. The Lesser General +Public License permits more lax criteria for linking other code with +the library. + + We call this license the "Lesser" General Public License because it +does Less to protect the user's freedom than the ordinary General +Public License. It also provides other free software developers Less +of an advantage over competing non-free programs. These disadvantages +are the reason we use the ordinary General Public License for many +libraries. However, the Lesser license provides advantages in certain +special circumstances. + + For example, on rare occasions, there may be a special need to +encourage the widest possible use of a certain library, so that it becomes +a de-facto standard. To achieve this, non-free programs must be +allowed to use the library. A more frequent case is that a free +library does the same job as widely used non-free libraries. In this +case, there is little to gain by limiting the free library to free +software only, so we use the Lesser General Public License. + + In other cases, permission to use a particular library in non-free +programs enables a greater number of people to use a large body of +free software. For example, permission to use the GNU C Library in +non-free programs enables many more people to use the whole GNU +operating system, as well as its variant, the GNU/Linux operating +system. + + Although the Lesser General Public License is Less protective of the +users' freedom, it does ensure that the user of a program that is +linked with the Library has the freedom and the wherewithal to run +that program using a modified version of the Library. + + The precise terms and conditions for copying, distribution and +modification follow. Pay close attention to the difference between a +"work based on the library" and a "work that uses the library". The +former contains code derived from the library, whereas the latter must +be combined with the library in order to run. + + GNU LESSER GENERAL PUBLIC LICENSE + TERMS AND CONDITIONS FOR COPYING, DISTRIBUTION AND MODIFICATION + + 0. This License Agreement applies to any software library or other +program which contains a notice placed by the copyright holder or +other authorized party saying it may be distributed under the terms of +this Lesser General Public License (also called "this License"). +Each licensee is addressed as "you". + + A "library" means a collection of software functions and/or data +prepared so as to be conveniently linked with application programs +(which use some of those functions and data) to form executables. + + The "Library", below, refers to any such software library or work +which has been distributed under these terms. A "work based on the +Library" means either the Library or any derivative work under +copyright law: that is to say, a work containing the Library or a +portion of it, either verbatim or with modifications and/or translated +straightforwardly into another language. (Hereinafter, translation is +included without limitation in the term "modification".) + + "Source code" for a work means the preferred form of the work for +making modifications to it. For a library, complete source code means +all the source code for all modules it contains, plus any associated +interface definition files, plus the scripts used to control compilation +and installation of the library. + + Activities other than copying, distribution and modification are not +covered by this License; they are outside its scope. The act of +running a program using the Library is not restricted, and output from +such a program is covered only if its contents constitute a work based +on the Library (independent of the use of the Library in a tool for +writing it). Whether that is true depends on what the Library does +and what the program that uses the Library does. + + 1. You may copy and distribute verbatim copies of the Library's +complete source code as you receive it, in any medium, provided that +you conspicuously and appropriately publish on each copy an +appropriate copyright notice and disclaimer of warranty; keep intact +all the notices that refer to this License and to the absence of any +warranty; and distribute a copy of this License along with the +Library. + + You may charge a fee for the physical act of transferring a copy, +and you may at your option offer warranty protection in exchange for a +fee. + + 2. You may modify your copy or copies of the Library or any portion +of it, thus forming a work based on the Library, and copy and +distribute such modifications or work under the terms of Section 1 +above, provided that you also meet all of these conditions: + + a) The modified work must itself be a software library. + + b) You must cause the files modified to carry prominent notices + stating that you changed the files and the date of any change. + + c) You must cause the whole of the work to be licensed at no + charge to all third parties under the terms of this License. + + d) If a facility in the modified Library refers to a function or a + table of data to be supplied by an application program that uses + the facility, other than as an argument passed when the facility + is invoked, then you must make a good faith effort to ensure that, + in the event an application does not supply such function or + table, the facility still operates, and performs whatever part of + its purpose remains meaningful. + + (For example, a function in a library to compute square roots has + a purpose that is entirely well-defined independent of the + application. Therefore, Subsection 2d requires that any + application-supplied function or table used by this function must + be optional: if the application does not supply it, the square + root function must still compute square roots.) + +These requirements apply to the modified work as a whole. If +identifiable sections of that work are not derived from the Library, +and can be reasonably considered independent and separate works in +themselves, then this License, and its terms, do not apply to those +sections when you distribute them as separate works. But when you +distribute the same sections as part of a whole which is a work based +on the Library, the distribution of the whole must be on the terms of +this License, whose permissions for other licensees extend to the +entire whole, and thus to each and every part regardless of who wrote +it. + +Thus, it is not the intent of this section to claim rights or contest +your rights to work written entirely by you; rather, the intent is to +exercise the right to control the distribution of derivative or +collective works based on the Library. + +In addition, mere aggregation of another work not based on the Library +with the Library (or with a work based on the Library) on a volume of +a storage or distribution medium does not bring the other work under +the scope of this License. + + 3. You may opt to apply the terms of the ordinary GNU General Public +License instead of this License to a given copy of the Library. To do +this, you must alter all the notices that refer to this License, so +that they refer to the ordinary GNU General Public License, version 2, +instead of to this License. (If a newer version than version 2 of the +ordinary GNU General Public License has appeared, then you can specify +that version instead if you wish.) Do not make any other change in +these notices. + + Once this change is made in a given copy, it is irreversible for +that copy, so the ordinary GNU General Public License applies to all +subsequent copies and derivative works made from that copy. + + This option is useful when you wish to copy part of the code of +the Library into a program that is not a library. + + 4. You may copy and distribute the Library (or a portion or +derivative of it, under Section 2) in object code or executable form +under the terms of Sections 1 and 2 above provided that you accompany +it with the complete corresponding machine-readable source code, which +must be distributed under the terms of Sections 1 and 2 above on a +medium customarily used for software interchange. + + If distribution of object code is made by offering access to copy +from a designated place, then offering equivalent access to copy the +source code from the same place satisfies the requirement to +distribute the source code, even though third parties are not +compelled to copy the source along with the object code. + + 5. A program that contains no derivative of any portion of the +Library, but is designed to work with the Library by being compiled or +linked with it, is called a "work that uses the Library". Such a +work, in isolation, is not a derivative work of the Library, and +therefore falls outside the scope of this License. + + However, linking a "work that uses the Library" with the Library +creates an executable that is a derivative of the Library (because it +contains portions of the Library), rather than a "work that uses the +library". The executable is therefore covered by this License. +Section 6 states terms for distribution of such executables. + + When a "work that uses the Library" uses material from a header file +that is part of the Library, the object code for the work may be a +derivative work of the Library even though the source code is not. +Whether this is true is especially significant if the work can be +linked without the Library, or if the work is itself a library. The +threshold for this to be true is not precisely defined by law. + + If such an object file uses only numerical parameters, data +structure layouts and accessors, and small macros and small inline +functions (ten lines or less in length), then the use of the object +file is unrestricted, regardless of whether it is legally a derivative +work. (Executables containing this object code plus portions of the +Library will still fall under Section 6.) + + Otherwise, if the work is a derivative of the Library, you may +distribute the object code for the work under the terms of Section 6. +Any executables containing that work also fall under Section 6, +whether or not they are linked directly with the Library itself. + + 6. As an exception to the Sections above, you may also combine or +link a "work that uses the Library" with the Library to produce a +work containing portions of the Library, and distribute that work +under terms of your choice, provided that the terms permit +modification of the work for the customer's own use and reverse +engineering for debugging such modifications. + + You must give prominent notice with each copy of the work that the +Library is used in it and that the Library and its use are covered by +this License. You must supply a copy of this License. If the work +during execution displays copyright notices, you must include the +copyright notice for the Library among them, as well as a reference +directing the user to the copy of this License. Also, you must do one +of these things: + + a) Accompany the work with the complete corresponding + machine-readable source code for the Library including whatever + changes were used in the work (which must be distributed under + Sections 1 and 2 above); and, if the work is an executable linked + with the Library, with the complete machine-readable "work that + uses the Library", as object code and/or source code, so that the + user can modify the Library and then relink to produce a modified + executable containing the modified Library. (It is understood + that the user who changes the contents of definitions files in the + Library will not necessarily be able to recompile the application + to use the modified definitions.) + + b) Use a suitable shared library mechanism for linking with the + Library. A suitable mechanism is one that (1) uses at run time a + copy of the library already present on the user's computer system, + rather than copying library functions into the executable, and (2) + will operate properly with a modified version of the library, if + the user installs one, as long as the modified version is + interface-compatible with the version that the work was made with. + + c) Accompany the work with a written offer, valid for at + least three years, to give the same user the materials + specified in Subsection 6a, above, for a charge no more + than the cost of performing this distribution. + + d) If distribution of the work is made by offering access to copy + from a designated place, offer equivalent access to copy the above + specified materials from the same place. + + e) Verify that the user has already received a copy of these + materials or that you have already sent this user a copy. + + For an executable, the required form of the "work that uses the +Library" must include any data and utility programs needed for +reproducing the executable from it. However, as a special exception, +the materials to be distributed need not include anything that is +normally distributed (in either source or binary form) with the major +components (compiler, kernel, and so on) of the operating system on +which the executable runs, unless that component itself accompanies +the executable. + + It may happen that this requirement contradicts the license +restrictions of other proprietary libraries that do not normally +accompany the operating system. Such a contradiction means you cannot +use both them and the Library together in an executable that you +distribute. + + 7. You may place library facilities that are a work based on the +Library side-by-side in a single library together with other library +facilities not covered by this License, and distribute such a combined +library, provided that the separate distribution of the work based on +the Library and of the other library facilities is otherwise +permitted, and provided that you do these two things: + + a) Accompany the combined library with a copy of the same work + based on the Library, uncombined with any other library + facilities. This must be distributed under the terms of the + Sections above. + + b) Give prominent notice with the combined library of the fact + that part of it is a work based on the Library, and explaining + where to find the accompanying uncombined form of the same work. + + 8. You may not copy, modify, sublicense, link with, or distribute +the Library except as expressly provided under this License. Any +attempt otherwise to copy, modify, sublicense, link with, or +distribute the Library is void, and will automatically terminate your +rights under this License. However, parties who have received copies, +or rights, from you under this License will not have their licenses +terminated so long as such parties remain in full compliance. + + 9. You are not required to accept this License, since you have not +signed it. However, nothing else grants you permission to modify or +distribute the Library or its derivative works. These actions are +prohibited by law if you do not accept this License. Therefore, by +modifying or distributing the Library (or any work based on the +Library), you indicate your acceptance of this License to do so, and +all its terms and conditions for copying, distributing or modifying +the Library or works based on it. + + 10. Each time you redistribute the Library (or any work based on the +Library), the recipient automatically receives a license from the +original licensor to copy, distribute, link with or modify the Library +subject to these terms and conditions. You may not impose any further +restrictions on the recipients' exercise of the rights granted herein. +You are not responsible for enforcing compliance by third parties with +this License. + + 11. If, as a consequence of a court judgment or allegation of patent +infringement or for any other reason (not limited to patent issues), +conditions are imposed on you (whether by court order, agreement or +otherwise) that contradict the conditions of this License, they do not +excuse you from the conditions of this License. If you cannot +distribute so as to satisfy simultaneously your obligations under this +License and any other pertinent obligations, then as a consequence you +may not distribute the Library at all. For example, if a patent +license would not permit royalty-free redistribution of the Library by +all those who receive copies directly or indirectly through you, then +the only way you could satisfy both it and this License would be to +refrain entirely from distribution of the Library. + +If any portion of this section is held invalid or unenforceable under any +particular circumstance, the balance of the section is intended to apply, +and the section as a whole is intended to apply in other circumstances. + +It is not the purpose of this section to induce you to infringe any +patents or other property right claims or to contest validity of any +such claims; this section has the sole purpose of protecting the +integrity of the free software distribution system which is +implemented by public license practices. Many people have made +generous contributions to the wide range of software distributed +through that system in reliance on consistent application of that +system; it is up to the author/donor to decide if he or she is willing +to distribute software through any other system and a licensee cannot +impose that choice. + +This section is intended to make thoroughly clear what is believed to +be a consequence of the rest of this License. + + 12. If the distribution and/or use of the Library is restricted in +certain countries either by patents or by copyrighted interfaces, the +original copyright holder who places the Library under this License may add +an explicit geographical distribution limitation excluding those countries, +so that distribution is permitted only in or among countries not thus +excluded. In such case, this License incorporates the limitation as if +written in the body of this License. + + 13. The Free Software Foundation may publish revised and/or new +versions of the Lesser General Public License from time to time. +Such new versions will be similar in spirit to the present version, +but may differ in detail to address new problems or concerns. + +Each version is given a distinguishing version number. If the Library +specifies a version number of this License which applies to it and +"any later version", you have the option of following the terms and +conditions either of that version or of any later version published by +the Free Software Foundation. If the Library does not specify a +license version number, you may choose any version ever published by +the Free Software Foundation. + + 14. If you wish to incorporate parts of the Library into other free +programs whose distribution conditions are incompatible with these, +write to the author to ask for permission. For software which is +copyrighted by the Free Software Foundation, write to the Free +Software Foundation; we sometimes make exceptions for this. Our +decision will be guided by the two goals of preserving the free status +of all derivatives of our free software and of promoting the sharing +and reuse of software generally. + + NO WARRANTY + + 15. BECAUSE THE LIBRARY IS LICENSED FREE OF CHARGE, THERE IS NO +WARRANTY FOR THE LIBRARY, TO THE EXTENT PERMITTED BY APPLICABLE LAW. +EXCEPT WHEN OTHERWISE STATED IN WRITING THE COPYRIGHT HOLDERS AND/OR +OTHER PARTIES PROVIDE THE LIBRARY "AS IS" WITHOUT WARRANTY OF ANY +KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE +IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR +PURPOSE. THE ENTIRE RISK AS TO THE QUALITY AND PERFORMANCE OF THE +LIBRARY IS WITH YOU. SHOULD THE LIBRARY PROVE DEFECTIVE, YOU ASSUME +THE COST OF ALL NECESSARY SERVICING, REPAIR OR CORRECTION. + + 16. IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN +WRITING WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MAY MODIFY +AND/OR REDISTRIBUTE THE LIBRARY AS PERMITTED ABOVE, BE LIABLE TO YOU +FOR DAMAGES, INCLUDING ANY GENERAL, SPECIAL, INCIDENTAL OR +CONSEQUENTIAL DAMAGES ARISING OUT OF THE USE OR INABILITY TO USE THE +LIBRARY (INCLUDING BUT NOT LIMITED TO LOSS OF DATA OR DATA BEING +RENDERED INACCURATE OR LOSSES SUSTAINED BY YOU OR THIRD PARTIES OR A +FAILURE OF THE LIBRARY TO OPERATE WITH ANY OTHER SOFTWARE), EVEN IF +SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE POSSIBILITY OF SUCH +DAMAGES. + + END OF TERMS AND CONDITIONS + + How to Apply These Terms to Your New Libraries + + If you develop a new library, and you want it to be of the greatest +possible use to the public, we recommend making it free software that +everyone can redistribute and change. You can do so by permitting +redistribution under these terms (or, alternatively, under the terms of the +ordinary General Public License). + + To apply these terms, attach the following notices to the library. It is +safest to attach them to the start of each source file to most effectively +convey the exclusion of warranty; and each file should have at least the +"copyright" line and a pointer to where the full notice is found. + + + Copyright (C) + + This library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + This library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with this library; if not, write to the Free Software + Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA + +Also add information on how to contact you by electronic and paper mail. + +You should also get your employer (if you work as a programmer) or your +school, if any, to sign a "copyright disclaimer" for the library, if +necessary. Here is a sample; alter the names: + + Yoyodyne, Inc., hereby disclaims all copyright interest in the + library `Frob' (a library for tweaking knobs) written by James Random Hacker. + + , 1 April 1990 + Ty Coon, President of Vice + +That's all there is to it! + + + +soupsieve +MIT +https://github.com/facelessuser/soupsieve +MIT License + +Copyright (c) 2018 - 2026 Isaac Muse + +Permission is hereby granted, free of charge, to any person obtaining a copy +of this software and associated documentation files (the "Software"), to deal +in the Software without restriction, including without limitation the rights +to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +copies of the Software, and to permit persons to whom the Software is +furnished to do so, subject to the following conditions: + +The above copyright notice and this permission notice shall be included in all +copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE +SOFTWARE. + + +stack-data +MIT License +http://github.com/alexmojaki/stack_data +MIT License + +Copyright (c) 2019 Alex Hall + +Permission is hereby granted, free of charge, to any person obtaining a copy +of this software and associated documentation files (the "Software"), to deal +in the Software without restriction, including without limitation the rights +to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +copies of the Software, and to permit persons to whom the Software is +furnished to do so, subject to the following conditions: + +The above copyright notice and this permission notice shall be included in all +copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE +SOFTWARE. + + +starlette +BSD-3-Clause +https://github.com/Kludex/starlette +Copyright © 2018, [Encode OSS Ltd](https://www.encode.io/). +All rights reserved. + +Redistribution and use in source and binary forms, with or without +modification, are permitted provided that the following conditions are met: + +* Redistributions of source code must retain the above copyright notice, this + list of conditions and the following disclaimer. + +* Redistributions in binary form must reproduce the above copyright notice, + this list of conditions and the following disclaimer in the documentation + and/or other materials provided with the distribution. + +* Neither the name of the copyright holder nor the names of its + contributors may be used to endorse or promote products derived from + this software without specific prior written permission. + +THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" +AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE +IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE +DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE +FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL +DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR +SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER +CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, +OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE +OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + + +sympy +BSD License +https://sympy.org +Copyright (c) 2006-2023 SymPy Development Team + +All rights reserved. + +Redistribution and use in source and binary forms, with or without +modification, are permitted provided that the following conditions are met: + + a. Redistributions of source code must retain the above copyright notice, + this list of conditions and the following disclaimer. + b. Redistributions in binary form must reproduce the above copyright + notice, this list of conditions and the following disclaimer in the + documentation and/or other materials provided with the distribution. + c. Neither the name of SymPy nor the names of its contributors + may be used to endorse or promote products derived from this software + without specific prior written permission. + + +THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" +AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE +IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE +ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE FOR +ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL +DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR +SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER +CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT +LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY +OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH +DAMAGE. + +-------------------------------------------------------------------------------- + +Patches that were taken from the Diofant project (https://github.com/diofant/diofant) +are licensed as: + +Copyright (c) 2006-2018 SymPy Development Team, + 2013-2023 Sergey B Kirpichev + +All rights reserved. + +Redistribution and use in source and binary forms, with or without +modification, are permitted provided that the following conditions are met: + + a. Redistributions of source code must retain the above copyright notice, + this list of conditions and the following disclaimer. + b. Redistributions in binary form must reproduce the above copyright + notice, this list of conditions and the following disclaimer in the + documentation and/or other materials provided with the distribution. + c. Neither the name of Diofant or SymPy nor the names of its contributors + may be used to endorse or promote products derived from this software + without specific prior written permission. + + +THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" +AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE +IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE +ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE FOR +ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL +DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR +SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER +CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT +LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY +OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH +DAMAGE. + +-------------------------------------------------------------------------------- + +Submodules taken from the multipledispatch project (https://github.com/mrocklin/multipledispatch) +are licensed as: + +Copyright (c) 2014 Matthew Rocklin + +All rights reserved. + +Redistribution and use in source and binary forms, with or without +modification, are permitted provided that the following conditions are met: + + a. Redistributions of source code must retain the above copyright notice, + this list of conditions and the following disclaimer. + b. Redistributions in binary form must reproduce the above copyright + notice, this list of conditions and the following disclaimer in the + documentation and/or other materials provided with the distribution. + c. Neither the name of multipledispatch nor the names of its contributors + may be used to endorse or promote products derived from this software + without specific prior written permission. + + +THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" +AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE +IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE +ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE FOR +ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL +DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR +SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER +CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT +LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY +OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH +DAMAGE. + +-------------------------------------------------------------------------------- + +The files under the directory sympy/parsing/autolev/tests/pydy-example-repo +are directly copied from PyDy project and are licensed as: + +Copyright (c) 2009-2023, PyDy Authors +All rights reserved. + +Redistribution and use in source and binary forms, with or without +modification, are permitted provided that the following conditions are met: + +* Redistributions of source code must retain the above copyright + notice, this list of conditions and the following disclaimer. +* Redistributions in binary form must reproduce the above copyright + notice, this list of conditions and the following disclaimer in the + documentation and/or other materials provided with the distribution. +* Neither the name of this project nor the names of its contributors may be + used to endorse or promote products derived from this software without + specific prior written permission. + +THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND +ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED +WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE +DISCLAIMED. IN NO EVENT SHALL PYDY AUTHORS BE LIABLE FOR ANY DIRECT, +INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, +BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, +DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF +LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE +OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF +ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + +-------------------------------------------------------------------------------- + +The files under the directory sympy/parsing/latex +are directly copied from latex2sympy project and are licensed as: + +Copyright 2016, latex2sympy + +Permission is hereby granted, free of charge, to any person obtaining a copy +of this software and associated documentation files (the "Software"), to deal +in the Software without restriction, including without limitation the rights +to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +copies of the Software, and to permit persons to whom the Software is +furnished to do so, subject to the following conditions: + +The above copyright notice and this permission notice shall be included in all +copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE +SOFTWARE. + + +tabulate +MIT +https://github.com/astanin/python-tabulate +Copyright (c) 2011-2020 Sergey Astanin and contributors + +Permission is hereby granted, free of charge, to any person obtaining +a copy of this software and associated documentation files (the +"Software"), to deal in the Software without restriction, including +without limitation the rights to use, copy, modify, merge, publish, +distribute, sublicense, and/or sell copies of the Software, and to +permit persons to whom the Software is furnished to do so, subject to +the following conditions: + +The above copyright notice and this permission notice shall be +included in all copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, +EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF +MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND +NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE +LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION +OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION +WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. + + +tensorboard +Apache Software License +https://github.com/tensorflow/tensorboard +# TensorBoard License + +TensorBoard is licensed Apache 2.0 and distributed with +vendored content licensed Apache 2.0, MIT, and BSD-3. + +## Table of Contents + +- tensorboard/pip_package/LICENSE.tensorflow +- external/npm/node_modules/d3/LICENSE +- external/com_google_fonts_roboto/LICENSE +- external/org_mozilla_bleach/LICENSE +- external/org_pythonhosted_webencodings/LICENSE +- third_party/bh_tsne.LICENSE + +## Licenses + + + +### tensorboard/pip_package/LICENSE.tensorflow + +Copyright 2017 The TensorFlow Authors. All rights reserved. + + Apache License + Version 2.0, January 2004 + http://www.apache.org/licenses/ + + TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION + + 1. Definitions. + + "License" shall mean the terms and conditions for use, reproduction, + and distribution as defined by Sections 1 through 9 of this document. + + "Licensor" shall mean the copyright owner or entity authorized by + the copyright owner that is granting the License. + + "Legal Entity" shall mean the union of the acting entity and all + other entities that control, are controlled by, or are under common + control with that entity. For the purposes of this definition, + "control" means (i) the power, direct or indirect, to cause the + direction or management of such entity, whether by contract or + otherwise, or (ii) ownership of fifty percent (50%) or more of the + outstanding shares, or (iii) beneficial ownership of such entity. + + "You" (or "Your") shall mean an individual or Legal Entity + exercising permissions granted by this License. + + "Source" form shall mean the preferred form for making modifications, + including but not limited to software source code, documentation + source, and configuration files. + + "Object" form shall mean any form resulting from mechanical + transformation or translation of a Source form, including but + not limited to compiled object code, generated documentation, + and conversions to other media types. + + "Work" shall mean the work of authorship, whether in Source or + Object form, made available under the License, as indicated by a + copyright notice that is included in or attached to the work + (an example is provided in the Appendix below). + + "Derivative Works" shall mean any work, whether in Source or Object + form, that is based on (or derived from) the Work and for which the + editorial revisions, annotations, elaborations, or other modifications + represent, as a whole, an original work of authorship. For the purposes + of this License, Derivative Works shall not include works that remain + separable from, or merely link (or bind by name) to the interfaces of, + the Work and Derivative Works thereof. + + "Contribution" shall mean any work of authorship, including + the original version of the Work and any modifications or additions + to that Work or Derivative Works thereof, that is intentionally + submitted to Licensor for inclusion in the Work by the copyright owner + or by an individual or Legal Entity authorized to submit on behalf of + the copyright owner. For the purposes of this definition, "submitted" + means any form of electronic, verbal, or written communication sent + to the Licensor or its representatives, including but not limited to + communication on electronic mailing lists, source code control systems, + and issue tracking systems that are managed by, or on behalf of, the + Licensor for the purpose of discussing and improving the Work, but + excluding communication that is conspicuously marked or otherwise + designated in writing by the copyright owner as "Not a Contribution." + + "Contributor" shall mean Licensor and any individual or Legal Entity + on behalf of whom a Contribution has been received by Licensor and + subsequently incorporated within the Work. + + 2. Grant of Copyright License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + copyright license to reproduce, prepare Derivative Works of, + publicly display, publicly perform, sublicense, and distribute the + Work and such Derivative Works in Source or Object form. + + 3. Grant of Patent License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + (except as stated in this section) patent license to make, have made, + use, offer to sell, sell, import, and otherwise transfer the Work, + where such license applies only to those patent claims licensable + by such Contributor that are necessarily infringed by their + Contribution(s) alone or by combination of their Contribution(s) + with the Work to which such Contribution(s) was submitted. If You + institute patent litigation against any entity (including a + cross-claim or counterclaim in a lawsuit) alleging that the Work + or a Contribution incorporated within the Work constitutes direct + or contributory patent infringement, then any patent licenses + granted to You under this License for that Work shall terminate + as of the date such litigation is filed. + + 4. Redistribution. You may reproduce and distribute copies of the + Work or Derivative Works thereof in any medium, with or without + modifications, and in Source or Object form, provided that You + meet the following conditions: + + (a) You must give any other recipients of the Work or + Derivative Works a copy of this License; and + + (b) You must cause any modified files to carry prominent notices + stating that You changed the files; and + + (c) You must retain, in the Source form of any Derivative Works + that You distribute, all copyright, patent, trademark, and + attribution notices from the Source form of the Work, + excluding those notices that do not pertain to any part of + the Derivative Works; and + + (d) If the Work includes a "NOTICE" text file as part of its + distribution, then any Derivative Works that You distribute must + include a readable copy of the attribution notices contained + within such NOTICE file, excluding those notices that do not + pertain to any part of the Derivative Works, in at least one + of the following places: within a NOTICE text file distributed + as part of the Derivative Works; within the Source form or + documentation, if provided along with the Derivative Works; or, + within a display generated by the Derivative Works, if and + wherever such third-party notices normally appear. The contents + of the NOTICE file are for informational purposes only and + do not modify the License. You may add Your own attribution + notices within Derivative Works that You distribute, alongside + or as an addendum to the NOTICE text from the Work, provided + that such additional attribution notices cannot be construed + as modifying the License. + + You may add Your own copyright statement to Your modifications and + may provide additional or different license terms and conditions + for use, reproduction, or distribution of Your modifications, or + for any such Derivative Works as a whole, provided Your use, + reproduction, and distribution of the Work otherwise complies with + the conditions stated in this License. + + 5. Submission of Contributions. Unless You explicitly state otherwise, + any Contribution intentionally submitted for inclusion in the Work + by You to the Licensor shall be under the terms and conditions of + this License, without any additional terms or conditions. + Notwithstanding the above, nothing herein shall supersede or modify + the terms of any separate license agreement you may have executed + with Licensor regarding such Contributions. + + 6. Trademarks. This License does not grant permission to use the trade + names, trademarks, service marks, or product names of the Licensor, + except as required for reasonable and customary use in describing the + origin of the Work and reproducing the content of the NOTICE file. + + 7. Disclaimer of Warranty. Unless required by applicable law or + agreed to in writing, Licensor provides the Work (and each + Contributor provides its Contributions) on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or + implied, including, without limitation, any warranties or conditions + of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A + PARTICULAR PURPOSE. You are solely responsible for determining the + appropriateness of using or redistributing the Work and assume any + risks associated with Your exercise of permissions under this License. + + 8. Limitation of Liability. In no event and under no legal theory, + whether in tort (including negligence), contract, or otherwise, + unless required by applicable law (such as deliberate and grossly + negligent acts) or agreed to in writing, shall any Contributor be + liable to You for damages, including any direct, indirect, special, + incidental, or consequential damages of any character arising as a + result of this License or out of the use or inability to use the + Work (including but not limited to damages for loss of goodwill, + work stoppage, computer failure or malfunction, or any and all + other commercial damages or losses), even if such Contributor + has been advised of the possibility of such damages. + + 9. Accepting Warranty or Additional Liability. While redistributing + the Work or Derivative Works thereof, You may choose to offer, + and charge a fee for, acceptance of support, warranty, indemnity, + or other liability obligations and/or rights consistent with this + License. However, in accepting such obligations, You may act only + on Your own behalf and on Your sole responsibility, not on behalf + of any other Contributor, and only if You agree to indemnify, + defend, and hold each Contributor harmless for any liability + incurred by, or claims asserted against, such Contributor by reason + of your accepting any such warranty or additional liability. + + END OF TERMS AND CONDITIONS + + APPENDIX: How to apply the Apache License to your work. + + To apply the Apache License to your work, attach the following + boilerplate notice, with the fields enclosed by brackets "[]" + replaced with your own identifying information. (Don't include + the brackets!) The text should be enclosed in the appropriate + comment syntax for the file format. We also recommend that a + file or class name and description of purpose be included on the + same "printed page" as the copyright notice for easier + identification within third-party archives. + + Copyright 2017, The TensorFlow Authors. + + Licensed under the Apache License, Version 2.0 (the "License"); + you may not use this file except in compliance with the License. + You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, software + distributed under the License is distributed on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + See the License for the specific language governing permissions and + limitations under the License. + + + +### external/npm/node_modules/d3/LICENSE + +Copyright 2010-2017 Mike Bostock +All rights reserved. + +Redistribution and use in source and binary forms, with or without modification, +are permitted provided that the following conditions are met: + +* Redistributions of source code must retain the above copyright notice, this + list of conditions and the following disclaimer. + +* Redistributions in binary form must reproduce the above copyright notice, + this list of conditions and the following disclaimer in the documentation + and/or other materials provided with the distribution. + +* Neither the name of the author nor the names of contributors may be used to + endorse or promote products derived from this software without specific prior + written permission. + +THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND +ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED +WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE +DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR +ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES +(INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; +LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON +ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT +(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS +SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + + + +### external/com_google_fonts_roboto/LICENSE + + Apache License + Version 2.0, January 2004 + http://www.apache.org/licenses/ + + TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION + + 1. Definitions. + + "License" shall mean the terms and conditions for use, reproduction, + and distribution as defined by Sections 1 through 9 of this document. + + "Licensor" shall mean the copyright owner or entity authorized by + the copyright owner that is granting the License. + + "Legal Entity" shall mean the union of the acting entity and all + other entities that control, are controlled by, or are under common + control with that entity. For the purposes of this definition, + "control" means (i) the power, direct or indirect, to cause the + direction or management of such entity, whether by contract or + otherwise, or (ii) ownership of fifty percent (50%) or more of the + outstanding shares, or (iii) beneficial ownership of such entity. + + "You" (or "Your") shall mean an individual or Legal Entity + exercising permissions granted by this License. + + "Source" form shall mean the preferred form for making modifications, + including but not limited to software source code, documentation + source, and configuration files. + + "Object" form shall mean any form resulting from mechanical + transformation or translation of a Source form, including but + not limited to compiled object code, generated documentation, + and conversions to other media types. + + "Work" shall mean the work of authorship, whether in Source or + Object form, made available under the License, as indicated by a + copyright notice that is included in or attached to the work + (an example is provided in the Appendix below). + + "Derivative Works" shall mean any work, whether in Source or Object + form, that is based on (or derived from) the Work and for which the + editorial revisions, annotations, elaborations, or other modifications + represent, as a whole, an original work of authorship. For the purposes + of this License, Derivative Works shall not include works that remain + separable from, or merely link (or bind by name) to the interfaces of, + the Work and Derivative Works thereof. + + "Contribution" shall mean any work of authorship, including + the original version of the Work and any modifications or additions + to that Work or Derivative Works thereof, that is intentionally + submitted to Licensor for inclusion in the Work by the copyright owner + or by an individual or Legal Entity authorized to submit on behalf of + the copyright owner. For the purposes of this definition, "submitted" + means any form of electronic, verbal, or written communication sent + to the Licensor or its representatives, including but not limited to + communication on electronic mailing lists, source code control systems, + and issue tracking systems that are managed by, or on behalf of, the + Licensor for the purpose of discussing and improving the Work, but + excluding communication that is conspicuously marked or otherwise + designated in writing by the copyright owner as "Not a Contribution." + + "Contributor" shall mean Licensor and any individual or Legal Entity + on behalf of whom a Contribution has been received by Licensor and + subsequently incorporated within the Work. + + 2. Grant of Copyright License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + copyright license to reproduce, prepare Derivative Works of, + publicly display, publicly perform, sublicense, and distribute the + Work and such Derivative Works in Source or Object form. + + 3. Grant of Patent License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + (except as stated in this section) patent license to make, have made, + use, offer to sell, sell, import, and otherwise transfer the Work, + where such license applies only to those patent claims licensable + by such Contributor that are necessarily infringed by their + Contribution(s) alone or by combination of their Contribution(s) + with the Work to which such Contribution(s) was submitted. If You + institute patent litigation against any entity (including a + cross-claim or counterclaim in a lawsuit) alleging that the Work + or a Contribution incorporated within the Work constitutes direct + or contributory patent infringement, then any patent licenses + granted to You under this License for that Work shall terminate + as of the date such litigation is filed. + + 4. Redistribution. You may reproduce and distribute copies of the + Work or Derivative Works thereof in any medium, with or without + modifications, and in Source or Object form, provided that You + meet the following conditions: + + (a) You must give any other recipients of the Work or + Derivative Works a copy of this License; and + + (b) You must cause any modified files to carry prominent notices + stating that You changed the files; and + + (c) You must retain, in the Source form of any Derivative Works + that You distribute, all copyright, patent, trademark, and + attribution notices from the Source form of the Work, + excluding those notices that do not pertain to any part of + the Derivative Works; and + + (d) If the Work includes a "NOTICE" text file as part of its + distribution, then any Derivative Works that You distribute must + include a readable copy of the attribution notices contained + within such NOTICE file, excluding those notices that do not + pertain to any part of the Derivative Works, in at least one + of the following places: within a NOTICE text file distributed + as part of the Derivative Works; within the Source form or + documentation, if provided along with the Derivative Works; or, + within a display generated by the Derivative Works, if and + wherever such third-party notices normally appear. The contents + of the NOTICE file are for informational purposes only and + do not modify the License. You may add Your own attribution + notices within Derivative Works that You distribute, alongside + or as an addendum to the NOTICE text from the Work, provided + that such additional attribution notices cannot be construed + as modifying the License. + + You may add Your own copyright statement to Your modifications and + may provide additional or different license terms and conditions + for use, reproduction, or distribution of Your modifications, or + for any such Derivative Works as a whole, provided Your use, + reproduction, and distribution of the Work otherwise complies with + the conditions stated in this License. + + 5. Submission of Contributions. Unless You explicitly state otherwise, + any Contribution intentionally submitted for inclusion in the Work + by You to the Licensor shall be under the terms and conditions of + this License, without any additional terms or conditions. + Notwithstanding the above, nothing herein shall supersede or modify + the terms of any separate license agreement you may have executed + with Licensor regarding such Contributions. + + 6. Trademarks. This License does not grant permission to use the trade + names, trademarks, service marks, or product names of the Licensor, + except as required for reasonable and customary use in describing the + origin of the Work and reproducing the content of the NOTICE file. + + 7. Disclaimer of Warranty. Unless required by applicable law or + agreed to in writing, Licensor provides the Work (and each + Contributor provides its Contributions) on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or + implied, including, without limitation, any warranties or conditions + of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A + PARTICULAR PURPOSE. You are solely responsible for determining the + appropriateness of using or redistributing the Work and assume any + risks associated with Your exercise of permissions under this License. + + 8. Limitation of Liability. In no event and under no legal theory, + whether in tort (including negligence), contract, or otherwise, + unless required by applicable law (such as deliberate and grossly + negligent acts) or agreed to in writing, shall any Contributor be + liable to You for damages, including any direct, indirect, special, + incidental, or consequential damages of any character arising as a + result of this License or out of the use or inability to use the + Work (including but not limited to damages for loss of goodwill, + work stoppage, computer failure or malfunction, or any and all + other commercial damages or losses), even if such Contributor + has been advised of the possibility of such damages. + + 9. Accepting Warranty or Additional Liability. While redistributing + the Work or Derivative Works thereof, You may choose to offer, + and charge a fee for, acceptance of support, warranty, indemnity, + or other liability obligations and/or rights consistent with this + License. However, in accepting such obligations, You may act only + on Your own behalf and on Your sole responsibility, not on behalf + of any other Contributor, and only if You agree to indemnify, + defend, and hold each Contributor harmless for any liability + incurred by, or claims asserted against, such Contributor by reason + of your accepting any such warranty or additional liability. + + END OF TERMS AND CONDITIONS + + APPENDIX: How to apply the Apache License to your work. + + To apply the Apache License to your work, attach the following + boilerplate notice, with the fields enclosed by brackets "[]" + replaced with your own identifying information. (Don't include + the brackets!) The text should be enclosed in the appropriate + comment syntax for the file format. We also recommend that a + file or class name and description of purpose be included on the + same "printed page" as the copyright notice for easier + identification within third-party archives. + + Copyright [yyyy] [name of copyright owner] + + Licensed under the Apache License, Version 2.0 (the "License"); + you may not use this file except in compliance with the License. + You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, software + distributed under the License is distributed on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + See the License for the specific language governing permissions and + limitations under the License. + + + +### external/org_mozilla_bleach/LICENSE + +Copyright (c) 2014-2017, Mozilla Foundation + +Licensed under the Apache License, Version 2.0 (the "License"); +you may not use this file except in compliance with the License. +You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + +Unless required by applicable law or agreed to in writing, software +distributed under the License is distributed on an "AS IS" BASIS, +WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +See the License for the specific language governing permissions and +limitations under the License. + + + +### external/org_pythonhosted_webencodings/LICENSE + +Copyright (c) 2012 by Simon Sapin. + +Some rights reserved. + +Redistribution and use in source and binary forms, with or without +modification, are permitted provided that the following conditions are +met: + + * Redistributions of source code must retain the above copyright + notice, this list of conditions and the following disclaimer. + + * Redistributions in binary form must reproduce the above + copyright notice, this list of conditions and the following + disclaimer in the documentation and/or other materials provided + with the distribution. + + * The names of the contributors may not be used to endorse or + promote products derived from this software without specific + prior written permission. + +THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS +"AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT +LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR +A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT +OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, +SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT +LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, +DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY +THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT +(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE +OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + + + +### third_party/bh_tsne.LICENSE + +The MIT License (MIT) + +Copyright (c) 2015 Andrej Karpathy + +Permission is hereby granted, free of charge, to any person obtaining a copy +of this software and associated documentation files (the "Software"), to deal +in the Software without restriction, including without limitation the rights +to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +copies of the Software, and to permit persons to whom the Software is +furnished to do so, subject to the following conditions: + +The above copyright notice and this permission notice shall be included in +all copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN +THE SOFTWARE. + + +tensorboard-data-server +Apache Software License +https://github.com/tensorflow/tensorboard/tree/master/tensorboard/data/server +UNKNOWN + +tensorstore +Apache-2.0 +https://github.com/google/tensorstore +Files: **/* + +Copyright 2018 The TensorStore Authors. All rights reserved. + + Apache License + Version 2.0, January 2004 + http://www.apache.org/licenses/ + + TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION + + 1. Definitions. + + "License" shall mean the terms and conditions for use, reproduction, + and distribution as defined by Sections 1 through 9 of this document. + + "Licensor" shall mean the copyright owner or entity authorized by + the copyright owner that is granting the License. + + "Legal Entity" shall mean the union of the acting entity and all + other entities that control, are controlled by, or are under common + control with that entity. For the purposes of this definition, + "control" means (i) the power, direct or indirect, to cause the + direction or management of such entity, whether by contract or + otherwise, or (ii) ownership of fifty percent (50%) or more of the + outstanding shares, or (iii) beneficial ownership of such entity. + + "You" (or "Your") shall mean an individual or Legal Entity + exercising permissions granted by this License. + + "Source" form shall mean the preferred form for making modifications, + including but not limited to software source code, documentation + source, and configuration files. + + "Object" form shall mean any form resulting from mechanical + transformation or translation of a Source form, including but + not limited to compiled object code, generated documentation, + and conversions to other media types. + + "Work" shall mean the work of authorship, whether in Source or + Object form, made available under the License, as indicated by a + copyright notice that is included in or attached to the work + (an example is provided in the Appendix below). + + "Derivative Works" shall mean any work, whether in Source or Object + form, that is based on (or derived from) the Work and for which the + editorial revisions, annotations, elaborations, or other modifications + represent, as a whole, an original work of authorship. For the purposes + of this License, Derivative Works shall not include works that remain + separable from, or merely link (or bind by name) to the interfaces of, + the Work and Derivative Works thereof. + + "Contribution" shall mean any work of authorship, including + the original version of the Work and any modifications or additions + to that Work or Derivative Works thereof, that is intentionally + submitted to Licensor for inclusion in the Work by the copyright owner + or by an individual or Legal Entity authorized to submit on behalf of + the copyright owner. For the purposes of this definition, "submitted" + means any form of electronic, verbal, or written communication sent + to the Licensor or its representatives, including but not limited to + communication on electronic mailing lists, source code control systems, + and issue tracking systems that are managed by, or on behalf of, the + Licensor for the purpose of discussing and improving the Work, but + excluding communication that is conspicuously marked or otherwise + designated in writing by the copyright owner as "Not a Contribution." + + "Contributor" shall mean Licensor and any individual or Legal Entity + on behalf of whom a Contribution has been received by Licensor and + subsequently incorporated within the Work. + + 2. Grant of Copyright License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + copyright license to reproduce, prepare Derivative Works of, + publicly display, publicly perform, sublicense, and distribute the + Work and such Derivative Works in Source or Object form. + + 3. Grant of Patent License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + (except as stated in this section) patent license to make, have made, + use, offer to sell, sell, import, and otherwise transfer the Work, + where such license applies only to those patent claims licensable + by such Contributor that are necessarily infringed by their + Contribution(s) alone or by combination of their Contribution(s) + with the Work to which such Contribution(s) was submitted. If You + institute patent litigation against any entity (including a + cross-claim or counterclaim in a lawsuit) alleging that the Work + or a Contribution incorporated within the Work constitutes direct + or contributory patent infringement, then any patent licenses + granted to You under this License for that Work shall terminate + as of the date such litigation is filed. + + 4. Redistribution. You may reproduce and distribute copies of the + Work or Derivative Works thereof in any medium, with or without + modifications, and in Source or Object form, provided that You + meet the following conditions: + + (a) You must give any other recipients of the Work or + Derivative Works a copy of this License; and + + (b) You must cause any modified files to carry prominent notices + stating that You changed the files; and + + (c) You must retain, in the Source form of any Derivative Works + that You distribute, all copyright, patent, trademark, and + attribution notices from the Source form of the Work, + excluding those notices that do not pertain to any part of + the Derivative Works; and + + (d) If the Work includes a "NOTICE" text file as part of its + distribution, then any Derivative Works that You distribute must + include a readable copy of the attribution notices contained + within such NOTICE file, excluding those notices that do not + pertain to any part of the Derivative Works, in at least one + of the following places: within a NOTICE text file distributed + as part of the Derivative Works; within the Source form or + documentation, if provided along with the Derivative Works; or, + within a display generated by the Derivative Works, if and + wherever such third-party notices normally appear. The contents + of the NOTICE file are for informational purposes only and + do not modify the License. You may add Your own attribution + notices within Derivative Works that You distribute, alongside + or as an addendum to the NOTICE text from the Work, provided + that such additional attribution notices cannot be construed + as modifying the License. + + You may add Your own copyright statement to Your modifications and + may provide additional or different license terms and conditions + for use, reproduction, or distribution of Your modifications, or + for any such Derivative Works as a whole, provided Your use, + reproduction, and distribution of the Work otherwise complies with + the conditions stated in this License. + + 5. Submission of Contributions. Unless You explicitly state otherwise, + any Contribution intentionally submitted for inclusion in the Work + by You to the Licensor shall be under the terms and conditions of + this License, without any additional terms or conditions. + Notwithstanding the above, nothing herein shall supersede or modify + the terms of any separate license agreement you may have executed + with Licensor regarding such Contributions. + + 6. Trademarks. This License does not grant permission to use the trade + names, trademarks, service marks, or product names of the Licensor, + except as required for reasonable and customary use in describing the + origin of the Work and reproducing the content of the NOTICE file. + + 7. Disclaimer of Warranty. Unless required by applicable law or + agreed to in writing, Licensor provides the Work (and each + Contributor provides its Contributions) on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or + implied, including, without limitation, any warranties or conditions + of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A + PARTICULAR PURPOSE. You are solely responsible for determining the + appropriateness of using or redistributing the Work and assume any + risks associated with Your exercise of permissions under this License. + + 8. Limitation of Liability. In no event and under no legal theory, + whether in tort (including negligence), contract, or otherwise, + unless required by applicable law (such as deliberate and grossly + negligent acts) or agreed to in writing, shall any Contributor be + liable to You for damages, including any direct, indirect, special, + incidental, or consequential damages of any character arising as a + result of this License or out of the use or inability to use the + Work (including but not limited to damages for loss of goodwill, + work stoppage, computer failure or malfunction, or any and all + other commercial damages or losses), even if such Contributor + has been advised of the possibility of such damages. + + 9. Accepting Warranty or Additional Liability. While redistributing + the Work or Derivative Works thereof, You may choose to offer, + and charge a fee for, acceptance of support, warranty, indemnity, + or other liability obligations and/or rights consistent with this + License. However, in accepting such obligations, You may act only + on Your own behalf and on Your sole responsibility, not on behalf + of any other Contributor, and only if You agree to indemnify, + defend, and hold each Contributor harmless for any liability + incurred by, or claims asserted against, such Contributor by reason + of your accepting any such warranty or additional liability. + + END OF TERMS AND CONDITIONS + + APPENDIX: How to apply the Apache License to your work. + + To apply the Apache License to your work, attach the following + boilerplate notice, with the fields enclosed by brackets "[]" + replaced with your own identifying information. (Don't include + the brackets!) The text should be enclosed in the appropriate + comment syntax for the file format. We also recommend that a + file or class name and description of purpose be included on the + same "printed page" as the copyright notice for easier + identification within third-party archives. + + Copyright 2017, The TensorStore Authors. + + Licensed under the Apache License, Version 2.0 (the "License"); + you may not use this file except in compliance with the License. + You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, software + distributed under the License is distributed on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + See the License for the specific language governing permissions and + limitations under the License. + +-------------------------------------------------------------------------------- + +Files: internal/utf8.cc + +Copyright (c) 2008-2009 Bjoern Hoehrmann + +Permission is hereby granted, free of charge, to any person obtaining +a copy of this software and associated documentation files (the +"Software"), to deal in the Software without restriction, including +without limitation the rights to use, copy, modify, merge, publish, +distribute, sublicense, and/or sell copies of the Software, and to +permit persons to whom the Software is furnished to do so, subject to +the following conditions: + +The above copyright notice and this permission notice shall be +included in all copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, +EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF +MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND +NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE +LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION +OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION +WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. + +-------------------------------------------------------------------------------- + +Files: third_party/snappy/bundled.BUILD.bazel + +Copyright 2011 Google Inc. All Rights Reserved. +Author: sesse@google.com (Steinar H. Gunderson) + +Redistribution and use in source and binary forms, with or without +modification, are permitted provided that the following conditions are +met: + + * Redistributions of source code must retain the above copyright +notice, this list of conditions and the following disclaimer. + * Redistributions in binary form must reproduce the above +copyright notice, this list of conditions and the following disclaimer +in the documentation and/or other materials provided with the +distribution. + * Neither the name of Google Inc. nor the names of its +contributors may be used to endorse or promote products derived from +this software without specific prior written permission. + +THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS +"AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT +LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR +A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT +OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, +SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT +LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, +DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY +THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT +(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE +OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + +-------------------------------------------------------------------------------- + +Files: tools/cmake/FindAVIF.cmake + +Copyright (C) 2021 Igalia S.L. + +Redistribution and use in source and binary forms, with or without +modification, are permitted provided that the following conditions +are met: +1. Redistributions of source code must retain the above copyright + notice, this list of conditions and the following disclaimer. +2. Redistributions in binary form must reproduce the above copyright + notice, this list of conditions and the following disclaimer in the + documentation and/or other materials provided with the distribution. + +THIS SOFTWARE IS PROVIDED BY APPLE INC. AND ITS CONTRIBUTORS ``AS IS'' +AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, +THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR +PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL APPLE INC. OR ITS CONTRIBUTORS +BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR +CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF +SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS +INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN +CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) +ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF +THE POSSIBILITY OF SUCH DAMAGE. + + + +termcolor +MIT +https://github.com/termcolor/termcolor +Copyright (c) 2008-2011 Volvox Development Team + +Permission is hereby granted, free of charge, to any person obtaining a copy +of this software and associated documentation files (the "Software"), to deal +in the Software without restriction, including without limitation the rights +to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +copies of the Software, and to permit persons to whom the Software is +furnished to do so, subject to the following conditions: + +The above copyright notice and this permission notice shall be included in +all copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN +THE SOFTWARE. + + +terminado +BSD License +https://github.com/jupyter/terminado +BSD 2-Clause License + +- Copyright (c) 2014-, Jupyter development team +- Copyright (c) 2014, Ramalingam Saravanan + +All rights reserved. + +Redistribution and use in source and binary forms, with or without +modification, are permitted provided that the following conditions are met: + +1. Redistributions of source code must retain the above copyright notice, this + list of conditions and the following disclaimer. + +2. Redistributions in binary form must reproduce the above copyright notice, + this list of conditions and the following disclaimer in the documentation + and/or other materials provided with the distribution. + +THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" +AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE +IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE +DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE +FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL +DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR +SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER +CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, +OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE +OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + + +tifffile +BSD-3-Clause +https://www.cgohlke.com +BSD-3-Clause license + +Copyright (c) 2008-2026, Christoph Gohlke +All rights reserved. + +Redistribution and use in source and binary forms, with or without +modification, are permitted provided that the following conditions are met: + +1. Redistributions of source code must retain the above copyright notice, + this list of conditions and the following disclaimer. + +2. Redistributions in binary form must reproduce the above copyright notice, + this list of conditions and the following disclaimer in the documentation + and/or other materials provided with the distribution. + +3. Neither the name of the copyright holder nor the names of its + contributors may be used to endorse or promote products derived from + this software without specific prior written permission. + +THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" +AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE +IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE +ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE +LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR +CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF +SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS +INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN +CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) +ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE +POSSIBILITY OF SUCH DAMAGE. + + +tiktoken +MIT License + +Copyright (c) 2022 OpenAI, Shantanu Jain + +Permission is hereby granted, free of charge, to any person obtaining a copy +of this software and associated documentation files (the "Software"), to deal +in the Software without restriction, including without limitation the rights +to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +copies of the Software, and to permit persons to whom the Software is +furnished to do so, subject to the following conditions: + +The above copyright notice and this permission notice shall be included in all +copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE +SOFTWARE. + +https://github.com/openai/tiktoken +MIT License + +Copyright (c) 2022 OpenAI, Shantanu Jain + +Permission is hereby granted, free of charge, to any person obtaining a copy +of this software and associated documentation files (the "Software"), to deal +in the Software without restriction, including without limitation the rights +to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +copies of the Software, and to permit persons to whom the Software is +furnished to do so, subject to the following conditions: + +The above copyright notice and this permission notice shall be included in all +copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE +SOFTWARE. + + +timm +Apache Software License +https://github.com/huggingface/pytorch-image-models + Apache License + Version 2.0, January 2004 + http://www.apache.org/licenses/ + + TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION + + 1. Definitions. + + "License" shall mean the terms and conditions for use, reproduction, + and distribution as defined by Sections 1 through 9 of this document. + + "Licensor" shall mean the copyright owner or entity authorized by + the copyright owner that is granting the License. + + "Legal Entity" shall mean the union of the acting entity and all + other entities that control, are controlled by, or are under common + control with that entity. For the purposes of this definition, + "control" means (i) the power, direct or indirect, to cause the + direction or management of such entity, whether by contract or + otherwise, or (ii) ownership of fifty percent (50%) or more of the + outstanding shares, or (iii) beneficial ownership of such entity. + + "You" (or "Your") shall mean an individual or Legal Entity + exercising permissions granted by this License. + + "Source" form shall mean the preferred form for making modifications, + including but not limited to software source code, documentation + source, and configuration files. + + "Object" form shall mean any form resulting from mechanical + transformation or translation of a Source form, including but + not limited to compiled object code, generated documentation, + and conversions to other media types. + + "Work" shall mean the work of authorship, whether in Source or + Object form, made available under the License, as indicated by a + copyright notice that is included in or attached to the work + (an example is provided in the Appendix below). + + "Derivative Works" shall mean any work, whether in Source or Object + form, that is based on (or derived from) the Work and for which the + editorial revisions, annotations, elaborations, or other modifications + represent, as a whole, an original work of authorship. For the purposes + of this License, Derivative Works shall not include works that remain + separable from, or merely link (or bind by name) to the interfaces of, + the Work and Derivative Works thereof. + + "Contribution" shall mean any work of authorship, including + the original version of the Work and any modifications or additions + to that Work or Derivative Works thereof, that is intentionally + submitted to Licensor for inclusion in the Work by the copyright owner + or by an individual or Legal Entity authorized to submit on behalf of + the copyright owner. For the purposes of this definition, "submitted" + means any form of electronic, verbal, or written communication sent + to the Licensor or its representatives, including but not limited to + communication on electronic mailing lists, source code control systems, + and issue tracking systems that are managed by, or on behalf of, the + Licensor for the purpose of discussing and improving the Work, but + excluding communication that is conspicuously marked or otherwise + designated in writing by the copyright owner as "Not a Contribution." + + "Contributor" shall mean Licensor and any individual or Legal Entity + on behalf of whom a Contribution has been received by Licensor and + subsequently incorporated within the Work. + + 2. Grant of Copyright License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + copyright license to reproduce, prepare Derivative Works of, + publicly display, publicly perform, sublicense, and distribute the + Work and such Derivative Works in Source or Object form. + + 3. Grant of Patent License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + (except as stated in this section) patent license to make, have made, + use, offer to sell, sell, import, and otherwise transfer the Work, + where such license applies only to those patent claims licensable + by such Contributor that are necessarily infringed by their + Contribution(s) alone or by combination of their Contribution(s) + with the Work to which such Contribution(s) was submitted. If You + institute patent litigation against any entity (including a + cross-claim or counterclaim in a lawsuit) alleging that the Work + or a Contribution incorporated within the Work constitutes direct + or contributory patent infringement, then any patent licenses + granted to You under this License for that Work shall terminate + as of the date such litigation is filed. + + 4. Redistribution. You may reproduce and distribute copies of the + Work or Derivative Works thereof in any medium, with or without + modifications, and in Source or Object form, provided that You + meet the following conditions: + + (a) You must give any other recipients of the Work or + Derivative Works a copy of this License; and + + (b) You must cause any modified files to carry prominent notices + stating that You changed the files; and + + (c) You must retain, in the Source form of any Derivative Works + that You distribute, all copyright, patent, trademark, and + attribution notices from the Source form of the Work, + excluding those notices that do not pertain to any part of + the Derivative Works; and + + (d) If the Work includes a "NOTICE" text file as part of its + distribution, then any Derivative Works that You distribute must + include a readable copy of the attribution notices contained + within such NOTICE file, excluding those notices that do not + pertain to any part of the Derivative Works, in at least one + of the following places: within a NOTICE text file distributed + as part of the Derivative Works; within the Source form or + documentation, if provided along with the Derivative Works; or, + within a display generated by the Derivative Works, if and + wherever such third-party notices normally appear. The contents + of the NOTICE file are for informational purposes only and + do not modify the License. You may add Your own attribution + notices within Derivative Works that You distribute, alongside + or as an addendum to the NOTICE text from the Work, provided + that such additional attribution notices cannot be construed + as modifying the License. + + You may add Your own copyright statement to Your modifications and + may provide additional or different license terms and conditions + for use, reproduction, or distribution of Your modifications, or + for any such Derivative Works as a whole, provided Your use, + reproduction, and distribution of the Work otherwise complies with + the conditions stated in this License. + + 5. Submission of Contributions. Unless You explicitly state otherwise, + any Contribution intentionally submitted for inclusion in the Work + by You to the Licensor shall be under the terms and conditions of + this License, without any additional terms or conditions. + Notwithstanding the above, nothing herein shall supersede or modify + the terms of any separate license agreement you may have executed + with Licensor regarding such Contributions. + + 6. Trademarks. This License does not grant permission to use the trade + names, trademarks, service marks, or product names of the Licensor, + except as required for reasonable and customary use in describing the + origin of the Work and reproducing the content of the NOTICE file. + + 7. Disclaimer of Warranty. Unless required by applicable law or + agreed to in writing, Licensor provides the Work (and each + Contributor provides its Contributions) on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or + implied, including, without limitation, any warranties or conditions + of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A + PARTICULAR PURPOSE. You are solely responsible for determining the + appropriateness of using or redistributing the Work and assume any + risks associated with Your exercise of permissions under this License. + + 8. Limitation of Liability. In no event and under no legal theory, + whether in tort (including negligence), contract, or otherwise, + unless required by applicable law (such as deliberate and grossly + negligent acts) or agreed to in writing, shall any Contributor be + liable to You for damages, including any direct, indirect, special, + incidental, or consequential damages of any character arising as a + result of this License or out of the use or inability to use the + Work (including but not limited to damages for loss of goodwill, + work stoppage, computer failure or malfunction, or any and all + other commercial damages or losses), even if such Contributor + has been advised of the possibility of such damages. + + 9. Accepting Warranty or Additional Liability. While redistributing + the Work or Derivative Works thereof, You may choose to offer, + and charge a fee for, acceptance of support, warranty, indemnity, + or other liability obligations and/or rights consistent with this + License. However, in accepting such obligations, You may act only + on Your own behalf and on Your sole responsibility, not on behalf + of any other Contributor, and only if You agree to indemnify, + defend, and hold each Contributor harmless for any liability + incurred by, or claims asserted against, such Contributor by reason + of your accepting any such warranty or additional liability. + + END OF TERMS AND CONDITIONS + + APPENDIX: How to apply the Apache License to your work. + + To apply the Apache License to your work, attach the following + boilerplate notice, with the fields enclosed by brackets "{}" + replaced with your own identifying information. (Don't include + the brackets!) The text should be enclosed in the appropriate + comment syntax for the file format. We also recommend that a + file or class name and description of purpose be included on the + same "printed page" as the copyright notice for easier + identification within third-party archives. + + Copyright 2019 Ross Wightman + + Licensed under the Apache License, Version 2.0 (the "License"); + you may not use this file except in compliance with the License. + You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, software + distributed under the License is distributed on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + See the License for the specific language governing permissions and + limitations under the License. + + +tinycss2 +BSD License +https://www.courtbouillon.org/tinycss2 +BSD 3-Clause License + +Copyright (c) 2013-2020, Simon Sapin and contributors. +All rights reserved. + +Redistribution and use in source and binary forms, with or without +modification, are permitted provided that the following conditions are met: + +* Redistributions of source code must retain the above copyright notice, this + list of conditions and the following disclaimer. + +* Redistributions in binary form must reproduce the above copyright notice, + this list of conditions and the following disclaimer in the documentation + and/or other materials provided with the distribution. + +* Neither the name of the copyright holder nor the names of its + contributors may be used to endorse or promote products derived from + this software without specific prior written permission. + +THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" +AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE +IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE +DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE +FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL +DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR +SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER +CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, +OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE +OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + + +tokenizers +Apache Software License +https://github.com/huggingface/tokenizers +UNKNOWN + +tomli +MIT +https://github.com/hukkin/tomli +MIT License + +Copyright (c) 2021 Taneli Hukkinen + +Permission is hereby granted, free of charge, to any person obtaining a copy +of this software and associated documentation files (the "Software"), to deal +in the Software without restriction, including without limitation the rights +to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +copies of the Software, and to permit persons to whom the Software is +furnished to do so, subject to the following conditions: + +The above copyright notice and this permission notice shall be included in all +copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE +SOFTWARE. + + +tomli_w +MIT License +https://github.com/hukkin/tomli-w +MIT License + +Copyright (c) 2021 Taneli Hukkinen + +Permission is hereby granted, free of charge, to any person obtaining a copy +of this software and associated documentation files (the "Software"), to deal +in the Software without restriction, including without limitation the rights +to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +copies of the Software, and to permit persons to whom the Software is +furnished to do so, subject to the following conditions: + +The above copyright notice and this permission notice shall be included in all +copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE +SOFTWARE. + + +tomlkit +MIT License +https://github.com/sdispater/tomlkit +Copyright (c) 2018 Sébastien Eustace + +Permission is hereby granted, free of charge, to any person obtaining +a copy of this software and associated documentation files (the +"Software"), to deal in the Software without restriction, including +without limitation the rights to use, copy, modify, merge, publish, +distribute, sublicense, and/or sell copies of the Software, and to +permit persons to whom the Software is furnished to do so, subject to +the following conditions: + +The above copyright notice and this permission notice shall be +included in all copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, +EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF +MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND +NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE +LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION +OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION +WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. + + +torch +BSD-3-Clause +https://pytorch.org +From PyTorch: + +Copyright (c) 2016- Facebook, Inc (Adam Paszke) +Copyright (c) 2014- Facebook, Inc (Soumith Chintala) +Copyright (c) 2011-2014 Idiap Research Institute (Ronan Collobert) +Copyright (c) 2012-2014 Deepmind Technologies (Koray Kavukcuoglu) +Copyright (c) 2011-2012 NEC Laboratories America (Koray Kavukcuoglu) +Copyright (c) 2011-2013 NYU (Clement Farabet) +Copyright (c) 2006-2010 NEC Laboratories America (Ronan Collobert, Leon Bottou, Iain Melvin, Jason Weston) +Copyright (c) 2006 Idiap Research Institute (Samy Bengio) +Copyright (c) 2001-2004 Idiap Research Institute (Ronan Collobert, Samy Bengio, Johnny Mariethoz) + +From Caffe2: + +Copyright (c) 2016-present, Facebook Inc. All rights reserved. + +All contributions by Facebook: +Copyright (c) 2016 Facebook Inc. + +All contributions by Google: +Copyright (c) 2015 Google Inc. +All rights reserved. + +All contributions by Yangqing Jia: +Copyright (c) 2015 Yangqing Jia +All rights reserved. + +All contributions by Kakao Brain: +Copyright 2019-2020 Kakao Brain + +All contributions by Cruise LLC: +Copyright (c) 2022 Cruise LLC. +All rights reserved. + +All contributions by Tri Dao: +Copyright (c) 2024 Tri Dao. +All rights reserved. + +All contributions by Arm: +Copyright (c) 2021, 2023-2025 Arm Limited and/or its affiliates + +All contributions from Caffe: +Copyright(c) 2013, 2014, 2015, the respective contributors +All rights reserved. + +All other contributions: +Copyright(c) 2015, 2016 the respective contributors +All rights reserved. + +Caffe2 uses a copyright model similar to Caffe: each contributor holds +copyright over their contributions to Caffe2. The project versioning records +all such contribution and copyright details. If a contributor wants to further +mark their specific copyright on a particular contribution, they should +indicate their copyright solely in the commit message of the change when it is +committed. + +All rights reserved. + +Redistribution and use in source and binary forms, with or without +modification, are permitted provided that the following conditions are met: + +1. Redistributions of source code must retain the above copyright + notice, this list of conditions and the following disclaimer. + +2. Redistributions in binary form must reproduce the above copyright + notice, this list of conditions and the following disclaimer in the + documentation and/or other materials provided with the distribution. + +3. Neither the names of Facebook, Deepmind Technologies, NYU, NEC Laboratories America + and IDIAP Research Institute nor the names of its contributors may be + used to endorse or promote products derived from this software without + specific prior written permission. + +THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" +AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE +IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE +ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE +LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR +CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF +SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS +INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN +CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) +ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE +POSSIBILITY OF SUCH DAMAGE. + + +The PyTorch repository and source distributions bundle several libraries that are +compatibly licensed. We list these here. + +Name: DCGM +License: Apache-2.0 +Files: /pytorch/third_party/kineto/libkineto/third_party/dynolog/third_party/DCGM + For details, see the files concatenated below: /pytorch/third_party/kineto/libkineto/third_party/dynolog/third_party/DCGM/LICENSE + +Name: FP16 +License: MIT +Files: /pytorch/third_party/FP16 + For details, see the files concatenated below: /pytorch/third_party/FP16/LICENSE + +Name: FXdiv +License: MIT +Files: /pytorch/third_party/FXdiv + For details, see the files concatenated below: /pytorch/third_party/FXdiv/LICENSE + +Name: NNPACK +License: BSD-2-Clause +Files: /pytorch/third_party/NNPACK + For details, see the files concatenated below: /pytorch/third_party/NNPACK/LICENSE + +Name: NVTX +License: Apache-2.0 with exception +Files: /pytorch/third_party/NVTX + For details, see the files concatenated below: /pytorch/third_party/NVTX/LICENSE.txt + +Name: VulkanMemoryAllocator +License: MIT +Files: /pytorch/third_party/VulkanMemoryAllocator + For details, see the files concatenated below: /pytorch/third_party/VulkanMemoryAllocator/LICENSE.txt + +Name: XNNPACK +License: BSD-3-Clause +Files: /pytorch/third_party/XNNPACK + For details, see the files concatenated below: /pytorch/third_party/XNNPACK/LICENSE + +Name: aiter +License: MIT +Files: /pytorch/third_party/aiter + For details, see the files concatenated below: /pytorch/third_party/aiter/LICENSE + +Name: benchmark +License: Apache-2.0 +Files: /pytorch/third_party/benchmark, + /pytorch/third_party/opentelemetry-cpp/third_party/benchmark, + /pytorch/third_party/protobuf/third_party/benchmark + For details, see the files concatenated below: /pytorch/third_party/benchmark/LICENSE, + /pytorch/third_party/opentelemetry-cpp/third_party/benchmark/LICENSE, + /pytorch/third_party/protobuf/third_party/benchmark/LICENSE + +Name: boost-vcpkg-helpers +License: MIT +Files: /pytorch/third_party/opentelemetry-cpp/tools/vcpkg/ports/boost-vcpkg-helpers + For details, see the files concatenated below: /pytorch/third_party/opentelemetry-cpp/tools/vcpkg/ports/boost-vcpkg-helpers/LICENSE.txt + +Name: cJSON +License: MIT +Files: /pytorch/third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/civetweb/examples/rest/cJSON, + /pytorch/third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/civetweb/examples/rest/cJSON + For details, see the files concatenated below: /pytorch/third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/civetweb/examples/rest/cJSON/LICENSE, + /pytorch/third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/civetweb/examples/rest/cJSON/LICENSE + +Name: catch2 +License: BSL-1.0 +Files: /pytorch/third_party/opentelemetry-cpp/third_party/opentracing-cpp/3rd_party/include/opentracing/catch2 + For details, see the files concatenated below: /pytorch/third_party/opentelemetry-cpp/third_party/opentracing-cpp/3rd_party/include/opentracing/catch2/LICENSE.txt + +Name: clog +License: BSD-2-Clause +Files: /pytorch/third_party/cpuinfo/deps/clog, + /pytorch/third_party/fbgemm/external/cpuinfo/deps/clog + For details, see the files concatenated below: /pytorch/third_party/cpuinfo/deps/clog/LICENSE, + /pytorch/third_party/fbgemm/external/cpuinfo/deps/clog/LICENSE + +Name: colorama +License: BSD-3-Clause +Files: /pytorch/third_party/kineto/libkineto/third_party/dynolog/third_party/DCGM/testing/python3/libs_3rdparty/colorama + For details, see the files concatenated below: /pytorch/third_party/kineto/libkineto/third_party/dynolog/third_party/DCGM/testing/python3/libs_3rdparty/colorama/LICENSE.txt + +Name: composable_kernel +License: MIT +Files: /pytorch/third_party/aiter/3rdparty/composable_kernel, + /pytorch/third_party/composable_kernel, + /pytorch/third_party/fbgemm/external/composable_kernel, + /pytorch/third_party/flash-attention/csrc/composable_kernel + For details, see the files concatenated below: /pytorch/third_party/aiter/3rdparty/composable_kernel/LICENSE, + /pytorch/third_party/composable_kernel/LICENSE, + /pytorch/third_party/fbgemm/external/composable_kernel/LICENSE, + /pytorch/third_party/flash-attention/csrc/composable_kernel/LICENSE + +Name: cpp-httplib +License: MIT +Files: /pytorch/third_party/cpp-httplib + For details, see the files concatenated below: /pytorch/third_party/cpp-httplib/LICENSE + +Name: cpplint +License: BSD-3-Clause +Files: /pytorch/third_party/kineto/libkineto/third_party/dynolog/third_party/json/third_party/cpplint + For details, see the files concatenated below: /pytorch/third_party/kineto/libkineto/third_party/dynolog/third_party/json/third_party/cpplint/LICENSE + +Name: cpr +License: MIT +Files: /pytorch/third_party/kineto/libkineto/third_party/dynolog/third_party/cpr + For details, see the files concatenated below: /pytorch/third_party/kineto/libkineto/third_party/dynolog/third_party/cpr/LICENSE + +Name: cpuinfo +License: BSD-2-Clause +Files: /pytorch/third_party/cpuinfo, + /pytorch/third_party/fbgemm/external/cpuinfo + For details, see the files concatenated below: /pytorch/third_party/cpuinfo/LICENSE, + /pytorch/third_party/fbgemm/external/cpuinfo/LICENSE + +Name: cudnn_frontend +License: MIT +Files: /pytorch/third_party/cudnn_frontend + For details, see the files concatenated below: /pytorch/third_party/cudnn_frontend/LICENSE.txt + +Name: cutlass +License: BSD-3-Clause +Files: /pytorch/third_party/cutlass, + /pytorch/third_party/fbgemm/external/cutlass, + /pytorch/third_party/flash-attention/csrc/cutlass + For details, see the files concatenated below: /pytorch/third_party/cutlass/LICENSE.txt, + /pytorch/third_party/fbgemm/external/cutlass/LICENSE.txt, + /pytorch/third_party/flash-attention/csrc/cutlass/LICENSE.txt + +Name: dart +License: Apache-2.0 +Files: /pytorch/third_party/flatbuffers/dart + For details, see the files concatenated below: /pytorch/third_party/flatbuffers/dart/LICENSE + +Name: docs +License: Apache-2.0 with exception +Files: /pytorch/third_party/NVTX/docs + For details, see the files concatenated below: /pytorch/third_party/NVTX/docs/LICENSE.txt + +Name: doctest +License: MIT +Files: /pytorch/third_party/kineto/libkineto/third_party/dynolog/third_party/json/test/thirdparty/doctest + For details, see the files concatenated below: /pytorch/third_party/kineto/libkineto/third_party/dynolog/third_party/json/test/thirdparty/doctest/LICENSE.txt + +Name: duktape-1.5.2 +License: MIT +Files: /pytorch/third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/civetweb/src/third_party/duktape-1.5.2, + /pytorch/third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/civetweb/src/third_party/duktape-1.5.2 + For details, see the files concatenated below: /pytorch/third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/civetweb/src/third_party/duktape-1.5.2/LICENSE.txt, + /pytorch/third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/civetweb/src/third_party/duktape-1.5.2/LICENSE.txt + +Name: duktape-1.8.0 +License: MIT +Files: /pytorch/third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/civetweb/src/third_party/duktape-1.8.0, + /pytorch/third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/civetweb/src/third_party/duktape-1.8.0 + For details, see the files concatenated below: /pytorch/third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/civetweb/src/third_party/duktape-1.8.0/LICENSE.txt, + /pytorch/third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/civetweb/src/third_party/duktape-1.8.0/LICENSE.txt + +Name: dynolog +License: MIT +Files: /pytorch/third_party/kineto/libkineto/third_party/dynolog + For details, see the files concatenated below: /pytorch/third_party/kineto/libkineto/third_party/dynolog/LICENSE + +Name: etw +License: MIT +Files: /pytorch/third_party/opentelemetry-cpp/exporters/etw/include/opentelemetry/exporters/etw + For details, see the files concatenated below: /pytorch/third_party/opentelemetry-cpp/exporters/etw/include/opentelemetry/exporters/etw/LICENSE + +Name: expected +License: MIT +Files: /pytorch/third_party/opentelemetry-cpp/third_party/opentracing-cpp/3rd_party/include/opentracing/expected + For details, see the files concatenated below: /pytorch/third_party/opentelemetry-cpp/third_party/opentracing-cpp/3rd_party/include/opentracing/expected/LICENSE + +Name: fbgemm +License: BSD-3-Clause +Files: /pytorch/third_party/fbgemm + For details, see the files concatenated below: /pytorch/third_party/fbgemm/LICENSE + +Name: ffnvcodec +License: MIT with exception +Files: /pytorch/third_party/opentelemetry-cpp/tools/vcpkg/ports/ffnvcodec + For details, see the files concatenated below: /pytorch/third_party/opentelemetry-cpp/tools/vcpkg/ports/ffnvcodec/LICENSE.txt + +Name: flash-attention +License: BSD-3-Clause +Files: /pytorch/third_party/flash-attention + For details, see the files concatenated below: /pytorch/third_party/flash-attention/LICENSE + +Name: flatbuffers +License: Apache-2.0 +Files: /pytorch/third_party/flatbuffers + For details, see the files concatenated below: /pytorch/third_party/flatbuffers/LICENSE + +Name: fmt +License: MIT with exception +Files: /pytorch/third_party/fmt, + /pytorch/third_party/kineto/libkineto/third_party/dynolog/third_party/fmt, + /pytorch/third_party/kineto/libkineto/third_party/fmt + For details, see the files concatenated below: /pytorch/third_party/fmt/LICENSE, + /pytorch/third_party/kineto/libkineto/third_party/dynolog/third_party/fmt/LICENSE.rst, + /pytorch/third_party/kineto/libkineto/third_party/fmt/LICENSE + +Name: gemmlowp +License: Apache-2.0 +Files: /pytorch/third_party/gemmlowp/gemmlowp + For details, see the files concatenated below: /pytorch/third_party/gemmlowp/gemmlowp/LICENSE + +Name: generator +License: Apache-2.0 +Files: /pytorch/third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/googletest/googlemock/scripts/generator, + /pytorch/third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/googletest/googlemock/scripts/generator, + /pytorch/third_party/protobuf/third_party/googletest/googlemock/scripts/generator, + /pytorch/third_party/tensorpipe/third_party/googletest/googlemock/scripts/generator + For details, see the files concatenated below: /pytorch/third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/googletest/googlemock/scripts/generator/LICENSE, + /pytorch/third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/googletest/googlemock/scripts/generator/LICENSE, + /pytorch/third_party/protobuf/third_party/googletest/googlemock/scripts/generator/LICENSE, + /pytorch/third_party/tensorpipe/third_party/googletest/googlemock/scripts/generator/LICENSE + +Name: gettimeofday +License: Apache-2.0 +Files: /pytorch/third_party/opentelemetry-cpp/tools/vcpkg/ports/gettimeofday + For details, see the files concatenated below: /pytorch/third_party/opentelemetry-cpp/tools/vcpkg/ports/gettimeofday/LICENSE + +Name: gloo +License: BSD-3-Clause +Files: /pytorch/third_party/gloo + For details, see the files concatenated below: /pytorch/third_party/gloo/LICENSE + +Name: googlemock +License: BSD-3-Clause +Files: /pytorch/third_party/protobuf/third_party/googletest/googlemock, + /pytorch/third_party/tensorpipe/third_party/googletest/googlemock + For details, see the files concatenated below: /pytorch/third_party/protobuf/third_party/googletest/googlemock/LICENSE, + /pytorch/third_party/tensorpipe/third_party/googletest/googlemock/LICENSE + +Name: googletest +License: BSD-3-Clause +Files: /pytorch/third_party/fbgemm/external/googletest, + /pytorch/third_party/googletest, + /pytorch/third_party/kineto/libkineto/third_party/dynolog/third_party/googletest, + /pytorch/third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/googletest, + /pytorch/third_party/kineto/libkineto/third_party/googletest, + /pytorch/third_party/opentelemetry-cpp/third_party/googletest, + /pytorch/third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/googletest, + /pytorch/third_party/protobuf/third_party/googletest, + /pytorch/third_party/protobuf/third_party/googletest/googletest, + /pytorch/third_party/tensorpipe/third_party/googletest, + /pytorch/third_party/tensorpipe/third_party/googletest/googletest + For details, see the files concatenated below: /pytorch/third_party/fbgemm/external/googletest/LICENSE, + /pytorch/third_party/googletest/LICENSE, + /pytorch/third_party/kineto/libkineto/third_party/dynolog/third_party/googletest/LICENSE, + /pytorch/third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/googletest/LICENSE, + /pytorch/third_party/kineto/libkineto/third_party/googletest/LICENSE, + /pytorch/third_party/opentelemetry-cpp/third_party/googletest/LICENSE, + /pytorch/third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/googletest/LICENSE, + /pytorch/third_party/protobuf/third_party/googletest/LICENSE, + /pytorch/third_party/protobuf/third_party/googletest/googletest/LICENSE, + /pytorch/third_party/tensorpipe/third_party/googletest/LICENSE, + /pytorch/third_party/tensorpipe/third_party/googletest/googletest/LICENSE + +Name: gtest +License: BSD-3-Clause +Files: /pytorch/third_party/ideep/mkl-dnn/tests/gtests/gtest + For details, see the files concatenated below: /pytorch/third_party/ideep/mkl-dnn/tests/gtests/gtest/LICENSE + +Name: hipify_torch +License: MIT +Files: /pytorch/third_party/fbgemm/external/hipify_torch + For details, see the files concatenated below: /pytorch/third_party/fbgemm/external/hipify_torch/LICENSE.txt + +Name: hstu +License: BSD-3-Clause +Files: /pytorch/third_party/fbgemm/fbgemm_gpu/experimental/hstu + For details, see the files concatenated below: /pytorch/third_party/fbgemm/fbgemm_gpu/experimental/hstu/LICENSE + +Name: hungarian +License: Permissive (free to use) +Files: /pytorch/third_party/opentelemetry-cpp/tools/vcpkg/ports/hungarian + For details, see the files concatenated below: /pytorch/third_party/opentelemetry-cpp/tools/vcpkg/ports/hungarian/LICENSE.txt + +Name: ideep +License: MIT +Files: /pytorch/third_party/ideep + For details, see the files concatenated below: /pytorch/third_party/ideep/LICENSE + +Name: irrlicht +License: MIT +Files: /pytorch/third_party/opentelemetry-cpp/tools/vcpkg/ports/irrlicht + For details, see the files concatenated below: /pytorch/third_party/opentelemetry-cpp/tools/vcpkg/ports/irrlicht/LICENSE.txt + +Name: kineto +License: BSD-3-Clause +Files: /pytorch/third_party/kineto + For details, see the files concatenated below: /pytorch/third_party/kineto/LICENSE + +Name: libnop +License: Apache-2.0 +Files: /pytorch/third_party/tensorpipe/third_party/libnop + For details, see the files concatenated below: /pytorch/third_party/tensorpipe/third_party/libnop/LICENSE + +Name: libstemmer +License: BSD-3-Clause +Files: /pytorch/third_party/opentelemetry-cpp/tools/vcpkg/ports/libstemmer + For details, see the files concatenated below: /pytorch/third_party/opentelemetry-cpp/tools/vcpkg/ports/libstemmer/LICENSE + +Name: libuv +License: MIT +Files: /pytorch/third_party/tensorpipe/third_party/libuv + For details, see the files concatenated below: /pytorch/third_party/tensorpipe/third_party/libuv/LICENSE + +Name: mimalloc +License: MIT +Files: /pytorch/third_party/mimalloc + For details, see the files concatenated below: /pytorch/third_party/mimalloc/LICENSE + +Name: miniz-3.0.2 +License: MIT +Files: /pytorch/third_party/miniz-3.0.2 + For details, see the files concatenated below: /pytorch/third_party/miniz-3.0.2/LICENSE + +Name: mkl-dnn +License: Apache-2.0 +Files: /pytorch/third_party/ideep/mkl-dnn + For details, see the files concatenated below: /pytorch/third_party/ideep/mkl-dnn/LICENSE + +Name: ms-gsl +License: MIT +Files: /pytorch/third_party/opentelemetry-cpp/third_party/ms-gsl + For details, see the files concatenated below: /pytorch/third_party/opentelemetry-cpp/third_party/ms-gsl/LICENSE + +Name: mx +License: MIT +Files: /pytorch/third_party/fbgemm/fbgemm_gpu/src/quantize_ops/mx, + /pytorch/third_party/fbgemm/fbgemm_gpu/test/quantize/mx + For details, see the files concatenated below: /pytorch/third_party/fbgemm/fbgemm_gpu/src/quantize_ops/mx/LICENSE, + /pytorch/third_party/fbgemm/fbgemm_gpu/test/quantize/mx/LICENSE + +Name: onnx +License: Apache-2.0 +Files: /pytorch/third_party/onnx + For details, see the files concatenated below: /pytorch/third_party/onnx/LICENSE + +Name: opentelemetry-cpp +License: Apache-2.0 +Files: /pytorch/third_party/opentelemetry-cpp + For details, see the files concatenated below: /pytorch/third_party/opentelemetry-cpp/LICENSE + +Name: opentelemetry-proto +License: Apache-2.0 +Files: /pytorch/third_party/opentelemetry-cpp/third_party/opentelemetry-proto + For details, see the files concatenated below: /pytorch/third_party/opentelemetry-cpp/third_party/opentelemetry-proto/LICENSE + +Name: opentracing-cpp +License: Apache-2.0 +Files: /pytorch/third_party/opentelemetry-cpp/third_party/opentracing-cpp + For details, see the files concatenated below: /pytorch/third_party/opentelemetry-cpp/third_party/opentracing-cpp/LICENSE + +Name: pdcurses +License: Public Domain for core +Files: /pytorch/third_party/opentelemetry-cpp/tools/vcpkg/ports/pdcurses + For details, see the files concatenated below: /pytorch/third_party/opentelemetry-cpp/tools/vcpkg/ports/pdcurses/LICENSE + +Name: pfs +License: Apache-2.0 +Files: /pytorch/third_party/kineto/libkineto/third_party/dynolog/third_party/pfs + For details, see the files concatenated below: /pytorch/third_party/kineto/libkineto/third_party/dynolog/third_party/pfs/LICENSE + +Name: physac +License: MIT +Files: /pytorch/third_party/opentelemetry-cpp/tools/vcpkg/ports/physac + For details, see the files concatenated below: /pytorch/third_party/opentelemetry-cpp/tools/vcpkg/ports/physac/LICENSE + +Name: pqp +License: Apache-2.0 +Files: /pytorch/third_party/opentelemetry-cpp/tools/vcpkg/ports/pqp + For details, see the files concatenated below: /pytorch/third_party/opentelemetry-cpp/tools/vcpkg/ports/pqp/LICENSE + +Name: prometheus-cpp +License: MIT +Files: /pytorch/third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp, + /pytorch/third_party/opentelemetry-cpp/third_party/prometheus-cpp + For details, see the files concatenated below: /pytorch/third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/LICENSE, + /pytorch/third_party/opentelemetry-cpp/third_party/prometheus-cpp/LICENSE + +Name: protobuf +License: BSD-3-Clause +Files: /pytorch/third_party/protobuf + For details, see the files concatenated below: /pytorch/third_party/protobuf/LICENSE + +Name: psimd +License: MIT +Files: /pytorch/third_party/psimd + For details, see the files concatenated below: /pytorch/third_party/psimd/LICENSE + +Name: pthreadpool +License: BSD-2-Clause +Files: /pytorch/third_party/pthreadpool + For details, see the files concatenated below: /pytorch/third_party/pthreadpool/LICENSE + +Name: pybind11 +License: BSD-3-Clause +Files: /pytorch/third_party/onnx/third_party/pybind11, + /pytorch/third_party/pybind11, + /pytorch/third_party/tensorpipe/third_party/pybind11 + For details, see the files concatenated below: /pytorch/third_party/onnx/third_party/pybind11/LICENSE, + /pytorch/third_party/pybind11/LICENSE, + /pytorch/third_party/tensorpipe/third_party/pybind11/LICENSE + +Name: python +License: Apache-2.0 with exception +Files: /pytorch/third_party/NVTX/python + For details, see the files concatenated below: /pytorch/third_party/NVTX/python/LICENSE.txt + +Name: python +License: BSD-3-Clause +Files: /pytorch/third_party/cutlass/python + For details, see the files concatenated below: /pytorch/third_party/cutlass/python/LICENSE.txt + +Name: python +License: BSD-3-Clause +Files: /pytorch/third_party/fbgemm/external/cutlass/python + For details, see the files concatenated below: /pytorch/third_party/fbgemm/external/cutlass/python/LICENSE.txt + +Name: python +License: BSD-3-Clause +Files: /pytorch/third_party/flash-attention/csrc/cutlass/python + For details, see the files concatenated below: /pytorch/third_party/flash-attention/csrc/cutlass/python/LICENSE.txt + +Name: python-peachpy +License: BSD-2-Clause +Files: /pytorch/third_party/python-peachpy + For details, see the files concatenated below: /pytorch/third_party/python-peachpy/LICENSE.rst + +Name: sigslot +License: Public Domain +Files: /pytorch/third_party/opentelemetry-cpp/tools/vcpkg/ports/sigslot + For details, see the files concatenated below: /pytorch/third_party/opentelemetry-cpp/tools/vcpkg/ports/sigslot/LICENSE + +Name: sleef +License: BSL-1.0 +Files: /pytorch/third_party/sleef + For details, see the files concatenated below: /pytorch/third_party/sleef/LICENSE.txt + +Name: swift +License: Apache-2.0 +Files: /pytorch/third_party/flatbuffers/swift + For details, see the files concatenated below: /pytorch/third_party/flatbuffers/swift/LICENSE + +Name: tb_plugin +License: BSD-3-Clause +Files: /pytorch/third_party/kineto/tb_plugin + For details, see the files concatenated below: /pytorch/third_party/kineto/tb_plugin/LICENSE + +Name: tensorflow-common +License: MIT +Files: /pytorch/third_party/opentelemetry-cpp/tools/vcpkg/ports/tensorflow-common + For details, see the files concatenated below: /pytorch/third_party/opentelemetry-cpp/tools/vcpkg/ports/tensorflow-common/LICENSE.txt + +Name: tensorpipe +License: BSD-3-Clause +Files: /pytorch/third_party/tensorpipe + For details, see the files concatenated below: /pytorch/third_party/tensorpipe/LICENSE.txt + +Name: test +License: MIT with exception +Files: /pytorch/third_party/kineto/libkineto/third_party/dynolog/third_party/cpr/test + For details, see the files concatenated below: /pytorch/third_party/kineto/libkineto/third_party/dynolog/third_party/cpr/test/LICENSE + +Name: variant +License: BSD-3-Clause +Files: /pytorch/third_party/opentelemetry-cpp/third_party/opentracing-cpp/3rd_party/include/opentracing/variant + For details, see the files concatenated below: /pytorch/third_party/opentelemetry-cpp/third_party/opentracing-cpp/3rd_party/include/opentracing/variant/LICENSE + +Name: vcpkg +License: MIT +Files: /pytorch/third_party/opentelemetry-cpp/tools/vcpkg + For details, see the files concatenated below: /pytorch/third_party/opentelemetry-cpp/tools/vcpkg/LICENSE.txt + +Name: vulkan +License: Apache-2.0 with exception +Files: /pytorch/third_party/opentelemetry-cpp/tools/vcpkg/ports/vulkan + For details, see the files concatenated below: /pytorch/third_party/opentelemetry-cpp/tools/vcpkg/ports/vulkan/LICENSE.txt + +/pytorch/third_party/kineto/libkineto/third_party/dynolog/third_party/DCGM/LICENSE +---------------------------------------------------------------------------------- +Copyright (c) 2022, NVIDIA CORPORATION. All rights reserved. + +Licensed under the Apache License, Version 2.0 (the "License"); +you may not use this file except in compliance with the License. +You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + +Unless required by applicable law or agreed to in writing, software +distributed under the License is distributed on an "AS IS" BASIS, +WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +See the License for the specific language governing permissions and +limitations under the License. + + +/pytorch/third_party/FP16/LICENSE +--------------------------------- +The MIT License (MIT) + +Copyright (c) 2017 Facebook Inc. +Copyright (c) 2017 Georgia Institute of Technology +Copyright 2019 Google LLC + +Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions: + +The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. + + +/pytorch/third_party/FXdiv/LICENSE +---------------------------------- +The MIT License (MIT) + +Copyright (c) 2017 Facebook Inc. +Copyright (c) 2016-2017 Marat Dukhan + +Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions: + +The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. + + +/pytorch/third_party/NNPACK/LICENSE +----------------------------------- +Copyright (c) 2017 Facebook Inc. +Copyright (c) 2015-2017, Georgia Institute of Technology +All rights reserved. + +Redistribution and use in source and binary forms, with or without +modification, are permitted provided that the following conditions are met: + +* Redistributions of source code must retain the above copyright notice, this + list of conditions and the following disclaimer. + +* Redistributions in binary form must reproduce the above copyright notice, + this list of conditions and the following disclaimer in the documentation + and/or other materials provided with the distribution. + +THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" +AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE +IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE +DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE +FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL +DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR +SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER +CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, +OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE +OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + + +/pytorch/third_party/NVTX/LICENSE.txt +------------------------------------- +============================================================================== +NVTX is under the Apache License v2.0 with LLVM Exceptions: +============================================================================== + + Apache License + Version 2.0, January 2004 + http://www.apache.org/licenses/ + + TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION + + 1. Definitions. + + "License" shall mean the terms and conditions for use, reproduction, + and distribution as defined by Sections 1 through 9 of this document. + + "Licensor" shall mean the copyright owner or entity authorized by + the copyright owner that is granting the License. + + "Legal Entity" shall mean the union of the acting entity and all + other entities that control, are controlled by, or are under common + control with that entity. For the purposes of this definition, + "control" means (i) the power, direct or indirect, to cause the + direction or management of such entity, whether by contract or + otherwise, or (ii) ownership of fifty percent (50%) or more of the + outstanding shares, or (iii) beneficial ownership of such entity. + + "You" (or "Your") shall mean an individual or Legal Entity + exercising permissions granted by this License. + + "Source" form shall mean the preferred form for making modifications, + including but not limited to software source code, documentation + source, and configuration files. + + "Object" form shall mean any form resulting from mechanical + transformation or translation of a Source form, including but + not limited to compiled object code, generated documentation, + and conversions to other media types. + + "Work" shall mean the work of authorship, whether in Source or + Object form, made available under the License, as indicated by a + copyright notice that is included in or attached to the work + (an example is provided in the Appendix below). + + "Derivative Works" shall mean any work, whether in Source or Object + form, that is based on (or derived from) the Work and for which the + editorial revisions, annotations, elaborations, or other modifications + represent, as a whole, an original work of authorship. For the purposes + of this License, Derivative Works shall not include works that remain + separable from, or merely link (or bind by name) to the interfaces of, + the Work and Derivative Works thereof. + + "Contribution" shall mean any work of authorship, including + the original version of the Work and any modifications or additions + to that Work or Derivative Works thereof, that is intentionally + submitted to Licensor for inclusion in the Work by the copyright owner + or by an individual or Legal Entity authorized to submit on behalf of + the copyright owner. For the purposes of this definition, "submitted" + means any form of electronic, verbal, or written communication sent + to the Licensor or its representatives, including but not limited to + communication on electronic mailing lists, source code control systems, + and issue tracking systems that are managed by, or on behalf of, the + Licensor for the purpose of discussing and improving the Work, but + excluding communication that is conspicuously marked or otherwise + designated in writing by the copyright owner as "Not a Contribution." + + "Contributor" shall mean Licensor and any individual or Legal Entity + on behalf of whom a Contribution has been received by Licensor and + subsequently incorporated within the Work. + + 2. Grant of Copyright License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + copyright license to reproduce, prepare Derivative Works of, + publicly display, publicly perform, sublicense, and distribute the + Work and such Derivative Works in Source or Object form. + + 3. Grant of Patent License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + (except as stated in this section) patent license to make, have made, + use, offer to sell, sell, import, and otherwise transfer the Work, + where such license applies only to those patent claims licensable + by such Contributor that are necessarily infringed by their + Contribution(s) alone or by combination of their Contribution(s) + with the Work to which such Contribution(s) was submitted. If You + institute patent litigation against any entity (including a + cross-claim or counterclaim in a lawsuit) alleging that the Work + or a Contribution incorporated within the Work constitutes direct + or contributory patent infringement, then any patent licenses + granted to You under this License for that Work shall terminate + as of the date such litigation is filed. + + 4. Redistribution. You may reproduce and distribute copies of the + Work or Derivative Works thereof in any medium, with or without + modifications, and in Source or Object form, provided that You + meet the following conditions: + + (a) You must give any other recipients of the Work or + Derivative Works a copy of this License; and + + (b) You must cause any modified files to carry prominent notices + stating that You changed the files; and + + (c) You must retain, in the Source form of any Derivative Works + that You distribute, all copyright, patent, trademark, and + attribution notices from the Source form of the Work, + excluding those notices that do not pertain to any part of + the Derivative Works; and + + (d) If the Work includes a "NOTICE" text file as part of its + distribution, then any Derivative Works that You distribute must + include a readable copy of the attribution notices contained + within such NOTICE file, excluding those notices that do not + pertain to any part of the Derivative Works, in at least one + of the following places: within a NOTICE text file distributed + as part of the Derivative Works; within the Source form or + documentation, if provided along with the Derivative Works; or, + within a display generated by the Derivative Works, if and + wherever such third-party notices normally appear. The contents + of the NOTICE file are for informational purposes only and + do not modify the License. You may add Your own attribution + notices within Derivative Works that You distribute, alongside + or as an addendum to the NOTICE text from the Work, provided + that such additional attribution notices cannot be construed + as modifying the License. + + You may add Your own copyright statement to Your modifications and + may provide additional or different license terms and conditions + for use, reproduction, or distribution of Your modifications, or + for any such Derivative Works as a whole, provided Your use, + reproduction, and distribution of the Work otherwise complies with + the conditions stated in this License. + + 5. Submission of Contributions. Unless You explicitly state otherwise, + any Contribution intentionally submitted for inclusion in the Work + by You to the Licensor shall be under the terms and conditions of + this License, without any additional terms or conditions. + Notwithstanding the above, nothing herein shall supersede or modify + the terms of any separate license agreement you may have executed + with Licensor regarding such Contributions. + + 6. Trademarks. This License does not grant permission to use the trade + names, trademarks, service marks, or product names of the Licensor, + except as required for reasonable and customary use in describing the + origin of the Work and reproducing the content of the NOTICE file. + + 7. Disclaimer of Warranty. Unless required by applicable law or + agreed to in writing, Licensor provides the Work (and each + Contributor provides its Contributions) on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or + implied, including, without limitation, any warranties or conditions + of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A + PARTICULAR PURPOSE. You are solely responsible for determining the + appropriateness of using or redistributing the Work and assume any + risks associated with Your exercise of permissions under this License. + + 8. Limitation of Liability. In no event and under no legal theory, + whether in tort (including negligence), contract, or otherwise, + unless required by applicable law (such as deliberate and grossly + negligent acts) or agreed to in writing, shall any Contributor be + liable to You for damages, including any direct, indirect, special, + incidental, or consequential damages of any character arising as a + result of this License or out of the use or inability to use the + Work (including but not limited to damages for loss of goodwill, + work stoppage, computer failure or malfunction, or any and all + other commercial damages or losses), even if such Contributor + has been advised of the possibility of such damages. + + 9. Accepting Warranty or Additional Liability. While redistributing + the Work or Derivative Works thereof, You may choose to offer, + and charge a fee for, acceptance of support, warranty, indemnity, + or other liability obligations and/or rights consistent with this + License. However, in accepting such obligations, You may act only + on Your own behalf and on Your sole responsibility, not on behalf + of any other Contributor, and only if You agree to indemnify, + defend, and hold each Contributor harmless for any liability + incurred by, or claims asserted against, such Contributor by reason + of your accepting any such warranty or additional liability. + + END OF TERMS AND CONDITIONS + + APPENDIX: How to apply the Apache License to your work. + + To apply the Apache License to your work, attach the following + boilerplate notice, with the fields enclosed by brackets "[]" + replaced with your own identifying information. (Don't include + the brackets!) The text should be enclosed in the appropriate + comment syntax for the file format. We also recommend that a + file or class name and description of purpose be included on the + same "printed page" as the copyright notice for easier + identification within third-party archives. + + Copyright [yyyy] [name of copyright owner] + + Licensed under the Apache License, Version 2.0 (the "License"); + you may not use this file except in compliance with the License. + You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, software + distributed under the License is distributed on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + See the License for the specific language governing permissions and + limitations under the License. + + +---- LLVM Exceptions to the Apache 2.0 License ---- + +As an exception, if, as a result of your compiling your source code, portions +of this Software are embedded into an Object form of such source code, you +may redistribute such embedded portions in such Object form without complying +with the conditions of Sections 4(a), 4(b) and 4(d) of the License. + +In addition, if you combine or link compiled forms of this Software with +software that is licensed under the GPLv2 ("Combined Software") and if a +court of competent jurisdiction determines that the patent provision (Section +3), the indemnity provision (Section 9) or other Section of the License +conflicts with the conditions of the GPLv2, you may retroactively and +prospectively choose to deem waived or otherwise exclude such Section(s) of +the License, but only in their entirety and only with respect to the Combined +Software. + + + +/pytorch/third_party/VulkanMemoryAllocator/LICENSE.txt +------------------------------------------------------ +Copyright (c) 2017-2025 Advanced Micro Devices, Inc. All rights reserved. + +Permission is hereby granted, free of charge, to any person obtaining a copy +of this software and associated documentation files (the "Software"), to deal +in the Software without restriction, including without limitation the rights +to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +copies of the Software, and to permit persons to whom the Software is +furnished to do so, subject to the following conditions: + +The above copyright notice and this permission notice shall be included in +all copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN +THE SOFTWARE. + + +/pytorch/third_party/XNNPACK/LICENSE +------------------------------------ +BSD License + +For XNNPACK software + +Copyright (c) Facebook, Inc. and its affiliates. All rights reserved. +Copyright 2019 Google LLC + +Redistribution and use in source and binary forms, with or without modification, +are permitted provided that the following conditions are met: + + * Redistributions of source code must retain the above copyright notice, this + list of conditions and the following disclaimer. + + * Redistributions in binary form must reproduce the above copyright notice, + this list of conditions and the following disclaimer in the documentation + and/or other materials provided with the distribution. + + * Neither the name Facebook nor the names of its contributors may be used to + endorse or promote products derived from this software without specific + prior written permission. + +THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND +ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED +WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE +DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR +ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES +(INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; +LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON +ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT +(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS +SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + + +/pytorch/third_party/aiter/LICENSE +---------------------------------- +Copyright © Advanced Micro Devices, Inc. All rights reserved. + +MIT License + +Permission is hereby granted, free of charge, to any person obtaining a copy +of this software and associated documentation files (the "Software"), to deal +in the Software without restriction, including without limitation the rights +to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +copies of the Software, and to permit persons to whom the Software is +furnished to do so, subject to the following conditions: + +The above copyright notice and this permission notice shall be included in all +copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE +SOFTWARE. + + +/pytorch/third_party/benchmark/LICENSE +-------------------------------------- + + Apache License + Version 2.0, January 2004 + http://www.apache.org/licenses/ + + TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION + + 1. Definitions. + + "License" shall mean the terms and conditions for use, reproduction, + and distribution as defined by Sections 1 through 9 of this document. + + "Licensor" shall mean the copyright owner or entity authorized by + the copyright owner that is granting the License. + + "Legal Entity" shall mean the union of the acting entity and all + other entities that control, are controlled by, or are under common + control with that entity. For the purposes of this definition, + "control" means (i) the power, direct or indirect, to cause the + direction or management of such entity, whether by contract or + otherwise, or (ii) ownership of fifty percent (50%) or more of the + outstanding shares, or (iii) beneficial ownership of such entity. + + "You" (or "Your") shall mean an individual or Legal Entity + exercising permissions granted by this License. + + "Source" form shall mean the preferred form for making modifications, + including but not limited to software source code, documentation + source, and configuration files. + + "Object" form shall mean any form resulting from mechanical + transformation or translation of a Source form, including but + not limited to compiled object code, generated documentation, + and conversions to other media types. + + "Work" shall mean the work of authorship, whether in Source or + Object form, made available under the License, as indicated by a + copyright notice that is included in or attached to the work + (an example is provided in the Appendix below). + + "Derivative Works" shall mean any work, whether in Source or Object + form, that is based on (or derived from) the Work and for which the + editorial revisions, annotations, elaborations, or other modifications + represent, as a whole, an original work of authorship. For the purposes + of this License, Derivative Works shall not include works that remain + separable from, or merely link (or bind by name) to the interfaces of, + the Work and Derivative Works thereof. + + "Contribution" shall mean any work of authorship, including + the original version of the Work and any modifications or additions + to that Work or Derivative Works thereof, that is intentionally + submitted to Licensor for inclusion in the Work by the copyright owner + or by an individual or Legal Entity authorized to submit on behalf of + the copyright owner. For the purposes of this definition, "submitted" + means any form of electronic, verbal, or written communication sent + to the Licensor or its representatives, including but not limited to + communication on electronic mailing lists, source code control systems, + and issue tracking systems that are managed by, or on behalf of, the + Licensor for the purpose of discussing and improving the Work, but + excluding communication that is conspicuously marked or otherwise + designated in writing by the copyright owner as "Not a Contribution." + + "Contributor" shall mean Licensor and any individual or Legal Entity + on behalf of whom a Contribution has been received by Licensor and + subsequently incorporated within the Work. + + 2. Grant of Copyright License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + copyright license to reproduce, prepare Derivative Works of, + publicly display, publicly perform, sublicense, and distribute the + Work and such Derivative Works in Source or Object form. + + 3. Grant of Patent License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + (except as stated in this section) patent license to make, have made, + use, offer to sell, sell, import, and otherwise transfer the Work, + where such license applies only to those patent claims licensable + by such Contributor that are necessarily infringed by their + Contribution(s) alone or by combination of their Contribution(s) + with the Work to which such Contribution(s) was submitted. If You + institute patent litigation against any entity (including a + cross-claim or counterclaim in a lawsuit) alleging that the Work + or a Contribution incorporated within the Work constitutes direct + or contributory patent infringement, then any patent licenses + granted to You under this License for that Work shall terminate + as of the date such litigation is filed. + + 4. Redistribution. You may reproduce and distribute copies of the + Work or Derivative Works thereof in any medium, with or without + modifications, and in Source or Object form, provided that You + meet the following conditions: + + (a) You must give any other recipients of the Work or + Derivative Works a copy of this License; and + + (b) You must cause any modified files to carry prominent notices + stating that You changed the files; and + + (c) You must retain, in the Source form of any Derivative Works + that You distribute, all copyright, patent, trademark, and + attribution notices from the Source form of the Work, + excluding those notices that do not pertain to any part of + the Derivative Works; and + + (d) If the Work includes a "NOTICE" text file as part of its + distribution, then any Derivative Works that You distribute must + include a readable copy of the attribution notices contained + within such NOTICE file, excluding those notices that do not + pertain to any part of the Derivative Works, in at least one + of the following places: within a NOTICE text file distributed + as part of the Derivative Works; within the Source form or + documentation, if provided along with the Derivative Works; or, + within a display generated by the Derivative Works, if and + wherever such third-party notices normally appear. The contents + of the NOTICE file are for informational purposes only and + do not modify the License. You may add Your own attribution + notices within Derivative Works that You distribute, alongside + or as an addendum to the NOTICE text from the Work, provided + that such additional attribution notices cannot be construed + as modifying the License. + + You may add Your own copyright statement to Your modifications and + may provide additional or different license terms and conditions + for use, reproduction, or distribution of Your modifications, or + for any such Derivative Works as a whole, provided Your use, + reproduction, and distribution of the Work otherwise complies with + the conditions stated in this License. + + 5. Submission of Contributions. Unless You explicitly state otherwise, + any Contribution intentionally submitted for inclusion in the Work + by You to the Licensor shall be under the terms and conditions of + this License, without any additional terms or conditions. + Notwithstanding the above, nothing herein shall supersede or modify + the terms of any separate license agreement you may have executed + with Licensor regarding such Contributions. + + 6. Trademarks. This License does not grant permission to use the trade + names, trademarks, service marks, or product names of the Licensor, + except as required for reasonable and customary use in describing the + origin of the Work and reproducing the content of the NOTICE file. + + 7. Disclaimer of Warranty. Unless required by applicable law or + agreed to in writing, Licensor provides the Work (and each + Contributor provides its Contributions) on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or + implied, including, without limitation, any warranties or conditions + of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A + PARTICULAR PURPOSE. You are solely responsible for determining the + appropriateness of using or redistributing the Work and assume any + risks associated with Your exercise of permissions under this License. + + 8. Limitation of Liability. In no event and under no legal theory, + whether in tort (including negligence), contract, or otherwise, + unless required by applicable law (such as deliberate and grossly + negligent acts) or agreed to in writing, shall any Contributor be + liable to You for damages, including any direct, indirect, special, + incidental, or consequential damages of any character arising as a + result of this License or out of the use or inability to use the + Work (including but not limited to damages for loss of goodwill, + work stoppage, computer failure or malfunction, or any and all + other commercial damages or losses), even if such Contributor + has been advised of the possibility of such damages. + + 9. Accepting Warranty or Additional Liability. While redistributing + the Work or Derivative Works thereof, You may choose to offer, + and charge a fee for, acceptance of support, warranty, indemnity, + or other liability obligations and/or rights consistent with this + License. However, in accepting such obligations, You may act only + on Your own behalf and on Your sole responsibility, not on behalf + of any other Contributor, and only if You agree to indemnify, + defend, and hold each Contributor harmless for any liability + incurred by, or claims asserted against, such Contributor by reason + of your accepting any such warranty or additional liability. + + END OF TERMS AND CONDITIONS + + APPENDIX: How to apply the Apache License to your work. + + To apply the Apache License to your work, attach the following + boilerplate notice, with the fields enclosed by brackets "[]" + replaced with your own identifying information. (Don't include + the brackets!) The text should be enclosed in the appropriate + comment syntax for the file format. We also recommend that a + file or class name and description of purpose be included on the + same "printed page" as the copyright notice for easier + identification within third-party archives. + + Copyright [yyyy] [name of copyright owner] + + Licensed under the Apache License, Version 2.0 (the "License"); + you may not use this file except in compliance with the License. + You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, software + distributed under the License is distributed on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + See the License for the specific language governing permissions and + limitations under the License. + + +/pytorch/third_party/opentelemetry-cpp/third_party/benchmark/LICENSE +-------------------------------------------------------------------- + + Apache License + Version 2.0, January 2004 + http://www.apache.org/licenses/ + + TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION + + 1. Definitions. + + "License" shall mean the terms and conditions for use, reproduction, + and distribution as defined by Sections 1 through 9 of this document. + + "Licensor" shall mean the copyright owner or entity authorized by + the copyright owner that is granting the License. + + "Legal Entity" shall mean the union of the acting entity and all + other entities that control, are controlled by, or are under common + control with that entity. For the purposes of this definition, + "control" means (i) the power, direct or indirect, to cause the + direction or management of such entity, whether by contract or + otherwise, or (ii) ownership of fifty percent (50%) or more of the + outstanding shares, or (iii) beneficial ownership of such entity. + + "You" (or "Your") shall mean an individual or Legal Entity + exercising permissions granted by this License. + + "Source" form shall mean the preferred form for making modifications, + including but not limited to software source code, documentation + source, and configuration files. + + "Object" form shall mean any form resulting from mechanical + transformation or translation of a Source form, including but + not limited to compiled object code, generated documentation, + and conversions to other media types. + + "Work" shall mean the work of authorship, whether in Source or + Object form, made available under the License, as indicated by a + copyright notice that is included in or attached to the work + (an example is provided in the Appendix below). + + "Derivative Works" shall mean any work, whether in Source or Object + form, that is based on (or derived from) the Work and for which the + editorial revisions, annotations, elaborations, or other modifications + represent, as a whole, an original work of authorship. For the purposes + of this License, Derivative Works shall not include works that remain + separable from, or merely link (or bind by name) to the interfaces of, + the Work and Derivative Works thereof. + + "Contribution" shall mean any work of authorship, including + the original version of the Work and any modifications or additions + to that Work or Derivative Works thereof, that is intentionally + submitted to Licensor for inclusion in the Work by the copyright owner + or by an individual or Legal Entity authorized to submit on behalf of + the copyright owner. For the purposes of this definition, "submitted" + means any form of electronic, verbal, or written communication sent + to the Licensor or its representatives, including but not limited to + communication on electronic mailing lists, source code control systems, + and issue tracking systems that are managed by, or on behalf of, the + Licensor for the purpose of discussing and improving the Work, but + excluding communication that is conspicuously marked or otherwise + designated in writing by the copyright owner as "Not a Contribution." + + "Contributor" shall mean Licensor and any individual or Legal Entity + on behalf of whom a Contribution has been received by Licensor and + subsequently incorporated within the Work. + + 2. Grant of Copyright License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + copyright license to reproduce, prepare Derivative Works of, + publicly display, publicly perform, sublicense, and distribute the + Work and such Derivative Works in Source or Object form. + + 3. Grant of Patent License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + (except as stated in this section) patent license to make, have made, + use, offer to sell, sell, import, and otherwise transfer the Work, + where such license applies only to those patent claims licensable + by such Contributor that are necessarily infringed by their + Contribution(s) alone or by combination of their Contribution(s) + with the Work to which such Contribution(s) was submitted. If You + institute patent litigation against any entity (including a + cross-claim or counterclaim in a lawsuit) alleging that the Work + or a Contribution incorporated within the Work constitutes direct + or contributory patent infringement, then any patent licenses + granted to You under this License for that Work shall terminate + as of the date such litigation is filed. + + 4. Redistribution. You may reproduce and distribute copies of the + Work or Derivative Works thereof in any medium, with or without + modifications, and in Source or Object form, provided that You + meet the following conditions: + + (a) You must give any other recipients of the Work or + Derivative Works a copy of this License; and + + (b) You must cause any modified files to carry prominent notices + stating that You changed the files; and + + (c) You must retain, in the Source form of any Derivative Works + that You distribute, all copyright, patent, trademark, and + attribution notices from the Source form of the Work, + excluding those notices that do not pertain to any part of + the Derivative Works; and + + (d) If the Work includes a "NOTICE" text file as part of its + distribution, then any Derivative Works that You distribute must + include a readable copy of the attribution notices contained + within such NOTICE file, excluding those notices that do not + pertain to any part of the Derivative Works, in at least one + of the following places: within a NOTICE text file distributed + as part of the Derivative Works; within the Source form or + documentation, if provided along with the Derivative Works; or, + within a display generated by the Derivative Works, if and + wherever such third-party notices normally appear. The contents + of the NOTICE file are for informational purposes only and + do not modify the License. You may add Your own attribution + notices within Derivative Works that You distribute, alongside + or as an addendum to the NOTICE text from the Work, provided + that such additional attribution notices cannot be construed + as modifying the License. + + You may add Your own copyright statement to Your modifications and + may provide additional or different license terms and conditions + for use, reproduction, or distribution of Your modifications, or + for any such Derivative Works as a whole, provided Your use, + reproduction, and distribution of the Work otherwise complies with + the conditions stated in this License. + + 5. Submission of Contributions. Unless You explicitly state otherwise, + any Contribution intentionally submitted for inclusion in the Work + by You to the Licensor shall be under the terms and conditions of + this License, without any additional terms or conditions. + Notwithstanding the above, nothing herein shall supersede or modify + the terms of any separate license agreement you may have executed + with Licensor regarding such Contributions. + + 6. Trademarks. This License does not grant permission to use the trade + names, trademarks, service marks, or product names of the Licensor, + except as required for reasonable and customary use in describing the + origin of the Work and reproducing the content of the NOTICE file. + + 7. Disclaimer of Warranty. Unless required by applicable law or + agreed to in writing, Licensor provides the Work (and each + Contributor provides its Contributions) on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or + implied, including, without limitation, any warranties or conditions + of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A + PARTICULAR PURPOSE. You are solely responsible for determining the + appropriateness of using or redistributing the Work and assume any + risks associated with Your exercise of permissions under this License. + + 8. Limitation of Liability. In no event and under no legal theory, + whether in tort (including negligence), contract, or otherwise, + unless required by applicable law (such as deliberate and grossly + negligent acts) or agreed to in writing, shall any Contributor be + liable to You for damages, including any direct, indirect, special, + incidental, or consequential damages of any character arising as a + result of this License or out of the use or inability to use the + Work (including but not limited to damages for loss of goodwill, + work stoppage, computer failure or malfunction, or any and all + other commercial damages or losses), even if such Contributor + has been advised of the possibility of such damages. + + 9. Accepting Warranty or Additional Liability. While redistributing + the Work or Derivative Works thereof, You may choose to offer, + and charge a fee for, acceptance of support, warranty, indemnity, + or other liability obligations and/or rights consistent with this + License. However, in accepting such obligations, You may act only + on Your own behalf and on Your sole responsibility, not on behalf + of any other Contributor, and only if You agree to indemnify, + defend, and hold each Contributor harmless for any liability + incurred by, or claims asserted against, such Contributor by reason + of your accepting any such warranty or additional liability. + + END OF TERMS AND CONDITIONS + + APPENDIX: How to apply the Apache License to your work. + + To apply the Apache License to your work, attach the following + boilerplate notice, with the fields enclosed by brackets "[]" + replaced with your own identifying information. (Don't include + the brackets!) The text should be enclosed in the appropriate + comment syntax for the file format. We also recommend that a + file or class name and description of purpose be included on the + same "printed page" as the copyright notice for easier + identification within third-party archives. + + Copyright [yyyy] [name of copyright owner] + + Licensed under the Apache License, Version 2.0 (the "License"); + you may not use this file except in compliance with the License. + You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, software + distributed under the License is distributed on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + See the License for the specific language governing permissions and + limitations under the License. + + +/pytorch/third_party/protobuf/third_party/benchmark/LICENSE +----------------------------------------------------------- + + Apache License + Version 2.0, January 2004 + http://www.apache.org/licenses/ + + TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION + + 1. Definitions. + + "License" shall mean the terms and conditions for use, reproduction, + and distribution as defined by Sections 1 through 9 of this document. + + "Licensor" shall mean the copyright owner or entity authorized by + the copyright owner that is granting the License. + + "Legal Entity" shall mean the union of the acting entity and all + other entities that control, are controlled by, or are under common + control with that entity. For the purposes of this definition, + "control" means (i) the power, direct or indirect, to cause the + direction or management of such entity, whether by contract or + otherwise, or (ii) ownership of fifty percent (50%) or more of the + outstanding shares, or (iii) beneficial ownership of such entity. + + "You" (or "Your") shall mean an individual or Legal Entity + exercising permissions granted by this License. + + "Source" form shall mean the preferred form for making modifications, + including but not limited to software source code, documentation + source, and configuration files. + + "Object" form shall mean any form resulting from mechanical + transformation or translation of a Source form, including but + not limited to compiled object code, generated documentation, + and conversions to other media types. + + "Work" shall mean the work of authorship, whether in Source or + Object form, made available under the License, as indicated by a + copyright notice that is included in or attached to the work + (an example is provided in the Appendix below). + + "Derivative Works" shall mean any work, whether in Source or Object + form, that is based on (or derived from) the Work and for which the + editorial revisions, annotations, elaborations, or other modifications + represent, as a whole, an original work of authorship. For the purposes + of this License, Derivative Works shall not include works that remain + separable from, or merely link (or bind by name) to the interfaces of, + the Work and Derivative Works thereof. + + "Contribution" shall mean any work of authorship, including + the original version of the Work and any modifications or additions + to that Work or Derivative Works thereof, that is intentionally + submitted to Licensor for inclusion in the Work by the copyright owner + or by an individual or Legal Entity authorized to submit on behalf of + the copyright owner. For the purposes of this definition, "submitted" + means any form of electronic, verbal, or written communication sent + to the Licensor or its representatives, including but not limited to + communication on electronic mailing lists, source code control systems, + and issue tracking systems that are managed by, or on behalf of, the + Licensor for the purpose of discussing and improving the Work, but + excluding communication that is conspicuously marked or otherwise + designated in writing by the copyright owner as "Not a Contribution." + + "Contributor" shall mean Licensor and any individual or Legal Entity + on behalf of whom a Contribution has been received by Licensor and + subsequently incorporated within the Work. + + 2. Grant of Copyright License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + copyright license to reproduce, prepare Derivative Works of, + publicly display, publicly perform, sublicense, and distribute the + Work and such Derivative Works in Source or Object form. + + 3. Grant of Patent License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + (except as stated in this section) patent license to make, have made, + use, offer to sell, sell, import, and otherwise transfer the Work, + where such license applies only to those patent claims licensable + by such Contributor that are necessarily infringed by their + Contribution(s) alone or by combination of their Contribution(s) + with the Work to which such Contribution(s) was submitted. If You + institute patent litigation against any entity (including a + cross-claim or counterclaim in a lawsuit) alleging that the Work + or a Contribution incorporated within the Work constitutes direct + or contributory patent infringement, then any patent licenses + granted to You under this License for that Work shall terminate + as of the date such litigation is filed. + + 4. Redistribution. You may reproduce and distribute copies of the + Work or Derivative Works thereof in any medium, with or without + modifications, and in Source or Object form, provided that You + meet the following conditions: + + (a) You must give any other recipients of the Work or + Derivative Works a copy of this License; and + + (b) You must cause any modified files to carry prominent notices + stating that You changed the files; and + + (c) You must retain, in the Source form of any Derivative Works + that You distribute, all copyright, patent, trademark, and + attribution notices from the Source form of the Work, + excluding those notices that do not pertain to any part of + the Derivative Works; and + + (d) If the Work includes a "NOTICE" text file as part of its + distribution, then any Derivative Works that You distribute must + include a readable copy of the attribution notices contained + within such NOTICE file, excluding those notices that do not + pertain to any part of the Derivative Works, in at least one + of the following places: within a NOTICE text file distributed + as part of the Derivative Works; within the Source form or + documentation, if provided along with the Derivative Works; or, + within a display generated by the Derivative Works, if and + wherever such third-party notices normally appear. The contents + of the NOTICE file are for informational purposes only and + do not modify the License. You may add Your own attribution + notices within Derivative Works that You distribute, alongside + or as an addendum to the NOTICE text from the Work, provided + that such additional attribution notices cannot be construed + as modifying the License. + + You may add Your own copyright statement to Your modifications and + may provide additional or different license terms and conditions + for use, reproduction, or distribution of Your modifications, or + for any such Derivative Works as a whole, provided Your use, + reproduction, and distribution of the Work otherwise complies with + the conditions stated in this License. + + 5. Submission of Contributions. Unless You explicitly state otherwise, + any Contribution intentionally submitted for inclusion in the Work + by You to the Licensor shall be under the terms and conditions of + this License, without any additional terms or conditions. + Notwithstanding the above, nothing herein shall supersede or modify + the terms of any separate license agreement you may have executed + with Licensor regarding such Contributions. + + 6. Trademarks. This License does not grant permission to use the trade + names, trademarks, service marks, or product names of the Licensor, + except as required for reasonable and customary use in describing the + origin of the Work and reproducing the content of the NOTICE file. + + 7. Disclaimer of Warranty. Unless required by applicable law or + agreed to in writing, Licensor provides the Work (and each + Contributor provides its Contributions) on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or + implied, including, without limitation, any warranties or conditions + of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A + PARTICULAR PURPOSE. You are solely responsible for determining the + appropriateness of using or redistributing the Work and assume any + risks associated with Your exercise of permissions under this License. + + 8. Limitation of Liability. In no event and under no legal theory, + whether in tort (including negligence), contract, or otherwise, + unless required by applicable law (such as deliberate and grossly + negligent acts) or agreed to in writing, shall any Contributor be + liable to You for damages, including any direct, indirect, special, + incidental, or consequential damages of any character arising as a + result of this License or out of the use or inability to use the + Work (including but not limited to damages for loss of goodwill, + work stoppage, computer failure or malfunction, or any and all + other commercial damages or losses), even if such Contributor + has been advised of the possibility of such damages. + + 9. Accepting Warranty or Additional Liability. While redistributing + the Work or Derivative Works thereof, You may choose to offer, + and charge a fee for, acceptance of support, warranty, indemnity, + or other liability obligations and/or rights consistent with this + License. However, in accepting such obligations, You may act only + on Your own behalf and on Your sole responsibility, not on behalf + of any other Contributor, and only if You agree to indemnify, + defend, and hold each Contributor harmless for any liability + incurred by, or claims asserted against, such Contributor by reason + of your accepting any such warranty or additional liability. + + END OF TERMS AND CONDITIONS + + APPENDIX: How to apply the Apache License to your work. + + To apply the Apache License to your work, attach the following + boilerplate notice, with the fields enclosed by brackets "[]" + replaced with your own identifying information. (Don't include + the brackets!) The text should be enclosed in the appropriate + comment syntax for the file format. We also recommend that a + file or class name and description of purpose be included on the + same "printed page" as the copyright notice for easier + identification within third-party archives. + + Copyright [yyyy] [name of copyright owner] + + Licensed under the Apache License, Version 2.0 (the "License"); + you may not use this file except in compliance with the License. + You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, software + distributed under the License is distributed on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + See the License for the specific language governing permissions and + limitations under the License. + + +/pytorch/third_party/opentelemetry-cpp/tools/vcpkg/ports/boost-vcpkg-helpers/LICENSE.txt +---------------------------------------------------------------------------------------- +Copyright (c) Microsoft Corporation + +All rights reserved. + +MIT License + +Permission is hereby granted, free of charge, to any person obtaining a copy of +this software and associated documentation files (the "Software"), to deal in +the Software without restriction, including without limitation the rights to +use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies +of the Software, and to permit persons to whom the Software is furnished to do +so, subject to the following conditions: + +The above copyright notice and this permission notice shall be included in all +copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED *AS IS*, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE +SOFTWARE. + + +/pytorch/third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/civetweb/examples/rest/cJSON/LICENSE +---------------------------------------------------------------------------------------------------------------------------------- +Copyright (c) 2009-2017 Dave Gamble and cJSON contributors + +Permission is hereby granted, free of charge, to any person obtaining a copy +of this software and associated documentation files (the "Software"), to deal +in the Software without restriction, including without limitation the rights +to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +copies of the Software, and to permit persons to whom the Software is +furnished to do so, subject to the following conditions: + +The above copyright notice and this permission notice shall be included in +all copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN +THE SOFTWARE. + + + +/pytorch/third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/civetweb/examples/rest/cJSON/LICENSE +--------------------------------------------------------------------------------------------------------------- +Copyright (c) 2009-2017 Dave Gamble and cJSON contributors + +Permission is hereby granted, free of charge, to any person obtaining a copy +of this software and associated documentation files (the "Software"), to deal +in the Software without restriction, including without limitation the rights +to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +copies of the Software, and to permit persons to whom the Software is +furnished to do so, subject to the following conditions: + +The above copyright notice and this permission notice shall be included in +all copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN +THE SOFTWARE. + + + +/pytorch/third_party/opentelemetry-cpp/third_party/opentracing-cpp/3rd_party/include/opentracing/catch2/LICENSE.txt +------------------------------------------------------------------------------------------------------------------- +Boost Software License - Version 1.0 - August 17th, 2003 + +Permission is hereby granted, free of charge, to any person or organization +obtaining a copy of the software and accompanying documentation covered by +this license (the "Software") to use, reproduce, display, distribute, +execute, and transmit the Software, and to prepare derivative works of the +Software, and to permit third-parties to whom the Software is furnished to +do so, all subject to the following: + +The copyright notices in the Software and this entire statement, including +the above license grant, this restriction and the following disclaimer, +must be included in all copies of the Software, in whole or in part, and +all derivative works of the Software, unless such copies or derivative +works are solely in the form of machine-executable object code generated by +a source language processor. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +FITNESS FOR A PARTICULAR PURPOSE, TITLE AND NON-INFRINGEMENT. IN NO EVENT +SHALL THE COPYRIGHT HOLDERS OR ANYONE DISTRIBUTING THE SOFTWARE BE LIABLE +FOR ANY DAMAGES OR OTHER LIABILITY, WHETHER IN CONTRACT, TORT OR OTHERWISE, +ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER +DEALINGS IN THE SOFTWARE. + + +/pytorch/third_party/cpuinfo/deps/clog/LICENSE +---------------------------------------------- +Copyright (C) 2018 Marat Dukhan +Copyright (c) 2017-2018 Facebook Inc. +Copyright (c) 2017 Georgia Institute of Technology + +All rights reserved. + +Redistribution and use in source and binary forms, with or without +modification, are permitted provided that the following conditions are met: + +* Redistributions of source code must retain the above copyright notice, this + list of conditions and the following disclaimer. + +* Redistributions in binary form must reproduce the above copyright notice, + this list of conditions and the following disclaimer in the documentation + and/or other materials provided with the distribution. + +THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" +AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE +IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE +DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE +FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL +DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR +SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER +CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, +OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE +OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + + +/pytorch/third_party/fbgemm/external/cpuinfo/deps/clog/LICENSE +-------------------------------------------------------------- +Copyright (C) 2018 Marat Dukhan +Copyright (c) 2017-2018 Facebook Inc. +Copyright (c) 2017 Georgia Institute of Technology + +All rights reserved. + +Redistribution and use in source and binary forms, with or without +modification, are permitted provided that the following conditions are met: + +* Redistributions of source code must retain the above copyright notice, this + list of conditions and the following disclaimer. + +* Redistributions in binary form must reproduce the above copyright notice, + this list of conditions and the following disclaimer in the documentation + and/or other materials provided with the distribution. + +THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" +AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE +IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE +DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE +FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL +DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR +SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER +CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, +OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE +OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + + +/pytorch/third_party/kineto/libkineto/third_party/dynolog/third_party/DCGM/testing/python3/libs_3rdparty/colorama/LICENSE.txt +----------------------------------------------------------------------------------------------------------------------------- +Copyright (c) 2010 Jonathan Hartley + +Released under the New BSD license (reproduced below), or alternatively you may +use this software under any OSI approved open source license such as those at +http://opensource.org/licenses/alphabetical + +All rights reserved. + +Redistribution and use in source and binary forms, with or without +modification, are permitted provided that the following conditions are met: + +* Redistributions of source code must retain the above copyright notice, this + list of conditions and the following disclaimer. + +* Redistributions in binary form must reproduce the above copyright notice, + this list of conditions and the following disclaimer in the documentation + and/or other materials provided with the distribution. + +* Neither the name(s) of the copyright holders, nor those of its contributors + may be used to endorse or promote products derived from this software without + specific prior written permission. + +THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND +ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED +WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE +DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE +FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL +DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR +SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER +CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, +OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE +OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + + + +/pytorch/third_party/aiter/3rdparty/composable_kernel/LICENSE +------------------------------------------------------------- +Copyright (c) 2018- , Advanced Micro Devices, Inc. (Chao Liu, Jing Zhang) +Copyright (c) 2019- , Advanced Micro Devices, Inc. (Letao Qin, Qianfeng Zhang, Liang Huang, Shaojie Wang) +Copyright (c) 2022- , Advanced Micro Devices, Inc. (Anthony Chang, Chunyu Lai, Illia Silin, Adam Osewski, Poyen Chen, Jehandad Khan) +Copyright (c) 2019-2021, Advanced Micro Devices, Inc. (Hanwen Chang) +Copyright (c) 2019-2020, Advanced Micro Devices, Inc. (Tejash Shah) +Copyright (c) 2020 , Advanced Micro Devices, Inc. (Xiaoyan Zhou) +Copyright (c) 2021-2022, Advanced Micro Devices, Inc. (Jianfeng Yan) + +SPDX-License-Identifier: MIT +Copyright (c) 2018-2025, Advanced Micro Devices, Inc. All rights reserved. + +Permission is hereby granted, free of charge, to any person obtaining a copy +of this software and associated documentation files (the "Software"), to deal +in the Software without restriction, including without limitation the rights +to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +copies of the Software, and to permit persons to whom the Software is +furnished to do so, subject to the following conditions: + +The above copyright notice and this permission notice shall be included in all +copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE +SOFTWARE. + + +/pytorch/third_party/composable_kernel/LICENSE +---------------------------------------------- +Copyright (c) 2018- , Advanced Micro Devices, Inc. (Chao Liu, Jing Zhang) +Copyright (c) 2019- , Advanced Micro Devices, Inc. (Letao Qin, Qianfeng Zhang, Liang Huang, Shaojie Wang) +Copyright (c) 2022- , Advanced Micro Devices, Inc. (Anthony Chang, Chunyu Lai, Illia Silin, Adam Osewski, Poyen Chen, Jehandad Khan) +Copyright (c) 2019-2021, Advanced Micro Devices, Inc. (Hanwen Chang) +Copyright (c) 2019-2020, Advanced Micro Devices, Inc. (Tejash Shah) +Copyright (c) 2020 , Advanced Micro Devices, Inc. (Xiaoyan Zhou) +Copyright (c) 2021-2022, Advanced Micro Devices, Inc. (Jianfeng Yan) + +SPDX-License-Identifier: MIT +Copyright (c) 2018-2025, Advanced Micro Devices, Inc. All rights reserved. + +Permission is hereby granted, free of charge, to any person obtaining a copy +of this software and associated documentation files (the "Software"), to deal +in the Software without restriction, including without limitation the rights +to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +copies of the Software, and to permit persons to whom the Software is +furnished to do so, subject to the following conditions: + +The above copyright notice and this permission notice shall be included in all +copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE +SOFTWARE. + + +/pytorch/third_party/fbgemm/external/composable_kernel/LICENSE +-------------------------------------------------------------- +Copyright (c) 2018- , Advanced Micro Devices, Inc. (Chao Liu, Jing Zhang) +Copyright (c) 2019- , Advanced Micro Devices, Inc. (Letao Qin, Qianfeng Zhang, Liang Huang, Shaojie Wang) +Copyright (c) 2022- , Advanced Micro Devices, Inc. (Anthony Chang, Chunyu Lai, Illia Silin, Adam Osewski, Poyen Chen, Jehandad Khan) +Copyright (c) 2019-2021, Advanced Micro Devices, Inc. (Hanwen Chang) +Copyright (c) 2019-2020, Advanced Micro Devices, Inc. (Tejash Shah) +Copyright (c) 2020 , Advanced Micro Devices, Inc. (Xiaoyan Zhou) +Copyright (c) 2021-2022, Advanced Micro Devices, Inc. (Jianfeng Yan) + +SPDX-License-Identifier: MIT +Copyright (c) 2018-2025, Advanced Micro Devices, Inc. All rights reserved. + +Permission is hereby granted, free of charge, to any person obtaining a copy +of this software and associated documentation files (the "Software"), to deal +in the Software without restriction, including without limitation the rights +to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +copies of the Software, and to permit persons to whom the Software is +furnished to do so, subject to the following conditions: + +The above copyright notice and this permission notice shall be included in all +copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE +SOFTWARE. + + +/pytorch/third_party/flash-attention/csrc/composable_kernel/LICENSE +------------------------------------------------------------------- +Copyright (c) 2018- , Advanced Micro Devices, Inc. (Chao Liu, Jing Zhang) +Copyright (c) 2019- , Advanced Micro Devices, Inc. (Letao Qin, Qianfeng Zhang, Liang Huang, Shaojie Wang) +Copyright (c) 2022- , Advanced Micro Devices, Inc. (Anthony Chang, Chunyu Lai, Illia Silin, Adam Osewski, Poyen Chen, Jehandad Khan) +Copyright (c) 2019-2021, Advanced Micro Devices, Inc. (Hanwen Chang) +Copyright (c) 2019-2020, Advanced Micro Devices, Inc. (Tejash Shah) +Copyright (c) 2020 , Advanced Micro Devices, Inc. (Xiaoyan Zhou) +Copyright (c) 2021-2022, Advanced Micro Devices, Inc. (Jianfeng Yan) + +SPDX-License-Identifier: MIT +Copyright (c) 2018-2024, Advanced Micro Devices, Inc. All rights reserved. + +Permission is hereby granted, free of charge, to any person obtaining a copy +of this software and associated documentation files (the "Software"), to deal +in the Software without restriction, including without limitation the rights +to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +copies of the Software, and to permit persons to whom the Software is +furnished to do so, subject to the following conditions: + +The above copyright notice and this permission notice shall be included in all +copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE +SOFTWARE. + + +/pytorch/third_party/cpp-httplib/LICENSE +---------------------------------------- +The MIT License (MIT) + +Copyright (c) 2017 yhirose + +Permission is hereby granted, free of charge, to any person obtaining a copy +of this software and associated documentation files (the "Software"), to deal +in the Software without restriction, including without limitation the rights +to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +copies of the Software, and to permit persons to whom the Software is +furnished to do so, subject to the following conditions: + +The above copyright notice and this permission notice shall be included in all +copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE +SOFTWARE. + + + +/pytorch/third_party/kineto/libkineto/third_party/dynolog/third_party/json/third_party/cpplint/LICENSE +------------------------------------------------------------------------------------------------------ +cpplint.py and its corresponding unit tests are Copyright (C) 2009 Google Inc. + +Redistribution and use in source and binary forms, with or without +modification, are permitted provided that the following conditions are +met: + + * Redistributions of source code must retain the above copyright +notice, this list of conditions and the following disclaimer. + * Redistributions in binary form must reproduce the above +copyright notice, this list of conditions and the following disclaimer +in the documentation and/or other materials provided with the +distribution. + * Neither the name of Google Inc. nor the names of its +contributors may be used to endorse or promote products derived from +this software without specific prior written permission. + +THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS +"AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT +LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR +A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT +OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, +SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT +LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, +DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY +THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT +(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE +OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + + +/pytorch/third_party/kineto/libkineto/third_party/dynolog/third_party/cpr/LICENSE +--------------------------------------------------------------------------------- +This license applies to everything except the contents of the "test" +directory and its subdirectories. + +MIT License + +Copyright (c) 2017-2021 Huu Nguyen +Copyright (c) 2022 libcpr and many other contributors + +Permission is hereby granted, free of charge, to any person obtaining a copy +of this software and associated documentation files (the "Software"), to deal +in the Software without restriction, including without limitation the rights +to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +copies of the Software, and to permit persons to whom the Software is +furnished to do so, subject to the following conditions: + +The above copyright notice and this permission notice shall be included in all +copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE +SOFTWARE. + +/pytorch/third_party/cpuinfo/LICENSE +------------------------------------ +Copyright (c) 2019 Google LLC +Copyright (c) 2017-2018 Facebook Inc. +Copyright (C) 2012-2017 Georgia Institute of Technology +Copyright (C) 2010-2012 Marat Dukhan + +All rights reserved. + +Redistribution and use in source and binary forms, with or without +modification, are permitted provided that the following conditions are met: + +* Redistributions of source code must retain the above copyright notice, this + list of conditions and the following disclaimer. + +* Redistributions in binary form must reproduce the above copyright notice, + this list of conditions and the following disclaimer in the documentation + and/or other materials provided with the distribution. + +THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" +AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE +IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE +DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE +FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL +DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR +SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER +CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, +OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE +OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + + +/pytorch/third_party/fbgemm/external/cpuinfo/LICENSE +---------------------------------------------------- +Copyright (c) 2019 Google LLC +Copyright (c) 2017-2018 Facebook Inc. +Copyright (C) 2012-2017 Georgia Institute of Technology +Copyright (C) 2010-2012 Marat Dukhan + +All rights reserved. + +Redistribution and use in source and binary forms, with or without +modification, are permitted provided that the following conditions are met: + +* Redistributions of source code must retain the above copyright notice, this + list of conditions and the following disclaimer. + +* Redistributions in binary form must reproduce the above copyright notice, + this list of conditions and the following disclaimer in the documentation + and/or other materials provided with the distribution. + +THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" +AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE +IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE +DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE +FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL +DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR +SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER +CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, +OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE +OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + + +/pytorch/third_party/cudnn_frontend/LICENSE.txt +----------------------------------------------- +/* + * Copyright (c) 2020, NVIDIA CORPORATION. All rights reserved. + * + * Permission is hereby granted, free of charge, to any person obtaining a + * copy of this software and associated documentation files (the "Software"), + * to deal in the Software without restriction, including without limitation + * the rights to use, copy, modify, merge, publish, distribute, sublicense, + * and/or sell copies of the Software, and to permit persons to whom the + * Software is furnished to do so, subject to the following conditions: + * + * The above copyright notice and this permission notice shall be included in + * all copies or substantial portions of the Software. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER + * DEALINGS IN THE SOFTWARE. + */ + + +/pytorch/third_party/cutlass/LICENSE.txt +---------------------------------------- +Copyright (c) 2017 - 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +SPDX-License-Identifier: BSD-3-Clause + +Redistribution and use in source and binary forms, with or without +modification, are permitted provided that the following conditions are met: + +1. Redistributions of source code must retain the above copyright notice, this +list of conditions and the following disclaimer. + +2. Redistributions in binary form must reproduce the above copyright notice, +this list of conditions and the following disclaimer in the documentation +and/or other materials provided with the distribution. + +3. Neither the name of the copyright holder nor the names of its +contributors may be used to endorse or promote products derived from +this software without specific prior written permission. + +THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" +AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE +IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE +DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE +FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL +DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR +SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER +CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, +OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE +OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + +Certain files within this repository are subject to separate licensing terms: + +- The files located in the `python/CuTeDSL` directory are licensed under the + NVIDIA End User License Agreement (EULA). Please refer to + https://docs.nvidia.com/cutlass/media/docs/pythonDSL/license.html + for the full terms. + + +/pytorch/third_party/fbgemm/external/cutlass/LICENSE.txt +-------------------------------------------------------- +Copyright (c) 2017 - 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +SPDX-License-Identifier: BSD-3-Clause + +Redistribution and use in source and binary forms, with or without +modification, are permitted provided that the following conditions are met: + +1. Redistributions of source code must retain the above copyright notice, this +list of conditions and the following disclaimer. + +2. Redistributions in binary form must reproduce the above copyright notice, +this list of conditions and the following disclaimer in the documentation +and/or other materials provided with the distribution. + +3. Neither the name of the copyright holder nor the names of its +contributors may be used to endorse or promote products derived from +this software without specific prior written permission. + +THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" +AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE +IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE +DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE +FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL +DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR +SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER +CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, +OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE +OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + +Certain files within this repository are subject to separate licensing terms: + +- The files located in the `python/CuTeDSL` directory are licensed under the + NVIDIA End User License Agreement (EULA). Please refer to + https://docs.nvidia.com/cutlass/media/docs/pythonDSL/license.html + for the full terms. + + +/pytorch/third_party/flash-attention/csrc/cutlass/LICENSE.txt +------------------------------------------------------------- +Copyright (c) 2017 - 2024 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +SPDX-License-Identifier: BSD-3-Clause + +Redistribution and use in source and binary forms, with or without +modification, are permitted provided that the following conditions are met: + +1. Redistributions of source code must retain the above copyright notice, this +list of conditions and the following disclaimer. + +2. Redistributions in binary form must reproduce the above copyright notice, +this list of conditions and the following disclaimer in the documentation +and/or other materials provided with the distribution. + +3. Neither the name of the copyright holder nor the names of its +contributors may be used to endorse or promote products derived from +this software without specific prior written permission. + +THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" +AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE +IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE +DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE +FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL +DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR +SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER +CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, +OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE +OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + + +/pytorch/third_party/flatbuffers/dart/LICENSE +--------------------------------------------- + Apache License + Version 2.0, January 2004 + http://www.apache.org/licenses/ + + TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION + + 1. Definitions. + + "License" shall mean the terms and conditions for use, reproduction, + and distribution as defined by Sections 1 through 9 of this document. + + "Licensor" shall mean the copyright owner or entity authorized by + the copyright owner that is granting the License. + + "Legal Entity" shall mean the union of the acting entity and all + other entities that control, are controlled by, or are under common + control with that entity. For the purposes of this definition, + "control" means (i) the power, direct or indirect, to cause the + direction or management of such entity, whether by contract or + otherwise, or (ii) ownership of fifty percent (50%) or more of the + outstanding shares, or (iii) beneficial ownership of such entity. + + "You" (or "Your") shall mean an individual or Legal Entity + exercising permissions granted by this License. + + "Source" form shall mean the preferred form for making modifications, + including but not limited to software source code, documentation + source, and configuration files. + + "Object" form shall mean any form resulting from mechanical + transformation or translation of a Source form, including but + not limited to compiled object code, generated documentation, + and conversions to other media types. + + "Work" shall mean the work of authorship, whether in Source or + Object form, made available under the License, as indicated by a + copyright notice that is included in or attached to the work + (an example is provided in the Appendix below). + + "Derivative Works" shall mean any work, whether in Source or Object + form, that is based on (or derived from) the Work and for which the + editorial revisions, annotations, elaborations, or other modifications + represent, as a whole, an original work of authorship. For the purposes + of this License, Derivative Works shall not include works that remain + separable from, or merely link (or bind by name) to the interfaces of, + the Work and Derivative Works thereof. + + "Contribution" shall mean any work of authorship, including + the original version of the Work and any modifications or additions + to that Work or Derivative Works thereof, that is intentionally + submitted to Licensor for inclusion in the Work by the copyright owner + or by an individual or Legal Entity authorized to submit on behalf of + the copyright owner. For the purposes of this definition, "submitted" + means any form of electronic, verbal, or written communication sent + to the Licensor or its representatives, including but not limited to + communication on electronic mailing lists, source code control systems, + and issue tracking systems that are managed by, or on behalf of, the + Licensor for the purpose of discussing and improving the Work, but + excluding communication that is conspicuously marked or otherwise + designated in writing by the copyright owner as "Not a Contribution." + + "Contributor" shall mean Licensor and any individual or Legal Entity + on behalf of whom a Contribution has been received by Licensor and + subsequently incorporated within the Work. + + 2. Grant of Copyright License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + copyright license to reproduce, prepare Derivative Works of, + publicly display, publicly perform, sublicense, and distribute the + Work and such Derivative Works in Source or Object form. + + 3. Grant of Patent License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + (except as stated in this section) patent license to make, have made, + use, offer to sell, sell, import, and otherwise transfer the Work, + where such license applies only to those patent claims licensable + by such Contributor that are necessarily infringed by their + Contribution(s) alone or by combination of their Contribution(s) + with the Work to which such Contribution(s) was submitted. If You + institute patent litigation against any entity (including a + cross-claim or counterclaim in a lawsuit) alleging that the Work + or a Contribution incorporated within the Work constitutes direct + or contributory patent infringement, then any patent licenses + granted to You under this License for that Work shall terminate + as of the date such litigation is filed. + + 4. Redistribution. You may reproduce and distribute copies of the + Work or Derivative Works thereof in any medium, with or without + modifications, and in Source or Object form, provided that You + meet the following conditions: + + (a) You must give any other recipients of the Work or + Derivative Works a copy of this License; and + + (b) You must cause any modified files to carry prominent notices + stating that You changed the files; and + + (c) You must retain, in the Source form of any Derivative Works + that You distribute, all copyright, patent, trademark, and + attribution notices from the Source form of the Work, + excluding those notices that do not pertain to any part of + the Derivative Works; and + + (d) If the Work includes a "NOTICE" text file as part of its + distribution, then any Derivative Works that You distribute must + include a readable copy of the attribution notices contained + within such NOTICE file, excluding those notices that do not + pertain to any part of the Derivative Works, in at least one + of the following places: within a NOTICE text file distributed + as part of the Derivative Works; within the Source form or + documentation, if provided along with the Derivative Works; or, + within a display generated by the Derivative Works, if and + wherever such third-party notices normally appear. The contents + of the NOTICE file are for informational purposes only and + do not modify the License. You may add Your own attribution + notices within Derivative Works that You distribute, alongside + or as an addendum to the NOTICE text from the Work, provided + that such additional attribution notices cannot be construed + as modifying the License. + + You may add Your own copyright statement to Your modifications and + may provide additional or different license terms and conditions + for use, reproduction, or distribution of Your modifications, or + for any such Derivative Works as a whole, provided Your use, + reproduction, and distribution of the Work otherwise complies with + the conditions stated in this License. + + 5. Submission of Contributions. Unless You explicitly state otherwise, + any Contribution intentionally submitted for inclusion in the Work + by You to the Licensor shall be under the terms and conditions of + this License, without any additional terms or conditions. + Notwithstanding the above, nothing herein shall supersede or modify + the terms of any separate license agreement you may have executed + with Licensor regarding such Contributions. + + 6. Trademarks. This License does not grant permission to use the trade + names, trademarks, service marks, or product names of the Licensor, + except as required for reasonable and customary use in describing the + origin of the Work and reproducing the content of the NOTICE file. + + 7. Disclaimer of Warranty. Unless required by applicable law or + agreed to in writing, Licensor provides the Work (and each + Contributor provides its Contributions) on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or + implied, including, without limitation, any warranties or conditions + of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A + PARTICULAR PURPOSE. You are solely responsible for determining the + appropriateness of using or redistributing the Work and assume any + risks associated with Your exercise of permissions under this License. + + 8. Limitation of Liability. In no event and under no legal theory, + whether in tort (including negligence), contract, or otherwise, + unless required by applicable law (such as deliberate and grossly + negligent acts) or agreed to in writing, shall any Contributor be + liable to You for damages, including any direct, indirect, special, + incidental, or consequential damages of any character arising as a + result of this License or out of the use or inability to use the + Work (including but not limited to damages for loss of goodwill, + work stoppage, computer failure or malfunction, or any and all + other commercial damages or losses), even if such Contributor + has been advised of the possibility of such damages. + + 9. Accepting Warranty or Additional Liability. While redistributing + the Work or Derivative Works thereof, You may choose to offer, + and charge a fee for, acceptance of support, warranty, indemnity, + or other liability obligations and/or rights consistent with this + License. However, in accepting such obligations, You may act only + on Your own behalf and on Your sole responsibility, not on behalf + of any other Contributor, and only if You agree to indemnify, + defend, and hold each Contributor harmless for any liability + incurred by, or claims asserted against, such Contributor by reason + of your accepting any such warranty or additional liability. + + END OF TERMS AND CONDITIONS + + APPENDIX: How to apply the Apache License to your work. + + To apply the Apache License to your work, attach the following + boilerplate notice, with the fields enclosed by brackets "[]" + replaced with your own identifying information. (Don't include + the brackets!) The text should be enclosed in the appropriate + comment syntax for the file format. We also recommend that a + file or class name and description of purpose be included on the + same "printed page" as the copyright notice for easier + identification within third-party archives. + + Copyright 2014 Google Inc. + + Licensed under the Apache License, Version 2.0 (the "License"); + you may not use this file except in compliance with the License. + You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, software + distributed under the License is distributed on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + See the License for the specific language governing permissions and + limitations under the License. + + +/pytorch/third_party/NVTX/docs/LICENSE.txt +------------------------------------------ +============================================================================== +NVTX is under the Apache License v2.0 with LLVM Exceptions: +============================================================================== + + Apache License + Version 2.0, January 2004 + http://www.apache.org/licenses/ + + TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION + + 1. Definitions. + + "License" shall mean the terms and conditions for use, reproduction, + and distribution as defined by Sections 1 through 9 of this document. + + "Licensor" shall mean the copyright owner or entity authorized by + the copyright owner that is granting the License. + + "Legal Entity" shall mean the union of the acting entity and all + other entities that control, are controlled by, or are under common + control with that entity. For the purposes of this definition, + "control" means (i) the power, direct or indirect, to cause the + direction or management of such entity, whether by contract or + otherwise, or (ii) ownership of fifty percent (50%) or more of the + outstanding shares, or (iii) beneficial ownership of such entity. + + "You" (or "Your") shall mean an individual or Legal Entity + exercising permissions granted by this License. + + "Source" form shall mean the preferred form for making modifications, + including but not limited to software source code, documentation + source, and configuration files. + + "Object" form shall mean any form resulting from mechanical + transformation or translation of a Source form, including but + not limited to compiled object code, generated documentation, + and conversions to other media types. + + "Work" shall mean the work of authorship, whether in Source or + Object form, made available under the License, as indicated by a + copyright notice that is included in or attached to the work + (an example is provided in the Appendix below). + + "Derivative Works" shall mean any work, whether in Source or Object + form, that is based on (or derived from) the Work and for which the + editorial revisions, annotations, elaborations, or other modifications + represent, as a whole, an original work of authorship. For the purposes + of this License, Derivative Works shall not include works that remain + separable from, or merely link (or bind by name) to the interfaces of, + the Work and Derivative Works thereof. + + "Contribution" shall mean any work of authorship, including + the original version of the Work and any modifications or additions + to that Work or Derivative Works thereof, that is intentionally + submitted to Licensor for inclusion in the Work by the copyright owner + or by an individual or Legal Entity authorized to submit on behalf of + the copyright owner. For the purposes of this definition, "submitted" + means any form of electronic, verbal, or written communication sent + to the Licensor or its representatives, including but not limited to + communication on electronic mailing lists, source code control systems, + and issue tracking systems that are managed by, or on behalf of, the + Licensor for the purpose of discussing and improving the Work, but + excluding communication that is conspicuously marked or otherwise + designated in writing by the copyright owner as "Not a Contribution." + + "Contributor" shall mean Licensor and any individual or Legal Entity + on behalf of whom a Contribution has been received by Licensor and + subsequently incorporated within the Work. + + 2. Grant of Copyright License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + copyright license to reproduce, prepare Derivative Works of, + publicly display, publicly perform, sublicense, and distribute the + Work and such Derivative Works in Source or Object form. + + 3. Grant of Patent License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + (except as stated in this section) patent license to make, have made, + use, offer to sell, sell, import, and otherwise transfer the Work, + where such license applies only to those patent claims licensable + by such Contributor that are necessarily infringed by their + Contribution(s) alone or by combination of their Contribution(s) + with the Work to which such Contribution(s) was submitted. If You + institute patent litigation against any entity (including a + cross-claim or counterclaim in a lawsuit) alleging that the Work + or a Contribution incorporated within the Work constitutes direct + or contributory patent infringement, then any patent licenses + granted to You under this License for that Work shall terminate + as of the date such litigation is filed. + + 4. Redistribution. You may reproduce and distribute copies of the + Work or Derivative Works thereof in any medium, with or without + modifications, and in Source or Object form, provided that You + meet the following conditions: + + (a) You must give any other recipients of the Work or + Derivative Works a copy of this License; and + + (b) You must cause any modified files to carry prominent notices + stating that You changed the files; and + + (c) You must retain, in the Source form of any Derivative Works + that You distribute, all copyright, patent, trademark, and + attribution notices from the Source form of the Work, + excluding those notices that do not pertain to any part of + the Derivative Works; and + + (d) If the Work includes a "NOTICE" text file as part of its + distribution, then any Derivative Works that You distribute must + include a readable copy of the attribution notices contained + within such NOTICE file, excluding those notices that do not + pertain to any part of the Derivative Works, in at least one + of the following places: within a NOTICE text file distributed + as part of the Derivative Works; within the Source form or + documentation, if provided along with the Derivative Works; or, + within a display generated by the Derivative Works, if and + wherever such third-party notices normally appear. The contents + of the NOTICE file are for informational purposes only and + do not modify the License. You may add Your own attribution + notices within Derivative Works that You distribute, alongside + or as an addendum to the NOTICE text from the Work, provided + that such additional attribution notices cannot be construed + as modifying the License. + + You may add Your own copyright statement to Your modifications and + may provide additional or different license terms and conditions + for use, reproduction, or distribution of Your modifications, or + for any such Derivative Works as a whole, provided Your use, + reproduction, and distribution of the Work otherwise complies with + the conditions stated in this License. + + 5. Submission of Contributions. Unless You explicitly state otherwise, + any Contribution intentionally submitted for inclusion in the Work + by You to the Licensor shall be under the terms and conditions of + this License, without any additional terms or conditions. + Notwithstanding the above, nothing herein shall supersede or modify + the terms of any separate license agreement you may have executed + with Licensor regarding such Contributions. + + 6. Trademarks. This License does not grant permission to use the trade + names, trademarks, service marks, or product names of the Licensor, + except as required for reasonable and customary use in describing the + origin of the Work and reproducing the content of the NOTICE file. + + 7. Disclaimer of Warranty. Unless required by applicable law or + agreed to in writing, Licensor provides the Work (and each + Contributor provides its Contributions) on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or + implied, including, without limitation, any warranties or conditions + of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A + PARTICULAR PURPOSE. You are solely responsible for determining the + appropriateness of using or redistributing the Work and assume any + risks associated with Your exercise of permissions under this License. + + 8. Limitation of Liability. In no event and under no legal theory, + whether in tort (including negligence), contract, or otherwise, + unless required by applicable law (such as deliberate and grossly + negligent acts) or agreed to in writing, shall any Contributor be + liable to You for damages, including any direct, indirect, special, + incidental, or consequential damages of any character arising as a + result of this License or out of the use or inability to use the + Work (including but not limited to damages for loss of goodwill, + work stoppage, computer failure or malfunction, or any and all + other commercial damages or losses), even if such Contributor + has been advised of the possibility of such damages. + + 9. Accepting Warranty or Additional Liability. While redistributing + the Work or Derivative Works thereof, You may choose to offer, + and charge a fee for, acceptance of support, warranty, indemnity, + or other liability obligations and/or rights consistent with this + License. However, in accepting such obligations, You may act only + on Your own behalf and on Your sole responsibility, not on behalf + of any other Contributor, and only if You agree to indemnify, + defend, and hold each Contributor harmless for any liability + incurred by, or claims asserted against, such Contributor by reason + of your accepting any such warranty or additional liability. + + END OF TERMS AND CONDITIONS + + APPENDIX: How to apply the Apache License to your work. + + To apply the Apache License to your work, attach the following + boilerplate notice, with the fields enclosed by brackets "[]" + replaced with your own identifying information. (Don't include + the brackets!) The text should be enclosed in the appropriate + comment syntax for the file format. We also recommend that a + file or class name and description of purpose be included on the + same "printed page" as the copyright notice for easier + identification within third-party archives. + + Copyright [yyyy] [name of copyright owner] + + Licensed under the Apache License, Version 2.0 (the "License"); + you may not use this file except in compliance with the License. + You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, software + distributed under the License is distributed on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + See the License for the specific language governing permissions and + limitations under the License. + + +---- LLVM Exceptions to the Apache 2.0 License ---- + +As an exception, if, as a result of your compiling your source code, portions +of this Software are embedded into an Object form of such source code, you +may redistribute such embedded portions in such Object form without complying +with the conditions of Sections 4(a), 4(b) and 4(d) of the License. + +In addition, if you combine or link compiled forms of this Software with +software that is licensed under the GPLv2 ("Combined Software") and if a +court of competent jurisdiction determines that the patent provision (Section +3), the indemnity provision (Section 9) or other Section of the License +conflicts with the conditions of the GPLv2, you may retroactively and +prospectively choose to deem waived or otherwise exclude such Section(s) of +the License, but only in their entirety and only with respect to the Combined +Software. + + + +/pytorch/third_party/kineto/libkineto/third_party/dynolog/third_party/json/test/thirdparty/doctest/LICENSE.txt +-------------------------------------------------------------------------------------------------------------- +The MIT License (MIT) + +Copyright (c) 2016-2021 Viktor Kirilov + +Permission is hereby granted, free of charge, to any person obtaining a copy +of this software and associated documentation files (the "Software"), to deal +in the Software without restriction, including without limitation the rights +to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +copies of the Software, and to permit persons to whom the Software is +furnished to do so, subject to the following conditions: + +The above copyright notice and this permission notice shall be included in all +copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE +SOFTWARE. + + +/pytorch/third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/civetweb/src/third_party/duktape-1.5.2/LICENSE.txt +------------------------------------------------------------------------------------------------------------------------------------------------ +=============== +Duktape license +=============== + +(http://opensource.org/licenses/MIT) + +Copyright (c) 2013-2016 by Duktape authors (see AUTHORS.rst) + +Permission is hereby granted, free of charge, to any person obtaining a copy +of this software and associated documentation files (the "Software"), to deal +in the Software without restriction, including without limitation the rights +to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +copies of the Software, and to permit persons to whom the Software is +furnished to do so, subject to the following conditions: + +The above copyright notice and this permission notice shall be included in +all copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN +THE SOFTWARE. + + +/pytorch/third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/civetweb/src/third_party/duktape-1.5.2/LICENSE.txt +----------------------------------------------------------------------------------------------------------------------------- +=============== +Duktape license +=============== + +(http://opensource.org/licenses/MIT) + +Copyright (c) 2013-2016 by Duktape authors (see AUTHORS.rst) + +Permission is hereby granted, free of charge, to any person obtaining a copy +of this software and associated documentation files (the "Software"), to deal +in the Software without restriction, including without limitation the rights +to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +copies of the Software, and to permit persons to whom the Software is +furnished to do so, subject to the following conditions: + +The above copyright notice and this permission notice shall be included in +all copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN +THE SOFTWARE. + + +/pytorch/third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/civetweb/src/third_party/duktape-1.8.0/LICENSE.txt +------------------------------------------------------------------------------------------------------------------------------------------------ +=============== +Duktape license +=============== + +(http://opensource.org/licenses/MIT) + +Copyright (c) 2013-2017 by Duktape authors (see AUTHORS.rst) + +Permission is hereby granted, free of charge, to any person obtaining a copy +of this software and associated documentation files (the "Software"), to deal +in the Software without restriction, including without limitation the rights +to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +copies of the Software, and to permit persons to whom the Software is +furnished to do so, subject to the following conditions: + +The above copyright notice and this permission notice shall be included in +all copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN +THE SOFTWARE. + + +/pytorch/third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/civetweb/src/third_party/duktape-1.8.0/LICENSE.txt +----------------------------------------------------------------------------------------------------------------------------- +=============== +Duktape license +=============== + +(http://opensource.org/licenses/MIT) + +Copyright (c) 2013-2017 by Duktape authors (see AUTHORS.rst) + +Permission is hereby granted, free of charge, to any person obtaining a copy +of this software and associated documentation files (the "Software"), to deal +in the Software without restriction, including without limitation the rights +to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +copies of the Software, and to permit persons to whom the Software is +furnished to do so, subject to the following conditions: + +The above copyright notice and this permission notice shall be included in +all copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN +THE SOFTWARE. + + +/pytorch/third_party/kineto/libkineto/third_party/dynolog/LICENSE +----------------------------------------------------------------- +MIT License + +Copyright (c) Facebook, Inc. and its affiliates. + +Permission is hereby granted, free of charge, to any person obtaining a copy +of this software and associated documentation files (the "Software"), to deal +in the Software without restriction, including without limitation the rights +to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +copies of the Software, and to permit persons to whom the Software is +furnished to do so, subject to the following conditions: + +The above copyright notice and this permission notice shall be included in all +copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE +SOFTWARE. + + +/pytorch/third_party/opentelemetry-cpp/exporters/etw/include/opentelemetry/exporters/etw/LICENSE +------------------------------------------------------------------------------------------------ +TraceLogging Dynamic for Windows + +Copyright (c) Microsoft Corporation. All rights reserved. + +MIT License + +Permission is hereby granted, free of charge, to any person obtaining a copy +of this software and associated documentation files (the "Software"), to deal +in the Software without restriction, including without limitation the rights +to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +copies of the Software, and to permit persons to whom the Software is +furnished to do so, subject to the following conditions: + +The above copyright notice and this permission notice shall be included in all +copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED *AS IS*, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE +SOFTWARE. + + +/pytorch/third_party/opentelemetry-cpp/third_party/opentracing-cpp/3rd_party/include/opentracing/expected/LICENSE +----------------------------------------------------------------------------------------------------------------- +The MIT License (MIT) + +Copyright (c) 2015 Martin Moene +Copyright (c) 2015 Microsoft Corporation. All rights reserved. + +Permission is hereby granted, free of charge, to any person obtaining a copy +of this software and associated documentation files (the "Software"), to deal +in the Software without restriction, including without limitation the rights +to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +copies of the Software, and to permit persons to whom the Software is +furnished to do so, subject to the following conditions: + +The above copyright notice and this permission notice shall be included in all +copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE +SOFTWARE. + + + +/pytorch/third_party/fbgemm/LICENSE +----------------------------------- +BSD License + +For FBGEMM software + +Copyright (c) Meta Platforms, Inc. and affiliates. All rights reserved. + +Redistribution and use in source and binary forms, with or without modification, +are permitted provided that the following conditions are met: + + * Redistributions of source code must retain the above copyright notice, this + list of conditions and the following disclaimer. + + * Redistributions in binary form must reproduce the above copyright notice, + this list of conditions and the following disclaimer in the documentation + and/or other materials provided with the distribution. + + * Neither the name Facebook nor the names of its contributors may be used to + endorse or promote products derived from this software without specific + prior written permission. + +THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND +ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED +WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE +DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR +ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES +(INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; +LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON +ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT +(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS +SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + + +/pytorch/third_party/opentelemetry-cpp/tools/vcpkg/ports/ffnvcodec/LICENSE.txt +------------------------------------------------------------------------------ +GNU LESSER GENERAL PUBLIC LICENSE +Version 2.1, February 1999 + +Copyright (C) 1991, 1999 Free Software Foundation, Inc. +51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA +Everyone is permitted to copy and distribute verbatim copies +of this license document, but changing it is not allowed. + +[This is the first released version of the Lesser GPL. It also counts + as the successor of the GNU Library Public License, version 2, hence + the version number 2.1.] +Preamble +The licenses for most software are designed to take away your freedom to share and change it. By contrast, the GNU General Public Licenses are intended to guarantee your freedom to share and change free software--to make sure the software is free for all its users. + +This license, the Lesser General Public License, applies to some specially designated software packages--typically libraries--of the Free Software Foundation and other authors who decide to use it. You can use it too, but we suggest you first think carefully about whether this license or the ordinary General Public License is the better strategy to use in any particular case, based on the explanations below. + +When we speak of free software, we are referring to freedom of use, not price. Our General Public Licenses are designed to make sure that you have the freedom to distribute copies of free software (and charge for this service if you wish); that you receive source code or can get it if you want it; that you can change the software and use pieces of it in new free programs; and that you are informed that you can do these things. + +To protect your rights, we need to make restrictions that forbid distributors to deny you these rights or to ask you to surrender these rights. These restrictions translate to certain responsibilities for you if you distribute copies of the library or if you modify it. + +For example, if you distribute copies of the library, whether gratis or for a fee, you must give the recipients all the rights that we gave you. You must make sure that they, too, receive or can get the source code. If you link other code with the library, you must provide complete object files to the recipients, so that they can relink them with the library after making changes to the library and recompiling it. And you must show them these terms so they know their rights. + +We protect your rights with a two-step method: (1) we copyright the library, and (2) we offer you this license, which gives you legal permission to copy, distribute and/or modify the library. + +To protect each distributor, we want to make it very clear that there is no warranty for the free library. Also, if the library is modified by someone else and passed on, the recipients should know that what they have is not the original version, so that the original author's reputation will not be affected by problems that might be introduced by others. + +Finally, software patents pose a constant threat to the existence of any free program. We wish to make sure that a company cannot effectively restrict the users of a free program by obtaining a restrictive license from a patent holder. Therefore, we insist that any patent license obtained for a version of the library must be consistent with the full freedom of use specified in this license. + +Most GNU software, including some libraries, is covered by the ordinary GNU General Public License. This license, the GNU Lesser General Public License, applies to certain designated libraries, and is quite different from the ordinary General Public License. We use this license for certain libraries in order to permit linking those libraries into non-free programs. + +When a program is linked with a library, whether statically or using a shared library, the combination of the two is legally speaking a combined work, a derivative of the original library. The ordinary General Public License therefore permits such linking only if the entire combination fits its criteria of freedom. The Lesser General Public License permits more lax criteria for linking other code with the library. + +We call this license the "Lesser" General Public License because it does Less to protect the user's freedom than the ordinary General Public License. It also provides other free software developers Less of an advantage over competing non-free programs. These disadvantages are the reason we use the ordinary General Public License for many libraries. However, the Lesser license provides advantages in certain special circumstances. + +For example, on rare occasions, there may be a special need to encourage the widest possible use of a certain library, so that it becomes a de-facto standard. To achieve this, non-free programs must be allowed to use the library. A more frequent case is that a free library does the same job as widely used non-free libraries. In this case, there is little to gain by limiting the free library to free software only, so we use the Lesser General Public License. + +In other cases, permission to use a particular library in non-free programs enables a greater number of people to use a large body of free software. For example, permission to use the GNU C Library in non-free programs enables many more people to use the whole GNU operating system, as well as its variant, the GNU/Linux operating system. + +Although the Lesser General Public License is Less protective of the users' freedom, it does ensure that the user of a program that is linked with the Library has the freedom and the wherewithal to run that program using a modified version of the Library. + +The precise terms and conditions for copying, distribution and modification follow. Pay close attention to the difference between a "work based on the library" and a "work that uses the library". The former contains code derived from the library, whereas the latter must be combined with the library in order to run. + +TERMS AND CONDITIONS FOR COPYING, DISTRIBUTION AND MODIFICATION +0. This License Agreement applies to any software library or other program which contains a notice placed by the copyright holder or other authorized party saying it may be distributed under the terms of this Lesser General Public License (also called "this License"). Each licensee is addressed as "you". + +A "library" means a collection of software functions and/or data prepared so as to be conveniently linked with application programs (which use some of those functions and data) to form executables. + +The "Library", below, refers to any such software library or work which has been distributed under these terms. A "work based on the Library" means either the Library or any derivative work under copyright law: that is to say, a work containing the Library or a portion of it, either verbatim or with modifications and/or translated straightforwardly into another language. (Hereinafter, translation is included without limitation in the term "modification".) + +"Source code" for a work means the preferred form of the work for making modifications to it. For a library, complete source code means all the source code for all modules it contains, plus any associated interface definition files, plus the scripts used to control compilation and installation of the library. + +Activities other than copying, distribution and modification are not covered by this License; they are outside its scope. The act of running a program using the Library is not restricted, and output from such a program is covered only if its contents constitute a work based on the Library (independent of the use of the Library in a tool for writing it). Whether that is true depends on what the Library does and what the program that uses the Library does. + +1. You may copy and distribute verbatim copies of the Library's complete source code as you receive it, in any medium, provided that you conspicuously and appropriately publish on each copy an appropriate copyright notice and disclaimer of warranty; keep intact all the notices that refer to this License and to the absence of any warranty; and distribute a copy of this License along with the Library. + +You may charge a fee for the physical act of transferring a copy, and you may at your option offer warranty protection in exchange for a fee. + +2. You may modify your copy or copies of the Library or any portion of it, thus forming a work based on the Library, and copy and distribute such modifications or work under the terms of Section 1 above, provided that you also meet all of these conditions: + +a) The modified work must itself be a software library. +b) You must cause the files modified to carry prominent notices stating that you changed the files and the date of any change. +c) You must cause the whole of the work to be licensed at no charge to all third parties under the terms of this License. +d) If a facility in the modified Library refers to a function or a table of data to be supplied by an application program that uses the facility, other than as an argument passed when the facility is invoked, then you must make a good faith effort to ensure that, in the event an application does not supply such function or table, the facility still operates, and performs whatever part of its purpose remains meaningful. +(For example, a function in a library to compute square roots has a purpose that is entirely well-defined independent of the application. Therefore, Subsection 2d requires that any application-supplied function or table used by this function must be optional: if the application does not supply it, the square root function must still compute square roots.) + +These requirements apply to the modified work as a whole. If identifiable sections of that work are not derived from the Library, and can be reasonably considered independent and separate works in themselves, then this License, and its terms, do not apply to those sections when you distribute them as separate works. But when you distribute the same sections as part of a whole which is a work based on the Library, the distribution of the whole must be on the terms of this License, whose permissions for other licensees extend to the entire whole, and thus to each and every part regardless of who wrote it. + +Thus, it is not the intent of this section to claim rights or contest your rights to work written entirely by you; rather, the intent is to exercise the right to control the distribution of derivative or collective works based on the Library. + +In addition, mere aggregation of another work not based on the Library with the Library (or with a work based on the Library) on a volume of a storage or distribution medium does not bring the other work under the scope of this License. + +3. You may opt to apply the terms of the ordinary GNU General Public License instead of this License to a given copy of the Library. To do this, you must alter all the notices that refer to this License, so that they refer to the ordinary GNU General Public License, version 2, instead of to this License. (If a newer version than version 2 of the ordinary GNU General Public License has appeared, then you can specify that version instead if you wish.) Do not make any other change in these notices. + +Once this change is made in a given copy, it is irreversible for that copy, so the ordinary GNU General Public License applies to all subsequent copies and derivative works made from that copy. + +This option is useful when you wish to copy part of the code of the Library into a program that is not a library. + +4. You may copy and distribute the Library (or a portion or derivative of it, under Section 2) in object code or executable form under the terms of Sections 1 and 2 above provided that you accompany it with the complete corresponding machine-readable source code, which must be distributed under the terms of Sections 1 and 2 above on a medium customarily used for software interchange. + +If distribution of object code is made by offering access to copy from a designated place, then offering equivalent access to copy the source code from the same place satisfies the requirement to distribute the source code, even though third parties are not compelled to copy the source along with the object code. + +5. A program that contains no derivative of any portion of the Library, but is designed to work with the Library by being compiled or linked with it, is called a "work that uses the Library". Such a work, in isolation, is not a derivative work of the Library, and therefore falls outside the scope of this License. + +However, linking a "work that uses the Library" with the Library creates an executable that is a derivative of the Library (because it contains portions of the Library), rather than a "work that uses the library". The executable is therefore covered by this License. Section 6 states terms for distribution of such executables. + +When a "work that uses the Library" uses material from a header file that is part of the Library, the object code for the work may be a derivative work of the Library even though the source code is not. Whether this is true is especially significant if the work can be linked without the Library, or if the work is itself a library. The threshold for this to be true is not precisely defined by law. + +If such an object file uses only numerical parameters, data structure layouts and accessors, and small macros and small inline functions (ten lines or less in length), then the use of the object file is unrestricted, regardless of whether it is legally a derivative work. (Executables containing this object code plus portions of the Library will still fall under Section 6.) + +Otherwise, if the work is a derivative of the Library, you may distribute the object code for the work under the terms of Section 6. Any executables containing that work also fall under Section 6, whether or not they are linked directly with the Library itself. + +6. As an exception to the Sections above, you may also combine or link a "work that uses the Library" with the Library to produce a work containing portions of the Library, and distribute that work under terms of your choice, provided that the terms permit modification of the work for the customer's own use and reverse engineering for debugging such modifications. + +You must give prominent notice with each copy of the work that the Library is used in it and that the Library and its use are covered by this License. You must supply a copy of this License. If the work during execution displays copyright notices, you must include the copyright notice for the Library among them, as well as a reference directing the user to the copy of this License. Also, you must do one of these things: + +a) Accompany the work with the complete corresponding machine-readable source code for the Library including whatever changes were used in the work (which must be distributed under Sections 1 and 2 above); and, if the work is an executable linked with the Library, with the complete machine-readable "work that uses the Library", as object code and/or source code, so that the user can modify the Library and then relink to produce a modified executable containing the modified Library. (It is understood that the user who changes the contents of definitions files in the Library will not necessarily be able to recompile the application to use the modified definitions.) +b) Use a suitable shared library mechanism for linking with the Library. A suitable mechanism is one that (1) uses at run time a copy of the library already present on the user's computer system, rather than copying library functions into the executable, and (2) will operate properly with a modified version of the library, if the user installs one, as long as the modified version is interface-compatible with the version that the work was made with. +c) Accompany the work with a written offer, valid for at least three years, to give the same user the materials specified in Subsection 6a, above, for a charge no more than the cost of performing this distribution. +d) If distribution of the work is made by offering access to copy from a designated place, offer equivalent access to copy the above specified materials from the same place. +e) Verify that the user has already received a copy of these materials or that you have already sent this user a copy. +For an executable, the required form of the "work that uses the Library" must include any data and utility programs needed for reproducing the executable from it. However, as a special exception, the materials to be distributed need not include anything that is normally distributed (in either source or binary form) with the major components (compiler, kernel, and so on) of the operating system on which the executable runs, unless that component itself accompanies the executable. + +It may happen that this requirement contradicts the license restrictions of other proprietary libraries that do not normally accompany the operating system. Such a contradiction means you cannot use both them and the Library together in an executable that you distribute. + +7. You may place library facilities that are a work based on the Library side-by-side in a single library together with other library facilities not covered by this License, and distribute such a combined library, provided that the separate distribution of the work based on the Library and of the other library facilities is otherwise permitted, and provided that you do these two things: + +a) Accompany the combined library with a copy of the same work based on the Library, uncombined with any other library facilities. This must be distributed under the terms of the Sections above. +b) Give prominent notice with the combined library of the fact that part of it is a work based on the Library, and explaining where to find the accompanying uncombined form of the same work. +8. You may not copy, modify, sublicense, link with, or distribute the Library except as expressly provided under this License. Any attempt otherwise to copy, modify, sublicense, link with, or distribute the Library is void, and will automatically terminate your rights under this License. However, parties who have received copies, or rights, from you under this License will not have their licenses terminated so long as such parties remain in full compliance. + +9. You are not required to accept this License, since you have not signed it. However, nothing else grants you permission to modify or distribute the Library or its derivative works. These actions are prohibited by law if you do not accept this License. Therefore, by modifying or distributing the Library (or any work based on the Library), you indicate your acceptance of this License to do so, and all its terms and conditions for copying, distributing or modifying the Library or works based on it. + +10. Each time you redistribute the Library (or any work based on the Library), the recipient automatically receives a license from the original licensor to copy, distribute, link with or modify the Library subject to these terms and conditions. You may not impose any further restrictions on the recipients' exercise of the rights granted herein. You are not responsible for enforcing compliance by third parties with this License. + +11. If, as a consequence of a court judgment or allegation of patent infringement or for any other reason (not limited to patent issues), conditions are imposed on you (whether by court order, agreement or otherwise) that contradict the conditions of this License, they do not excuse you from the conditions of this License. If you cannot distribute so as to satisfy simultaneously your obligations under this License and any other pertinent obligations, then as a consequence you may not distribute the Library at all. For example, if a patent license would not permit royalty-free redistribution of the Library by all those who receive copies directly or indirectly through you, then the only way you could satisfy both it and this License would be to refrain entirely from distribution of the Library. + +If any portion of this section is held invalid or unenforceable under any particular circumstance, the balance of the section is intended to apply, and the section as a whole is intended to apply in other circumstances. + +It is not the purpose of this section to induce you to infringe any patents or other property right claims or to contest validity of any such claims; this section has the sole purpose of protecting the integrity of the free software distribution system which is implemented by public license practices. Many people have made generous contributions to the wide range of software distributed through that system in reliance on consistent application of that system; it is up to the author/donor to decide if he or she is willing to distribute software through any other system and a licensee cannot impose that choice. + +This section is intended to make thoroughly clear what is believed to be a consequence of the rest of this License. + +12. If the distribution and/or use of the Library is restricted in certain countries either by patents or by copyrighted interfaces, the original copyright holder who places the Library under this License may add an explicit geographical distribution limitation excluding those countries, so that distribution is permitted only in or among countries not thus excluded. In such case, this License incorporates the limitation as if written in the body of this License. + +13. The Free Software Foundation may publish revised and/or new versions of the Lesser General Public License from time to time. Such new versions will be similar in spirit to the present version, but may differ in detail to address new problems or concerns. + +Each version is given a distinguishing version number. If the Library specifies a version number of this License which applies to it and "any later version", you have the option of following the terms and conditions either of that version or of any later version published by the Free Software Foundation. If the Library does not specify a license version number, you may choose any version ever published by the Free Software Foundation. + +14. If you wish to incorporate parts of the Library into other free programs whose distribution conditions are incompatible with these, write to the author to ask for permission. For software which is copyrighted by the Free Software Foundation, write to the Free Software Foundation; we sometimes make exceptions for this. Our decision will be guided by the two goals of preserving the free status of all derivatives of our free software and of promoting the sharing and reuse of software generally. + +NO WARRANTY + +15. BECAUSE THE LIBRARY IS LICENSED FREE OF CHARGE, THERE IS NO WARRANTY FOR THE LIBRARY, TO THE EXTENT PERMITTED BY APPLICABLE LAW. EXCEPT WHEN OTHERWISE STATED IN WRITING THE COPYRIGHT HOLDERS AND/OR OTHER PARTIES PROVIDE THE LIBRARY "AS IS" WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. THE ENTIRE RISK AS TO THE QUALITY AND PERFORMANCE OF THE LIBRARY IS WITH YOU. SHOULD THE LIBRARY PROVE DEFECTIVE, YOU ASSUME THE COST OF ALL NECESSARY SERVICING, REPAIR OR CORRECTION. + +16. IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN WRITING WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MAY MODIFY AND/OR REDISTRIBUTE THE LIBRARY AS PERMITTED ABOVE, BE LIABLE TO YOU FOR DAMAGES, INCLUDING ANY GENERAL, SPECIAL, INCIDENTAL OR CONSEQUENTIAL DAMAGES ARISING OUT OF THE USE OR INABILITY TO USE THE LIBRARY (INCLUDING BUT NOT LIMITED TO LOSS OF DATA OR DATA BEING RENDERED INACCURATE OR LOSSES SUSTAINED BY YOU OR THIRD PARTIES OR A FAILURE OF THE LIBRARY TO OPERATE WITH ANY OTHER SOFTWARE), EVEN IF SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE POSSIBILITY OF SUCH DAMAGES. + +END OF TERMS AND CONDITIONS +How to Apply These Terms to Your New Libraries +If you develop a new library, and you want it to be of the greatest possible use to the public, we recommend making it free software that everyone can redistribute and change. You can do so by permitting redistribution under these terms (or, alternatively, under the terms of the ordinary General Public License). + +To apply these terms, attach the following notices to the library. It is safest to attach them to the start of each source file to most effectively convey the exclusion of warranty; and each file should have at least the "copyright" line and a pointer to where the full notice is found. + +one line to give the library's name and an idea of what it does. +Copyright (C) year name of author + +This library is free software; you can redistribute it and/or +modify it under the terms of the GNU Lesser General Public +License as published by the Free Software Foundation; either +version 2.1 of the License, or (at your option) any later version. + +This library is distributed in the hope that it will be useful, +but WITHOUT ANY WARRANTY; without even the implied warranty of +MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU +Lesser General Public License for more details. + +You should have received a copy of the GNU Lesser General Public +License along with this library; if not, write to the Free Software +Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA +Also add information on how to contact you by electronic and paper mail. + +You should also get your employer (if you work as a programmer) or your school, if any, to sign a "copyright disclaimer" for the library, if necessary. Here is a sample; alter the names: + +Yoyodyne, Inc., hereby disclaims all copyright interest in +the library `Frob' (a library for tweaking knobs) written +by James Random Hacker. + +signature of Ty Coon, 1 April 1990 +Ty Coon, President of Vice +That's all there is to it! + +/pytorch/third_party/flash-attention/LICENSE +-------------------------------------------- +BSD 3-Clause License + +Copyright (c) 2022, the respective contributors, as shown by the AUTHORS file. +All rights reserved. + +Redistribution and use in source and binary forms, with or without +modification, are permitted provided that the following conditions are met: + +* Redistributions of source code must retain the above copyright notice, this + list of conditions and the following disclaimer. + +* Redistributions in binary form must reproduce the above copyright notice, + this list of conditions and the following disclaimer in the documentation + and/or other materials provided with the distribution. + +* Neither the name of the copyright holder nor the names of its + contributors may be used to endorse or promote products derived from + this software without specific prior written permission. + +THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" +AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE +IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE +DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE +FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL +DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR +SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER +CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, +OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE +OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + + +/pytorch/third_party/flatbuffers/LICENSE +---------------------------------------- + + Apache License + Version 2.0, January 2004 + http://www.apache.org/licenses/ + + TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION + + 1. Definitions. + + "License" shall mean the terms and conditions for use, reproduction, + and distribution as defined by Sections 1 through 9 of this document. + + "Licensor" shall mean the copyright owner or entity authorized by + the copyright owner that is granting the License. + + "Legal Entity" shall mean the union of the acting entity and all + other entities that control, are controlled by, or are under common + control with that entity. For the purposes of this definition, + "control" means (i) the power, direct or indirect, to cause the + direction or management of such entity, whether by contract or + otherwise, or (ii) ownership of fifty percent (50%) or more of the + outstanding shares, or (iii) beneficial ownership of such entity. + + "You" (or "Your") shall mean an individual or Legal Entity + exercising permissions granted by this License. + + "Source" form shall mean the preferred form for making modifications, + including but not limited to software source code, documentation + source, and configuration files. + + "Object" form shall mean any form resulting from mechanical + transformation or translation of a Source form, including but + not limited to compiled object code, generated documentation, + and conversions to other media types. + + "Work" shall mean the work of authorship, whether in Source or + Object form, made available under the License, as indicated by a + copyright notice that is included in or attached to the work + (an example is provided in the Appendix below). + + "Derivative Works" shall mean any work, whether in Source or Object + form, that is based on (or derived from) the Work and for which the + editorial revisions, annotations, elaborations, or other modifications + represent, as a whole, an original work of authorship. For the purposes + of this License, Derivative Works shall not include works that remain + separable from, or merely link (or bind by name) to the interfaces of, + the Work and Derivative Works thereof. + + "Contribution" shall mean any work of authorship, including + the original version of the Work and any modifications or additions + to that Work or Derivative Works thereof, that is intentionally + submitted to Licensor for inclusion in the Work by the copyright owner + or by an individual or Legal Entity authorized to submit on behalf of + the copyright owner. For the purposes of this definition, "submitted" + means any form of electronic, verbal, or written communication sent + to the Licensor or its representatives, including but not limited to + communication on electronic mailing lists, source code control systems, + and issue tracking systems that are managed by, or on behalf of, the + Licensor for the purpose of discussing and improving the Work, but + excluding communication that is conspicuously marked or otherwise + designated in writing by the copyright owner as "Not a Contribution." + + "Contributor" shall mean Licensor and any individual or Legal Entity + on behalf of whom a Contribution has been received by Licensor and + subsequently incorporated within the Work. + + 2. Grant of Copyright License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + copyright license to reproduce, prepare Derivative Works of, + publicly display, publicly perform, sublicense, and distribute the + Work and such Derivative Works in Source or Object form. + + 3. Grant of Patent License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + (except as stated in this section) patent license to make, have made, + use, offer to sell, sell, import, and otherwise transfer the Work, + where such license applies only to those patent claims licensable + by such Contributor that are necessarily infringed by their + Contribution(s) alone or by combination of their Contribution(s) + with the Work to which such Contribution(s) was submitted. If You + institute patent litigation against any entity (including a + cross-claim or counterclaim in a lawsuit) alleging that the Work + or a Contribution incorporated within the Work constitutes direct + or contributory patent infringement, then any patent licenses + granted to You under this License for that Work shall terminate + as of the date such litigation is filed. + + 4. Redistribution. You may reproduce and distribute copies of the + Work or Derivative Works thereof in any medium, with or without + modifications, and in Source or Object form, provided that You + meet the following conditions: + + (a) You must give any other recipients of the Work or + Derivative Works a copy of this License; and + + (b) You must cause any modified files to carry prominent notices + stating that You changed the files; and + + (c) You must retain, in the Source form of any Derivative Works + that You distribute, all copyright, patent, trademark, and + attribution notices from the Source form of the Work, + excluding those notices that do not pertain to any part of + the Derivative Works; and + + (d) If the Work includes a "NOTICE" text file as part of its + distribution, then any Derivative Works that You distribute must + include a readable copy of the attribution notices contained + within such NOTICE file, excluding those notices that do not + pertain to any part of the Derivative Works, in at least one + of the following places: within a NOTICE text file distributed + as part of the Derivative Works; within the Source form or + documentation, if provided along with the Derivative Works; or, + within a display generated by the Derivative Works, if and + wherever such third-party notices normally appear. The contents + of the NOTICE file are for informational purposes only and + do not modify the License. You may add Your own attribution + notices within Derivative Works that You distribute, alongside + or as an addendum to the NOTICE text from the Work, provided + that such additional attribution notices cannot be construed + as modifying the License. + + You may add Your own copyright statement to Your modifications and + may provide additional or different license terms and conditions + for use, reproduction, or distribution of Your modifications, or + for any such Derivative Works as a whole, provided Your use, + reproduction, and distribution of the Work otherwise complies with + the conditions stated in this License. + + 5. Submission of Contributions. Unless You explicitly state otherwise, + any Contribution intentionally submitted for inclusion in the Work + by You to the Licensor shall be under the terms and conditions of + this License, without any additional terms or conditions. + Notwithstanding the above, nothing herein shall supersede or modify + the terms of any separate license agreement you may have executed + with Licensor regarding such Contributions. + + 6. Trademarks. This License does not grant permission to use the trade + names, trademarks, service marks, or product names of the Licensor, + except as required for reasonable and customary use in describing the + origin of the Work and reproducing the content of the NOTICE file. + + 7. Disclaimer of Warranty. Unless required by applicable law or + agreed to in writing, Licensor provides the Work (and each + Contributor provides its Contributions) on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or + implied, including, without limitation, any warranties or conditions + of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A + PARTICULAR PURPOSE. You are solely responsible for determining the + appropriateness of using or redistributing the Work and assume any + risks associated with Your exercise of permissions under this License. + + 8. Limitation of Liability. In no event and under no legal theory, + whether in tort (including negligence), contract, or otherwise, + unless required by applicable law (such as deliberate and grossly + negligent acts) or agreed to in writing, shall any Contributor be + liable to You for damages, including any direct, indirect, special, + incidental, or consequential damages of any character arising as a + result of this License or out of the use or inability to use the + Work (including but not limited to damages for loss of goodwill, + work stoppage, computer failure or malfunction, or any and all + other commercial damages or losses), even if such Contributor + has been advised of the possibility of such damages. + + 9. Accepting Warranty or Additional Liability. While redistributing + the Work or Derivative Works thereof, You may choose to offer, + and charge a fee for, acceptance of support, warranty, indemnity, + or other liability obligations and/or rights consistent with this + License. However, in accepting such obligations, You may act only + on Your own behalf and on Your sole responsibility, not on behalf + of any other Contributor, and only if You agree to indemnify, + defend, and hold each Contributor harmless for any liability + incurred by, or claims asserted against, such Contributor by reason + of your accepting any such warranty or additional liability. + + END OF TERMS AND CONDITIONS + + APPENDIX: How to apply the Apache License to your work. + + To apply the Apache License to your work, attach the following + boilerplate notice, with the fields enclosed by brackets "[]" + replaced with your own identifying information. (Don't include + the brackets!) The text should be enclosed in the appropriate + comment syntax for the file format. We also recommend that a + file or class name and description of purpose be included on the + same "printed page" as the copyright notice for easier + identification within third-party archives. + + Copyright [yyyy] [name of copyright owner] + + Licensed under the Apache License, Version 2.0 (the "License"); + you may not use this file except in compliance with the License. + You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, software + distributed under the License is distributed on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + See the License for the specific language governing permissions and + limitations under the License. + + +/pytorch/third_party/fmt/LICENSE +-------------------------------- +Copyright (c) 2012 - present, Victor Zverovich and {fmt} contributors + +Permission is hereby granted, free of charge, to any person obtaining +a copy of this software and associated documentation files (the +"Software"), to deal in the Software without restriction, including +without limitation the rights to use, copy, modify, merge, publish, +distribute, sublicense, and/or sell copies of the Software, and to +permit persons to whom the Software is furnished to do so, subject to +the following conditions: + +The above copyright notice and this permission notice shall be +included in all copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, +EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF +MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND +NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE +LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION +OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION +WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. + +--- Optional exception to the license --- + +As an exception, if, as a result of your compiling your source code, portions +of this Software are embedded into a machine-executable object form of such +source code, you may redistribute such embedded portions in such object form +without including the above copyright and permission notices. + + +/pytorch/third_party/kineto/libkineto/third_party/dynolog/third_party/fmt/LICENSE.rst +------------------------------------------------------------------------------------- +Copyright (c) 2012 - present, Victor Zverovich + +Permission is hereby granted, free of charge, to any person obtaining +a copy of this software and associated documentation files (the +"Software"), to deal in the Software without restriction, including +without limitation the rights to use, copy, modify, merge, publish, +distribute, sublicense, and/or sell copies of the Software, and to +permit persons to whom the Software is furnished to do so, subject to +the following conditions: + +The above copyright notice and this permission notice shall be +included in all copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, +EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF +MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND +NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE +LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION +OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION +WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. + +--- Optional exception to the license --- + +As an exception, if, as a result of your compiling your source code, portions +of this Software are embedded into a machine-executable object form of such +source code, you may redistribute such embedded portions in such object form +without including the above copyright and permission notices. + + +/pytorch/third_party/kineto/libkineto/third_party/fmt/LICENSE +------------------------------------------------------------- +Copyright (c) 2012 - present, Victor Zverovich and {fmt} contributors + +Permission is hereby granted, free of charge, to any person obtaining +a copy of this software and associated documentation files (the +"Software"), to deal in the Software without restriction, including +without limitation the rights to use, copy, modify, merge, publish, +distribute, sublicense, and/or sell copies of the Software, and to +permit persons to whom the Software is furnished to do so, subject to +the following conditions: + +The above copyright notice and this permission notice shall be +included in all copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, +EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF +MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND +NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE +LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION +OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION +WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. + +--- Optional exception to the license --- + +As an exception, if, as a result of your compiling your source code, portions +of this Software are embedded into a machine-executable object form of such +source code, you may redistribute such embedded portions in such object form +without including the above copyright and permission notices. + + +/pytorch/third_party/gemmlowp/gemmlowp/LICENSE +---------------------------------------------- + + Apache License + Version 2.0, January 2004 + http://www.apache.org/licenses/ + + TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION + + 1. Definitions. + + "License" shall mean the terms and conditions for use, reproduction, + and distribution as defined by Sections 1 through 9 of this document. + + "Licensor" shall mean the copyright owner or entity authorized by + the copyright owner that is granting the License. + + "Legal Entity" shall mean the union of the acting entity and all + other entities that control, are controlled by, or are under common + control with that entity. For the purposes of this definition, + "control" means (i) the power, direct or indirect, to cause the + direction or management of such entity, whether by contract or + otherwise, or (ii) ownership of fifty percent (50%) or more of the + outstanding shares, or (iii) beneficial ownership of such entity. + + "You" (or "Your") shall mean an individual or Legal Entity + exercising permissions granted by this License. + + "Source" form shall mean the preferred form for making modifications, + including but not limited to software source code, documentation + source, and configuration files. + + "Object" form shall mean any form resulting from mechanical + transformation or translation of a Source form, including but + not limited to compiled object code, generated documentation, + and conversions to other media types. + + "Work" shall mean the work of authorship, whether in Source or + Object form, made available under the License, as indicated by a + copyright notice that is included in or attached to the work + (an example is provided in the Appendix below). + + "Derivative Works" shall mean any work, whether in Source or Object + form, that is based on (or derived from) the Work and for which the + editorial revisions, annotations, elaborations, or other modifications + represent, as a whole, an original work of authorship. For the purposes + of this License, Derivative Works shall not include works that remain + separable from, or merely link (or bind by name) to the interfaces of, + the Work and Derivative Works thereof. + + "Contribution" shall mean any work of authorship, including + the original version of the Work and any modifications or additions + to that Work or Derivative Works thereof, that is intentionally + submitted to Licensor for inclusion in the Work by the copyright owner + or by an individual or Legal Entity authorized to submit on behalf of + the copyright owner. For the purposes of this definition, "submitted" + means any form of electronic, verbal, or written communication sent + to the Licensor or its representatives, including but not limited to + communication on electronic mailing lists, source code control systems, + and issue tracking systems that are managed by, or on behalf of, the + Licensor for the purpose of discussing and improving the Work, but + excluding communication that is conspicuously marked or otherwise + designated in writing by the copyright owner as "Not a Contribution." + + "Contributor" shall mean Licensor and any individual or Legal Entity + on behalf of whom a Contribution has been received by Licensor and + subsequently incorporated within the Work. + + 2. Grant of Copyright License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + copyright license to reproduce, prepare Derivative Works of, + publicly display, publicly perform, sublicense, and distribute the + Work and such Derivative Works in Source or Object form. + + 3. Grant of Patent License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + (except as stated in this section) patent license to make, have made, + use, offer to sell, sell, import, and otherwise transfer the Work, + where such license applies only to those patent claims licensable + by such Contributor that are necessarily infringed by their + Contribution(s) alone or by combination of their Contribution(s) + with the Work to which such Contribution(s) was submitted. If You + institute patent litigation against any entity (including a + cross-claim or counterclaim in a lawsuit) alleging that the Work + or a Contribution incorporated within the Work constitutes direct + or contributory patent infringement, then any patent licenses + granted to You under this License for that Work shall terminate + as of the date such litigation is filed. + + 4. Redistribution. You may reproduce and distribute copies of the + Work or Derivative Works thereof in any medium, with or without + modifications, and in Source or Object form, provided that You + meet the following conditions: + + (a) You must give any other recipients of the Work or + Derivative Works a copy of this License; and + + (b) You must cause any modified files to carry prominent notices + stating that You changed the files; and + + (c) You must retain, in the Source form of any Derivative Works + that You distribute, all copyright, patent, trademark, and + attribution notices from the Source form of the Work, + excluding those notices that do not pertain to any part of + the Derivative Works; and + + (d) If the Work includes a "NOTICE" text file as part of its + distribution, then any Derivative Works that You distribute must + include a readable copy of the attribution notices contained + within such NOTICE file, excluding those notices that do not + pertain to any part of the Derivative Works, in at least one + of the following places: within a NOTICE text file distributed + as part of the Derivative Works; within the Source form or + documentation, if provided along with the Derivative Works; or, + within a display generated by the Derivative Works, if and + wherever such third-party notices normally appear. The contents + of the NOTICE file are for informational purposes only and + do not modify the License. You may add Your own attribution + notices within Derivative Works that You distribute, alongside + or as an addendum to the NOTICE text from the Work, provided + that such additional attribution notices cannot be construed + as modifying the License. + + You may add Your own copyright statement to Your modifications and + may provide additional or different license terms and conditions + for use, reproduction, or distribution of Your modifications, or + for any such Derivative Works as a whole, provided Your use, + reproduction, and distribution of the Work otherwise complies with + the conditions stated in this License. + + 5. Submission of Contributions. Unless You explicitly state otherwise, + any Contribution intentionally submitted for inclusion in the Work + by You to the Licensor shall be under the terms and conditions of + this License, without any additional terms or conditions. + Notwithstanding the above, nothing herein shall supersede or modify + the terms of any separate license agreement you may have executed + with Licensor regarding such Contributions. + + 6. Trademarks. This License does not grant permission to use the trade + names, trademarks, service marks, or product names of the Licensor, + except as required for reasonable and customary use in describing the + origin of the Work and reproducing the content of the NOTICE file. + + 7. Disclaimer of Warranty. Unless required by applicable law or + agreed to in writing, Licensor provides the Work (and each + Contributor provides its Contributions) on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or + implied, including, without limitation, any warranties or conditions + of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A + PARTICULAR PURPOSE. You are solely responsible for determining the + appropriateness of using or redistributing the Work and assume any + risks associated with Your exercise of permissions under this License. + + 8. Limitation of Liability. In no event and under no legal theory, + whether in tort (including negligence), contract, or otherwise, + unless required by applicable law (such as deliberate and grossly + negligent acts) or agreed to in writing, shall any Contributor be + liable to You for damages, including any direct, indirect, special, + incidental, or consequential damages of any character arising as a + result of this License or out of the use or inability to use the + Work (including but not limited to damages for loss of goodwill, + work stoppage, computer failure or malfunction, or any and all + other commercial damages or losses), even if such Contributor + has been advised of the possibility of such damages. + + 9. Accepting Warranty or Additional Liability. While redistributing + the Work or Derivative Works thereof, You may choose to offer, + and charge a fee for, acceptance of support, warranty, indemnity, + or other liability obligations and/or rights consistent with this + License. However, in accepting such obligations, You may act only + on Your own behalf and on Your sole responsibility, not on behalf + of any other Contributor, and only if You agree to indemnify, + defend, and hold each Contributor harmless for any liability + incurred by, or claims asserted against, such Contributor by reason + of your accepting any such warranty or additional liability. + + END OF TERMS AND CONDITIONS + + APPENDIX: How to apply the Apache License to your work. + + To apply the Apache License to your work, attach the following + boilerplate notice, with the fields enclosed by brackets "[]" + replaced with your own identifying information. (Don't include + the brackets!) The text should be enclosed in the appropriate + comment syntax for the file format. We also recommend that a + file or class name and description of purpose be included on the + same "printed page" as the copyright notice for easier + identification within third-party archives. + + Copyright [yyyy] [name of copyright owner] + + Licensed under the Apache License, Version 2.0 (the "License"); + you may not use this file except in compliance with the License. + You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, software + distributed under the License is distributed on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + See the License for the specific language governing permissions and + limitations under the License. + + +/pytorch/third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/googletest/googlemock/scripts/generator/LICENSE +--------------------------------------------------------------------------------------------------------------------------------------------- + + Apache License + Version 2.0, January 2004 + http://www.apache.org/licenses/ + + TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION + + 1. Definitions. + + "License" shall mean the terms and conditions for use, reproduction, + and distribution as defined by Sections 1 through 9 of this document. + + "Licensor" shall mean the copyright owner or entity authorized by + the copyright owner that is granting the License. + + "Legal Entity" shall mean the union of the acting entity and all + other entities that control, are controlled by, or are under common + control with that entity. For the purposes of this definition, + "control" means (i) the power, direct or indirect, to cause the + direction or management of such entity, whether by contract or + otherwise, or (ii) ownership of fifty percent (50%) or more of the + outstanding shares, or (iii) beneficial ownership of such entity. + + "You" (or "Your") shall mean an individual or Legal Entity + exercising permissions granted by this License. + + "Source" form shall mean the preferred form for making modifications, + including but not limited to software source code, documentation + source, and configuration files. + + "Object" form shall mean any form resulting from mechanical + transformation or translation of a Source form, including but + not limited to compiled object code, generated documentation, + and conversions to other media types. + + "Work" shall mean the work of authorship, whether in Source or + Object form, made available under the License, as indicated by a + copyright notice that is included in or attached to the work + (an example is provided in the Appendix below). + + "Derivative Works" shall mean any work, whether in Source or Object + form, that is based on (or derived from) the Work and for which the + editorial revisions, annotations, elaborations, or other modifications + represent, as a whole, an original work of authorship. For the purposes + of this License, Derivative Works shall not include works that remain + separable from, or merely link (or bind by name) to the interfaces of, + the Work and Derivative Works thereof. + + "Contribution" shall mean any work of authorship, including + the original version of the Work and any modifications or additions + to that Work or Derivative Works thereof, that is intentionally + submitted to Licensor for inclusion in the Work by the copyright owner + or by an individual or Legal Entity authorized to submit on behalf of + the copyright owner. For the purposes of this definition, "submitted" + means any form of electronic, verbal, or written communication sent + to the Licensor or its representatives, including but not limited to + communication on electronic mailing lists, source code control systems, + and issue tracking systems that are managed by, or on behalf of, the + Licensor for the purpose of discussing and improving the Work, but + excluding communication that is conspicuously marked or otherwise + designated in writing by the copyright owner as "Not a Contribution." + + "Contributor" shall mean Licensor and any individual or Legal Entity + on behalf of whom a Contribution has been received by Licensor and + subsequently incorporated within the Work. + + 2. Grant of Copyright License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + copyright license to reproduce, prepare Derivative Works of, + publicly display, publicly perform, sublicense, and distribute the + Work and such Derivative Works in Source or Object form. + + 3. Grant of Patent License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + (except as stated in this section) patent license to make, have made, + use, offer to sell, sell, import, and otherwise transfer the Work, + where such license applies only to those patent claims licensable + by such Contributor that are necessarily infringed by their + Contribution(s) alone or by combination of their Contribution(s) + with the Work to which such Contribution(s) was submitted. If You + institute patent litigation against any entity (including a + cross-claim or counterclaim in a lawsuit) alleging that the Work + or a Contribution incorporated within the Work constitutes direct + or contributory patent infringement, then any patent licenses + granted to You under this License for that Work shall terminate + as of the date such litigation is filed. + + 4. Redistribution. You may reproduce and distribute copies of the + Work or Derivative Works thereof in any medium, with or without + modifications, and in Source or Object form, provided that You + meet the following conditions: + + (a) You must give any other recipients of the Work or + Derivative Works a copy of this License; and + + (b) You must cause any modified files to carry prominent notices + stating that You changed the files; and + + (c) You must retain, in the Source form of any Derivative Works + that You distribute, all copyright, patent, trademark, and + attribution notices from the Source form of the Work, + excluding those notices that do not pertain to any part of + the Derivative Works; and + + (d) If the Work includes a "NOTICE" text file as part of its + distribution, then any Derivative Works that You distribute must + include a readable copy of the attribution notices contained + within such NOTICE file, excluding those notices that do not + pertain to any part of the Derivative Works, in at least one + of the following places: within a NOTICE text file distributed + as part of the Derivative Works; within the Source form or + documentation, if provided along with the Derivative Works; or, + within a display generated by the Derivative Works, if and + wherever such third-party notices normally appear. The contents + of the NOTICE file are for informational purposes only and + do not modify the License. You may add Your own attribution + notices within Derivative Works that You distribute, alongside + or as an addendum to the NOTICE text from the Work, provided + that such additional attribution notices cannot be construed + as modifying the License. + + You may add Your own copyright statement to Your modifications and + may provide additional or different license terms and conditions + for use, reproduction, or distribution of Your modifications, or + for any such Derivative Works as a whole, provided Your use, + reproduction, and distribution of the Work otherwise complies with + the conditions stated in this License. + + 5. Submission of Contributions. Unless You explicitly state otherwise, + any Contribution intentionally submitted for inclusion in the Work + by You to the Licensor shall be under the terms and conditions of + this License, without any additional terms or conditions. + Notwithstanding the above, nothing herein shall supersede or modify + the terms of any separate license agreement you may have executed + with Licensor regarding such Contributions. + + 6. Trademarks. This License does not grant permission to use the trade + names, trademarks, service marks, or product names of the Licensor, + except as required for reasonable and customary use in describing the + origin of the Work and reproducing the content of the NOTICE file. + + 7. Disclaimer of Warranty. Unless required by applicable law or + agreed to in writing, Licensor provides the Work (and each + Contributor provides its Contributions) on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or + implied, including, without limitation, any warranties or conditions + of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A + PARTICULAR PURPOSE. You are solely responsible for determining the + appropriateness of using or redistributing the Work and assume any + risks associated with Your exercise of permissions under this License. + + 8. Limitation of Liability. In no event and under no legal theory, + whether in tort (including negligence), contract, or otherwise, + unless required by applicable law (such as deliberate and grossly + negligent acts) or agreed to in writing, shall any Contributor be + liable to You for damages, including any direct, indirect, special, + incidental, or consequential damages of any character arising as a + result of this License or out of the use or inability to use the + Work (including but not limited to damages for loss of goodwill, + work stoppage, computer failure or malfunction, or any and all + other commercial damages or losses), even if such Contributor + has been advised of the possibility of such damages. + + 9. Accepting Warranty or Additional Liability. While redistributing + the Work or Derivative Works thereof, You may choose to offer, + and charge a fee for, acceptance of support, warranty, indemnity, + or other liability obligations and/or rights consistent with this + License. However, in accepting such obligations, You may act only + on Your own behalf and on Your sole responsibility, not on behalf + of any other Contributor, and only if You agree to indemnify, + defend, and hold each Contributor harmless for any liability + incurred by, or claims asserted against, such Contributor by reason + of your accepting any such warranty or additional liability. + + END OF TERMS AND CONDITIONS + + APPENDIX: How to apply the Apache License to your work. + + To apply the Apache License to your work, attach the following + boilerplate notice, with the fields enclosed by brackets "[]" + replaced with your own identifying information. (Don't include + the brackets!) The text should be enclosed in the appropriate + comment syntax for the file format. We also recommend that a + file or class name and description of purpose be included on the + same "printed page" as the copyright notice for easier + identification within third-party archives. + + Copyright [2007] Neal Norwitz + Portions Copyright [2007] Google Inc. + + Licensed under the Apache License, Version 2.0 (the "License"); + you may not use this file except in compliance with the License. + You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, software + distributed under the License is distributed on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + See the License for the specific language governing permissions and + limitations under the License. + + +/pytorch/third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/googletest/googlemock/scripts/generator/LICENSE +-------------------------------------------------------------------------------------------------------------------------- + + Apache License + Version 2.0, January 2004 + http://www.apache.org/licenses/ + + TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION + + 1. Definitions. + + "License" shall mean the terms and conditions for use, reproduction, + and distribution as defined by Sections 1 through 9 of this document. + + "Licensor" shall mean the copyright owner or entity authorized by + the copyright owner that is granting the License. + + "Legal Entity" shall mean the union of the acting entity and all + other entities that control, are controlled by, or are under common + control with that entity. For the purposes of this definition, + "control" means (i) the power, direct or indirect, to cause the + direction or management of such entity, whether by contract or + otherwise, or (ii) ownership of fifty percent (50%) or more of the + outstanding shares, or (iii) beneficial ownership of such entity. + + "You" (or "Your") shall mean an individual or Legal Entity + exercising permissions granted by this License. + + "Source" form shall mean the preferred form for making modifications, + including but not limited to software source code, documentation + source, and configuration files. + + "Object" form shall mean any form resulting from mechanical + transformation or translation of a Source form, including but + not limited to compiled object code, generated documentation, + and conversions to other media types. + + "Work" shall mean the work of authorship, whether in Source or + Object form, made available under the License, as indicated by a + copyright notice that is included in or attached to the work + (an example is provided in the Appendix below). + + "Derivative Works" shall mean any work, whether in Source or Object + form, that is based on (or derived from) the Work and for which the + editorial revisions, annotations, elaborations, or other modifications + represent, as a whole, an original work of authorship. For the purposes + of this License, Derivative Works shall not include works that remain + separable from, or merely link (or bind by name) to the interfaces of, + the Work and Derivative Works thereof. + + "Contribution" shall mean any work of authorship, including + the original version of the Work and any modifications or additions + to that Work or Derivative Works thereof, that is intentionally + submitted to Licensor for inclusion in the Work by the copyright owner + or by an individual or Legal Entity authorized to submit on behalf of + the copyright owner. For the purposes of this definition, "submitted" + means any form of electronic, verbal, or written communication sent + to the Licensor or its representatives, including but not limited to + communication on electronic mailing lists, source code control systems, + and issue tracking systems that are managed by, or on behalf of, the + Licensor for the purpose of discussing and improving the Work, but + excluding communication that is conspicuously marked or otherwise + designated in writing by the copyright owner as "Not a Contribution." + + "Contributor" shall mean Licensor and any individual or Legal Entity + on behalf of whom a Contribution has been received by Licensor and + subsequently incorporated within the Work. + + 2. Grant of Copyright License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + copyright license to reproduce, prepare Derivative Works of, + publicly display, publicly perform, sublicense, and distribute the + Work and such Derivative Works in Source or Object form. + + 3. Grant of Patent License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + (except as stated in this section) patent license to make, have made, + use, offer to sell, sell, import, and otherwise transfer the Work, + where such license applies only to those patent claims licensable + by such Contributor that are necessarily infringed by their + Contribution(s) alone or by combination of their Contribution(s) + with the Work to which such Contribution(s) was submitted. If You + institute patent litigation against any entity (including a + cross-claim or counterclaim in a lawsuit) alleging that the Work + or a Contribution incorporated within the Work constitutes direct + or contributory patent infringement, then any patent licenses + granted to You under this License for that Work shall terminate + as of the date such litigation is filed. + + 4. Redistribution. You may reproduce and distribute copies of the + Work or Derivative Works thereof in any medium, with or without + modifications, and in Source or Object form, provided that You + meet the following conditions: + + (a) You must give any other recipients of the Work or + Derivative Works a copy of this License; and + + (b) You must cause any modified files to carry prominent notices + stating that You changed the files; and + + (c) You must retain, in the Source form of any Derivative Works + that You distribute, all copyright, patent, trademark, and + attribution notices from the Source form of the Work, + excluding those notices that do not pertain to any part of + the Derivative Works; and + + (d) If the Work includes a "NOTICE" text file as part of its + distribution, then any Derivative Works that You distribute must + include a readable copy of the attribution notices contained + within such NOTICE file, excluding those notices that do not + pertain to any part of the Derivative Works, in at least one + of the following places: within a NOTICE text file distributed + as part of the Derivative Works; within the Source form or + documentation, if provided along with the Derivative Works; or, + within a display generated by the Derivative Works, if and + wherever such third-party notices normally appear. The contents + of the NOTICE file are for informational purposes only and + do not modify the License. You may add Your own attribution + notices within Derivative Works that You distribute, alongside + or as an addendum to the NOTICE text from the Work, provided + that such additional attribution notices cannot be construed + as modifying the License. + + You may add Your own copyright statement to Your modifications and + may provide additional or different license terms and conditions + for use, reproduction, or distribution of Your modifications, or + for any such Derivative Works as a whole, provided Your use, + reproduction, and distribution of the Work otherwise complies with + the conditions stated in this License. + + 5. Submission of Contributions. Unless You explicitly state otherwise, + any Contribution intentionally submitted for inclusion in the Work + by You to the Licensor shall be under the terms and conditions of + this License, without any additional terms or conditions. + Notwithstanding the above, nothing herein shall supersede or modify + the terms of any separate license agreement you may have executed + with Licensor regarding such Contributions. + + 6. Trademarks. This License does not grant permission to use the trade + names, trademarks, service marks, or product names of the Licensor, + except as required for reasonable and customary use in describing the + origin of the Work and reproducing the content of the NOTICE file. + + 7. Disclaimer of Warranty. Unless required by applicable law or + agreed to in writing, Licensor provides the Work (and each + Contributor provides its Contributions) on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or + implied, including, without limitation, any warranties or conditions + of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A + PARTICULAR PURPOSE. You are solely responsible for determining the + appropriateness of using or redistributing the Work and assume any + risks associated with Your exercise of permissions under this License. + + 8. Limitation of Liability. In no event and under no legal theory, + whether in tort (including negligence), contract, or otherwise, + unless required by applicable law (such as deliberate and grossly + negligent acts) or agreed to in writing, shall any Contributor be + liable to You for damages, including any direct, indirect, special, + incidental, or consequential damages of any character arising as a + result of this License or out of the use or inability to use the + Work (including but not limited to damages for loss of goodwill, + work stoppage, computer failure or malfunction, or any and all + other commercial damages or losses), even if such Contributor + has been advised of the possibility of such damages. + + 9. Accepting Warranty or Additional Liability. While redistributing + the Work or Derivative Works thereof, You may choose to offer, + and charge a fee for, acceptance of support, warranty, indemnity, + or other liability obligations and/or rights consistent with this + License. However, in accepting such obligations, You may act only + on Your own behalf and on Your sole responsibility, not on behalf + of any other Contributor, and only if You agree to indemnify, + defend, and hold each Contributor harmless for any liability + incurred by, or claims asserted against, such Contributor by reason + of your accepting any such warranty or additional liability. + + END OF TERMS AND CONDITIONS + + APPENDIX: How to apply the Apache License to your work. + + To apply the Apache License to your work, attach the following + boilerplate notice, with the fields enclosed by brackets "[]" + replaced with your own identifying information. (Don't include + the brackets!) The text should be enclosed in the appropriate + comment syntax for the file format. We also recommend that a + file or class name and description of purpose be included on the + same "printed page" as the copyright notice for easier + identification within third-party archives. + + Copyright [2007] Neal Norwitz + Portions Copyright [2007] Google Inc. + + Licensed under the Apache License, Version 2.0 (the "License"); + you may not use this file except in compliance with the License. + You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, software + distributed under the License is distributed on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + See the License for the specific language governing permissions and + limitations under the License. + + +/pytorch/third_party/protobuf/third_party/googletest/googlemock/scripts/generator/LICENSE +----------------------------------------------------------------------------------------- + + Apache License + Version 2.0, January 2004 + http://www.apache.org/licenses/ + + TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION + + 1. Definitions. + + "License" shall mean the terms and conditions for use, reproduction, + and distribution as defined by Sections 1 through 9 of this document. + + "Licensor" shall mean the copyright owner or entity authorized by + the copyright owner that is granting the License. + + "Legal Entity" shall mean the union of the acting entity and all + other entities that control, are controlled by, or are under common + control with that entity. For the purposes of this definition, + "control" means (i) the power, direct or indirect, to cause the + direction or management of such entity, whether by contract or + otherwise, or (ii) ownership of fifty percent (50%) or more of the + outstanding shares, or (iii) beneficial ownership of such entity. + + "You" (or "Your") shall mean an individual or Legal Entity + exercising permissions granted by this License. + + "Source" form shall mean the preferred form for making modifications, + including but not limited to software source code, documentation + source, and configuration files. + + "Object" form shall mean any form resulting from mechanical + transformation or translation of a Source form, including but + not limited to compiled object code, generated documentation, + and conversions to other media types. + + "Work" shall mean the work of authorship, whether in Source or + Object form, made available under the License, as indicated by a + copyright notice that is included in or attached to the work + (an example is provided in the Appendix below). + + "Derivative Works" shall mean any work, whether in Source or Object + form, that is based on (or derived from) the Work and for which the + editorial revisions, annotations, elaborations, or other modifications + represent, as a whole, an original work of authorship. For the purposes + of this License, Derivative Works shall not include works that remain + separable from, or merely link (or bind by name) to the interfaces of, + the Work and Derivative Works thereof. + + "Contribution" shall mean any work of authorship, including + the original version of the Work and any modifications or additions + to that Work or Derivative Works thereof, that is intentionally + submitted to Licensor for inclusion in the Work by the copyright owner + or by an individual or Legal Entity authorized to submit on behalf of + the copyright owner. For the purposes of this definition, "submitted" + means any form of electronic, verbal, or written communication sent + to the Licensor or its representatives, including but not limited to + communication on electronic mailing lists, source code control systems, + and issue tracking systems that are managed by, or on behalf of, the + Licensor for the purpose of discussing and improving the Work, but + excluding communication that is conspicuously marked or otherwise + designated in writing by the copyright owner as "Not a Contribution." + + "Contributor" shall mean Licensor and any individual or Legal Entity + on behalf of whom a Contribution has been received by Licensor and + subsequently incorporated within the Work. + + 2. Grant of Copyright License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + copyright license to reproduce, prepare Derivative Works of, + publicly display, publicly perform, sublicense, and distribute the + Work and such Derivative Works in Source or Object form. + + 3. Grant of Patent License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + (except as stated in this section) patent license to make, have made, + use, offer to sell, sell, import, and otherwise transfer the Work, + where such license applies only to those patent claims licensable + by such Contributor that are necessarily infringed by their + Contribution(s) alone or by combination of their Contribution(s) + with the Work to which such Contribution(s) was submitted. If You + institute patent litigation against any entity (including a + cross-claim or counterclaim in a lawsuit) alleging that the Work + or a Contribution incorporated within the Work constitutes direct + or contributory patent infringement, then any patent licenses + granted to You under this License for that Work shall terminate + as of the date such litigation is filed. + + 4. Redistribution. You may reproduce and distribute copies of the + Work or Derivative Works thereof in any medium, with or without + modifications, and in Source or Object form, provided that You + meet the following conditions: + + (a) You must give any other recipients of the Work or + Derivative Works a copy of this License; and + + (b) You must cause any modified files to carry prominent notices + stating that You changed the files; and + + (c) You must retain, in the Source form of any Derivative Works + that You distribute, all copyright, patent, trademark, and + attribution notices from the Source form of the Work, + excluding those notices that do not pertain to any part of + the Derivative Works; and + + (d) If the Work includes a "NOTICE" text file as part of its + distribution, then any Derivative Works that You distribute must + include a readable copy of the attribution notices contained + within such NOTICE file, excluding those notices that do not + pertain to any part of the Derivative Works, in at least one + of the following places: within a NOTICE text file distributed + as part of the Derivative Works; within the Source form or + documentation, if provided along with the Derivative Works; or, + within a display generated by the Derivative Works, if and + wherever such third-party notices normally appear. The contents + of the NOTICE file are for informational purposes only and + do not modify the License. You may add Your own attribution + notices within Derivative Works that You distribute, alongside + or as an addendum to the NOTICE text from the Work, provided + that such additional attribution notices cannot be construed + as modifying the License. + + You may add Your own copyright statement to Your modifications and + may provide additional or different license terms and conditions + for use, reproduction, or distribution of Your modifications, or + for any such Derivative Works as a whole, provided Your use, + reproduction, and distribution of the Work otherwise complies with + the conditions stated in this License. + + 5. Submission of Contributions. Unless You explicitly state otherwise, + any Contribution intentionally submitted for inclusion in the Work + by You to the Licensor shall be under the terms and conditions of + this License, without any additional terms or conditions. + Notwithstanding the above, nothing herein shall supersede or modify + the terms of any separate license agreement you may have executed + with Licensor regarding such Contributions. + + 6. Trademarks. This License does not grant permission to use the trade + names, trademarks, service marks, or product names of the Licensor, + except as required for reasonable and customary use in describing the + origin of the Work and reproducing the content of the NOTICE file. + + 7. Disclaimer of Warranty. Unless required by applicable law or + agreed to in writing, Licensor provides the Work (and each + Contributor provides its Contributions) on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or + implied, including, without limitation, any warranties or conditions + of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A + PARTICULAR PURPOSE. You are solely responsible for determining the + appropriateness of using or redistributing the Work and assume any + risks associated with Your exercise of permissions under this License. + + 8. Limitation of Liability. In no event and under no legal theory, + whether in tort (including negligence), contract, or otherwise, + unless required by applicable law (such as deliberate and grossly + negligent acts) or agreed to in writing, shall any Contributor be + liable to You for damages, including any direct, indirect, special, + incidental, or consequential damages of any character arising as a + result of this License or out of the use or inability to use the + Work (including but not limited to damages for loss of goodwill, + work stoppage, computer failure or malfunction, or any and all + other commercial damages or losses), even if such Contributor + has been advised of the possibility of such damages. + + 9. Accepting Warranty or Additional Liability. While redistributing + the Work or Derivative Works thereof, You may choose to offer, + and charge a fee for, acceptance of support, warranty, indemnity, + or other liability obligations and/or rights consistent with this + License. However, in accepting such obligations, You may act only + on Your own behalf and on Your sole responsibility, not on behalf + of any other Contributor, and only if You agree to indemnify, + defend, and hold each Contributor harmless for any liability + incurred by, or claims asserted against, such Contributor by reason + of your accepting any such warranty or additional liability. + + END OF TERMS AND CONDITIONS + + APPENDIX: How to apply the Apache License to your work. + + To apply the Apache License to your work, attach the following + boilerplate notice, with the fields enclosed by brackets "[]" + replaced with your own identifying information. (Don't include + the brackets!) The text should be enclosed in the appropriate + comment syntax for the file format. We also recommend that a + file or class name and description of purpose be included on the + same "printed page" as the copyright notice for easier + identification within third-party archives. + + Copyright [2007] Neal Norwitz + Portions Copyright [2007] Google Inc. + + Licensed under the Apache License, Version 2.0 (the "License"); + you may not use this file except in compliance with the License. + You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, software + distributed under the License is distributed on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + See the License for the specific language governing permissions and + limitations under the License. + + +/pytorch/third_party/tensorpipe/third_party/googletest/googlemock/scripts/generator/LICENSE +------------------------------------------------------------------------------------------- + + Apache License + Version 2.0, January 2004 + http://www.apache.org/licenses/ + + TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION + + 1. Definitions. + + "License" shall mean the terms and conditions for use, reproduction, + and distribution as defined by Sections 1 through 9 of this document. + + "Licensor" shall mean the copyright owner or entity authorized by + the copyright owner that is granting the License. + + "Legal Entity" shall mean the union of the acting entity and all + other entities that control, are controlled by, or are under common + control with that entity. For the purposes of this definition, + "control" means (i) the power, direct or indirect, to cause the + direction or management of such entity, whether by contract or + otherwise, or (ii) ownership of fifty percent (50%) or more of the + outstanding shares, or (iii) beneficial ownership of such entity. + + "You" (or "Your") shall mean an individual or Legal Entity + exercising permissions granted by this License. + + "Source" form shall mean the preferred form for making modifications, + including but not limited to software source code, documentation + source, and configuration files. + + "Object" form shall mean any form resulting from mechanical + transformation or translation of a Source form, including but + not limited to compiled object code, generated documentation, + and conversions to other media types. + + "Work" shall mean the work of authorship, whether in Source or + Object form, made available under the License, as indicated by a + copyright notice that is included in or attached to the work + (an example is provided in the Appendix below). + + "Derivative Works" shall mean any work, whether in Source or Object + form, that is based on (or derived from) the Work and for which the + editorial revisions, annotations, elaborations, or other modifications + represent, as a whole, an original work of authorship. For the purposes + of this License, Derivative Works shall not include works that remain + separable from, or merely link (or bind by name) to the interfaces of, + the Work and Derivative Works thereof. + + "Contribution" shall mean any work of authorship, including + the original version of the Work and any modifications or additions + to that Work or Derivative Works thereof, that is intentionally + submitted to Licensor for inclusion in the Work by the copyright owner + or by an individual or Legal Entity authorized to submit on behalf of + the copyright owner. For the purposes of this definition, "submitted" + means any form of electronic, verbal, or written communication sent + to the Licensor or its representatives, including but not limited to + communication on electronic mailing lists, source code control systems, + and issue tracking systems that are managed by, or on behalf of, the + Licensor for the purpose of discussing and improving the Work, but + excluding communication that is conspicuously marked or otherwise + designated in writing by the copyright owner as "Not a Contribution." + + "Contributor" shall mean Licensor and any individual or Legal Entity + on behalf of whom a Contribution has been received by Licensor and + subsequently incorporated within the Work. + + 2. Grant of Copyright License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + copyright license to reproduce, prepare Derivative Works of, + publicly display, publicly perform, sublicense, and distribute the + Work and such Derivative Works in Source or Object form. + + 3. Grant of Patent License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + (except as stated in this section) patent license to make, have made, + use, offer to sell, sell, import, and otherwise transfer the Work, + where such license applies only to those patent claims licensable + by such Contributor that are necessarily infringed by their + Contribution(s) alone or by combination of their Contribution(s) + with the Work to which such Contribution(s) was submitted. If You + institute patent litigation against any entity (including a + cross-claim or counterclaim in a lawsuit) alleging that the Work + or a Contribution incorporated within the Work constitutes direct + or contributory patent infringement, then any patent licenses + granted to You under this License for that Work shall terminate + as of the date such litigation is filed. + + 4. Redistribution. You may reproduce and distribute copies of the + Work or Derivative Works thereof in any medium, with or without + modifications, and in Source or Object form, provided that You + meet the following conditions: + + (a) You must give any other recipients of the Work or + Derivative Works a copy of this License; and + + (b) You must cause any modified files to carry prominent notices + stating that You changed the files; and + + (c) You must retain, in the Source form of any Derivative Works + that You distribute, all copyright, patent, trademark, and + attribution notices from the Source form of the Work, + excluding those notices that do not pertain to any part of + the Derivative Works; and + + (d) If the Work includes a "NOTICE" text file as part of its + distribution, then any Derivative Works that You distribute must + include a readable copy of the attribution notices contained + within such NOTICE file, excluding those notices that do not + pertain to any part of the Derivative Works, in at least one + of the following places: within a NOTICE text file distributed + as part of the Derivative Works; within the Source form or + documentation, if provided along with the Derivative Works; or, + within a display generated by the Derivative Works, if and + wherever such third-party notices normally appear. The contents + of the NOTICE file are for informational purposes only and + do not modify the License. You may add Your own attribution + notices within Derivative Works that You distribute, alongside + or as an addendum to the NOTICE text from the Work, provided + that such additional attribution notices cannot be construed + as modifying the License. + + You may add Your own copyright statement to Your modifications and + may provide additional or different license terms and conditions + for use, reproduction, or distribution of Your modifications, or + for any such Derivative Works as a whole, provided Your use, + reproduction, and distribution of the Work otherwise complies with + the conditions stated in this License. + + 5. Submission of Contributions. Unless You explicitly state otherwise, + any Contribution intentionally submitted for inclusion in the Work + by You to the Licensor shall be under the terms and conditions of + this License, without any additional terms or conditions. + Notwithstanding the above, nothing herein shall supersede or modify + the terms of any separate license agreement you may have executed + with Licensor regarding such Contributions. + + 6. Trademarks. This License does not grant permission to use the trade + names, trademarks, service marks, or product names of the Licensor, + except as required for reasonable and customary use in describing the + origin of the Work and reproducing the content of the NOTICE file. + + 7. Disclaimer of Warranty. Unless required by applicable law or + agreed to in writing, Licensor provides the Work (and each + Contributor provides its Contributions) on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or + implied, including, without limitation, any warranties or conditions + of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A + PARTICULAR PURPOSE. You are solely responsible for determining the + appropriateness of using or redistributing the Work and assume any + risks associated with Your exercise of permissions under this License. + + 8. Limitation of Liability. In no event and under no legal theory, + whether in tort (including negligence), contract, or otherwise, + unless required by applicable law (such as deliberate and grossly + negligent acts) or agreed to in writing, shall any Contributor be + liable to You for damages, including any direct, indirect, special, + incidental, or consequential damages of any character arising as a + result of this License or out of the use or inability to use the + Work (including but not limited to damages for loss of goodwill, + work stoppage, computer failure or malfunction, or any and all + other commercial damages or losses), even if such Contributor + has been advised of the possibility of such damages. + + 9. Accepting Warranty or Additional Liability. While redistributing + the Work or Derivative Works thereof, You may choose to offer, + and charge a fee for, acceptance of support, warranty, indemnity, + or other liability obligations and/or rights consistent with this + License. However, in accepting such obligations, You may act only + on Your own behalf and on Your sole responsibility, not on behalf + of any other Contributor, and only if You agree to indemnify, + defend, and hold each Contributor harmless for any liability + incurred by, or claims asserted against, such Contributor by reason + of your accepting any such warranty or additional liability. + + END OF TERMS AND CONDITIONS + + APPENDIX: How to apply the Apache License to your work. + + To apply the Apache License to your work, attach the following + boilerplate notice, with the fields enclosed by brackets "[]" + replaced with your own identifying information. (Don't include + the brackets!) The text should be enclosed in the appropriate + comment syntax for the file format. We also recommend that a + file or class name and description of purpose be included on the + same "printed page" as the copyright notice for easier + identification within third-party archives. + + Copyright [2007] Neal Norwitz + Portions Copyright [2007] Google Inc. + + Licensed under the Apache License, Version 2.0 (the "License"); + you may not use this file except in compliance with the License. + You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, software + distributed under the License is distributed on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + See the License for the specific language governing permissions and + limitations under the License. + + +/pytorch/third_party/opentelemetry-cpp/tools/vcpkg/ports/gettimeofday/LICENSE +----------------------------------------------------------------------------- +/* + * Copied from PostgreSQL source: + * http://doxygen.postgresql.org/gettimeofday_8c_source.html + * + */ + +/* + * gettimeofday.c + * Win32 gettimeofday() replacement + * + * src/port/gettimeofday.c + * + * Copyright (c) 2003 SRA, Inc. + * Copyright (c) 2003 SKC, Inc. + * + * Permission to use, copy, modify, and distribute this software and + * its documentation for any purpose, without fee, and without a + * written agreement is hereby granted, provided that the above + * copyright notice and this paragraph and the following two + * paragraphs appear in all copies. + * + * IN NO EVENT SHALL THE AUTHOR BE LIABLE TO ANY PARTY FOR DIRECT, + * INDIRECT, SPECIAL, INCIDENTAL, OR CONSEQUENTIAL DAMAGES, INCLUDING + * LOST PROFITS, ARISING OUT OF THE USE OF THIS SOFTWARE AND ITS + * DOCUMENTATION, EVEN IF THE UNIVERSITY OF CALIFORNIA HAS BEEN ADVISED + * OF THE POSSIBILITY OF SUCH DAMAGE. + * + * THE AUTHOR SPECIFICALLY DISCLAIMS ANY WARRANTIES, INCLUDING, BUT NOT + * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR + * A PARTICULAR PURPOSE. THE SOFTWARE PROVIDED HEREUNDER IS ON AN "AS + * IS" BASIS, AND THE AUTHOR HAS NO OBLIGATIONS TO PROVIDE MAINTENANCE, + * SUPPORT, UPDATES, ENHANCEMENTS, OR MODIFICATIONS. + */ + + +/pytorch/third_party/gloo/LICENSE +--------------------------------- +BSD License + +For Gloo software + +Copyright (c) 2017-present, Facebook, Inc. All rights reserved. + +Redistribution and use in source and binary forms, with or without modification, +are permitted provided that the following conditions are met: + + * Redistributions of source code must retain the above copyright notice, this + list of conditions and the following disclaimer. + + * Redistributions in binary form must reproduce the above copyright notice, + this list of conditions and the following disclaimer in the documentation + and/or other materials provided with the distribution. + + * Neither the name Facebook nor the names of its contributors may be used to + endorse or promote products derived from this software without specific + prior written permission. + +THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND +ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED +WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE +DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR +ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES +(INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; +LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON +ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT +(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS +SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + + +/pytorch/third_party/protobuf/third_party/googletest/googlemock/LICENSE +----------------------------------------------------------------------- +Copyright 2008, Google Inc. +All rights reserved. + +Redistribution and use in source and binary forms, with or without +modification, are permitted provided that the following conditions are +met: + + * Redistributions of source code must retain the above copyright +notice, this list of conditions and the following disclaimer. + * Redistributions in binary form must reproduce the above +copyright notice, this list of conditions and the following disclaimer +in the documentation and/or other materials provided with the +distribution. + * Neither the name of Google Inc. nor the names of its +contributors may be used to endorse or promote products derived from +this software without specific prior written permission. + +THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS +"AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT +LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR +A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT +OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, +SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT +LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, +DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY +THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT +(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE +OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + + +/pytorch/third_party/tensorpipe/third_party/googletest/googlemock/LICENSE +------------------------------------------------------------------------- +Copyright 2008, Google Inc. +All rights reserved. + +Redistribution and use in source and binary forms, with or without +modification, are permitted provided that the following conditions are +met: + + * Redistributions of source code must retain the above copyright +notice, this list of conditions and the following disclaimer. + * Redistributions in binary form must reproduce the above +copyright notice, this list of conditions and the following disclaimer +in the documentation and/or other materials provided with the +distribution. + * Neither the name of Google Inc. nor the names of its +contributors may be used to endorse or promote products derived from +this software without specific prior written permission. + +THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS +"AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT +LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR +A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT +OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, +SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT +LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, +DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY +THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT +(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE +OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + + +/pytorch/third_party/fbgemm/external/googletest/LICENSE +------------------------------------------------------- +Copyright 2008, Google Inc. +All rights reserved. + +Redistribution and use in source and binary forms, with or without +modification, are permitted provided that the following conditions are +met: + + * Redistributions of source code must retain the above copyright +notice, this list of conditions and the following disclaimer. + * Redistributions in binary form must reproduce the above +copyright notice, this list of conditions and the following disclaimer +in the documentation and/or other materials provided with the +distribution. + * Neither the name of Google Inc. nor the names of its +contributors may be used to endorse or promote products derived from +this software without specific prior written permission. + +THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS +"AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT +LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR +A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT +OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, +SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT +LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, +DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY +THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT +(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE +OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + + +/pytorch/third_party/googletest/LICENSE +--------------------------------------- +Copyright 2008, Google Inc. +All rights reserved. + +Redistribution and use in source and binary forms, with or without +modification, are permitted provided that the following conditions are +met: + + * Redistributions of source code must retain the above copyright +notice, this list of conditions and the following disclaimer. + * Redistributions in binary form must reproduce the above +copyright notice, this list of conditions and the following disclaimer +in the documentation and/or other materials provided with the +distribution. + * Neither the name of Google Inc. nor the names of its +contributors may be used to endorse or promote products derived from +this software without specific prior written permission. + +THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS +"AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT +LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR +A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT +OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, +SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT +LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, +DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY +THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT +(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE +OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + + +/pytorch/third_party/kineto/libkineto/third_party/dynolog/third_party/googletest/LICENSE +---------------------------------------------------------------------------------------- +Copyright 2008, Google Inc. +All rights reserved. + +Redistribution and use in source and binary forms, with or without +modification, are permitted provided that the following conditions are +met: + + * Redistributions of source code must retain the above copyright +notice, this list of conditions and the following disclaimer. + * Redistributions in binary form must reproduce the above +copyright notice, this list of conditions and the following disclaimer +in the documentation and/or other materials provided with the +distribution. + * Neither the name of Google Inc. nor the names of its +contributors may be used to endorse or promote products derived from +this software without specific prior written permission. + +THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS +"AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT +LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR +A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT +OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, +SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT +LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, +DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY +THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT +(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE +OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + + +/pytorch/third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/googletest/LICENSE +---------------------------------------------------------------------------------------------------------------- +Copyright 2008, Google Inc. +All rights reserved. + +Redistribution and use in source and binary forms, with or without +modification, are permitted provided that the following conditions are +met: + + * Redistributions of source code must retain the above copyright +notice, this list of conditions and the following disclaimer. + * Redistributions in binary form must reproduce the above +copyright notice, this list of conditions and the following disclaimer +in the documentation and/or other materials provided with the +distribution. + * Neither the name of Google Inc. nor the names of its +contributors may be used to endorse or promote products derived from +this software without specific prior written permission. + +THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS +"AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT +LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR +A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT +OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, +SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT +LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, +DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY +THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT +(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE +OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + + +/pytorch/third_party/kineto/libkineto/third_party/googletest/LICENSE +-------------------------------------------------------------------- +Copyright 2008, Google Inc. +All rights reserved. + +Redistribution and use in source and binary forms, with or without +modification, are permitted provided that the following conditions are +met: + + * Redistributions of source code must retain the above copyright +notice, this list of conditions and the following disclaimer. + * Redistributions in binary form must reproduce the above +copyright notice, this list of conditions and the following disclaimer +in the documentation and/or other materials provided with the +distribution. + * Neither the name of Google Inc. nor the names of its +contributors may be used to endorse or promote products derived from +this software without specific prior written permission. + +THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS +"AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT +LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR +A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT +OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, +SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT +LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, +DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY +THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT +(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE +OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + + +/pytorch/third_party/opentelemetry-cpp/third_party/googletest/LICENSE +--------------------------------------------------------------------- +Copyright 2008, Google Inc. +All rights reserved. + +Redistribution and use in source and binary forms, with or without +modification, are permitted provided that the following conditions are +met: + + * Redistributions of source code must retain the above copyright +notice, this list of conditions and the following disclaimer. + * Redistributions in binary form must reproduce the above +copyright notice, this list of conditions and the following disclaimer +in the documentation and/or other materials provided with the +distribution. + * Neither the name of Google Inc. nor the names of its +contributors may be used to endorse or promote products derived from +this software without specific prior written permission. + +THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS +"AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT +LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR +A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT +OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, +SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT +LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, +DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY +THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT +(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE +OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + + +/pytorch/third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/googletest/LICENSE +--------------------------------------------------------------------------------------------- +Copyright 2008, Google Inc. +All rights reserved. + +Redistribution and use in source and binary forms, with or without +modification, are permitted provided that the following conditions are +met: + + * Redistributions of source code must retain the above copyright +notice, this list of conditions and the following disclaimer. + * Redistributions in binary form must reproduce the above +copyright notice, this list of conditions and the following disclaimer +in the documentation and/or other materials provided with the +distribution. + * Neither the name of Google Inc. nor the names of its +contributors may be used to endorse or promote products derived from +this software without specific prior written permission. + +THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS +"AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT +LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR +A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT +OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, +SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT +LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, +DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY +THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT +(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE +OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + + +/pytorch/third_party/protobuf/third_party/googletest/LICENSE +------------------------------------------------------------ +Copyright 2008, Google Inc. +All rights reserved. + +Redistribution and use in source and binary forms, with or without +modification, are permitted provided that the following conditions are +met: + + * Redistributions of source code must retain the above copyright +notice, this list of conditions and the following disclaimer. + * Redistributions in binary form must reproduce the above +copyright notice, this list of conditions and the following disclaimer +in the documentation and/or other materials provided with the +distribution. + * Neither the name of Google Inc. nor the names of its +contributors may be used to endorse or promote products derived from +this software without specific prior written permission. + +THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS +"AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT +LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR +A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT +OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, +SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT +LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, +DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY +THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT +(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE +OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + + +/pytorch/third_party/protobuf/third_party/googletest/googletest/LICENSE +----------------------------------------------------------------------- +Copyright 2008, Google Inc. +All rights reserved. + +Redistribution and use in source and binary forms, with or without +modification, are permitted provided that the following conditions are +met: + + * Redistributions of source code must retain the above copyright +notice, this list of conditions and the following disclaimer. + * Redistributions in binary form must reproduce the above +copyright notice, this list of conditions and the following disclaimer +in the documentation and/or other materials provided with the +distribution. + * Neither the name of Google Inc. nor the names of its +contributors may be used to endorse or promote products derived from +this software without specific prior written permission. + +THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS +"AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT +LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR +A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT +OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, +SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT +LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, +DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY +THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT +(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE +OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + + +/pytorch/third_party/tensorpipe/third_party/googletest/LICENSE +-------------------------------------------------------------- +Copyright 2008, Google Inc. +All rights reserved. + +Redistribution and use in source and binary forms, with or without +modification, are permitted provided that the following conditions are +met: + + * Redistributions of source code must retain the above copyright +notice, this list of conditions and the following disclaimer. + * Redistributions in binary form must reproduce the above +copyright notice, this list of conditions and the following disclaimer +in the documentation and/or other materials provided with the +distribution. + * Neither the name of Google Inc. nor the names of its +contributors may be used to endorse or promote products derived from +this software without specific prior written permission. + +THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS +"AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT +LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR +A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT +OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, +SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT +LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, +DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY +THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT +(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE +OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + + +/pytorch/third_party/tensorpipe/third_party/googletest/googletest/LICENSE +------------------------------------------------------------------------- +Copyright 2008, Google Inc. +All rights reserved. + +Redistribution and use in source and binary forms, with or without +modification, are permitted provided that the following conditions are +met: + + * Redistributions of source code must retain the above copyright +notice, this list of conditions and the following disclaimer. + * Redistributions in binary form must reproduce the above +copyright notice, this list of conditions and the following disclaimer +in the documentation and/or other materials provided with the +distribution. + * Neither the name of Google Inc. nor the names of its +contributors may be used to endorse or promote products derived from +this software without specific prior written permission. + +THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS +"AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT +LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR +A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT +OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, +SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT +LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, +DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY +THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT +(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE +OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + + +/pytorch/third_party/ideep/mkl-dnn/tests/gtests/gtest/LICENSE +------------------------------------------------------------- +Copyright 2008, Google Inc. +All rights reserved. + +Redistribution and use in source and binary forms, with or without +modification, are permitted provided that the following conditions are +met: + + * Redistributions of source code must retain the above copyright +notice, this list of conditions and the following disclaimer. + * Redistributions in binary form must reproduce the above +copyright notice, this list of conditions and the following disclaimer +in the documentation and/or other materials provided with the +distribution. + * Neither the name of Google Inc. nor the names of its +contributors may be used to endorse or promote products derived from +this software without specific prior written permission. + +THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS +"AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT +LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR +A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT +OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, +SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT +LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, +DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY +THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT +(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE +OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + + +/pytorch/third_party/fbgemm/external/hipify_torch/LICENSE.txt +------------------------------------------------------------- +MIT License + +Copyright (c) 2021-2024, Advanced Micro Devices, Inc. All rights reserved. + +Permission is hereby granted, free of charge, to any person obtaining a copy +of this software and associated documentation files (the "Software"), to deal +in the Software without restriction, including without limitation the rights +to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +copies of the Software, and to permit persons to whom the Software is +furnished to do so, subject to the following conditions: + +The above copyright notice and this permission notice shall be included in all +copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE +SOFTWARE. + + +/pytorch/third_party/fbgemm/fbgemm_gpu/experimental/hstu/LICENSE +---------------------------------------------------------------- +BSD 3-Clause License + +Copyright (c) 2022, the respective contributors, as shown by the AUTHORS file. +All rights reserved. + +Redistribution and use in source and binary forms, with or without +modification, are permitted provided that the following conditions are met: + +* Redistributions of source code must retain the above copyright notice, this + list of conditions and the following disclaimer. + +* Redistributions in binary form must reproduce the above copyright notice, + this list of conditions and the following disclaimer in the documentation + and/or other materials provided with the distribution. + +* Neither the name of the copyright holder nor the names of its + contributors may be used to endorse or promote products derived from + this software without specific prior written permission. + +THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" +AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE +IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE +DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE +FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL +DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR +SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER +CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, +OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE +OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + +/* + * SPDX-FileCopyrightText: Copyright (c) <2024> NVIDIA CORPORATION & AFFILIATES. All rights reserved. + * SPDX-License-Identifier: LicenseRef-NvidiaProprietary + * + * NVIDIA CORPORATION, its affiliates and licensors retain all intellectual + * property and proprietary rights in and to this material, related + * documentation and any modifications thereto. Any use, reproduction, + * disclosure or distribution of this material and related documentation + * without an express license agreement from NVIDIA CORPORATION or + * its affiliates is strictly prohibited. + */ + +/pytorch/third_party/opentelemetry-cpp/tools/vcpkg/ports/hungarian/LICENSE.txt +------------------------------------------------------------------------------ +/******************************************************************** + ******************************************************************** + ** + ** libhungarian by Cyrill Stachniss, 2004 + ** + ** + ** Solving the Minimum Assignment Problem using the + ** Hungarian Method. + ** + ** ** This file may be freely copied and distributed! ** + ** + ** Parts of the used code was originally provided by the + ** "Stanford GraphGase", but I made changes to this code. + ** As asked by the copyright node of the "Stanford GraphGase", + ** I hereby proclaim that this file are *NOT* part of the + ** "Stanford GraphGase" distrubition! + ** + ** This file is distributed in the hope that it will be useful, + ** but WITHOUT ANY WARRANTY; without even the implied + ** warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR + ** PURPOSE. + ** + ******************************************************************** + ********************************************************************/ + + +/pytorch/third_party/ideep/LICENSE +---------------------------------- +Copyright (c) 2018 Intel Corporation. + +Permission is hereby granted, free of charge, to any person obtaining a copy +of this software and associated documentation files (the "Software"), to deal +in the Software without restriction, including without limitation the rights +to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +copies of the Software, and to permit persons to whom the Software is +furnished to do so, subject to the following conditions: + +The above copyright notice and this permission notice shall be included in +all copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN +THE SOFTWARE. + + +/pytorch/third_party/opentelemetry-cpp/tools/vcpkg/ports/irrlicht/LICENSE.txt +----------------------------------------------------------------------------- +The Irrlicht Engine License +=========================== + +Copyright (C) 2002-2015 Nikolaus Gebhardt + +This software is provided 'as-is', without any express or implied +warranty. In no event will the authors be held liable for any damages +arising from the use of this software. + +Permission is granted to anyone to use this software for any purpose, +including commercial applications, and to alter it and redistribute it +freely, subject to the following restrictions: + +1. The origin of this software must not be misrepresented; you must not + claim that you wrote the original software. If you use this software + in a product, an acknowledgement in the product documentation would be + appreciated but is not required. +2. Altered source versions must be clearly marked as such, and must not be + misrepresented as being the original software. +3. This notice may not be removed or altered from any source distribution. + +/pytorch/third_party/kineto/LICENSE +----------------------------------- +BSD License + +For Kineto software + +Copyright (c) Meta Platforms, Inc. and affiliates. + +All contributions by Microsoft: +Copyright (c) Microsoft Corporation. (The Azure AI Platform team) + +Redistribution and use in source and binary forms, with or without modification, +are permitted provided that the following conditions are met: + + * Redistributions of source code must retain the above copyright notice, this + list of conditions and the following disclaimer. + + * Redistributions in binary form must reproduce the above copyright notice, + this list of conditions and the following disclaimer in the documentation + and/or other materials provided with the distribution. + + * Neither the name Meta nor the names of its contributors may be used to + endorse or promote products derived from this software without specific + prior written permission. + +THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND +ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED +WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE +DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR +ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES +(INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; +LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON +ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT +(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS +SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + + +/pytorch/third_party/tensorpipe/third_party/libnop/LICENSE +---------------------------------------------------------- +Copyright 2017 The Native Object Protocols Authors + +Licensed under the Apache License, Version 2.0 (the "License"); +you may not use this file except in compliance with the License. +You may obtain a copy of the License at + + https://www.apache.org/licenses/LICENSE-2.0 + +Unless required by applicable law or agreed to in writing, software +distributed under the License is distributed on an "AS IS" BASIS, +WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +See the License for the specific language governing permissions and +limitations under the License. + + +/pytorch/third_party/opentelemetry-cpp/tools/vcpkg/ports/libstemmer/LICENSE +--------------------------------------------------------------------------- +Snowball - License +Except where explicitly noted, all the software given out on this Snowball site is covered by the 3-clause BSD License: + +Copyright (c) 2001, Dr Martin Porter, +Copyright (c) 2002, Richard Boulton. +All rights reserved. + +Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met: + +1. Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer. + +2. Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution. + +3. Neither the name of the copyright holder nor the names of its contributors may be used to endorse or promote products derived from this software without specific prior written permission. + +THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. +Essentially, all this means is that you can do what you like with the code, except claim another Copyright for it, or claim that it is issued under a different license. The software is also issued without warranties, which means that if anyone suffers through its use, they cannot come back and sue you. You also have to alert anyone to whom you give the Snowball software to the fact that it is covered by the BSD license. + +We have not bothered to insert the licensing arrangement into the text of the Snowball software. + + +/pytorch/third_party/tensorpipe/third_party/libuv/LICENSE +--------------------------------------------------------- +Copyright (c) 2015-present libuv project contributors. + +Permission is hereby granted, free of charge, to any person obtaining a copy +of this software and associated documentation files (the "Software"), to +deal in the Software without restriction, including without limitation the +rights to use, copy, modify, merge, publish, distribute, sublicense, and/or +sell copies of the Software, and to permit persons to whom the Software is +furnished to do so, subject to the following conditions: + +The above copyright notice and this permission notice shall be included in +all copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING +FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS +IN THE SOFTWARE. + + +/pytorch/third_party/mimalloc/LICENSE +------------------------------------- +MIT License + +Copyright (c) 2018-2025 Microsoft Corporation, Daan Leijen + +Permission is hereby granted, free of charge, to any person obtaining a copy +of this software and associated documentation files (the "Software"), to deal +in the Software without restriction, including without limitation the rights +to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +copies of the Software, and to permit persons to whom the Software is +furnished to do so, subject to the following conditions: + +The above copyright notice and this permission notice shall be included in all +copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE +SOFTWARE. + + +/pytorch/third_party/miniz-3.0.2/LICENSE +---------------------------------------- +Copyright 2013-2014 RAD Game Tools and Valve Software +Copyright 2010-2014 Rich Geldreich and Tenacious Software LLC + +All Rights Reserved. + +Permission is hereby granted, free of charge, to any person obtaining a copy +of this software and associated documentation files (the "Software"), to deal +in the Software without restriction, including without limitation the rights +to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +copies of the Software, and to permit persons to whom the Software is +furnished to do so, subject to the following conditions: + +The above copyright notice and this permission notice shall be included in +all copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN +THE SOFTWARE. + + +/pytorch/third_party/ideep/mkl-dnn/LICENSE +------------------------------------------ + Apache License + Version 2.0, January 2004 + http://www.apache.org/licenses/ + + TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION + + 1. Definitions. + + "License" shall mean the terms and conditions for use, reproduction, + and distribution as defined by Sections 1 through 9 of this document. + + "Licensor" shall mean the copyright owner or entity authorized by + the copyright owner that is granting the License. + + "Legal Entity" shall mean the union of the acting entity and all + other entities that control, are controlled by, or are under common + control with that entity. For the purposes of this definition, + "control" means (i) the power, direct or indirect, to cause the + direction or management of such entity, whether by contract or + otherwise, or (ii) ownership of fifty percent (50%) or more of the + outstanding shares, or (iii) beneficial ownership of such entity. + + "You" (or "Your") shall mean an individual or Legal Entity + exercising permissions granted by this License. + + "Source" form shall mean the preferred form for making modifications, + including but not limited to software source code, documentation + source, and configuration files. + + "Object" form shall mean any form resulting from mechanical + transformation or translation of a Source form, including but + not limited to compiled object code, generated documentation, + and conversions to other media types. + + "Work" shall mean the work of authorship, whether in Source or + Object form, made available under the License, as indicated by a + copyright notice that is included in or attached to the work + (an example is provided in the Appendix below). + + "Derivative Works" shall mean any work, whether in Source or Object + form, that is based on (or derived from) the Work and for which the + editorial revisions, annotations, elaborations, or other modifications + represent, as a whole, an original work of authorship. For the purposes + of this License, Derivative Works shall not include works that remain + separable from, or merely link (or bind by name) to the interfaces of, + the Work and Derivative Works thereof. + + "Contribution" shall mean any work of authorship, including + the original version of the Work and any modifications or additions + to that Work or Derivative Works thereof, that is intentionally + submitted to Licensor for inclusion in the Work by the copyright owner + or by an individual or Legal Entity authorized to submit on behalf of + the copyright owner. For the purposes of this definition, "submitted" + means any form of electronic, verbal, or written communication sent + to the Licensor or its representatives, including but not limited to + communication on electronic mailing lists, source code control systems, + and issue tracking systems that are managed by, or on behalf of, the + Licensor for the purpose of discussing and improving the Work, but + excluding communication that is conspicuously marked or otherwise + designated in writing by the copyright owner as "Not a Contribution." + + "Contributor" shall mean Licensor and any individual or Legal Entity + on behalf of whom a Contribution has been received by Licensor and + subsequently incorporated within the Work. + + 2. Grant of Copyright License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + copyright license to reproduce, prepare Derivative Works of, + publicly display, publicly perform, sublicense, and distribute the + Work and such Derivative Works in Source or Object form. + + 3. Grant of Patent License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + (except as stated in this section) patent license to make, have made, + use, offer to sell, sell, import, and otherwise transfer the Work, + where such license applies only to those patent claims licensable + by such Contributor that are necessarily infringed by their + Contribution(s) alone or by combination of their Contribution(s) + with the Work to which such Contribution(s) was submitted. If You + institute patent litigation against any entity (including a + cross-claim or counterclaim in a lawsuit) alleging that the Work + or a Contribution incorporated within the Work constitutes direct + or contributory patent infringement, then any patent licenses + granted to You under this License for that Work shall terminate + as of the date such litigation is filed. + + 4. Redistribution. You may reproduce and distribute copies of the + Work or Derivative Works thereof in any medium, with or without + modifications, and in Source or Object form, provided that You + meet the following conditions: + + (a) You must give any other recipients of the Work or + Derivative Works a copy of this License; and + + (b) You must cause any modified files to carry prominent notices + stating that You changed the files; and + + (c) You must retain, in the Source form of any Derivative Works + that You distribute, all copyright, patent, trademark, and + attribution notices from the Source form of the Work, + excluding those notices that do not pertain to any part of + the Derivative Works; and + + (d) If the Work includes a "NOTICE" text file as part of its + distribution, then any Derivative Works that You distribute must + include a readable copy of the attribution notices contained + within such NOTICE file, excluding those notices that do not + pertain to any part of the Derivative Works, in at least one + of the following places: within a NOTICE text file distributed + as part of the Derivative Works; within the Source form or + documentation, if provided along with the Derivative Works; or, + within a display generated by the Derivative Works, if and + wherever such third-party notices normally appear. The contents + of the NOTICE file are for informational purposes only and + do not modify the License. You may add Your own attribution + notices within Derivative Works that You distribute, alongside + or as an addendum to the NOTICE text from the Work, provided + that such additional attribution notices cannot be construed + as modifying the License. + + You may add Your own copyright statement to Your modifications and + may provide additional or different license terms and conditions + for use, reproduction, or distribution of Your modifications, or + for any such Derivative Works as a whole, provided Your use, + reproduction, and distribution of the Work otherwise complies with + the conditions stated in this License. + + 5. Submission of Contributions. Unless You explicitly state otherwise, + any Contribution intentionally submitted for inclusion in the Work + by You to the Licensor shall be under the terms and conditions of + this License, without any additional terms or conditions. + Notwithstanding the above, nothing herein shall supersede or modify + the terms of any separate license agreement you may have executed + with Licensor regarding such Contributions. + + 6. Trademarks. This License does not grant permission to use the trade + names, trademarks, service marks, or product names of the Licensor, + except as required for reasonable and customary use in describing the + origin of the Work and reproducing the content of the NOTICE file. + + 7. Disclaimer of Warranty. Unless required by applicable law or + agreed to in writing, Licensor provides the Work (and each + Contributor provides its Contributions) on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or + implied, including, without limitation, any warranties or conditions + of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A + PARTICULAR PURPOSE. You are solely responsible for determining the + appropriateness of using or redistributing the Work and assume any + risks associated with Your exercise of permissions under this License. + + 8. Limitation of Liability. In no event and under no legal theory, + whether in tort (including negligence), contract, or otherwise, + unless required by applicable law (such as deliberate and grossly + negligent acts) or agreed to in writing, shall any Contributor be + liable to You for damages, including any direct, indirect, special, + incidental, or consequential damages of any character arising as a + result of this License or out of the use or inability to use the + Work (including but not limited to damages for loss of goodwill, + work stoppage, computer failure or malfunction, or any and all + other commercial damages or losses), even if such Contributor + has been advised of the possibility of such damages. + + 9. Accepting Warranty or Additional Liability. While redistributing + the Work or Derivative Works thereof, You may choose to offer, + and charge a fee for, acceptance of support, warranty, indemnity, + or other liability obligations and/or rights consistent with this + License. However, in accepting such obligations, You may act only + on Your own behalf and on Your sole responsibility, not on behalf + of any other Contributor, and only if You agree to indemnify, + defend, and hold each Contributor harmless for any liability + incurred by, or claims asserted against, such Contributor by reason + of your accepting any such warranty or additional liability. + + END OF TERMS AND CONDITIONS + + ============================================================================ + + Copyright 2016-2023 Intel Corporation + Copyright 2018 YANDEX LLC + Copyright 2019-2023 FUJITSU LIMITED + Copyright 2020-2023 Arm Ltd. and affiliates + Copyright 2020-2022 Codeplay Software Limited + Copyright 2021 Alanna Tempest + Copyright 2022-2023 IBM Corporation + Copyright 2023 KNS Group LLC (YADRO) + + Licensed under the Apache License, Version 2.0 (the "License"); + you may not use this file except in compliance with the License. + You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, software + distributed under the License is distributed on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + See the License for the specific language governing permissions and + limitations under the License. + + This distribution includes third party software ("third party programs"). + This third party software, even if included with the distribution of + the Intel software, may be governed by separate license terms, including + without limitation, third party license terms, other Intel software license + terms, and open source software license terms. These separate license terms + govern your use of the third party programs as set forth in the + "THIRD-PARTY-PROGRAMS" file. + + +/pytorch/third_party/opentelemetry-cpp/third_party/ms-gsl/LICENSE +----------------------------------------------------------------- +Copyright (c) 2015 Microsoft Corporation. All rights reserved. + +This code is licensed under the MIT License (MIT). + +Permission is hereby granted, free of charge, to any person obtaining a copy +of this software and associated documentation files (the "Software"), to deal +in the Software without restriction, including without limitation the rights +to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies +of the Software, and to permit persons to whom the Software is furnished to do +so, subject to the following conditions: + +The above copyright notice and this permission notice shall be included in all +copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN +THE SOFTWARE. + + +/pytorch/third_party/fbgemm/fbgemm_gpu/src/quantize_ops/mx/LICENSE +------------------------------------------------------------------ + MIT License + + Copyright (c) Microsoft Corporation. + + Permission is hereby granted, free of charge, to any person obtaining a copy + of this software and associated documentation files (the "Software"), to deal + in the Software without restriction, including without limitation the rights + to use, copy, modify, merge, publish, distribute, sublicense, and/or sell + copies of the Software, and to permit persons to whom the Software is + furnished to do so, subject to the following conditions: + + The above copyright notice and this permission notice shall be included in all + copies or substantial portions of the Software. + + THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE + AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER + LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, + OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE + SOFTWARE + + +/pytorch/third_party/fbgemm/fbgemm_gpu/test/quantize/mx/LICENSE +--------------------------------------------------------------- + MIT License + + Copyright (c) Microsoft Corporation. + + Permission is hereby granted, free of charge, to any person obtaining a copy + of this software and associated documentation files (the "Software"), to deal + in the Software without restriction, including without limitation the rights + to use, copy, modify, merge, publish, distribute, sublicense, and/or sell + copies of the Software, and to permit persons to whom the Software is + furnished to do so, subject to the following conditions: + + The above copyright notice and this permission notice shall be included in all + copies or substantial portions of the Software. + + THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE + AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER + LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, + OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE + SOFTWARE + + +/pytorch/third_party/onnx/LICENSE +--------------------------------- + + Apache License + Version 2.0, January 2004 + http://www.apache.org/licenses/ + + TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION + + 1. Definitions. + + "License" shall mean the terms and conditions for use, reproduction, + and distribution as defined by Sections 1 through 9 of this document. + + "Licensor" shall mean the copyright owner or entity authorized by + the copyright owner that is granting the License. + + "Legal Entity" shall mean the union of the acting entity and all + other entities that control, are controlled by, or are under common + control with that entity. For the purposes of this definition, + "control" means (i) the power, direct or indirect, to cause the + direction or management of such entity, whether by contract or + otherwise, or (ii) ownership of fifty percent (50%) or more of the + outstanding shares, or (iii) beneficial ownership of such entity. + + "You" (or "Your") shall mean an individual or Legal Entity + exercising permissions granted by this License. + + "Source" form shall mean the preferred form for making modifications, + including but not limited to software source code, documentation + source, and configuration files. + + "Object" form shall mean any form resulting from mechanical + transformation or translation of a Source form, including but + not limited to compiled object code, generated documentation, + and conversions to other media types. + + "Work" shall mean the work of authorship, whether in Source or + Object form, made available under the License, as indicated by a + copyright notice that is included in or attached to the work + (an example is provided in the Appendix below). + + "Derivative Works" shall mean any work, whether in Source or Object + form, that is based on (or derived from) the Work and for which the + editorial revisions, annotations, elaborations, or other modifications + represent, as a whole, an original work of authorship. For the purposes + of this License, Derivative Works shall not include works that remain + separable from, or merely link (or bind by name) to the interfaces of, + the Work and Derivative Works thereof. + + "Contribution" shall mean any work of authorship, including + the original version of the Work and any modifications or additions + to that Work or Derivative Works thereof, that is intentionally + submitted to Licensor for inclusion in the Work by the copyright owner + or by an individual or Legal Entity authorized to submit on behalf of + the copyright owner. For the purposes of this definition, "submitted" + means any form of electronic, verbal, or written communication sent + to the Licensor or its representatives, including but not limited to + communication on electronic mailing lists, source code control systems, + and issue tracking systems that are managed by, or on behalf of, the + Licensor for the purpose of discussing and improving the Work, but + excluding communication that is conspicuously marked or otherwise + designated in writing by the copyright owner as "Not a Contribution." + + "Contributor" shall mean Licensor and any individual or Legal Entity + on behalf of whom a Contribution has been received by Licensor and + subsequently incorporated within the Work. + + 2. Grant of Copyright License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + copyright license to reproduce, prepare Derivative Works of, + publicly display, publicly perform, sublicense, and distribute the + Work and such Derivative Works in Source or Object form. + + 3. Grant of Patent License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + (except as stated in this section) patent license to make, have made, + use, offer to sell, sell, import, and otherwise transfer the Work, + where such license applies only to those patent claims licensable + by such Contributor that are necessarily infringed by their + Contribution(s) alone or by combination of their Contribution(s) + with the Work to which such Contribution(s) was submitted. If You + institute patent litigation against any entity (including a + cross-claim or counterclaim in a lawsuit) alleging that the Work + or a Contribution incorporated within the Work constitutes direct + or contributory patent infringement, then any patent licenses + granted to You under this License for that Work shall terminate + as of the date such litigation is filed. + + 4. Redistribution. You may reproduce and distribute copies of the + Work or Derivative Works thereof in any medium, with or without + modifications, and in Source or Object form, provided that You + meet the following conditions: + + (a) You must give any other recipients of the Work or + Derivative Works a copy of this License; and + + (b) You must cause any modified files to carry prominent notices + stating that You changed the files; and + + (c) You must retain, in the Source form of any Derivative Works + that You distribute, all copyright, patent, trademark, and + attribution notices from the Source form of the Work, + excluding those notices that do not pertain to any part of + the Derivative Works; and + + (d) If the Work includes a "NOTICE" text file as part of its + distribution, then any Derivative Works that You distribute must + include a readable copy of the attribution notices contained + within such NOTICE file, excluding those notices that do not + pertain to any part of the Derivative Works, in at least one + of the following places: within a NOTICE text file distributed + as part of the Derivative Works; within the Source form or + documentation, if provided along with the Derivative Works; or, + within a display generated by the Derivative Works, if and + wherever such third-party notices normally appear. The contents + of the NOTICE file are for informational purposes only and + do not modify the License. You may add Your own attribution + notices within Derivative Works that You distribute, alongside + or as an addendum to the NOTICE text from the Work, provided + that such additional attribution notices cannot be construed + as modifying the License. + + You may add Your own copyright statement to Your modifications and + may provide additional or different license terms and conditions + for use, reproduction, or distribution of Your modifications, or + for any such Derivative Works as a whole, provided Your use, + reproduction, and distribution of the Work otherwise complies with + the conditions stated in this License. + + 5. Submission of Contributions. Unless You explicitly state otherwise, + any Contribution intentionally submitted for inclusion in the Work + by You to the Licensor shall be under the terms and conditions of + this License, without any additional terms or conditions. + Notwithstanding the above, nothing herein shall supersede or modify + the terms of any separate license agreement you may have executed + with Licensor regarding such Contributions. + + 6. Trademarks. This License does not grant permission to use the trade + names, trademarks, service marks, or product names of the Licensor, + except as required for reasonable and customary use in describing the + origin of the Work and reproducing the content of the NOTICE file. + + 7. Disclaimer of Warranty. Unless required by applicable law or + agreed to in writing, Licensor provides the Work (and each + Contributor provides its Contributions) on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or + implied, including, without limitation, any warranties or conditions + of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A + PARTICULAR PURPOSE. You are solely responsible for determining the + appropriateness of using or redistributing the Work and assume any + risks associated with Your exercise of permissions under this License. + + 8. Limitation of Liability. In no event and under no legal theory, + whether in tort (including negligence), contract, or otherwise, + unless required by applicable law (such as deliberate and grossly + negligent acts) or agreed to in writing, shall any Contributor be + liable to You for damages, including any direct, indirect, special, + incidental, or consequential damages of any character arising as a + result of this License or out of the use or inability to use the + Work (including but not limited to damages for loss of goodwill, + work stoppage, computer failure or malfunction, or any and all + other commercial damages or losses), even if such Contributor + has been advised of the possibility of such damages. + + 9. Accepting Warranty or Additional Liability. While redistributing + the Work or Derivative Works thereof, You may choose to offer, + and charge a fee for, acceptance of support, warranty, indemnity, + or other liability obligations and/or rights consistent with this + License. However, in accepting such obligations, You may act only + on Your own behalf and on Your sole responsibility, not on behalf + of any other Contributor, and only if You agree to indemnify, + defend, and hold each Contributor harmless for any liability + incurred by, or claims asserted against, such Contributor by reason + of your accepting any such warranty or additional liability. + + END OF TERMS AND CONDITIONS + + APPENDIX: How to apply the Apache License to your work. + + To apply the Apache License to your work, attach the following + boilerplate notice, with the fields enclosed by brackets "[]" + replaced with your own identifying information. (Don't include + the brackets!) The text should be enclosed in the appropriate + comment syntax for the file format. We also recommend that a + file or class name and description of purpose be included on the + same "printed page" as the copyright notice for easier + identification within third-party archives. + + Copyright [yyyy] [name of copyright owner] + + Licensed under the Apache License, Version 2.0 (the "License"); + you may not use this file except in compliance with the License. + You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, software + distributed under the License is distributed on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + See the License for the specific language governing permissions and + limitations under the License. + + +/pytorch/third_party/opentelemetry-cpp/LICENSE +---------------------------------------------- + Apache License + Version 2.0, January 2004 + http://www.apache.org/licenses/ + + TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION + + 1. Definitions. + + "License" shall mean the terms and conditions for use, reproduction, + and distribution as defined by Sections 1 through 9 of this document. + + "Licensor" shall mean the copyright owner or entity authorized by + the copyright owner that is granting the License. + + "Legal Entity" shall mean the union of the acting entity and all + other entities that control, are controlled by, or are under common + control with that entity. For the purposes of this definition, + "control" means (i) the power, direct or indirect, to cause the + direction or management of such entity, whether by contract or + otherwise, or (ii) ownership of fifty percent (50%) or more of the + outstanding shares, or (iii) beneficial ownership of such entity. + + "You" (or "Your") shall mean an individual or Legal Entity + exercising permissions granted by this License. + + "Source" form shall mean the preferred form for making modifications, + including but not limited to software source code, documentation + source, and configuration files. + + "Object" form shall mean any form resulting from mechanical + transformation or translation of a Source form, including but + not limited to compiled object code, generated documentation, + and conversions to other media types. + + "Work" shall mean the work of authorship, whether in Source or + Object form, made available under the License, as indicated by a + copyright notice that is included in or attached to the work + (an example is provided in the Appendix below). + + "Derivative Works" shall mean any work, whether in Source or Object + form, that is based on (or derived from) the Work and for which the + editorial revisions, annotations, elaborations, or other modifications + represent, as a whole, an original work of authorship. For the purposes + of this License, Derivative Works shall not include works that remain + separable from, or merely link (or bind by name) to the interfaces of, + the Work and Derivative Works thereof. + + "Contribution" shall mean any work of authorship, including + the original version of the Work and any modifications or additions + to that Work or Derivative Works thereof, that is intentionally + submitted to Licensor for inclusion in the Work by the copyright owner + or by an individual or Legal Entity authorized to submit on behalf of + the copyright owner. For the purposes of this definition, "submitted" + means any form of electronic, verbal, or written communication sent + to the Licensor or its representatives, including but not limited to + communication on electronic mailing lists, source code control systems, + and issue tracking systems that are managed by, or on behalf of, the + Licensor for the purpose of discussing and improving the Work, but + excluding communication that is conspicuously marked or otherwise + designated in writing by the copyright owner as "Not a Contribution." + + "Contributor" shall mean Licensor and any individual or Legal Entity + on behalf of whom a Contribution has been received by Licensor and + subsequently incorporated within the Work. + + 2. Grant of Copyright License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + copyright license to reproduce, prepare Derivative Works of, + publicly display, publicly perform, sublicense, and distribute the + Work and such Derivative Works in Source or Object form. + + 3. Grant of Patent License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + (except as stated in this section) patent license to make, have made, + use, offer to sell, sell, import, and otherwise transfer the Work, + where such license applies only to those patent claims licensable + by such Contributor that are necessarily infringed by their + Contribution(s) alone or by combination of their Contribution(s) + with the Work to which such Contribution(s) was submitted. If You + institute patent litigation against any entity (including a + cross-claim or counterclaim in a lawsuit) alleging that the Work + or a Contribution incorporated within the Work constitutes direct + or contributory patent infringement, then any patent licenses + granted to You under this License for that Work shall terminate + as of the date such litigation is filed. + + 4. Redistribution. You may reproduce and distribute copies of the + Work or Derivative Works thereof in any medium, with or without + modifications, and in Source or Object form, provided that You + meet the following conditions: + + (a) You must give any other recipients of the Work or + Derivative Works a copy of this License; and + + (b) You must cause any modified files to carry prominent notices + stating that You changed the files; and + + (c) You must retain, in the Source form of any Derivative Works + that You distribute, all copyright, patent, trademark, and + attribution notices from the Source form of the Work, + excluding those notices that do not pertain to any part of + the Derivative Works; and + + (d) If the Work includes a "NOTICE" text file as part of its + distribution, then any Derivative Works that You distribute must + include a readable copy of the attribution notices contained + within such NOTICE file, excluding those notices that do not + pertain to any part of the Derivative Works, in at least one + of the following places: within a NOTICE text file distributed + as part of the Derivative Works; within the Source form or + documentation, if provided along with the Derivative Works; or, + within a display generated by the Derivative Works, if and + wherever such third-party notices normally appear. The contents + of the NOTICE file are for informational purposes only and + do not modify the License. You may add Your own attribution + notices within Derivative Works that You distribute, alongside + or as an addendum to the NOTICE text from the Work, provided + that such additional attribution notices cannot be construed + as modifying the License. + + You may add Your own copyright statement to Your modifications and + may provide additional or different license terms and conditions + for use, reproduction, or distribution of Your modifications, or + for any such Derivative Works as a whole, provided Your use, + reproduction, and distribution of the Work otherwise complies with + the conditions stated in this License. + + 5. Submission of Contributions. Unless You explicitly state otherwise, + any Contribution intentionally submitted for inclusion in the Work + by You to the Licensor shall be under the terms and conditions of + this License, without any additional terms or conditions. + Notwithstanding the above, nothing herein shall supersede or modify + the terms of any separate license agreement you may have executed + with Licensor regarding such Contributions. + + 6. Trademarks. This License does not grant permission to use the trade + names, trademarks, service marks, or product names of the Licensor, + except as required for reasonable and customary use in describing the + origin of the Work and reproducing the content of the NOTICE file. + + 7. Disclaimer of Warranty. Unless required by applicable law or + agreed to in writing, Licensor provides the Work (and each + Contributor provides its Contributions) on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or + implied, including, without limitation, any warranties or conditions + of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A + PARTICULAR PURPOSE. You are solely responsible for determining the + appropriateness of using or redistributing the Work and assume any + risks associated with Your exercise of permissions under this License. + + 8. Limitation of Liability. In no event and under no legal theory, + whether in tort (including negligence), contract, or otherwise, + unless required by applicable law (such as deliberate and grossly + negligent acts) or agreed to in writing, shall any Contributor be + liable to You for damages, including any direct, indirect, special, + incidental, or consequential damages of any character arising as a + result of this License or out of the use or inability to use the + Work (including but not limited to damages for loss of goodwill, + work stoppage, computer failure or malfunction, or any and all + other commercial damages or losses), even if such Contributor + has been advised of the possibility of such damages. + + 9. Accepting Warranty or Additional Liability. While redistributing + the Work or Derivative Works thereof, You may choose to offer, + and charge a fee for, acceptance of support, warranty, indemnity, + or other liability obligations and/or rights consistent with this + License. However, in accepting such obligations, You may act only + on Your own behalf and on Your sole responsibility, not on behalf + of any other Contributor, and only if You agree to indemnify, + defend, and hold each Contributor harmless for any liability + incurred by, or claims asserted against, such Contributor by reason + of your accepting any such warranty or additional liability. + + END OF TERMS AND CONDITIONS + + APPENDIX: How to apply the Apache License to your work. + + To apply the Apache License to your work, attach the following + boilerplate notice, with the fields enclosed by brackets "[]" + replaced with your own identifying information. (Don't include + the brackets!) The text should be enclosed in the appropriate + comment syntax for the file format. We also recommend that a + file or class name and description of purpose be included on the + same "printed page" as the copyright notice for easier + identification within third-party archives. + + Copyright [yyyy] [name of copyright owner] + + Licensed under the Apache License, Version 2.0 (the "License"); + you may not use this file except in compliance with the License. + You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, software + distributed under the License is distributed on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + See the License for the specific language governing permissions and + limitations under the License. + + +/pytorch/third_party/opentelemetry-cpp/third_party/opentelemetry-proto/LICENSE +------------------------------------------------------------------------------ + Apache License + Version 2.0, January 2004 + http://www.apache.org/licenses/ + + TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION + + 1. Definitions. + + "License" shall mean the terms and conditions for use, reproduction, + and distribution as defined by Sections 1 through 9 of this document. + + "Licensor" shall mean the copyright owner or entity authorized by + the copyright owner that is granting the License. + + "Legal Entity" shall mean the union of the acting entity and all + other entities that control, are controlled by, or are under common + control with that entity. For the purposes of this definition, + "control" means (i) the power, direct or indirect, to cause the + direction or management of such entity, whether by contract or + otherwise, or (ii) ownership of fifty percent (50%) or more of the + outstanding shares, or (iii) beneficial ownership of such entity. + + "You" (or "Your") shall mean an individual or Legal Entity + exercising permissions granted by this License. + + "Source" form shall mean the preferred form for making modifications, + including but not limited to software source code, documentation + source, and configuration files. + + "Object" form shall mean any form resulting from mechanical + transformation or translation of a Source form, including but + not limited to compiled object code, generated documentation, + and conversions to other media types. + + "Work" shall mean the work of authorship, whether in Source or + Object form, made available under the License, as indicated by a + copyright notice that is included in or attached to the work + (an example is provided in the Appendix below). + + "Derivative Works" shall mean any work, whether in Source or Object + form, that is based on (or derived from) the Work and for which the + editorial revisions, annotations, elaborations, or other modifications + represent, as a whole, an original work of authorship. For the purposes + of this License, Derivative Works shall not include works that remain + separable from, or merely link (or bind by name) to the interfaces of, + the Work and Derivative Works thereof. + + "Contribution" shall mean any work of authorship, including + the original version of the Work and any modifications or additions + to that Work or Derivative Works thereof, that is intentionally + submitted to Licensor for inclusion in the Work by the copyright owner + or by an individual or Legal Entity authorized to submit on behalf of + the copyright owner. For the purposes of this definition, "submitted" + means any form of electronic, verbal, or written communication sent + to the Licensor or its representatives, including but not limited to + communication on electronic mailing lists, source code control systems, + and issue tracking systems that are managed by, or on behalf of, the + Licensor for the purpose of discussing and improving the Work, but + excluding communication that is conspicuously marked or otherwise + designated in writing by the copyright owner as "Not a Contribution." + + "Contributor" shall mean Licensor and any individual or Legal Entity + on behalf of whom a Contribution has been received by Licensor and + subsequently incorporated within the Work. + + 2. Grant of Copyright License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + copyright license to reproduce, prepare Derivative Works of, + publicly display, publicly perform, sublicense, and distribute the + Work and such Derivative Works in Source or Object form. + + 3. Grant of Patent License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + (except as stated in this section) patent license to make, have made, + use, offer to sell, sell, import, and otherwise transfer the Work, + where such license applies only to those patent claims licensable + by such Contributor that are necessarily infringed by their + Contribution(s) alone or by combination of their Contribution(s) + with the Work to which such Contribution(s) was submitted. If You + institute patent litigation against any entity (including a + cross-claim or counterclaim in a lawsuit) alleging that the Work + or a Contribution incorporated within the Work constitutes direct + or contributory patent infringement, then any patent licenses + granted to You under this License for that Work shall terminate + as of the date such litigation is filed. + + 4. Redistribution. You may reproduce and distribute copies of the + Work or Derivative Works thereof in any medium, with or without + modifications, and in Source or Object form, provided that You + meet the following conditions: + + (a) You must give any other recipients of the Work or + Derivative Works a copy of this License; and + + (b) You must cause any modified files to carry prominent notices + stating that You changed the files; and + + (c) You must retain, in the Source form of any Derivative Works + that You distribute, all copyright, patent, trademark, and + attribution notices from the Source form of the Work, + excluding those notices that do not pertain to any part of + the Derivative Works; and + + (d) If the Work includes a "NOTICE" text file as part of its + distribution, then any Derivative Works that You distribute must + include a readable copy of the attribution notices contained + within such NOTICE file, excluding those notices that do not + pertain to any part of the Derivative Works, in at least one + of the following places: within a NOTICE text file distributed + as part of the Derivative Works; within the Source form or + documentation, if provided along with the Derivative Works; or, + within a display generated by the Derivative Works, if and + wherever such third-party notices normally appear. The contents + of the NOTICE file are for informational purposes only and + do not modify the License. You may add Your own attribution + notices within Derivative Works that You distribute, alongside + or as an addendum to the NOTICE text from the Work, provided + that such additional attribution notices cannot be construed + as modifying the License. + + You may add Your own copyright statement to Your modifications and + may provide additional or different license terms and conditions + for use, reproduction, or distribution of Your modifications, or + for any such Derivative Works as a whole, provided Your use, + reproduction, and distribution of the Work otherwise complies with + the conditions stated in this License. + + 5. Submission of Contributions. Unless You explicitly state otherwise, + any Contribution intentionally submitted for inclusion in the Work + by You to the Licensor shall be under the terms and conditions of + this License, without any additional terms or conditions. + Notwithstanding the above, nothing herein shall supersede or modify + the terms of any separate license agreement you may have executed + with Licensor regarding such Contributions. + + 6. Trademarks. This License does not grant permission to use the trade + names, trademarks, service marks, or product names of the Licensor, + except as required for reasonable and customary use in describing the + origin of the Work and reproducing the content of the NOTICE file. + + 7. Disclaimer of Warranty. Unless required by applicable law or + agreed to in writing, Licensor provides the Work (and each + Contributor provides its Contributions) on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or + implied, including, without limitation, any warranties or conditions + of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A + PARTICULAR PURPOSE. You are solely responsible for determining the + appropriateness of using or redistributing the Work and assume any + risks associated with Your exercise of permissions under this License. + + 8. Limitation of Liability. In no event and under no legal theory, + whether in tort (including negligence), contract, or otherwise, + unless required by applicable law (such as deliberate and grossly + negligent acts) or agreed to in writing, shall any Contributor be + liable to You for damages, including any direct, indirect, special, + incidental, or consequential damages of any character arising as a + result of this License or out of the use or inability to use the + Work (including but not limited to damages for loss of goodwill, + work stoppage, computer failure or malfunction, or any and all + other commercial damages or losses), even if such Contributor + has been advised of the possibility of such damages. + + 9. Accepting Warranty or Additional Liability. While redistributing + the Work or Derivative Works thereof, You may choose to offer, + and charge a fee for, acceptance of support, warranty, indemnity, + or other liability obligations and/or rights consistent with this + License. However, in accepting such obligations, You may act only + on Your own behalf and on Your sole responsibility, not on behalf + of any other Contributor, and only if You agree to indemnify, + defend, and hold each Contributor harmless for any liability + incurred by, or claims asserted against, such Contributor by reason + of your accepting any such warranty or additional liability. + + END OF TERMS AND CONDITIONS + + APPENDIX: How to apply the Apache License to your work. + + To apply the Apache License to your work, attach the following + boilerplate notice, with the fields enclosed by brackets "[]" + replaced with your own identifying information. (Don't include + the brackets!) The text should be enclosed in the appropriate + comment syntax for the file format. We also recommend that a + file or class name and description of purpose be included on the + same "printed page" as the copyright notice for easier + identification within third-party archives. + + Copyright [yyyy] [name of copyright owner] + + Licensed under the Apache License, Version 2.0 (the "License"); + you may not use this file except in compliance with the License. + You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, software + distributed under the License is distributed on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + See the License for the specific language governing permissions and + limitations under the License. + + +/pytorch/third_party/opentelemetry-cpp/third_party/opentracing-cpp/LICENSE +-------------------------------------------------------------------------- + Apache License + Version 2.0, January 2004 + http://www.apache.org/licenses/ + + TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION + + 1. Definitions. + + "License" shall mean the terms and conditions for use, reproduction, + and distribution as defined by Sections 1 through 9 of this document. + + "Licensor" shall mean the copyright owner or entity authorized by + the copyright owner that is granting the License. + + "Legal Entity" shall mean the union of the acting entity and all + other entities that control, are controlled by, or are under common + control with that entity. For the purposes of this definition, + "control" means (i) the power, direct or indirect, to cause the + direction or management of such entity, whether by contract or + otherwise, or (ii) ownership of fifty percent (50%) or more of the + outstanding shares, or (iii) beneficial ownership of such entity. + + "You" (or "Your") shall mean an individual or Legal Entity + exercising permissions granted by this License. + + "Source" form shall mean the preferred form for making modifications, + including but not limited to software source code, documentation + source, and configuration files. + + "Object" form shall mean any form resulting from mechanical + transformation or translation of a Source form, including but + not limited to compiled object code, generated documentation, + and conversions to other media types. + + "Work" shall mean the work of authorship, whether in Source or + Object form, made available under the License, as indicated by a + copyright notice that is included in or attached to the work + (an example is provided in the Appendix below). + + "Derivative Works" shall mean any work, whether in Source or Object + form, that is based on (or derived from) the Work and for which the + editorial revisions, annotations, elaborations, or other modifications + represent, as a whole, an original work of authorship. For the purposes + of this License, Derivative Works shall not include works that remain + separable from, or merely link (or bind by name) to the interfaces of, + the Work and Derivative Works thereof. + + "Contribution" shall mean any work of authorship, including + the original version of the Work and any modifications or additions + to that Work or Derivative Works thereof, that is intentionally + submitted to Licensor for inclusion in the Work by the copyright owner + or by an individual or Legal Entity authorized to submit on behalf of + the copyright owner. For the purposes of this definition, "submitted" + means any form of electronic, verbal, or written communication sent + to the Licensor or its representatives, including but not limited to + communication on electronic mailing lists, source code control systems, + and issue tracking systems that are managed by, or on behalf of, the + Licensor for the purpose of discussing and improving the Work, but + excluding communication that is conspicuously marked or otherwise + designated in writing by the copyright owner as "Not a Contribution." + + "Contributor" shall mean Licensor and any individual or Legal Entity + on behalf of whom a Contribution has been received by Licensor and + subsequently incorporated within the Work. + + 2. Grant of Copyright License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + copyright license to reproduce, prepare Derivative Works of, + publicly display, publicly perform, sublicense, and distribute the + Work and such Derivative Works in Source or Object form. + + 3. Grant of Patent License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + (except as stated in this section) patent license to make, have made, + use, offer to sell, sell, import, and otherwise transfer the Work, + where such license applies only to those patent claims licensable + by such Contributor that are necessarily infringed by their + Contribution(s) alone or by combination of their Contribution(s) + with the Work to which such Contribution(s) was submitted. If You + institute patent litigation against any entity (including a + cross-claim or counterclaim in a lawsuit) alleging that the Work + or a Contribution incorporated within the Work constitutes direct + or contributory patent infringement, then any patent licenses + granted to You under this License for that Work shall terminate + as of the date such litigation is filed. + + 4. Redistribution. You may reproduce and distribute copies of the + Work or Derivative Works thereof in any medium, with or without + modifications, and in Source or Object form, provided that You + meet the following conditions: + + (a) You must give any other recipients of the Work or + Derivative Works a copy of this License; and + + (b) You must cause any modified files to carry prominent notices + stating that You changed the files; and + + (c) You must retain, in the Source form of any Derivative Works + that You distribute, all copyright, patent, trademark, and + attribution notices from the Source form of the Work, + excluding those notices that do not pertain to any part of + the Derivative Works; and + + (d) If the Work includes a "NOTICE" text file as part of its + distribution, then any Derivative Works that You distribute must + include a readable copy of the attribution notices contained + within such NOTICE file, excluding those notices that do not + pertain to any part of the Derivative Works, in at least one + of the following places: within a NOTICE text file distributed + as part of the Derivative Works; within the Source form or + documentation, if provided along with the Derivative Works; or, + within a display generated by the Derivative Works, if and + wherever such third-party notices normally appear. The contents + of the NOTICE file are for informational purposes only and + do not modify the License. You may add Your own attribution + notices within Derivative Works that You distribute, alongside + or as an addendum to the NOTICE text from the Work, provided + that such additional attribution notices cannot be construed + as modifying the License. + + You may add Your own copyright statement to Your modifications and + may provide additional or different license terms and conditions + for use, reproduction, or distribution of Your modifications, or + for any such Derivative Works as a whole, provided Your use, + reproduction, and distribution of the Work otherwise complies with + the conditions stated in this License. + + 5. Submission of Contributions. Unless You explicitly state otherwise, + any Contribution intentionally submitted for inclusion in the Work + by You to the Licensor shall be under the terms and conditions of + this License, without any additional terms or conditions. + Notwithstanding the above, nothing herein shall supersede or modify + the terms of any separate license agreement you may have executed + with Licensor regarding such Contributions. + + 6. Trademarks. This License does not grant permission to use the trade + names, trademarks, service marks, or product names of the Licensor, + except as required for reasonable and customary use in describing the + origin of the Work and reproducing the content of the NOTICE file. + + 7. Disclaimer of Warranty. Unless required by applicable law or + agreed to in writing, Licensor provides the Work (and each + Contributor provides its Contributions) on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or + implied, including, without limitation, any warranties or conditions + of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A + PARTICULAR PURPOSE. You are solely responsible for determining the + appropriateness of using or redistributing the Work and assume any + risks associated with Your exercise of permissions under this License. + + 8. Limitation of Liability. In no event and under no legal theory, + whether in tort (including negligence), contract, or otherwise, + unless required by applicable law (such as deliberate and grossly + negligent acts) or agreed to in writing, shall any Contributor be + liable to You for damages, including any direct, indirect, special, + incidental, or consequential damages of any character arising as a + result of this License or out of the use or inability to use the + Work (including but not limited to damages for loss of goodwill, + work stoppage, computer failure or malfunction, or any and all + other commercial damages or losses), even if such Contributor + has been advised of the possibility of such damages. + + 9. Accepting Warranty or Additional Liability. While redistributing + the Work or Derivative Works thereof, You may choose to offer, + and charge a fee for, acceptance of support, warranty, indemnity, + or other liability obligations and/or rights consistent with this + License. However, in accepting such obligations, You may act only + on Your own behalf and on Your sole responsibility, not on behalf + of any other Contributor, and only if You agree to indemnify, + defend, and hold each Contributor harmless for any liability + incurred by, or claims asserted against, such Contributor by reason + of your accepting any such warranty or additional liability. + + END OF TERMS AND CONDITIONS + + APPENDIX: How to apply the Apache License to your work. + + To apply the Apache License to your work, attach the following + boilerplate notice, with the fields enclosed by brackets "{}" + replaced with your own identifying information. (Don't include + the brackets!) The text should be enclosed in the appropriate + comment syntax for the file format. We also recommend that a + file or class name and description of purpose be included on the + same "printed page" as the copyright notice for easier + identification within third-party archives. + + Copyright The OpenTracing Authors + + Licensed under the Apache License, Version 2.0 (the "License"); + you may not use this file except in compliance with the License. + You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, software + distributed under the License is distributed on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + See the License for the specific language governing permissions and + limitations under the License. + +/pytorch/third_party/opentelemetry-cpp/tools/vcpkg/ports/pdcurses/LICENSE +------------------------------------------------------------------------- +The core package is in the public domain, but small portions of PDCurses are subject to copyright under various licenses. + +The win32 files are released to the public domain. + +If you use PDCurses in an application, an acknowledgement would be appreciated, but is not mandatory. If you make corrections or enhancements to PDCurses, please forward them to the current maintainer for the benefit of other users. + +This software is provided AS IS with NO WARRANTY whatsoever. + +/pytorch/third_party/kineto/libkineto/third_party/dynolog/third_party/pfs/LICENSE +--------------------------------------------------------------------------------- + Apache License + Version 2.0, January 2004 + http://www.apache.org/licenses/ + + TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION + + 1. Definitions. + + "License" shall mean the terms and conditions for use, reproduction, + and distribution as defined by Sections 1 through 9 of this document. + + "Licensor" shall mean the copyright owner or entity authorized by + the copyright owner that is granting the License. + + "Legal Entity" shall mean the union of the acting entity and all + other entities that control, are controlled by, or are under common + control with that entity. For the purposes of this definition, + "control" means (i) the power, direct or indirect, to cause the + direction or management of such entity, whether by contract or + otherwise, or (ii) ownership of fifty percent (50%) or more of the + outstanding shares, or (iii) beneficial ownership of such entity. + + "You" (or "Your") shall mean an individual or Legal Entity + exercising permissions granted by this License. + + "Source" form shall mean the preferred form for making modifications, + including but not limited to software source code, documentation + source, and configuration files. + + "Object" form shall mean any form resulting from mechanical + transformation or translation of a Source form, including but + not limited to compiled object code, generated documentation, + and conversions to other media types. + + "Work" shall mean the work of authorship, whether in Source or + Object form, made available under the License, as indicated by a + copyright notice that is included in or attached to the work + (an example is provided in the Appendix below). + + "Derivative Works" shall mean any work, whether in Source or Object + form, that is based on (or derived from) the Work and for which the + editorial revisions, annotations, elaborations, or other modifications + represent, as a whole, an original work of authorship. For the purposes + of this License, Derivative Works shall not include works that remain + separable from, or merely link (or bind by name) to the interfaces of, + the Work and Derivative Works thereof. + + "Contribution" shall mean any work of authorship, including + the original version of the Work and any modifications or additions + to that Work or Derivative Works thereof, that is intentionally + submitted to Licensor for inclusion in the Work by the copyright owner + or by an individual or Legal Entity authorized to submit on behalf of + the copyright owner. For the purposes of this definition, "submitted" + means any form of electronic, verbal, or written communication sent + to the Licensor or its representatives, including but not limited to + communication on electronic mailing lists, source code control systems, + and issue tracking systems that are managed by, or on behalf of, the + Licensor for the purpose of discussing and improving the Work, but + excluding communication that is conspicuously marked or otherwise + designated in writing by the copyright owner as "Not a Contribution." + + "Contributor" shall mean Licensor and any individual or Legal Entity + on behalf of whom a Contribution has been received by Licensor and + subsequently incorporated within the Work. + + 2. Grant of Copyright License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + copyright license to reproduce, prepare Derivative Works of, + publicly display, publicly perform, sublicense, and distribute the + Work and such Derivative Works in Source or Object form. + + 3. Grant of Patent License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + (except as stated in this section) patent license to make, have made, + use, offer to sell, sell, import, and otherwise transfer the Work, + where such license applies only to those patent claims licensable + by such Contributor that are necessarily infringed by their + Contribution(s) alone or by combination of their Contribution(s) + with the Work to which such Contribution(s) was submitted. If You + institute patent litigation against any entity (including a + cross-claim or counterclaim in a lawsuit) alleging that the Work + or a Contribution incorporated within the Work constitutes direct + or contributory patent infringement, then any patent licenses + granted to You under this License for that Work shall terminate + as of the date such litigation is filed. + + 4. Redistribution. You may reproduce and distribute copies of the + Work or Derivative Works thereof in any medium, with or without + modifications, and in Source or Object form, provided that You + meet the following conditions: + + (a) You must give any other recipients of the Work or + Derivative Works a copy of this License; and + + (b) You must cause any modified files to carry prominent notices + stating that You changed the files; and + + (c) You must retain, in the Source form of any Derivative Works + that You distribute, all copyright, patent, trademark, and + attribution notices from the Source form of the Work, + excluding those notices that do not pertain to any part of + the Derivative Works; and + + (d) If the Work includes a "NOTICE" text file as part of its + distribution, then any Derivative Works that You distribute must + include a readable copy of the attribution notices contained + within such NOTICE file, excluding those notices that do not + pertain to any part of the Derivative Works, in at least one + of the following places: within a NOTICE text file distributed + as part of the Derivative Works; within the Source form or + documentation, if provided along with the Derivative Works; or, + within a display generated by the Derivative Works, if and + wherever such third-party notices normally appear. The contents + of the NOTICE file are for informational purposes only and + do not modify the License. You may add Your own attribution + notices within Derivative Works that You distribute, alongside + or as an addendum to the NOTICE text from the Work, provided + that such additional attribution notices cannot be construed + as modifying the License. + + You may add Your own copyright statement to Your modifications and + may provide additional or different license terms and conditions + for use, reproduction, or distribution of Your modifications, or + for any such Derivative Works as a whole, provided Your use, + reproduction, and distribution of the Work otherwise complies with + the conditions stated in this License. + + 5. Submission of Contributions. Unless You explicitly state otherwise, + any Contribution intentionally submitted for inclusion in the Work + by You to the Licensor shall be under the terms and conditions of + this License, without any additional terms or conditions. + Notwithstanding the above, nothing herein shall supersede or modify + the terms of any separate license agreement you may have executed + with Licensor regarding such Contributions. + + 6. Trademarks. This License does not grant permission to use the trade + names, trademarks, service marks, or product names of the Licensor, + except as required for reasonable and customary use in describing the + origin of the Work and reproducing the content of the NOTICE file. + + 7. Disclaimer of Warranty. Unless required by applicable law or + agreed to in writing, Licensor provides the Work (and each + Contributor provides its Contributions) on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or + implied, including, without limitation, any warranties or conditions + of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A + PARTICULAR PURPOSE. You are solely responsible for determining the + appropriateness of using or redistributing the Work and assume any + risks associated with Your exercise of permissions under this License. + + 8. Limitation of Liability. In no event and under no legal theory, + whether in tort (including negligence), contract, or otherwise, + unless required by applicable law (such as deliberate and grossly + negligent acts) or agreed to in writing, shall any Contributor be + liable to You for damages, including any direct, indirect, special, + incidental, or consequential damages of any character arising as a + result of this License or out of the use or inability to use the + Work (including but not limited to damages for loss of goodwill, + work stoppage, computer failure or malfunction, or any and all + other commercial damages or losses), even if such Contributor + has been advised of the possibility of such damages. + + 9. Accepting Warranty or Additional Liability. While redistributing + the Work or Derivative Works thereof, You may choose to offer, + and charge a fee for, acceptance of support, warranty, indemnity, + or other liability obligations and/or rights consistent with this + License. However, in accepting such obligations, You may act only + on Your own behalf and on Your sole responsibility, not on behalf + of any other Contributor, and only if You agree to indemnify, + defend, and hold each Contributor harmless for any liability + incurred by, or claims asserted against, such Contributor by reason + of your accepting any such warranty or additional liability. + + END OF TERMS AND CONDITIONS + + Copyright 2020-present Daniel Trugman + + Licensed under the Apache License, Version 2.0 (the "License"); + you may not use this file except in compliance with the License. + You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, software + distributed under the License is distributed on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + See the License for the specific language governing permissions and + limitations under the License. + + +/pytorch/third_party/opentelemetry-cpp/tools/vcpkg/ports/physac/LICENSE +----------------------------------------------------------------------- +MIT License + +Copyright (c) 2022 Víctor Fisac + +Permission is hereby granted, free of charge, to any person obtaining a copy +of this software and associated documentation files (the "Software"), to deal +in the Software without restriction, including without limitation the rights +to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +copies of the Software, and to permit persons to whom the Software is +furnished to do so, subject to the following conditions: + +The above copyright notice and this permission notice shall be included in all +copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE +SOFTWARE. + +/pytorch/third_party/opentelemetry-cpp/tools/vcpkg/ports/pqp/LICENSE +-------------------------------------------------------------------- +Copyright 1999 University of North Carolina at Chapel Hill. +All rights reserved. + +Permission to use, copy, modify, and distribute this software and its +documentation for educational, research, and non-profit purposes, without fee, +and without a written agreement is hereby granted, provided that the above +copyright notice and the following three paragraphs appear in all copies. + +IN NO EVENT SHALL THE UNIVERSITY OF NORTH CAROLINA AT CHAPEL HILL BE LIABLE TO +ANY PARTY FOR DIRECT, INDIRECT, SPECIAL, INCIDENTAL, OR CONSEQUENTIAL DAMAGES, +INCLUDING LOST PROFITS, ARISING OUT OF THE USE OF THIS SOFTWARE AND ITS +DOCUMENTATION, EVEN IF THE UNIVERSITY OF NORTH CAROLINA AT CHAPEL HILL HAS +BEEN ADVISED OF THE POSSIBILITY OF SUCH DAMAGES. + +THE UNIVERSITY OF NORTH CAROLINA AT CHAPEL HILL SPECIFICALLY DISCLAIMS ANY +WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF +MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. THE SOFTWARE PROVIDED +HEREUNDER IS ON AN "AS IS" BASIS, AND THE UNIVERSITY OF NORTH CAROLINA AT +CHAPEL HILL HAS NO OBLIGATIONS TO PROVIDE MAINTENANCE, SUPPORT, UPDATES, +ENHANCEMENTS, OR MODIFICATIONS. + +The authors may be contacted via: + +US Mail: Eric Larsen, Stefan Gottschalk + Department of Computer Science + Sitterson Hall, CB #3175 + University of North Carolina + Chapel Hill, NC 27599-3175 + +Phone: (919) 962-1749 + +Email: geom@cs.unc.edu + +/pytorch/third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/LICENSE +-------------------------------------------------------------------------------------------- +MIT License + +Copyright (c) 2016-2021 Jupp Mueller +Copyright (c) 2017-2022 Gregor Jasny + +And many contributors, see +https://github.com/jupp0r/prometheus-cpp/graphs/contributors + +Permission is hereby granted, free of charge, to any person obtaining a copy +of this software and associated documentation files (the "Software"), to deal +in the Software without restriction, including without limitation the rights +to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +copies of the Software, and to permit persons to whom the Software is +furnished to do so, subject to the following conditions: + +The above copyright notice and this permission notice shall be included in all +copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE +SOFTWARE. + + +/pytorch/third_party/opentelemetry-cpp/third_party/prometheus-cpp/LICENSE +------------------------------------------------------------------------- +MIT License + +Copyright (c) 2016-2021 Jupp Mueller +Copyright (c) 2017-2022 Gregor Jasny + +And many contributors, see +https://github.com/jupp0r/prometheus-cpp/graphs/contributors + +Permission is hereby granted, free of charge, to any person obtaining a copy +of this software and associated documentation files (the "Software"), to deal +in the Software without restriction, including without limitation the rights +to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +copies of the Software, and to permit persons to whom the Software is +furnished to do so, subject to the following conditions: + +The above copyright notice and this permission notice shall be included in all +copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE +SOFTWARE. + + +/pytorch/third_party/protobuf/LICENSE +------------------------------------- +Copyright 2008 Google Inc. All rights reserved. + +Redistribution and use in source and binary forms, with or without +modification, are permitted provided that the following conditions are +met: + + * Redistributions of source code must retain the above copyright +notice, this list of conditions and the following disclaimer. + * Redistributions in binary form must reproduce the above +copyright notice, this list of conditions and the following disclaimer +in the documentation and/or other materials provided with the +distribution. + * Neither the name of Google Inc. nor the names of its +contributors may be used to endorse or promote products derived from +this software without specific prior written permission. + +THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS +"AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT +LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR +A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT +OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, +SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT +LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, +DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY +THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT +(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE +OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + +Code generated by the Protocol Buffer compiler is owned by the owner +of the input file used when generating it. This code is not +standalone and requires a support library to be linked with it. This +support library is itself covered by the above license. + + +/pytorch/third_party/psimd/LICENSE +---------------------------------- +The MIT License (MIT) + +Copyright (c) 2017 Facebook Inc. +Copyright (c) 2014-2017 Georgia Institute of Technology +Copyright 2019 Google LLC + +Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions: + +The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. + + +/pytorch/third_party/pthreadpool/LICENSE +---------------------------------------- +Copyright 2019 Google LLC +Copyright (c) 2017 Facebook Inc. +Copyright (c) 2015-2017 Georgia Institute of Technology +All rights reserved. + +Redistribution and use in source and binary forms, with or without +modification, are permitted provided that the following conditions are met: + +* Redistributions of source code must retain the above copyright notice, this + list of conditions and the following disclaimer. + +* Redistributions in binary form must reproduce the above copyright notice, + this list of conditions and the following disclaimer in the documentation + and/or other materials provided with the distribution. + +THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" +AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE +IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE +DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE +FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL +DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR +SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER +CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, +OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE +OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + + + +/pytorch/third_party/onnx/third_party/pybind11/LICENSE +------------------------------------------------------ +Copyright (c) 2016 Wenzel Jakob , All rights reserved. + +Redistribution and use in source and binary forms, with or without +modification, are permitted provided that the following conditions are met: + +1. Redistributions of source code must retain the above copyright notice, this + list of conditions and the following disclaimer. + +2. Redistributions in binary form must reproduce the above copyright notice, + this list of conditions and the following disclaimer in the documentation + and/or other materials provided with the distribution. + +3. Neither the name of the copyright holder nor the names of its contributors + may be used to endorse or promote products derived from this software + without specific prior written permission. + +THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND +ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED +WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE +DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE +FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL +DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR +SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER +CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, +OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE +OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + +Please also refer to the file .github/CONTRIBUTING.md, which clarifies licensing of +external contributions to this project including patches, pull requests, etc. + + +/pytorch/third_party/pybind11/LICENSE +------------------------------------- +Copyright (c) 2016 Wenzel Jakob , All rights reserved. + +Redistribution and use in source and binary forms, with or without +modification, are permitted provided that the following conditions are met: + +1. Redistributions of source code must retain the above copyright notice, this + list of conditions and the following disclaimer. + +2. Redistributions in binary form must reproduce the above copyright notice, + this list of conditions and the following disclaimer in the documentation + and/or other materials provided with the distribution. + +3. Neither the name of the copyright holder nor the names of its contributors + may be used to endorse or promote products derived from this software + without specific prior written permission. + +THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND +ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED +WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE +DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE +FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL +DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR +SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER +CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, +OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE +OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + +Please also refer to the file .github/CONTRIBUTING.md, which clarifies licensing of +external contributions to this project including patches, pull requests, etc. + + +/pytorch/third_party/tensorpipe/third_party/pybind11/LICENSE +------------------------------------------------------------ +Copyright (c) 2016 Wenzel Jakob , All rights reserved. + +Redistribution and use in source and binary forms, with or without +modification, are permitted provided that the following conditions are met: + +1. Redistributions of source code must retain the above copyright notice, this + list of conditions and the following disclaimer. + +2. Redistributions in binary form must reproduce the above copyright notice, + this list of conditions and the following disclaimer in the documentation + and/or other materials provided with the distribution. + +3. Neither the name of the copyright holder nor the names of its contributors + may be used to endorse or promote products derived from this software + without specific prior written permission. + +THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND +ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED +WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE +DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE +FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL +DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR +SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER +CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, +OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE +OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + +Please also refer to the file CONTRIBUTING.md, which clarifies licensing of +external contributions to this project including patches, pull requests, etc. + + +/pytorch/third_party/NVTX/python/LICENSE.txt +-------------------------------------------- +============================================================================== +NVTX is under the Apache License v2.0 with LLVM Exceptions: +============================================================================== + + Apache License + Version 2.0, January 2004 + http://www.apache.org/licenses/ + + TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION + + 1. Definitions. + + "License" shall mean the terms and conditions for use, reproduction, + and distribution as defined by Sections 1 through 9 of this document. + + "Licensor" shall mean the copyright owner or entity authorized by + the copyright owner that is granting the License. + + "Legal Entity" shall mean the union of the acting entity and all + other entities that control, are controlled by, or are under common + control with that entity. For the purposes of this definition, + "control" means (i) the power, direct or indirect, to cause the + direction or management of such entity, whether by contract or + otherwise, or (ii) ownership of fifty percent (50%) or more of the + outstanding shares, or (iii) beneficial ownership of such entity. + + "You" (or "Your") shall mean an individual or Legal Entity + exercising permissions granted by this License. + + "Source" form shall mean the preferred form for making modifications, + including but not limited to software source code, documentation + source, and configuration files. + + "Object" form shall mean any form resulting from mechanical + transformation or translation of a Source form, including but + not limited to compiled object code, generated documentation, + and conversions to other media types. + + "Work" shall mean the work of authorship, whether in Source or + Object form, made available under the License, as indicated by a + copyright notice that is included in or attached to the work + (an example is provided in the Appendix below). + + "Derivative Works" shall mean any work, whether in Source or Object + form, that is based on (or derived from) the Work and for which the + editorial revisions, annotations, elaborations, or other modifications + represent, as a whole, an original work of authorship. For the purposes + of this License, Derivative Works shall not include works that remain + separable from, or merely link (or bind by name) to the interfaces of, + the Work and Derivative Works thereof. + + "Contribution" shall mean any work of authorship, including + the original version of the Work and any modifications or additions + to that Work or Derivative Works thereof, that is intentionally + submitted to Licensor for inclusion in the Work by the copyright owner + or by an individual or Legal Entity authorized to submit on behalf of + the copyright owner. For the purposes of this definition, "submitted" + means any form of electronic, verbal, or written communication sent + to the Licensor or its representatives, including but not limited to + communication on electronic mailing lists, source code control systems, + and issue tracking systems that are managed by, or on behalf of, the + Licensor for the purpose of discussing and improving the Work, but + excluding communication that is conspicuously marked or otherwise + designated in writing by the copyright owner as "Not a Contribution." + + "Contributor" shall mean Licensor and any individual or Legal Entity + on behalf of whom a Contribution has been received by Licensor and + subsequently incorporated within the Work. + + 2. Grant of Copyright License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + copyright license to reproduce, prepare Derivative Works of, + publicly display, publicly perform, sublicense, and distribute the + Work and such Derivative Works in Source or Object form. + + 3. Grant of Patent License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + (except as stated in this section) patent license to make, have made, + use, offer to sell, sell, import, and otherwise transfer the Work, + where such license applies only to those patent claims licensable + by such Contributor that are necessarily infringed by their + Contribution(s) alone or by combination of their Contribution(s) + with the Work to which such Contribution(s) was submitted. If You + institute patent litigation against any entity (including a + cross-claim or counterclaim in a lawsuit) alleging that the Work + or a Contribution incorporated within the Work constitutes direct + or contributory patent infringement, then any patent licenses + granted to You under this License for that Work shall terminate + as of the date such litigation is filed. + + 4. Redistribution. You may reproduce and distribute copies of the + Work or Derivative Works thereof in any medium, with or without + modifications, and in Source or Object form, provided that You + meet the following conditions: + + (a) You must give any other recipients of the Work or + Derivative Works a copy of this License; and + + (b) You must cause any modified files to carry prominent notices + stating that You changed the files; and + + (c) You must retain, in the Source form of any Derivative Works + that You distribute, all copyright, patent, trademark, and + attribution notices from the Source form of the Work, + excluding those notices that do not pertain to any part of + the Derivative Works; and + + (d) If the Work includes a "NOTICE" text file as part of its + distribution, then any Derivative Works that You distribute must + include a readable copy of the attribution notices contained + within such NOTICE file, excluding those notices that do not + pertain to any part of the Derivative Works, in at least one + of the following places: within a NOTICE text file distributed + as part of the Derivative Works; within the Source form or + documentation, if provided along with the Derivative Works; or, + within a display generated by the Derivative Works, if and + wherever such third-party notices normally appear. The contents + of the NOTICE file are for informational purposes only and + do not modify the License. You may add Your own attribution + notices within Derivative Works that You distribute, alongside + or as an addendum to the NOTICE text from the Work, provided + that such additional attribution notices cannot be construed + as modifying the License. + + You may add Your own copyright statement to Your modifications and + may provide additional or different license terms and conditions + for use, reproduction, or distribution of Your modifications, or + for any such Derivative Works as a whole, provided Your use, + reproduction, and distribution of the Work otherwise complies with + the conditions stated in this License. + + 5. Submission of Contributions. Unless You explicitly state otherwise, + any Contribution intentionally submitted for inclusion in the Work + by You to the Licensor shall be under the terms and conditions of + this License, without any additional terms or conditions. + Notwithstanding the above, nothing herein shall supersede or modify + the terms of any separate license agreement you may have executed + with Licensor regarding such Contributions. + + 6. Trademarks. This License does not grant permission to use the trade + names, trademarks, service marks, or product names of the Licensor, + except as required for reasonable and customary use in describing the + origin of the Work and reproducing the content of the NOTICE file. + + 7. Disclaimer of Warranty. Unless required by applicable law or + agreed to in writing, Licensor provides the Work (and each + Contributor provides its Contributions) on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or + implied, including, without limitation, any warranties or conditions + of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A + PARTICULAR PURPOSE. You are solely responsible for determining the + appropriateness of using or redistributing the Work and assume any + risks associated with Your exercise of permissions under this License. + + 8. Limitation of Liability. In no event and under no legal theory, + whether in tort (including negligence), contract, or otherwise, + unless required by applicable law (such as deliberate and grossly + negligent acts) or agreed to in writing, shall any Contributor be + liable to You for damages, including any direct, indirect, special, + incidental, or consequential damages of any character arising as a + result of this License or out of the use or inability to use the + Work (including but not limited to damages for loss of goodwill, + work stoppage, computer failure or malfunction, or any and all + other commercial damages or losses), even if such Contributor + has been advised of the possibility of such damages. + + 9. Accepting Warranty or Additional Liability. While redistributing + the Work or Derivative Works thereof, You may choose to offer, + and charge a fee for, acceptance of support, warranty, indemnity, + or other liability obligations and/or rights consistent with this + License. However, in accepting such obligations, You may act only + on Your own behalf and on Your sole responsibility, not on behalf + of any other Contributor, and only if You agree to indemnify, + defend, and hold each Contributor harmless for any liability + incurred by, or claims asserted against, such Contributor by reason + of your accepting any such warranty or additional liability. + + END OF TERMS AND CONDITIONS + + APPENDIX: How to apply the Apache License to your work. + + To apply the Apache License to your work, attach the following + boilerplate notice, with the fields enclosed by brackets "[]" + replaced with your own identifying information. (Don't include + the brackets!) The text should be enclosed in the appropriate + comment syntax for the file format. We also recommend that a + file or class name and description of purpose be included on the + same "printed page" as the copyright notice for easier + identification within third-party archives. + + Copyright [yyyy] [name of copyright owner] + + Licensed under the Apache License, Version 2.0 (the "License"); + you may not use this file except in compliance with the License. + You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, software + distributed under the License is distributed on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + See the License for the specific language governing permissions and + limitations under the License. + + +---- LLVM Exceptions to the Apache 2.0 License ---- + +As an exception, if, as a result of your compiling your source code, portions +of this Software are embedded into an Object form of such source code, you +may redistribute such embedded portions in such Object form without complying +with the conditions of Sections 4(a), 4(b) and 4(d) of the License. + +In addition, if you combine or link compiled forms of this Software with +software that is licensed under the GPLv2 ("Combined Software") and if a +court of competent jurisdiction determines that the patent provision (Section +3), the indemnity provision (Section 9) or other Section of the License +conflicts with the conditions of the GPLv2, you may retroactively and +prospectively choose to deem waived or otherwise exclude such Section(s) of +the License, but only in their entirety and only with respect to the Combined +Software. + + + +/pytorch/third_party/cutlass/python/LICENSE.txt +----------------------------------------------- +Copyright (c) 2017 - 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +SPDX-License-Identifier: BSD-3-Clause + +Redistribution and use in source and binary forms, with or without +modification, are permitted provided that the following conditions are met: + +1. Redistributions of source code must retain the above copyright notice, this +list of conditions and the following disclaimer. + +2. Redistributions in binary form must reproduce the above copyright notice, +this list of conditions and the following disclaimer in the documentation +and/or other materials provided with the distribution. + +3. Neither the name of the copyright holder nor the names of its +contributors may be used to endorse or promote products derived from +this software without specific prior written permission. + +THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" +AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE +IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE +DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE +FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL +DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR +SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER +CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, +OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE +OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + + +/pytorch/third_party/fbgemm/external/cutlass/python/LICENSE.txt +--------------------------------------------------------------- +Copyright (c) 2017 - 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +SPDX-License-Identifier: BSD-3-Clause + +Redistribution and use in source and binary forms, with or without +modification, are permitted provided that the following conditions are met: + +1. Redistributions of source code must retain the above copyright notice, this +list of conditions and the following disclaimer. + +2. Redistributions in binary form must reproduce the above copyright notice, +this list of conditions and the following disclaimer in the documentation +and/or other materials provided with the distribution. + +3. Neither the name of the copyright holder nor the names of its +contributors may be used to endorse or promote products derived from +this software without specific prior written permission. + +THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" +AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE +IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE +DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE +FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL +DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR +SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER +CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, +OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE +OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + + +/pytorch/third_party/flash-attention/csrc/cutlass/python/LICENSE.txt +-------------------------------------------------------------------- +Copyright (c) 2017 - 2024 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +SPDX-License-Identifier: BSD-3-Clause + +Redistribution and use in source and binary forms, with or without +modification, are permitted provided that the following conditions are met: + +1. Redistributions of source code must retain the above copyright notice, this +list of conditions and the following disclaimer. + +2. Redistributions in binary form must reproduce the above copyright notice, +this list of conditions and the following disclaimer in the documentation +and/or other materials provided with the distribution. + +3. Neither the name of the copyright holder nor the names of its +contributors may be used to endorse or promote products derived from +this software without specific prior written permission. + +THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" +AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE +IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE +DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE +FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL +DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR +SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER +CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, +OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE +OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + + +/pytorch/third_party/python-peachpy/LICENSE.rst +----------------------------------------------- +============================== +PeachPy license (2-clause BSD) +============================== + +Copyright (c) 2017, Facebook Inc. +Copyright (c) 2013-2017, Georgia Institute of Technology +All rights reserved. + +Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met: + +1. Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer. + +2. Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution. + +THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + + +/pytorch/third_party/opentelemetry-cpp/tools/vcpkg/ports/sigslot/LICENSE +------------------------------------------------------------------------ +License +The sigslot library has been placed in the public domain. This means that you are free to use it however you like. + +The author takes no responsibility or liability of any kind for any use that you may make of this library. + +If you screw up, it's your fault. + +If the library screws up, you got it for free, so you should have tested it better - it's still your responsibility. + +/pytorch/third_party/sleef/LICENSE.txt +-------------------------------------- +Boost Software License - Version 1.0 - August 17th, 2003 + +Permission is hereby granted, free of charge, to any person or organization +obtaining a copy of the software and accompanying documentation covered by +this license (the "Software") to use, reproduce, display, distribute, +execute, and transmit the Software, and to prepare derivative works of the +Software, and to permit third-parties to whom the Software is furnished to +do so, all subject to the following: + +The copyright notices in the Software and this entire statement, including +the above license grant, this restriction and the following disclaimer, +must be included in all copies of the Software, in whole or in part, and +all derivative works of the Software, unless such copies or derivative +works are solely in the form of machine-executable object code generated by +a source language processor. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +FITNESS FOR A PARTICULAR PURPOSE, TITLE AND NON-INFRINGEMENT. IN NO EVENT +SHALL THE COPYRIGHT HOLDERS OR ANYONE DISTRIBUTING THE SOFTWARE BE LIABLE +FOR ANY DAMAGES OR OTHER LIABILITY, WHETHER IN CONTRACT, TORT OR OTHERWISE, +ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER +DEALINGS IN THE SOFTWARE. + + +/pytorch/third_party/flatbuffers/swift/LICENSE +---------------------------------------------- + + Apache License + Version 2.0, January 2004 + http://www.apache.org/licenses/ + + TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION + + 1. Definitions. + + "License" shall mean the terms and conditions for use, reproduction, + and distribution as defined by Sections 1 through 9 of this document. + + "Licensor" shall mean the copyright owner or entity authorized by + the copyright owner that is granting the License. + + "Legal Entity" shall mean the union of the acting entity and all + other entities that control, are controlled by, or are under common + control with that entity. For the purposes of this definition, + "control" means (i) the power, direct or indirect, to cause the + direction or management of such entity, whether by contract or + otherwise, or (ii) ownership of fifty percent (50%) or more of the + outstanding shares, or (iii) beneficial ownership of such entity. + + "You" (or "Your") shall mean an individual or Legal Entity + exercising permissions granted by this License. + + "Source" form shall mean the preferred form for making modifications, + including but not limited to software source code, documentation + source, and configuration files. + + "Object" form shall mean any form resulting from mechanical + transformation or translation of a Source form, including but + not limited to compiled object code, generated documentation, + and conversions to other media types. + + "Work" shall mean the work of authorship, whether in Source or + Object form, made available under the License, as indicated by a + copyright notice that is included in or attached to the work + (an example is provided in the Appendix below). + + "Derivative Works" shall mean any work, whether in Source or Object + form, that is based on (or derived from) the Work and for which the + editorial revisions, annotations, elaborations, or other modifications + represent, as a whole, an original work of authorship. For the purposes + of this License, Derivative Works shall not include works that remain + separable from, or merely link (or bind by name) to the interfaces of, + the Work and Derivative Works thereof. + + "Contribution" shall mean any work of authorship, including + the original version of the Work and any modifications or additions + to that Work or Derivative Works thereof, that is intentionally + submitted to Licensor for inclusion in the Work by the copyright owner + or by an individual or Legal Entity authorized to submit on behalf of + the copyright owner. For the purposes of this definition, "submitted" + means any form of electronic, verbal, or written communication sent + to the Licensor or its representatives, including but not limited to + communication on electronic mailing lists, source code control systems, + and issue tracking systems that are managed by, or on behalf of, the + Licensor for the purpose of discussing and improving the Work, but + excluding communication that is conspicuously marked or otherwise + designated in writing by the copyright owner as "Not a Contribution." + + "Contributor" shall mean Licensor and any individual or Legal Entity + on behalf of whom a Contribution has been received by Licensor and + subsequently incorporated within the Work. + + 2. Grant of Copyright License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + copyright license to reproduce, prepare Derivative Works of, + publicly display, publicly perform, sublicense, and distribute the + Work and such Derivative Works in Source or Object form. + + 3. Grant of Patent License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + (except as stated in this section) patent license to make, have made, + use, offer to sell, sell, import, and otherwise transfer the Work, + where such license applies only to those patent claims licensable + by such Contributor that are necessarily infringed by their + Contribution(s) alone or by combination of their Contribution(s) + with the Work to which such Contribution(s) was submitted. If You + institute patent litigation against any entity (including a + cross-claim or counterclaim in a lawsuit) alleging that the Work + or a Contribution incorporated within the Work constitutes direct + or contributory patent infringement, then any patent licenses + granted to You under this License for that Work shall terminate + as of the date such litigation is filed. + + 4. Redistribution. You may reproduce and distribute copies of the + Work or Derivative Works thereof in any medium, with or without + modifications, and in Source or Object form, provided that You + meet the following conditions: + + (a) You must give any other recipients of the Work or + Derivative Works a copy of this License; and + + (b) You must cause any modified files to carry prominent notices + stating that You changed the files; and + + (c) You must retain, in the Source form of any Derivative Works + that You distribute, all copyright, patent, trademark, and + attribution notices from the Source form of the Work, + excluding those notices that do not pertain to any part of + the Derivative Works; and + + (d) If the Work includes a "NOTICE" text file as part of its + distribution, then any Derivative Works that You distribute must + include a readable copy of the attribution notices contained + within such NOTICE file, excluding those notices that do not + pertain to any part of the Derivative Works, in at least one + of the following places: within a NOTICE text file distributed + as part of the Derivative Works; within the Source form or + documentation, if provided along with the Derivative Works; or, + within a display generated by the Derivative Works, if and + wherever such third-party notices normally appear. The contents + of the NOTICE file are for informational purposes only and + do not modify the License. You may add Your own attribution + notices within Derivative Works that You distribute, alongside + or as an addendum to the NOTICE text from the Work, provided + that such additional attribution notices cannot be construed + as modifying the License. + + You may add Your own copyright statement to Your modifications and + may provide additional or different license terms and conditions + for use, reproduction, or distribution of Your modifications, or + for any such Derivative Works as a whole, provided Your use, + reproduction, and distribution of the Work otherwise complies with + the conditions stated in this License. + + 5. Submission of Contributions. Unless You explicitly state otherwise, + any Contribution intentionally submitted for inclusion in the Work + by You to the Licensor shall be under the terms and conditions of + this License, without any additional terms or conditions. + Notwithstanding the above, nothing herein shall supersede or modify + the terms of any separate license agreement you may have executed + with Licensor regarding such Contributions. + + 6. Trademarks. This License does not grant permission to use the trade + names, trademarks, service marks, or product names of the Licensor, + except as required for reasonable and customary use in describing the + origin of the Work and reproducing the content of the NOTICE file. + + 7. Disclaimer of Warranty. Unless required by applicable law or + agreed to in writing, Licensor provides the Work (and each + Contributor provides its Contributions) on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or + implied, including, without limitation, any warranties or conditions + of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A + PARTICULAR PURPOSE. You are solely responsible for determining the + appropriateness of using or redistributing the Work and assume any + risks associated with Your exercise of permissions under this License. + + 8. Limitation of Liability. In no event and under no legal theory, + whether in tort (including negligence), contract, or otherwise, + unless required by applicable law (such as deliberate and grossly + negligent acts) or agreed to in writing, shall any Contributor be + liable to You for damages, including any direct, indirect, special, + incidental, or consequential damages of any character arising as a + result of this License or out of the use or inability to use the + Work (including but not limited to damages for loss of goodwill, + work stoppage, computer failure or malfunction, or any and all + other commercial damages or losses), even if such Contributor + has been advised of the possibility of such damages. + + 9. Accepting Warranty or Additional Liability. While redistributing + the Work or Derivative Works thereof, You may choose to offer, + and charge a fee for, acceptance of support, warranty, indemnity, + or other liability obligations and/or rights consistent with this + License. However, in accepting such obligations, You may act only + on Your own behalf and on Your sole responsibility, not on behalf + of any other Contributor, and only if You agree to indemnify, + defend, and hold each Contributor harmless for any liability + incurred by, or claims asserted against, such Contributor by reason + of your accepting any such warranty or additional liability. + + END OF TERMS AND CONDITIONS + + APPENDIX: How to apply the Apache License to your work. + + To apply the Apache License to your work, attach the following + boilerplate notice, with the fields enclosed by brackets "[]" + replaced with your own identifying information. (Don't include + the brackets!) The text should be enclosed in the appropriate + comment syntax for the file format. We also recommend that a + file or class name and description of purpose be included on the + same "printed page" as the copyright notice for easier + identification within third-party archives. + + Copyright [yyyy] [name of copyright owner] + + Licensed under the Apache License, Version 2.0 (the "License"); + you may not use this file except in compliance with the License. + You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, software + distributed under the License is distributed on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + See the License for the specific language governing permissions and + limitations under the License. + + +/pytorch/third_party/kineto/tb_plugin/LICENSE +--------------------------------------------- +BSD License + +For Kineto software + +Copyright (c) Facebook, Inc. and its affiliates. All rights reserved. + +All contributions by Microsoft: +Copyright (c) Microsoft Corporation. (The Azure AI Platform team) + +Redistribution and use in source and binary forms, with or without modification, +are permitted provided that the following conditions are met: + + * Redistributions of source code must retain the above copyright notice, this + list of conditions and the following disclaimer. + + * Redistributions in binary form must reproduce the above copyright notice, + this list of conditions and the following disclaimer in the documentation + and/or other materials provided with the distribution. + + * Neither the name Facebook nor the names of its contributors may be used to + endorse or promote products derived from this software without specific + prior written permission. + +THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND +ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED +WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE +DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR +ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES +(INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; +LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON +ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT +(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS +SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + + +/pytorch/third_party/opentelemetry-cpp/tools/vcpkg/ports/tensorflow-common/LICENSE.txt +-------------------------------------------------------------------------------------- +Copyright (c) Microsoft Corporation + +All rights reserved. + +MIT License + +Permission is hereby granted, free of charge, to any person obtaining a copy of +this software and associated documentation files (the "Software"), to deal in +the Software without restriction, including without limitation the rights to +use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies +of the Software, and to permit persons to whom the Software is furnished to do +so, subject to the following conditions: + +The above copyright notice and this permission notice shall be included in all +copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED *AS IS*, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE +SOFTWARE. + + +/pytorch/third_party/tensorpipe/LICENSE.txt +------------------------------------------- +BSD License + +For TensorPipe software + +Copyright (c) Meta Platforms, Inc. and affiliates. All rights reserved. + +Redistribution and use in source and binary forms, with or without modification, +are permitted provided that the following conditions are met: + + * Redistributions of source code must retain the above copyright notice, this + list of conditions and the following disclaimer. + + * Redistributions in binary form must reproduce the above copyright notice, + this list of conditions and the following disclaimer in the documentation + and/or other materials provided with the distribution. + + * Neither the name Meta nor the names of its contributors may be used to + endorse or promote products derived from this software without specific + prior written permission. + +THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND +ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED +WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE +DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR +ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES +(INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; +LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON +ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT +(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS +SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + + +/pytorch/third_party/kineto/libkineto/third_party/dynolog/third_party/cpr/test/LICENSE +-------------------------------------------------------------------------------------- +This license applies to everything inside this directory and all +subdirectories. + + GNU GENERAL PUBLIC LICENSE + Version 3, 29 June 2007 + + Copyright (C) 2007 Free Software Foundation, Inc. + Everyone is permitted to copy and distribute verbatim copies + of this license document, but changing it is not allowed. + + Preamble + + The GNU General Public License is a free, copyleft license for +software and other kinds of works. + + The licenses for most software and other practical works are designed +to take away your freedom to share and change the works. By contrast, +the GNU General Public License is intended to guarantee your freedom to +share and change all versions of a program--to make sure it remains free +software for all its users. We, the Free Software Foundation, use the +GNU General Public License for most of our software; it applies also to +any other work released this way by its authors. You can apply it to +your programs, too. + + When we speak of free software, we are referring to freedom, not +price. Our General Public Licenses are designed to make sure that you +have the freedom to distribute copies of free software (and charge for +them if you wish), that you receive source code or can get it if you +want it, that you can change the software or use pieces of it in new +free programs, and that you know you can do these things. + + To protect your rights, we need to prevent others from denying you +these rights or asking you to surrender the rights. Therefore, you have +certain responsibilities if you distribute copies of the software, or if +you modify it: responsibilities to respect the freedom of others. + + For example, if you distribute copies of such a program, whether +gratis or for a fee, you must pass on to the recipients the same +freedoms that you received. You must make sure that they, too, receive +or can get the source code. And you must show them these terms so they +know their rights. + + Developers that use the GNU GPL protect your rights with two steps: +(1) assert copyright on the software, and (2) offer you this License +giving you legal permission to copy, distribute and/or modify it. + + For the developers' and authors' protection, the GPL clearly explains +that there is no warranty for this free software. For both users' and +authors' sake, the GPL requires that modified versions be marked as +changed, so that their problems will not be attributed erroneously to +authors of previous versions. + + Some devices are designed to deny users access to install or run +modified versions of the software inside them, although the manufacturer +can do so. This is fundamentally incompatible with the aim of +protecting users' freedom to change the software. The systematic +pattern of such abuse occurs in the area of products for individuals to +use, which is precisely where it is most unacceptable. Therefore, we +have designed this version of the GPL to prohibit the practice for those +products. If such problems arise substantially in other domains, we +stand ready to extend this provision to those domains in future versions +of the GPL, as needed to protect the freedom of users. + + Finally, every program is threatened constantly by software patents. +States should not allow patents to restrict development and use of +software on general-purpose computers, but in those that do, we wish to +avoid the special danger that patents applied to a free program could +make it effectively proprietary. To prevent this, the GPL assures that +patents cannot be used to render the program non-free. + + The precise terms and conditions for copying, distribution and +modification follow. + + TERMS AND CONDITIONS + + 0. Definitions. + + "This License" refers to version 3 of the GNU General Public License. + + "Copyright" also means copyright-like laws that apply to other kinds of +works, such as semiconductor masks. + + "The Program" refers to any copyrightable work licensed under this +License. Each licensee is addressed as "you". "Licensees" and +"recipients" may be individuals or organizations. + + To "modify" a work means to copy from or adapt all or part of the work +in a fashion requiring copyright permission, other than the making of an +exact copy. The resulting work is called a "modified version" of the +earlier work or a work "based on" the earlier work. + + A "covered work" means either the unmodified Program or a work based +on the Program. + + To "propagate" a work means to do anything with it that, without +permission, would make you directly or secondarily liable for +infringement under applicable copyright law, except executing it on a +computer or modifying a private copy. Propagation includes copying, +distribution (with or without modification), making available to the +public, and in some countries other activities as well. + + To "convey" a work means any kind of propagation that enables other +parties to make or receive copies. Mere interaction with a user through +a computer network, with no transfer of a copy, is not conveying. + + An interactive user interface displays "Appropriate Legal Notices" +to the extent that it includes a convenient and prominently visible +feature that (1) displays an appropriate copyright notice, and (2) +tells the user that there is no warranty for the work (except to the +extent that warranties are provided), that licensees may convey the +work under this License, and how to view a copy of this License. If +the interface presents a list of user commands or options, such as a +menu, a prominent item in the list meets this criterion. + + 1. Source Code. + + The "source code" for a work means the preferred form of the work +for making modifications to it. "Object code" means any non-source +form of a work. + + A "Standard Interface" means an interface that either is an official +standard defined by a recognized standards body, or, in the case of +interfaces specified for a particular programming language, one that +is widely used among developers working in that language. + + The "System Libraries" of an executable work include anything, other +than the work as a whole, that (a) is included in the normal form of +packaging a Major Component, but which is not part of that Major +Component, and (b) serves only to enable use of the work with that +Major Component, or to implement a Standard Interface for which an +implementation is available to the public in source code form. A +"Major Component", in this context, means a major essential component +(kernel, window system, and so on) of the specific operating system +(if any) on which the executable work runs, or a compiler used to +produce the work, or an object code interpreter used to run it. + + The "Corresponding Source" for a work in object code form means all +the source code needed to generate, install, and (for an executable +work) run the object code and to modify the work, including scripts to +control those activities. However, it does not include the work's +System Libraries, or general-purpose tools or generally available free +programs which are used unmodified in performing those activities but +which are not part of the work. For example, Corresponding Source +includes interface definition files associated with source files for +the work, and the source code for shared libraries and dynamically +linked subprograms that the work is specifically designed to require, +such as by intimate data communication or control flow between those +subprograms and other parts of the work. + + The Corresponding Source need not include anything that users +can regenerate automatically from other parts of the Corresponding +Source. + + The Corresponding Source for a work in source code form is that +same work. + + 2. Basic Permissions. + + All rights granted under this License are granted for the term of +copyright on the Program, and are irrevocable provided the stated +conditions are met. This License explicitly affirms your unlimited +permission to run the unmodified Program. The output from running a +covered work is covered by this License only if the output, given its +content, constitutes a covered work. This License acknowledges your +rights of fair use or other equivalent, as provided by copyright law. + + You may make, run and propagate covered works that you do not +convey, without conditions so long as your license otherwise remains +in force. You may convey covered works to others for the sole purpose +of having them make modifications exclusively for you, or provide you +with facilities for running those works, provided that you comply with +the terms of this License in conveying all material for which you do +not control copyright. Those thus making or running the covered works +for you must do so exclusively on your behalf, under your direction +and control, on terms that prohibit them from making any copies of +your copyrighted material outside their relationship with you. + + Conveying under any other circumstances is permitted solely under +the conditions stated below. Sublicensing is not allowed; section 10 +makes it unnecessary. + + 3. Protecting Users' Legal Rights From Anti-Circumvention Law. + + No covered work shall be deemed part of an effective technological +measure under any applicable law fulfilling obligations under article +11 of the WIPO copyright treaty adopted on 20 December 1996, or +similar laws prohibiting or restricting circumvention of such +measures. + + When you convey a covered work, you waive any legal power to forbid +circumvention of technological measures to the extent such circumvention +is effected by exercising rights under this License with respect to +the covered work, and you disclaim any intention to limit operation or +modification of the work as a means of enforcing, against the work's +users, your or third parties' legal rights to forbid circumvention of +technological measures. + + 4. Conveying Verbatim Copies. + + You may convey verbatim copies of the Program's source code as you +receive it, in any medium, provided that you conspicuously and +appropriately publish on each copy an appropriate copyright notice; +keep intact all notices stating that this License and any +non-permissive terms added in accord with section 7 apply to the code; +keep intact all notices of the absence of any warranty; and give all +recipients a copy of this License along with the Program. + + You may charge any price or no price for each copy that you convey, +and you may offer support or warranty protection for a fee. + + 5. Conveying Modified Source Versions. + + You may convey a work based on the Program, or the modifications to +produce it from the Program, in the form of source code under the +terms of section 4, provided that you also meet all of these conditions: + + a) The work must carry prominent notices stating that you modified + it, and giving a relevant date. + + b) The work must carry prominent notices stating that it is + released under this License and any conditions added under section + 7. This requirement modifies the requirement in section 4 to + "keep intact all notices". + + c) You must license the entire work, as a whole, under this + License to anyone who comes into possession of a copy. This + License will therefore apply, along with any applicable section 7 + additional terms, to the whole of the work, and all its parts, + regardless of how they are packaged. This License gives no + permission to license the work in any other way, but it does not + invalidate such permission if you have separately received it. + + d) If the work has interactive user interfaces, each must display + Appropriate Legal Notices; however, if the Program has interactive + interfaces that do not display Appropriate Legal Notices, your + work need not make them do so. + + A compilation of a covered work with other separate and independent +works, which are not by their nature extensions of the covered work, +and which are not combined with it such as to form a larger program, +in or on a volume of a storage or distribution medium, is called an +"aggregate" if the compilation and its resulting copyright are not +used to limit the access or legal rights of the compilation's users +beyond what the individual works permit. Inclusion of a covered work +in an aggregate does not cause this License to apply to the other +parts of the aggregate. + + 6. Conveying Non-Source Forms. + + You may convey a covered work in object code form under the terms +of sections 4 and 5, provided that you also convey the +machine-readable Corresponding Source under the terms of this License, +in one of these ways: + + a) Convey the object code in, or embodied in, a physical product + (including a physical distribution medium), accompanied by the + Corresponding Source fixed on a durable physical medium + customarily used for software interchange. + + b) Convey the object code in, or embodied in, a physical product + (including a physical distribution medium), accompanied by a + written offer, valid for at least three years and valid for as + long as you offer spare parts or customer support for that product + model, to give anyone who possesses the object code either (1) a + copy of the Corresponding Source for all the software in the + product that is covered by this License, on a durable physical + medium customarily used for software interchange, for a price no + more than your reasonable cost of physically performing this + conveying of source, or (2) access to copy the + Corresponding Source from a network server at no charge. + + c) Convey individual copies of the object code with a copy of the + written offer to provide the Corresponding Source. This + alternative is allowed only occasionally and noncommercially, and + only if you received the object code with such an offer, in accord + with subsection 6b. + + d) Convey the object code by offering access from a designated + place (gratis or for a charge), and offer equivalent access to the + Corresponding Source in the same way through the same place at no + further charge. You need not require recipients to copy the + Corresponding Source along with the object code. If the place to + copy the object code is a network server, the Corresponding Source + may be on a different server (operated by you or a third party) + that supports equivalent copying facilities, provided you maintain + clear directions next to the object code saying where to find the + Corresponding Source. Regardless of what server hosts the + Corresponding Source, you remain obligated to ensure that it is + available for as long as needed to satisfy these requirements. + + e) Convey the object code using peer-to-peer transmission, provided + you inform other peers where the object code and Corresponding + Source of the work are being offered to the general public at no + charge under subsection 6d. + + A separable portion of the object code, whose source code is excluded +from the Corresponding Source as a System Library, need not be +included in conveying the object code work. + + A "User Product" is either (1) a "consumer product", which means any +tangible personal property which is normally used for personal, family, +or household purposes, or (2) anything designed or sold for incorporation +into a dwelling. In determining whether a product is a consumer product, +doubtful cases shall be resolved in favor of coverage. For a particular +product received by a particular user, "normally used" refers to a +typical or common use of that class of product, regardless of the status +of the particular user or of the way in which the particular user +actually uses, or expects or is expected to use, the product. A product +is a consumer product regardless of whether the product has substantial +commercial, industrial or non-consumer uses, unless such uses represent +the only significant mode of use of the product. + + "Installation Information" for a User Product means any methods, +procedures, authorization keys, or other information required to install +and execute modified versions of a covered work in that User Product from +a modified version of its Corresponding Source. The information must +suffice to ensure that the continued functioning of the modified object +code is in no case prevented or interfered with solely because +modification has been made. + + If you convey an object code work under this section in, or with, or +specifically for use in, a User Product, and the conveying occurs as +part of a transaction in which the right of possession and use of the +User Product is transferred to the recipient in perpetuity or for a +fixed term (regardless of how the transaction is characterized), the +Corresponding Source conveyed under this section must be accompanied +by the Installation Information. But this requirement does not apply +if neither you nor any third party retains the ability to install +modified object code on the User Product (for example, the work has +been installed in ROM). + + The requirement to provide Installation Information does not include a +requirement to continue to provide support service, warranty, or updates +for a work that has been modified or installed by the recipient, or for +the User Product in which it has been modified or installed. Access to a +network may be denied when the modification itself materially and +adversely affects the operation of the network or violates the rules and +protocols for communication across the network. + + Corresponding Source conveyed, and Installation Information provided, +in accord with this section must be in a format that is publicly +documented (and with an implementation available to the public in +source code form), and must require no special password or key for +unpacking, reading or copying. + + 7. Additional Terms. + + "Additional permissions" are terms that supplement the terms of this +License by making exceptions from one or more of its conditions. +Additional permissions that are applicable to the entire Program shall +be treated as though they were included in this License, to the extent +that they are valid under applicable law. If additional permissions +apply only to part of the Program, that part may be used separately +under those permissions, but the entire Program remains governed by +this License without regard to the additional permissions. + + When you convey a copy of a covered work, you may at your option +remove any additional permissions from that copy, or from any part of +it. (Additional permissions may be written to require their own +removal in certain cases when you modify the work.) You may place +additional permissions on material, added by you to a covered work, +for which you have or can give appropriate copyright permission. + + Notwithstanding any other provision of this License, for material you +add to a covered work, you may (if authorized by the copyright holders of +that material) supplement the terms of this License with terms: + + a) Disclaiming warranty or limiting liability differently from the + terms of sections 15 and 16 of this License; or + + b) Requiring preservation of specified reasonable legal notices or + author attributions in that material or in the Appropriate Legal + Notices displayed by works containing it; or + + c) Prohibiting misrepresentation of the origin of that material, or + requiring that modified versions of such material be marked in + reasonable ways as different from the original version; or + + d) Limiting the use for publicity purposes of names of licensors or + authors of the material; or + + e) Declining to grant rights under trademark law for use of some + trade names, trademarks, or service marks; or + + f) Requiring indemnification of licensors and authors of that + material by anyone who conveys the material (or modified versions of + it) with contractual assumptions of liability to the recipient, for + any liability that these contractual assumptions directly impose on + those licensors and authors. + + All other non-permissive additional terms are considered "further +restrictions" within the meaning of section 10. If the Program as you +received it, or any part of it, contains a notice stating that it is +governed by this License along with a term that is a further +restriction, you may remove that term. If a license document contains +a further restriction but permits relicensing or conveying under this +License, you may add to a covered work material governed by the terms +of that license document, provided that the further restriction does +not survive such relicensing or conveying. + + If you add terms to a covered work in accord with this section, you +must place, in the relevant source files, a statement of the +additional terms that apply to those files, or a notice indicating +where to find the applicable terms. + + Additional terms, permissive or non-permissive, may be stated in the +form of a separately written license, or stated as exceptions; +the above requirements apply either way. + + 8. Termination. + + You may not propagate or modify a covered work except as expressly +provided under this License. Any attempt otherwise to propagate or +modify it is void, and will automatically terminate your rights under +this License (including any patent licenses granted under the third +paragraph of section 11). + + However, if you cease all violation of this License, then your +license from a particular copyright holder is reinstated (a) +provisionally, unless and until the copyright holder explicitly and +finally terminates your license, and (b) permanently, if the copyright +holder fails to notify you of the violation by some reasonable means +prior to 60 days after the cessation. + + Moreover, your license from a particular copyright holder is +reinstated permanently if the copyright holder notifies you of the +violation by some reasonable means, this is the first time you have +received notice of violation of this License (for any work) from that +copyright holder, and you cure the violation prior to 30 days after +your receipt of the notice. + + Termination of your rights under this section does not terminate the +licenses of parties who have received copies or rights from you under +this License. If your rights have been terminated and not permanently +reinstated, you do not qualify to receive new licenses for the same +material under section 10. + + 9. Acceptance Not Required for Having Copies. + + You are not required to accept this License in order to receive or +run a copy of the Program. Ancillary propagation of a covered work +occurring solely as a consequence of using peer-to-peer transmission +to receive a copy likewise does not require acceptance. However, +nothing other than this License grants you permission to propagate or +modify any covered work. These actions infringe copyright if you do +not accept this License. Therefore, by modifying or propagating a +covered work, you indicate your acceptance of this License to do so. + + 10. Automatic Licensing of Downstream Recipients. + + Each time you convey a covered work, the recipient automatically +receives a license from the original licensors, to run, modify and +propagate that work, subject to this License. You are not responsible +for enforcing compliance by third parties with this License. + + An "entity transaction" is a transaction transferring control of an +organization, or substantially all assets of one, or subdividing an +organization, or merging organizations. If propagation of a covered +work results from an entity transaction, each party to that +transaction who receives a copy of the work also receives whatever +licenses to the work the party's predecessor in interest had or could +give under the previous paragraph, plus a right to possession of the +Corresponding Source of the work from the predecessor in interest, if +the predecessor has it or can get it with reasonable efforts. + + You may not impose any further restrictions on the exercise of the +rights granted or affirmed under this License. For example, you may +not impose a license fee, royalty, or other charge for exercise of +rights granted under this License, and you may not initiate litigation +(including a cross-claim or counterclaim in a lawsuit) alleging that +any patent claim is infringed by making, using, selling, offering for +sale, or importing the Program or any portion of it. + + 11. Patents. + + A "contributor" is a copyright holder who authorizes use under this +License of the Program or a work on which the Program is based. The +work thus licensed is called the contributor's "contributor version". + + A contributor's "essential patent claims" are all patent claims +owned or controlled by the contributor, whether already acquired or +hereafter acquired, that would be infringed by some manner, permitted +by this License, of making, using, or selling its contributor version, +but do not include claims that would be infringed only as a +consequence of further modification of the contributor version. For +purposes of this definition, "control" includes the right to grant +patent sublicenses in a manner consistent with the requirements of +this License. + + Each contributor grants you a non-exclusive, worldwide, royalty-free +patent license under the contributor's essential patent claims, to +make, use, sell, offer for sale, import and otherwise run, modify and +propagate the contents of its contributor version. + + In the following three paragraphs, a "patent license" is any express +agreement or commitment, however denominated, not to enforce a patent +(such as an express permission to practice a patent or covenant not to +sue for patent infringement). To "grant" such a patent license to a +party means to make such an agreement or commitment not to enforce a +patent against the party. + + If you convey a covered work, knowingly relying on a patent license, +and the Corresponding Source of the work is not available for anyone +to copy, free of charge and under the terms of this License, through a +publicly available network server or other readily accessible means, +then you must either (1) cause the Corresponding Source to be so +available, or (2) arrange to deprive yourself of the benefit of the +patent license for this particular work, or (3) arrange, in a manner +consistent with the requirements of this License, to extend the patent +license to downstream recipients. "Knowingly relying" means you have +actual knowledge that, but for the patent license, your conveying the +covered work in a country, or your recipient's use of the covered work +in a country, would infringe one or more identifiable patents in that +country that you have reason to believe are valid. + + If, pursuant to or in connection with a single transaction or +arrangement, you convey, or propagate by procuring conveyance of, a +covered work, and grant a patent license to some of the parties +receiving the covered work authorizing them to use, propagate, modify +or convey a specific copy of the covered work, then the patent license +you grant is automatically extended to all recipients of the covered +work and works based on it. + + A patent license is "discriminatory" if it does not include within +the scope of its coverage, prohibits the exercise of, or is +conditioned on the non-exercise of one or more of the rights that are +specifically granted under this License. You may not convey a covered +work if you are a party to an arrangement with a third party that is +in the business of distributing software, under which you make payment +to the third party based on the extent of your activity of conveying +the work, and under which the third party grants, to any of the +parties who would receive the covered work from you, a discriminatory +patent license (a) in connection with copies of the covered work +conveyed by you (or copies made from those copies), or (b) primarily +for and in connection with specific products or compilations that +contain the covered work, unless you entered into that arrangement, +or that patent license was granted, prior to 28 March 2007. + + Nothing in this License shall be construed as excluding or limiting +any implied license or other defenses to infringement that may +otherwise be available to you under applicable patent law. + + 12. No Surrender of Others' Freedom. + + If conditions are imposed on you (whether by court order, agreement or +otherwise) that contradict the conditions of this License, they do not +excuse you from the conditions of this License. If you cannot convey a +covered work so as to satisfy simultaneously your obligations under this +License and any other pertinent obligations, then as a consequence you may +not convey it at all. For example, if you agree to terms that obligate you +to collect a royalty for further conveying from those to whom you convey +the Program, the only way you could satisfy both those terms and this +License would be to refrain entirely from conveying the Program. + + 13. Use with the GNU Affero General Public License. + + Notwithstanding any other provision of this License, you have +permission to link or combine any covered work with a work licensed +under version 3 of the GNU Affero General Public License into a single +combined work, and to convey the resulting work. The terms of this +License will continue to apply to the part which is the covered work, +but the special requirements of the GNU Affero General Public License, +section 13, concerning interaction through a network will apply to the +combination as such. + + 14. Revised Versions of this License. + + The Free Software Foundation may publish revised and/or new versions of +the GNU General Public License from time to time. Such new versions will +be similar in spirit to the present version, but may differ in detail to +address new problems or concerns. + + Each version is given a distinguishing version number. If the +Program specifies that a certain numbered version of the GNU General +Public License "or any later version" applies to it, you have the +option of following the terms and conditions either of that numbered +version or of any later version published by the Free Software +Foundation. If the Program does not specify a version number of the +GNU General Public License, you may choose any version ever published +by the Free Software Foundation. + + If the Program specifies that a proxy can decide which future +versions of the GNU General Public License can be used, that proxy's +public statement of acceptance of a version permanently authorizes you +to choose that version for the Program. + + Later license versions may give you additional or different +permissions. However, no additional obligations are imposed on any +author or copyright holder as a result of your choosing to follow a +later version. + + 15. Disclaimer of Warranty. + + THERE IS NO WARRANTY FOR THE PROGRAM, TO THE EXTENT PERMITTED BY +APPLICABLE LAW. EXCEPT WHEN OTHERWISE STATED IN WRITING THE COPYRIGHT +HOLDERS AND/OR OTHER PARTIES PROVIDE THE PROGRAM "AS IS" WITHOUT WARRANTY +OF ANY KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, +THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR +PURPOSE. THE ENTIRE RISK AS TO THE QUALITY AND PERFORMANCE OF THE PROGRAM +IS WITH YOU. SHOULD THE PROGRAM PROVE DEFECTIVE, YOU ASSUME THE COST OF +ALL NECESSARY SERVICING, REPAIR OR CORRECTION. + + 16. Limitation of Liability. + + IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN WRITING +WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MODIFIES AND/OR CONVEYS +THE PROGRAM AS PERMITTED ABOVE, BE LIABLE TO YOU FOR DAMAGES, INCLUDING ANY +GENERAL, SPECIAL, INCIDENTAL OR CONSEQUENTIAL DAMAGES ARISING OUT OF THE +USE OR INABILITY TO USE THE PROGRAM (INCLUDING BUT NOT LIMITED TO LOSS OF +DATA OR DATA BEING RENDERED INACCURATE OR LOSSES SUSTAINED BY YOU OR THIRD +PARTIES OR A FAILURE OF THE PROGRAM TO OPERATE WITH ANY OTHER PROGRAMS), +EVEN IF SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE POSSIBILITY OF +SUCH DAMAGES. + + 17. Interpretation of Sections 15 and 16. + + If the disclaimer of warranty and limitation of liability provided +above cannot be given local legal effect according to their terms, +reviewing courts shall apply local law that most closely approximates +an absolute waiver of all civil liability in connection with the +Program, unless a warranty or assumption of liability accompanies a +copy of the Program in return for a fee. + + END OF TERMS AND CONDITIONS + + How to Apply These Terms to Your New Programs + + If you develop a new program, and you want it to be of the greatest +possible use to the public, the best way to achieve this is to make it +free software which everyone can redistribute and change under these terms. + + To do so, attach the following notices to the program. It is safest +to attach them to the start of each source file to most effectively +state the exclusion of warranty; and each file should have at least +the "copyright" line and a pointer to where the full notice is found. + + + Copyright (C) + + This program is free software: you can redistribute it and/or modify + it under the terms of the GNU General Public License as published by + the Free Software Foundation, either version 3 of the License, or + (at your option) any later version. + + This program is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License + along with this program. If not, see . + +Also add information on how to contact you by electronic and paper mail. + + If the program does terminal interaction, make it output a short +notice like this when it starts in an interactive mode: + + Copyright (C) + This program comes with ABSOLUTELY NO WARRANTY; for details type `show w'. + This is free software, and you are welcome to redistribute it + under certain conditions; type `show c' for details. + +The hypothetical commands `show w' and `show c' should show the appropriate +parts of the General Public License. Of course, your program's commands +might be different; for a GUI interface, you would use an "about box". + + You should also get your employer (if you work as a programmer) or school, +if any, to sign a "copyright disclaimer" for the program, if necessary. +For more information on this, and how to apply and follow the GNU GPL, see +. + + The GNU General Public License does not permit incorporating your program +into proprietary programs. If your program is a subroutine library, you +may consider it more useful to permit linking proprietary applications with +the library. If this is what you want to do, use the GNU Lesser General +Public License instead of this License. But first, please read +. + +/pytorch/third_party/opentelemetry-cpp/third_party/opentracing-cpp/3rd_party/include/opentracing/variant/LICENSE +---------------------------------------------------------------------------------------------------------------- +Copyright (c) MapBox +All rights reserved. + +Redistribution and use in source and binary forms, with or without modification, +are permitted provided that the following conditions are met: + +- Redistributions of source code must retain the above copyright notice, this + list of conditions and the following disclaimer. +- Redistributions in binary form must reproduce the above copyright notice, this + list of conditions and the following disclaimer in the documentation and/or + other materials provided with the distribution. +- Neither the name "MapBox" nor the names of its contributors may be + used to endorse or promote products derived from this software without + specific prior written permission. + +THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND +ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED +WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE +DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR +ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES +(INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; +LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON +ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT +(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS +SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + +/pytorch/third_party/opentelemetry-cpp/tools/vcpkg/LICENSE.txt +-------------------------------------------------------------- +MIT License + +Copyright (c) Microsoft Corporation + +Permission is hereby granted, free of charge, to any person obtaining a copy of this +software and associated documentation files (the "Software"), to deal in the Software +without restriction, including without limitation the rights to use, copy, modify, +merge, publish, distribute, sublicense, and/or sell copies of the Software, and to +permit persons to whom the Software is furnished to do so, subject to the following +conditions: + +The above copyright notice and this permission notice shall be included in all copies +or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, +INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A +PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT +HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF +CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE +OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. + + +/pytorch/third_party/opentelemetry-cpp/tools/vcpkg/ports/vulkan/LICENSE.txt +--------------------------------------------------------------------------- +/* +* +* Licensed under the Apache License, Version 2.0 (the "License"); +* you may not use this file except in compliance with the License. +* You may obtain a copy of the License at +* +* http://www.apache.org/licenses/LICENSE-2.0 +* +* Unless required by applicable law or agreed to in writing, software +* distributed under the License is distributed on an "AS IS" BASIS, +* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +* See the License for the specific language governing permissions and +* limitations under the License. +*/ + + +Apache License +Version 2.0, January 2004 +http://www.apache.org/licenses/ + +TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION + +1. Definitions. + +"License" shall mean the terms and conditions for use, reproduction, and distribution as defined by Sections 1 through 9 of this document. + +"Licensor" shall mean the copyright owner or entity authorized by the copyright owner that is granting the License. + +"Legal Entity" shall mean the union of the acting entity and all other entities that control, are controlled by, or are under common control with that entity. For the purposes of this definition, "control" means (i) the power, direct or indirect, to cause the direction or management of such entity, whether by contract or otherwise, or (ii) ownership of fifty percent (50%) or more of the outstanding shares, or (iii) beneficial ownership of such entity. + +"You" (or "Your") shall mean an individual or Legal Entity exercising permissions granted by this License. + +"Source" form shall mean the preferred form for making modifications, including but not limited to software source code, documentation source, and configuration files. + +"Object" form shall mean any form resulting from mechanical transformation or translation of a Source form, including but not limited to compiled object code, generated documentation, and conversions to other media types. + +"Work" shall mean the work of authorship, whether in Source or Object form, made available under the License, as indicated by a copyright notice that is included in or attached to the work (an example is provided in the Appendix below). + +"Derivative Works" shall mean any work, whether in Source or Object form, that is based on (or derived from) the Work and for which the editorial revisions, annotations, elaborations, or other modifications represent, as a whole, an original work of authorship. For the purposes of this License, Derivative Works shall not include works that remain separable from, or merely link (or bind by name) to the interfaces of, the Work and Derivative Works thereof. + +"Contribution" shall mean any work of authorship, including the original version of the Work and any modifications or additions to that Work or Derivative Works thereof, that is intentionally submitted to Licensor for inclusion in the Work by the copyright owner or by an individual or Legal Entity authorized to submit on behalf of the copyright owner. For the purposes of this definition, "submitted" means any form of electronic, verbal, or written communication sent to the Licensor or its representatives, including but not limited to communication on electronic mailing lists, source code control systems, and issue tracking systems that are managed by, or on behalf of, the Licensor for the purpose of discussing and improving the Work, but excluding communication that is conspicuously marked or otherwise designated in writing by the copyright owner as "Not a Contribution." + +"Contributor" shall mean Licensor and any individual or Legal Entity on behalf of whom a Contribution has been received by Licensor and subsequently incorporated within the Work. + +2. Grant of Copyright License. + +Subject to the terms and conditions of this License, each Contributor hereby grants to You a perpetual, worldwide, non-exclusive, no-charge, royalty-free, irrevocable copyright license to reproduce, prepare Derivative Works of, publicly display, publicly perform, sublicense, and distribute the Work and such Derivative Works in Source or Object form. + +3. Grant of Patent License. + +Subject to the terms and conditions of this License, each Contributor hereby grants to You a perpetual, worldwide, non-exclusive, no-charge, royalty-free, irrevocable (except as stated in this section) patent license to make, have made, use, offer to sell, sell, import, and otherwise transfer the Work, where such license applies only to those patent claims licensable by such Contributor that are necessarily infringed by their Contribution(s) alone or by combination of their Contribution(s) with the Work to which such Contribution(s) was submitted. If You institute patent litigation against any entity (including a cross-claim or counterclaim in a lawsuit) alleging that the Work or a Contribution incorporated within the Work constitutes direct or contributory patent infringement, then any patent licenses granted to You under this License for that Work shall terminate as of the date such litigation is filed. + +4. Redistribution. + +You may reproduce and distribute copies of the Work or Derivative Works thereof in any medium, with or without modifications, and in Source or Object form, provided that You meet the following conditions: + +You must give any other recipients of the Work or Derivative Works a copy of this License; and +You must cause any modified files to carry prominent notices stating that You changed the files; and +You must retain, in the Source form of any Derivative Works that You distribute, all copyright, patent, trademark, and attribution notices from the Source form of the Work, excluding those notices that do not pertain to any part of the Derivative Works; and +If the Work includes a "NOTICE" text file as part of its distribution, then any Derivative Works that You distribute must include a readable copy of the attribution notices contained within such NOTICE file, excluding those notices that do not pertain to any part of the Derivative Works, in at least one of the following places: within a NOTICE text file distributed as part of the Derivative Works; within the Source form or documentation, if provided along with the Derivative Works; or, within a display generated by the Derivative Works, if and wherever such third-party notices normally appear. The contents of the NOTICE file are for informational purposes only and do not modify the License. You may add Your own attribution notices within Derivative Works that You distribute, alongside or as an addendum to the NOTICE text from the Work, provided that such additional attribution notices cannot be construed as modifying the License. +You may add Your own copyright statement to Your modifications and may provide additional or different license terms and conditions for use, reproduction, or distribution of Your modifications, or for any such Derivative Works as a whole, provided Your use, reproduction, and distribution of the Work otherwise complies with the conditions stated in this License. + +5. Submission of Contributions. + +Unless You explicitly state otherwise, any Contribution intentionally submitted for inclusion in the Work by You to the Licensor shall be under the terms and conditions of this License, without any additional terms or conditions. Notwithstanding the above, nothing herein shall supersede or modify the terms of any separate license agreement you may have executed with Licensor regarding such Contributions. + +6. Trademarks. + +This License does not grant permission to use the trade names, trademarks, service marks, or product names of the Licensor, except as required for reasonable and customary use in describing the origin of the Work and reproducing the content of the NOTICE file. + +7. Disclaimer of Warranty. + +Unless required by applicable law or agreed to in writing, Licensor provides the Work (and each Contributor provides its Contributions) on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied, including, without limitation, any warranties or conditions of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A PARTICULAR PURPOSE. You are solely responsible for determining the appropriateness of using or redistributing the Work and assume any risks associated with Your exercise of permissions under this License. + +8. Limitation of Liability. + +In no event and under no legal theory, whether in tort (including negligence), contract, or otherwise, unless required by applicable law (such as deliberate and grossly negligent acts) or agreed to in writing, shall any Contributor be liable to You for damages, including any direct, indirect, special, incidental, or consequential damages of any character arising as a result of this License or out of the use or inability to use the Work (including but not limited to damages for loss of goodwill, work stoppage, computer failure or malfunction, or any and all other commercial damages or losses), even if such Contributor has been advised of the possibility of such damages. + +9. Accepting Warranty or Additional Liability. + +While redistributing the Work or Derivative Works thereof, You may choose to offer, and charge a fee for, acceptance of support, warranty, indemnity, or other liability obligations and/or rights consistent with this License. However, in accepting such obligations, You may act only on Your own behalf and on Your sole responsibility, not on behalf of any other Contributor, and only if You agree to indemnify, defend, and hold each Contributor harmless for any liability incurred by, or claims asserted against, such Contributor by reason of your accepting any such warranty or additional liability. + +END OF TERMS AND CONDITIONS + +=============================================================================================================================================== + +/Copyright (C) 2012 LunarG, Inc. +//All rights reserved. +// +//Redistribution and use in source and binary forms, with or without +//modification, are permitted provided that the following conditions +//are met: +// +// Redistributions of source code must retain the above copyright +// notice, this list of conditions and the following disclaimer. +// +// Redistributions in binary form must reproduce the above +// copyright notice, this list of conditions and the following +// disclaimer in the documentation and/or other materials provided +// with the distribution. +// +// Neither the name of LunarG Inc. nor the names of its +// contributors may be used to endorse or promote products derived +// from this software without specific prior written permission. +// +//THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS +//"AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT +//LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS +//FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE +//COPYRIGHT HOLDERS OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, +//INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, +//BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; +//LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER +//CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT +//LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN +//ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE +//POSSIBILITY OF SUCH DAMAGE. + +=============================================================================================================================================== + +#============================================================================= +# Copyright 2007-2009 Kitware, Inc. +# Copyright 2007-2008 Miguel A. Figueroa-Villanueva +# +# Distributed under the OSI-approved BSD License (the "License"); +# see accompanying file Copyright_cmake.txt for details. +# +# This software is distributed WITHOUT ANY WARRANTY; without even the +# implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. +# See the License for more information. +#============================================================================= +# (To distributed this file outside of CMake, substitute the full +# License text for the above reference.) + + +============================================================================================================================================== + +// +// Copyright (C) 2015-2018 Google, Inc. +// Copyright (C) +// +// All rights reserved. +// +// Redistribution and use in source and binary forms, with or without +// modification, are permitted provided that the following conditions +// are met: +// +// Redistributions of source code must retain the above copyright +// notice, this list of conditions and the following disclaimer. +// +// Redistributions in binary form must reproduce the above +// copyright notice, this list of conditions and the following +// disclaimer in the documentation and/or other materials provided +// with the distribution. +// +// Neither the name of 3Dlabs Inc. Ltd. nor the names of its +// contributors may be used to endorse or promote products derived +// from this software without specific prior written permission. +// +// THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS +// "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT +// LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS +// FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE +// COPYRIGHT HOLDERS OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, +// INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, +// BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; +// LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER +// CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT +// LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN +// ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE +// POSSIBILITY OF SUCH DAMAGE. +// + +========================================================================================================================================== + +Note: This license has also been called the "New BSD License" or "Modified BSD License". See also the 2-clause BSD License. +Copyright +Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met: +1. Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer. +2. Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution. +3. Neither the name of the copyright holder nor the names of its contributors may be used to endorse or promote products derived from this software without specific prior written permission. +THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + + +========================================================================================================================================== + +/* +* xxHash - Fast Hash algorithm +* Copyright (C) 2012-2016, Yann Collet +* +* BSD 2-Clause License (http://www.opensource.org/licenses/bsd-license.php) +* +* Redistribution and use in source and binary forms, with or without +* modification, are permitted provided that the following conditions are +* met: +* +* * Redistributions of source code must retain the above copyright +* notice, this list of conditions and the following disclaimer. +* * Redistributions in binary form must reproduce the above +* copyright notice, this list of conditions and the following disclaimer +* in the documentation and/or other materials provided with the +* distribution. +* +* THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS +* "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT +* LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR +* A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT +* OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, +* SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT +* LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, +* DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY +* THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT +* (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE +* OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. +* +* You can contact the author at : +* - xxHash homepage: http://www.xxhash.com +* - xxHash source repository : https://github.com/Cyan4973/xxHash +*/ + + +=========================================================================================================================================== + +# Copyright (C) 2018 Google, Inc. +# +# All rights reserved. +# +# Redistribution and use in source and binary forms, with or without +# modification, are permitted provided that the following conditions +# are met: +# +# Redistributions of source code must retain the above copyright +# notice, this list of conditions and the following disclaimer. +# +# Redistributions in binary form must reproduce the above +# copyright notice, this list of conditions and the following +# disclaimer in the documentation and/or other materials provided +# with the distribution. +# +# Neither the name of Google Inc. nor the names of its +# contributors may be used to endorse or promote products derived +# from this software without specific prior written permission. +# +# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS +# "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT +# LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS +# FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE +# COPYRIGHT HOLDERS OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, +# INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, +# BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; +# LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER +# CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT +# LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN +# ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE +# POSSIBILITY OF SUCH DAMAGE. + +========================================================================================================================================== + +/* A Bison parser, made by GNU Bison 3.0.4. */ + +/* Bison implementation for Yacc-like parsers in C +Copyright (C) 1984, 1989-1990, 2000-2015 Free Software Foundation, Inc. +This program is free software: you can redistribute it and/or modify +it under the terms of the GNU General Public License as published by +the Free Software Foundation, either version 3 of the License, or +(at your option) any later version. +This program is distributed in the hope that it will be useful, +but WITHOUT ANY WARRANTY; without even the implied warranty of +MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +GNU General Public License for more details. +You should have received a copy of the GNU General Public License +along with this program. If not, see . */ + +/* As a special exception, you may create a larger work that contains +part or all of the Bison parser skeleton and distribute that work +under terms of your choice, so long as that work isn't itself a +parser generator using the skeleton or a modified version thereof +as a parser skeleton. Alternatively, if you modify or redistribute +the parser skeleton itself, you may (at your option) remove this +special exception, which will cause the skeleton and the resulting +Bison output files to be licensed under the GNU General Public +License without this special exception. +This special exception was added by the Free Software Foundation in +version 2.2 of Bison. */ + +/* C LALR(1) parser skeleton written by Richard Stallman, by +simplifying the original so-called "semantic" parser. */ + +/* All symbols defined below should begin with yy or YY, to avoid +infringing on user name space. This should be done even for local +variables, as they might otherwise be expanded by user macros. +There are some unavoidable exceptions within include files to +define necessary library symbols; they are noted "INFRINGES ON +USER NAME SPACE" below. */ + +============================================================================================================================================== + +copyright : [ +Copyright (c) 2017 The Khronos Group Inc., +, +Permission is hereby granted, free of charge, to any person obtaining a copy, +of this software and/or associated documentation files (the \Materials\"),", +to deal in the Materials without restriction, including without limitation, +the rights to use, copy, modify, merge, publish, distribute, sublicense,, +and/or sell copies of the Materials, and to permit persons to whom the, +Materials are furnished to do so, subject to the following conditions:, +, +The above copyright notice and this permission notice shall be included in, +all copies or substantial portions of the Materials., +, +MODIFICATIONS TO THIS FILE MAY MEAN IT NO LONGER ACCURATELY REFLECTS KHRONOS, +STANDARDS. THE UNMODIFIED, NORMATIVE VERSIONS OF KHRONOS SPECIFICATIONS AND, +HEADER INFORMATION ARE LOCATED AT https://www.khronos.org/registry/ , +, +THE MATERIALS ARE PROVIDED \AS IS\", WITHOUT WARRANTY OF ANY KIND, EXPRESS", +OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,, +FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL, +THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER, +LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING, +FROM,OUT OF OR IN CONNECTION WITH THE MATERIALS OR THE USE OR OTHER DEALINGS, +IN THE MATERIALS. + +============================================================================================================================================= + +CMake - Cross Platform Makefile Generator +Copyright 2000-2009 Kitware, Inc., Insight Software Consortium +All rights reserved. + +Redistribution and use in source and binary forms, with or without +modification, are permitted provided that the following conditions +are met: + +* Redistributions of source code must retain the above copyright +notice, this list of conditions and the following disclaimer. + +* Redistributions in binary form must reproduce the above copyright +notice, this list of conditions and the following disclaimer in the +documentation and/or other materials provided with the distribution. + +* Neither the names of Kitware, Inc., the Insight Software Consortium, +nor the names of their contributors may be used to endorse or promote +products derived from this software without specific prior written +permission. + +THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS +AS IS AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT +LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR +A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT +HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, +SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT +LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, +DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY +THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT +(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE +OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + +------------------------------------------------------------------------------ + +The above copyright and license notice applies to distributions of +CMake in source and binary form. Some source files contain additional +notices of original copyright by their contributors; see each source +for details. Third-party software packages supplied with CMake under +compatible licenses provide their own copyright notices documented in +corresponding subdirectories. + +------------------------------------------------------------------------------ + +CMake was initially developed by Kitware with the following sponsorship: + +* National Library of Medicine at the National Institutes of Health +as part of the Insight Segmentation and Registration Toolkit (ITK). + +* US National Labs (Los Alamos, Livermore, Sandia) ASC Parallel +Visualization Initiative. + +* National Alliance for Medical Image Computing (NAMIC) is funded by the +National Institutes of Health through the NIH Roadmap for Medical Research, +Grant U54 EB005149. + +* Kitware, Inc. + +======================================================================================================================================== + +The authors of this software are Rob Pike and Ken Thompson. +* Copyright (c) 2002 by Lucent Technologies. +* Permission to use, copy, modify, and distribute this software for any +* purpose without fee is hereby granted, provided that this entire notice +* is included in all copies of any software which is or includes a copy +* or modification of this software and in all copies of the supporting +* documentation for such software. +* THIS SOFTWARE IS BEING PROVIDED "AS IS", WITHOUT ANY EXPRESS OR IMPLIED +* WARRANTY. IN PARTICULAR, NEITHER THE AUTHORS NOR LUCENT TECHNOLOGIES MAKE ANY +* REPRESENTATION OR WARRANTY OF ANY KIND CONCERNING THE MERCHANTABILITY +* OF THIS SOFTWARE OR ITS FITNESS FOR ANY PARTICULAR PURPOSE. + + +======================================================================================================================================== + +Copyright (c) 2015-2018 Baldur Karlsson + +Copyright (c) 2014 Crytek + +Copyright (c) 1998-2018 Third party code and tools + +Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions: + +The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. + +========================================================================================================================================= + +/* +Copyright (c) 2009 Dave Gamble +Copyright (c) 2015-2016 The Khronos Group Inc. +Copyright (c) 2015-2016 Valve Corporation +Copyright (c) 2015-2016 LunarG, Inc. + +Permission is hereby granted, free of charge, to any person obtaining a copy +of this software and associated documentation files (the "Software"), to deal +in the Software without restriction, including without limitation the rights +to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +copies of the Software, and to permit persons to whom the Software is +furnished to do so, subject to the following conditions: + +The above copyright notice and this permission notice shall be included in +all copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN +THE SOFTWARE. +*/ + +=========================================================================================================================================== + +Copyright (c) 2005 - 2017 G-Truc Creation + +Permission is hereby granted, free of charge, to any person obtaining a copy +of this software and associated documentation files (the "Software"), to deal +in the Software without restriction, including without limitation the rights +to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +copies of the Software, and to permit persons to whom the Software is +furnished to do so, subject to the following conditions: + +The above copyright notice and this permission notice shall be included in +all copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN + + +========================================================================================================================================== + +/* +The JsonCpp library's source code, including accompanying documentation, +tests and demonstration applications, are licensed under the following +conditions... +The author (Baptiste Lepilleur) explicitly disclaims copyright in all +jurisdictions which recognize such a disclaimer. In such jurisdictions, +this software is released into the Public Domain. +In jurisdictions which do not recognize Public Domain property (e.g. Germany as of +2010), this software is Copyright (c) 2007-2010 by Baptiste Lepilleur, and is +released under the terms of the MIT License (see below). +In jurisdictions which recognize Public Domain property, the user of this +software may choose to accept it either as 1) Public Domain, 2) under the +conditions of the MIT License (see below), or 3) under the terms of dual +Public Domain/MIT License conditions described here, as they choose. +The MIT License is about as close to Public Domain as a license can get, and is +described in clear, concise terms at: +http://en.wikipedia.org/wiki/MIT_License + +The full text of the MIT License follows: + +Copyright (c) 2007-2010 Baptiste Lepilleur +Permission is hereby granted, free of charge, to any person +obtaining a copy of this software and associated documentation +files (the "Software"), to deal in the Software without +restriction, including without limitation the rights to use, copy, +modify, merge, publish, distribute, sublicense, and/or sell copies +of the Software, and to permit persons to whom the Software is +furnished to do so, subject to the following conditions: +The above copyright notice and this permission notice shall be +included in all copies or substantial portions of the Software. +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, +EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF +MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND +NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS +BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN +ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN +CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE +SOFTWARE. + +========================================================================================================================================== + +/** +* `murmurhash.h' - murmurhash +* +* copyright (c) 2014 joseph werle +* Copyright (c) 2015-2016 The Khronos Group Inc. +* Copyright (c) 2015-2016 Valve Corporation +* Copyright (c) 2015-2016 LunarG, Inc. +* +* Permission is hereby granted, free of charge, to any person obtaining a copy +* of this software and/or associated documentation files (the "Materials"), to +* deal in the Materials without restriction, including without limitation the +* rights to use, copy, modify, merge, publish, distribute, sublicense, and/or +* sell copies of the Materials, and to permit persons to whom the Materials are +* furnished to do so, subject to the following conditions: +* +* The above copyright notice(s) and this permission notice shall be included in +* all copies or substantial portions of the Materials. +* +* THE MATERIALS ARE PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. +* +* IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, +* DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR +* OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE MATERIALS OR THE +* USE OR OTHER DEALINGS IN THE MATERIALS. +*/ + +========================================================================================================================================= + +Licenced as X11: http://www.kryogenix.org/code/browser/licence.html +This basically means: do what you want with it. + +========================================================================================================================================= + +/////////////////////////////////////////////////////////////////////////////////// +/// OpenGL Mathematics (glm.g-truc.net) +/// +/// Copyright (c) 2005 - 2014 G-Truc Creation (www.g-truc.net) +/// Permission is hereby granted, free of charge, to any person obtaining a copy +/// of this software and associated documentation files (the "Software"), to deal +/// in the Software without restriction, including without limitation the rights +/// to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +/// copies of the Software, and to permit persons to whom the Software is +/// furnished to do so, subject to the following conditions: +/// +/// The above copyright notice and this permission notice shall be included in +/// all copies or substantial portions of the Software. +/// +/// THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +/// IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +/// FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +/// AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +/// LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +/// OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN +/// THE SOFTWARE. +/// +/// @ref core +/// @file glm/common.hpp +/// @date 2013-12-24 / 2013-12-24 +/// @author Christophe Riccio +/////////////////////////////////////////////////////////////////////////////////// + + +========================================================================================================================================== + +// LICENSE +// +// This software is in the public domain. Where that dedication is not +// recognized, you are granted a perpetual, irrevocable license to copy, +// distribute, and modify this file as you see fit. +// + +========================================================================================================================================== + +Simple DirectMedia Layer +Copyright (C) 1997-2018 Sam Lantinga + +This software is provided 'as-is', without any express or implied +warranty. In no event will the authors be held liable for any damages +arising from the use of this software. + +Permission is granted to anyone to use this software for any purpose, +including commercial applications, and to alter it and redistribute it +freely, subject to the following restrictions: + +1. The origin of this software must not be misrepresented; you must not +claim that you wrote the original software. If you use this software +in a product, an acknowledgment in the product documentation would be +appreciated but is not required. +2. Altered source versions must be plainly marked as such, and must not be +misrepresented as being the original software. +3. This notice may not be removed or altered from any source distribution. + +========================================================================================================================================= + +/****************************************************************************\ +Copyright (c) 2002, NVIDIA Corporation. + +NVIDIA Corporation("NVIDIA") supplies this software to you in +consideration of your agreement to the following terms, and your use, +installation, modification or redistribution of this NVIDIA software +constitutes acceptance of these terms. If you do not agree with these +terms, please do not use, install, modify or redistribute this NVIDIA +software. + +In consideration of your agreement to abide by the following terms, and +subject to these terms, NVIDIA grants you a personal, non-exclusive +license, under NVIDIA's copyrights in this original NVIDIA software (the +NVIDIA Software), to use, reproduce, modify and redistribute the +NVIDIA Software, with or without modifications, in source and/or binary +forms; provided that if you redistribute the NVIDIA Software, you must +retain the copyright notice of NVIDIA, this notice and the following +text and disclaimers in all such redistributions of the NVIDIA Software. +Neither the name, trademarks, service marks nor logos of NVIDIA +Corporation may be used to endorse or promote products derived from the +NVIDIA Software without specific prior written permission from NVIDIA. +Except as expressly stated in this notice, no other rights or licenses +express or implied, are granted by NVIDIA herein, including but not +limited to any patent rights that may be infringed by your derivative +works or by other works in which the NVIDIA Software may be +incorporated. No hardware is licensed hereunder. + +THE NVIDIA SOFTWARE IS BEING PROVIDED ON AN "AS IS" BASIS, WITHOUT +WARRANTIES OR CONDITIONS OF ANY KIND, EITHER EXPRESS OR IMPLIED, +INCLUDING WITHOUT LIMITATION, WARRANTIES OR CONDITIONS OF TITLE, +NON-INFRINGEMENT, MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE, OR +ITS USE AND OPERATION EITHER ALONE OR IN COMBINATION WITH OTHER +PRODUCTS. + +IN NO EVENT SHALL NVIDIA BE LIABLE FOR ANY SPECIAL, INDIRECT, +INCIDENTAL, EXEMPLARY, CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED +TO, LOST PROFITS; PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF +USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) OR ARISING IN ANY WAY +OUT OF THE USE, REPRODUCTION, MODIFICATION AND/OR DISTRIBUTION OF THE +NVIDIA SOFTWARE, HOWEVER CAUSED AND WHETHER UNDER THEORY OF CONTRACT, +TORT (INCLUDING NEGLIGENCE), STRICT LIABILITY OR OTHERWISE, EVEN IF +NVIDIA HAS BEEN ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. +\****************************************************************************/ + +================================================================================================================================================== + +This software is provided 'as-is', without any express or implied +warranty. In no event will the authors be held liable for any damages +arising from the use of this software. + +Permission is granted to anyone to use this software for any purpose, +including commercial applications, and to alter it and redistribute it +freely, subject to the following restrictions: + +1. The origin of this software must not be misrepresented; you must not + claim that you wrote the original software. If you use this software + in a product, an acknowledgment in the product documentation would be + appreciated but is not required. +2. Altered source versions must be plainly marked as such, and must not be + misrepresented as being the original software. +3. This notice may not be removed or altered from any source distribution. + + +================================================================================================================================================== + +GNU LESSER GENERAL PUBLIC LICENSE +Version 3, 29 June 2007 + +Copyright (C) 2007 Free Software Foundation, Inc. + +Everyone is permitted to copy and distribute verbatim copies of this license document, but changing it is not allowed. + +This version of the GNU Lesser General Public License incorporates the terms and conditions of version 3 of the GNU General Public License, supplemented by the additional permissions listed below. + +0. Additional Definitions. + +As used herein, "this License" refers to version 3 of the GNU Lesser General Public License, and the "GNU GPL" refers to version 3 of the GNU General Public License. + +"The Library" refers to a covered work governed by this License, other than an Application or a Combined Work as defined below. + +An "Application" is any work that makes use of an interface provided by the Library, but which is not otherwise based on the Library. Defining a subclass of a class defined by the Library is deemed a mode of using an interface provided by the Library. + +A "Combined Work" is a work produced by combining or linking an Application with the Library. The particular version of the Library with which the Combined Work was made is also called the "Linked Version". + +The "Minimal Corresponding Source" for a Combined Work means the Corresponding Source for the Combined Work, excluding any source code for portions of the Combined Work that, considered in isolation, are based on the Application, and not on the Linked Version. + +The "Corresponding Application Code" for a Combined Work means the object code and/or source code for the Application, including any data and utility programs needed for reproducing the Combined Work from the Application, but excluding the System Libraries of the Combined Work. + +1. Exception to Section 3 of the GNU GPL. + +You may convey a covered work under sections 3 and 4 of this License without being bound by section 3 of the GNU GPL. + +2. Conveying Modified Versions. + +If you modify a copy of the Library, and, in your modifications, a facility refers to a function or data to be supplied by an Application that uses the facility (other than as an argument passed when the facility is invoked), then you may convey a copy of the modified version: + +a) under this License, provided that you make a good faith effort to ensure that, in the event an Application does not supply the function or data, the facility still operates, and performs whatever part of its purpose remains meaningful, or +b) under the GNU GPL, with none of the additional permissions of this License applicable to that copy. +3. Object Code Incorporating Material from Library Header Files. + +The object code form of an Application may incorporate material from a header file that is part of the Library. You may convey such object code under terms of your choice, provided that, if the incorporated material is not limited to numerical parameters, data structure layouts and accessors, or small macros, inline functions and templates (ten or fewer lines in length), you do both of the following: + +a) Give prominent notice with each copy of the object code that the Library is used in it and that the Library and its use are covered by this License. +b) Accompany the object code with a copy of the GNU GPL and this license document. +4. Combined Works. + +You may convey a Combined Work under terms of your choice that, taken together, effectively do not restrict modification of the portions of the Library contained in the Combined Work and reverse engineering for debugging such modifications, if you also do each of the following: + +a) Give prominent notice with each copy of the Combined Work that the Library is used in it and that the Library and its use are covered by this License. +b) Accompany the Combined Work with a copy of the GNU GPL and this license document. +c) For a Combined Work that displays copyright notices during execution, include the copyright notice for the Library among these notices, as well as a reference directing the user to the copies of the GNU GPL and this license document. +d) Do one of the following: +0) Convey the Minimal Corresponding Source under the terms of this License, and the Corresponding Application Code in a form suitable for, and under terms that permit, the user to recombine or relink the Application with a modified version of the Linked Version to produce a modified Combined Work, in the manner specified by section 6 of the GNU GPL for conveying Corresponding Source. +1) Use a suitable shared library mechanism for linking with the Library. A suitable mechanism is one that (a) uses at run time a copy of the Library already present on the user's computer system, and (b) will operate properly with a modified version of the Library that is interface-compatible with the Linked Version. +e) Provide Installation Information, but only if you would otherwise be required to provide such information under section 6 of the GNU GPL, and only to the extent that such information is necessary to install and execute a modified version of the Combined Work produced by recombining or relinking the Application with a modified version of the Linked Version. (If you use option 4d0, the Installation Information must accompany the Minimal Corresponding Source and Corresponding Application Code. If you use option 4d1, you must provide the Installation Information in the manner specified by section 6 of the GNU GPL for conveying Corresponding Source.) +5. Combined Libraries. + +You may place library facilities that are a work based on the Library side by side in a single library together with other library facilities that are not Applications and are not covered by this License, and convey such a combined library under terms of your choice, if you do both of the following: + +a) Accompany the combined library with a copy of the same work based on the Library, uncombined with any other library facilities, conveyed under the terms of this License. +b) Give prominent notice with the combined library that part of it is a work based on the Library, and explaining where to find the accompanying uncombined form of the same work. +6. Revised Versions of the GNU Lesser General Public License. + +The Free Software Foundation may publish revised and/or new versions of the GNU Lesser General Public License from time to time. Such new versions will be similar in spirit to the present version, but may differ in detail to address new problems or concerns. + +Each version is given a distinguishing version number. If the Library as you received it specifies that a certain numbered version of the GNU Lesser General Public License "or any later version" applies to it, you have the option of following the terms and conditions either of that published version or of any later version published by the Free Software Foundation. If the Library as you received it does not specify a version number of the GNU Lesser General Public License, you may choose any version of the GNU Lesser General Public License ever published by the Free Software Foundation. + +If the Library as you received it specifies that a proxy can decide whether future versions of the GNU Lesser General Public License shall apply, that proxy's public statement of acceptance of any version is permanent authorization for you to choose that version for the Library. + + +torch-optimizer +Apache Software License +https://github.com/jettify/pytorch-optimizer +Apache License + Version 2.0, January 2004 + http://www.apache.org/licenses/ + + TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION + + 1. Definitions. + + "License" shall mean the terms and conditions for use, reproduction, + and distribution as defined by Sections 1 through 9 of this document. + + "Licensor" shall mean the copyright owner or entity authorized by + the copyright owner that is granting the License. + + "Legal Entity" shall mean the union of the acting entity and all + other entities that control, are controlled by, or are under common + control with that entity. For the purposes of this definition, + "control" means (i) the power, direct or indirect, to cause the + direction or management of such entity, whether by contract or + otherwise, or (ii) ownership of fifty percent (50%) or more of the + outstanding shares, or (iii) beneficial ownership of such entity. + + "You" (or "Your") shall mean an individual or Legal Entity + exercising permissions granted by this License. + + "Source" form shall mean the preferred form for making modifications, + including but not limited to software source code, documentation + source, and configuration files. + + "Object" form shall mean any form resulting from mechanical + transformation or translation of a Source form, including but + not limited to compiled object code, generated documentation, + and conversions to other media types. + + "Work" shall mean the work of authorship, whether in Source or + Object form, made available under the License, as indicated by a + copyright notice that is included in or attached to the work + (an example is provided in the Appendix below). + + "Derivative Works" shall mean any work, whether in Source or Object + form, that is based on (or derived from) the Work and for which the + editorial revisions, annotations, elaborations, or other modifications + represent, as a whole, an original work of authorship. For the purposes + of this License, Derivative Works shall not include works that remain + separable from, or merely link (or bind by name) to the interfaces of, + the Work and Derivative Works thereof. + + "Contribution" shall mean any work of authorship, including + the original version of the Work and any modifications or additions + to that Work or Derivative Works thereof, that is intentionally + submitted to Licensor for inclusion in the Work by the copyright owner + or by an individual or Legal Entity authorized to submit on behalf of + the copyright owner. For the purposes of this definition, "submitted" + means any form of electronic, verbal, or written communication sent + to the Licensor or its representatives, including but not limited to + communication on electronic mailing lists, source code control systems, + and issue tracking systems that are managed by, or on behalf of, the + Licensor for the purpose of discussing and improving the Work, but + excluding communication that is conspicuously marked or otherwise + designated in writing by the copyright owner as "Not a Contribution." + + "Contributor" shall mean Licensor and any individual or Legal Entity + on behalf of whom a Contribution has been received by Licensor and + subsequently incorporated within the Work. + + 2. Grant of Copyright License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + copyright license to reproduce, prepare Derivative Works of, + publicly display, publicly perform, sublicense, and distribute the + Work and such Derivative Works in Source or Object form. + + 3. Grant of Patent License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + (except as stated in this section) patent license to make, have made, + use, offer to sell, sell, import, and otherwise transfer the Work, + where such license applies only to those patent claims licensable + by such Contributor that are necessarily infringed by their + Contribution(s) alone or by combination of their Contribution(s) + with the Work to which such Contribution(s) was submitted. If You + institute patent litigation against any entity (including a + cross-claim or counterclaim in a lawsuit) alleging that the Work + or a Contribution incorporated within the Work constitutes direct + or contributory patent infringement, then any patent licenses + granted to You under this License for that Work shall terminate + as of the date such litigation is filed. + + 4. Redistribution. You may reproduce and distribute copies of the + Work or Derivative Works thereof in any medium, with or without + modifications, and in Source or Object form, provided that You + meet the following conditions: + + (a) You must give any other recipients of the Work or + Derivative Works a copy of this License; and + + (b) You must cause any modified files to carry prominent notices + stating that You changed the files; and + + (c) You must retain, in the Source form of any Derivative Works + that You distribute, all copyright, patent, trademark, and + attribution notices from the Source form of the Work, + excluding those notices that do not pertain to any part of + the Derivative Works; and + + (d) If the Work includes a "NOTICE" text file as part of its + distribution, then any Derivative Works that You distribute must + include a readable copy of the attribution notices contained + within such NOTICE file, excluding those notices that do not + pertain to any part of the Derivative Works, in at least one + of the following places: within a NOTICE text file distributed + as part of the Derivative Works; within the Source form or + documentation, if provided along with the Derivative Works; or, + within a display generated by the Derivative Works, if and + wherever such third-party notices normally appear. The contents + of the NOTICE file are for informational purposes only and + do not modify the License. You may add Your own attribution + notices within Derivative Works that You distribute, alongside + or as an addendum to the NOTICE text from the Work, provided + that such additional attribution notices cannot be construed + as modifying the License. + + You may add Your own copyright statement to Your modifications and + may provide additional or different license terms and conditions + for use, reproduction, or distribution of Your modifications, or + for any such Derivative Works as a whole, provided Your use, + reproduction, and distribution of the Work otherwise complies with + the conditions stated in this License. + + 5. Submission of Contributions. Unless You explicitly state otherwise, + any Contribution intentionally submitted for inclusion in the Work + by You to the Licensor shall be under the terms and conditions of + this License, without any additional terms or conditions. + Notwithstanding the above, nothing herein shall supersede or modify + the terms of any separate license agreement you may have executed + with Licensor regarding such Contributions. + + 6. Trademarks. This License does not grant permission to use the trade + names, trademarks, service marks, or product names of the Licensor, + except as required for reasonable and customary use in describing the + origin of the Work and reproducing the content of the NOTICE file. + + 7. Disclaimer of Warranty. Unless required by applicable law or + agreed to in writing, Licensor provides the Work (and each + Contributor provides its Contributions) on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or + implied, including, without limitation, any warranties or conditions + of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A + PARTICULAR PURPOSE. You are solely responsible for determining the + appropriateness of using or redistributing the Work and assume any + risks associated with Your exercise of permissions under this License. + + 8. Limitation of Liability. In no event and under no legal theory, + whether in tort (including negligence), contract, or otherwise, + unless required by applicable law (such as deliberate and grossly + negligent acts) or agreed to in writing, shall any Contributor be + liable to You for damages, including any direct, indirect, special, + incidental, or consequential damages of any character arising as a + result of this License or out of the use or inability to use the + Work (including but not limited to damages for loss of goodwill, + work stoppage, computer failure or malfunction, or any and all + other commercial damages or losses), even if such Contributor + has been advised of the possibility of such damages. + + 9. Accepting Warranty or Additional Liability. While redistributing + the Work or Derivative Works thereof, You may choose to offer, + and charge a fee for, acceptance of support, warranty, indemnity, + or other liability obligations and/or rights consistent with this + License. However, in accepting such obligations, You may act only + on Your own behalf and on Your sole responsibility, not on behalf + of any other Contributor, and only if You agree to indemnify, + defend, and hold each Contributor harmless for any liability + incurred by, or claims asserted against, such Contributor by reason + of your accepting any such warranty or additional liability. + + END OF TERMS AND CONDITIONS + + APPENDIX: How to apply the Apache License to your work. + + To apply the Apache License to your work, attach the following + boilerplate notice, with the fields enclosed by brackets "{}" + replaced with your own identifying information. (Don't include + the brackets!) The text should be enclosed in the appropriate + comment syntax for the file format. We also recommend that a + file or class name and description of purpose be included on the + same "printed page" as the copyright notice for easier + identification within third-party archives. + + Copyright 2020 Nikolay Novik (https://github.com/jettify) + + Licensed under the Apache License, Version 2.0 (the "License"); + you may not use this file except in compliance with the License. + You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, software + distributed under the License is distributed on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + See the License for the specific language governing permissions and + limitations under the License. + + + +torch_fidelity +Apache License 2.0 +https://www.github.com/toshas/torch-fidelity +Copyright 2020 Anton Obukhov + +Licensed under the Apache License, Version 2.0 (the "License"); +you may not use this file except in compliance with the License. +You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + +Unless required by applicable law or agreed to in writing, software +distributed under the License is distributed on an "AS IS" BASIS, +WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +See the License for the specific language governing permissions and +limitations under the License. + + +torchcodec +BSD 3-Clause License + +Copyright 2024 Meta + +Redistribution and use in source and binary forms, with or without modification, +are permitted provided that the following conditions are met: + +1. Redistributions of source code must retain the above copyright notice,this list +of conditions and the following disclaimer. + +2. Redistributions in binary form must reproduce the above copyright notice, this +list of conditions and the following disclaimer in the documentation +and/or other materials provided with the distribution. + +3. Neither the name of the copyright holder nor the names of its contributors may +be used to endorse or promote products derived from this software without specific +prior written permission. + +THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS “AS IS” AND ANY +EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES +OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT +SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, +INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED +TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR +BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN +CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN +ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH +DAMAGE. + +UNKNOWN +BSD 3-Clause License + +Copyright 2024 Meta + +Redistribution and use in source and binary forms, with or without modification, +are permitted provided that the following conditions are met: + +1. Redistributions of source code must retain the above copyright notice,this list +of conditions and the following disclaimer. + +2. Redistributions in binary form must reproduce the above copyright notice, this +list of conditions and the following disclaimer in the documentation +and/or other materials provided with the distribution. + +3. Neither the name of the copyright holder nor the names of its contributors may +be used to endorse or promote products derived from this software without specific +prior written permission. + +THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS “AS IS” AND ANY +EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES +OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT +SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, +INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED +TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR +BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN +CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN +ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH +DAMAGE. + + +torchdata +BSD License +https://github.com/pytorch/data +BSD 3-Clause License + +Copyright (c) 2021-present, Facebook, Inc. +All rights reserved. + +Redistribution and use in source and binary forms, with or without +modification, are permitted provided that the following conditions are met: + +* Redistributions of source code must retain the above copyright notice, this + list of conditions and the following disclaimer. + +* Redistributions in binary form must reproduce the above copyright notice, + this list of conditions and the following disclaimer in the documentation + and/or other materials provided with the distribution. + +* Neither the name of the copyright holder nor the names of its + contributors may be used to endorse or promote products derived from + this software without specific prior written permission. + +THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" +AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE +IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE +DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE +FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL +DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR +SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER +CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, +OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE +OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + + +torchmetrics +Apache Software License +https://github.com/Lightning-AI/torchmetrics + Apache License + Version 2.0, January 2004 + http://www.apache.org/licenses/ + + TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION + + 1. Definitions. + + "License" shall mean the terms and conditions for use, reproduction, + and distribution as defined by Sections 1 through 9 of this document. + + "Licensor" shall mean the copyright owner or entity authorized by + the copyright owner that is granting the License. + + "Legal Entity" shall mean the union of the acting entity and all + other entities that control, are controlled by, or are under common + control with that entity. For the purposes of this definition, + "control" means (i) the power, direct or indirect, to cause the + direction or management of such entity, whether by contract or + otherwise, or (ii) ownership of fifty percent (50%) or more of the + outstanding shares, or (iii) beneficial ownership of such entity. + + "You" (or "Your") shall mean an individual or Legal Entity + exercising permissions granted by this License. + + "Source" form shall mean the preferred form for making modifications, + including but not limited to software source code, documentation + source, and configuration files. + + "Object" form shall mean any form resulting from mechanical + transformation or translation of a Source form, including but + not limited to compiled object code, generated documentation, + and conversions to other media types. + + "Work" shall mean the work of authorship, whether in Source or + Object form, made available under the License, as indicated by a + copyright notice that is included in or attached to the work + (an example is provided in the Appendix below). + + "Derivative Works" shall mean any work, whether in Source or Object + form, that is based on (or derived from) the Work and for which the + editorial revisions, annotations, elaborations, or other modifications + represent, as a whole, an original work of authorship. For the purposes + of this License, Derivative Works shall not include works that remain + separable from, or merely link (or bind by name) to the interfaces of, + the Work and Derivative Works thereof. + + "Contribution" shall mean any work of authorship, including + the original version of the Work and any modifications or additions + to that Work or Derivative Works thereof, that is intentionally + submitted to Licensor for inclusion in the Work by the copyright owner + or by an individual or Legal Entity authorized to submit on behalf of + the copyright owner. For the purposes of this definition, "submitted" + means any form of electronic, verbal, or written communication sent + to the Licensor or its representatives, including but not limited to + communication on electronic mailing lists, source code control systems, + and issue tracking systems that are managed by, or on behalf of, the + Licensor for the purpose of discussing and improving the Work, but + excluding communication that is conspicuously marked or otherwise + designated in writing by the copyright owner as "Not a Contribution." + + "Contributor" shall mean Licensor and any individual or Legal Entity + on behalf of whom a Contribution has been received by Licensor and + subsequently incorporated within the Work. + + 2. Grant of Copyright License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + copyright license to reproduce, prepare Derivative Works of, + publicly display, publicly perform, sublicense, and distribute the + Work and such Derivative Works in Source or Object form. + + 3. Grant of Patent License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + (except as stated in this section) patent license to make, have made, + use, offer to sell, sell, import, and otherwise transfer the Work, + where such license applies only to those patent claims licensable + by such Contributor that are necessarily infringed by their + Contribution(s) alone or by combination of their Contribution(s) + with the Work to which such Contribution(s) was submitted. If You + institute patent litigation against any entity (including a + cross-claim or counterclaim in a lawsuit) alleging that the Work + or a Contribution incorporated within the Work constitutes direct + or contributory patent infringement, then any patent licenses + granted to You under this License for that Work shall terminate + as of the date such litigation is filed. + + 4. Redistribution. You may reproduce and distribute copies of the + Work or Derivative Works thereof in any medium, with or without + modifications, and in Source or Object form, provided that You + meet the following conditions: + + (a) You must give any other recipients of the Work or + Derivative Works a copy of this License; and + + (b) You must cause any modified files to carry prominent notices + stating that You changed the files; and + + (c) You must retain, in the Source form of any Derivative Works + that You distribute, all copyright, patent, trademark, and + attribution notices from the Source form of the Work, + excluding those notices that do not pertain to any part of + the Derivative Works; and + + (d) If the Work includes a "NOTICE" text file as part of its + distribution, then any Derivative Works that You distribute must + include a readable copy of the attribution notices contained + within such NOTICE file, excluding those notices that do not + pertain to any part of the Derivative Works, in at least one + of the following places: within a NOTICE text file distributed + as part of the Derivative Works; within the Source form or + documentation, if provided along with the Derivative Works; or, + within a display generated by the Derivative Works, if and + wherever such third-party notices normally appear. The contents + of the NOTICE file are for informational purposes only and + do not modify the License. You may add Your own attribution + notices within Derivative Works that You distribute, alongside + or as an addendum to the NOTICE text from the Work, provided + that such additional attribution notices cannot be construed + as modifying the License. + + You may add Your own copyright statement to Your modifications and + may provide additional or different license terms and conditions + for use, reproduction, or distribution of Your modifications, or + for any such Derivative Works as a whole, provided Your use, + reproduction, and distribution of the Work otherwise complies with + the conditions stated in this License. + + 5. Submission of Contributions. Unless You explicitly state otherwise, + any Contribution intentionally submitted for inclusion in the Work + by You to the Licensor shall be under the terms and conditions of + this License, without any additional terms or conditions. + Notwithstanding the above, nothing herein shall supersede or modify + the terms of any separate license agreement you may have executed + with Licensor regarding such Contributions. + + 6. Trademarks. This License does not grant permission to use the trade + names, trademarks, service marks, or product names of the Licensor, + except as required for reasonable and customary use in describing the + origin of the Work and reproducing the content of the NOTICE file. + + 7. Disclaimer of Warranty. Unless required by applicable law or + agreed to in writing, Licensor provides the Work (and each + Contributor provides its Contributions) on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or + implied, including, without limitation, any warranties or conditions + of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A + PARTICULAR PURPOSE. You are solely responsible for determining the + appropriateness of using or redistributing the Work and assume any + risks associated with Your exercise of permissions under this License. + + 8. Limitation of Liability. In no event and under no legal theory, + whether in tort (including negligence), contract, or otherwise, + unless required by applicable law (such as deliberate and grossly + negligent acts) or agreed to in writing, shall any Contributor be + liable to You for damages, including any direct, indirect, special, + incidental, or consequential damages of any character arising as a + result of this License or out of the use or inability to use the + Work (including but not limited to damages for loss of goodwill, + work stoppage, computer failure or malfunction, or any and all + other commercial damages or losses), even if such Contributor + has been advised of the possibility of such damages. + + 9. Accepting Warranty or Additional Liability. While redistributing + the Work or Derivative Works thereof, You may choose to offer, + and charge a fee for, acceptance of support, warranty, indemnity, + or other liability obligations and/or rights consistent with this + License. However, in accepting such obligations, You may act only + on Your own behalf and on Your sole responsibility, not on behalf + of any other Contributor, and only if You agree to indemnify, + defend, and hold each Contributor harmless for any liability + incurred by, or claims asserted against, such Contributor by reason + of your accepting any such warranty or additional liability. + + END OF TERMS AND CONDITIONS + + APPENDIX: How to apply the Apache License to your work. + + To apply the Apache License to your work, attach the following + boilerplate notice, with the fields enclosed by brackets "[]" + replaced with your own identifying information. (Don't include + the brackets!) The text should be enclosed in the appropriate + comment syntax for the file format. We also recommend that a + file or class name and description of purpose be included on the + same "printed page" as the copyright notice for easier + identification within third-party archives. + + Copyright 2020-2022 Lightning-AI team + + Licensed under the Apache License, Version 2.0 (the "License"); + you may not use this file except in compliance with the License. + You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, software + distributed under the License is distributed on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + See the License for the specific language governing permissions and + limitations under the License. + + +torchtitan +BSD 3-Clause License + +(c) Meta Platforms, Inc. and affiliates. + +Redistribution and use in source and binary forms, with or without modification, +are permitted provided that the following conditions are met: + +1. Redistributions of source code must retain the above copyright notice,this list +of conditions and the following disclaimer. + +2. Redistributions in binary form must reproduce the above copyright notice, this +list of conditions and the following disclaimer in the documentation +and/or other materials provided with the distribution. + +3. Neither the name of the copyright holder nor the names of its contributors may +be used to endorse or promote products derived from this software without specific +prior written permission. + +THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS “AS IS” AND ANY +EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES +OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT +SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, +INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED +TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR +BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN +CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN +ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH +DAMAGE. + +UNKNOWN +BSD 3-Clause License + +(c) Meta Platforms, Inc. and affiliates. + +Redistribution and use in source and binary forms, with or without modification, +are permitted provided that the following conditions are met: + +1. Redistributions of source code must retain the above copyright notice,this list +of conditions and the following disclaimer. + +2. Redistributions in binary form must reproduce the above copyright notice, this +list of conditions and the following disclaimer in the documentation +and/or other materials provided with the distribution. + +3. Neither the name of the copyright holder nor the names of its contributors may +be used to endorse or promote products derived from this software without specific +prior written permission. + +THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS “AS IS” AND ANY +EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES +OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT +SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, +INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED +TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR +BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN +CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN +ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH +DAMAGE. + + +torchvision +BSD +https://github.com/pytorch/vision +BSD 3-Clause License + +Copyright (c) Soumith Chintala 2016, +All rights reserved. + +Redistribution and use in source and binary forms, with or without +modification, are permitted provided that the following conditions are met: + +* Redistributions of source code must retain the above copyright notice, this + list of conditions and the following disclaimer. + +* Redistributions in binary form must reproduce the above copyright notice, + this list of conditions and the following disclaimer in the documentation + and/or other materials provided with the distribution. + +* Neither the name of the copyright holder nor the names of its + contributors may be used to endorse or promote products derived from + this software without specific prior written permission. + +THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" +AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE +IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE +DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE +FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL +DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR +SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER +CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, +OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE +OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + + +tornado +Apache Software License +http://www.tornadoweb.org/ + + Apache License + Version 2.0, January 2004 + http://www.apache.org/licenses/ + + TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION + + 1. Definitions. + + "License" shall mean the terms and conditions for use, reproduction, + and distribution as defined by Sections 1 through 9 of this document. + + "Licensor" shall mean the copyright owner or entity authorized by + the copyright owner that is granting the License. + + "Legal Entity" shall mean the union of the acting entity and all + other entities that control, are controlled by, or are under common + control with that entity. For the purposes of this definition, + "control" means (i) the power, direct or indirect, to cause the + direction or management of such entity, whether by contract or + otherwise, or (ii) ownership of fifty percent (50%) or more of the + outstanding shares, or (iii) beneficial ownership of such entity. + + "You" (or "Your") shall mean an individual or Legal Entity + exercising permissions granted by this License. + + "Source" form shall mean the preferred form for making modifications, + including but not limited to software source code, documentation + source, and configuration files. + + "Object" form shall mean any form resulting from mechanical + transformation or translation of a Source form, including but + not limited to compiled object code, generated documentation, + and conversions to other media types. + + "Work" shall mean the work of authorship, whether in Source or + Object form, made available under the License, as indicated by a + copyright notice that is included in or attached to the work + (an example is provided in the Appendix below). + + "Derivative Works" shall mean any work, whether in Source or Object + form, that is based on (or derived from) the Work and for which the + editorial revisions, annotations, elaborations, or other modifications + represent, as a whole, an original work of authorship. For the purposes + of this License, Derivative Works shall not include works that remain + separable from, or merely link (or bind by name) to the interfaces of, + the Work and Derivative Works thereof. + + "Contribution" shall mean any work of authorship, including + the original version of the Work and any modifications or additions + to that Work or Derivative Works thereof, that is intentionally + submitted to Licensor for inclusion in the Work by the copyright owner + or by an individual or Legal Entity authorized to submit on behalf of + the copyright owner. For the purposes of this definition, "submitted" + means any form of electronic, verbal, or written communication sent + to the Licensor or its representatives, including but not limited to + communication on electronic mailing lists, source code control systems, + and issue tracking systems that are managed by, or on behalf of, the + Licensor for the purpose of discussing and improving the Work, but + excluding communication that is conspicuously marked or otherwise + designated in writing by the copyright owner as "Not a Contribution." + + "Contributor" shall mean Licensor and any individual or Legal Entity + on behalf of whom a Contribution has been received by Licensor and + subsequently incorporated within the Work. + + 2. Grant of Copyright License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + copyright license to reproduce, prepare Derivative Works of, + publicly display, publicly perform, sublicense, and distribute the + Work and such Derivative Works in Source or Object form. + + 3. Grant of Patent License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + (except as stated in this section) patent license to make, have made, + use, offer to sell, sell, import, and otherwise transfer the Work, + where such license applies only to those patent claims licensable + by such Contributor that are necessarily infringed by their + Contribution(s) alone or by combination of their Contribution(s) + with the Work to which such Contribution(s) was submitted. If You + institute patent litigation against any entity (including a + cross-claim or counterclaim in a lawsuit) alleging that the Work + or a Contribution incorporated within the Work constitutes direct + or contributory patent infringement, then any patent licenses + granted to You under this License for that Work shall terminate + as of the date such litigation is filed. + + 4. Redistribution. You may reproduce and distribute copies of the + Work or Derivative Works thereof in any medium, with or without + modifications, and in Source or Object form, provided that You + meet the following conditions: + + (a) You must give any other recipients of the Work or + Derivative Works a copy of this License; and + + (b) You must cause any modified files to carry prominent notices + stating that You changed the files; and + + (c) You must retain, in the Source form of any Derivative Works + that You distribute, all copyright, patent, trademark, and + attribution notices from the Source form of the Work, + excluding those notices that do not pertain to any part of + the Derivative Works; and + + (d) If the Work includes a "NOTICE" text file as part of its + distribution, then any Derivative Works that You distribute must + include a readable copy of the attribution notices contained + within such NOTICE file, excluding those notices that do not + pertain to any part of the Derivative Works, in at least one + of the following places: within a NOTICE text file distributed + as part of the Derivative Works; within the Source form or + documentation, if provided along with the Derivative Works; or, + within a display generated by the Derivative Works, if and + wherever such third-party notices normally appear. The contents + of the NOTICE file are for informational purposes only and + do not modify the License. You may add Your own attribution + notices within Derivative Works that You distribute, alongside + or as an addendum to the NOTICE text from the Work, provided + that such additional attribution notices cannot be construed + as modifying the License. + + You may add Your own copyright statement to Your modifications and + may provide additional or different license terms and conditions + for use, reproduction, or distribution of Your modifications, or + for any such Derivative Works as a whole, provided Your use, + reproduction, and distribution of the Work otherwise complies with + the conditions stated in this License. + + 5. Submission of Contributions. Unless You explicitly state otherwise, + any Contribution intentionally submitted for inclusion in the Work + by You to the Licensor shall be under the terms and conditions of + this License, without any additional terms or conditions. + Notwithstanding the above, nothing herein shall supersede or modify + the terms of any separate license agreement you may have executed + with Licensor regarding such Contributions. + + 6. Trademarks. This License does not grant permission to use the trade + names, trademarks, service marks, or product names of the Licensor, + except as required for reasonable and customary use in describing the + origin of the Work and reproducing the content of the NOTICE file. + + 7. Disclaimer of Warranty. Unless required by applicable law or + agreed to in writing, Licensor provides the Work (and each + Contributor provides its Contributions) on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or + implied, including, without limitation, any warranties or conditions + of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A + PARTICULAR PURPOSE. You are solely responsible for determining the + appropriateness of using or redistributing the Work and assume any + risks associated with Your exercise of permissions under this License. + + 8. Limitation of Liability. In no event and under no legal theory, + whether in tort (including negligence), contract, or otherwise, + unless required by applicable law (such as deliberate and grossly + negligent acts) or agreed to in writing, shall any Contributor be + liable to You for damages, including any direct, indirect, special, + incidental, or consequential damages of any character arising as a + result of this License or out of the use or inability to use the + Work (including but not limited to damages for loss of goodwill, + work stoppage, computer failure or malfunction, or any and all + other commercial damages or losses), even if such Contributor + has been advised of the possibility of such damages. + + 9. Accepting Warranty or Additional Liability. While redistributing + the Work or Derivative Works thereof, You may choose to offer, + and charge a fee for, acceptance of support, warranty, indemnity, + or other liability obligations and/or rights consistent with this + License. However, in accepting such obligations, You may act only + on Your own behalf and on Your sole responsibility, not on behalf + of any other Contributor, and only if You agree to indemnify, + defend, and hold each Contributor harmless for any liability + incurred by, or claims asserted against, such Contributor by reason + of your accepting any such warranty or additional liability. + + END OF TERMS AND CONDITIONS + + APPENDIX: How to apply the Apache License to your work. + + To apply the Apache License to your work, attach the following + boilerplate notice, with the fields enclosed by brackets "[]" + replaced with your own identifying information. (Don't include + the brackets!) The text should be enclosed in the appropriate + comment syntax for the file format. We also recommend that a + file or class name and description of purpose be included on the + same "printed page" as the copyright notice for easier + identification within third-party archives. + + Copyright [yyyy] [name of copyright owner] + + Licensed under the Apache License, Version 2.0 (the "License"); + you may not use this file except in compliance with the License. + You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, software + distributed under the License is distributed on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + See the License for the specific language governing permissions and + limitations under the License. + + +tqdm +MPL-2.0 AND MIT +https://tqdm.github.io +`tqdm` is a product of collaborative work. +Unless otherwise stated, all authors (see commit logs) retain copyright +for their respective work, and release the work under the MIT licence +(text below). + +Exceptions or notable authors are listed below +in reverse chronological order: + +* files: * + MPL-2.0 2015-2026 (c) Casper da Costa-Luis + [casperdcl](https://github.com/casperdcl). +* files: tqdm/_tqdm.py + MIT 2016 (c) [PR #96] on behalf of Google Inc. +* files: tqdm/_tqdm.py README.rst .gitignore + MIT 2013 (c) Noam Yorav-Raphael, original author. + +[PR #96]: https://github.com/tqdm/tqdm/pull/96 + + +Mozilla Public Licence (MPL) v. 2.0 - Exhibit A +----------------------------------------------- + +This Source Code Form is subject to the terms of the +Mozilla Public License, v. 2.0. +If a copy of the MPL was not distributed with this project, +You can obtain one at https://mozilla.org/MPL/2.0/. + + +MIT License (MIT) +----------------- + +Copyright (c) 2013 noamraph + +Permission is hereby granted, free of charge, to any person obtaining a copy of +this software and associated documentation files (the "Software"), to deal in +the Software without restriction, including without limitation the rights to +use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of +the Software, and to permit persons to whom the Software is furnished to do so, +subject to the following conditions: + +The above copyright notice and this permission notice shall be included in all +copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS +FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR +COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER +IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN +CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. + + +traitlets +BSD License +https://github.com/ipython/traitlets +BSD 3-Clause License + +- Copyright (c) 2001-, IPython Development Team + +All rights reserved. + +Redistribution and use in source and binary forms, with or without +modification, are permitted provided that the following conditions are met: + +1. Redistributions of source code must retain the above copyright notice, this + list of conditions and the following disclaimer. + +2. Redistributions in binary form must reproduce the above copyright notice, + this list of conditions and the following disclaimer in the documentation + and/or other materials provided with the distribution. + +3. Neither the name of the copyright holder nor the names of its + contributors may be used to endorse or promote products derived from + this software without specific prior written permission. + +THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" +AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE +IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE +DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE +FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL +DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR +SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER +CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, +OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE +OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + + +transformer_engine +UNKNOWN +UNKNOWN + Apache License + Version 2.0, January 2004 + http://www.apache.org/licenses/ + + TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION + + 1. Definitions. + + "License" shall mean the terms and conditions for use, reproduction, + and distribution as defined by Sections 1 through 9 of this document. + + "Licensor" shall mean the copyright owner or entity authorized by + the copyright owner that is granting the License. + + "Legal Entity" shall mean the union of the acting entity and all + other entities that control, are controlled by, or are under common + control with that entity. For the purposes of this definition, + "control" means (i) the power, direct or indirect, to cause the + direction or management of such entity, whether by contract or + otherwise, or (ii) ownership of fifty percent (50%) or more of the + outstanding shares, or (iii) beneficial ownership of such entity. + + "You" (or "Your") shall mean an individual or Legal Entity + exercising permissions granted by this License. + + "Source" form shall mean the preferred form for making modifications, + including but not limited to software source code, documentation + source, and configuration files. + + "Object" form shall mean any form resulting from mechanical + transformation or translation of a Source form, including but + not limited to compiled object code, generated documentation, + and conversions to other media types. + + "Work" shall mean the work of authorship, whether in Source or + Object form, made available under the License, as indicated by a + copyright notice that is included in or attached to the work + (an example is provided in the Appendix below). + + "Derivative Works" shall mean any work, whether in Source or Object + form, that is based on (or derived from) the Work and for which the + editorial revisions, annotations, elaborations, or other modifications + represent, as a whole, an original work of authorship. For the purposes + of this License, Derivative Works shall not include works that remain + separable from, or merely link (or bind by name) to the interfaces of, + the Work and Derivative Works thereof. + + "Contribution" shall mean any work of authorship, including + the original version of the Work and any modifications or additions + to that Work or Derivative Works thereof, that is intentionally + submitted to Licensor for inclusion in the Work by the copyright owner + or by an individual or Legal Entity authorized to submit on behalf of + the copyright owner. For the purposes of this definition, "submitted" + means any form of electronic, verbal, or written communication sent + to the Licensor or its representatives, including but not limited to + communication on electronic mailing lists, source code control systems, + and issue tracking systems that are managed by, or on behalf of, the + Licensor for the purpose of discussing and improving the Work, but + excluding communication that is conspicuously marked or otherwise + designated in writing by the copyright owner as "Not a Contribution." + + "Contributor" shall mean Licensor and any individual or Legal Entity + on behalf of whom a Contribution has been received by Licensor and + subsequently incorporated within the Work. + + 2. Grant of Copyright License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + copyright license to reproduce, prepare Derivative Works of, + publicly display, publicly perform, sublicense, and distribute the + Work and such Derivative Works in Source or Object form. + + 3. Grant of Patent License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + (except as stated in this section) patent license to make, have made, + use, offer to sell, sell, import, and otherwise transfer the Work, + where such license applies only to those patent claims licensable + by such Contributor that are necessarily infringed by their + Contribution(s) alone or by combination of their Contribution(s) + with the Work to which such Contribution(s) was submitted. If You + institute patent litigation against any entity (including a + cross-claim or counterclaim in a lawsuit) alleging that the Work + or a Contribution incorporated within the Work constitutes direct + or contributory patent infringement, then any patent licenses + granted to You under this License for that Work shall terminate + as of the date such litigation is filed. + + 4. Redistribution. You may reproduce and distribute copies of the + Work or Derivative Works thereof in any medium, with or without + modifications, and in Source or Object form, provided that You + meet the following conditions: + + (a) You must give any other recipients of the Work or + Derivative Works a copy of this License; and + + (b) You must cause any modified files to carry prominent notices + stating that You changed the files; and + + (c) You must retain, in the Source form of any Derivative Works + that You distribute, all copyright, patent, trademark, and + attribution notices from the Source form of the Work, + excluding those notices that do not pertain to any part of + the Derivative Works; and + + (d) If the Work includes a "NOTICE" text file as part of its + distribution, then any Derivative Works that You distribute must + include a readable copy of the attribution notices contained + within such NOTICE file, excluding those notices that do not + pertain to any part of the Derivative Works, in at least one + of the following places: within a NOTICE text file distributed + as part of the Derivative Works; within the Source form or + documentation, if provided along with the Derivative Works; or, + within a display generated by the Derivative Works, if and + wherever such third-party notices normally appear. The contents + of the NOTICE file are for informational purposes only and + do not modify the License. You may add Your own attribution + notices within Derivative Works that You distribute, alongside + or as an addendum to the NOTICE text from the Work, provided + that such additional attribution notices cannot be construed + as modifying the License. + + You may add Your own copyright statement to Your modifications and + may provide additional or different license terms and conditions + for use, reproduction, or distribution of Your modifications, or + for any such Derivative Works as a whole, provided Your use, + reproduction, and distribution of the Work otherwise complies with + the conditions stated in this License. + + 5. Submission of Contributions. Unless You explicitly state otherwise, + any Contribution intentionally submitted for inclusion in the Work + by You to the Licensor shall be under the terms and conditions of + this License, without any additional terms or conditions. + Notwithstanding the above, nothing herein shall supersede or modify + the terms of any separate license agreement you may have executed + with Licensor regarding such Contributions. + + 6. Trademarks. This License does not grant permission to use the trade + names, trademarks, service marks, or product names of the Licensor, + except as required for reasonable and customary use in describing the + origin of the Work and reproducing the content of the NOTICE file. + + 7. Disclaimer of Warranty. Unless required by applicable law or + agreed to in writing, Licensor provides the Work (and each + Contributor provides its Contributions) on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or + implied, including, without limitation, any warranties or conditions + of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A + PARTICULAR PURPOSE. You are solely responsible for determining the + appropriateness of using or redistributing the Work and assume any + risks associated with Your exercise of permissions under this License. + + 8. Limitation of Liability. In no event and under no legal theory, + whether in tort (including negligence), contract, or otherwise, + unless required by applicable law (such as deliberate and grossly + negligent acts) or agreed to in writing, shall any Contributor be + liable to You for damages, including any direct, indirect, special, + incidental, or consequential damages of any character arising as a + result of this License or out of the use or inability to use the + Work (including but not limited to damages for loss of goodwill, + work stoppage, computer failure or malfunction, or any and all + other commercial damages or losses), even if such Contributor + has been advised of the possibility of such damages. + + 9. Accepting Warranty or Additional Liability. While redistributing + the Work or Derivative Works thereof, You may choose to offer, + and charge a fee for, acceptance of support, warranty, indemnity, + or other liability obligations and/or rights consistent with this + License. However, in accepting such obligations, You may act only + on Your own behalf and on Your sole responsibility, not on behalf + of any other Contributor, and only if You agree to indemnify, + defend, and hold each Contributor harmless for any liability + incurred by, or claims asserted against, such Contributor by reason + of your accepting any such warranty or additional liability. + + + +transformers +Apache Software License +https://github.com/huggingface/transformers +Copyright 2018- The Hugging Face team. All rights reserved. + + Apache License + Version 2.0, January 2004 + http://www.apache.org/licenses/ + + TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION + + 1. Definitions. + + "License" shall mean the terms and conditions for use, reproduction, + and distribution as defined by Sections 1 through 9 of this document. + + "Licensor" shall mean the copyright owner or entity authorized by + the copyright owner that is granting the License. + + "Legal Entity" shall mean the union of the acting entity and all + other entities that control, are controlled by, or are under common + control with that entity. For the purposes of this definition, + "control" means (i) the power, direct or indirect, to cause the + direction or management of such entity, whether by contract or + otherwise, or (ii) ownership of fifty percent (50%) or more of the + outstanding shares, or (iii) beneficial ownership of such entity. + + "You" (or "Your") shall mean an individual or Legal Entity + exercising permissions granted by this License. + + "Source" form shall mean the preferred form for making modifications, + including but not limited to software source code, documentation + source, and configuration files. + + "Object" form shall mean any form resulting from mechanical + transformation or translation of a Source form, including but + not limited to compiled object code, generated documentation, + and conversions to other media types. + + "Work" shall mean the work of authorship, whether in Source or + Object form, made available under the License, as indicated by a + copyright notice that is included in or attached to the work + (an example is provided in the Appendix below). + + "Derivative Works" shall mean any work, whether in Source or Object + form, that is based on (or derived from) the Work and for which the + editorial revisions, annotations, elaborations, or other modifications + represent, as a whole, an original work of authorship. For the purposes + of this License, Derivative Works shall not include works that remain + separable from, or merely link (or bind by name) to the interfaces of, + the Work and Derivative Works thereof. + + "Contribution" shall mean any work of authorship, including + the original version of the Work and any modifications or additions + to that Work or Derivative Works thereof, that is intentionally + submitted to Licensor for inclusion in the Work by the copyright owner + or by an individual or Legal Entity authorized to submit on behalf of + the copyright owner. For the purposes of this definition, "submitted" + means any form of electronic, verbal, or written communication sent + to the Licensor or its representatives, including but not limited to + communication on electronic mailing lists, source code control systems, + and issue tracking systems that are managed by, or on behalf of, the + Licensor for the purpose of discussing and improving the Work, but + excluding communication that is conspicuously marked or otherwise + designated in writing by the copyright owner as "Not a Contribution." + + "Contributor" shall mean Licensor and any individual or Legal Entity + on behalf of whom a Contribution has been received by Licensor and + subsequently incorporated within the Work. + + 2. Grant of Copyright License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + copyright license to reproduce, prepare Derivative Works of, + publicly display, publicly perform, sublicense, and distribute the + Work and such Derivative Works in Source or Object form. + + 3. Grant of Patent License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + (except as stated in this section) patent license to make, have made, + use, offer to sell, sell, import, and otherwise transfer the Work, + where such license applies only to those patent claims licensable + by such Contributor that are necessarily infringed by their + Contribution(s) alone or by combination of their Contribution(s) + with the Work to which such Contribution(s) was submitted. If You + institute patent litigation against any entity (including a + cross-claim or counterclaim in a lawsuit) alleging that the Work + or a Contribution incorporated within the Work constitutes direct + or contributory patent infringement, then any patent licenses + granted to You under this License for that Work shall terminate + as of the date such litigation is filed. + + 4. Redistribution. You may reproduce and distribute copies of the + Work or Derivative Works thereof in any medium, with or without + modifications, and in Source or Object form, provided that You + meet the following conditions: + + (a) You must give any other recipients of the Work or + Derivative Works a copy of this License; and + + (b) You must cause any modified files to carry prominent notices + stating that You changed the files; and + + (c) You must retain, in the Source form of any Derivative Works + that You distribute, all copyright, patent, trademark, and + attribution notices from the Source form of the Work, + excluding those notices that do not pertain to any part of + the Derivative Works; and + + (d) If the Work includes a "NOTICE" text file as part of its + distribution, then any Derivative Works that You distribute must + include a readable copy of the attribution notices contained + within such NOTICE file, excluding those notices that do not + pertain to any part of the Derivative Works, in at least one + of the following places: within a NOTICE text file distributed + as part of the Derivative Works; within the Source form or + documentation, if provided along with the Derivative Works; or, + within a display generated by the Derivative Works, if and + wherever such third-party notices normally appear. The contents + of the NOTICE file are for informational purposes only and + do not modify the License. You may add Your own attribution + notices within Derivative Works that You distribute, alongside + or as an addendum to the NOTICE text from the Work, provided + that such additional attribution notices cannot be construed + as modifying the License. + + You may add Your own copyright statement to Your modifications and + may provide additional or different license terms and conditions + for use, reproduction, or distribution of Your modifications, or + for any such Derivative Works as a whole, provided Your use, + reproduction, and distribution of the Work otherwise complies with + the conditions stated in this License. + + 5. Submission of Contributions. Unless You explicitly state otherwise, + any Contribution intentionally submitted for inclusion in the Work + by You to the Licensor shall be under the terms and conditions of + this License, without any additional terms or conditions. + Notwithstanding the above, nothing herein shall supersede or modify + the terms of any separate license agreement you may have executed + with Licensor regarding such Contributions. + + 6. Trademarks. This License does not grant permission to use the trade + names, trademarks, service marks, or product names of the Licensor, + except as required for reasonable and customary use in describing the + origin of the Work and reproducing the content of the NOTICE file. + + 7. Disclaimer of Warranty. Unless required by applicable law or + agreed to in writing, Licensor provides the Work (and each + Contributor provides its Contributions) on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or + implied, including, without limitation, any warranties or conditions + of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A + PARTICULAR PURPOSE. You are solely responsible for determining the + appropriateness of using or redistributing the Work and assume any + risks associated with Your exercise of permissions under this License. + + 8. Limitation of Liability. In no event and under no legal theory, + whether in tort (including negligence), contract, or otherwise, + unless required by applicable law (such as deliberate and grossly + negligent acts) or agreed to in writing, shall any Contributor be + liable to You for damages, including any direct, indirect, special, + incidental, or consequential damages of any character arising as a + result of this License or out of the use or inability to use the + Work (including but not limited to damages for loss of goodwill, + work stoppage, computer failure or malfunction, or any and all + other commercial damages or losses), even if such Contributor + has been advised of the possibility of such damages. + + 9. Accepting Warranty or Additional Liability. While redistributing + the Work or Derivative Works thereof, You may choose to offer, + and charge a fee for, acceptance of support, warranty, indemnity, + or other liability obligations and/or rights consistent with this + License. However, in accepting such obligations, You may act only + on Your own behalf and on Your sole responsibility, not on behalf + of any other Contributor, and only if You agree to indemnify, + defend, and hold each Contributor harmless for any liability + incurred by, or claims asserted against, such Contributor by reason + of your accepting any such warranty or additional liability. + + END OF TERMS AND CONDITIONS + + APPENDIX: How to apply the Apache License to your work. + + To apply the Apache License to your work, attach the following + boilerplate notice, with the fields enclosed by brackets "[]" + replaced with your own identifying information. (Don't include + the brackets!) The text should be enclosed in the appropriate + comment syntax for the file format. We also recommend that a + file or class name and description of purpose be included on the + same "printed page" as the copyright notice for easier + identification within third-party archives. + + Copyright [yyyy] [name of copyright owner] + + Licensed under the Apache License, Version 2.0 (the "License"); + you may not use this file except in compliance with the License. + You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, software + distributed under the License is distributed on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + See the License for the specific language governing permissions and + limitations under the License. + + +trimesh +MIT License +https://github.com/mikedh/trimesh +The MIT License (MIT) + +Copyright (c) 2023 Michael Dawson-Haggerty + +Permission is hereby granted, free of charge, to any person obtaining a copy +of this software and associated documentation files (the "Software"), to deal +in the Software without restriction, including without limitation the rights +to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +copies of the Software, and to permit persons to whom the Software is +furnished to do so, subject to the following conditions: + +The above copyright notice and this permission notice shall be included in all +copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE +SOFTWARE. + + +triton +MIT License +https://github.com/triton-lang/triton/ +/* +* Copyright 2018-2020 Philippe Tillet +* Copyright 2020-2022 OpenAI +* +* Permission is hereby granted, free of charge, to any person obtaining +* a copy of this software and associated documentation files +* (the "Software"), to deal in the Software without restriction, +* including without limitation the rights to use, copy, modify, merge, +* publish, distribute, sublicense, and/or sell copies of the Software, +* and to permit persons to whom the Software is furnished to do so, +* subject to the following conditions: +* +* The above copyright notice and this permission notice shall be +* included in all copies or substantial portions of the Software. +* +* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, +* EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF +* MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. +* IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY +* CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, +* TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE +* SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. +*/ + + +trove-classifiers +Apache Software License +https://github.com/pypa/trove-classifiers + Apache License + Version 2.0, January 2004 + http://www.apache.org/licenses/ + + TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION + + 1. Definitions. + + "License" shall mean the terms and conditions for use, reproduction, + and distribution as defined by Sections 1 through 9 of this document. + + "Licensor" shall mean the copyright owner or entity authorized by + the copyright owner that is granting the License. + + "Legal Entity" shall mean the union of the acting entity and all + other entities that control, are controlled by, or are under common + control with that entity. For the purposes of this definition, + "control" means (i) the power, direct or indirect, to cause the + direction or management of such entity, whether by contract or + otherwise, or (ii) ownership of fifty percent (50%) or more of the + outstanding shares, or (iii) beneficial ownership of such entity. + + "You" (or "Your") shall mean an individual or Legal Entity + exercising permissions granted by this License. + + "Source" form shall mean the preferred form for making modifications, + including but not limited to software source code, documentation + source, and configuration files. + + "Object" form shall mean any form resulting from mechanical + transformation or translation of a Source form, including but + not limited to compiled object code, generated documentation, + and conversions to other media types. + + "Work" shall mean the work of authorship, whether in Source or + Object form, made available under the License, as indicated by a + copyright notice that is included in or attached to the work + (an example is provided in the Appendix below). + + "Derivative Works" shall mean any work, whether in Source or Object + form, that is based on (or derived from) the Work and for which the + editorial revisions, annotations, elaborations, or other modifications + represent, as a whole, an original work of authorship. For the purposes + of this License, Derivative Works shall not include works that remain + separable from, or merely link (or bind by name) to the interfaces of, + the Work and Derivative Works thereof. + + "Contribution" shall mean any work of authorship, including + the original version of the Work and any modifications or additions + to that Work or Derivative Works thereof, that is intentionally + submitted to Licensor for inclusion in the Work by the copyright owner + or by an individual or Legal Entity authorized to submit on behalf of + the copyright owner. For the purposes of this definition, "submitted" + means any form of electronic, verbal, or written communication sent + to the Licensor or its representatives, including but not limited to + communication on electronic mailing lists, source code control systems, + and issue tracking systems that are managed by, or on behalf of, the + Licensor for the purpose of discussing and improving the Work, but + excluding communication that is conspicuously marked or otherwise + designated in writing by the copyright owner as "Not a Contribution." + + "Contributor" shall mean Licensor and any individual or Legal Entity + on behalf of whom a Contribution has been received by Licensor and + subsequently incorporated within the Work. + + 2. Grant of Copyright License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + copyright license to reproduce, prepare Derivative Works of, + publicly display, publicly perform, sublicense, and distribute the + Work and such Derivative Works in Source or Object form. + + 3. Grant of Patent License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + (except as stated in this section) patent license to make, have made, + use, offer to sell, sell, import, and otherwise transfer the Work, + where such license applies only to those patent claims licensable + by such Contributor that are necessarily infringed by their + Contribution(s) alone or by combination of their Contribution(s) + with the Work to which such Contribution(s) was submitted. If You + institute patent litigation against any entity (including a + cross-claim or counterclaim in a lawsuit) alleging that the Work + or a Contribution incorporated within the Work constitutes direct + or contributory patent infringement, then any patent licenses + granted to You under this License for that Work shall terminate + as of the date such litigation is filed. + + 4. Redistribution. You may reproduce and distribute copies of the + Work or Derivative Works thereof in any medium, with or without + modifications, and in Source or Object form, provided that You + meet the following conditions: + + (a) You must give any other recipients of the Work or + Derivative Works a copy of this License; and + + (b) You must cause any modified files to carry prominent notices + stating that You changed the files; and + + (c) You must retain, in the Source form of any Derivative Works + that You distribute, all copyright, patent, trademark, and + attribution notices from the Source form of the Work, + excluding those notices that do not pertain to any part of + the Derivative Works; and + + (d) If the Work includes a "NOTICE" text file as part of its + distribution, then any Derivative Works that You distribute must + include a readable copy of the attribution notices contained + within such NOTICE file, excluding those notices that do not + pertain to any part of the Derivative Works, in at least one + of the following places: within a NOTICE text file distributed + as part of the Derivative Works; within the Source form or + documentation, if provided along with the Derivative Works; or, + within a display generated by the Derivative Works, if and + wherever such third-party notices normally appear. The contents + of the NOTICE file are for informational purposes only and + do not modify the License. You may add Your own attribution + notices within Derivative Works that You distribute, alongside + or as an addendum to the NOTICE text from the Work, provided + that such additional attribution notices cannot be construed + as modifying the License. + + You may add Your own copyright statement to Your modifications and + may provide additional or different license terms and conditions + for use, reproduction, or distribution of Your modifications, or + for any such Derivative Works as a whole, provided Your use, + reproduction, and distribution of the Work otherwise complies with + the conditions stated in this License. + + 5. Submission of Contributions. Unless You explicitly state otherwise, + any Contribution intentionally submitted for inclusion in the Work + by You to the Licensor shall be under the terms and conditions of + this License, without any additional terms or conditions. + Notwithstanding the above, nothing herein shall supersede or modify + the terms of any separate license agreement you may have executed + with Licensor regarding such Contributions. + + 6. Trademarks. This License does not grant permission to use the trade + names, trademarks, service marks, or product names of the Licensor, + except as required for reasonable and customary use in describing the + origin of the Work and reproducing the content of the NOTICE file. + + 7. Disclaimer of Warranty. Unless required by applicable law or + agreed to in writing, Licensor provides the Work (and each + Contributor provides its Contributions) on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or + implied, including, without limitation, any warranties or conditions + of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A + PARTICULAR PURPOSE. You are solely responsible for determining the + appropriateness of using or redistributing the Work and assume any + risks associated with Your exercise of permissions under this License. + + 8. Limitation of Liability. In no event and under no legal theory, + whether in tort (including negligence), contract, or otherwise, + unless required by applicable law (such as deliberate and grossly + negligent acts) or agreed to in writing, shall any Contributor be + liable to You for damages, including any direct, indirect, special, + incidental, or consequential damages of any character arising as a + result of this License or out of the use or inability to use the + Work (including but not limited to damages for loss of goodwill, + work stoppage, computer failure or malfunction, or any and all + other commercial damages or losses), even if such Contributor + has been advised of the possibility of such damages. + + 9. Accepting Warranty or Additional Liability. While redistributing + the Work or Derivative Works thereof, You may choose to offer, + and charge a fee for, acceptance of support, warranty, indemnity, + or other liability obligations and/or rights consistent with this + License. However, in accepting such obligations, You may act only + on Your own behalf and on Your sole responsibility, not on behalf + of any other Contributor, and only if You agree to indemnify, + defend, and hold each Contributor harmless for any liability + incurred by, or claims asserted against, such Contributor by reason + of your accepting any such warranty or additional liability. + + END OF TERMS AND CONDITIONS + + APPENDIX: How to apply the Apache License to your work. + + To apply the Apache License to your work, attach the following + boilerplate notice, with the fields enclosed by brackets "[]" + replaced with your own identifying information. (Don't include + the brackets!) The text should be enclosed in the appropriate + comment syntax for the file format. We also recommend that a + file or class name and description of purpose be included on the + same "printed page" as the copyright notice for easier + identification within third-party archives. + + Copyright [yyyy] [name of copyright owner] + + Licensed under the Apache License, Version 2.0 (the "License"); + you may not use this file except in compliance with the License. + You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, software + distributed under the License is distributed on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + See the License for the specific language governing permissions and + limitations under the License. + + +typeguard +MIT +UNKNOWN +This is the MIT license: http://www.opensource.org/licenses/mit-license.php + +Copyright (c) Alex Grönholm + +Permission is hereby granted, free of charge, to any person obtaining a copy of this +software and associated documentation files (the "Software"), to deal in the Software +without restriction, including without limitation the rights to use, copy, modify, merge, +publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons +to whom the Software is furnished to do so, subject to the following conditions: + +The above copyright notice and this permission notice shall be included in all copies or +substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, +INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR +PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE +FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR +OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER +DEALINGS IN THE SOFTWARE. + + +typer +MIT License +https://github.com/fastapi/typer +The MIT License (MIT) + +Copyright (c) 2019 Sebastián Ramírez + +Permission is hereby granted, free of charge, to any person obtaining a copy +of this software and associated documentation files (the "Software"), to deal +in the Software without restriction, including without limitation the rights +to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +copies of the Software, and to permit persons to whom the Software is +furnished to do so, subject to the following conditions: + +The above copyright notice and this permission notice shall be included in +all copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN +THE SOFTWARE. + + +typing-inspect +MIT License +https://github.com/ilevkivskyi/typing_inspect +The MIT License (MIT) + +Copyright (c) 2017-2019 Ivan Levkivskyi + +Permission is hereby granted, free of charge, to any person obtaining a copy of +this software and associated documentation files (the "Software"), to deal in +the Software without restriction, including without limitation the rights to +use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of +the Software, and to permit persons to whom the Software is furnished to do so, +subject to the following conditions: + +The above copyright notice and this permission notice shall be included in all +copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS +FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR +COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER +IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN +CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. + + +typing-inspection +MIT +https://github.com/pydantic/typing-inspection +MIT License + +Copyright (c) Pydantic Services Inc. 2025 to present + +Permission is hereby granted, free of charge, to any person obtaining a copy +of this software and associated documentation files (the "Software"), to deal +in the Software without restriction, including without limitation the rights +to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +copies of the Software, and to permit persons to whom the Software is +furnished to do so, subject to the following conditions: + +The above copyright notice and this permission notice shall be included in all +copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE +SOFTWARE. + + +typing_extensions +PSF-2.0 +https://github.com/python/typing_extensions +A. HISTORY OF THE SOFTWARE +========================== + +Python was created in the early 1990s by Guido van Rossum at Stichting +Mathematisch Centrum (CWI, see https://www.cwi.nl) in the Netherlands +as a successor of a language called ABC. Guido remains Python's +principal author, although it includes many contributions from others. + +In 1995, Guido continued his work on Python at the Corporation for +National Research Initiatives (CNRI, see https://www.cnri.reston.va.us) +in Reston, Virginia where he released several versions of the +software. + +In May 2000, Guido and the Python core development team moved to +BeOpen.com to form the BeOpen PythonLabs team. In October of the same +year, the PythonLabs team moved to Digital Creations, which became +Zope Corporation. In 2001, the Python Software Foundation (PSF, see +https://www.python.org/psf/) was formed, a non-profit organization +created specifically to own Python-related Intellectual Property. +Zope Corporation was a sponsoring member of the PSF. + +All Python releases are Open Source (see https://opensource.org for +the Open Source Definition). Historically, most, but not all, Python +releases have also been GPL-compatible; the table below summarizes +the various releases. + + Release Derived Year Owner GPL- + from compatible? (1) + + 0.9.0 thru 1.2 1991-1995 CWI yes + 1.3 thru 1.5.2 1.2 1995-1999 CNRI yes + 1.6 1.5.2 2000 CNRI no + 2.0 1.6 2000 BeOpen.com no + 1.6.1 1.6 2001 CNRI yes (2) + 2.1 2.0+1.6.1 2001 PSF no + 2.0.1 2.0+1.6.1 2001 PSF yes + 2.1.1 2.1+2.0.1 2001 PSF yes + 2.1.2 2.1.1 2002 PSF yes + 2.1.3 2.1.2 2002 PSF yes + 2.2 and above 2.1.1 2001-now PSF yes + +Footnotes: + +(1) GPL-compatible doesn't mean that we're distributing Python under + the GPL. All Python licenses, unlike the GPL, let you distribute + a modified version without making your changes open source. The + GPL-compatible licenses make it possible to combine Python with + other software that is released under the GPL; the others don't. + +(2) According to Richard Stallman, 1.6.1 is not GPL-compatible, + because its license has a choice of law clause. According to + CNRI, however, Stallman's lawyer has told CNRI's lawyer that 1.6.1 + is "not incompatible" with the GPL. + +Thanks to the many outside volunteers who have worked under Guido's +direction to make these releases possible. + + +B. TERMS AND CONDITIONS FOR ACCESSING OR OTHERWISE USING PYTHON +=============================================================== + +Python software and documentation are licensed under the +Python Software Foundation License Version 2. + +Starting with Python 3.8.6, examples, recipes, and other code in +the documentation are dual licensed under the PSF License Version 2 +and the Zero-Clause BSD license. + +Some software incorporated into Python is under different licenses. +The licenses are listed with code falling under that license. + + +PYTHON SOFTWARE FOUNDATION LICENSE VERSION 2 +-------------------------------------------- + +1. This LICENSE AGREEMENT is between the Python Software Foundation +("PSF"), and the Individual or Organization ("Licensee") accessing and +otherwise using this software ("Python") in source or binary form and +its associated documentation. + +2. Subject to the terms and conditions of this License Agreement, PSF hereby +grants Licensee a nonexclusive, royalty-free, world-wide license to reproduce, +analyze, test, perform and/or display publicly, prepare derivative works, +distribute, and otherwise use Python alone or in any derivative version, +provided, however, that PSF's License Agreement and PSF's notice of copyright, +i.e., "Copyright (c) 2001, 2002, 2003, 2004, 2005, 2006, 2007, 2008, 2009, 2010, +2011, 2012, 2013, 2014, 2015, 2016, 2017, 2018, 2019, 2020, 2021, 2022, 2023 Python Software Foundation; +All Rights Reserved" are retained in Python alone or in any derivative version +prepared by Licensee. + +3. In the event Licensee prepares a derivative work that is based on +or incorporates Python or any part thereof, and wants to make +the derivative work available to others as provided herein, then +Licensee hereby agrees to include in any such work a brief summary of +the changes made to Python. + +4. PSF is making Python available to Licensee on an "AS IS" +basis. PSF MAKES NO REPRESENTATIONS OR WARRANTIES, EXPRESS OR +IMPLIED. BY WAY OF EXAMPLE, BUT NOT LIMITATION, PSF MAKES NO AND +DISCLAIMS ANY REPRESENTATION OR WARRANTY OF MERCHANTABILITY OR FITNESS +FOR ANY PARTICULAR PURPOSE OR THAT THE USE OF PYTHON WILL NOT +INFRINGE ANY THIRD PARTY RIGHTS. + +5. PSF SHALL NOT BE LIABLE TO LICENSEE OR ANY OTHER USERS OF PYTHON +FOR ANY INCIDENTAL, SPECIAL, OR CONSEQUENTIAL DAMAGES OR LOSS AS +A RESULT OF MODIFYING, DISTRIBUTING, OR OTHERWISE USING PYTHON, +OR ANY DERIVATIVE THEREOF, EVEN IF ADVISED OF THE POSSIBILITY THEREOF. + +6. This License Agreement will automatically terminate upon a material +breach of its terms and conditions. + +7. Nothing in this License Agreement shall be deemed to create any +relationship of agency, partnership, or joint venture between PSF and +Licensee. This License Agreement does not grant permission to use PSF +trademarks or trade name in a trademark sense to endorse or promote +products or services of Licensee, or any third party. + +8. By copying, installing or otherwise using Python, Licensee +agrees to be bound by the terms and conditions of this License +Agreement. + + +BEOPEN.COM LICENSE AGREEMENT FOR PYTHON 2.0 +------------------------------------------- + +BEOPEN PYTHON OPEN SOURCE LICENSE AGREEMENT VERSION 1 + +1. This LICENSE AGREEMENT is between BeOpen.com ("BeOpen"), having an +office at 160 Saratoga Avenue, Santa Clara, CA 95051, and the +Individual or Organization ("Licensee") accessing and otherwise using +this software in source or binary form and its associated +documentation ("the Software"). + +2. Subject to the terms and conditions of this BeOpen Python License +Agreement, BeOpen hereby grants Licensee a non-exclusive, +royalty-free, world-wide license to reproduce, analyze, test, perform +and/or display publicly, prepare derivative works, distribute, and +otherwise use the Software alone or in any derivative version, +provided, however, that the BeOpen Python License is retained in the +Software, alone or in any derivative version prepared by Licensee. + +3. BeOpen is making the Software available to Licensee on an "AS IS" +basis. BEOPEN MAKES NO REPRESENTATIONS OR WARRANTIES, EXPRESS OR +IMPLIED. BY WAY OF EXAMPLE, BUT NOT LIMITATION, BEOPEN MAKES NO AND +DISCLAIMS ANY REPRESENTATION OR WARRANTY OF MERCHANTABILITY OR FITNESS +FOR ANY PARTICULAR PURPOSE OR THAT THE USE OF THE SOFTWARE WILL NOT +INFRINGE ANY THIRD PARTY RIGHTS. + +4. BEOPEN SHALL NOT BE LIABLE TO LICENSEE OR ANY OTHER USERS OF THE +SOFTWARE FOR ANY INCIDENTAL, SPECIAL, OR CONSEQUENTIAL DAMAGES OR LOSS +AS A RESULT OF USING, MODIFYING OR DISTRIBUTING THE SOFTWARE, OR ANY +DERIVATIVE THEREOF, EVEN IF ADVISED OF THE POSSIBILITY THEREOF. + +5. This License Agreement will automatically terminate upon a material +breach of its terms and conditions. + +6. This License Agreement shall be governed by and interpreted in all +respects by the law of the State of California, excluding conflict of +law provisions. Nothing in this License Agreement shall be deemed to +create any relationship of agency, partnership, or joint venture +between BeOpen and Licensee. This License Agreement does not grant +permission to use BeOpen trademarks or trade names in a trademark +sense to endorse or promote products or services of Licensee, or any +third party. As an exception, the "BeOpen Python" logos available at +http://www.pythonlabs.com/logos.html may be used according to the +permissions granted on that web page. + +7. By copying, installing or otherwise using the software, Licensee +agrees to be bound by the terms and conditions of this License +Agreement. + + +CNRI LICENSE AGREEMENT FOR PYTHON 1.6.1 +--------------------------------------- + +1. This LICENSE AGREEMENT is between the Corporation for National +Research Initiatives, having an office at 1895 Preston White Drive, +Reston, VA 20191 ("CNRI"), and the Individual or Organization +("Licensee") accessing and otherwise using Python 1.6.1 software in +source or binary form and its associated documentation. + +2. Subject to the terms and conditions of this License Agreement, CNRI +hereby grants Licensee a nonexclusive, royalty-free, world-wide +license to reproduce, analyze, test, perform and/or display publicly, +prepare derivative works, distribute, and otherwise use Python 1.6.1 +alone or in any derivative version, provided, however, that CNRI's +License Agreement and CNRI's notice of copyright, i.e., "Copyright (c) +1995-2001 Corporation for National Research Initiatives; All Rights +Reserved" are retained in Python 1.6.1 alone or in any derivative +version prepared by Licensee. Alternately, in lieu of CNRI's License +Agreement, Licensee may substitute the following text (omitting the +quotes): "Python 1.6.1 is made available subject to the terms and +conditions in CNRI's License Agreement. This Agreement together with +Python 1.6.1 may be located on the internet using the following +unique, persistent identifier (known as a handle): 1895.22/1013. This +Agreement may also be obtained from a proxy server on the internet +using the following URL: http://hdl.handle.net/1895.22/1013". + +3. In the event Licensee prepares a derivative work that is based on +or incorporates Python 1.6.1 or any part thereof, and wants to make +the derivative work available to others as provided herein, then +Licensee hereby agrees to include in any such work a brief summary of +the changes made to Python 1.6.1. + +4. CNRI is making Python 1.6.1 available to Licensee on an "AS IS" +basis. CNRI MAKES NO REPRESENTATIONS OR WARRANTIES, EXPRESS OR +IMPLIED. BY WAY OF EXAMPLE, BUT NOT LIMITATION, CNRI MAKES NO AND +DISCLAIMS ANY REPRESENTATION OR WARRANTY OF MERCHANTABILITY OR FITNESS +FOR ANY PARTICULAR PURPOSE OR THAT THE USE OF PYTHON 1.6.1 WILL NOT +INFRINGE ANY THIRD PARTY RIGHTS. + +5. CNRI SHALL NOT BE LIABLE TO LICENSEE OR ANY OTHER USERS OF PYTHON +1.6.1 FOR ANY INCIDENTAL, SPECIAL, OR CONSEQUENTIAL DAMAGES OR LOSS AS +A RESULT OF MODIFYING, DISTRIBUTING, OR OTHERWISE USING PYTHON 1.6.1, +OR ANY DERIVATIVE THEREOF, EVEN IF ADVISED OF THE POSSIBILITY THEREOF. + +6. This License Agreement will automatically terminate upon a material +breach of its terms and conditions. + +7. This License Agreement shall be governed by the federal +intellectual property law of the United States, including without +limitation the federal copyright law, and, to the extent such +U.S. federal law does not apply, by the law of the Commonwealth of +Virginia, excluding Virginia's conflict of law provisions. +Notwithstanding the foregoing, with regard to derivative works based +on Python 1.6.1 that incorporate non-separable material that was +previously distributed under the GNU General Public License (GPL), the +law of the Commonwealth of Virginia shall govern this License +Agreement only as to issues arising under or with respect to +Paragraphs 4, 5, and 7 of this License Agreement. Nothing in this +License Agreement shall be deemed to create any relationship of +agency, partnership, or joint venture between CNRI and Licensee. This +License Agreement does not grant permission to use CNRI trademarks or +trade name in a trademark sense to endorse or promote products or +services of Licensee, or any third party. + +8. By clicking on the "ACCEPT" button where indicated, or by copying, +installing or otherwise using Python 1.6.1, Licensee agrees to be +bound by the terms and conditions of this License Agreement. + + ACCEPT + + +CWI LICENSE AGREEMENT FOR PYTHON 0.9.0 THROUGH 1.2 +-------------------------------------------------- + +Copyright (c) 1991 - 1995, Stichting Mathematisch Centrum Amsterdam, +The Netherlands. All rights reserved. + +Permission to use, copy, modify, and distribute this software and its +documentation for any purpose and without fee is hereby granted, +provided that the above copyright notice appear in all copies and that +both that copyright notice and this permission notice appear in +supporting documentation, and that the name of Stichting Mathematisch +Centrum or CWI not be used in advertising or publicity pertaining to +distribution of the software without specific, written prior +permission. + +STICHTING MATHEMATISCH CENTRUM DISCLAIMS ALL WARRANTIES WITH REGARD TO +THIS SOFTWARE, INCLUDING ALL IMPLIED WARRANTIES OF MERCHANTABILITY AND +FITNESS, IN NO EVENT SHALL STICHTING MATHEMATISCH CENTRUM BE LIABLE +FOR ANY SPECIAL, INDIRECT OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES +WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN +ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT +OF OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE. + +ZERO-CLAUSE BSD LICENSE FOR CODE IN THE PYTHON DOCUMENTATION +---------------------------------------------------------------------- + +Permission to use, copy, modify, and/or distribute this software for any +purpose with or without fee is hereby granted. + +THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES WITH +REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF MERCHANTABILITY +AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY SPECIAL, DIRECT, +INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES WHATSOEVER RESULTING FROM +LOSS OF USE, DATA OR PROFITS, WHETHER IN AN ACTION OF CONTRACT, NEGLIGENCE OR +OTHER TORTIOUS ACTION, ARISING OUT OF OR IN CONNECTION WITH THE USE OR +PERFORMANCE OF THIS SOFTWARE. + + +tyro +MIT License +UNKNOWN +MIT License + +Copyright (c) 2024 Brent Yi + +Permission is hereby granted, free of charge, to any person obtaining a copy +of this software and associated documentation files (the "Software"), to deal +in the Software without restriction, including without limitation the rights +to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +copies of the Software, and to permit persons to whom the Software is +furnished to do so, subject to the following conditions: + +The above copyright notice and this permission notice shall be included in all +copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE +SOFTWARE. + + +tzdata +Apache-2.0 +https://github.com/python/tzdata +Apache Software License 2.0 + +Copyright (c) 2020, Paul Ganssle (Google) + +Licensed under the Apache License, Version 2.0 (the "License"); +you may not use this file except in compliance with the License. +You may obtain a copy of the License at + +http://www.apache.org/licenses/LICENSE-2.0 + +Unless required by applicable law or agreed to in writing, software +distributed under the License is distributed on an "AS IS" BASIS, +WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +See the License for the specific language governing permissions and +limitations under the License. + + +uri-template +MIT License +https://gitlab.linss.com/open-source/python/uri-template +MIT License + +Copyright (c) 2020 Peter Linss + +Permission is hereby granted, free of charge, to any person obtaining a copy +of this software and associated documentation files (the "Software"), to deal +in the Software without restriction, including without limitation the rights +to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +copies of the Software, and to permit persons to whom the Software is +furnished to do so, subject to the following conditions: + +The above copyright notice and this permission notice shall be included in all +copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE +SOFTWARE. + + +urllib3 +MIT +https://github.com/urllib3/urllib3/blob/main/CHANGES.rst +MIT License + +Copyright (c) 2008-2020 Andrey Petrov and contributors. + +Permission is hereby granted, free of charge, to any person obtaining a copy +of this software and associated documentation files (the "Software"), to deal +in the Software without restriction, including without limitation the rights +to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +copies of the Software, and to permit persons to whom the Software is +furnished to do so, subject to the following conditions: + +The above copyright notice and this permission notice shall be included in all +copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE +SOFTWARE. + + +userpath +MIT +https://github.com/ofek/userpath +MIT License + +Copyright (c) 2017-present Ofek Lev + +Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions: + +The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. + + +uv +Apache Software License; MIT License +https://pypi.org/project/uv/ + Apache License + Version 2.0, January 2004 + http://www.apache.org/licenses/ + + TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION + + 1. Definitions. + + "License" shall mean the terms and conditions for use, reproduction, + and distribution as defined by Sections 1 through 9 of this document. + + "Licensor" shall mean the copyright owner or entity authorized by + the copyright owner that is granting the License. + + "Legal Entity" shall mean the union of the acting entity and all + other entities that control, are controlled by, or are under common + control with that entity. For the purposes of this definition, + "control" means (i) the power, direct or indirect, to cause the + direction or management of such entity, whether by contract or + otherwise, or (ii) ownership of fifty percent (50%) or more of the + outstanding shares, or (iii) beneficial ownership of such entity. + + "You" (or "Your") shall mean an individual or Legal Entity + exercising permissions granted by this License. + + "Source" form shall mean the preferred form for making modifications, + including but not limited to software source code, documentation + source, and configuration files. + + "Object" form shall mean any form resulting from mechanical + transformation or translation of a Source form, including but + not limited to compiled object code, generated documentation, + and conversions to other media types. + + "Work" shall mean the work of authorship, whether in Source or + Object form, made available under the License, as indicated by a + copyright notice that is included in or attached to the work + (an example is provided in the Appendix below). + + "Derivative Works" shall mean any work, whether in Source or Object + form, that is based on (or derived from) the Work and for which the + editorial revisions, annotations, elaborations, or other modifications + represent, as a whole, an original work of authorship. For the purposes + of this License, Derivative Works shall not include works that remain + separable from, or merely link (or bind by name) to the interfaces of, + the Work and Derivative Works thereof. + + "Contribution" shall mean any work of authorship, including + the original version of the Work and any modifications or additions + to that Work or Derivative Works thereof, that is intentionally + submitted to Licensor for inclusion in the Work by the copyright owner + or by an individual or Legal Entity authorized to submit on behalf of + the copyright owner. For the purposes of this definition, "submitted" + means any form of electronic, verbal, or written communication sent + to the Licensor or its representatives, including but not limited to + communication on electronic mailing lists, source code control systems, + and issue tracking systems that are managed by, or on behalf of, the + Licensor for the purpose of discussing and improving the Work, but + excluding communication that is conspicuously marked or otherwise + designated in writing by the copyright owner as "Not a Contribution." + + "Contributor" shall mean Licensor and any individual or Legal Entity + on behalf of whom a Contribution has been received by Licensor and + subsequently incorporated within the Work. + + 2. Grant of Copyright License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + copyright license to reproduce, prepare Derivative Works of, + publicly display, publicly perform, sublicense, and distribute the + Work and such Derivative Works in Source or Object form. + + 3. Grant of Patent License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + (except as stated in this section) patent license to make, have made, + use, offer to sell, sell, import, and otherwise transfer the Work, + where such license applies only to those patent claims licensable + by such Contributor that are necessarily infringed by their + Contribution(s) alone or by combination of their Contribution(s) + with the Work to which such Contribution(s) was submitted. If You + institute patent litigation against any entity (including a + cross-claim or counterclaim in a lawsuit) alleging that the Work + or a Contribution incorporated within the Work constitutes direct + or contributory patent infringement, then any patent licenses + granted to You under this License for that Work shall terminate + as of the date such litigation is filed. + + 4. Redistribution. You may reproduce and distribute copies of the + Work or Derivative Works thereof in any medium, with or without + modifications, and in Source or Object form, provided that You + meet the following conditions: + + (a) You must give any other recipients of the Work or + Derivative Works a copy of this License; and + + (b) You must cause any modified files to carry prominent notices + stating that You changed the files; and + + (c) You must retain, in the Source form of any Derivative Works + that You distribute, all copyright, patent, trademark, and + attribution notices from the Source form of the Work, + excluding those notices that do not pertain to any part of + the Derivative Works; and + + (d) If the Work includes a "NOTICE" text file as part of its + distribution, then any Derivative Works that You distribute must + include a readable copy of the attribution notices contained + within such NOTICE file, excluding those notices that do not + pertain to any part of the Derivative Works, in at least one + of the following places: within a NOTICE text file distributed + as part of the Derivative Works; within the Source form or + documentation, if provided along with the Derivative Works; or, + within a display generated by the Derivative Works, if and + wherever such third-party notices normally appear. The contents + of the NOTICE file are for informational purposes only and + do not modify the License. You may add Your own attribution + notices within Derivative Works that You distribute, alongside + or as an addendum to the NOTICE text from the Work, provided + that such additional attribution notices cannot be construed + as modifying the License. + + You may add Your own copyright statement to Your modifications and + may provide additional or different license terms and conditions + for use, reproduction, or distribution of Your modifications, or + for any such Derivative Works as a whole, provided Your use, + reproduction, and distribution of the Work otherwise complies with + the conditions stated in this License. + + 5. Submission of Contributions. Unless You explicitly state otherwise, + any Contribution intentionally submitted for inclusion in the Work + by You to the Licensor shall be under the terms and conditions of + this License, without any additional terms or conditions. + Notwithstanding the above, nothing herein shall supersede or modify + the terms of any separate license agreement you may have executed + with Licensor regarding such Contributions. + + 6. Trademarks. This License does not grant permission to use the trade + names, trademarks, service marks, or product names of the Licensor, + except as required for reasonable and customary use in describing the + origin of the Work and reproducing the content of the NOTICE file. + + 7. Disclaimer of Warranty. Unless required by applicable law or + agreed to in writing, Licensor provides the Work (and each + Contributor provides its Contributions) on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or + implied, including, without limitation, any warranties or conditions + of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A + PARTICULAR PURPOSE. You are solely responsible for determining the + appropriateness of using or redistributing the Work and assume any + risks associated with Your exercise of permissions under this License. + + 8. Limitation of Liability. In no event and under no legal theory, + whether in tort (including negligence), contract, or otherwise, + unless required by applicable law (such as deliberate and grossly + negligent acts) or agreed to in writing, shall any Contributor be + liable to You for damages, including any direct, indirect, special, + incidental, or consequential damages of any character arising as a + result of this License or out of the use or inability to use the + Work (including but not limited to damages for loss of goodwill, + work stoppage, computer failure or malfunction, or any and all + other commercial damages or losses), even if such Contributor + has been advised of the possibility of such damages. + + 9. Accepting Warranty or Additional Liability. While redistributing + the Work or Derivative Works thereof, You may choose to offer, + and charge a fee for, acceptance of support, warranty, indemnity, + or other liability obligations and/or rights consistent with this + License. However, in accepting such obligations, You may act only + on Your own behalf and on Your sole responsibility, not on behalf + of any other Contributor, and only if You agree to indemnify, + defend, and hold each Contributor harmless for any liability + incurred by, or claims asserted against, such Contributor by reason + of your accepting any such warranty or additional liability. + + END OF TERMS AND CONDITIONS + + APPENDIX: How to apply the Apache License to your work. + + To apply the Apache License to your work, attach the following + boilerplate notice, with the fields enclosed by brackets "[]" + replaced with your own identifying information. (Don't include + the brackets!) The text should be enclosed in the appropriate + comment syntax for the file format. We also recommend that a + file or class name and description of purpose be included on the + same "printed page" as the copyright notice for easier + identification within third-party archives. + + Copyright [yyyy] [name of copyright owner] + + Licensed under the Apache License, Version 2.0 (the "License"); + you may not use this file except in compliance with the License. + You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, software + distributed under the License is distributed on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + See the License for the specific language governing permissions and + limitations under the License. + + +uvicorn +BSD-3-Clause +https://uvicorn.dev/ +Copyright © 2017-present, [Encode OSS Ltd](https://www.encode.io/). +All rights reserved. + +Redistribution and use in source and binary forms, with or without +modification, are permitted provided that the following conditions are met: + +* Redistributions of source code must retain the above copyright notice, this + list of conditions and the following disclaimer. + +* Redistributions in binary form must reproduce the above copyright notice, + this list of conditions and the following disclaimer in the documentation + and/or other materials provided with the distribution. + +* Neither the name of the copyright holder nor the names of its + contributors may be used to endorse or promote products derived from + this software without specific prior written permission. + +THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" +AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE +IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE +DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE +FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL +DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR +SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER +CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, +OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE +OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + + +uvloop +Apache Software License; MIT License +UNKNOWN +Copyright (C) 2016-present the uvloop authors and contributors. + + Apache License + Version 2.0, January 2004 + http://www.apache.org/licenses/ + + TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION + + 1. Definitions. + + "License" shall mean the terms and conditions for use, reproduction, + and distribution as defined by Sections 1 through 9 of this document. + + "Licensor" shall mean the copyright owner or entity authorized by + the copyright owner that is granting the License. + + "Legal Entity" shall mean the union of the acting entity and all + other entities that control, are controlled by, or are under common + control with that entity. For the purposes of this definition, + "control" means (i) the power, direct or indirect, to cause the + direction or management of such entity, whether by contract or + otherwise, or (ii) ownership of fifty percent (50%) or more of the + outstanding shares, or (iii) beneficial ownership of such entity. + + "You" (or "Your") shall mean an individual or Legal Entity + exercising permissions granted by this License. + + "Source" form shall mean the preferred form for making modifications, + including but not limited to software source code, documentation + source, and configuration files. + + "Object" form shall mean any form resulting from mechanical + transformation or translation of a Source form, including but + not limited to compiled object code, generated documentation, + and conversions to other media types. + + "Work" shall mean the work of authorship, whether in Source or + Object form, made available under the License, as indicated by a + copyright notice that is included in or attached to the work + (an example is provided in the Appendix below). + + "Derivative Works" shall mean any work, whether in Source or Object + form, that is based on (or derived from) the Work and for which the + editorial revisions, annotations, elaborations, or other modifications + represent, as a whole, an original work of authorship. For the purposes + of this License, Derivative Works shall not include works that remain + separable from, or merely link (or bind by name) to the interfaces of, + the Work and Derivative Works thereof. + + "Contribution" shall mean any work of authorship, including + the original version of the Work and any modifications or additions + to that Work or Derivative Works thereof, that is intentionally + submitted to Licensor for inclusion in the Work by the copyright owner + or by an individual or Legal Entity authorized to submit on behalf of + the copyright owner. For the purposes of this definition, "submitted" + means any form of electronic, verbal, or written communication sent + to the Licensor or its representatives, including but not limited to + communication on electronic mailing lists, source code control systems, + and issue tracking systems that are managed by, or on behalf of, the + Licensor for the purpose of discussing and improving the Work, but + excluding communication that is conspicuously marked or otherwise + designated in writing by the copyright owner as "Not a Contribution." + + "Contributor" shall mean Licensor and any individual or Legal Entity + on behalf of whom a Contribution has been received by Licensor and + subsequently incorporated within the Work. + + 2. Grant of Copyright License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + copyright license to reproduce, prepare Derivative Works of, + publicly display, publicly perform, sublicense, and distribute the + Work and such Derivative Works in Source or Object form. + + 3. Grant of Patent License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + (except as stated in this section) patent license to make, have made, + use, offer to sell, sell, import, and otherwise transfer the Work, + where such license applies only to those patent claims licensable + by such Contributor that are necessarily infringed by their + Contribution(s) alone or by combination of their Contribution(s) + with the Work to which such Contribution(s) was submitted. If You + institute patent litigation against any entity (including a + cross-claim or counterclaim in a lawsuit) alleging that the Work + or a Contribution incorporated within the Work constitutes direct + or contributory patent infringement, then any patent licenses + granted to You under this License for that Work shall terminate + as of the date such litigation is filed. + + 4. Redistribution. You may reproduce and distribute copies of the + Work or Derivative Works thereof in any medium, with or without + modifications, and in Source or Object form, provided that You + meet the following conditions: + + (a) You must give any other recipients of the Work or + Derivative Works a copy of this License; and + + (b) You must cause any modified files to carry prominent notices + stating that You changed the files; and + + (c) You must retain, in the Source form of any Derivative Works + that You distribute, all copyright, patent, trademark, and + attribution notices from the Source form of the Work, + excluding those notices that do not pertain to any part of + the Derivative Works; and + + (d) If the Work includes a "NOTICE" text file as part of its + distribution, then any Derivative Works that You distribute must + include a readable copy of the attribution notices contained + within such NOTICE file, excluding those notices that do not + pertain to any part of the Derivative Works, in at least one + of the following places: within a NOTICE text file distributed + as part of the Derivative Works; within the Source form or + documentation, if provided along with the Derivative Works; or, + within a display generated by the Derivative Works, if and + wherever such third-party notices normally appear. The contents + of the NOTICE file are for informational purposes only and + do not modify the License. You may add Your own attribution + notices within Derivative Works that You distribute, alongside + or as an addendum to the NOTICE text from the Work, provided + that such additional attribution notices cannot be construed + as modifying the License. + + You may add Your own copyright statement to Your modifications and + may provide additional or different license terms and conditions + for use, reproduction, or distribution of Your modifications, or + for any such Derivative Works as a whole, provided Your use, + reproduction, and distribution of the Work otherwise complies with + the conditions stated in this License. + + 5. Submission of Contributions. Unless You explicitly state otherwise, + any Contribution intentionally submitted for inclusion in the Work + by You to the Licensor shall be under the terms and conditions of + this License, without any additional terms or conditions. + Notwithstanding the above, nothing herein shall supersede or modify + the terms of any separate license agreement you may have executed + with Licensor regarding such Contributions. + + 6. Trademarks. This License does not grant permission to use the trade + names, trademarks, service marks, or product names of the Licensor, + except as required for reasonable and customary use in describing the + origin of the Work and reproducing the content of the NOTICE file. + + 7. Disclaimer of Warranty. Unless required by applicable law or + agreed to in writing, Licensor provides the Work (and each + Contributor provides its Contributions) on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or + implied, including, without limitation, any warranties or conditions + of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A + PARTICULAR PURPOSE. You are solely responsible for determining the + appropriateness of using or redistributing the Work and assume any + risks associated with Your exercise of permissions under this License. + + 8. Limitation of Liability. In no event and under no legal theory, + whether in tort (including negligence), contract, or otherwise, + unless required by applicable law (such as deliberate and grossly + negligent acts) or agreed to in writing, shall any Contributor be + liable to You for damages, including any direct, indirect, special, + incidental, or consequential damages of any character arising as a + result of this License or out of the use or inability to use the + Work (including but not limited to damages for loss of goodwill, + work stoppage, computer failure or malfunction, or any and all + other commercial damages or losses), even if such Contributor + has been advised of the possibility of such damages. + + 9. Accepting Warranty or Additional Liability. While redistributing + the Work or Derivative Works thereof, You may choose to offer, + and charge a fee for, acceptance of support, warranty, indemnity, + or other liability obligations and/or rights consistent with this + License. However, in accepting such obligations, You may act only + on Your own behalf and on Your sole responsibility, not on behalf + of any other Contributor, and only if You agree to indemnify, + defend, and hold each Contributor harmless for any liability + incurred by, or claims asserted against, such Contributor by reason + of your accepting any such warranty or additional liability. + + END OF TERMS AND CONDITIONS + + APPENDIX: How to apply the Apache License to your work. + + To apply the Apache License to your work, attach the following + boilerplate notice, with the fields enclosed by brackets "[]" + replaced with your own identifying information. (Don't include + the brackets!) The text should be enclosed in the appropriate + comment syntax for the file format. We also recommend that a + file or class name and description of purpose be included on the + same "printed page" as the copyright notice for easier + identification within third-party archives. + + Copyright (c) 2015-present MagicStack Inc. http://magic.io + + Licensed under the Apache License, Version 2.0 (the "License"); + you may not use this file except in compliance with the License. + You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, software + distributed under the License is distributed on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + See the License for the specific language governing permissions and + limitations under the License. + + +virtualenv +MIT +https://github.com/pypa/virtualenv +Copyright (c) 2020-202x The virtualenv developers + +Permission is hereby granted, free of charge, to any person obtaining +a copy of this software and associated documentation files (the +"Software"), to deal in the Software without restriction, including +without limitation the rights to use, copy, modify, merge, publish, +distribute, sublicense, and/or sell copies of the Software, and to +permit persons to whom the Software is furnished to do so, subject to +the following conditions: + +The above copyright notice and this permission notice shall be +included in all copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, +EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF +MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND +NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE +LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION +OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION +WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. + + +wandb +MIT License +https://github.com/wandb/wandb +MIT License + +Copyright (c) 2021 Weights and Biases, Inc. + +Permission is hereby granted, free of charge, to any person obtaining a copy +of this software and associated documentation files (the "Software"), to deal +in the Software without restriction, including without limitation the rights +to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +copies of the Software, and to permit persons to whom the Software is +furnished to do so, subject to the following conditions: + +The above copyright notice and this permission notice shall be included in all +copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE +SOFTWARE. + + +watchfiles +MIT License +https://github.com/samuelcolvin/watchfiles +The MIT License (MIT) + +Copyright (c) 2017 to present Samuel Colvin + +Permission is hereby granted, free of charge, to any person obtaining a copy +of this software and associated documentation files (the "Software"), to deal +in the Software without restriction, including without limitation the rights +to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +copies of the Software, and to permit persons to whom the Software is +furnished to do so, subject to the following conditions: + +The above copyright notice and this permission notice shall be included in all +copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE +SOFTWARE. + + +wcmatch +MIT +https://github.com/facelessuser/wcmatch +MIT License + +Copyright (c) 2018 - 2025 Isaac Muse + +Permission is hereby granted, free of charge, to any person obtaining a copy +of this software and associated documentation files (the "Software"), to deal +in the Software without restriction, including without limitation the rights +to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +copies of the Software, and to permit persons to whom the Software is +furnished to do so, subject to the following conditions: + +The above copyright notice and this permission notice shall be included in all +copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE +SOFTWARE. + + +webcolors +BSD License +UNKNOWN +Copyright (c) James Bennett, and contributors. +All rights reserved. + +Redistribution and use in source and binary forms, with or without +modification, are permitted provided that the following conditions are +met: + + * Redistributions of source code must retain the above copyright + notice, this list of conditions and the following disclaimer. + * Redistributions in binary form must reproduce the above + copyright notice, this list of conditions and the following + disclaimer in the documentation and/or other materials provided + with the distribution. + * Neither the name of the author nor the names of other + contributors may be used to endorse or promote products derived + from this software without specific prior written permission. + +THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS +"AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT +LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR +A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT +OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, +SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT +LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, +DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY +THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT +(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE +OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + + +webdataset +BSD-3-Clause +http://github.com/webdataset/webdataset +Copyright 2020 NVIDIA CORPORATION. All rights reserved. + +Redistribution and use in source and binary forms, with or without +modification, are permitted provided that the following conditions +are met: + +1. Redistributions of source code must retain the above copyright notice, +this list of conditions and the following disclaimer. + +2. Redistributions in binary form must reproduce the above copyright +notice, this list of conditions and the following disclaimer in the +documentation and/or other materials provided with the distribution. + +3. Neither the name of the copyright holder nor the names of its +contributors may be used to endorse or promote products derived from +this software without specific prior written permission. + +THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS +"AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT +LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR +A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT +HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, +SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED +TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR +PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF +LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING +NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS +SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + + +webencodings +BSD License +https://github.com/SimonSapin/python-webencodings +UNKNOWN + +websocket-client +Apache Software License +https://github.com/websocket-client/websocket-client.git + + Apache License + Version 2.0, January 2004 + http://www.apache.org/licenses/ + + TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION + + 1. Definitions. + + "License" shall mean the terms and conditions for use, reproduction, + and distribution as defined by Sections 1 through 9 of this document. + + "Licensor" shall mean the copyright owner or entity authorized by + the copyright owner that is granting the License. + + "Legal Entity" shall mean the union of the acting entity and all + other entities that control, are controlled by, or are under common + control with that entity. For the purposes of this definition, + "control" means (i) the power, direct or indirect, to cause the + direction or management of such entity, whether by contract or + otherwise, or (ii) ownership of fifty percent (50%) or more of the + outstanding shares, or (iii) beneficial ownership of such entity. + + "You" (or "Your") shall mean an individual or Legal Entity + exercising permissions granted by this License. + + "Source" form shall mean the preferred form for making modifications, + including but not limited to software source code, documentation + source, and configuration files. + + "Object" form shall mean any form resulting from mechanical + transformation or translation of a Source form, including but + not limited to compiled object code, generated documentation, + and conversions to other media types. + + "Work" shall mean the work of authorship, whether in Source or + Object form, made available under the License, as indicated by a + copyright notice that is included in or attached to the work + (an example is provided in the Appendix below). + + "Derivative Works" shall mean any work, whether in Source or Object + form, that is based on (or derived from) the Work and for which the + editorial revisions, annotations, elaborations, or other modifications + represent, as a whole, an original work of authorship. For the purposes + of this License, Derivative Works shall not include works that remain + separable from, or merely link (or bind by name) to the interfaces of, + the Work and Derivative Works thereof. + + "Contribution" shall mean any work of authorship, including + the original version of the Work and any modifications or additions + to that Work or Derivative Works thereof, that is intentionally + submitted to Licensor for inclusion in the Work by the copyright owner + or by an individual or Legal Entity authorized to submit on behalf of + the copyright owner. For the purposes of this definition, "submitted" + means any form of electronic, verbal, or written communication sent + to the Licensor or its representatives, including but not limited to + communication on electronic mailing lists, source code control systems, + and issue tracking systems that are managed by, or on behalf of, the + Licensor for the purpose of discussing and improving the Work, but + excluding communication that is conspicuously marked or otherwise + designated in writing by the copyright owner as "Not a Contribution." + + "Contributor" shall mean Licensor and any individual or Legal Entity + on behalf of whom a Contribution has been received by Licensor and + subsequently incorporated within the Work. + + 2. Grant of Copyright License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + copyright license to reproduce, prepare Derivative Works of, + publicly display, publicly perform, sublicense, and distribute the + Work and such Derivative Works in Source or Object form. + + 3. Grant of Patent License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + (except as stated in this section) patent license to make, have made, + use, offer to sell, sell, import, and otherwise transfer the Work, + where such license applies only to those patent claims licensable + by such Contributor that are necessarily infringed by their + Contribution(s) alone or by combination of their Contribution(s) + with the Work to which such Contribution(s) was submitted. If You + institute patent litigation against any entity (including a + cross-claim or counterclaim in a lawsuit) alleging that the Work + or a Contribution incorporated within the Work constitutes direct + or contributory patent infringement, then any patent licenses + granted to You under this License for that Work shall terminate + as of the date such litigation is filed. + + 4. Redistribution. You may reproduce and distribute copies of the + Work or Derivative Works thereof in any medium, with or without + modifications, and in Source or Object form, provided that You + meet the following conditions: + + (a) You must give any other recipients of the Work or + Derivative Works a copy of this License; and + + (b) You must cause any modified files to carry prominent notices + stating that You changed the files; and + + (c) You must retain, in the Source form of any Derivative Works + that You distribute, all copyright, patent, trademark, and + attribution notices from the Source form of the Work, + excluding those notices that do not pertain to any part of + the Derivative Works; and + + (d) If the Work includes a "NOTICE" text file as part of its + distribution, then any Derivative Works that You distribute must + include a readable copy of the attribution notices contained + within such NOTICE file, excluding those notices that do not + pertain to any part of the Derivative Works, in at least one + of the following places: within a NOTICE text file distributed + as part of the Derivative Works; within the Source form or + documentation, if provided along with the Derivative Works; or, + within a display generated by the Derivative Works, if and + wherever such third-party notices normally appear. The contents + of the NOTICE file are for informational purposes only and + do not modify the License. You may add Your own attribution + notices within Derivative Works that You distribute, alongside + or as an addendum to the NOTICE text from the Work, provided + that such additional attribution notices cannot be construed + as modifying the License. + + You may add Your own copyright statement to Your modifications and + may provide additional or different license terms and conditions + for use, reproduction, or distribution of Your modifications, or + for any such Derivative Works as a whole, provided Your use, + reproduction, and distribution of the Work otherwise complies with + the conditions stated in this License. + + 5. Submission of Contributions. Unless You explicitly state otherwise, + any Contribution intentionally submitted for inclusion in the Work + by You to the Licensor shall be under the terms and conditions of + this License, without any additional terms or conditions. + Notwithstanding the above, nothing herein shall supersede or modify + the terms of any separate license agreement you may have executed + with Licensor regarding such Contributions. + + 6. Trademarks. This License does not grant permission to use the trade + names, trademarks, service marks, or product names of the Licensor, + except as required for reasonable and customary use in describing the + origin of the Work and reproducing the content of the NOTICE file. + + 7. Disclaimer of Warranty. Unless required by applicable law or + agreed to in writing, Licensor provides the Work (and each + Contributor provides its Contributions) on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or + implied, including, without limitation, any warranties or conditions + of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A + PARTICULAR PURPOSE. You are solely responsible for determining the + appropriateness of using or redistributing the Work and assume any + risks associated with Your exercise of permissions under this License. + + 8. Limitation of Liability. In no event and under no legal theory, + whether in tort (including negligence), contract, or otherwise, + unless required by applicable law (such as deliberate and grossly + negligent acts) or agreed to in writing, shall any Contributor be + liable to You for damages, including any direct, indirect, special, + incidental, or consequential damages of any character arising as a + result of this License or out of the use or inability to use the + Work (including but not limited to damages for loss of goodwill, + work stoppage, computer failure or malfunction, or any and all + other commercial damages or losses), even if such Contributor + has been advised of the possibility of such damages. + + 9. Accepting Warranty or Additional Liability. While redistributing + the Work or Derivative Works thereof, You may choose to offer, + and charge a fee for, acceptance of support, warranty, indemnity, + or other liability obligations and/or rights consistent with this + License. However, in accepting such obligations, You may act only + on Your own behalf and on Your sole responsibility, not on behalf + of any other Contributor, and only if You agree to indemnify, + defend, and hold each Contributor harmless for any liability + incurred by, or claims asserted against, such Contributor by reason + of your accepting any such warranty or additional liability. + + END OF TERMS AND CONDITIONS + + APPENDIX: How to apply the Apache License to your work. + + To apply the Apache License to your work, attach the following + boilerplate notice, with the fields enclosed by brackets "[]" + replaced with your own identifying information. (Don't include + the brackets!) The text should be enclosed in the appropriate + comment syntax for the file format. We also recommend that a + file or class name and description of purpose be included on the + same "printed page" as the copyright notice for easier + identification within third-party archives. + + Copyright 2025 engn33r + + Licensed under the Apache License, Version 2.0 (the "License"); + you may not use this file except in compliance with the License. + You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, software + distributed under the License is distributed on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + See the License for the specific language governing permissions and + limitations under the License. + + + +websockets +BSD-3-Clause +https://github.com/python-websockets/websockets +Copyright (c) Aymeric Augustin and contributors + +Redistribution and use in source and binary forms, with or without +modification, are permitted provided that the following conditions are met: + + * Redistributions of source code must retain the above copyright notice, + this list of conditions and the following disclaimer. + * Redistributions in binary form must reproduce the above copyright notice, + this list of conditions and the following disclaimer in the documentation + and/or other materials provided with the distribution. + * Neither the name of the copyright holder nor the names of its contributors + may be used to endorse or promote products derived from this software + without specific prior written permission. + +THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND +ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED +WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE +DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE +FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL +DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR +SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER +CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, +OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE +OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + + +widgetsnbextension +BSD License +http://jupyter.org +Copyright (c) 2015 Project Jupyter Contributors +All rights reserved. + +Redistribution and use in source and binary forms, with or without +modification, are permitted provided that the following conditions are met: + +1. Redistributions of source code must retain the above copyright notice, this + list of conditions and the following disclaimer. + +2. Redistributions in binary form must reproduce the above copyright notice, + this list of conditions and the following disclaimer in the documentation + and/or other materials provided with the distribution. + +3. Neither the name of the copyright holder nor the names of its + contributors may be used to endorse or promote products derived from + this software without specific prior written permission. + +THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" +AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE +IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE +DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE +FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL +DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR +SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER +CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, +OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE +OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + + +wrapt +BSD License +https://github.com/GrahamDumpleton/wrapt +Copyright (c) 2013-2023, Graham Dumpleton +All rights reserved. + +Redistribution and use in source and binary forms, with or without +modification, are permitted provided that the following conditions are met: + +* Redistributions of source code must retain the above copyright notice, this + list of conditions and the following disclaimer. + +* Redistributions in binary form must reproduce the above copyright notice, + this list of conditions and the following disclaimer in the documentation + and/or other materials provided with the distribution. + +THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" +AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE +IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE +ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE +LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR +CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF +SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS +INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN +CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) +ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE +POSSIBILITY OF SUCH DAMAGE. + + +xatlas +MIT License + + Copyright (c) 2021 Markus Worchel + + Permission is hereby granted, free of charge, to any person obtaining a copy + of this software and associated documentation files (the "Software"), to deal + in the Software without restriction, including without limitation the rights + to use, copy, modify, merge, publish, distribute, sublicense, and/or sell + copies of the Software, and to permit persons to whom the Software is + furnished to do so, subject to the following conditions: + + The above copyright notice and this permission notice shall be included in all + copies or substantial portions of the Software. + + THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE + AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER + LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, + OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE + SOFTWARE. + +https://github.com/mworchel/xatlas-python +MIT License + +Copyright (c) 2021 Markus Worchel + +Permission is hereby granted, free of charge, to any person obtaining a copy +of this software and associated documentation files (the "Software"), to deal +in the Software without restriction, including without limitation the rights +to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +copies of the Software, and to permit persons to whom the Software is +furnished to do so, subject to the following conditions: + +The above copyright notice and this permission notice shall be included in all +copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE +SOFTWARE. + + +xattr +MIT +https://github.com/xattr/xattr +This is the MIT license. This software may also be distributed under the same terms as Python (the PSF license). + +Copyright (c) 2004 Bob Ippolito. + +Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions: + +The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. + + +xmltodict +MIT +https://github.com/martinblech/xmltodict +Copyright (C) 2012 Martin Blech and individual contributors. + +Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions: + +The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. + + +xxhash +BSD License +https://github.com/ifduyue/python-xxhash +Copyright (c) 2014-2024, Yue Du +All rights reserved. + +Redistribution and use in source and binary forms, with or without modification, +are permitted provided that the following conditions are met: + + * Redistributions of source code must retain the above copyright notice, + this list of conditions and the following disclaimer. + * Redistributions in binary form must reproduce the above copyright notice, + this list of conditions and the following disclaimer in the documentation + and/or other materials provided with the distribution. + +THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND +ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED +WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE +DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE +FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL +DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR +SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER +CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, +OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE +OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + + +yacs +Apache Software License +https://github.com/rbgirshick/yacs +Apache License +Version 2.0, January 2004 +http://www.apache.org/licenses/ + +TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION + +1. Definitions. + +"License" shall mean the terms and conditions for use, reproduction, +and distribution as defined by Sections 1 through 9 of this document. + +"Licensor" shall mean the copyright owner or entity authorized by +the copyright owner that is granting the License. + +"Legal Entity" shall mean the union of the acting entity and all +other entities that control, are controlled by, or are under common +control with that entity. For the purposes of this definition, +"control" means (i) the power, direct or indirect, to cause the +direction or management of such entity, whether by contract or +otherwise, or (ii) ownership of fifty percent (50%) or more of the +outstanding shares, or (iii) beneficial ownership of such entity. + +"You" (or "Your") shall mean an individual or Legal Entity +exercising permissions granted by this License. + +"Source" form shall mean the preferred form for making modifications, +including but not limited to software source code, documentation +source, and configuration files. + +"Object" form shall mean any form resulting from mechanical +transformation or translation of a Source form, including but +not limited to compiled object code, generated documentation, +and conversions to other media types. + +"Work" shall mean the work of authorship, whether in Source or +Object form, made available under the License, as indicated by a +copyright notice that is included in or attached to the work +(an example is provided in the Appendix below). + +"Derivative Works" shall mean any work, whether in Source or Object +form, that is based on (or derived from) the Work and for which the +editorial revisions, annotations, elaborations, or other modifications +represent, as a whole, an original work of authorship. For the purposes +of this License, Derivative Works shall not include works that remain +separable from, or merely link (or bind by name) to the interfaces of, +the Work and Derivative Works thereof. + +"Contribution" shall mean any work of authorship, including +the original version of the Work and any modifications or additions +to that Work or Derivative Works thereof, that is intentionally +submitted to Licensor for inclusion in the Work by the copyright owner +or by an individual or Legal Entity authorized to submit on behalf of +the copyright owner. For the purposes of this definition, "submitted" +means any form of electronic, verbal, or written communication sent +to the Licensor or its representatives, including but not limited to +communication on electronic mailing lists, source code control systems, +and issue tracking systems that are managed by, or on behalf of, the +Licensor for the purpose of discussing and improving the Work, but +excluding communication that is conspicuously marked or otherwise +designated in writing by the copyright owner as "Not a Contribution." + +"Contributor" shall mean Licensor and any individual or Legal Entity +on behalf of whom a Contribution has been received by Licensor and +subsequently incorporated within the Work. + +2. Grant of Copyright License. Subject to the terms and conditions of +this License, each Contributor hereby grants to You a perpetual, +worldwide, non-exclusive, no-charge, royalty-free, irrevocable +copyright license to reproduce, prepare Derivative Works of, +publicly display, publicly perform, sublicense, and distribute the +Work and such Derivative Works in Source or Object form. + +3. Grant of Patent License. Subject to the terms and conditions of +this License, each Contributor hereby grants to You a perpetual, +worldwide, non-exclusive, no-charge, royalty-free, irrevocable +(except as stated in this section) patent license to make, have made, +use, offer to sell, sell, import, and otherwise transfer the Work, +where such license applies only to those patent claims licensable +by such Contributor that are necessarily infringed by their +Contribution(s) alone or by combination of their Contribution(s) +with the Work to which such Contribution(s) was submitted. If You +institute patent litigation against any entity (including a +cross-claim or counterclaim in a lawsuit) alleging that the Work +or a Contribution incorporated within the Work constitutes direct +or contributory patent infringement, then any patent licenses +granted to You under this License for that Work shall terminate +as of the date such litigation is filed. + +4. Redistribution. You may reproduce and distribute copies of the +Work or Derivative Works thereof in any medium, with or without +modifications, and in Source or Object form, provided that You +meet the following conditions: + +(a) You must give any other recipients of the Work or +Derivative Works a copy of this License; and + +(b) You must cause any modified files to carry prominent notices +stating that You changed the files; and + +(c) You must retain, in the Source form of any Derivative Works +that You distribute, all copyright, patent, trademark, and +attribution notices from the Source form of the Work, +excluding those notices that do not pertain to any part of +the Derivative Works; and + +(d) If the Work includes a "NOTICE" text file as part of its +distribution, then any Derivative Works that You distribute must +include a readable copy of the attribution notices contained +within such NOTICE file, excluding those notices that do not +pertain to any part of the Derivative Works, in at least one +of the following places: within a NOTICE text file distributed +as part of the Derivative Works; within the Source form or +documentation, if provided along with the Derivative Works; or, +within a display generated by the Derivative Works, if and +wherever such third-party notices normally appear. The contents +of the NOTICE file are for informational purposes only and +do not modify the License. You may add Your own attribution +notices within Derivative Works that You distribute, alongside +or as an addendum to the NOTICE text from the Work, provided +that such additional attribution notices cannot be construed +as modifying the License. + +You may add Your own copyright statement to Your modifications and +may provide additional or different license terms and conditions +for use, reproduction, or distribution of Your modifications, or +for any such Derivative Works as a whole, provided Your use, +reproduction, and distribution of the Work otherwise complies with +the conditions stated in this License. + +5. Submission of Contributions. Unless You explicitly state otherwise, +any Contribution intentionally submitted for inclusion in the Work +by You to the Licensor shall be under the terms and conditions of +this License, without any additional terms or conditions. +Notwithstanding the above, nothing herein shall supersede or modify +the terms of any separate license agreement you may have executed +with Licensor regarding such Contributions. + +6. Trademarks. This License does not grant permission to use the trade +names, trademarks, service marks, or product names of the Licensor, +except as required for reasonable and customary use in describing the +origin of the Work and reproducing the content of the NOTICE file. + +7. Disclaimer of Warranty. Unless required by applicable law or +agreed to in writing, Licensor provides the Work (and each +Contributor provides its Contributions) on an "AS IS" BASIS, +WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or +implied, including, without limitation, any warranties or conditions +of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A +PARTICULAR PURPOSE. You are solely responsible for determining the +appropriateness of using or redistributing the Work and assume any +risks associated with Your exercise of permissions under this License. + +8. Limitation of Liability. In no event and under no legal theory, +whether in tort (including negligence), contract, or otherwise, +unless required by applicable law (such as deliberate and grossly +negligent acts) or agreed to in writing, shall any Contributor be +liable to You for damages, including any direct, indirect, special, +incidental, or consequential damages of any character arising as a +result of this License or out of the use or inability to use the +Work (including but not limited to damages for loss of goodwill, +work stoppage, computer failure or malfunction, or any and all +other commercial damages or losses), even if such Contributor +has been advised of the possibility of such damages. + +9. Accepting Warranty or Additional Liability. While redistributing +the Work or Derivative Works thereof, You may choose to offer, +and charge a fee for, acceptance of support, warranty, indemnity, +or other liability obligations and/or rights consistent with this +License. However, in accepting such obligations, You may act only +on Your own behalf and on Your sole responsibility, not on behalf +of any other Contributor, and only if You agree to indemnify, +defend, and hold each Contributor harmless for any liability +incurred by, or claims asserted against, such Contributor by reason +of your accepting any such warranty or additional liability. + +END OF TERMS AND CONDITIONS + +APPENDIX: How to apply the Apache License to your work. + +To apply the Apache License to your work, attach the following +boilerplate notice, with the fields enclosed by brackets "[]" +replaced with your own identifying information. (Don't include +the brackets!) The text should be enclosed in the appropriate +comment syntax for the file format. We also recommend that a +file or class name and description of purpose be included on the +same "printed page" as the copyright notice for easier +identification within third-party archives. + +Copyright [yyyy] [name of copyright owner] + +Licensed under the Apache License, Version 2.0 (the "License"); +you may not use this file except in compliance with the License. +You may obtain a copy of the License at + +http://www.apache.org/licenses/LICENSE-2.0 + +Unless required by applicable law or agreed to in writing, software +distributed under the License is distributed on an "AS IS" BASIS, +WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +See the License for the specific language governing permissions and +limitations under the License. + + +yarl +Apache-2.0 +https://github.com/aio-libs/yarl + + Apache License + Version 2.0, January 2004 + http://www.apache.org/licenses/ + + TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION + + 1. Definitions. + + "License" shall mean the terms and conditions for use, reproduction, + and distribution as defined by Sections 1 through 9 of this document. + + "Licensor" shall mean the copyright owner or entity authorized by + the copyright owner that is granting the License. + + "Legal Entity" shall mean the union of the acting entity and all + other entities that control, are controlled by, or are under common + control with that entity. For the purposes of this definition, + "control" means (i) the power, direct or indirect, to cause the + direction or management of such entity, whether by contract or + otherwise, or (ii) ownership of fifty percent (50%) or more of the + outstanding shares, or (iii) beneficial ownership of such entity. + + "You" (or "Your") shall mean an individual or Legal Entity + exercising permissions granted by this License. + + "Source" form shall mean the preferred form for making modifications, + including but not limited to software source code, documentation + source, and configuration files. + + "Object" form shall mean any form resulting from mechanical + transformation or translation of a Source form, including but + not limited to compiled object code, generated documentation, + and conversions to other media types. + + "Work" shall mean the work of authorship, whether in Source or + Object form, made available under the License, as indicated by a + copyright notice that is included in or attached to the work + (an example is provided in the Appendix below). + + "Derivative Works" shall mean any work, whether in Source or Object + form, that is based on (or derived from) the Work and for which the + editorial revisions, annotations, elaborations, or other modifications + represent, as a whole, an original work of authorship. For the purposes + of this License, Derivative Works shall not include works that remain + separable from, or merely link (or bind by name) to the interfaces of, + the Work and Derivative Works thereof. + + "Contribution" shall mean any work of authorship, including + the original version of the Work and any modifications or additions + to that Work or Derivative Works thereof, that is intentionally + submitted to Licensor for inclusion in the Work by the copyright owner + or by an individual or Legal Entity authorized to submit on behalf of + the copyright owner. For the purposes of this definition, "submitted" + means any form of electronic, verbal, or written communication sent + to the Licensor or its representatives, including but not limited to + communication on electronic mailing lists, source code control systems, + and issue tracking systems that are managed by, or on behalf of, the + Licensor for the purpose of discussing and improving the Work, but + excluding communication that is conspicuously marked or otherwise + designated in writing by the copyright owner as "Not a Contribution." + + "Contributor" shall mean Licensor and any individual or Legal Entity + on behalf of whom a Contribution has been received by Licensor and + subsequently incorporated within the Work. + + 2. Grant of Copyright License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + copyright license to reproduce, prepare Derivative Works of, + publicly display, publicly perform, sublicense, and distribute the + Work and such Derivative Works in Source or Object form. + + 3. Grant of Patent License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + (except as stated in this section) patent license to make, have made, + use, offer to sell, sell, import, and otherwise transfer the Work, + where such license applies only to those patent claims licensable + by such Contributor that are necessarily infringed by their + Contribution(s) alone or by combination of their Contribution(s) + with the Work to which such Contribution(s) was submitted. If You + institute patent litigation against any entity (including a + cross-claim or counterclaim in a lawsuit) alleging that the Work + or a Contribution incorporated within the Work constitutes direct + or contributory patent infringement, then any patent licenses + granted to You under this License for that Work shall terminate + as of the date such litigation is filed. + + 4. Redistribution. You may reproduce and distribute copies of the + Work or Derivative Works thereof in any medium, with or without + modifications, and in Source or Object form, provided that You + meet the following conditions: + + (a) You must give any other recipients of the Work or + Derivative Works a copy of this License; and + + (b) You must cause any modified files to carry prominent notices + stating that You changed the files; and + + (c) You must retain, in the Source form of any Derivative Works + that You distribute, all copyright, patent, trademark, and + attribution notices from the Source form of the Work, + excluding those notices that do not pertain to any part of + the Derivative Works; and + + (d) If the Work includes a "NOTICE" text file as part of its + distribution, then any Derivative Works that You distribute must + include a readable copy of the attribution notices contained + within such NOTICE file, excluding those notices that do not + pertain to any part of the Derivative Works, in at least one + of the following places: within a NOTICE text file distributed + as part of the Derivative Works; within the Source form or + documentation, if provided along with the Derivative Works; or, + within a display generated by the Derivative Works, if and + wherever such third-party notices normally appear. The contents + of the NOTICE file are for informational purposes only and + do not modify the License. You may add Your own attribution + notices within Derivative Works that You distribute, alongside + or as an addendum to the NOTICE text from the Work, provided + that such additional attribution notices cannot be construed + as modifying the License. + + You may add Your own copyright statement to Your modifications and + may provide additional or different license terms and conditions + for use, reproduction, or distribution of Your modifications, or + for any such Derivative Works as a whole, provided Your use, + reproduction, and distribution of the Work otherwise complies with + the conditions stated in this License. + + 5. Submission of Contributions. Unless You explicitly state otherwise, + any Contribution intentionally submitted for inclusion in the Work + by You to the Licensor shall be under the terms and conditions of + this License, without any additional terms or conditions. + Notwithstanding the above, nothing herein shall supersede or modify + the terms of any separate license agreement you may have executed + with Licensor regarding such Contributions. + + 6. Trademarks. This License does not grant permission to use the trade + names, trademarks, service marks, or product names of the Licensor, + except as required for reasonable and customary use in describing the + origin of the Work and reproducing the content of the NOTICE file. + + 7. Disclaimer of Warranty. Unless required by applicable law or + agreed to in writing, Licensor provides the Work (and each + Contributor provides its Contributions) on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or + implied, including, without limitation, any warranties or conditions + of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A + PARTICULAR PURPOSE. You are solely responsible for determining the + appropriateness of using or redistributing the Work and assume any + risks associated with Your exercise of permissions under this License. + + 8. Limitation of Liability. In no event and under no legal theory, + whether in tort (including negligence), contract, or otherwise, + unless required by applicable law (such as deliberate and grossly + negligent acts) or agreed to in writing, shall any Contributor be + liable to You for damages, including any direct, indirect, special, + incidental, or consequential damages of any character arising as a + result of this License or out of the use or inability to use the + Work (including but not limited to damages for loss of goodwill, + work stoppage, computer failure or malfunction, or any and all + other commercial damages or losses), even if such Contributor + has been advised of the possibility of such damages. + + 9. Accepting Warranty or Additional Liability. While redistributing + the Work or Derivative Works thereof, You may choose to offer, + and charge a fee for, acceptance of support, warranty, indemnity, + or other liability obligations and/or rights consistent with this + License. However, in accepting such obligations, You may act only + on Your own behalf and on Your sole responsibility, not on behalf + of any other Contributor, and only if You agree to indemnify, + defend, and hold each Contributor harmless for any liability + incurred by, or claims asserted against, such Contributor by reason + of your accepting any such warranty or additional liability. + + END OF TERMS AND CONDITIONS + + APPENDIX: How to apply the Apache License to your work. + + To apply the Apache License to your work, attach the following + boilerplate notice, with the fields enclosed by brackets "[]" + replaced with your own identifying information. (Don't include + the brackets!) The text should be enclosed in the appropriate + comment syntax for the file format. We also recommend that a + file or class name and description of purpose be included on the + same "printed page" as the copyright notice for easier + identification within third-party archives. + + Copyright [yyyy] [name of copyright owner] + + Licensed under the Apache License, Version 2.0 (the "License"); + you may not use this file except in compliance with the License. + You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, software + distributed under the License is distributed on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + See the License for the specific language governing permissions and + limitations under the License. + + +zarr +MIT +https://github.com/zarr-developers/zarr-python +The MIT License (MIT) + +Copyright (c) 2015-2025 Zarr Developers + +Permission is hereby granted, free of charge, to any person obtaining a copy +of this software and associated documentation files (the "Software"), to deal +in the Software without restriction, including without limitation the rights +to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +copies of the Software, and to permit persons to whom the Software is +furnished to do so, subject to the following conditions: + +The above copyright notice and this permission notice shall be included in all +copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE +SOFTWARE. + + +zipp +MIT +https://github.com/jaraco/zipp +MIT License + +Copyright (c) 2025 + +Permission is hereby granted, free of charge, to any person obtaining a copy of this software and +associated documentation files (the "Software"), to deal in the Software without restriction, including +without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the +following conditions: + +The above copyright notice and this permission notice shall be included in all copies or substantial +portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT +LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO +EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER +IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE +USE OR OTHER DEALINGS IN THE SOFTWARE. diff --git a/cosmos-inference/CHANGELOG.md b/cosmos-inference/CHANGELOG.md new file mode 100644 index 00000000..7f787639 --- /dev/null +++ b/cosmos-inference/CHANGELOG.md @@ -0,0 +1,33 @@ +# Change log + +## Unreleased + +- New features + +- Breaking changes + +## 1.2.2 (May 14, 2026) + +- New features + - Add [action policy closed-loop evaluation](./docs/action_policy_closed_loop_eval.md). + +## 1.2.1 (May 08, 2026) + +- New features + - Add [action policy post-training (SFT)](./docs/training.md). + +## 1.2.0 (May 05, 2026) + +- New features + - Add action modalities (Forward Dynamics, Inverse Dynamics, Policy) for Cosmos3-Nano model. + - Upgrade Cosmos3-Nano checkpoint to improve T2V, I2V quality. + +## 1.1.1 (May 01, 2026) + +- New features + - Add DCP checkpoint conversion/inference. + +## 1.1.0 (April 29, 2026) + +- New features + - Add [Post-Training (Supervised Fine-Tuning)](./docs/training.md). diff --git a/cosmos-inference/CONTRIBUTING.md b/cosmos-inference/CONTRIBUTING.md new file mode 100644 index 00000000..d4eaae27 --- /dev/null +++ b/cosmos-inference/CONTRIBUTING.md @@ -0,0 +1,121 @@ +# Contributing + + + +______________________________________________________________________ + +**Table of Contents** + +- [Setup](#setup) +- [Test](#test) + - [Run Linting and Formatting](#run-linting-and-formatting) + - [Run Tests](#run-tests) + - [Run a Single Test](#run-a-single-test) +- [Code Reviews](#code-reviews) +- [Signing Your Work](#signing-your-work) + +______________________________________________________________________ + + + +We'd love to receive your patches and contributions. Please keep your PRs as draft until such time that you would like us to review them. + +## Setup + +Install system dependencies: + +[just](https://just.systems/man/en/pre-built-binaries.html#pre-built-binaries) + +```shell +uv tool install -U rust-just +``` + +To see all available `just` commands, run + +```shell +just +``` + +## Test + +### Run Linting and Formatting + +```shell +just lint +``` + +This will also run auto-fixes and linting. We recommend that you commit your changes first. + +### Run Tests + +```shell +just test +``` + +Test levels (`--levels`): + +0. Smoke tests. Requires >= 1 GPU. +1. Partial E2E tests. Requires >= 8 GPUs. +2. Full E2E tests. Requires >= 8 GPUs. + +Test outputs are saved to `outputs/pytest/`. To monitor a test, open `console.log`/`debug.log`. + +### Run a Single Test + +```shell +# List tests to get the test name +just test-list +# Run the test +just test-single [--pdb] +``` + +## Code Reviews + +All submissions, including submissions by project members, require review. We use GitHub pull requests for this purpose. Consult +[GitHub Help](https://help.github.com/articles/about-pull-requests/) for more information on using pull requests. + +## Signing Your Work + +- We require that all contributors "sign-off" on their commits. This certifies that the contribution is your original work, or you have rights to submit it under the same license, or a compatible license. + + - Any contribution which contains commits that are not Signed-Off will not be accepted. + +- To sign off on a commit you simply use the `--signoff` (or `-s`) option when committing your changes: + + ```bash + git commit -s -m "Add cool feature." + ``` + + This will append the following to your commit message: + + ``` + Signed-off-by: Your Name + ``` + +- Full text of the DCO: + + ``` + Developer Certificate of Origin + Version 1.1 + + Copyright (C) 2004, 2006 The Linux Foundation and its contributors. + 1 Letterman Drive + Suite D4700 + San Francisco, CA, 94129 + + Everyone is permitted to copy and distribute verbatim copies of this license document, but changing it is not allowed. + ``` + + ``` + Developer's Certificate of Origin 1.1 + + By making a contribution to this project, I certify that: + + (a) The contribution was created in whole or in part by me and I have the right to submit it under the open source license indicated in the file; or + + (b) The contribution is based upon previous work that, to the best of my knowledge, is covered under an appropriate open source license and I have the right under that license to submit that work with modifications, whether created in whole or in part by me, under the same open source license (unless I am permitted to submit under a different license), as indicated in the file; or + + (c) The contribution was provided directly to me by some other person who certified (a), (b) or (c) and I have not modified it. + + (d) I understand and agree that this project and the contribution are public and that a record of the contribution (including all personal information I submit with it, including my sign-off) is maintained indefinitely and may be redistributed consistent with this project or the open source license(s) involved. + ``` diff --git a/cosmos-inference/Dockerfile b/cosmos-inference/Dockerfile new file mode 100644 index 00000000..62cb8378 --- /dev/null +++ b/cosmos-inference/Dockerfile @@ -0,0 +1,71 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +# Dockerfile using uv environment. + +ARG CUDA_VERSION=13.0.2 +ARG BASE_IMAGE=nvidia/cuda:${CUDA_VERSION}-cudnn-devel-ubuntu24.04 +FROM ${BASE_IMAGE} + +# Set the DEBIAN_FRONTEND environment variable to avoid interactive prompts during apt operations. +ENV DEBIAN_FRONTEND=noninteractive + +# Install packages +RUN --mount=type=cache,target=/var/cache/apt \ + --mount=type=cache,target=/var/lib/apt \ + apt-get update && \ + apt-get install -y --no-install-recommends \ + curl \ + ffmpeg \ + git \ + git-lfs \ + tree \ + wget + +# Install uv: https://docs.astral.sh/uv/getting-started/installation/ +# https://github.com/astral-sh/uv-docker-example/blob/main/Dockerfile +COPY --from=ghcr.io/astral-sh/uv:0.10.8 /uv /uvx /usr/local/bin/ +# Copy from the cache instead of linking since it's a mounted volume +ENV UV_LINK_MODE=copy +# Cache python downloads +ENV UV_PYTHON_CACHE_DIR=/root/.cache/uv/python + +# Install just: https://just.systems/man/en/pre-built-binaries.html +RUN curl --proto '=https' --tlsv1.2 -sSf https://just.systems/install.sh | bash -s -- --to /usr/local/bin --tag 1.46.0 + +ENV PATH="/root/.local/bin:$PATH" + +WORKDIR /workspace + +# Install python +RUN --mount=type=cache,target=/root/.cache/uv \ + --mount=type=bind,source=.python-version,target=.python-version \ + uv python install + +# Install into virtual environment +RUN echo "$CUDA_VERSION" | sed -E 's/^([0-9]+)\.([0-9]+).*/cu\1\2/' > /root/.cuda-name +RUN --mount=type=cache,target=/root/.cache/uv \ + --mount=type=bind,source=uv.lock,target=uv.lock \ + --mount=type=bind,source=pyproject.toml,target=pyproject.toml \ + --mount=type=bind,source=.python-version,target=.python-version \ + uv sync --locked --no-install-project --no-editable --all-extras --group=$(cat /root/.cuda-name) +ENV PATH="/workspace/.venv/bin:$PATH" + +# Triton bundled ptxas doesn't support latest GPU architectures +ENV TRITON_PTXAS_PATH="/usr/local/cuda/bin/ptxas" + +ENTRYPOINT ["/workspace/docker/entrypoint.sh"] + +CMD ["/bin/bash"] diff --git a/cosmos-inference/LICENSE b/cosmos-inference/LICENSE new file mode 100644 index 00000000..1bffec96 --- /dev/null +++ b/cosmos-inference/LICENSE @@ -0,0 +1,222 @@ + Apache License + Version 2.0, January 2004 + http://www.apache.org/licenses/ + + TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION + + 1. Definitions. + + "License" shall mean the terms and conditions for use, reproduction, + and distribution as defined by Sections 1 through 9 of this document. + + "Licensor" shall mean the copyright owner or entity authorized by + the copyright owner that is granting the License. + + "Legal Entity" shall mean the union of the acting entity and all + other entities that control, are controlled by, or are under common + control with that entity. For the purposes of this definition, + "control" means (i) the power, direct or indirect, to cause the + direction or management of such entity, whether by contract or + otherwise, or (ii) ownership of fifty percent (50%) or more of the + outstanding shares, or (iii) beneficial ownership of such entity. + + "You" (or "Your") shall mean an individual or Legal Entity + exercising permissions granted by this License. + + "Source" form shall mean the preferred form for making modifications, + including but not limited to software source code, documentation + source, and configuration files. + + "Object" form shall mean any form resulting from mechanical + transformation or translation of a Source form, including but + not limited to compiled object code, generated documentation, + and conversions to other media types. + + "Work" shall mean the work of authorship, whether in Source or + Object form, made available under the License, as indicated by a + copyright notice that is included in or attached to the work + (an example is provided in the Appendix below). + + "Derivative Works" shall mean any work, whether in Source or Object + form, that is based on (or derived from) the Work and for which the + editorial revisions, annotations, elaborations, or other modifications + represent, as a whole, an original work of authorship. For the purposes + of this License, Derivative Works shall not include works that remain + separable from, or merely link (or bind by name) to the interfaces of, + the Work and Derivative Works thereof. + + "Contribution" shall mean any work of authorship, including + the original version of the Work and any modifications or additions + to that Work or Derivative Works thereof, that is intentionally + submitted to Licensor for inclusion in the Work by the copyright owner + or by an individual or Legal Entity authorized to submit on behalf of + the copyright owner. For the purposes of this definition, "submitted" + means any form of electronic, verbal, or written communication sent + to the Licensor or its representatives, including but not limited to + communication on electronic mailing lists, source code control systems, + and issue tracking systems that are managed by, or on behalf of, the + Licensor for the purpose of discussing and improving the Work, but + excluding communication that is conspicuously marked or otherwise + designated in writing by the copyright owner as "Not a Contribution." + + "Contributor" shall mean Licensor and any individual or Legal Entity + on behalf of whom a Contribution has been received by Licensor and + subsequently incorporated within the Work. + + 2. Grant of Copyright License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + copyright license to reproduce, prepare Derivative Works of, + publicly display, publicly perform, sublicense, and distribute the + Work and such Derivative Works in Source or Object form. + + 3. Grant of Patent License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + (except as stated in this section) patent license to make, have made, + use, offer to sell, sell, import, and otherwise transfer the Work, + where such license applies only to those patent claims licensable + by such Contributor that are necessarily infringed by their + Contribution(s) alone or by combination of their Contribution(s) + with the Work to which such Contribution(s) was submitted. If You + institute patent litigation against any entity (including a + cross-claim or counterclaim in a lawsuit) alleging that the Work + or a Contribution incorporated within the Work constitutes direct + or contributory patent infringement, then any patent licenses + granted to You under this License for that Work shall terminate + as of the date such litigation is filed. + + 4. Redistribution. You may reproduce and distribute copies of the + Work or Derivative Works thereof in any medium, with or without + modifications, and in Source or Object form, provided that You + meet the following conditions: + + (a) You must give any other recipients of the Work or + Derivative Works a copy of this License; and + + (b) You must cause any modified files to carry prominent notices + stating that You changed the files; and + + (c) You must retain, in the Source form of any Derivative Works + that You distribute, all copyright, patent, trademark, and + attribution notices from the Source form of the Work, + excluding those notices that do not pertain to any part of + the Derivative Works; and + + (d) If the Work includes a "NOTICE" text file as part of its + distribution, then any Derivative Works that You distribute must + include a readable copy of the attribution notices contained + within such NOTICE file, excluding those notices that do not + pertain to any part of the Derivative Works, in at least one + of the following places: within a NOTICE text file distributed + as part of the Derivative Works; within the Source form or + documentation, if provided along with the Derivative Works; or, + within a display generated by the Derivative Works, if and + wherever such third-party notices normally appear. The contents + of the NOTICE file are for informational purposes only and + do not modify the License. You may add Your own attribution + notices within Derivative Works that You distribute, alongside + or as an addendum to the NOTICE text from the Work, provided + that such additional attribution notices cannot be construed + as modifying the License. + + You may add Your own copyright statement to Your modifications and + may provide additional or different license terms and conditions + for use, reproduction, or distribution of Your modifications, or + for any such Derivative Works as a whole, provided Your use, + reproduction, and distribution of the Work otherwise complies with + the conditions stated in this License. + + 5. Submission of Contributions. Unless You explicitly state otherwise, + any Contribution intentionally submitted for inclusion in the Work + by You to the Licensor shall be under the terms and conditions of + this License, without any additional terms or conditions. + Notwithstanding the above, nothing herein shall supersede or modify + the terms of any separate license agreement you may have executed + with Licensor regarding such Contributions. + + 6. Trademarks. This License does not grant permission to use the trade + names, trademarks, service marks, or product names of the Licensor, + except as required for reasonable and customary use in describing the + origin of the Work and reproducing the content of the NOTICE file. + + 7. Disclaimer of Warranty. Unless required by applicable law or + agreed to in writing, Licensor provides the Work (and each + Contributor provides its Contributions) on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or + implied, including, without limitation, any warranties or conditions + of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A + PARTICULAR PURPOSE. You are solely responsible for determining the + appropriateness of using or redistributing the Work and assume any + risks associated with Your exercise of permissions under this License. + + 8. Limitation of Liability. In no event and under no legal theory, + whether in tort (including negligence), contract, or otherwise, + unless required by applicable law (such as deliberate and grossly + negligent acts) or agreed to in writing, shall any Contributor be + liable to You for damages, including any direct, indirect, special, + incidental, or consequential damages of any character arising as a + result of this License or out of the use or inability to use the + Work (including but not limited to damages for loss of goodwill, + work stoppage, computer failure or malfunction, or any and all + other commercial damages or losses), even if such Contributor + has been advised of the possibility of such damages. + + 9. Accepting Warranty or Additional Liability. While redistributing + the Work or Derivative Works thereof, You may choose to offer, + and charge a fee for, acceptance of support, warranty, indemnity, + or other liability obligations and/or rights consistent with this + License. However, in accepting such obligations, You may act only + on Your own behalf and on Your sole responsibility, not on behalf + of any other Contributor, and only if You agree to indemnify, + defend, and hold each Contributor harmless for any liability + incurred by, or claims asserted against, such Contributor by reason + of your accepting any such warranty or additional liability. + + END OF TERMS AND CONDITIONS + + APPENDIX: How to apply the Apache License to your work. + + To apply the Apache License to your work, attach the following + boilerplate notice, with the fields enclosed by brackets "[]" + replaced with your own identifying information. (Don't include + the brackets!) The text should be enclosed in the appropriate + comment syntax for the file format. We also recommend that a + file or class name and description of purpose be included on the + same "printed page" as the copyright notice for easier + identification within third-party archives. + + Copyright [yyyy] [name of copyright owner] + + Licensed under the Apache License, Version 2.0 (the "License"); + you may not use this file except in compliance with the License. + You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, software + distributed under the License is distributed on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + See the License for the specific language governing permissions and + limitations under the License. + +================================================================================ + THIRD-PARTY ATTRIBUTIONS +================================================================================ + +This product includes code adapted from HuggingFace Transformers +(https://github.com/huggingface/transformers), licensed under the Apache +License, Version 2.0. + + Copyright 2024 The Qwen team, Alibaba Group and the HuggingFace Inc. team. + Copyright 2025 The Qwen team, Alibaba Group and the HuggingFace Inc. team. + Copyright 2025 The Qwen Team and The HuggingFace Inc. team. + All rights reserved. + +The following files are adapted from HuggingFace Transformers: + + cosmos3/_src/vfm/models/llm/qwen3/configuration_qwen3.py + cosmos3/_src/vfm/models/llm/qwen3/qwen3.py + cosmos3/_src/vfm/models/vlm/qwen3_vl/configuration_qwen3_vl.py + cosmos3/_src/vfm/models/vlm/qwen3_vl/qwen3_vl.py + cosmos3/_src/vfm/models/vlm/qwen3_vl/video_processing_qwen3_vl.py diff --git a/cosmos-inference/README.md b/cosmos-inference/README.md new file mode 100644 index 00000000..be09b0e6 --- /dev/null +++ b/cosmos-inference/README.md @@ -0,0 +1,131 @@ +

+ NVIDIA Cosmos +

+ +

🤗 Hugging Face | Paper Draft

+ +# Cosmos3 + +- [Gallery](./docs/gallery.md) +- [Quickstart](#setup) +- [Setup](./docs/setup.md) +- [Prompting](./docs/prompting.md) +- [Inference](./docs/inference.md) +- [Post-Training (Supervised Fine-Tuning)](./docs/training.md) + - [JSONL Dataset](./docs/dataset_jsonl.md) + - [Action Policy Closed-Loop Evaluation on LIBERO](./docs/action_policy_closed_loop_eval.md) +- Reference + - [Environment Variables](./docs/environment_variables.md) + - [FAQ](./docs/faq.md) + - [AGENTS.md](./AGENTS.md) + +## Overview + +**Cosmos3** is a world foundation model that unifies understanding and generation within a single Mixture-of-Transformer (MoT) architecture. Two tightly coupled towers—a **Reasoner** (vision-language model) and a **Generator** (world simulator)—share latent representations so that structured perception directly grounds realistic, temporally consistent simulation. + +

Image

+ +One model, many capabilities: + +| Input Modality | Output Modality | Application | EA1 | +| ----------------------- | --------------- | --------------------- | ------------ | +| Video \| Text | Video | Video Generator | ✅ | +| Video \| Text | Text | Vision Language Model | Coming soon! | +| Action \| Video \| Text | Video | World Model | ✅ | +| Video \| Text | Video & Action | Policy Model | ✅ | + +## Supported Features (Cosmos3 EA1 — Robotics Backbone) + +### User Stories + +- **Video Backbone**: Evaluate and benchmark the model’s task understanding and review its architecture to inform codebase decisions. + +### Base Model Specifications + +| Spec | Value | +| ---------------- | -------------------------------------------------------------------------- | +| Model Size | Nano, Super | +| Resolution | 256p / 480p / 720p | +| Frame Rate (FPS) | 10–30 | +| Num of Frames | Default: 189 (max by resolution: `256p → 400`, `480p → 300`, `720p → 200`) | +| Max Duration | Variable | +| View | Single view only | + +## Setup + +For more details and alternative installation methods, see [Setup](./docs/setup.md#installation). + +Install system dependencies: + +```shell +sudo apt-get install -y --no-install-recommends curl ffmpeg libx11-dev tree wget +``` + +Install the package with `uv`: + +```shell +uv sync --all-extras --group=cu130-train +source .venv/bin/activate && export LD_LIBRARY_PATH= +``` + +## Prompting + +See [Prompting](./docs/prompting.md). + +## Inference + +For more details, see [Inference](./docs/inference.md). + +Generate a single sample with 1 GPU: + +```shell +python -m cosmos3.scripts.inference \ + --parallelism-preset=latency \ + -i "inputs/omni/t2v.json" \ + -o outputs/omni_nano \ + --checkpoint-path Cosmos3-Nano \ + --seed=0 +``` + +Generate multiple samples with 8 GPUs (~5 mins): + +```shell +torchrun --nproc-per-node=8 -m cosmos3.scripts.inference \ + --parallelism-preset=throughput \ + -i "inputs/omni/*.json" \ + -o outputs/omni_nano \ + --checkpoint-path Cosmos3-Nano \ + --seed=0 +``` + +**Note:** The progress bar only prints on rank 0. + +To see all available arguments: + +```shell +python -m cosmos3.scripts.inference --help +``` + +### Models + +| Model | Arguments | Modalities | +| ------------- | --------------------------------- | ----------------------------------- | +| Cosmos3-Nano | `--checkpoint-path=Cosmos3-Nano` | All | +| Cosmos3-Super | `--checkpoint-path=Cosmos3-Super` | Text2Image, Text2Video, Image2Video | + +### Modalities + +| Modality | Example | +| ------------------ | -------------------------------------------------------------------------------------------------- | +| `text2image` | [`-i "inputs/omni/t2i.json"`](inputs/omni/t2i.json) | +| `text2video` | [`-i "inputs/omni/t2v.json"`](inputs/omni/t2v.json) | +| `image2video` | [`-i "inputs/omni/i2v.json"`](inputs/omni/i2v.json) | +| `forward_dynamics` | [`-i "inputs/omni/action_forward_dynamics*.json"`](inputs/omni/action_forward_dynamics_robot.json) | +| `inverse_dynamics` | [`-i "inputs/omni/action_inverse_dynamics*.json"`](inputs/omni/action_inverse_dynamics_av.json) | +| `policy` | [`-i "inputs/omni/action_policy*.json"`](inputs/omni/action_policy_robot.json) | + +To generate all examples, use `-i "inputs/omni/*.json"`. + +## Training + +See [Training](./docs/training.md). diff --git a/cosmos-inference/ci/.link-check.json b/cosmos-inference/ci/.link-check.json new file mode 100644 index 00000000..3546a997 --- /dev/null +++ b/cosmos-inference/ci/.link-check.json @@ -0,0 +1,10 @@ +{ + "ignorePatterns": [ + { + "pattern": "localhost" + }, + { + "pattern": "^https://github-production-user-asset" + } + ] +} diff --git a/cosmos-inference/ci/.markdown-toc-creator.toml b/cosmos-inference/ci/.markdown-toc-creator.toml new file mode 100644 index 00000000..650901b0 --- /dev/null +++ b/cosmos-inference/ci/.markdown-toc-creator.toml @@ -0,0 +1,19 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025-2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +[tool.markdown_toc_creator] +proactive = false +exclude = '/_src/' +quiet = true diff --git a/cosmos-inference/ci/.pre-commit-config-base.yaml b/cosmos-inference/ci/.pre-commit-config-base.yaml new file mode 100644 index 00000000..3aaa7f72 --- /dev/null +++ b/cosmos-inference/ci/.pre-commit-config-base.yaml @@ -0,0 +1,26 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +repos: + - repo: https://github.com/pre-commit/pre-commit-hooks + rev: v6.0.0 + hooks: + - id: check-added-large-files + args: ['--maxkb=10000'] # 10MB + - id: forbid-submodules + - repo: https://github.com/gitleaks/gitleaks + rev: v8.30.0 + hooks: + - id: gitleaks diff --git a/cosmos-inference/ci/license.txt b/cosmos-inference/ci/license.txt new file mode 100644 index 00000000..8f179ca4 --- /dev/null +++ b/cosmos-inference/ci/license.txt @@ -0,0 +1,14 @@ +SPDX-FileCopyrightText: Copyright (c) 2025-2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +SPDX-License-Identifier: Apache-2.0 + +Licensed under the Apache License, Version 2.0 (the "License"); +you may not use this file except in compliance with the License. +You may obtain a copy of the License at + +http://www.apache.org/licenses/LICENSE-2.0 + +Unless required by applicable law or agreed to in writing, software +distributed under the License is distributed on an "AS IS" BASIS, +WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +See the License for the specific language governing permissions and +limitations under the License. diff --git a/cosmos-inference/ci/uv_lock.sh b/cosmos-inference/ci/uv_lock.sh new file mode 100755 index 00000000..3a6189c9 --- /dev/null +++ b/cosmos-inference/ci/uv_lock.sh @@ -0,0 +1,22 @@ +#!/usr/bin/env bash +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# + +# Generate uv lock files for projects. + +set -euo pipefail + +for file in "$@"; do + project_dir="$(dirname "$file")" + if ! uv lock -q --check --project "$project_dir" &>/dev/null; then + echo "Updating lock file for '$project_dir'" >&2 + uv lock -q --project "$project_dir" + fi +done diff --git a/cosmos-inference/ci/uv_lock_script.sh b/cosmos-inference/ci/uv_lock_script.sh new file mode 100755 index 00000000..83975315 --- /dev/null +++ b/cosmos-inference/ci/uv_lock_script.sh @@ -0,0 +1,23 @@ +#!/usr/bin/env bash +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# + +# Generate uv lock files for scripts. + +set -euo pipefail + +for file in "$@"; do + if head -n1 "$file" | grep -q '^#!/usr/bin/env -S uv run --script'; then + if ! uv lock -q --check --script "$file" &>/dev/null; then + echo "Updating lock file for '$file'" >&2 + uv lock -q --script "$file" + fi + fi +done diff --git a/cosmos-inference/conftest.py b/cosmos-inference/conftest.py new file mode 100644 index 00000000..22543723 --- /dev/null +++ b/cosmos-inference/conftest.py @@ -0,0 +1,282 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +from cosmos3._src.imaginaire.lazy_config import lazy_call + + +lazy_call._CONVERT_TARGET_TO_STRING = True + +import gc +import os +from functools import cache +from pathlib import Path + +import pytest + +from cosmos3.fixtures.args import ALL_LEVELS, ALL_NUM_GPUS, ALLOWED_GPUS_BY_LEVEL, Args, get_args, init_args + + +@pytest.fixture(scope="module") +def original_datadir(request: pytest.FixtureRequest) -> Path: + root_dir = request.config.rootpath + relative_path = request.path.with_suffix("").relative_to(root_dir) + return root_dir / "tests/data" / relative_path + + +@cache +def _get_available_gpus() -> int: + import pynvml + + try: + pynvml.nvmlInit() + device_count = pynvml.nvmlDeviceGetCount() + pynvml.nvmlShutdown() + return device_count + except pynvml.NVMLError as e: + print(f"WARNING: Failed to get available GPUs: {e}") + return 0 + + +def pytest_addoption(parser: pytest.Parser): + parser.addoption("--manual", action="store_true", default=False, help="Run manual tests") + parser.addoption( + "--num-gpus", + default=None, + type=int, + choices=ALL_NUM_GPUS, + help="Run tests with the specified number of GPUs", + ) + parser.addoption("--levels", default=None, help="Run tests with the specified levels (comma-separated list)") + + +def pytest_xdist_auto_num_workers(config: pytest.Config) -> int | None: + num_gpus: int | None = config.option.num_gpus + if num_gpus is None: + return 1 + if num_gpus == 0: + return None + + available_gpus = _get_available_gpus() + if available_gpus < num_gpus: + raise ValueError(f"Not enough GPUs available. Required: {num_gpus}, Available: {available_gpus}") + return available_gpus // num_gpus + + +def pytest_configure(config: pytest.Config): + args = Args.from_config(config) + init_args(args) + + if ( + args.num_gpus is not None + and args.levels is not None + and all(args.num_gpus not in ALLOWED_GPUS_BY_LEVEL[level] for level in args.levels) + ): + pytest.exit(f"No tests for {args.num_gpus} GPUs and levels {args.levels}.", returncode=0) + + if args.worker_id == "master": + return + + if args.worker_index > 1: + if args.num_gpus is None: + raise NotImplementedError(f"Running parallel tests requires --num-gpus to be set.") + + # Check if there are enough GPUs available. + if args.num_gpus is not None and args.num_gpus > 0: + required_gpus = args.num_gpus * (args.worker_index + 1) + else: + required_gpus = 1 + available_gpus = _get_available_gpus() + if available_gpus < required_gpus: + raise ValueError(f"Not enough GPUs available. Required: {required_gpus}, Available: {available_gpus}") + + # Limit threading to reduce contention + import torch + + torch.set_num_threads(1) + torch.set_num_interop_threads(1) + + +def _get_marker(item: pytest.Item, name: str) -> pytest.Mark | None: + markers = list(item.iter_markers(name=name)) + if not markers: + return None + marker = markers[0] + for other_marker in markers[1:]: + if other_marker != marker: + raise ValueError(f"Multiple different markers found for {name}: {markers}") + return marker + + +def _parse_level_marker(mark: pytest.Mark) -> int: + if len(mark.args) != 1: + raise ValueError(f"Invalid arguments: {mark.args}") + if mark.kwargs: + raise ValueError(f"Invalid keyword arguments: {mark.kwargs}") + level = mark.args[0] + if level not in ALL_LEVELS: + raise ValueError(f"Invalid level {level} not in {ALL_LEVELS}") + return level + + +def _parse_gpus_marker(mark: pytest.Mark) -> int: + if len(mark.args) != 1: + raise ValueError(f"Invalid arguments: {mark.args}") + if mark.kwargs: + raise ValueError(f"Invalid keyword arguments: {mark.kwargs}") + required_gpus = int(mark.args[0]) + if required_gpus not in ALL_NUM_GPUS: + raise ValueError(f"Invalid number of GPUs {required_gpus} not in {ALL_NUM_GPUS}") + return required_gpus + + +def pytest_collection_modifyitems(config: pytest.Config, items: list[pytest.Item]): + args = get_args() + + for item in items: + manual_mark = _get_marker(item, "manual") + level_mark = _get_marker(item, "level") + gpus_mark = _get_marker(item, "gpus") + try: + level = _parse_level_marker(level_mark) if level_mark else 0 + gpus = _parse_gpus_marker(gpus_mark) if gpus_mark else 0 + except ValueError as e: + pytest.fail(f"Invalid marker on test {item.name}: {e}") + assert False, "unreachable" + + allowed_gpus = ALLOWED_GPUS_BY_LEVEL[level] + if gpus not in allowed_gpus: + pytest.fail(f"Level {level} tests must have {allowed_gpus} GPUs, but {item.name} has {gpus} GPUs") + + # Check if the test should be skipped + if not args.enable_manual and manual_mark is not None: + item.add_marker(pytest.mark.skip(reason="test requires --manual")) + if args.levels is not None and level not in args.levels: + item.add_marker(pytest.mark.skip(reason=f"test requires --levels={level}")) + if args.num_gpus is not None and gpus != args.num_gpus: + item.add_marker(pytest.mark.skip(reason=f"test requires --num-gpus={gpus}")) + available_gpus = _get_available_gpus() + if gpus > available_gpus: + item.add_marker( + pytest.mark.skip(reason=f"test requires {gpus} GPUs, but only {available_gpus} are available") + ) + + # Exclude skipped tests + selected_items = [] + deselected_items = [] + for item in items: + if item.get_closest_marker("skip"): + deselected_items.append(item) + continue + selected_items.append(item) + items[:] = selected_items + config.hook.pytest_deselected(items=deselected_items) + + +def pytest_runtest_setup(item: pytest.Item): + import torch + + args = get_args() + + gpus_mark = item.get_closest_marker(name="gpus") + try: + gpus = _parse_gpus_marker(gpus_mark) if gpus_mark else 0 + except ValueError as e: + pytest.fail(f"Invalid marker on test {item.name}: {e}") + assert False, "unreachable" + + # Limit the number of GPUs used by the test + if gpus > 0: + device_start = args.worker_index * gpus + device_end = device_start + gpus + os.environ["CUDA_VISIBLE_DEVICES"] = ",".join(map(str, range(device_start, device_end))) + os.environ["NUM_GPUS"] = str(gpus) + else: + device = 0 + os.environ["CUDA_VISIBLE_DEVICES"] = str(device) + os.environ["NUM_GPUS"] = "1" + + test_max_processes = int(os.environ.get("TEST_MAX_PROCESSES", "8")) + device_memory_fraction = 1 / max(args.worker_count, test_max_processes) + os.environ["DEVICE_MEMORY_FRACTION"] = str(device_memory_fraction) + torch.cuda.set_per_process_memory_fraction(device_memory_fraction) + + +@pytest.fixture(autouse=True) +def init_cosmos_test(tmp_path: Path, monkeypatch: pytest.MonkeyPatch): + from cosmos3.common.init import _init_log_console, _init_log_files + + monkeypatch.setenv("IMAGINAIRE_OUTPUT_ROOT", str(tmp_path / "imaginaire4-output")) + + _init_log_console() + _init_log_files(tmp_path) + + yield + + +@pytest.fixture(autouse=True) +def init_torch_test(): + import torch + + from cosmos3.common.init import set_seed + + # Reproducibility + set_seed(0) + + yield + + # Cleanup memory + gc.collect() + if torch.cuda.is_available(): + torch.cuda.empty_cache() + + + +_WHITELIST_ENV_VARS = { + "LD_LIBRARY_PATH", + "QT_QPA_FONTDIR", + "QT_QPA_PLATFORM_PLUGIN_PATH", + "TORCHINDUCTOR_CACHE_DIR", +} + + +@pytest.fixture(autouse=True) +def detect_env_modifications(): + original_env = dict(os.environ) + + yield + + new_env = dict(os.environ) + + for env in [original_env, new_env]: + for k in list(env.keys()): + if k.startswith("PYTEST_") or k in _WHITELIST_ENV_VARS: + del env[k] + if new_env != original_env: + added, removed, modified = _compare_dict(new_env, original_env) + os.environ.clear() + os.environ.update(original_env) + raise ValueError( + f"Environment variables modified by test! Use 'monkeypatch.setenv' to temporarily modify environment variables. \n" + f"Added: {added}\n" + f"Removed: {removed}\n" + f"Modified: {modified}" + ) + + +def _compare_dict(actual: dict[str, str], expected: dict[str, str]) -> tuple[set[str], set[str], set[str]]: + added = set(actual) - set(expected) + removed = set(expected) - set(actual) + modified = {k for k in expected if k in actual and expected[k] != actual[k]} + return added, removed, modified diff --git a/cosmos-inference/cosmos3/__init__.py b/cosmos-inference/cosmos3/__init__.py new file mode 100644 index 00000000..db6d6eb6 --- /dev/null +++ b/cosmos-inference/cosmos3/__init__.py @@ -0,0 +1,15 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025-2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + diff --git a/cosmos-inference/cosmos3/_src/imaginaire/__init__.py b/cosmos-inference/cosmos3/_src/imaginaire/__init__.py new file mode 100644 index 00000000..db6d6eb6 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/imaginaire/__init__.py @@ -0,0 +1,15 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025-2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + diff --git a/cosmos-inference/cosmos3/_src/imaginaire/attention/README.md b/cosmos-inference/cosmos3/_src/imaginaire/attention/README.md new file mode 100644 index 00000000..750e7c32 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/imaginaire/attention/README.md @@ -0,0 +1,47 @@ +# Imaginaire Attention Subpackage + +A subpackage within cosmos3._src.imaginaire that integrates only the best and most reliable +solutions, and provides simple APIs to end-users. + +For more information, please refer to the [docs](docs/). + +## Basic API + +```python +from cosmos3._src.imaginaire.attention import attention + +output = attention( + query=query, + key=key, + value=value, +) +``` + +- **Optional** `scale`: attention (softmax/dot product) scale. Defaults to `head_dim ** -0.5`. +- **Optional** `return_lse`: returns logsumexp if `True` +- **Optional** `backend`: explicitly set backend instead of automatically selecting the best compatible + +## Tensor layouts + +Imaginaire Attention only supports one tensor memory layout: +heads-last torch contiguous (`torch.contiguous_format`). + +With this layout, input tensors `query`, `key`, and `value` are represented as rank-4 tensors, with +dimension 0 representing batch, dimension 1 representing sequence length, dimension 2 representing +attention heads, and dimension 3 representing head dimension. +This layout is also consistent with the `contiguous_format` memory layout in PyTorch, meaning the +right-most dimension (head dimension) is the major dimension (has stride 1), and tokens from +different heads are interleaved. + +```python +def verify_heads_last_contig_tensor(x: Tensor): + assert x.shape[0] == batch + assert x.shape[1] == seqlen + assert x.shape[2] == heads + assert x.shape[3] == head_dim + + assert x.stride(3) == 1 + assert x.stride(2) == head_dim + assert x.stride(1) == heads * head_dim + assert x.stride(0) == heads * head_dim * seqlen +``` diff --git a/cosmos-inference/cosmos3/_src/imaginaire/attention/__init__.py b/cosmos-inference/cosmos3/_src/imaginaire/attention/__init__.py new file mode 100644 index 00000000..72d51932 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/imaginaire/attention/__init__.py @@ -0,0 +1,37 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +""" +Imaginaire4 Attention Subpackage: +Unified implementation for all Attention implementations. + + +""" + +from cosmos3._src.imaginaire.attention.frontend import ( + attention, + merge_attentions, + multi_dimensional_attention, + multi_dimensional_attention_varlen, + spatio_temporal_attention, +) + +__all__ = [ + "attention", + "multi_dimensional_attention", + "multi_dimensional_attention_varlen", + "spatio_temporal_attention", + "merge_attentions", +] diff --git a/cosmos-inference/cosmos3/_src/imaginaire/attention/backends.py b/cosmos-inference/cosmos3/_src/imaginaire/attention/backends.py new file mode 100644 index 00000000..bac2e441 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/imaginaire/attention/backends.py @@ -0,0 +1,418 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +""" +Imaginaire4 Attention Subpackage: +Unified implementation for all Attention implementations. + +Frontend APIs +""" + +import torch + +from cosmos3._src.imaginaire.attention.flash2.checks import flash2_attention_check +from cosmos3._src.imaginaire.attention.flash3.checks import flash3_attention_check +from cosmos3._src.imaginaire.attention.masks import CausalType +from cosmos3._src.imaginaire.attention.natten.checks import natten_attention_check, natten_multi_dim_attention_check +from cosmos3._src.imaginaire.attention.utils import get_arch_tag +from cosmos3._src.imaginaire.attention.utils.environment import ( + filter_attention_backends, + filter_multi_dim_attention_backends, +) +from cosmos3._src.imaginaire.attention.utils.safe_ops import log +from cosmos3._src.imaginaire.attention.utils.safe_ops.functools import lru_cache + + +BACKEND_CHECK_MAP = { + "natten": natten_attention_check, + "flash2": flash2_attention_check, + "flash3": flash3_attention_check, +} + +BACKEND_MULTI_DIM_CHECK_MAP = { + "natten": natten_multi_dim_attention_check, +} + + +def is_backend_compatible( + backend: str, + query_shape: torch.Size, + key_shape: torch.Size, + value_shape: torch.Size, + dtype: torch.dtype, + device: torch.device, + requires_grad: bool, + is_causal: bool, + causal_type: CausalType | None, + is_varlen: bool, + deterministic: bool = False, + raise_error: bool = False, +) -> bool: + """ + Input validation function a specified backend. + Runs the common and backend-specific checks. Returns False if any checks fail, otherwise True. + + Parameters: + backend (str): selected backend. + + query_shape (torch.Size): Shape of 4-D query tensor (`[batch, seqlen, heads, head_dim]`). + + key_shape (torch.Size): Shape of 4-D key tensor (`[batch, seqlen_kv, heads_kv, head_dim]`). + + value_shape (torch.Size): Shape of 4-D value tensor (`[batch, seqlen_kv, heads_kv, head_dim_v]`). + + dtype (torch.dtype): Data type of tensors. + + device (torch.device): Device of tensors. + + requires_grad (bool): Whether tensors require gradients (training vs inference). + + is_causal (bool): whether or not causal masking is enabled. + + causal_type (CausalType): causal masking mode. Choices: `CausalType.TopLeft`, + `CausalType.BottomRight`. Required when `is_causal = True`. + + is_varlen (bool): whether or not a variable length (varlen) use case. Must be inferred + beforehand based on arguments such as seqlens_{Q,KV} or cumulative_seqlen_{Q,KV} being + passed. + + deterministic (bool): Deterministic backward pass required. + + raise_error (bool): whether to raise an error if any checks fail or no backend is selected, + instead of just returning False. Default is False. + + Returns: + success (bool): whether use case is compatible with the backend. + + """ + if backend is None: + raise ValueError("Cannot pass None backend to is_backend_compatible.") + + if backend not in BACKEND_CHECK_MAP: + raise ValueError(f"Unrecognized backend name {backend}.") + + return BACKEND_CHECK_MAP[backend]( + query_shape=query_shape, + key_shape=key_shape, + value_shape=value_shape, + dtype=dtype, + device=device, + requires_grad=requires_grad, + is_causal=is_causal, + causal_type=causal_type, + is_varlen=is_varlen, + deterministic=deterministic, + raise_error=raise_error, + ) + + +def get_backend_list(arch_tag: int) -> list[str]: + """ + Returns list of supported backends according to arch tag (attention.utils.get_arch_tag). + Backends are ordered based on their known performance levels, so that the best-performing + compatible backend is selected. + + The returned list can be filtered via environment variable. + See `filter_attention_backends` for details. + + Parameters: + arch_tag (int): Arch tag for the current CUDA device. Example: 80 for A100, 90 for H100. + + Returns: + backend_list (list[str]): a list of backend names (string). Empty if device is not supported. + + """ + + if arch_tag < 75: + log.debug(f"Minimum architecture supported for Attention is 75, got {arch_tag=}.") + return [] + + default_backends = [] + if arch_tag == 90: + default_backends = [ + "flash3", + "natten", + "flash2", + ] + elif arch_tag in [100, 103]: + default_backends = [ + "natten", + "flash2", + ] + elif arch_tag >= 80: + default_backends = [ + "flash2", + "natten", + ] + else: + default_backends = ["natten"] + + # Apply environment variable filtering + return filter_attention_backends(default_backends) + + +@lru_cache +def choose_backend( + query_shape: torch.Size, + key_shape: torch.Size, + value_shape: torch.Size, + dtype: torch.dtype, + device: torch.device, + requires_grad: bool, + is_causal: bool, + causal_type: CausalType | None, + is_varlen: bool, + deterministic: bool = False, + backend: str | None = None, + raise_error: bool = True, +) -> str | None: + """ + Selects a compatible backend, unless one is already selected, which runs its corresponding + checks. + + Parameters: + query_shape (torch.Size): Shape of 4-D query tensor (`[batch, seqlen, heads, head_dim]`). + + key_shape (torch.Size): Shape of 4-D key tensor (`[batch, seqlen_kv, heads_kv, head_dim]`). + + value_shape (torch.Size): Shape of 4-D value tensor (`[batch, seqlen_kv, heads_kv, head_dim_v]`). + + dtype (torch.dtype): Data type of tensors. + + device (torch.device): Device of tensors. + + requires_grad (bool): Whether tensors require gradients (training vs inference). + + is_causal (bool): whether or not causal masking is enabled. + + causal_type (CausalType): causal masking mode. Choices: `CausalType.TopLeft`, + `CausalType.BottomRight`. Required when `is_causal = True`. + + is_varlen (bool): whether or not a variable length (varlen) use case. Must be inferred + beforehand based on arguments such as seqlens_{Q,KV} or cumulative_seqlen_{Q,KV} being + passed. + + deterministic (bool): Deterministic backward pass required. + + backend (str | None): selected backend, if any. + + raise_error (bool): whether to raise an error if any checks fail or no backend is selected, + instead of just returning False. Default is **True**. + + Returns: + backend (str | None): selected backend, or None if no backends are compatible. + + """ + if backend is not None: + if is_backend_compatible( + backend=backend, + query_shape=query_shape, + key_shape=key_shape, + value_shape=value_shape, + dtype=dtype, + device=device, + requires_grad=requires_grad, + is_causal=is_causal, + causal_type=causal_type, + is_varlen=is_varlen, + deterministic=deterministic, + raise_error=raise_error, + ): + return backend + return None + + arch_tag = get_arch_tag(device) + backend_list = get_backend_list(arch_tag) + for backend in backend_list: + if is_backend_compatible( + backend=backend, + query_shape=query_shape, + key_shape=key_shape, + value_shape=value_shape, + dtype=dtype, + device=device, + requires_grad=requires_grad, + is_causal=is_causal, + causal_type=causal_type, + is_varlen=is_varlen, + deterministic=deterministic, + raise_error=False, + ): + return backend + + if not raise_error: + return None + + raise ValueError( + "Could not find a compatible Attention backend for this use case / device. " + "Try running with debug logs to find out why." + ) + + +def is_multi_dim_backend_compatible( + backend: str, + query_shape: torch.Size, + key_shape: torch.Size, + value_shape: torch.Size, + dtype: torch.dtype, + device: torch.device, + requires_grad: bool, + deterministic: bool = False, + raise_error: bool = False, +) -> bool: + """ + Input validation function a specified multi-dimensional backend. + Runs the common and backend-specific checks. Returns False if any checks fail, otherwise True. + + Parameters: + backend (str): selected backend. + + query_shape (torch.Size): Shape of 4-D, 5-D, or 6-D query tensor (`[batch, *token_layout_shape, heads, head_dim]`). + + key_shape (torch.Size): Shape of 4-D, 5-D, or 6-D key tensor (`[batch, *token_layout_shape, heads_kv, head_dim]`). + + value_shape (torch.Size): Shape of 4-D, 5-D, or 6-D value tensor (`[batch, *token_layout_shape, heads_kv, head_dim_v]`). + + dtype (torch.dtype): Data type of tensors. + + device (torch.device): Device of tensors. + + requires_grad (bool): Whether tensors require gradients (training vs inference). + + deterministic (bool): Deterministic backward pass required. + + raise_error (bool): whether to raise an error if any checks fail or no backend is selected, + instead of just returning False. Default is False. + + Returns: + success (bool): whether use case is compatible with the backend. + + """ + if backend is None: + raise ValueError("Cannot pass None backend to is_backend_compatible.") + + if backend not in BACKEND_MULTI_DIM_CHECK_MAP: + raise ValueError(f"Unrecognized backend name {backend}.") + + return BACKEND_MULTI_DIM_CHECK_MAP[backend]( + query_shape=query_shape, + key_shape=key_shape, + value_shape=value_shape, + dtype=dtype, + device=device, + requires_grad=requires_grad, + deterministic=deterministic, + raise_error=raise_error, + ) + + +def get_multi_dim_backend_list(arch_tag: int) -> list[str]: + """ + Returns list of supported multi-dimensional backends according to arch tag (attention.utils.get_arch_tag). + Backends are ordered based on their known performance levels, so that the best-performing + compatible backend is selected. + + The returned list can be filtered via environment variable. + See `filter_multi_dim_attention_backends` for details. + + Parameters: + arch_tag (int): Arch tag for the current CUDA device. Example: 80 for A100, 90 for H100. + + Returns: + backend_list (list[str]): a list of backend names (string). Empty if device is not supported. + + """ + + if arch_tag < 75: + log.debug(f"Minimum architecture supported for Multi-Dimensional Attention is 75, got {arch_tag=}.") + return [] + + # NATTEN is the only supported backend for now + default_backends = ["natten"] + + # Apply environment variable filtering + return filter_multi_dim_attention_backends(default_backends) + + +@lru_cache +def choose_multi_dim_backend( + query_shape: torch.Size, + key_shape: torch.Size, + value_shape: torch.Size, + dtype: torch.dtype, + device: torch.device, + requires_grad: bool, + deterministic: bool = False, + backend: str | None = None, +) -> str: + """ + Selects a compatible multi-dimensional backend, unless one is already selected, which runs its + corresponding checks. + + Parameters: + query_shape (torch.Size): Shape of 4-D, 5-D, or 6-D query tensor (`[batch, *token_layout_shape, heads, head_dim]`). + + key_shape (torch.Size): Shape of 4-D, 5-D, or 6-D key tensor (`[batch, *token_layout_shape, heads_kv, head_dim]`). + + value_shape (torch.Size): Shape of 4-D, 5-D, or 6-D value tensor (`[batch, *token_layout_shape, heads_kv, head_dim_v]`). + + dtype (torch.dtype): Data type of tensors. + + device (torch.device): Device of tensors. + + requires_grad (bool): Whether tensors require gradients (training vs inference). + + deterministic (bool): Deterministic backward pass required. + + backend (str | None): selected backend, if any. + + Returns: + backend (str): selected backend. + + """ + if backend is not None: + assert is_multi_dim_backend_compatible( + backend=backend, + query_shape=query_shape, + key_shape=key_shape, + value_shape=value_shape, + dtype=dtype, + device=device, + requires_grad=requires_grad, + deterministic=deterministic, + raise_error=True, + ) + return backend + + arch_tag = get_arch_tag(device) + backend_list = get_multi_dim_backend_list(arch_tag) + for backend in backend_list: + if is_multi_dim_backend_compatible( + backend=backend, + query_shape=query_shape, + key_shape=key_shape, + value_shape=value_shape, + dtype=dtype, + device=device, + requires_grad=requires_grad, + deterministic=deterministic, + raise_error=False, + ): + return backend + + raise ValueError( + "Could not find a compatible Multi-Dimensional Attention backend for this use case / device. " + "Try running with debug logs to find out why." + ) diff --git a/cosmos-inference/cosmos3/_src/imaginaire/attention/checks.py b/cosmos-inference/cosmos3/_src/imaginaire/attention/checks.py new file mode 100644 index 00000000..0fb4f568 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/imaginaire/attention/checks.py @@ -0,0 +1,634 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +""" +Imaginaire4 Attention Subpackage: +Unified implementation for all Attention implementations. + +Common, op-specific, and backend-specific checks +""" + +from collections.abc import Sequence +from functools import partial +from typing import Any + +import torch +from torch import Tensor + +from cosmos3._src.imaginaire.attention.masks import CausalType +from cosmos3._src.imaginaire.attention.utils import log_or_raise_error +from cosmos3._src.imaginaire.attention.utils.environment import is_torch_compiling +from cosmos3._src.imaginaire.attention.varlen import generate_varlen_parameters + + +def universal_tensor_checks( + query: Tensor, key: Tensor, value: Tensor, raise_error: bool = True +) -> bool: # query/key/value: [B,*,H,D] + """ + Universal tensor validation: checks sparse/nested tensors and ensures device/dtype consistency. + This should be called by users before extracting tensor properties for tensorless APIs. + + Parameters: + query (Tensor): Query tensor. + key (Tensor): Key tensor. + value (Tensor): Value tensor. + raise_error (bool): Whether to raise an error if checks fail. Default is True. + + Returns: + success (bool): Whether all checks pass. + """ + target_fn = partial(log_or_raise_error, raise_error=raise_error) + + if query.is_sparse or key.is_sparse or value.is_sparse: + target_fn("This operation does not support sparse tensors.", exception=NotImplementedError) + return False + + if query.is_nested or key.is_nested or value.is_nested: + target_fn("This operation does not support nested tensors.", exception=NotImplementedError) + return False + + if query.device != key.device or query.device != value.device: + target_fn( + f"Query, key, and value must be on the same device, got {query.device=}, {key.device=}, {value.device=}.", + exception=ValueError, + ) + return False + + if query.dtype != key.dtype or query.dtype != value.dtype: + target_fn( + f"Query, key, and value must assume the same data type, got {query.dtype=}, {key.dtype=}, {value.dtype=}.", + exception=ValueError, + ) + return False + + return True + + +def assert_universal_tensor_checks(query: Tensor, key: Tensor, value: Tensor) -> None: # query/key/value: [B,*,H,D] + """ + Universal tensor validation using assertions for backend functions. + Checks sparse/nested tensors and ensures device/dtype/requires_grad consistency. + + This is intended for internal backend use only. Users should not call backend functions directly. + Assertions are disabled in production (-O flag), so this is appropriate for post-frontend checks. + + Parameters: + query (Tensor): Query tensor. + key (Tensor): Key tensor. + value (Tensor): Value tensor. + """ + assert not query.is_sparse and not key.is_sparse and not value.is_sparse, "Sparse tensors not supported" + assert not query.is_nested and not key.is_nested and not value.is_nested, "Nested tensors not supported" + assert query.device == key.device == value.device, ( + f"Device mismatch: {query.device=}, {key.device=}, {value.device=}" + ) + assert query.dtype == key.dtype == value.dtype, f"Dtype mismatch: {query.dtype=}, {key.dtype=}, {value.dtype=}" + # Disabled: requires_grad may differ if differentiable queries attend to non-differentiable + # keys, e.g. when attending to a KV-cache during training. + # assert query.requires_grad == key.requires_grad == value.requires_grad, ( + # f"requires_grad mismatch: {query.requires_grad=}, {key.requires_grad=}, {value.requires_grad=}" + # ) + + +def _universal_attention_checks( + query_shape: torch.Size, + key_shape: torch.Size, + value_shape: torch.Size, + dtype: torch.dtype, + requires_grad: bool, + supported_dtypes_forward: list[torch.dtype] | None = None, + supported_dtypes_backward: list[torch.dtype] | None = None, + supports_mla: bool = True, + supports_gqa_mqa: bool = True, + raise_error: bool = True, + backend_name: str | None = None, +) -> bool: + backend_name = backend_name or "Attention" + + target_fn = partial(log_or_raise_error, raise_error=raise_error) + + query_dim = len(query_shape) + key_dim = len(key_shape) + value_dim = len(value_shape) + + if query_dim != key_dim or query_dim != value_dim: + target_fn( + f"Q, K, and V must have the same rank, got {query_dim=}, {key_dim=}, {value_dim=}.", + exception=ValueError, + ) + return False + + if query_shape[0] != key_shape[0] or query_shape[0] != value_shape[0]: + target_fn( + f"Q, K, and V must match in batch size, got {query_shape[0]=}, {key_shape[0]=}, {value_shape[0]=}.", + exception=ValueError, + ) + return False + + if query_shape[-1] != key_shape[-1]: + target_fn( + f"Q and K head dims must match, got {query_shape[-1]=}, {key_shape[-1]=}.", + exception=ValueError, + ) + return False + + if key_shape[-2] != value_shape[-2]: + target_fn( + f"K and V must always have the same number of heads, got {key_shape[-2]=}, {value_shape[-2]=}.", + exception=ValueError, + ) + return False + + if not supports_mla and query_shape[-1] != value_shape[-1]: + target_fn( + f"{backend_name} does not support different head dims for QK and V, got " + f"{query_shape[-1]=}, {value_shape[-1]=}.", + exception=ValueError, + ) + return False + + if not supports_gqa_mqa and (query_shape[-2] != key_shape[-2] or query_shape[-2] != value_shape[-2]): + target_fn( + f"{backend_name} does not support GQA/MQA, therefore number of heads in Q, K, and V " + f"must match, got {query_shape[-2]=}, {key_shape[-2]=}, {value_shape[-2]=}.", + exception=ValueError, + ) + return False + + if supports_gqa_mqa: + heads_q = query_shape[-2] + heads_kv = key_shape[-2] + + if heads_q < heads_kv or heads_q % heads_kv != 0: + target_fn( + f"KV heads must evenly divide Q heads, got {heads_q=}, {heads_kv=}.", + exception=ValueError, + ) + return False + + # Caller must ensure dtype consistency via universal_tensor_checks + if supported_dtypes_forward is not None and dtype not in supported_dtypes_forward: + target_fn( + f"{backend_name} does not support forward pass (inference) with data type {dtype}; " + f"supported dtypes: {supported_dtypes_forward}.", + exception=ValueError, + ) + return False + + if supported_dtypes_backward is not None and requires_grad and dtype not in supported_dtypes_backward: + target_fn( + f"{backend_name} does not support backward pass (training) with data type {dtype}; " + f"supported dtypes: {supported_dtypes_backward}.", + exception=ValueError, + ) + return False + + return True + + +def attention_tensor_checks( + query_shape: torch.Size, + key_shape: torch.Size, + value_shape: torch.Size, + dtype: torch.dtype, + requires_grad: bool, + supported_dtypes_forward: list[torch.dtype] | None = None, + supported_dtypes_backward: list[torch.dtype] | None = None, + supports_mla: bool = True, + supports_gqa_mqa: bool = True, + raise_error: bool = True, + backend_name: str | None = None, +) -> bool: + backend_name = backend_name or "Attention" + + if not _universal_attention_checks( + query_shape=query_shape, + key_shape=key_shape, + value_shape=value_shape, + dtype=dtype, + requires_grad=requires_grad, + supported_dtypes_forward=supported_dtypes_forward, + supported_dtypes_backward=supported_dtypes_backward, + supports_mla=supports_mla, + supports_gqa_mqa=supports_gqa_mqa, + raise_error=raise_error, + backend_name=backend_name, + ): + return False + + target_fn = partial(log_or_raise_error, raise_error=raise_error) + + query_dim = len(query_shape) + if query_dim != 4: + target_fn( + f"Attention expects 4-D tensors as inputs, got {query_dim=}.", + exception=ValueError, + ) + return False + + if key_shape[1] != value_shape[1]: + target_fn( + f"K and V must match in sequence length, got {key_shape[1]=}, {value_shape[1]=}.", + exception=ValueError, + ) + return False + + return True + + +def varlen_tensor_checks( + query: Tensor, # [1,S_total_Q,H,D] + key: Tensor, # [1,S_total_KV,H_KV,D] + value: Tensor, # [1,S_total_KV,H_KV,D_V] + seqlens_Q: Tensor | None = None, # [B] + seqlens_KV: Tensor | None = None, # [B] + cumulative_seqlen_Q: Tensor | None = None, # [B+1] + cumulative_seqlen_KV: Tensor | None = None, # [B+1] + max_seqlen_Q: int | None = None, + max_seqlen_KV: int | None = None, +) -> ( + tuple[None, None, int, int] | tuple[Tensor, Tensor, int, int] +): # (cumseqlen_Q[B+1], cumseqlen_KV[B+1], max_seqlen_Q, max_seqlen_KV) + if query.shape[0] != key.shape[0] or query.shape[0] != value.shape[0]: + raise ValueError( + f"Q, K, and V must match in batch size, got {query.shape[0]=}, {key.shape[0]=}, {value.shape[0]=}." + ) + + + if not is_torch_compiling(): + # Validate max_seqlen values: neither can be negative, and they must be + # both zero/None (not varlen) or both positive (varlen). + if (max_seqlen_Q is not None and max_seqlen_Q < 0) or (max_seqlen_KV is not None and max_seqlen_KV < 0): + raise ValueError( + f"max_seqlen_Q and max_seqlen_KV cannot be negative, got {max_seqlen_Q=}, {max_seqlen_KV=}." + ) + + if (max_seqlen_Q == 0) != (max_seqlen_KV == 0): + raise ValueError( + "max_seqlen_Q and max_seqlen_KV must either both be 0/None (not varlen) or both be positive " + f"(varlen), got {max_seqlen_Q=}, {max_seqlen_KV=}." + ) + + if all( + x is None + for x in [ + seqlens_Q, + seqlens_KV, + cumulative_seqlen_Q, + cumulative_seqlen_KV, + ] + ) and all( + x is None or x == 0 + for x in [ + max_seqlen_Q, + max_seqlen_KV, + ] + ): + # Not varlen + return None, None, 0, 0 + + if seqlens_Q is not None or seqlens_KV is not None: + # Generate cumulative_seqlen_{Q,KV}, max_seqlen_{Q,KV}, total_seqlen_{Q,KV} + # based on user input + return generate_varlen_parameters( + query=query, + key=key, + value=value, + seqlens_Q=seqlens_Q, + seqlens_KV=seqlens_KV, + ) + + # Validate user-input cumulative_seqlen_{Q,KV}, max_seqlen_{Q,KV}, total_seqlen_{Q,KV} + + # Mismatch (one 0, the other positive) is already caught by the early check above. + # This feature may require support in the backends themselves; see NATTEN PR: + # https://github.com/SHI-Labs/NATTEN/pull/327 + if any( + x is None + for x in [ + cumulative_seqlen_Q, + cumulative_seqlen_KV, + max_seqlen_Q, + max_seqlen_KV, + ] + ): + raise ValueError( + "Variable length Attention requires all of cumulative_seqlen_{Q,KV} and max_seqlen_{Q,KV} to be set." + ) + + if query.shape[0] != 1: + raise ValueError( + f"Variable length Attention only supports sequence-packed memory layout (batch = 1), got {query.shape[0]=}." + ) + + assert cumulative_seqlen_Q is not None + assert cumulative_seqlen_KV is not None + assert max_seqlen_Q is not None + assert max_seqlen_KV is not None + + if not isinstance(max_seqlen_Q, int) or not isinstance(max_seqlen_KV, int): + raise ValueError( + f"max_seqlen_Q and max_seqlen_KV must be ints, got {type(max_seqlen_Q)=}, {type(max_seqlen_KV)=}." + ) + + total_seqlen_Q = query.shape[1] + total_seqlen_KV = key.shape[1] + + + if not is_torch_compiling(): + # When both max_seqlens are 0, skip bounds checks (skip kernel / empty-batch case). + # Mismatch is already caught by the early check, so at this point either both are 0 or both are positive. + if max_seqlen_Q > 0 or max_seqlen_KV > 0: + if max_seqlen_Q > total_seqlen_Q: + raise ValueError( + f"Maximum sequence length cannot exceed total, got {max_seqlen_Q=}, {total_seqlen_Q=}." + ) + + if max_seqlen_KV > total_seqlen_KV: + raise ValueError( + f"Maximum sequence length cannot exceed total, got {max_seqlen_KV=}, {total_seqlen_KV=}." + ) + + if max_seqlen_Q < 1 or max_seqlen_KV < 1: + raise ValueError( + f"Maximum sequence length cannot be less than 1, got {max_seqlen_Q=}, {max_seqlen_KV=}." + ) + + if not isinstance(cumulative_seqlen_Q, Tensor) or not isinstance(cumulative_seqlen_KV, Tensor): + raise ValueError("cumulative_seqlen_Q and cumulative_seqlen_KV must both be tensors.") + + if cumulative_seqlen_Q.device != query.device or cumulative_seqlen_KV.device != query.device: + raise ValueError( + "cumulative_seqlen_Q and cumulative_seqlen_KV must be on the same device as QKV, but " + f"{cumulative_seqlen_Q.device=}, {cumulative_seqlen_KV.device=}, {query.device=}." + ) + + if cumulative_seqlen_Q.dtype != torch.int32 or cumulative_seqlen_KV.dtype != torch.int32: + raise ValueError( + "cumulative_seqlen_Q and cumulative_seqlen_KV must both be torch.int32 tensors, got " + f"{cumulative_seqlen_Q.dtype=}, {cumulative_seqlen_KV.dtype=}." + ) + + if cumulative_seqlen_Q.dim() != 1 or cumulative_seqlen_KV.dim() != 1: + raise ValueError( + "cumulative_seqlen_Q and cumulative_seqlen_KV must both be 1-D tensors, got " + f"{cumulative_seqlen_Q.dim()=}, {cumulative_seqlen_KV.dim()=}." + ) + + if cumulative_seqlen_Q.shape[0] != cumulative_seqlen_KV.shape[0]: + raise ValueError( + "cumulative_seqlen_Q and cumulative_seqlen_KV must match in size, got " + f"{cumulative_seqlen_Q.shape=}, {cumulative_seqlen_KV.shape=}." + ) + + if cumulative_seqlen_Q.shape[0] < 2: + raise ValueError( + "cumulative_seqlen_Q and cumulative_seqlen_KV must contain at least 2 elements, got " + f"{cumulative_seqlen_Q.shape=}, {cumulative_seqlen_KV.shape=}." + ) + + return ( + cumulative_seqlen_Q, + cumulative_seqlen_KV, + max_seqlen_Q, + max_seqlen_KV, + ) + + +def attention_param_checks( + query_shape: torch.Size, + key_shape: torch.Size, + value_shape: torch.Size, + is_causal: bool, + causal_type: CausalType, +): + if is_causal and (causal_type is None or not isinstance(causal_type, CausalType)): + raise ValueError( + f"Argument causal_type must be specified as an enum instance of CausalType when is_causal=True, got {causal_type=}." + ) + + assert len(query_shape) == len(key_shape) == len(value_shape) == 4 + assert key_shape[1] == value_shape[1] + if is_causal and causal_type == CausalType.DontCare and query_shape[1] != key_shape[1]: + raise ValueError( + "Causal mask type DontCare is only valid when seqlen_q == seqlen_kv, got " + f"{query_shape[1]=}, {key_shape[1]=}." + ) + + +def multi_dim_attention_tensor_checks( + query_shape: torch.Size, + key_shape: torch.Size, + value_shape: torch.Size, + dtype: torch.dtype, + requires_grad: bool, + supported_dtypes_forward: list[torch.dtype] | None = None, + supported_dtypes_backward: list[torch.dtype] | None = None, + supports_mla: bool = True, + supports_gqa_mqa: bool = True, + raise_error: bool = True, + backend_name: str | None = None, +) -> bool: + backend_name = backend_name or "Multi-Dimensional Attention" + + if not _universal_attention_checks( + query_shape=query_shape, + key_shape=key_shape, + value_shape=value_shape, + dtype=dtype, + requires_grad=requires_grad, + supported_dtypes_forward=supported_dtypes_forward, + supported_dtypes_backward=supported_dtypes_backward, + supports_mla=supports_mla, + supports_gqa_mqa=supports_gqa_mqa, + raise_error=raise_error, + backend_name=backend_name, + ): + return False + + target_fn = partial(log_or_raise_error, raise_error=raise_error) + + query_dim = len(query_shape) + if query_dim not in [4, 5, 6]: + target_fn( + f"Multi-Dimensional Attention supports 4-D, 5-D, or 6-D tensors as inputs, got {query_dim=}.", + exception=ValueError, + ) + return False + + num_dims = query_dim - 3 # minus batch, heads, head_dim + + q_token_layout_shape = query_shape[1 : 1 + num_dims] + k_token_layout_shape = key_shape[1 : 1 + num_dims] + v_token_layout_shape = value_shape[1 : 1 + num_dims] + + if q_token_layout_shape != k_token_layout_shape or q_token_layout_shape != v_token_layout_shape: + target_fn( + "Q, K and V must match in their token layout shapes in multi-dimensional attention, " + f"got {q_token_layout_shape=}, {k_token_layout_shape=}, {v_token_layout_shape=}.", + exception=ValueError, + ) + return False + + return True + + +def check_valid_tuple_or_element( + param: Any, num_dims: int, typename: type, raise_error: bool = False, param_name: str = "unknown" +) -> tuple | None: + if isinstance(param, typename): + return tuple(param for _ in range(num_dims)) + + if isinstance(param, Sequence) and len(param) == num_dims and all(isinstance(x, typename) for x in param): + return tuple(x for x in param) + + if raise_error: + raise ValueError(f"Invalid value for parameter {param_name}: {param}.") + return None + + +def multi_dim_attention_param_filter_tensorless( + token_layout_shape: tuple, + window_size: tuple | int = -1, + stride: tuple | int = 1, + dilation: tuple | int = 1, + is_causal: tuple | bool = False, +) -> tuple[tuple, tuple, tuple, tuple]: + """ + Converts all multi-dimensional parameters to standard types. + """ + + if not isinstance(token_layout_shape, tuple) or any(not isinstance(x, int) for x in token_layout_shape): + raise ValueError(f"token_layout_shape must be an integer tuple, got {token_layout_shape=}.") + + num_dims = len(token_layout_shape) + assert num_dims in [1, 2, 3] + + window_size_ = check_valid_tuple_or_element(window_size, num_dims, int) + if window_size_ is None: + raise ValueError( + f"Parameter 'window_size' must be either an int or tuple of {num_dims} ints, got {window_size=}." + ) + + stride_ = check_valid_tuple_or_element(stride, num_dims, int) + if stride_ is None: + raise ValueError(f"Parameter 'stride' must be either an int or tuple of {num_dims} ints, got {stride=}.") + + dilation_ = check_valid_tuple_or_element(dilation, num_dims, int) + if dilation_ is None: + raise ValueError(f"Parameter 'dilation' must be either an int or tuple of {num_dims} ints, got {dilation=}.") + + is_causal_ = check_valid_tuple_or_element(is_causal, num_dims, bool) + if is_causal_ is None: + raise ValueError( + f"Parameter 'is_causal' must be either a boolean or tuple of {num_dims} booleans, got {is_causal=}." + ) + + # Map -1 windows to corresponding size in token layout + window_size_ = tuple(w if w != -1 else x for x, w in zip(token_layout_shape, window_size_)) + + return window_size_, stride_, dilation_, is_causal_ + + +def multi_dim_attention_param_checks_tensorless( + token_layout_shape: tuple, + window_size: tuple, + stride: tuple, + dilation: tuple, + is_causal: tuple, +): + """ + Validates multi-dimensional parameters. + """ + + if not isinstance(token_layout_shape, tuple) or any(not isinstance(x, int) for x in token_layout_shape): + raise ValueError(f"token_layout_shape must be an integer tuple, got {token_layout_shape=}.") + + num_dims = len(token_layout_shape) + assert num_dims in [1, 2, 3] + + if any(x <= 1 for x in token_layout_shape): + raise ValueError(f"Token layout dimensions must all be >= 2, got {token_layout_shape=}.") + + if any(w <= 1 for w in window_size): + raise ValueError( + "Parameter 'window_size' must be either -1 (no sparsity) or >= 2 along every dimension, " + f"got {window_size=}." + ) + + if any(w * d > x for x, w, d in zip(token_layout_shape, window_size, dilation)): + raise ValueError( + "The product of 'window_size' and 'dilation' cannot be greater than the input " + f"(token layout shape), got {window_size=}, {dilation=}, {token_layout_shape=}." + ) + + if any(s < 1 for s in stride): + raise ValueError(f"Parameter 'stride' allows positive integers only, got {stride=}.") + + if any(s > w for w, s in zip(window_size, stride)): + raise ValueError( + f"Parameter 'stride' cannot be greater than window size along any dimension, got {window_size=}, {stride=}." + ) + + if any(d < 1 for d in dilation): + raise ValueError(f"Parameter 'dilation' allows positive integers only, got {dilation=}.") + + +def multi_dim_attention_param_filter( + query: Tensor, # [B,*token_layout_shape,H,D] + window_size: tuple | int = -1, + stride: tuple | int = 1, + dilation: tuple | int = 1, + is_causal: tuple | bool = False, +) -> tuple[tuple, tuple, tuple, tuple, tuple]: + """ + Converts all multi-dimensional parameters to standard types. + """ + assert query.dim() in [4, 5, 6] + num_dims = query.dim() - 3 + token_layout_shape = tuple(s for s in query.shape[1 : 1 + num_dims]) + + window_size_, stride_, dilation_, is_causal_ = multi_dim_attention_param_filter_tensorless( + token_layout_shape=token_layout_shape, + window_size=window_size, + stride=stride, + dilation=dilation, + is_causal=is_causal, + ) + + return token_layout_shape, window_size_, stride_, dilation_, is_causal_ + + +def multi_dim_attention_param_checks( + query: Tensor, # [B,*token_layout_shape,H,D] + window_size: tuple, + stride: tuple, + dilation: tuple, + is_causal: tuple, +): + """ + Validates multi-dimensional parameters. + """ + assert query.dim() in [4, 5, 6] + num_dims = query.dim() - 3 + token_layout_shape = tuple(s for s in query.shape[1 : 1 + num_dims]) + + multi_dim_attention_param_checks_tensorless( + token_layout_shape=token_layout_shape, + window_size=window_size, + stride=stride, + dilation=dilation, + is_causal=is_causal, + ) diff --git a/cosmos-inference/cosmos3/_src/imaginaire/attention/docs/README.md b/cosmos-inference/cosmos3/_src/imaginaire/attention/docs/README.md new file mode 100644 index 00000000..d72e7535 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/imaginaire/attention/docs/README.md @@ -0,0 +1,10 @@ +# Imaginaire Attention Subpackage Docs + +- [Basic API & Intro](../README.md) +- Docs (you are here) + - [Backends](backends.md) + - Features + - [Basic features](features.md) + - [Multi-dimensional Attention](multi-dim.md) + - [Spatio-Temporal Attention](multi-dim.md#spatio-temporal-attention) + - [APIs](apis.md) diff --git a/cosmos-inference/cosmos3/_src/imaginaire/attention/docs/apis.md b/cosmos-inference/cosmos3/_src/imaginaire/attention/docs/apis.md new file mode 100644 index 00000000..ad8dafe9 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/imaginaire/attention/docs/apis.md @@ -0,0 +1,28 @@ +# Imaginaire Attention Subpackage Docs > APIs + +## Attention + +::: cosmos3._src.imaginaire.attention + options: + heading_level: 3 + show_object_full_path: true + members: + - attention + +## Multi-Dimensional Attention + +::: cosmos3._src.imaginaire.attention + options: + heading_level: 3 + show_object_full_path: true + members: + - multi_dimensional_attention + +### Spatio-Temporal Attention + +::: cosmos3._src.imaginaire.attention + options: + heading_level: 3 + show_object_full_path: true + members: + - spatio_temporal_attention diff --git a/cosmos-inference/cosmos3/_src/imaginaire/attention/docs/backends.md b/cosmos-inference/cosmos3/_src/imaginaire/attention/docs/backends.md new file mode 100644 index 00000000..ba3f5d88 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/imaginaire/attention/docs/backends.md @@ -0,0 +1,61 @@ +# Imaginaire Attention Subpackage Docs > Backends + +The goal is to support as many stable and reliable backends as possible, both for feature coverage, +and for delivering the best performance. + +## NATTEN + +[NATTEN](https://natten.org) ships standard Attention kernels in addition to sparse / +multi-dimensional kernels. + +Minimum version required: `0.21.5.dev3`. + +### Feature coverage + +| Feat/Backend | Ampere/RTX | Hopper | Blackwell | +| ------------ | ------------------ | ------ | ------------------ | +| Causal mask | :white_check_mark: | | :white_check_mark: | +| Varlen | :white_check_mark: | | :white_check_mark: | +| GQA/MQA | | | :white_check_mark: | +| MLA | :white_check_mark: | | | + +This backend supports torch compile. + +## Flash Attention v2 + +Flash Attention v2 (original C++ kernels) are available under the `flash2` backend. +Requires the `flash_attn` package. + +Minimum version required: `2.7.0`. +Maximum version supported: `2.7.4`. + +This backend supports torch compile. + +### Feature coverage + +| Feat/Backend | Ampere/RTX | +| ------------ | ------------------ | +| Causal mask | :white_check_mark: | +| Varlen | :white_check_mark: | +| GQA/MQA | :white_check_mark: | +| MLA | | + +## Flash Attention v3 + +Flash Attention v3 (original C++ kernels) are available under the `flash3` backend. +Requires the `flash_attn_3` package. + +Version required: `3.0.0.b*`. + +### Feature coverage + +| Feat/Backend | Ampere/RTX | +| ------------ | ------------------ | +| Causal mask | :white_check_mark: | +| Varlen | :white_check_mark: | +| GQA/MQA | :white_check_mark: | +| MLA | | + +MLA is technically supported, but disabled due to an API bug in the backward pass. + +Torch compile is NOT yet supported for this backend. diff --git a/cosmos-inference/cosmos3/_src/imaginaire/attention/docs/features.md b/cosmos-inference/cosmos3/_src/imaginaire/attention/docs/features.md new file mode 100644 index 00000000..d14e3f19 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/imaginaire/attention/docs/features.md @@ -0,0 +1,151 @@ +# Imaginaire Attention Subpackage Docs > Features + +## Causal mask + +Causal masking requires explicit indication of causal mask type. +For example, simply passing `is_causal=True` will fail: + +```python +output = attention( + query=query, + key=key, + value=value, + is_causal=True +) +``` + +Result: + +``` +ValueError: Argument causal_type must be specified when is_causal=True. +``` + +There are currently two types of causal masking that are supported, and many popular backends tend +to support only one. It's therefore critical to to choose the correct one for your application. + +```python +from cosmos3._src.imaginaire.attention.masks import CausalType + +# Causal type choices: +# - CausalType.TopLeft +# - CausalType.BottomRight + +output = attention( + query=query, + key=key, + value=value, + is_causal=True, + causal_type=CausalType.TopLeft, +) +``` + +### Top-left causal mask + +Q sequence length = KV sequence length = 5 + +| | K1 | K2 | K3 | K4 | K5 | +| --- | -------- | -------- | -------- | -------- | -------- | +| Q1 | ✓ | ✗ | ✗ | ✗ | ✗ | +| Q2 | ✓ | ✓ | ✗ | ✗ | ✗ | +| Q3 | ✓ | ✓ | ✓ | ✗ | ✗ | +| Q4 | ✓ | ✓ | ✓ | ✓ | ✗ | +| Q5 | ✓ | ✓ | ✓ | ✓ | ✓ | + +Q sequence length = 2, KV sequence length = 5 + +| | K1 | K2 | K3 | K4 | K5 | +| --- | -------- | -------- | -------- | -------- | -------- | +| Q1 | ✓ | ✗ | ✗ | ✗ | ✗ | +| Q2 | ✓ | ✓ | ✗ | ✗ | ✗ | + +Q sequence length = 5, KV sequence length = 2 + +| | K1 | K2 | +| --- | -------- | -------- | +| Q1 | ✓ | ✗ | +| Q2 | ✓ | ✓ | +| Q3 | ✓ | ✓ | +| Q4 | ✓ | ✓ | +| Q5 | ✓ | ✓ | + +### Bottom-right causal mask + +Q sequence length = KV sequence length = 5 + +| | K1 | K2 | K3 | K4 | K5 | +| --- | -------- | -------- | -------- | -------- | -------- | +| Q1 | ✓ | ✗ | ✗ | ✗ | ✗ | +| Q2 | ✓ | ✓ | ✗ | ✗ | ✗ | +| Q3 | ✓ | ✓ | ✓ | ✗ | ✗ | +| Q4 | ✓ | ✓ | ✓ | ✓ | ✗ | +| Q5 | ✓ | ✓ | ✓ | ✓ | ✓ | + +(identical to top-left in this special case) + +Q sequence length = 2, KV sequence length = 5 + +| | K1 | K2 | K3 | K4 | K5 | +| --- | -------- | -------- | -------- | -------- | -------- | +| Q1 | ✓ | ✓ | ✓ | ✓ | ✗ | +| Q2 | ✓ | ✓ | ✓ | ✓ | ✓ | + +Q sequence length = 5, KV sequence length = 2 + +| | K1 | K2 | +| --- | -------- | -------- | +| Q1 | ✗ | ✗ | +| Q2 | ✗ | ✗ | +| Q3 | ✗ | ✗ | +| Q4 | ✓ | ✗ | +| Q5 | ✓ | ✓ | + +## GQA/MQA + +Simply pass `key` and `value` without repeating attention heads. + +**NOTE**: `key`/`value` heads must evenly divide `query` heads. + +**NOTE**: the behavior is similar to `repeat_interleave`, not `repeat`. + +## Variable length + +**(Less efficient option)** Pass sequence lengths directly: + +```python +output = attention( + query=query, + key=key, + value=value, + seqlens_Q=torch.tensor(sequence_length_list_Q, device=query.device), + seqlens_KV=torch.tensor(sequence_length_list_KV, device=query.device), +) +``` + +This will manually compute the maximum sequence lengths, and cumulative sums (with the additional +padding). + +**(More efficient option)** Compute cumulative sequence lengths and maximums once, and reuse it: + +```python +from cosmos3._src.imaginaire.attention.varlen import generate_varlen_parameters + + +# they correspond to. +( + cumulative_seqlen_Q, + cumulative_seqlen_KV, + max_seqlen_Q, + max_seqlen_KV, +) = generate_varlen_parameters(query, key, value, seqlens_Q, seqlens_KV) + +# in all attention layers that follow: +output = attention( + query=query, + key=key, + value=value, + cumulative_seqlen_Q=cumulative_seqlen_Q, + cumulative_seqlen_KV=cumulative_seqlen_KV, + max_seqlen_Q=max_seqlen_Q, + max_seqlen_KV=max_seqlen_KV, +) +``` diff --git a/cosmos-inference/cosmos3/_src/imaginaire/attention/docs/multi-dim.md b/cosmos-inference/cosmos3/_src/imaginaire/attention/docs/multi-dim.md new file mode 100644 index 00000000..fb4bb215 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/imaginaire/attention/docs/multi-dim.md @@ -0,0 +1,113 @@ +# Imaginaire Attention Subpackage Docs > Features > Multi-Dimensional Attention + +Multi-Dimensional Attention is the primary API for handling various complex masks and sparsity +patterns, such as the spatio-temporal mask, and sliding window attention. + +## Basic API + +```python +from cosmos3._src.imaginaire.attention import multi_dimensional_attention + +output = multi_dimensional_attention( + query=query, + key=key, + value=value, +) +``` + +Sparsity parameters: + +- **Optional** `window_size`: allows reducing the attention span by limiting each token's context to + a local sliding window. References: + - [Image Transformer](https://arxiv.org/abs/1802.05751) + - [Stand-alone self-attention](https://arxiv.org/abs/1906.05909) + - [Neighborhood attention transformer](https://arxiv.org/abs/2204.07143) +- **Optional** `dilation`: introduces gaps between the tokens within a sliding window, capturing + global context without more computation. + Reference: [Dilated neighborhood attention transformer](https://arxiv.org/abs/2209.15001) + +Other masking parameters: + +- **Optional** `stride`: introduces delays into the sliding window, for **potential** efficiency + gains. Reference: [Generalized Neighborhood Attention](https://arxiv.org/abs/2504.16922). +- **Optional** `is_causal`: allows causally masking individual dimensions. This parameter can + implement the spatio-temporal mask (causal masking across temporal dimension, bi-directional + along space). + +All sparsity / masking parameters can be specified **per dimension**. +The key feature of `multi_dimensional_attention` over the standard `attention` API is supporting +multi-dimensional layouts of tokens (i.e. multi-dimensional feature maps). + +This means `query`, `key` and `value` are not necessarily 4-D tensors; they can be 4-D, 5-D, or 6-D, +representing 1-D, 2-D, and 3-D token layouts (see [Tensor layouts](#tensor-layouts)). + +- **Optional** `scale`: attention (softmax/dot product) scale. Defaults to `head_dim ** -0.5`. +- **Optional** `return_lse`: returns logsumexp if `True` +- **Optional** `backend`: explicitly set backend instead of automatically selecting the best compatible + +## Tensor layouts + +In addition to requiring the [contiguous heads-last tensor layout](../README.md#tensor-layouts), +Multi-Dimensional Attention also requires the "sequence length" dimension to be unrolled / unfolded +back into its original representation: + +```python +# 1-D case: language, audio +batch, X, heads, head_dim = query_1d.shape +# _ +# ^ +# | +# |-----> token layout shape + +# 2-D case: images +batch, X, Y, heads, head_dim = query_2d.shape +# ____ +# ^ +# | +# |-----> token layout shape + +# 3-D case: videos / 3-D images +batch, X, Y, Z, heads, head_dim = query_3d.shape +# _______ +# ^ +# | +# |------> token layout shape +``` + +Multi-Dimensional Attention also requires the shapes of `query`, `key` and `value` to match along +those dimensions, henceforth called the **token layout shape**: + +```python +assert query_1d.shape[1:2] == key_1d.shape[1:2] == value_1d.shape[1:2] + +assert query_2d.shape[1:3] == key_2d.shape[1:3] == value_2d.shape[1:3] + +assert query_3d.shape[1:4] == key_3d.shape[1:4] == value_3d.shape[1:4] +``` + +This is because of the large number of sparsity / masking features (and their combinations) +supported, which is mainly possible by making the assumption that query and context coordinate +spaces are the same, eliminating the requirement for a mapping between the two. + +Problems with a different query and key/value token layout shape may be supported in the future. + +## Backends + +The only backend supporting multi-dimensional attention for now is `natten`. + +## Spatio-Temporal Attention + +Spatio-Temporal attention (causal masking across the time dimension, and no masking / bi-directional +across spatial dimensions) is a special case of Multi-Dimensional Attention. +You can either implement it by marking `is_causal` as expected in `multi_dimensional_attention`, or +directly use `spatio_temporal_attention`: + +```python +from cosmos3._src.imaginaire.attention import spatio_temporal_attention + +output = spatio_temporal_attention( + query=query, + key=key, + value=value, +) +``` diff --git a/cosmos-inference/cosmos3/_src/imaginaire/attention/flash2/__init__.py b/cosmos-inference/cosmos3/_src/imaginaire/attention/flash2/__init__.py new file mode 100644 index 00000000..4e9c6a2a --- /dev/null +++ b/cosmos-inference/cosmos3/_src/imaginaire/attention/flash2/__init__.py @@ -0,0 +1,85 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +""" +Imaginaire4 Attention Subpackage: +Unified implementation for all Attention implementations. + +Flash Attention v2 (flash2) Backend +""" + +import torch + +from cosmos3._src.imaginaire.attention.utils.safe_ops import log +from cosmos3._src.imaginaire.attention.utils.version import version_in_range + +# We lock to safe releases of Flash 2 +# We will have a separate backend identifier for 2025 releases with CuTeDSL +# kernels. +FLASH_ATTENTION_V2_MIN_VERSION = "2.7.0" +FLASH_ATTENTION_V2_MAX_VERSION = "2.7.4.post1" + + +def flash2_supported() -> bool: + """ + Returns whether Flash Attention is supported in this environment. + Requirements are: + * Presence of CUDA Runtime (via PyTorch) + * Presence of Flash Attention, meeting minimum version requirements + + This check guards imports / dependencies on the Flash Attention package. + """ + if not torch.cuda.is_available(): + log.debug("Flash Attention v2 is not supported because PyTorch did not detect CUDA runtime.") + return False + + try: + import flash_attn + + except ImportError: + log.debug("Flash Attention v2 is not supported because the Python package was not found.") + return False + except Exception as e: + log.debug(f"Flash Attention v2 is not supported because importing the Python package failed: {e}") + return False + + flash2_version_str = None + if not hasattr(flash_attn, "__version__"): + from importlib.metadata import version + + flash2_version_str = version("flash_attn") + else: + flash2_version_str = flash_attn.__version__ + + if not version_in_range(flash2_version_str, FLASH_ATTENTION_V2_MIN_VERSION, FLASH_ATTENTION_V2_MAX_VERSION): + log.debug( + "Flash Attention v2 build is not supported; this backend only supports versions " + f"{FLASH_ATTENTION_V2_MIN_VERSION} through {FLASH_ATTENTION_V2_MAX_VERSION}, got " + f"{flash2_version_str}." + ) + return False + + return True + + +FLASH2_SUPPORTED = flash2_supported() + +if FLASH2_SUPPORTED: + from cosmos3._src.imaginaire.attention.flash2.functions import flash2_attention + +else: + from cosmos3._src.imaginaire.attention.flash2.stubs import flash2_attention + +__all__ = ["flash2_attention", "FLASH2_SUPPORTED"] diff --git a/cosmos-inference/cosmos3/_src/imaginaire/attention/flash2/checks.py b/cosmos-inference/cosmos3/_src/imaginaire/attention/flash2/checks.py new file mode 100644 index 00000000..87c44748 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/imaginaire/attention/flash2/checks.py @@ -0,0 +1,130 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +""" +Imaginaire4 Attention Subpackage: +Unified implementation for all Attention implementations. + +Flash Attention v2 (flash2) backend checks +""" + +from functools import partial + +import torch + +from cosmos3._src.imaginaire.attention.checks import attention_param_checks, attention_tensor_checks +from cosmos3._src.imaginaire.attention.flash2 import FLASH2_SUPPORTED +from cosmos3._src.imaginaire.attention.flash2.meta import get_bwd_dtypes, get_fwd_dtypes +from cosmos3._src.imaginaire.attention.masks import CausalType +from cosmos3._src.imaginaire.attention.utils import get_arch_tag, log_or_raise_error + + +def flash2_attention_check( + query_shape: torch.Size, + key_shape: torch.Size, + value_shape: torch.Size, + dtype: torch.dtype, + device: torch.device, + requires_grad: bool, + is_causal: bool, + causal_type: CausalType, + is_varlen: bool, + deterministic: bool = False, + raise_error: bool = False, +) -> bool: + """ + Input validation function for the flash2 backend. + + Parameters: + query_shape (torch.Size): Shape of 4-D query tensor (`[batch, seqlen, heads, head_dim]`). + + key_shape (torch.Size): Shape of 4-D key tensor (`[batch, seqlen_kv, heads_kv, head_dim]`). + + value_shape (torch.Size): Shape of 4-D value tensor (`[batch, seqlen_kv, heads_kv, head_dim_v]`). + + dtype (torch.dtype): Data type of tensors. + + device (torch.device): Device of tensors. + + requires_grad (bool): Whether tensors require gradients (training vs inference). + + is_causal (bool): whether or not causal masking is enabled. + + causal_type (CausalType): causal masking mode. Choices: `CausalType.TopLeft`, + `CausalType.BottomRight`. Required when `is_causal = True`. + + is_varlen (bool): whether or not a variable length (varlen) use case. Must be inferred + beforehand based on arguments such as seqlens_{Q,KV} or cumulative_seqlen_{Q,KV} being + passed. + + deterministic (bool): Deterministic backward pass required. + + raise_error (bool): whether to raise an error if any checks fail or no backend is selected, + instead of just returning False. Default is False. + + Returns: + success (bool): whether use case is compatible with flash2 backend. + + """ + target_fn = partial(log_or_raise_error, raise_error=raise_error) + + if not FLASH2_SUPPORTED: + target_fn( + "Flash Attention v2 (flash2) is not supported in this environment. Run with debug logs to find out why, or choose another backend.", + exception=RuntimeError, + ) + return False + + if is_varlen: + target_fn( + "Flash Attention v2 (flash2) varlen is banned due to instability. Please choose another backend.", + exception=ValueError, + ) + return False + + arch_tag = get_arch_tag(device) + fwd_dtypes = get_fwd_dtypes(arch_tag) + bwd_dtypes = get_bwd_dtypes(arch_tag) + if not attention_tensor_checks( + query_shape=query_shape, + key_shape=key_shape, + value_shape=value_shape, + dtype=dtype, + requires_grad=requires_grad, + supported_dtypes_forward=fwd_dtypes, + supported_dtypes_backward=bwd_dtypes, + supports_mla=False, + supports_gqa_mqa=True, + raise_error=raise_error, + backend_name="Flash Attention v2 (flash2)", + ): + target_fn("Flash Attention v2 (flash2) does not support the given inputs.", exception=RuntimeError) + return False + + # Verifies causal_type is a CausalType instance when is_causal + # Verifies DontCare is not used unless seqlen_q == seqlen_kv + attention_param_checks( + query_shape=query_shape, + key_shape=key_shape, + value_shape=value_shape, + is_causal=is_causal, + causal_type=causal_type, + ) + + if is_causal and causal_type not in [CausalType.BottomRight, CausalType.DontCare]: + target_fn("Flash Attention v2 only supports bottom-right causal masking.", exception=RuntimeError) + return False + + return True diff --git a/cosmos-inference/cosmos3/_src/imaginaire/attention/flash2/functions.py b/cosmos-inference/cosmos3/_src/imaginaire/attention/flash2/functions.py new file mode 100644 index 00000000..b1dd380d --- /dev/null +++ b/cosmos-inference/cosmos3/_src/imaginaire/attention/flash2/functions.py @@ -0,0 +1,198 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +""" +Imaginaire4 Attention Subpackage: +Unified implementation for all Attention implementations. + +Flash Attention v2 (flash2) Backend: intermediate APIs +Only safe to import when FLASH2_SUPPORTED is True. +""" + +from flash_attn.flash_attn_interface import flash_attn_func, flash_attn_varlen_func +from torch import Tensor + +from cosmos3._src.imaginaire.attention.checks import assert_universal_tensor_checks +from cosmos3._src.imaginaire.attention.flash2.checks import flash2_attention_check +from cosmos3._src.imaginaire.attention.masks import CausalType +from cosmos3._src.imaginaire.attention.utils.environment import is_torch_compiling + + +def flash2_attention( + query: Tensor, + key: Tensor, + value: Tensor, + is_causal: bool = False, + causal_type: CausalType | None = None, + scale: float | None = None, + cumulative_seqlen_Q: Tensor | None = None, + cumulative_seqlen_KV: Tensor | None = None, + max_seqlen_Q: int | None = None, + max_seqlen_KV: int | None = None, + return_lse: bool = False, + backend_kwargs: dict | None = None, + deterministic: bool = False, +) -> Tensor | tuple[Tensor, Tensor]: + """ + Runs Flash Attention v2 on given operands (Q, K, V) with the heads-last contiguous layout + (`[batch, seqlen, heads, head_dim]`). + + Parameters: + query (Tensor): 4-D query tensor, with the heads-last contiguous layout + (`[batch, seqlen, heads, head_dim]`) + + key (Tensor): 4-D key tensor, with the heads-last contiguous layout + (`[batch, seqlen_kv, heads_kv, head_dim]`) + + value (Tensor): 4-D value tensor, with heads-last contiguous layout + (`[batch, seqlen_kv, heads_kv, head_dim_v]`) + + is_causal (bool): whether or not causal masking is enabled. Default is False. + + causal_type (CausalType): causal masking mode. Choices: `CausalType.TopLeft`, + `CausalType.BottomRight`. Required when `is_causal = True`. + + scale (float | None): Dot product scale (attention scale). Defaults to head_dim ** -0.5. + + cumulative_seqlen_Q (Tensor | None): (varlen) Optional 1-D tensor with size `batch + 1` + indicating the cumulative sum of number of query tokens in each batch, with an + additional 0 element in the beginning. Must be passed together with + `cumulative_seqlen_KV` and `max_seqlen_{Q,KV}`. + + cumulative_seqlen_KV (Tensor | None): (varlen) Optional 1-D tensor with size `batch + 1` + indicating the cumulative sum of number of key/value tokens in each batch, with an + additional 0 element in the beginning. Must be passed together with + `cumulative_seqlen_Q` and `max_seqlen_{Q,KV}`. + + max_seqlen_Q (int | None): (varlen) Optional integer indicating the maximum query + sequence length in all batches. Must be passed together with `cumulative_seqlen_{Q,KV}` + and `max_seqlen_KV`. + + max_seqlen_KV (int | None): (varlen) Optional integer indicating the maximum key/value + sequence length in all batches. Must be passed together with `cumulative_seqlen_{Q,KV}` + and `max_seqlen_Q`. + + Other Parameters: + return_lse (bool): Whether to return the logsumexp values. Default is False. + + backend_kwargs (dict | None): Key-value pair for passing arguments specific to Flash's + attention operator, if any. + + deterministic (bool): Deterministic backward pass required. + + Returns: + output (Tensor): 4-D output tensor, with the heads-last contiguous layout + (`[batch, seqlen, heads, head_dim_v]`). + + logsumexp (Tensor): logsumexp tensor, with the heads-last contiguous layout + (`[batch, seqlen, heads, 1]`). Only returned when return_lse is True. + NOTE: this tensor is not contiguous in this backend (Flash2) and it should not be made + contiguous unless we can guarantee its results aren't merged via `merge_attentions`. + """ + + is_varlen = cumulative_seqlen_Q is not None + assert_universal_tensor_checks(query, key, value) + + backend_kwargs = backend_kwargs.copy() if backend_kwargs is not None else {} + # Determinism in backend_kwargs supersedes primary flag, if set to True + if "deterministic" in backend_kwargs: + deterministic = deterministic or backend_kwargs["deterministic"] + del backend_kwargs["deterministic"] + + assert flash2_attention_check( + query_shape=query.shape, + key_shape=key.shape, + value_shape=value.shape, + dtype=query.dtype, + device=query.device, + requires_grad=query.requires_grad or key.requires_grad or value.requires_grad, + is_causal=is_causal, + causal_type=causal_type, + is_varlen=is_varlen, + deterministic=deterministic, + raise_error=True, + ) + + # This check introduces recompiles + if not is_torch_compiling(): + if is_varlen and max_seqlen_Q == max_seqlen_KV == 0: + raise NotImplementedError( + "You're trying to use varlen attention with the flash2 backend and " + "an empty batch, which is not yet supported by flash2." + ) + + scale = scale if scale is not None else query.shape[-1] ** -0.5 + + if is_varlen: + assert query.shape[0] == key.shape[0] == value.shape[0] == 1 + q = query.squeeze(0) # [total_tokens,H,D] + k = key.squeeze(0) # [total_tokens,Hkv,D] + v = value.squeeze(0) # [total_tokens,Hkv,Dv] + assert q.dim() == k.dim() == v.dim() == 3 + out, lse_, _ = flash_attn_varlen_func( + q=query.squeeze(0), + k=key.squeeze(0), + v=value.squeeze(0), + cu_seqlens_q=cumulative_seqlen_Q, + cu_seqlens_k=cumulative_seqlen_KV, + max_seqlen_q=max_seqlen_Q, + max_seqlen_k=max_seqlen_KV, + softmax_scale=scale, + causal=is_causal, + return_attn_probs=True, + deterministic=deterministic, + **backend_kwargs, + # window_size=(-1, -1), + # dropout_p=0.0, + # softcap=0.0, # 0.0 means deactivated + # alibi_slopes=None, + # block_table=None, + ) + assert out.dim() == 3 # [total_tokens,H,Dv] + assert lse_.dim() == 2 # [H,total_tokens] + + output = out.unsqueeze(0) # [1,total_tokens,H,Dv] + lse = lse_.unsqueeze(0) # [1,H,total_tokens] + + else: + output, lse, _ = flash_attn_func( # output: [B,N,H,Dv], lse: [B,H,N] + q=query, + k=key, + v=value, + softmax_scale=scale, + causal=is_causal, + return_attn_probs=True, + deterministic=deterministic, + **backend_kwargs, + # window_size=(-1, -1), + # dropout_p=0.0, + # softcap=0.0, # 0.0 means deactivated + # alibi_slopes=None, + ) + + assert isinstance(output, Tensor) + assert isinstance(lse, Tensor) + assert output.dim() == 4 # [B,N,H,Dv] or [1,total_tokens,H,Dv] + assert lse.dim() == 3 # [B,H,N] or [1,H,total_tokens] + + + # incorrect. All output and lse tensors passed into `merge_attentions` must have the same data + # pointer as their corresponding attention autograd ops! + lse = lse.permute(0, 2, 1) # [B,N,H] or [1,total_tokens,H] + + if return_lse: + return output, lse + + return output diff --git a/cosmos-inference/cosmos3/_src/imaginaire/attention/flash2/meta.py b/cosmos-inference/cosmos3/_src/imaginaire/attention/flash2/meta.py new file mode 100644 index 00000000..b0a813a5 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/imaginaire/attention/flash2/meta.py @@ -0,0 +1,64 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +""" +Imaginaire4 Attention Subpackage: +Unified implementation for all Attention implementations. + +Flash Attention v2 (flash2) Backend: metadata +Always safe to import (as long as torch is available.) +""" + +import torch + +from cosmos3._src.imaginaire.attention.utils.safe_ops import log + + +def get_fwd_dtypes(arch_tag: int) -> list[torch.dtype]: + """ + Returns data type choices for forward pass according to arch tag (attention.utils.get_arch_tag). + + Parameters: + arch_tag (int): Arch tag for the current CUDA device. Example: 80 for A100, 90 for H100. + + Returns: + data_type_choices (list): a list of PyTorch data types. Empty if device is not supported. + + """ + + if arch_tag < 80: + log.debug("Flash Attention v2 (flash2) is not supported because compute capability is below the minimum (8.0).") + return [] + + return [torch.float16, torch.bfloat16] + + +def get_bwd_dtypes(arch_tag: int) -> list[torch.dtype]: + """ + Returns data type choices for backward pass according to arch tag (attention.utils.get_arch_tag). + + Parameters: + arch_tag (int): Arch tag for the current CUDA device. Example: 80 for A100, 90 for H100. + + Returns: + data_type_choices (list): a list of PyTorch data types. Empty if device is not supported. + + """ + + if arch_tag < 80: + log.debug("Flash Attention v2 (flash2) is not supported because compute capability is below the minimum (8.0).") + return [] + + return [torch.float16, torch.bfloat16] diff --git a/cosmos-inference/cosmos3/_src/imaginaire/attention/flash2/stubs.py b/cosmos-inference/cosmos3/_src/imaginaire/attention/flash2/stubs.py new file mode 100644 index 00000000..05093ad8 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/imaginaire/attention/flash2/stubs.py @@ -0,0 +1,47 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +""" +Imaginaire4 Attention Subpackage: +Unified implementation for all Attention implementations. + +Flash Attention v2 (flash2) Backend: intermediate API stubs +Always safe to import (as long as torch is available.) +""" + +from torch import Tensor + +from cosmos3._src.imaginaire.attention.masks import CausalType + + +def flash2_attention( + query: Tensor, + key: Tensor, + value: Tensor, + is_causal: bool = False, + causal_type: CausalType | None = None, + scale: float | None = None, + cumulative_seqlen_Q: Tensor | None = None, + cumulative_seqlen_KV: Tensor | None = None, + max_seqlen_Q: int | None = None, + max_seqlen_KV: int | None = None, + return_lse: bool = False, + backend_kwargs: dict | None = None, + deterministic: bool = False, +) -> Tensor | tuple[Tensor, Tensor]: + raise RuntimeError( + "Tried to run Flash Attention v2, but it is not supported / available. " + "Try running with debug logs enabled to see why." + ) diff --git a/cosmos-inference/cosmos3/_src/imaginaire/attention/flash3/__init__.py b/cosmos-inference/cosmos3/_src/imaginaire/attention/flash3/__init__.py new file mode 100644 index 00000000..845e8838 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/imaginaire/attention/flash3/__init__.py @@ -0,0 +1,74 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +""" +Imaginaire4 Attention Subpackage: +Unified implementation for all Attention implementations. + +Flash Attention v3 (flash3) Backend +""" + +import torch + +from cosmos3._src.imaginaire.attention.utils.safe_ops import log +from cosmos3._src.imaginaire.attention.utils.version import version_at_least + +FLASH_ATTENTION_V3_MIN_VERSION = "1.0.3" + + +def flash3_supported() -> bool: + """ + Returns whether Flash Attention is supported in this environment. + Requirements are: + * Presence of CUDA Runtime (via PyTorch) + * Presence of Flash Attention, meeting minimum version requirements + + This check guards imports / dependencies on the Flash Attention package. + """ + if not torch.cuda.is_available(): + log.debug("Flash Attention v3 is not supported because PyTorch did not detect CUDA runtime.") + return False + + try: + # pyrefly: ignore # missing-import + import flash_attn_3_nv + + except ImportError: + log.debug("Flash Attention v3 is not supported because the Python package ('flash_attn_3_nv'_) was not found.") + return False + except Exception as e: + log.debug(f"Flash Attention v3 is not supported because importing the Python package failed: {e}") + return False + + if not version_at_least(flash_attn_3_nv.__version__, FLASH_ATTENTION_V3_MIN_VERSION): + log.debug( + f"Flash Attention v3 ('flash_attn_3_nv') build is not supported; minimum required version is " + f"{FLASH_ATTENTION_V3_MIN_VERSION}, got {flash_attn_3_nv.__version__}." + ) + return False + + return True + + +FLASH3_SUPPORTED = flash3_supported() + + +if FLASH3_SUPPORTED: + from cosmos3._src.imaginaire.attention.flash3.functions import flash3_attention + +else: + from cosmos3._src.imaginaire.attention.flash3.stubs import flash3_attention + +__all__ = ["flash3_attention", "FLASH3_SUPPORTED"] diff --git a/cosmos-inference/cosmos3/_src/imaginaire/attention/flash3/checks.py b/cosmos-inference/cosmos3/_src/imaginaire/attention/flash3/checks.py new file mode 100644 index 00000000..94b6d9fc --- /dev/null +++ b/cosmos-inference/cosmos3/_src/imaginaire/attention/flash3/checks.py @@ -0,0 +1,138 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +""" +Imaginaire4 Attention Subpackage: +Unified implementation for all Attention implementations. + +Flash Attention v3 (flash3) backend checks +""" + +from functools import partial + +import torch + +from cosmos3._src.imaginaire.attention.checks import attention_param_checks, attention_tensor_checks +from cosmos3._src.imaginaire.attention.flash3 import FLASH3_SUPPORTED +from cosmos3._src.imaginaire.attention.flash3.meta import get_bwd_dtypes, get_fwd_dtypes +from cosmos3._src.imaginaire.attention.masks import CausalType +from cosmos3._src.imaginaire.attention.utils import get_arch_tag, log_or_raise_error + + +def flash3_attention_check( + query_shape: torch.Size, + key_shape: torch.Size, + value_shape: torch.Size, + dtype: torch.dtype, + device: torch.device, + requires_grad: bool, + is_causal: bool, + causal_type: CausalType, + is_varlen: bool, + deterministic: bool = False, + raise_error: bool = False, +) -> bool: + """ + Input validation function for the flash3 backend. + + Parameters: + query_shape (torch.Size): Shape of 4-D query tensor (`[batch, seqlen, heads, head_dim]`). + + key_shape (torch.Size): Shape of 4-D key tensor (`[batch, seqlen_kv, heads_kv, head_dim]`). + + value_shape (torch.Size): Shape of 4-D value tensor (`[batch, seqlen_kv, heads_kv, head_dim_v]`). + + dtype (torch.dtype): Data type of tensors. + + device (torch.device): Device of tensors. + + requires_grad (bool): Whether tensors require gradients (training vs inference). + + is_causal (bool): whether or not causal masking is enabled. + + causal_type (CausalType): causal masking mode. Choices: `CausalType.TopLeft`, + `CausalType.BottomRight`. Required when `is_causal = True`. + + is_varlen (bool): whether or not a variable length (varlen) use case. Must be inferred + beforehand based on arguments such as seqlens_{Q,KV} or cumulative_seqlen_{Q,KV} being + passed. + + deterministic (bool): Deterministic backward pass required. + + raise_error (bool): whether to raise an error if any checks fail or no backend is selected, + instead of just returning False. Default is False. + + Returns: + success (bool): whether use case is compatible with flash3 backend. + + """ + target_fn = partial(log_or_raise_error, raise_error=raise_error) + + if not FLASH3_SUPPORTED: + target_fn( + "Flash Attention v3 (flash3) is not supported in this environment. Run with debug logs to find out why, or choose another backend.", + exception=RuntimeError, + ) + return False + + arch_tag = get_arch_tag(device) + fwd_dtypes = get_fwd_dtypes(arch_tag) + bwd_dtypes = get_bwd_dtypes(arch_tag) + if not attention_tensor_checks( + query_shape=query_shape, + key_shape=key_shape, + value_shape=value_shape, + dtype=dtype, + requires_grad=requires_grad, + supported_dtypes_forward=fwd_dtypes, + supported_dtypes_backward=bwd_dtypes, + # flash3 supports MLA, unlike flash2, but with some constraints + # disabled for now due to API bug + supports_mla=False, + supports_gqa_mqa=True, + raise_error=raise_error, + backend_name="Flash Attention v3 (flash3)", + ): + target_fn("Flash Attention v3 (flash3) does not support the given inputs.", exception=RuntimeError) + return False + + # MLA constraints + if query_shape[-1] != value_shape[-1]: + head_dim_q = query_shape[-1] + head_dim_v = value_shape[-1] + if not ((head_dim_q <= 64 and head_dim_v <= 512) or (128 <= head_dim_q <= 192 and 96 <= head_dim_v <= 128)): + target_fn( + "Flash Attention v3 (flash3) does not support this head dim combination. " + "Expected either head_dim_qk <= 64 and head_dim_v <= 512, or 128 <= head_dim_qk <= 192 " + f"and 96 <= head_dim_v <= 128, got {head_dim_q=}, {head_dim_v=}.", + exception=ValueError, + ) + return False + + # Verifies causal_type is a CausalType instance when is_causal + # Verifies DontCare is not used unless seqlen_q == seqlen_kv + attention_param_checks( + query_shape=query_shape, + key_shape=key_shape, + value_shape=value_shape, + is_causal=is_causal, + causal_type=causal_type, + ) + + if is_causal and causal_type not in [CausalType.BottomRight, CausalType.DontCare]: + target_fn("Flash Attention v3 only supports bottom-right causal masking.", exception=ValueError) + return False + + return True diff --git a/cosmos-inference/cosmos3/_src/imaginaire/attention/flash3/functions.py b/cosmos-inference/cosmos3/_src/imaginaire/attention/flash3/functions.py new file mode 100644 index 00000000..1ac796f8 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/imaginaire/attention/flash3/functions.py @@ -0,0 +1,213 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +""" +Imaginaire4 Attention Subpackage: +Unified implementation for all Attention implementations. + +Flash Attention v3 (flash3) Backend: intermediate APIs +Only safe to import when FLASH3_SUPPORTED is True. +""" + +import inspect + +# pyrefly: ignore # missing-import +from flash_attn_3_nv.flash_attn_interface import flash_attn_func, flash_attn_varlen_func +from torch import Tensor + + +# reflection of the commit hash in the version, so we have to manually inspect the signatures +HAS_RETURN_ATTN_PROBS = "return_attn_probs" in inspect.signature(flash_attn_func).parameters + +from cosmos3._src.imaginaire.attention.checks import assert_universal_tensor_checks +from cosmos3._src.imaginaire.attention.flash3.checks import flash3_attention_check +from cosmos3._src.imaginaire.attention.masks import CausalType +from cosmos3._src.imaginaire.attention.utils.environment import is_torch_compiling + + +def flash3_attention( + query: Tensor, + key: Tensor, + value: Tensor, + is_causal: bool = False, + causal_type: CausalType | None = None, + scale: float | None = None, + cumulative_seqlen_Q: Tensor | None = None, + cumulative_seqlen_KV: Tensor | None = None, + max_seqlen_Q: int | None = None, + max_seqlen_KV: int | None = None, + return_lse: bool = False, + backend_kwargs: dict | None = None, + deterministic: bool = False, +) -> Tensor | tuple[Tensor, Tensor]: + """ + Runs Flash Attention v3 on given operands (Q, K, V) with the heads-last contiguous layout + (`[batch, seqlen, heads, head_dim]`). + + Parameters: + query (Tensor): 4-D query tensor, with the heads-last contiguous layout + (`[batch, seqlen, heads, head_dim]`) + + key (Tensor): 4-D key tensor, with the heads-last contiguous layout + (`[batch, seqlen_kv, heads_kv, head_dim]`) + + value (Tensor): 4-D value tensor, with heads-last contiguous layout + (`[batch, seqlen_kv, heads_kv, head_dim_v]`) + + is_causal (bool): whether or not causal masking is enabled. Default is False. + + causal_type (CausalType): causal masking mode. Choices: `CausalType.TopLeft`, + `CausalType.BottomRight`. Required when `is_causal = True`. + + scale (float | None): Dot product scale (attention scale). Defaults to head_dim ** -0.5. + + cumulative_seqlen_Q (Tensor | None): (varlen) Optional 1-D tensor with size `batch + 1` + indicating the cumulative sum of number of query tokens in each batch, with an + additional 0 element in the beginning. Must be passed together with + `cumulative_seqlen_KV` and `max_seqlen_{Q,KV}`. + + cumulative_seqlen_KV (Tensor | None): (varlen) Optional 1-D tensor with size `batch + 1` + indicating the cumulative sum of number of key/value tokens in each batch, with an + additional 0 element in the beginning. Must be passed together with + `cumulative_seqlen_Q` and `max_seqlen_{Q,KV}`. + + max_seqlen_Q (int | None): (varlen) Optional integer indicating the maximum query + sequence length in all batches. Must be passed together with `cumulative_seqlen_{Q,KV}` + and `max_seqlen_KV`. + + max_seqlen_KV (int | None): (varlen) Optional integer indicating the maximum key/value + sequence length in all batches. Must be passed together with `cumulative_seqlen_{Q,KV}` + and `max_seqlen_Q`. + + Other Parameters: + return_lse (bool): Whether to return the logsumexp values. Default is False. + + backend_kwargs (dict | None): Key-value pair for passing arguments specific to Flash's + attention operator, if any. + + deterministic (bool): Deterministic backward pass required. + + Returns: + output (Tensor): 4-D output tensor, with the heads-last contiguous layout + (`[batch, seqlen, heads, head_dim_v]`). + + logsumexp (Tensor): logsumexp tensor, with the heads-last contiguous layout + (`[batch, seqlen, heads, 1]`). Only returned when return_lse is True. + NOTE: this tensor is not contiguous in this backend (Flash3) and it should not be made + contiguous unless we can guarantee its results aren't merged via `merge_attentions`. + """ + + is_varlen = cumulative_seqlen_Q is not None + assert_universal_tensor_checks(query, key, value) + + backend_kwargs = backend_kwargs.copy() if backend_kwargs is not None else {} + # Determinism in backend_kwargs supersedes primary flag, if set to True + if "deterministic" in backend_kwargs: + deterministic = deterministic or backend_kwargs["deterministic"] + del backend_kwargs["deterministic"] + + assert flash3_attention_check( + query_shape=query.shape, + key_shape=key.shape, + value_shape=value.shape, + dtype=query.dtype, + device=query.device, + requires_grad=query.requires_grad or key.requires_grad or value.requires_grad, + is_causal=is_causal, + causal_type=causal_type, + is_varlen=is_varlen, + deterministic=deterministic, + raise_error=True, + ) + + # This check introduces recompiles + if not is_torch_compiling(): + if is_varlen and max_seqlen_Q == max_seqlen_KV == 0: + raise NotImplementedError( + "You're trying to use varlen attention with the flash3 backend and " + "an empty batch, which is not yet supported by flash3." + ) + + scale = scale if scale is not None else query.shape[-1] ** -0.5 + + if HAS_RETURN_ATTN_PROBS: + backend_kwargs["return_attn_probs"] = True + + if is_varlen: + assert query.shape[0] == key.shape[0] == value.shape[0] == 1 + q = query.squeeze(0) # [total_tokens,H,D] + k = key.squeeze(0) # [total_tokens,Hkv,D] + v = value.squeeze(0) # [total_tokens,Hkv,Dv] + assert q.dim() == k.dim() == v.dim() == 3 + out, lse_ = flash_attn_varlen_func( + q=query.squeeze(0), + k=key.squeeze(0), + v=value.squeeze(0), + cu_seqlens_q=cumulative_seqlen_Q, + cu_seqlens_k=cumulative_seqlen_KV, + max_seqlen_q=max_seqlen_Q, + max_seqlen_k=max_seqlen_KV, + softmax_scale=scale, + causal=is_causal, + deterministic=deterministic, + **backend_kwargs, + # qv=None, + # q_descale=None, k_descale=None, v_descale=None, + # attention_chunk=0, + # num_splits=1, + # pack_gqa=None, + # sm_margin=0, + # window_size=(-1, -1), + # softcap=0.0, # 0.0 means deactivated + ) + assert out.dim() == 3 # [total_tokens,H,Dv] + assert lse_.dim() == 2 # [H,total_tokens] + + output = out.unsqueeze(0) # [1,total_tokens,H,Dv] + lse = lse_.unsqueeze(0) # [1,H,total_tokens] + + else: + output, lse = flash_attn_func( # output: [B,N,H,Dv], lse: [B,H,N] + q=query, + k=key, + v=value, + softmax_scale=scale, + causal=is_causal, + deterministic=deterministic, + **backend_kwargs, + # qv=None, + # q_descale=None, k_descale=None, v_descale=None, + # attention_chunk=0, + # num_splits=1, + # pack_gqa=None, + # sm_margin=0, + # window_size=(-1, -1), + # softcap=0.0, # 0.0 means deactivated + ) + + assert isinstance(output, Tensor) + assert isinstance(lse, Tensor) + assert output.dim() == 4 # [B,N,H,Dv] or [1,total_tokens,H,Dv] + assert lse.dim() == 3 # [B,H,N] or [1,H,total_tokens] + + + # incorrect. All output and lse tensors passed into `merge_attentions` must have the same data + # pointer as their corresponding attention autograd ops! + lse = lse.permute(0, 2, 1) # [B,N,H] or [1,total_tokens,H] + + if return_lse: + return output, lse + + return output diff --git a/cosmos-inference/cosmos3/_src/imaginaire/attention/flash3/meta.py b/cosmos-inference/cosmos3/_src/imaginaire/attention/flash3/meta.py new file mode 100644 index 00000000..3229d037 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/imaginaire/attention/flash3/meta.py @@ -0,0 +1,64 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +""" +Imaginaire4 Attention Subpackage: +Unified implementation for all Attention implementations. + +Flash Attention v3 (flash3) Backend: metadata +Always safe to import (as long as torch is available.) +""" + +import torch + +from cosmos3._src.imaginaire.attention.utils.safe_ops import log + + +def get_fwd_dtypes(arch_tag: int) -> list[torch.dtype]: + """ + Returns data type choices for forward pass according to arch tag (attention.utils.get_arch_tag). + + Parameters: + arch_tag (int): Arch tag for the current CUDA device. Example: 80 for A100, 90 for H100. + + Returns: + data_type_choices (list): a list of PyTorch data types. Empty if device is not supported. + + """ + + if arch_tag != 90: + log.debug("Flash Attention v3 (flash3) only supports compute capability 9.0 (Hopper).") + return [] + + return [torch.float16, torch.bfloat16] + + +def get_bwd_dtypes(arch_tag: int) -> list[torch.dtype]: + """ + Returns data type choices for backward pass according to arch tag (attention.utils.get_arch_tag). + + Parameters: + arch_tag (int): Arch tag for the current CUDA device. Example: 80 for A100, 90 for H100. + + Returns: + data_type_choices (list): a list of PyTorch data types. Empty if device is not supported. + + """ + + if arch_tag != 90: + log.debug("Flash Attention v3 (flash3) only supports compute capability 9.0 (Hopper).") + return [] + + return [torch.float16, torch.bfloat16] diff --git a/cosmos-inference/cosmos3/_src/imaginaire/attention/flash3/stubs.py b/cosmos-inference/cosmos3/_src/imaginaire/attention/flash3/stubs.py new file mode 100644 index 00000000..9fcd9774 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/imaginaire/attention/flash3/stubs.py @@ -0,0 +1,47 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +""" +Imaginaire4 Attention Subpackage: +Unified implementation for all Attention implementations. + +Flash Attention v3 (flash3) Backend: intermediate API stubs +Always safe to import (as long as torch is available.) +""" + +from torch import Tensor + +from cosmos3._src.imaginaire.attention.masks import CausalType + + +def flash3_attention( + query: Tensor, + key: Tensor, + value: Tensor, + is_causal: bool = False, + causal_type: CausalType | None = None, + scale: float | None = None, + cumulative_seqlen_Q: Tensor | None = None, + cumulative_seqlen_KV: Tensor | None = None, + max_seqlen_Q: int | None = None, + max_seqlen_KV: int | None = None, + return_lse: bool = False, + backend_kwargs: dict | None = None, + deterministic: bool = False, +) -> Tensor | tuple[Tensor, Tensor]: + raise RuntimeError( + "Tried to run Flash Attention v3, but it is not supported / available. " + "Try running with debug logs enabled to see why." + ) diff --git a/cosmos-inference/cosmos3/_src/imaginaire/attention/frontend.py b/cosmos-inference/cosmos3/_src/imaginaire/attention/frontend.py new file mode 100644 index 00000000..9a001b87 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/imaginaire/attention/frontend.py @@ -0,0 +1,769 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +""" +Imaginaire4 Attention Subpackage: +Unified implementation for all Attention implementations. + +Frontend APIs +""" + +import torch +from torch import Tensor + +from cosmos3._src.imaginaire.attention.backends import choose_backend, choose_multi_dim_backend +from cosmos3._src.imaginaire.attention.checks import ( + attention_param_checks, + attention_tensor_checks, + multi_dim_attention_param_checks, + multi_dim_attention_param_filter, + multi_dim_attention_tensor_checks, + universal_tensor_checks, + varlen_tensor_checks, +) +from cosmos3._src.imaginaire.attention.flash2 import flash2_attention +from cosmos3._src.imaginaire.attention.flash3 import flash3_attention +from cosmos3._src.imaginaire.attention.masks import CausalType +from cosmos3._src.imaginaire.attention.natten import natten_attention, natten_multi_dim_attention +from cosmos3._src.imaginaire.attention.utils.environment import filter_attention_merge_backends +from cosmos3._src.imaginaire.attention.utils.safe_ops import log + + +# Map backend names to their frontend attention API +BACKEND_MAP = { + "natten": natten_attention, + "flash2": flash2_attention, + "flash3": flash3_attention, +} + +MULTI_DIM_BACKEND_MAP = { + "natten": natten_multi_dim_attention, +} + + +def attention( + query: Tensor, # [B,S_Q,H,D] + key: Tensor, # [B,S_KV,H_KV,D] + value: Tensor, # [B,S_KV,H_KV,D_V] + is_causal: bool = False, + causal_type: CausalType | None = None, + scale: float | None = None, + # varlen parameters + seqlens_Q: Tensor | None = None, # [B] + seqlens_KV: Tensor | None = None, # [B] + cumulative_seqlen_Q: Tensor | None = None, # [B+1] + cumulative_seqlen_KV: Tensor | None = None, # [B+1] + max_seqlen_Q: int | None = None, + max_seqlen_KV: int | None = None, + # backend & misc parameters + backend: str | None = None, + return_lse: bool = False, + backend_kwargs: dict | None = None, + deterministic: bool = False, +) -> Tensor | tuple[Tensor, Tensor]: # [B,S_Q,H,D_V] or ([B,S_Q,H,D_V], [B,S_Q,H,1]) + """ + Runs Attention on given operands (Q, K, V) with the heads-last contiguous layout + (`[batch, seqlen, heads, head_dim]`). + + Varlen Attention is only supported for the sequence-packed layout: QKV tensors have batch size + 1, and tokens from different batches are concatenated without any padding along the sequence + dimension. Sequence lengths for different batches can be provided in two ways: + 1. `seqlens_Q` and `seqlens_KV` (less efficient): only provide the sequence lengths as + integer tensors (must be on the same device as QKV), and cumulative and maximum sequence + lengths are recomputed on each call. + 2. `cumulative_seqlen_{Q,KV}` and `max_seqlen_{Q,KV}` (more efficient): + compute cumulative and maximum sequence lengths. `cumulative_seqlen_{Q,KV}` are integer + tensors on the same device as QKV containing the cumulative sum of `seqlens_{Q,KV}`, + with an additional `0` element in the beginning, therefore sized `batch+1`. + `max_seqlen_{Q,KV}` are integers (not Tensors) that represent the maximum sequence + lengths for Q and KV among all sequence batches. + You can use `generate_varlen_parameters` to generate these + parameters: + ```python3 + from cosmos3._src.imaginaire.attention.varlen import generate_varlen_parameters + ( + cumulative_seqlen_Q, + cumulative_seqlen_KV, + max_seqlen_Q, + max_seqlen_KV, + ) = generate_varlen_parameters(q, k, v, seqlens_Q, seqlens_KV) + ``` + + Parameters: + query (Tensor): 4-D query tensor, with the heads-last contiguous layout + (`[batch, seqlen_q, heads, head_dim]`) + + key (Tensor): 4-D key tensor, with the heads-last contiguous layout + (`[batch, seqlen_kv, heads_kv, head_dim]`) + + value (Tensor): 4-D value tensor, with heads-last contiguous layout + (`[batch, seqlen_kv, heads_kv, head_dim_v]`) + + is_causal (bool): whether or not causal masking is enabled. Default is False. + + causal_type (CausalType): causal masking mode. Choices: `CausalType.TopLeft`, + `CausalType.BottomRight`, `CausalType.DontCare` (only valid when seqlen_q == seqlen_kv). + Required when `is_causal = True`. + + scale (float | None): Dot product scale (attention scale). Defaults to head_dim ** -0.5. + + seqlens_Q (Tensor | None): (varlen) Optional 1-D tensor with size `batch` + indicating the number of query tokens in each batch. Must be passed together with + `seqlens_KV`. + + seqlens_KV (Tensor | None): (varlen) Optional 1-D tensor with size `batch` + indicating the number of key/value tokens in each batch. Must be passed together with + `seqlens_Q`. + + cumulative_seqlen_Q (Tensor | None): (varlen) Optional 1-D tensor with size `batch + 1` + indicating the cumulative sum of number of query tokens in each batch, with an + additional 0 element in the beginning. Must be passed together with + `cumulative_seqlen_KV` and `max_seqlen_{Q,KV}`. + + cumulative_seqlen_KV (Tensor | None): (varlen) Optional 1-D tensor with size `batch + 1` + indicating the cumulative sum of number of key/value tokens in each batch, with an + additional 0 element in the beginning. Must be passed together with + `cumulative_seqlen_Q` and `max_seqlen_{Q,KV}`. + + max_seqlen_Q (int | None): (varlen) Optional integer indicating the maximum query + sequence length in all batches. Must be passed together with `cumulative_seqlen_{Q,KV}` + and `max_seqlen_KV`. + + max_seqlen_KV (int | None): (varlen) Optional integer indicating the maximum key/value + sequence length in all batches. Must be passed together with `cumulative_seqlen_{Q,KV}` + and `max_seqlen_Q`. + + Other Parameters: + backend (str | None): Backend to run with. If unspecified (default), it will try to + select the best available. + + return_lse (bool): Whether to return the logsumexp values. Default is False. + + backend_kwargs (dict | None): Key-value pair for passing arguments specific to the backend's + attention operator, if any. Only valid when a specific backend is selected (backend is + not None). + + deterministic (bool): Whether to enforce deterministic backward pass. Default is False. + When True, backends are selected based on deterministic support. + + Returns: + output (Tensor): 4-D output tensor, with the heads-last contiguous layout + (`[batch, seqlen_q, heads, head_dim_v]`). + + logsumexp (Tensor): logsumexp tensor, with the heads-last contiguous layout + (`[batch, seqlen_q, heads, 1]`). Only returned when return_lse is True. + NOTE: this tensor is not guaranteed to be contiguous with some backends and it should + not be made contiguous unless we can guarantee its results aren't merged via + `merge_attentions`. + """ + + assert universal_tensor_checks(query=query, key=key, value=value, raise_error=True) + + assert attention_tensor_checks( + query_shape=query.shape, + key_shape=key.shape, + value_shape=value.shape, + dtype=query.dtype, + requires_grad=query.requires_grad or key.requires_grad or value.requires_grad, + raise_error=True, + ) + + attention_param_checks( + query_shape=query.shape, + key_shape=key.shape, + value_shape=value.shape, + is_causal=is_causal, + causal_type=causal_type, + ) + + ( + cumulative_seqlen_Q, + cumulative_seqlen_KV, + max_seqlen_Q, + max_seqlen_KV, + ) = varlen_tensor_checks( + query=query, + key=key, + value=value, + seqlens_Q=seqlens_Q, + seqlens_KV=seqlens_KV, + cumulative_seqlen_Q=cumulative_seqlen_Q, + cumulative_seqlen_KV=cumulative_seqlen_KV, + max_seqlen_Q=max_seqlen_Q, + max_seqlen_KV=max_seqlen_KV, + ) + is_varlen = cumulative_seqlen_Q is not None + + scale = scale if scale is not None else query.shape[-1] ** -0.5 + + if backend is None and backend_kwargs is not None: + backend_kwargs = None + log.debug("A backend was not specified, but got backend_kwargs. Ignoring... ") + + if backend is not None and backend not in BACKEND_MAP: + raise ValueError(f"Selected {backend=}, but available choices are {BACKEND_MAP.keys()}. ") + + compatible_backend = choose_backend( + query_shape=query.shape, + key_shape=key.shape, + value_shape=value.shape, + dtype=query.dtype, + device=query.device, + requires_grad=query.requires_grad or key.requires_grad or value.requires_grad, + is_causal=is_causal, + causal_type=causal_type, + is_varlen=is_varlen, + deterministic=deterministic, + backend=backend, + raise_error=False, + ) + + # Either incompatible backend specified by user, or no compatible backends found + if compatible_backend is None and backend is None: + raise ValueError( + "Could not find a compatible Attention backend for this use case / device. " + "Try running with debug logs to find out why." + ) + elif compatible_backend is None: + raise ValueError( + f"Selected Attention backend {backend} is incompatible with this use case / device. " + "Try running with debug logs to find out why." + ) + + assert compatible_backend in BACKEND_MAP + return BACKEND_MAP[compatible_backend]( + query=query, + key=key, + value=value, + is_causal=is_causal, + causal_type=causal_type, + scale=scale, + cumulative_seqlen_Q=cumulative_seqlen_Q, + cumulative_seqlen_KV=cumulative_seqlen_KV, + max_seqlen_Q=max_seqlen_Q, + max_seqlen_KV=max_seqlen_KV, + return_lse=return_lse, + backend_kwargs=backend_kwargs, + deterministic=deterministic, + ) + + +def multi_dimensional_attention( + query: Tensor, # [B,*token_layout_shape,H,D] + key: Tensor, # [B,*token_layout_shape,H_KV,D] + value: Tensor, # [B,*token_layout_shape,H_KV,D_V] + window_size: tuple | int = -1, + stride: tuple | int = 1, + dilation: tuple | int = 1, + is_causal: tuple | bool = False, + scale: float | None = None, + # backend & misc parameters + backend: str | None = None, + return_lse: bool = False, + backend_kwargs: dict | None = None, + deterministic: bool = False, +) -> ( + Tensor | tuple[Tensor, Tensor] +): # [B,*token_layout_shape,H,D_V] or ([B,*token_layout_shape,H,D_V], [B,*token_layout_shape,H]) + """ + Runs Multi-Dimensional Attention on given operands (Q, K, V) with the heads-last contiguous + layout (`[batch, *, heads, head_dim]`). Supports up to and including 3 dimensions: + * 1-D: `[batch, X, heads, head_dim]`, with masking arguments expecting tuples of size 1. + * 2-D: `[batch, X, Y, heads, head_dim]`, with masking arguments expecting tuples of size 2. + * 3-D: `[batch, X, Y, Z, heads, head_dim]`, with masking arguments expecting tuples of size 3. + + The dimensions here refer to the layout of tokens; that is the arrangement of tokens for each + batch/head, or the `[X]`, `[X, Y]`, `[X, Y, Z]` part of the input shape. + We refer to these as the "token layout shape". + + For now, it is always expected that Q, K, and V match in the sizes of those dimensions. + + Masking arguments, all of which can be set uniformly across all dimensions or per dimension, are: + * `window_size`: determines the sliding window size. -1 is interpreted as the maximum window + size. Must be either -1 or at least 2 and at most the token layout shape. + For example, if inputs are `[batch, X, Y, Z, heads_{q,kv}, head_dim_{qk,v}]`, + `window_size` must be either an integer == -1 or an integer <= `min(X, Y, Z)`, + or a tuple of size 3 corresponding to the three dimensions / axes, where: + * `window_size[0] == -1 or 2 <= window_size[0] <= X` + * `window_size[1] == -1 or 2 <= window_size[1] <= Y` + * `window_size[2] == -1 or 2 <= window_size[2] <= Z` + When `window_size` is set to the maximum for any dimension, we're effectively performing + self attention (no sparsity) along that dimension. + Default is -1 (self attention). + + * `stride`: determines the step size of the sliding window. Only matters when the + corresponding `window_size` is not -1 / maximum (self attention). + Default is 1, indicating the smallest sliding window delay. + Larger values trade off translational equivariance for potentially improved efficiency. + Maximum value for `stride` along each dimension is the corresponding `window_size`. + If `stride == window_size` along any dimension, it is equivalent to blocked / windowed + attention (from works such as Swin Transformer, SAM, ViTDet, etc) along that dimension, + meaning no overlap between windows. + For more details, please refer to the GNA paper: + https://arxiv.org/abs/2504.16922 + + * `dilation`: introduces gaps between tokens in a sliding window, similarly to dilated + convolution. + Default is 1, indicating no gaps. + Maximum value is the largest positive integer that satisfies + `window_size * dilation <= token_layout_shape` along that dimension. + Higher dilation means more sparse and global context. Lower dilation means more + locality. + For more details, please refer to the DiNAT paper: + https://arxiv.org/abs/2209.15001 + + * `is_causal`: per-dimension causal mask. + + Parameters: + query (Tensor): 4-D, 5-D, or 6-D query tensor, with the heads-last contiguous layout + (`[batch, *token_layout_shape, heads, head_dim]`) + + key (Tensor): 4-D, 5-D, or 6-D key tensor, with the heads-last contiguous layout + (`[batch, *token_layout_shape, heads_kv, head_dim]`) + + value (Tensor): 4-D, 5-D, or 6-D value tensor, with heads-last contiguous layout + (`[batch, *token_layout_shape, heads_kv, head_dim_v]`) + + window_size (tuple | int): Attention window (kernel) size / shape. If an + integer, it will be repeated for all dimensions. For example `window_size=3`, when + `len(token_layout_shape) == 3`, is interpreted as `window_size=(3, 3, 3)`. + `-1`s are replaced with the corresponding `token_layout_shape`. + Final window size must satisfy `2 <= window_size <= token_layout_shape`. + Default is -1 (no sparsity). + + stride (tuple | int): Sliding window step size/shape. If an integer, it will be repeated + for all dimensions. For example `stride=2`, when `len(token_layout_shape) == 3`, is + interpreted as `stride=(2, 2, 2)`. + Final stride must satisfy `1 <= stride <= window_size`. + Default is 1. + + dilation (tuple | int): Dilation step size/shape. If an integer, it will be repeated for + all dimensions. For example `dilation=4`, when `len(token_layout_shape) == 3`, is + interpreted as `dilation=(4, 4, 4)`. + Final dilation must satisfy `2 <= dilation * window_size <= token_layout_shape`. + Default is 1. + + is_causal (tuple | bool): Toggle causal masking. If a boolean, it will be repeated for all + dimensions. For example `is_causal=True`, when `len(token_layout_shape) == 3`, is + interpreted as `is_causal=(True, True, True)`. + Default is False. + + scale (float | None): Dot product scale (attention scale). Defaults to head_dim ** -0.5. + + Other Parameters: + backend (str | None): Backend to run with. If unspecified (default), it will try to + select the best available. + + return_lse (bool): Whether to return the logsumexp values. Default is False. + + backend_kwargs (dict | None): Key-value pair for passing arguments specific to the backend's + multi-dim / sparse attention operator, if any. Only valid when a specific backend is + selected (backend is not None). + + deterministic (bool): Whether to enforce deterministic backward pass. Default is False. + When True, backends are selected based on deterministic support. + + Returns: + output (Tensor): 4-D, 5-D, or 6-D output tensor, with the heads-last contiguous layout + (`[batch, *token_layout_shape, heads, head_dim_v]`). + + logsumexp (Tensor): logsumexp tensor, with the heads-last contiguous layout + (`[batch, *token_layout_shape, heads]`). Only returned when return_lse is True. + """ + + assert universal_tensor_checks(query=query, key=key, value=value, raise_error=True) + + assert multi_dim_attention_tensor_checks( + query_shape=query.shape, + key_shape=key.shape, + value_shape=value.shape, + dtype=query.dtype, + requires_grad=query.requires_grad or key.requires_grad or value.requires_grad, + raise_error=True, + ) + + token_layout_shape, window_size, stride, dilation, is_causal = multi_dim_attention_param_filter( + query, + window_size=window_size, + stride=stride, + dilation=dilation, + is_causal=is_causal, + ) + num_dims = len(token_layout_shape) + + # Automatic transformation for 1s in token layout + # I.e. Attention over a (1, 16, 32) token layout is identical to over a (16, 32) + + token_layout_ones = [i for i in range(num_dims) if token_layout_shape[i] == 1] + if len(token_layout_ones) > 0: + token_layout_t = tuple(s for i, s in enumerate(token_layout_shape) if i not in token_layout_ones) + window_size_t = tuple(w for i, w in enumerate(window_size) if i not in token_layout_ones) + stride_t = tuple(s for i, s in enumerate(stride) if i not in token_layout_ones) + dilation_t = tuple(d for i, d in enumerate(dilation) if i not in token_layout_ones) + is_causal_t = tuple(c for i, c in enumerate(is_causal) if i not in token_layout_ones) + + assert all(x >= 2 for x in token_layout_t) + assert all(w >= 2 for w in window_size_t) + + query_t = query.reshape( + query.shape[0], *token_layout_t, query.shape[-2], query.shape[-1] + ) # [B,*token_layout_t,H,D] + key_t = key.reshape(key.shape[0], *token_layout_t, key.shape[-2], key.shape[-1]) # [B,*token_layout_t,H_KV,D] + value_t = value.reshape( + value.shape[0], *token_layout_t, value.shape[-2], value.shape[-1] + ) # [B,*token_layout_t,H_KV,D_V] + output_shape = [x for x in query.shape[:-1]] + [value.shape[-1]] + + if not torch.compiler.is_compiling(): + log.debug( + "This Multi-Dimensional Attention problem has 1s in the token layout, which can be simplified from " + f"<{token_layout_shape=}, {window_size=}, {stride=}, {dilation=}, {is_causal=}> into " + f"<{token_layout_t=}, {window_size_t=}, {stride_t=}, {dilation_t=}, {is_causal_t=}>." + ) + + output_t, lse_t = multi_dimensional_attention( + query=query_t, + key=key_t, + value=value_t, + window_size=window_size_t, + stride=stride_t, + dilation=dilation_t, + is_causal=is_causal_t, + scale=scale, + backend=backend, + return_lse=True, + backend_kwargs=backend_kwargs, + deterministic=deterministic, + ) + output = output_t.reshape(*output_shape) # [B,*token_layout_shape,H,D_V] + lse = lse_t.reshape(*output_shape[:-1]) # [B,*token_layout_shape,H] + if return_lse: + return output, lse + return output + + multi_dim_attention_param_checks( + query, + window_size=window_size, + stride=stride, + dilation=dilation, + is_causal=is_causal, + ) + + # Fast path for self attention problems + if all(x == w for x, w in zip(token_layout_shape, window_size)) and ( + not any(c for c in is_causal) or num_dims == 1 + ): + if not torch.compiler.is_compiling(): + log.debug( + "This Multi-Dimensional Attention problem is implementable with standard Attention: " + f"{token_layout_shape=}, {window_size=}, {is_causal=}." + ) + if backend is not None: + log.debug(f"Ignoring {backend=} and backend args...") + + query_1d = query.flatten(1, num_dims) # [B,S,H,D] + key_1d = key.flatten(1, num_dims) # [B,S,H_KV,D] + value_1d = value.flatten(1, num_dims) # [B,S,H_KV,D_V] + is_causal_1d = is_causal[0] + output_shape = [x for x in query.shape[:-1]] + [value.shape[-1]] + + output_1d, lse_1d = attention( + query_1d, + key_1d, + value_1d, + scale=scale, + is_causal=is_causal_1d, + causal_type=CausalType.DontCare, + return_lse=True, + deterministic=deterministic, + ) + output = output_1d.reshape(*output_shape) # [B,*token_layout_shape,H,D_V] + lse = lse_1d.reshape(*output_shape[:-1]) # [B,*token_layout_shape,H] + if return_lse: + return output, lse + return output + + scale = scale if scale is not None else query.shape[-1] ** -0.5 + + if backend is None and backend_kwargs is not None: + backend_kwargs = None + log.debug("A backend was not specified, but got backend_kwargs. Ignoring... ") + + backend = choose_multi_dim_backend( + query_shape=query.shape, + key_shape=key.shape, + value_shape=value.shape, + dtype=query.dtype, + device=query.device, + requires_grad=query.requires_grad or key.requires_grad or value.requires_grad, + deterministic=deterministic, + backend=backend, + ) + + if backend not in MULTI_DIM_BACKEND_MAP: + raise ValueError(f"Selected {backend=}, but available choices are {MULTI_DIM_BACKEND_MAP.keys()}. ") + + return MULTI_DIM_BACKEND_MAP[backend]( + query=query, + key=key, + value=value, + window_size=window_size, + stride=stride, + dilation=dilation, + is_causal=is_causal, + scale=scale, + return_lse=return_lse, + backend_kwargs=backend_kwargs, + deterministic=deterministic, + ) + + +def multi_dimensional_attention_varlen( + query: Tensor, # [1,S_total,H,D] + key: Tensor, # [1,S_total,H_KV,D] + value: Tensor, # [1,S_total,H_KV,D_V] + metadata: dict, + scale: float | None = None, + backend: str | None = None, + return_lse: bool = False, + backend_kwargs: dict | None = None, + deterministic: bool = False, +) -> Tensor | tuple[Tensor, Tensor]: # [1,S_total,H,D_V] or ([1,S_total,H,D_V], [1,S_total,H]) + """ + Runs Variable-Length Multi-Dimensional Attention on sequence-packed QKV tensors. + + This operation performs sparse/multi-dimensional attention on variable-length sequences + where tokens from different samples with different spatial layouts are concatenated + along the sequence dimension. Each sample can have its own spatial dimensions + (e.g., different height/width for 2D layouts). + + The metadata should be pre-computed using `configure_varlen_metadata` and reused + across forward/backward passes for efficiency. + + **Requires NATTEN >= 0.21.9.dev0 and either Hopper or Blackwell DC-class architecture** + + Parameters: + query (Tensor): 4-D query tensor with sequence-packed layout + (`[1, seqlen_total, heads, head_dim]`) + + key (Tensor): 4-D key tensor with sequence-packed layout + (`[1, seqlen_total, heads_kv, head_dim]`) + + value (Tensor): 4-D value tensor with sequence-packed layout + (`[1, seqlen_total, heads_kv, head_dim_v]`) + + metadata (dict): Pre-computed varlen metadata from `imaginaire.varlen.generate_multi_dim_varlen_parameters`. + + scale (float | None): Attention scale. Defaults to head_dim ** -0.5. + + Other Parameters: + backend (str | None): Backend to run with. If unspecified (default), it will try to + select the best available. + + return_lse (bool): Whether to return logsumexp values. Default is False. + + backend_kwargs (dict | None): Backend-specific arguments. + + deterministic (bool): Whether to enforce deterministic backward pass. Default is False. + Not supported for this operation (Hopper and Blackwell FNA backends do not support determinism). + + Returns: + output (Tensor): 4-D output tensor with sequence-packed layout + (`[1, seqlen_total, heads, head_dim_v]`). + + logsumexp (Tensor): logsumexp tensor (`[1, seqlen_total, heads]`). + Only returned when return_lse is True. + """ + # For now, NATTEN is the only backend that supports varlen multi-dimensional attention + from cosmos3._src.imaginaire.attention.natten import natten_supported + + if not natten_supported(): + raise RuntimeError("merge_attentions requires NATTEN. Please upgrade NATTEN to use attention merging.") + + if backend is not None and backend != "natten": + raise ValueError( + f"multi_dimensional_attention_varlen currently only supports 'natten' backend, got {backend=}." + ) + + # Import NATTEN's varlen function + from cosmos3._src.imaginaire.attention.natten.functions import natten_multi_dim_attention_varlen + + return natten_multi_dim_attention_varlen( + query=query, + key=key, + value=value, + metadata=metadata, + scale=scale, + return_lse=return_lse, + backend_kwargs=backend_kwargs, + deterministic=deterministic, + ) + + +def spatio_temporal_attention( + query: Tensor, # [B,T,H_spatial,W_spatial,H,D] + key: Tensor, # [B,T,H_spatial,W_spatial,H_KV,D] + value: Tensor, # [B,T,H_spatial,W_spatial,H_KV,D_V] + window_size: tuple | int = -1, + stride: tuple | int = 1, + dilation: tuple | int = 1, + scale: float | None = None, + # backend & misc parameters + backend: str | None = None, + return_lse: bool = False, + backend_kwargs: dict | None = None, + deterministic: bool = False, +) -> ( + Tensor | tuple[Tensor, Tensor] +): # [B,T,H_spatial,W_spatial,H,D_V] or ([B,T,H_spatial,W_spatial,H,D_V], [B,T,H_spatial,W_spatial,H]) + """ + Runs Spatio-Temporal Attention on unflattened QKV with the heads-last contiguous layout + (`[batch, T, H, W, heads, head_dim]`). + For now, it is always expected that Q, K, and V match in their shapes. + + Parameters: + query (Tensor): 6-D query tensor, with the heads-last contiguous layout + (`[batch, T, H, W, heads, head_dim]`) + + key (Tensor): 6-D key tensor, with the heads-last contiguous layout + (`[batch, T, H, W, heads_kv, head_dim]`) + + value (Tensor): 6-D value tensor, with heads-last contiguous layout + (`[batch, T, H, W, heads_kv, head_dim_v]`) + + window_size (tuple | int): Attention window (kernel) size / shape. If an + integer, it will be repeated for all dimensions. For example `window_size=3` is + interpreted as `window_size=(3, 3, 3)`. + `-1`s are replaced with the corresponding value in `(T, H, W)`. + Default is -1 (no sparsity). + + stride (tuple | int): Sliding window step size/shape. If an integer, it will be repeated + for all dimensions. For example `stride=2` is interpreted as `stride=(2, 2, 2)`. + Final stride must satisfy `1 <= stride <= window_size`. + Default is 1. + + dilation (tuple | int): Dilation step size/shape. If an integer, it will be repeated for + all dimensions. For example `dilation=4` is interpreted as `dilation=(4, 4, 4)`. + Final dilation must satisfy `2 <= dilation * window_size <= (T, H, W)`. + Default is 1. + + scale (float | None): Dot product scale (attention scale). Defaults to head_dim ** -0.5. + + Other Parameters: + backend (str | None): Backend to run with. If unspecified (default), it will try to + select the best available. + + return_lse (bool): Whether to return the logsumexp values. Default is False. + + backend_kwargs (dict | None): Key-value pair for passing arguments specific to the backend's + multi-dim / sparse attention operator, if any. Only valid when a specific backend is + selected (backend is not None). + + deterministic (bool): Whether to enforce deterministic backward pass. Default is False. + When True, backends are selected based on deterministic support. + + Returns: + output (Tensor): 6-D output tensor, with the heads-last contiguous layout + (`[batch, T, H, W, heads, head_dim_v]`). + + logsumexp (Tensor): logsumexp tensor, with the heads-last contiguous layout + (`[batch, T, H, W, heads]`). Only returned when return_lse is True. + """ + if query.dim() != 6: + raise ValueError( + "Spatio-Temporal Attention requires 6-D input tensors ([batch, T, H, W, heads, head_dim]), " + f"got {query.shape=})." + ) + + return multi_dimensional_attention( + query=query, + key=key, + value=value, + window_size=window_size, + stride=stride, + dilation=dilation, + is_causal=(True, False, False), + scale=scale, + backend=backend, + return_lse=return_lse, + backend_kwargs=backend_kwargs, + deterministic=deterministic, + ) + + +def merge_attentions( + outputs: list[Tensor], # list of [B,S,H,D_V] + lse_tensors: list[Tensor], # list of [B,S,H] + torch_compile: bool = False, +) -> tuple[Tensor, Tensor]: # ([B,S,H,D_V], [B,S,H]) + """ + Merges multiple attention outputs that share the same query. + + **NOTE: the user is responsible for ensuring ALL output and LSE tensors have the same data + pointer as the outputs from the corresponding Attention operations for correct backpropagation!** + + **NOTE: requires NATTEN** + + **NOTE: for backpropagation, only two outputs can be merged for now.** + + Takes multiple attention outputs computed from the same set of query but w.r.t. different + key/value pairs, and merges them as if all key/value pairs had been concatenated. + This enables patterns like: + - Combining local and global attention (e.g., sparse + dense context) + - Pipelined context parallelism + + The NATTEN backend can be controlled via environment variable filtering. + See `filter_attention_merge_backends` for details. + + Parameters: + outputs (list[Tensor]): List of 4-D attention output tensors, with the heads-last layout + (`[batch, seqlen, heads, head_dim]`). Must contain at least 2 tensors. + + lse_tensors (list[Tensor]): List of 3-D logsumexp tensors, with the heads-last layout + (`[batch, seqlen, heads]`). Must match length of `outputs`. + + torch_compile (bool): Attempt to use `torch.compile` to fuse the underlying elementwise + operations. Default is False. + + Returns: + output (Tensor): Merged attention output tensor (`[batch, seqlen, heads, head_dim]`). + + logsumexp (Tensor): Updated logsumexp tensor (`[batch, seqlen, heads]`). + """ + # For now, NATTEN is the only backend that provides merge_attentions + + # Check if NATTEN is allowed by environment variable + allowed_backends = filter_attention_merge_backends(["natten"]) + + if "natten" not in allowed_backends: + raise RuntimeError( + "merge_attentions requires NATTEN backend, but it has been disabled via environment variable. " + "NATTEN is currently the only supported backend for merge_attentions." + ) + + from cosmos3._src.imaginaire.attention.natten import natten_supported + + if not natten_supported(): + raise RuntimeError("merge_attentions requires NATTEN. Please upgrade NATTEN to use attention merging.") + + # Import and use NATTEN's merge_attentions + from natten.functional import merge_attentions as natten_merge_attentions + + return natten_merge_attentions( + outputs=outputs, + lse_tensors=lse_tensors, + torch_compile=torch_compile, + use_autograd_fix=True, # Always use autograd fix for correct backprop + ) diff --git a/cosmos-inference/cosmos3/_src/imaginaire/attention/masks.py b/cosmos-inference/cosmos3/_src/imaginaire/attention/masks.py new file mode 100644 index 00000000..3cad6c02 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/imaginaire/attention/masks.py @@ -0,0 +1,61 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +""" +Imaginaire4 Attention Subpackage: +Unified implementation for all Attention implementations. + +Mask utilities +""" + +from enum import Enum + + +class CausalType(Enum): + """ + Different types of causal masking supported by backends of interest. + """ + + # Top-Left: Simplified: mask if q_idx < kv_idx + # CUTLASS / NATTEN default + # Q = 2, KV = 5: + # O____ + # OO___ + # + # Q = 5, KV = 2: + # O_ + # OO + # OO + # OO + # OO + TopLeft = 0 + + # Bottom-right: mask if q_idx + KV - Q < kv_idx + # Flash Attention default + # Q = 2, KV = 5: + # OOOO_ + # OOOOO + # + # Q = 5, KV = 2: + # __ + # __ + # __ + # O_ + # OO + BottomRight = 1 + + # When seqlen_q == seqlen_kv, we don't care about the causal type + # because top-left and bottom-right are equivalent + DontCare = 2 diff --git a/cosmos-inference/cosmos3/_src/imaginaire/attention/natten/__init__.py b/cosmos-inference/cosmos3/_src/imaginaire/attention/natten/__init__.py new file mode 100644 index 00000000..ad2492e6 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/imaginaire/attention/natten/__init__.py @@ -0,0 +1,127 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +""" +Imaginaire4 Attention Subpackage: +Unified implementation for all Attention implementations. + +NATTEN Backend +""" + +import torch + +from cosmos3._src.imaginaire.attention.utils.safe_ops import log +from cosmos3._src.imaginaire.attention.utils.version import version_at_least + +# 0.21.5.dev1 patches some varlen issues +# 0.21.5.dev2 adds torch compile support +# 0.21.5.dev3 fixes a few compat issues for older torch versions +# 0.21.5.dev6 gqa/mqa support +# 0.21.5.dev9 fixes attention merging +NATTEN_MIN_VERSION = "0.21.5.dev9" + +# Hopper FMHA causal and varlen support +NATTEN_HOPPER_CAUSAL_VARLEN_VERSION = "0.21.6.dev3" + +# Blackwell-FMHA Deterministic bwd support +NATTEN_BLACKWELL_DETERMINISTIC_VERSION = "0.21.6.dev7" + +# Blackwell-FMHA/FNA support extended to head dims meeting alignment constraint and <= 128 +NATTEN_BLACKWELL_PARTIAL_HEAD_DIM_VERSION = "0.21.6.dev8" + +# 0.21.9.dev0 adds varlen multi-dimensional (sparse) attention +NATTEN_VARLEN_MULTI_DIM_VERSION = "0.21.9.dev0" + + +def get_natten_version() -> str: + try: + import natten + except (ImportError, Exception): + return "0.0.0" + + return natten.__version__ + + +def natten_version_satisfies(min_version_str: str) -> bool: + """ + Check if the installed NATTEN version satisfies a specific minimum version requirement. + + Parameters: + min_version_str (str): Minimum version string (e.g., "0.21.5" or "0.21.5.dev12"). + + Returns: + bool: True if NATTEN is installed and meets the minimum version requirement. + """ + return version_at_least(get_natten_version(), min_version_str) + + +def natten_supported() -> bool: + """ + Returns whether NATTEN is supported in this environment. + Requirements are: + * Presence of CUDA Runtime (via PyTorch) + * Presence of NATTEN, meeting minimum version requirements + + This check guards imports / dependencies on the NATTEN package. + """ + if not torch.cuda.is_available(): + log.debug("NATTEN Attention is not supported because PyTorch did not detect CUDA runtime.") + return False + + try: + import natten + except ImportError: + log.debug("NATTEN Attention is not supported because the Python package was not found.") + return False + except Exception as e: + log.debug(f"NATTEN Attention is not supported because importing the Python package failed: {e}") + return False + + if not version_at_least(natten.__version__, NATTEN_MIN_VERSION): + log.debug( + f"NATTEN Attention is not supported due to insufficient NATTEN version " + f"{natten.__version__}, expected at least {NATTEN_MIN_VERSION}." + ) + return False + + return True + + +NATTEN_SUPPORTED = natten_supported() + +if NATTEN_SUPPORTED: + from cosmos3._src.imaginaire.attention.natten.functions import ( + natten_attention, + natten_multi_dim_attention, + natten_multi_dim_attention_varlen, + ) + +else: + from cosmos3._src.imaginaire.attention.natten.stubs import ( + natten_attention, + natten_multi_dim_attention, + natten_multi_dim_attention_varlen, + ) + +__all__ = [ + "natten_attention", + "natten_multi_dim_attention", + "natten_multi_dim_attention_varlen", + "NATTEN_SUPPORTED", + "NATTEN_MIN_VERSION", + "NATTEN_VARLEN_MULTI_DIM_VERSION", + "get_natten_version", + "natten_version_satisfies", +] diff --git a/cosmos-inference/cosmos3/_src/imaginaire/attention/natten/checks.py b/cosmos-inference/cosmos3/_src/imaginaire/attention/natten/checks.py new file mode 100644 index 00000000..c6369cd2 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/imaginaire/attention/natten/checks.py @@ -0,0 +1,521 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +""" +Imaginaire4 Attention Subpackage: +Unified implementation for all Attention implementations. + +NATTEN backend checks +""" + +from functools import partial + +import torch + +from cosmos3._src.imaginaire.attention.checks import ( + attention_param_checks, + attention_tensor_checks, + multi_dim_attention_tensor_checks, +) +from cosmos3._src.imaginaire.attention.masks import CausalType +from cosmos3._src.imaginaire.attention.natten import ( + NATTEN_BLACKWELL_DETERMINISTIC_VERSION, + NATTEN_BLACKWELL_PARTIAL_HEAD_DIM_VERSION, + NATTEN_HOPPER_CAUSAL_VARLEN_VERSION, + NATTEN_SUPPORTED, + get_natten_version, + natten_version_satisfies, +) +from cosmos3._src.imaginaire.attention.natten.meta import get_bwd_dtypes, get_fwd_dtypes +from cosmos3._src.imaginaire.attention.utils import get_arch_tag, is_fp8, log_or_raise_error +from cosmos3._src.imaginaire.attention.utils.safe_ops import log +from cosmos3._src.imaginaire.attention.utils.safe_ops.functools import lru_cache + + +def dtype_supported( + dtype: torch.dtype, requires_grad: bool, dtypes_fwd: list[torch.dtype], dtypes_bwd: list[torch.dtype] | None = None +) -> bool: + """ + Helper determining whether dtype is supported with different sets of supported dtypes for + training and inference (forward+backward and forward). + + Parameters: + dtype (torch.dtype): tensor element type. + + requires_grad (bool): Whether tensors require gradients (training vs inference). + + dtypes_fwd (list[torch.dtype]): list of dtypes allowed for inference only (when not + tensor.requires_grad). + + dtypes_bwd (list[torch.dtype] | None): Optional list of dtypes allowed for training only + (when tensor.requires_grad), if different from dtypes_fwd. + + """ + if requires_grad and dtypes_bwd is not None: + return dtype in dtypes_bwd + return dtype in dtypes_fwd + + +@lru_cache +def choose_natten_backend( + query_shape: torch.Size, + key_shape: torch.Size, + value_shape: torch.Size, + dtype: torch.dtype, + device: torch.device, + requires_grad: bool, + is_causal: bool, + is_varlen: bool, + deterministic: bool = False, + requires_fna: bool = False, + raise_error: bool = False, +) -> str | None: + """ + Chooses an FMHA backend in NATTEN (cutlass-fmha, hopper-fmha, blackwell-fmha) for the current + use case based on features needed and current GPU architecture. + + Using tensor shapes, it infers whether MLA (head_dim_value != head_dim_qk) or + GQA/MQA (heads_kv != heads_q) are required. + Using device, it infers GPU architecture and compatible backends. + Using arguments is_causal and is_varlen, and other inferred features, it picks the best + available backend. + + It is possible for no backend to be selected, if the combination of features is not available in + any one of the NATTEN backends, in which case it will return None. + + Parameters: + query_shape (torch.Size): Shape of 4-D query tensor (`[batch, seqlen, heads, head_dim]`). + + key_shape (torch.Size): Shape of 4-D key tensor (`[batch, seqlen_kv, heads_kv, head_dim]`). + + value_shape (torch.Size): Shape of 4-D value tensor (`[batch, seqlen_kv, heads_kv, head_dim_v]`). + + dtype (torch.dtype): Data type of tensors. + + device (torch.device): Device of tensors. + + requires_grad (bool): Whether tensors require gradients (training vs inference). + + is_causal (bool): whether or not causal masking is enabled. + + is_varlen (bool): whether or not a variable length (varlen) use case. Must be inferred + beforehand based on arguments such as seqlens_{Q,KV} or cumulative_seqlen_{Q,KV} being + passed. + + deterministic (bool): Deterministic backward pass required. + + requires_fna (bool): Whether the selection is for FNA kernels (sometimes they have different + feature coverage compared to their FMHA counterparts.) + + raise_error (bool): whether to raise an error if no backend is selected, instead of just + returning None. Default is False. + + Returns: + backend (str | None): selected NATTEN backend, if any compatible. + + """ + natten_version = get_natten_version() + + target_fn = partial(log_or_raise_error, raise_error=raise_error) + + + arch_tag = get_arch_tag(device) + + is_mla = query_shape[-1] != value_shape[-1] + head_dim = max(query_shape[-1], value_shape[-1]) + + # banning devices not supported since CUDA 13.0 for simplicity + if arch_tag < 75: + log.debug("NATTEN is not supported because compute capability is below the minimum (7.5).") + return None + + # blackwell-fmha: sm100 and sm103 only. + # limitations: no mla (TBD). + blackwell_fmha_fwd_dtypes = [torch.float16, torch.bfloat16, torch.float8_e5m2, torch.float8_e4m3fn] + blackwell_fmha_bwd_dtypes = [torch.float16, torch.bfloat16] + blackwell_dtype_supported = dtype_supported( + dtype=dtype, + requires_grad=requires_grad, + dtypes_fwd=blackwell_fmha_fwd_dtypes, + dtypes_bwd=blackwell_fmha_bwd_dtypes, + ) + + blackwell_deterministic_supported = natten_version_satisfies(NATTEN_BLACKWELL_DETERMINISTIC_VERSION) + blackwell_deterministic_blocked = deterministic and (requires_fna or not blackwell_deterministic_supported) + + blackwell_partial_head_dim_support = natten_version_satisfies(NATTEN_BLACKWELL_PARTIAL_HEAD_DIM_VERSION) + blackwell_head_dim_alignment_constraint = 16 if is_fp8(dtype) else 8 + blackwell_head_dim_in_range = 0 < head_dim and head_dim <= 128 + blackwell_head_dim_alignment_met = head_dim % blackwell_head_dim_alignment_constraint == 0 + blackwell_head_dim_supported = head_dim in [32, 64, 128] or ( + blackwell_partial_head_dim_support and blackwell_head_dim_in_range and blackwell_head_dim_alignment_met + ) + + if ( + arch_tag in [100, 103] + and not is_mla + and blackwell_dtype_supported + and blackwell_head_dim_supported + and not blackwell_deterministic_blocked + ): + return "blackwell-fmha" + else: + reason = "" + if blackwell_deterministic_blocked: + reason += "Deterministic mode requested but not supported. " + if arch_tag not in [100, 103]: + reason += f"Incompatible architecture ({arch_tag}, expected 100 or 103). " + if is_mla: + reason += "Use case is MLA (head_dim_qk != head_dim_value). " + if not blackwell_dtype_supported: + if requires_grad: + reason += ( + f"Data type {dtype} is not in list of supported dtypes for training: {blackwell_fmha_bwd_dtypes}. " + ) + else: + reason += ( + f"Data type {dtype} is not in list of supported dtypes for inference: {blackwell_fmha_fwd_dtypes}. " + ) + if not blackwell_head_dim_supported: + reason += f"{head_dim=} is not supported with {dtype=} (natten {natten_version})" + log.debug(f"NATTEN backend blackwell-fmha is not compatible. Reason: {reason}") + + # hopper-fmha: sm90 only. + # limitations: no mla. + # varlen and causal masking support was added in NATTEN_HOPPER_CAUSAL_VARLEN_VERSION + hopper_fmha_dtypes = [torch.float16, torch.bfloat16] + dtype_supported_hopper = dtype_supported(dtype=dtype, requires_grad=requires_grad, dtypes_fwd=hopper_fmha_dtypes) + head_dim_supported_hopper = (head_dim in [32, 64, 128, 256] and not requires_grad) or head_dim in [32, 64, 128] + hopper_varlen_causal_supported = natten_version_satisfies(NATTEN_HOPPER_CAUSAL_VARLEN_VERSION) + hopper_varlen_causal_check = hopper_varlen_causal_supported or (not is_varlen and not is_causal) + if ( + arch_tag == 90 + and hopper_varlen_causal_check + and not is_mla + and dtype_supported_hopper + and head_dim_supported_hopper + and not deterministic + ): + return "hopper-fmha" + else: + reason = "" + if deterministic: + reason += "Deterministic mode requested but hopper-fmha does not support it. " + if arch_tag != 90: + reason += f"Incompatible architecture ({arch_tag}, expected 90). " + if is_causal and not hopper_varlen_causal_supported: + reason += ( + "Use case is causal, which is only supported since natten " + + f"{NATTEN_HOPPER_CAUSAL_VARLEN_VERSION}, detected version: {natten_version}. " + ) + if is_varlen and not hopper_varlen_causal_supported: + reason += ( + "Use case is varlen, which is only supported since natten " + + f"{NATTEN_HOPPER_CAUSAL_VARLEN_VERSION}, detected version: {natten_version}. " + ) + if is_mla: + reason += "Use case is MLA (head_dim_qk != head_dim_value). " + if not dtype_supported_hopper: + reason += f"Data type {dtype} is not in list of supported dtypes: {hopper_fmha_dtypes}. " + if not head_dim_supported_hopper: + reason += f"{head_dim=} with {requires_grad=} is not supported. " + log.debug(f"NATTEN backend hopper-fmha is not compatible. Reason: {reason}") + + # cutlass-fmha: targets sm50, sm70, sm75, sm80 (supports sm80+) + # limitations: none. + cutlass_fmha_dtypes = [torch.float32, torch.float16, torch.bfloat16] + dtype_supported_cutlass = dtype_supported(dtype=dtype, requires_grad=requires_grad, dtypes_fwd=cutlass_fmha_dtypes) + head_dim_supported_cutlass = head_dim % 8 == 0 + if dtype_supported_cutlass and head_dim_supported_cutlass: + return "cutlass-fmha" + else: + reason = "" + if not dtype_supported_cutlass: + reason += f"Data type {dtype} is not in list of supported dtypes: {cutlass_fmha_dtypes}. " + if not head_dim_supported_cutlass: + reason += f"{head_dim=} is not supported. " + log.debug(f"NATTEN backend cutlass-fmha is not compatible. Reason: {reason}") + + target_fn( + f"Could not find a compatible NATTEN FMHA backend for {arch_tag=}, {is_causal=}, {is_varlen=}, {is_mla=}.", + exception=RuntimeError, + ) + return None + + +def natten_attention_check( + query_shape: torch.Size, + key_shape: torch.Size, + value_shape: torch.Size, + dtype: torch.dtype, + device: torch.device, + requires_grad: bool, + is_causal: bool, + causal_type: CausalType, + is_varlen: bool, + deterministic: bool = False, + raise_error: bool = False, +) -> bool: + """ + Input validation function for the NATTEN backend. + Runs the common checks in addition to trying to find a compatible NATTEN backend. If any checks + fail, or no compatible backend is found in NATTEN, returns False. + + Parameters: + query_shape (torch.Size): Shape of 4-D query tensor (`[batch, seqlen, heads, head_dim]`). + + key_shape (torch.Size): Shape of 4-D key tensor (`[batch, seqlen_kv, heads_kv, head_dim]`). + + value_shape (torch.Size): Shape of 4-D value tensor (`[batch, seqlen_kv, heads_kv, head_dim_v]`). + + dtype (torch.dtype): Data type of tensors. + + device (torch.device): Device of tensors. + + requires_grad (bool): Whether tensors require gradients (training vs inference). + + is_causal (bool): whether or not causal masking is enabled. + + causal_type (CausalType): causal masking mode. Choices: `CausalType.TopLeft`, + `CausalType.BottomRight`. Required when `is_causal = True`. + + is_varlen (bool): whether or not a variable length (varlen) use case. Must be inferred + beforehand based on arguments such as seqlens_{Q,KV} or cumulative_seqlen_{Q,KV} being + passed. + + deterministic (bool): Deterministic backward pass required. + + raise_error (bool): whether to raise an error if any checks fail or no backend is selected, + instead of just returning False. Default is False. + + Returns: + success (bool): whether use case is compatible with NATTEN backend. + + """ + target_fn = partial(log_or_raise_error, raise_error=raise_error) + + if not NATTEN_SUPPORTED: + target_fn( + "NATTEN is not supported in this environment. Run with debug logs to find out why, or choose another backend.", + exception=RuntimeError, + ) + return False + + arch_tag = get_arch_tag(device) + fwd_dtypes = get_fwd_dtypes(arch_tag) + bwd_dtypes = get_bwd_dtypes(arch_tag) + if not attention_tensor_checks( + query_shape=query_shape, + key_shape=key_shape, + value_shape=value_shape, + dtype=dtype, + requires_grad=requires_grad, + supported_dtypes_forward=fwd_dtypes, + supported_dtypes_backward=bwd_dtypes, + supports_mla=True, + supports_gqa_mqa=True, + raise_error=raise_error, + backend_name="NATTEN Attention", + ): + target_fn("NATTEN does not support the given inputs.", exception=RuntimeError) + return False + + # Verifies causal_type is a CausalType instance when is_causal + # Verifies DontCare is not used unless seqlen_q == seqlen_kv + attention_param_checks( + query_shape=query_shape, + key_shape=key_shape, + value_shape=value_shape, + is_causal=is_causal, + causal_type=causal_type, + ) + + if is_causal and causal_type not in [CausalType.TopLeft, CausalType.DontCare]: + target_fn("NATTEN Attention only supports top-left causal masking for now.", exception=RuntimeError) + return False + + natten_backend = choose_natten_backend( + query_shape=query_shape, + key_shape=key_shape, + value_shape=value_shape, + dtype=dtype, + device=device, + requires_grad=requires_grad, + is_causal=is_causal, + is_varlen=is_varlen, + deterministic=deterministic, + raise_error=raise_error, + ) + + if natten_backend is None: + return False + + return True + + +@lru_cache +def choose_natten_multi_dim_backend( + query_shape: torch.Size, + key_shape: torch.Size, + value_shape: torch.Size, + dtype: torch.dtype, + device: torch.device, + requires_grad: bool, + deterministic: bool = False, + raise_error: bool = False, +) -> str | None: + """ + Chooses an FNA backend in NATTEN (cutlass-fna, hopper-fna, blackwell-fna) for the current + use case based on features needed and current GPU architecture. + + Using tensor shapes, it infers whether MLA (head_dim_value != head_dim_qk) or + GQA/MQA (heads_kv != heads_q) are required. + Using device, it infers GPU architecture and compatible backends. + Using arguments is_causal and is_varlen, and other inferred features, it picks the best + available backend. + + It is possible for no backend to be selected, if the combination of features is not available in + any one of the NATTEN backends, in which case it will return None. + + Parameters: + query_shape (torch.Size): Shape of 4-D, 5-D, or 6-D query tensor (`[batch, *token_layout_shape, heads, head_dim]`). + + key_shape (torch.Size): Shape of 4-D, 5-D, or 6-D key tensor (`[batch, *token_layout_shape, heads_kv, head_dim]`). + + value_shape (torch.Size): Shape of 4-D, 5-D, or 6-D value tensor (`[batch, *token_layout_shape, heads_kv, head_dim_v]`). + + dtype (torch.dtype): Data type of tensors. + + device (torch.device): Device of tensors. + + requires_grad (bool): Whether tensors require gradients (training vs inference). + + deterministic (bool): Deterministic backward pass required. + + raise_error (bool): whether to raise an error if no backend is selected, instead of just + returning None. Default is False. + + Returns: + backend (str | None): selected NATTEN backend, if any compatible. + + """ + + # Reuse choose_natten_backend instead of duplicating code + # NATTEN specifically makes sure the FNA counterparts cover all the features the FMHA kernels + # do. + fmha_backend = choose_natten_backend( + query_shape=query_shape, + key_shape=key_shape, + value_shape=value_shape, + dtype=dtype, + device=device, + requires_grad=requires_grad, + is_causal=False, # causal masking in supported across all multi-dim (FNA) backends + is_varlen=False, # varlen is undefined (so far) for multi-dim + deterministic=deterministic, + requires_fna=True, + raise_error=raise_error, + ) + + natten_fmha_backend_to_fna_backend = { + "cutlass-fmha": "cutlass-fna", + "hopper-fmha": "hopper-fna", + "blackwell-fmha": "blackwell-fna", + } + + assert fmha_backend in natten_fmha_backend_to_fna_backend + return natten_fmha_backend_to_fna_backend[fmha_backend] + + +def natten_multi_dim_attention_check( + query_shape: torch.Size, + key_shape: torch.Size, + value_shape: torch.Size, + dtype: torch.dtype, + device: torch.device, + requires_grad: bool, + deterministic: bool = False, + raise_error: bool = False, +) -> bool: + """ + Input validation function for the NATTEN multi-dimensional backend. + Runs the common checks in addition to trying to find a compatible NATTEN backend. If any checks + fail, or no compatible backend is found in NATTEN, returns False. + + Parameters: + query_shape (torch.Size): Shape of 4-D, 5-D, or 6-D query tensor (`[batch, *token_layout_shape, heads, head_dim]`). + + key_shape (torch.Size): Shape of 4-D, 5-D, or 6-D key tensor (`[batch, *token_layout_shape, heads_kv, head_dim]`). + + value_shape (torch.Size): Shape of 4-D, 5-D, or 6-D value tensor (`[batch, *token_layout_shape, heads_kv, head_dim_v]`). + + dtype (torch.dtype): Data type of tensors. + + device (torch.device): Device of tensors. + + requires_grad (bool): Whether tensors require gradients (training vs inference). + + deterministic (bool): Deterministic backward pass required. + + raise_error (bool): whether to raise an error if any checks fail or no backend is selected, + instead of just returning False. Default is False. + + Returns: + success (bool): whether use case is compatible with NATTEN backend. + + """ + target_fn = partial(log_or_raise_error, raise_error=raise_error) + + if not NATTEN_SUPPORTED: + target_fn( + "NATTEN is not supported in this environment. Run with debug logs to find out why, or choose another backend.", + exception=RuntimeError, + ) + return False + + arch_tag = get_arch_tag(device) + fwd_dtypes = get_fwd_dtypes(arch_tag) + bwd_dtypes = get_bwd_dtypes(arch_tag) + if not multi_dim_attention_tensor_checks( + query_shape=query_shape, + key_shape=key_shape, + value_shape=value_shape, + dtype=dtype, + requires_grad=requires_grad, + supported_dtypes_forward=fwd_dtypes, + supported_dtypes_backward=bwd_dtypes, + supports_mla=True, + supports_gqa_mqa=True, + raise_error=raise_error, + backend_name="NATTEN Multi-Dimensional Attention", + ): + target_fn("NATTEN does not support the given inputs.", exception=RuntimeError) + return False + + natten_backend = choose_natten_multi_dim_backend( + query_shape=query_shape, + key_shape=key_shape, + value_shape=value_shape, + dtype=dtype, + device=device, + requires_grad=requires_grad, + deterministic=deterministic, + raise_error=raise_error, + ) + + if natten_backend is None: + return False + + return True diff --git a/cosmos-inference/cosmos3/_src/imaginaire/attention/natten/functions.py b/cosmos-inference/cosmos3/_src/imaginaire/attention/natten/functions.py new file mode 100644 index 00000000..c1141986 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/imaginaire/attention/natten/functions.py @@ -0,0 +1,420 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +""" +Imaginaire4 Attention Subpackage: +Unified implementation for all Attention implementations. + +NATTEN Backend: intermediate APIs +Only safe to import when NATTEN_SUPPORTED is True. +""" + +from contextlib import nullcontext + +from natten.context import set_memory_usage_preference, use_kv_parallelism_in_fused_na +from natten.functional import attention as _natten_attention +from natten.functional import neighborhood_attention_generic as _natten_multi_dim_attention +from torch import Tensor + +from cosmos3._src.imaginaire.attention.checks import ( + assert_universal_tensor_checks, + multi_dim_attention_param_checks, + multi_dim_attention_param_filter, +) +from cosmos3._src.imaginaire.attention.masks import CausalType +from cosmos3._src.imaginaire.attention.natten import NATTEN_VARLEN_MULTI_DIM_VERSION, natten_version_satisfies +from cosmos3._src.imaginaire.attention.natten.checks import ( + choose_natten_backend, + choose_natten_multi_dim_backend, + natten_attention_check, + natten_multi_dim_attention_check, +) +from cosmos3._src.imaginaire.attention.utils import torch_deterministic_mode +from cosmos3._src.imaginaire.attention.utils.environment import is_torch_compiling + +set_memory_usage_preference("unrestricted") +use_kv_parallelism_in_fused_na(True) + + +def natten_attention( + query: Tensor, + key: Tensor, + value: Tensor, + is_causal: bool = False, + causal_type: CausalType | None = None, + scale: float | None = None, + cumulative_seqlen_Q: Tensor | None = None, + cumulative_seqlen_KV: Tensor | None = None, + max_seqlen_Q: int | None = None, + max_seqlen_KV: int | None = None, + return_lse: bool = False, + backend_kwargs: dict | None = None, + deterministic: bool = False, +) -> Tensor | tuple[Tensor, Tensor]: + """ + Runs NATTEN Attention on given operands (Q, K, V) with the heads-last contiguous layout + (`[batch, seqlen, heads, head_dim]`). + + Parameters: + query (Tensor): 4-D query tensor, with the heads-last contiguous layout + (`[batch, seqlen, heads, head_dim]`) + + key (Tensor): 4-D key tensor, with the heads-last contiguous layout + (`[batch, seqlen_kv, heads_kv, head_dim]`) + + value (Tensor): 4-D value tensor, with heads-last contiguous layout + (`[batch, seqlen_kv, heads_kv, head_dim_v]`) + + is_causal (bool): whether or not causal masking is enabled. Default is False. + + causal_type (CausalType): causal masking mode. Choices: `CausalType.TopLeft`, + `CausalType.BottomRight`. Required when `is_causal = True`. + + scale (float | None): Dot product scale (attention scale). Defaults to head_dim ** -0.5. + + cumulative_seqlen_Q (Tensor | None): (varlen) Optional 1-D tensor with size `batch + 1` + indicating the cumulative sum of number of query tokens in each batch, with an + additional 0 element in the beginning. Must be passed together with + `cumulative_seqlen_KV` and `max_seqlen_{Q,KV}`. + + cumulative_seqlen_KV (Tensor | None): (varlen) Optional 1-D tensor with size `batch + 1` + indicating the cumulative sum of number of key/value tokens in each batch, with an + additional 0 element in the beginning. Must be passed together with + `cumulative_seqlen_Q` and `max_seqlen_{Q,KV}`. + + max_seqlen_Q (int | None): (varlen) Optional integer indicating the maximum query + sequence length in all batches. Must be passed together with `cumulative_seqlen_{Q,KV}` + and `max_seqlen_KV`. + + max_seqlen_KV (int | None): (varlen) Optional integer indicating the maximum key/value + sequence length in all batches. Must be passed together with `cumulative_seqlen_{Q,KV}` + and `max_seqlen_Q`. + + Other Parameters: + return_lse (bool): Whether to return the logsumexp values. Default is False. + + backend_kwargs (dict | None): Key-value pair for passing arguments specific to NATTEN's + attention operator, if any. + + deterministic (bool): Deterministic backward pass required. + + Returns: + output (Tensor): 4-D output tensor, with the heads-last contiguous layout + (`[batch, seqlen, heads, head_dim_v]`). + + logsumexp (Tensor): logsumexp tensor, with the heads-last contiguous layout + (`[batch, seqlen, heads, 1]`). Only returned when return_lse is True. + """ + + is_varlen = cumulative_seqlen_Q is not None + assert_universal_tensor_checks(query, key, value) + assert natten_attention_check( + query_shape=query.shape, + key_shape=key.shape, + value_shape=value.shape, + dtype=query.dtype, + device=query.device, + requires_grad=query.requires_grad or key.requires_grad or value.requires_grad, + is_causal=is_causal, + causal_type=causal_type, + is_varlen=is_varlen, + deterministic=deterministic, + raise_error=True, + ) + + # This check introduces recompiles + if not is_torch_compiling(): + if is_varlen and max_seqlen_Q == max_seqlen_KV == 0 and not natten_version_satisfies("0.21.6.dev7"): + raise NotImplementedError( + "You're trying to use varlen attention with the NATTEN backend and " + "an empty batch, which is only supported since version " + "0.21.6.dev7. Please upgrade NATTEN." + ) + + scale = scale if scale is not None else query.shape[-1] ** -0.5 + + backend_kwargs = backend_kwargs.copy() if backend_kwargs is not None else {} + + natten_backend = None + if "backend" in backend_kwargs: + natten_backend = backend_kwargs["backend"] + del backend_kwargs["backend"] + else: + natten_backend = choose_natten_backend( + query_shape=query.shape, + key_shape=key.shape, + value_shape=value.shape, + dtype=query.dtype, + device=query.device, + requires_grad=query.requires_grad or key.requires_grad or value.requires_grad, + is_causal=is_causal, + is_varlen=is_varlen, + deterministic=deterministic, + raise_error=True, + ) + + assert natten_backend is not None + + # Override NATTEN's default delta reduction method: using PyTorch + # is more accurate, but slightly slower. + # Only affects NATTEN's "cutlass-fmha" backend (Ampere kernels) + backward_use_pt_reduction = True + if "backward_use_pt_reduction" in backend_kwargs: + backward_use_pt_reduction = backend_kwargs["backward_use_pt_reduction"] + del backend_kwargs["backward_use_pt_reduction"] + + with torch_deterministic_mode() if deterministic else nullcontext(): + return _natten_attention( + query=query, + key=key, + value=value, + is_causal=is_causal, + scale=scale, + cumulative_seqlen_Q=cumulative_seqlen_Q, + cumulative_seqlen_KV=cumulative_seqlen_KV, + max_seqlen_Q=max_seqlen_Q, + max_seqlen_KV=max_seqlen_KV, + return_lse=return_lse, + backend=natten_backend, + backward_use_pt_reduction=backward_use_pt_reduction, + **backend_kwargs, + ) + + +def natten_multi_dim_attention( + query: Tensor, + key: Tensor, + value: Tensor, + window_size: tuple | int = -1, + stride: tuple | int = 1, + dilation: tuple | int = 1, + is_causal: tuple | bool = False, + scale: float | None = None, + return_lse: bool = False, + backend_kwargs: dict | None = None, + deterministic: bool = False, +) -> Tensor | tuple[Tensor, Tensor]: + """ + Runs NATTEN's Multi-Dimensional Attention on given operands (Q, K, V) with the heads-last + contiguous layout (`[batch, *, heads, head_dim]`). Supports up to and including 3 dimensions: + * 1-D: `[batch, X, heads, head_dim]`, with masking arguments expecting tuples of size 1. + * 2-D: `[batch, X, Y, heads, head_dim]`, with masking arguments expecting tuples of size 2. + * 3-D: `[batch, X, Y, Z, heads, head_dim]`, with masking arguments expecting tuples of size 3. + + Parameters: + query (Tensor): 4-D, 5-D, or 6-D query tensor, with the heads-last contiguous layout + (`[batch, *token_layout_shape, heads, head_dim]`) + + key (Tensor): 4-D, 5-D, or 6-D key tensor, with the heads-last contiguous layout + (`[batch, *token_layout_shape, heads_kv, head_dim]`) + + value (Tensor): 4-D, 5-D, or 6-D value tensor, with heads-last contiguous layout + (`[batch, *token_layout_shape, heads_kv, head_dim_v]`) + + window_size (tuple | int): Attention window (kernel) size / shape. If an + integer, it will be repeated for all dimensions. For example `window_size=3`, when + `len(token_layout_shape) == 3`, is interpreted as `window_size=(3, 3, 3)`. + `-1`s are replaced with the corresponding `token_layout_shape`. + Final window size must satisfy `2 <= window_size <= token_layout_shape`. + Default is -1 (no sparsity). + + stride (tuple | int): Sliding window step size/shape. If an integer, it will be repeated + for all dimensions. For example `stride=2`, when `len(token_layout_shape) == 3`, is + interpreted as `stride=(2, 2, 2)`. + Final stride must satisfy `1 <= stride <= window_size`. + Default is 1. + + dilation (tuple | int): Dilation step size/shape. If an integer, it will be repeated for + all dimensions. For example `dilation=4`, when `len(token_layout_shape) == 3`, is + interpreted as `dilation=(4, 4, 4)`. + Final dilation must satisfy `2 <= dilation * window_size <= token_layout_shape`. + Default is 1. + + is_causal (tuple | bool): Toggle causal masking. If a boolean, it will be repeated for all + dimensions. For example `is_causal=True`, when `len(token_layout_shape) == 3`, is + interpreted as `is_causal=(True, True, True)`. + Default is False. + + scale (float | None): Dot product scale (attention scale). Defaults to head_dim ** -0.5. + + Other Parameters: + return_lse (bool): Whether to return the logsumexp values. Default is False. + + backend_kwargs (dict | None): Key-value pair for passing arguments specific to NATTEN's + multi-dim / sparse attention operator, if any. + + deterministic (bool): Deterministic backward pass required. + + Returns: + output (Tensor): 4-D, 5-D, or 6-D output tensor, with the heads-last contiguous layout + (`[batch, *token_layout_shape, heads, head_dim_v]`). + + logsumexp (Tensor): logsumexp tensor, with the heads-last contiguous layout + (`[batch, *token_layout_shape, heads, 1]`). Only returned when return_lse is True. + """ + + assert_universal_tensor_checks(query, key, value) + assert natten_multi_dim_attention_check( + query_shape=query.shape, + key_shape=key.shape, + value_shape=value.shape, + dtype=query.dtype, + device=query.device, + requires_grad=query.requires_grad or key.requires_grad or value.requires_grad, + deterministic=deterministic, + raise_error=True, + ) + + token_layout, window_size, stride, dilation, is_causal = multi_dim_attention_param_filter( + query, + window_size=window_size, + stride=stride, + dilation=dilation, + is_causal=is_causal, + ) + + multi_dim_attention_param_checks( + query, + window_size=window_size, + stride=stride, + dilation=dilation, + is_causal=is_causal, + ) + + scale = scale if scale is not None else query.shape[-1] ** -0.5 + + backend_kwargs = backend_kwargs.copy() if backend_kwargs is not None else {} + + natten_backend = None + if "backend" in backend_kwargs: + natten_backend = backend_kwargs["backend"] + del backend_kwargs["backend"] + else: + natten_backend = choose_natten_multi_dim_backend( + query_shape=query.shape, + key_shape=key.shape, + value_shape=value.shape, + dtype=query.dtype, + device=query.device, + requires_grad=query.requires_grad or key.requires_grad or value.requires_grad, + deterministic=deterministic, + raise_error=True, + ) + + assert natten_backend is not None + + # Override NATTEN's default delta reduction method: using PyTorch + # is more accurate, but slightly slower. + # Only affects NATTEN's "cutlass-fmha" backend (Ampere kernels) + backward_use_pt_reduction = True + if "backward_use_pt_reduction" in backend_kwargs: + backward_use_pt_reduction = backend_kwargs["backward_use_pt_reduction"] + del backend_kwargs["backward_use_pt_reduction"] + + with torch_deterministic_mode() if deterministic else nullcontext(): + return _natten_multi_dim_attention( + query=query, + key=key, + value=value, + kernel_size=window_size, + stride=stride, + dilation=dilation, + is_causal=is_causal, + scale=scale, + backend=natten_backend, + backward_use_pt_reduction=backward_use_pt_reduction, + return_lse=return_lse, + **backend_kwargs, + ) + + +def natten_multi_dim_attention_varlen( + query: Tensor, + key: Tensor, + value: Tensor, + metadata: dict, + scale: float | None = None, + return_lse: bool = False, + backend_kwargs: dict | None = None, + deterministic: bool = False, +) -> Tensor | tuple[Tensor, Tensor]: + """ + Runs NATTEN's Variable-Length Multi-Dimensional Attention on given operands (Q, K, V) with + sequence-packed layout (`[batch=1, seqlen, heads, head_dim]`). + + This operation is used for sparse/multi-dimensional attention on variable-length sequences, + where tokens from different batches with different spatial layouts are concatenated along + the sequence dimension. + + **Requires NATTEN >= 0.21.9.dev0** + + Parameters: + query (Tensor): 4-D query tensor, with the sequence-packed layout + (`[1, seqlen_total, heads, head_dim]`) + + key (Tensor): 4-D key tensor, with the sequence-packed layout + (`[1, seqlen_total, heads_kv, head_dim]`) + + value (Tensor): 4-D value tensor, with sequence-packed layout + (`[1, seqlen_total, heads_kv, head_dim_v]`) + + metadata (dict): Pre-computed varlen metadata from `generate_multi_dim_varlen_parameters`. + + scale (float | None): Dot product scale (attention scale). Defaults to head_dim ** -0.5. + + Other Parameters: + return_lse (bool): Whether to return the logsumexp values. Default is False. + + backend_kwargs (dict | None): Additional backend-specific arguments. + + deterministic (bool): Deterministic backward pass required. + + Returns: + output (Tensor): 4-D output tensor, with the sequence-packed layout + (`[1, seqlen_total, heads, head_dim_v]`). + + logsumexp (Tensor): logsumexp tensor, with the sequence-packed layout + (`[1, seqlen_total, heads]`). Only returned when return_lse is True. + """ + # Check if NATTEN version supports varlen features + if not natten_version_satisfies(NATTEN_VARLEN_MULTI_DIM_VERSION): + raise RuntimeError( + f"NATTEN's varlen/varsized attention requires NATTEN >= {NATTEN_VARLEN_MULTI_DIM_VERSION}. " + "Please upgrade NATTEN to use this feature." + ) + + # Import NATTEN's varlen function (only available in NATTEN_VARLEN_MULTI_DIM_VERSION+) + from natten.varlen import neighborhood_attention_varlen + + backend_kwargs = backend_kwargs.copy() if backend_kwargs is not None else {} + + if deterministic: + raise ValueError( + "Deterministic mode is not supported for varlen multi-dimensional attention. " + "This operation requires Hopper / Blackwell FNA backends which do not support deterministic mode." + ) + + # Parameter mapping: NATTEN uses kernel_size instead of window_size + outputs = neighborhood_attention_varlen( + query=query, + key=key, + value=value, + metadata=metadata, + scale=scale, + return_lse=return_lse, + **backend_kwargs, + ) + + return outputs diff --git a/cosmos-inference/cosmos3/_src/imaginaire/attention/natten/meta.py b/cosmos-inference/cosmos3/_src/imaginaire/attention/natten/meta.py new file mode 100644 index 00000000..15d7d1a1 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/imaginaire/attention/natten/meta.py @@ -0,0 +1,67 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +""" +Imaginaire4 Attention Subpackage: +Unified implementation for all Attention implementations. + +NATTEN Backend: metadata +Always safe to import (as long as torch is available.) +""" + +import torch + +from cosmos3._src.imaginaire.attention.utils.safe_ops import log + + +def get_fwd_dtypes(arch_tag: int) -> list[torch.dtype]: + """ + Returns data type choices for forward pass according to arch tag (attention.utils.get_arch_tag). + + Parameters: + arch_tag (int): Arch tag for the current CUDA device. Example: 80 for A100, 90 for H100. + + Returns: + data_type_choices (list): a list of PyTorch data types. Empty if device is not supported. + + """ + + if arch_tag < 75: + log.debug("NATTEN is not supported because compute capability is below the minimum (7.5).") + return [] + + if arch_tag in [100, 103]: + return [torch.float32, torch.float16, torch.bfloat16, torch.float8_e5m2, torch.float8_e4m3fn] + + return [torch.float32, torch.float16, torch.bfloat16] + + +def get_bwd_dtypes(arch_tag: int) -> list[torch.dtype]: + """ + Returns data type choices for backward pass according to arch tag (attention.utils.get_arch_tag). + + Parameters: + arch_tag (int): Arch tag for the current CUDA device. Example: 80 for A100, 90 for H100. + + Returns: + data_type_choices (list): a list of PyTorch data types. Empty if device is not supported. + + """ + + if arch_tag < 75: + log.debug("NATTEN is not supported because compute capability is below the minimum (7.5).") + return [] + + return [torch.float32, torch.float16, torch.bfloat16] diff --git a/cosmos-inference/cosmos3/_src/imaginaire/attention/natten/stubs.py b/cosmos-inference/cosmos3/_src/imaginaire/attention/natten/stubs.py new file mode 100644 index 00000000..bf2ac8bb --- /dev/null +++ b/cosmos-inference/cosmos3/_src/imaginaire/attention/natten/stubs.py @@ -0,0 +1,82 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +""" +Imaginaire4 Attention Subpackage: +Unified implementation for all Attention implementations. + +NATTEN Backend: intermediate API stubs +Always safe to import (as long as torch is available.) +""" + +from torch import Tensor + +from cosmos3._src.imaginaire.attention.masks import CausalType + + +def natten_attention( + query: Tensor, + key: Tensor, + value: Tensor, + is_causal: bool = False, + causal_type: CausalType | None = None, + scale: float | None = None, + cumulative_seqlen_Q: Tensor | None = None, + cumulative_seqlen_KV: Tensor | None = None, + max_seqlen_Q: int | None = None, + max_seqlen_KV: int | None = None, + return_lse: bool = False, + backend_kwargs: dict | None = None, + deterministic: bool = False, +) -> Tensor | tuple[Tensor, Tensor]: + raise RuntimeError( + "Tried to run NATTEN attention, but it is not supported / available. " + "Try running with debug logs enabled to see why." + ) + + +def natten_multi_dim_attention( + query: Tensor, + key: Tensor, + value: Tensor, + window_size: tuple | int = -1, + stride: tuple | int = 1, + dilation: tuple | int = 1, + is_causal: tuple | bool = False, + scale: float | None = None, + return_lse: bool = False, + backend_kwargs: dict | None = None, + deterministic: bool = False, +) -> Tensor | tuple[Tensor, Tensor]: + raise RuntimeError( + "Tried to run NATTEN's Multi-Dimensional attention, but it is not supported / available. " + "Try running with debug logs enabled to see why." + ) + + +def natten_multi_dim_attention_varlen( + query: Tensor, + key: Tensor, + value: Tensor, + metadata: dict, + scale: float | None = None, + return_lse: bool = False, + backend_kwargs: dict | None = None, + deterministic: bool = False, +) -> Tensor | tuple[Tensor, Tensor]: + raise RuntimeError( + "Tried to run NATTEN's variable-length/size Multi-Dimensional attention, but it is not supported / available. " + "Try running with debug logs enabled to see why." + ) diff --git a/cosmos-inference/cosmos3/_src/imaginaire/attention/utils/__init__.py b/cosmos-inference/cosmos3/_src/imaginaire/attention/utils/__init__.py new file mode 100644 index 00000000..6c1c97b5 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/imaginaire/attention/utils/__init__.py @@ -0,0 +1,85 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +""" +Imaginaire4 Attention Subpackage: +Unified implementation for all Attention implementations. + +Utilities: compute capability detection, helpers, and more. +""" + +from typing import Any + +import torch + +from cosmos3._src.imaginaire.attention.utils.determinism import torch_deterministic_mode +from cosmos3._src.imaginaire.attention.utils.environment import is_torch_compiling +from cosmos3._src.imaginaire.attention.utils.safe_ops import log + + +def get_arch_tag(device: torch.device | None = None) -> int: + """ + Returns the compute capability of a given torch device if it's a CUDA device, otherwise returns 0. + + Args: + device (torch.device | None): torch device. Uses default device if None. + + Returns: + device_cc (int): compute capability in the SmXXX format (i.e. 90 for Hopper). + """ + if torch.cuda.is_available() and torch.version.cuda and (device is None or device.type == "cuda"): + major, minor = torch.cuda.get_device_capability(device) + return major * 10 + minor + return 0 + + +def log_or_raise_error(msg: str, raise_error: bool = False, exception: Any = RuntimeError): + if raise_error: + raise exception(msg) + else: + log.debug(msg) + + +def is_full(dtype: torch.dtype) -> bool: + return dtype == torch.float32 + + +def is_half(dtype: torch.dtype) -> bool: + return dtype in [torch.float16, torch.bfloat16] + + +def is_fp8(dtype: torch.dtype) -> bool: + return dtype in [torch.float8_e5m2, torch.float8_e4m3fn] + + +def is_hopper(device: torch.device | None = None) -> bool: + return get_arch_tag(device) == 90 + + +def is_blackwell_dc(device: torch.device | None = None) -> bool: + return get_arch_tag(device) in [100, 103] + + +__all__ = [ + "get_arch_tag", + "log_or_raise_error", + "is_full", + "is_half", + "is_fp8", + "is_hopper", + "is_blackwell_dc", + "is_torch_compiling", + "torch_deterministic_mode", +] diff --git a/cosmos-inference/cosmos3/_src/imaginaire/attention/utils/determinism.py b/cosmos-inference/cosmos3/_src/imaginaire/attention/utils/determinism.py new file mode 100644 index 00000000..14e39ad9 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/imaginaire/attention/utils/determinism.py @@ -0,0 +1,38 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +""" +Imaginaire4 Attention Subpackage: +Unified implementation for all Attention implementations. + +Utilities: deterministic mode helpers. +""" + +from contextlib import contextmanager + +import torch + + +@contextmanager +def torch_deterministic_mode(): + """Context manager that enables ``torch.use_deterministic_algorithms`` and restores the + previous state on exit (including the ``warn_only`` flag).""" + prev_mode = torch.are_deterministic_algorithms_enabled() + prev_warn_only = torch.is_deterministic_algorithms_warn_only_enabled() + torch.use_deterministic_algorithms(True) + try: + yield + finally: + torch.use_deterministic_algorithms(prev_mode, warn_only=prev_warn_only) diff --git a/cosmos-inference/cosmos3/_src/imaginaire/attention/utils/environment.py b/cosmos-inference/cosmos3/_src/imaginaire/attention/utils/environment.py new file mode 100644 index 00000000..73b5361b --- /dev/null +++ b/cosmos-inference/cosmos3/_src/imaginaire/attention/utils/environment.py @@ -0,0 +1,134 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +""" +Imaginaire4 Attention Subpackage: +Unified implementation for all Attention implementations. + +Environment-related utilities. +""" + +import os + +import torch + +from cosmos3._src.imaginaire.utils import log + + +# Controls all regions guarded against torch compile +# Logs, and certain assertions cause graph breaks. +def is_torch_compiling() -> bool: + try: + return torch.compiler.is_compiling() + except Exception as e: + log.exception(f"Exception occurred checking whether in torch compiled region: {e}") + # Assume too old to support torch compile + return False + + +def parse_backend_filter(env_var_value: str | None, default_backends: list[str]) -> list[str]: + """ + Parse backend filter from environment variable value. + + Logic: + - If env_var_value is None or empty: return default_backends (no filtering) + - If all items start with "-": start with default_backends and remove specified backends (ban-list) + - Otherwise: only include specified backends (allow-list) + - Mixing bans and allows raises a ValueError + - For ban-list: invalid backend names only issue a warning (may not be available on this GPU) + - For allow-list: invalid backend names raise a ValueError (explicit request for unavailable backend) + + Parameters: + env_var_value (str | None): The environment variable value (comma-separated list) + default_backends (list[str]): The default list of backends + + Returns: + list[str]: Filtered list of backends + + Raises: + ValueError: If the list mixes bans and allows, or if any backend name in allow-list is invalid + """ + if not env_var_value: + return default_backends + + # Parse comma-separated list + items = [item.strip() for item in env_var_value.split(",") if item.strip()] + + if not items: + return default_backends + + # Check if items are bans or allows + bans = [item for item in items if item.startswith("-")] + allows = [item for item in items if not item.startswith("-")] + + # Validate: cannot mix bans and allows + if bans and allows: + raise ValueError( + f"Cannot mix ban-list (items starting with '-') and allow-list (items without '-') in backend filter. " + f"Got bans: {bans}, allows: {allows}. " + f"Either specify all bans (e.g., '-flash2,-cudnn') or all allows (e.g., 'natten,flash2')." + ) + + default_backends_set = set(default_backends) + + if bans: + # Ban-list mode: start with defaults and remove specified backends + ban_backends = {item[1:] for item in bans} # Remove "-" prefix + + # Warn about banned backends that don't exist in default list + # (they may not be available on this GPU, which is fine) + invalid_bans = ban_backends - default_backends_set + if invalid_bans: + log.warning( + f"Attempting to ban backend(s) that are not in the available list: {sorted(invalid_bans)}. " + f"Available backends are: {default_backends}. " + f"This may be expected if these backends are not supported on this GPU." + ) + + filtered = [b for b in default_backends if b not in ban_backends] + log.debug(f"Backend filter (ban-list): removing {ban_backends} from {default_backends}, result: {filtered}") + return filtered + else: + # Allow-list mode: only include specified backends + allow_backends = set(allows) + + # Validate: all allowed backends must exist in default list + invalid_allows = allow_backends - default_backends_set + if invalid_allows: + raise ValueError( + f"Invalid backend(s) in allow-list: {sorted(invalid_allows)}. " + f"Available backends are: {default_backends}. " + f"Check for typos in the backend names." + ) + + # Preserve order from default_backends for consistency + filtered = [b for b in default_backends if b in allow_backends] + log.debug(f"Backend filter (allow-list): allowing {allow_backends} from {default_backends}, result: {filtered}") + return filtered + + +def filter_attention_backends(default_backends: list[str]) -> list[str]: + env_var_value = os.environ.get("I4_ATTN_BACKENDS") + return parse_backend_filter(env_var_value, default_backends) + + +def filter_multi_dim_attention_backends(default_backends: list[str]) -> list[str]: + env_var_value = os.environ.get("I4_ATTN_BACKENDS_MULTIDIM") + return parse_backend_filter(env_var_value, default_backends) + + +def filter_attention_merge_backends(default_backends: list[str]) -> list[str]: + env_var_value = os.environ.get("I4_ATTN_BACKENDS_MERGE") + return parse_backend_filter(env_var_value, default_backends) diff --git a/cosmos-inference/cosmos3/_src/imaginaire/attention/utils/safe_ops/__init__.py b/cosmos-inference/cosmos3/_src/imaginaire/attention/utils/safe_ops/__init__.py new file mode 100644 index 00000000..1fdaf020 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/imaginaire/attention/utils/safe_ops/__init__.py @@ -0,0 +1,26 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +""" +Imaginaire4 Attention Subpackage: +Unified implementation for all Attention implementations. + +Safe operations for torch.compile: operations that should be disabled or modified +when in a torch.compiled regions. +""" + +from cosmos3._src.imaginaire.attention.utils.safe_ops import functools, log + +__all__ = ["log", "functools"] diff --git a/cosmos-inference/cosmos3/_src/imaginaire/attention/utils/safe_ops/functools.py b/cosmos-inference/cosmos3/_src/imaginaire/attention/utils/safe_ops/functools.py new file mode 100644 index 00000000..6a498ee8 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/imaginaire/attention/utils/safe_ops/functools.py @@ -0,0 +1,68 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +""" +Imaginaire4 Attention Subpackage: +Unified implementation for all Attention implementations. + +torch.compile-safe functools wrappers (specifically lru_cache). +""" + +import functools + + +def lru_cache(maxsize=128, typed=False): + """ + A torch.compile-safe wrapper around functools.lru_cache. + + This decorator automatically disables caching when inside a torch-compiled region. + torch.compile ignores lru_cache and raises warnings; since torch.compile acts as a + higher-level cache itself, lru_cache becomes redundant and we disable it to avoid warnings. + + When not in a torch-compiled region, behaves exactly like functools.lru_cache. + When in a torch-compiled region it's a no-op. + """ + + def decorator(func): + # Create the cached version using lru_cache + cached_func = functools.lru_cache(maxsize=maxsize, typed=typed)(func) + + @functools.wraps(func) + def wrapper(*args, **kwargs): + # Check if we're in a torch-compiled region + from cosmos3._src.imaginaire.attention.utils.environment import is_torch_compiling + + if is_torch_compiling(): + # Bypass cache during compilation + return func(*args, **kwargs) + else: + # Use cached version normally + return cached_func(*args, **kwargs) + + # Expose cache_clear and cache_info methods when not compiling + wrapper.cache_clear = cached_func.cache_clear + wrapper.cache_info = cached_func.cache_info + wrapper.__wrapped__ = func + + return wrapper + + # Support both @lru_cache and @lru_cache() syntax + # If called without parentheses (maxsize is actually the function) + if callable(maxsize): + func = maxsize + maxsize = 128 + return decorator(func) + + return decorator diff --git a/cosmos-inference/cosmos3/_src/imaginaire/attention/utils/safe_ops/log.py b/cosmos-inference/cosmos3/_src/imaginaire/attention/utils/safe_ops/log.py new file mode 100644 index 00000000..a2b3332e --- /dev/null +++ b/cosmos-inference/cosmos3/_src/imaginaire/attention/utils/safe_ops/log.py @@ -0,0 +1,64 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +""" +Imaginaire4 Attention Subpackage: +Unified implementation for all Attention implementations. + +torch.compile-safe log wrappers. +""" + +from cosmos3._src.imaginaire.attention.utils.environment import is_torch_compiling +from cosmos3._src.imaginaire.utils import log + + +def trace(message: str, rank0_only: bool = True) -> None: + if not is_torch_compiling(): + log.trace(message=message, rank0_only=rank0_only) + + +def debug(message: str, rank0_only: bool = True) -> None: + if not is_torch_compiling(): + log.debug(message=message, rank0_only=rank0_only) + + +def info(message: str, rank0_only: bool = True) -> None: + if not is_torch_compiling(): + log.info(message=message, rank0_only=rank0_only) + + +def success(message: str, rank0_only: bool = True) -> None: + if not is_torch_compiling(): + log.success(message=message, rank0_only=rank0_only) + + +def warning(message: str, rank0_only: bool = True) -> None: + if not is_torch_compiling(): + log.warning(message=message, rank0_only=rank0_only) + + +def error(message: str, rank0_only: bool = True) -> None: + if not is_torch_compiling(): + log.critical(message=message, rank0_only=rank0_only) + + +def critical(message: str, rank0_only: bool = True) -> None: + if not is_torch_compiling(): + log.critical(message=message, rank0_only=rank0_only) + + +def exception(message: str, rank0_only: bool = True) -> None: + if not is_torch_compiling(): + log.exception(message=message, rank0_only=rank0_only) diff --git a/cosmos-inference/cosmos3/_src/imaginaire/attention/utils/version.py b/cosmos-inference/cosmos3/_src/imaginaire/attention/utils/version.py new file mode 100644 index 00000000..57da5547 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/imaginaire/attention/utils/version.py @@ -0,0 +1,50 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +""" +Imaginaire4 Attention Subpackage: +Unified implementation for all Attention implementations. + +Utilities: version checking helpers shared across backends. +""" + +from packaging.version import Version + + +def parse_version(version_str: str) -> Version | None: + """Parse a version string into a ``packaging.version.Version``, returning ``None`` on failure.""" + try: + return Version(version_str) + except Exception: + return None + + +def version_at_least(version_str: str, min_version: str) -> bool: + """Return ``True`` if *version_str* >= *min_version*. Returns ``False`` on parse failure.""" + v = parse_version(version_str) + m = parse_version(min_version) + if v is None or m is None: + return False + return v >= m + + +def version_in_range(version_str: str, min_version: str, max_version: str) -> bool: + """Return ``True`` if *min_version* <= *version_str* <= *max_version*. Returns ``False`` on parse failure.""" + v = parse_version(version_str) + lo = parse_version(min_version) + hi = parse_version(max_version) + if v is None or lo is None or hi is None: + return False + return lo <= v <= hi diff --git a/cosmos-inference/cosmos3/_src/imaginaire/attention/varlen.py b/cosmos-inference/cosmos3/_src/imaginaire/attention/varlen.py new file mode 100644 index 00000000..596887b9 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/imaginaire/attention/varlen.py @@ -0,0 +1,225 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +""" +Imaginaire4 Attention Subpackage: +Unified implementation for all Attention implementations. + +Varlen utilities +""" + +import torch +from torch import Tensor + +from cosmos3._src.imaginaire.attention.utils import is_torch_compiling + + +def generate_varlen_parameters( + query: Tensor, # [1,S_total_Q,H,D] + key: Tensor, # [1,S_total_KV,H_KV,D] + value: Tensor, # [1,S_total_KV,H_KV,D_V] + seqlens_Q: Tensor | None = None, # [B] + seqlens_KV: Tensor | None = None, # [B] +) -> ( + tuple[None, None, int, int] | tuple[Tensor, Tensor, int, int] +): # (cumseqlen_Q[B+1], cumseqlen_KV[B+1], max_seqlen_Q, max_seqlen_KV) + + # which we launch the varlen kernel) and not device tensors. + # .item() introduces control flow and breaks the graph. + # It is also inefficient to repeat this per-op, and mostly there for convenience. + # generate_varlen_parameters should ideally always be called by the user ahead of model + # forward / backward. + if is_torch_compiling(): + raise RuntimeError( + "Running 'generate_varlen_parameters' in a torch-compiled region is disallowed as it " + "results in graph breaks. Please consider calling ahead of time and pass " + "'cumulative_seqlen_{Q,KV}' and 'max_seqlen_{Q,KV}' instead of 'seqlens_{Q,KV}' to " + "'attention'. " + ) + + if query.shape[0] != key.shape[0] or query.shape[0] != value.shape[0]: + raise ValueError( + f"Q, K, and V must match in batch size, got {query.shape[0]=}, {key.shape[0]=}, {value.shape[0]=}." + ) + + if (seqlens_Q is None) ^ (seqlens_KV is None): + raise ValueError( + "Variable length Attention requires both of seqlens_Q and seqlens_KV to be set, got " + f"{seqlens_Q=}, {seqlens_KV=}." + ) + + if seqlens_Q is None and seqlens_KV is None: + # Not varlen + return None, None, 0, 0 + + assert seqlens_Q is not None + assert seqlens_KV is not None + + if not isinstance(seqlens_Q, Tensor) or not isinstance(seqlens_KV, Tensor): + raise ValueError("seqlens_Q and seqlens_KV must both be tensors.") + + if seqlens_Q.device != query.device or seqlens_KV.device != query.device: + raise ValueError( + "seqlens_Q and seqlens_KV must be on the same device as QKV, but " + f"{seqlens_Q.device=}, {seqlens_KV.device=}, {query.device=}." + ) + + if seqlens_Q.dtype != torch.int32 or seqlens_KV.dtype != torch.int32: + raise ValueError( + f"seqlens_Q and seqlens_KV must both be torch.int32 tensors, got {seqlens_Q.dtype=}, {seqlens_KV.dtype=}." + ) + + if seqlens_Q.dim() != 1 or seqlens_KV.dim() != 1: + raise ValueError( + f"seqlens_Q and seqlens_KV must both be 1-D tensors, got {seqlens_Q.dim()=}, {seqlens_KV.dim()=}." + ) + + if seqlens_Q.shape[0] != seqlens_KV.shape[0]: + raise ValueError(f"seqlens_Q and seqlens_KV must match in size, got {seqlens_Q.shape=}, {seqlens_KV.shape=}.") + + if seqlens_Q.shape[0] < 1: + raise ValueError( + f"seqlens_Q and seqlens_KV must contain at least one element, got {seqlens_Q.shape=}, {seqlens_KV.shape=}." + ) + + if query.shape[0] != 1: + raise ValueError( + f"Variable length attention only supports sequence-packed memory layout (batch = 1), got {query.shape[0]=}." + ) + + assert seqlens_Q.dim() == seqlens_KV.dim() == 1 + assert seqlens_Q.shape[0] == seqlens_KV.shape[0] >= 1 + assert seqlens_Q.dtype == seqlens_KV.dtype == torch.int32 + + max_seqlen_Q = seqlens_Q.max().item() # type: ignore + max_seqlen_KV = seqlens_KV.max().item() # type: ignore + + if max_seqlen_Q < 0 or max_seqlen_KV < 0: + raise ValueError(f"max_seqlen_Q and max_seqlen_KV cannot be negative, got {max_seqlen_Q=}, {max_seqlen_KV=}.") + + + # This feature may require support in the backends themselves; see NATTEN PR: + # https://github.com/SHI-Labs/NATTEN/pull/327 + if (max_seqlen_Q == 0) != (max_seqlen_KV == 0): + raise ValueError( + "max_seqlen_Q and max_seqlen_KV must either both be 0 or both be positive, " + f"but computed {max_seqlen_Q=}, {max_seqlen_KV=} from provided seqlens." + ) + + + z = torch.tensor([0], dtype=torch.int32, device=seqlens_Q.device) # [1] + cumulative_seqlen_Q = torch.cat([z, seqlens_Q.cumsum(0).to(torch.int32)], dim=0) # [B+1] + cumulative_seqlen_KV = torch.cat([z, seqlens_KV.cumsum(0).to(torch.int32)], dim=0) # [B+1] + + assert isinstance(max_seqlen_Q, int) + assert isinstance(max_seqlen_KV, int) + + return ( + cumulative_seqlen_Q, + cumulative_seqlen_KV, + max_seqlen_Q, + max_seqlen_KV, + ) + + +def generate_multi_dim_varlen_parameters( + token_layout_list: list, + head_dim: int, + device: torch.device, + dtype: torch.dtype, + requires_grad: bool, + window_size_list: list | None = None, + stride_list: list | None = None, + dilation_list: list | None = None, + is_causal: tuple | bool = False, + *args, + **kwargs, +) -> dict: + """ + Configures metadata for variable-length multi-dimensional attention operations. + + This function prepares the metadata needed for varlen/varsized sparse attention, + including backend selection and tile configurations. The metadata should be generated + ahead of time (outside of torch.compile regions) and reused across forward/backward passes. + + **Requires NATTEN >= 0.21.9.dev0** + + Parameters: + token_layout_list (list): List of token layout tuples describing the spatial arrangement + of tokens for each sequence. For example, for 2D attention with two sequences of + sizes (H1, W1) and (H2, W2), pass [(H1, W1), (H2, W2)]. + + head_dim (int): Attention head dimension. + + device (torch.device): Target device for runtime. + + dtype (torch.dtype): Tensor element type. + + requires_grad (bool): Whether tensors will require backward pass. + + window_size_list (list | None): Per-sequence window sizes for variable kernel sizes. + + stride_list (list | None): Per-sequence stride values for variable strides. + + dilation_list (list | None): Per-sequence dilation values for variable dilations. + + is_causal (tuple | bool): Toggle causal masking. Default is False. + + Returns: + dict: Runtime metadata for varlen operations. This dict should be passed to + `natten_multi_dimensional_attention_varlen` as the `metadata` parameter. + """ + # For now, NATTEN is the only backend that supports varlen multi-dimensional attention + + from cosmos3._src.imaginaire.attention.natten import NATTEN_VARLEN_MULTI_DIM_VERSION, natten_supported, natten_version_satisfies + + if not natten_supported(): + raise RuntimeError("generate_multi_dim_varlen_parameters requires NATTEN.") + + if not natten_version_satisfies(NATTEN_VARLEN_MULTI_DIM_VERSION): + raise RuntimeError( + f"generate_multi_dim_varlen_parameters requires NATTEN >= {NATTEN_VARLEN_MULTI_DIM_VERSION}. " + "Please upgrade NATTEN to use varlen/varsized attention features." + ) + + from natten.varlen import configure_varlen + + # Map -1s in window size list to full attention + if window_size_list is None: + window_size_list_filtered = [token_layout for token_layout in token_layout_list] + else: + window_size_list_filtered = [] + for window_size, token_layout in zip(window_size_list, token_layout_list): + window_size_filtered = tuple(k if k > 0 else x for k, x in zip(window_size, token_layout)) + window_size_list_filtered.append(window_size_filtered) + + metadata = configure_varlen( + token_layout_list=token_layout_list, + head_dim=head_dim, + device=device, + dtype=dtype, + requires_grad=requires_grad, + is_causal=is_causal, + kernel_size=None, + stride=None, + dilation=None, + kernel_size_list=window_size_list_filtered, + stride_list=stride_list, + dilation_list=dilation_list, + *args, + **kwargs, + ) + + return metadata diff --git a/cosmos-inference/cosmos3/_src/imaginaire/auxiliary/__init__.py b/cosmos-inference/cosmos3/_src/imaginaire/auxiliary/__init__.py new file mode 100644 index 00000000..3159bfe6 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/imaginaire/auxiliary/__init__.py @@ -0,0 +1,14 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. diff --git a/cosmos-inference/cosmos3/_src/imaginaire/auxiliary/guardrail/__init__.py b/cosmos-inference/cosmos3/_src/imaginaire/auxiliary/guardrail/__init__.py new file mode 100644 index 00000000..3159bfe6 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/imaginaire/auxiliary/guardrail/__init__.py @@ -0,0 +1,14 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. diff --git a/cosmos-inference/cosmos3/_src/imaginaire/auxiliary/guardrail/blocklist/__init__.py b/cosmos-inference/cosmos3/_src/imaginaire/auxiliary/guardrail/blocklist/__init__.py new file mode 100644 index 00000000..3159bfe6 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/imaginaire/auxiliary/guardrail/blocklist/__init__.py @@ -0,0 +1,14 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. diff --git a/cosmos-inference/cosmos3/_src/imaginaire/auxiliary/guardrail/blocklist/blocklist.py b/cosmos-inference/cosmos3/_src/imaginaire/auxiliary/guardrail/blocklist/blocklist.py new file mode 100644 index 00000000..53e3259a --- /dev/null +++ b/cosmos-inference/cosmos3/_src/imaginaire/auxiliary/guardrail/blocklist/blocklist.py @@ -0,0 +1,248 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +import argparse +import os +import re +import string +from difflib import SequenceMatcher + +import nltk +from better_profanity import profanity + +from cosmos3._src.imaginaire.auxiliary.guardrail.blocklist.utils import read_keyword_list_from_dir, to_ascii +from cosmos3._src.imaginaire.auxiliary.guardrail.common.core import ( + GUARDRAIL1_CHECKPOINT, + ContentSafetyGuardrail, + GuardrailRunner, +) +from cosmos3._src.imaginaire.utils import log, misc + +CENSOR = misc.Color.red("*") + + +class Blocklist(ContentSafetyGuardrail): + def __init__( + self, + guardrail_partial_match_min_chars: int = 6, + guardrail_partial_match_letter_count: float = 0.4, + ) -> None: + """Blocklist model for text filtering safety check. + + Args: + checkpoint_dir (str): Path to the checkpoint directory. + guardrail_partial_match_min_chars (int, optional): Minimum number of characters in a word to check for partial match. Defaults to 6. + guardrail_partial_match_letter_count (float, optional): Maximum allowed difference in characters for partial match. Defaults to 0.4. + """ + self.checkpoint_dir = os.path.join(GUARDRAIL1_CHECKPOINT.download(), "blocklist") + nltk.data.path.append(os.path.join(self.checkpoint_dir, "nltk_data")) + self.lemmatizer = nltk.WordNetLemmatizer() + self.profanity = profanity + self.guardrail_partial_match_min_chars = guardrail_partial_match_min_chars + self.guardrail_partial_match_letter_count = guardrail_partial_match_letter_count + + # Load blocklist and whitelist keywords + self.blocklist_words = read_keyword_list_from_dir(os.path.join(self.checkpoint_dir, "custom")) + self.whitelist_words = read_keyword_list_from_dir(os.path.join(self.checkpoint_dir, "whitelist")) + self.exact_match_words = read_keyword_list_from_dir(os.path.join(self.checkpoint_dir, "exact_match")) + + self.profanity.load_censor_words(custom_words=self.blocklist_words, whitelist_words=self.whitelist_words) + log.debug(f"Loaded {len(self.blocklist_words)} words/phrases from blocklist") + log.debug(f"Whitelisted {len(self.whitelist_words)} words/phrases from whitelist") + log.debug(f"Loaded {len(self.exact_match_words)} exact match words/phrases from blocklist") + + def uncensor_whitelist(self, input_prompt: str, censored_prompt: str) -> str: + """Explicitly uncensor words that are in the whitelist.""" + input_words = input_prompt.split() + censored_words = censored_prompt.split() + whitelist_words = set(self.whitelist_words) + for i, token in enumerate(input_words): + if token.strip(string.punctuation).lower() in whitelist_words: + censored_words[i] = token + censored_prompt = " ".join(censored_words) + return censored_prompt + + def censor_prompt(self, input_prompt: str) -> tuple[bool, str]: + """Censor the prompt using the blocklist with better-profanity fuzzy matching. + + Args: + input_prompt: input prompt to censor + + Returns: + bool: True if the prompt is blocked, False otherwise + str: A message indicating why the prompt was blocked + """ + censored_prompt = self.profanity.censor(input_prompt, censor_char=CENSOR) + # Uncensor whitelisted words that were censored from blocklist fuzzy matching + censored_prompt = self.uncensor_whitelist(input_prompt, censored_prompt) + if CENSOR in censored_prompt: + return True, f"Prompt blocked by censorship: Censored Prompt: {censored_prompt}" + return False, "" + + @staticmethod + def check_partial_match( + normalized_prompt: str, normalized_word: str, guardrail_partial_match_letter_count: float + ) -> tuple[bool, str]: + """ + Check robustly if normalized word and the matching target have a difference of up to guardrail_partial_match_letter_count characters. + + Args: + normalized_prompt: a string with many words + normalized_word: a string with one or multiple words, its length is smaller than normalized_prompt + guardrail_partial_match_letter_count: maximum allowed difference in characters (float to allow partial characters) + + Returns: + bool: True if a match is found, False otherwise + str: A message indicating why the prompt was blocked + """ + prompt_words = normalized_prompt.split() + word_length = len(normalized_word.split()) + max_similarity_ratio = (len(normalized_word) - float(guardrail_partial_match_letter_count)) / float( + len(normalized_word) + ) + + seq_matcher = SequenceMatcher(None) + seq_matcher.set_seq2(normalized_word) + + for i in range(len(prompt_words) - word_length + 1): + # Extract a substring from the prompt with the same number of words as the normalized_word + substring = " ".join(prompt_words[i : i + word_length]) + seq_matcher.set_seq1(substring) + + # real_quick_ratio and quick_ratio are faster than ratio and both serve as upper bound for similarity ratio. + # If they are less than max_similarity_ratio, it means that also the ratio will be less than max_similarity_ratio and we can skip the expensive ratio computation. + # This saves a lot of time because in practice the tested words are usually dissimilar. + # For details see: https://docs.python.org/3/library/difflib.html#difflib.SequenceMatcher + if ( + seq_matcher.real_quick_ratio() < max_similarity_ratio + or seq_matcher.quick_ratio() < max_similarity_ratio + ): + continue + + similarity_ratio = seq_matcher.ratio() + if similarity_ratio >= max_similarity_ratio: + return ( + True, + f"Prompt blocked by partial match blocklist: Prompt: {normalized_prompt}, Partial Match Word: {normalized_word}", + ) + + return False, "" + + @staticmethod + def check_against_whole_word_blocklist( + prompt: str, + blocklist: list[str], + guardrail_partial_match_min_chars: int = 6, + guardrail_partial_match_letter_count: float = 0.4, + ) -> tuple[bool, str]: + """ + Check if the prompt contains any whole words from the blocklist. + The match is case insensitive and robust to multiple spaces between words. + + Args: + prompt: input prompt to check + blocklist: list of words to check against + guardrail_partial_match_min_chars: minimum number of characters in a word to check for partial match + guardrail_partial_match_letter_count: maximum allowed difference in characters for partial match + + Returns: + tuple[bool, str]: (True if a match is found, False otherwise), message indicating why the prompt was blocked + """ + # Normalize spaces and convert to lowercase + normalized_prompt = re.sub(r"\s+", " ", prompt).strip().lower() + + normalized_words_cache = set() + + for word in blocklist: + # Normalize spaces and convert to lowercase for each blocklist word + normalized_word = re.sub(r"\s+", " ", word).strip().lower() + + if normalized_word in normalized_words_cache: + continue + + normalized_words_cache.add(normalized_word) + + # Use word boundaries to ensure whole word match + if re.search(r"\b" + re.escape(normalized_word) + r"\b", normalized_prompt): + return True, f"Prompt blocked by exact match blocklist: Prompt: {prompt}, Exact Match Word: {word}" + + # Roughly 3/4 of the time this function requires is spent on partial matching. + # We could use just one for loop to check both exact and partial matches but doing it in two loops is faster in practice + # because it delays the partial matching as long as possible with a chance of early exit due to exact match. + # Above we cache the normalized words and here we reuse them in the second loop for partial matching. + + for normalized_word in normalized_words_cache: + # Check for partial match if the word is long enough + if len(normalized_word) >= guardrail_partial_match_min_chars: + match, message = Blocklist.check_partial_match( + normalized_prompt, normalized_word, guardrail_partial_match_letter_count + ) + if match: + return True, message + + return False, "" + + def is_safe(self, input_prompt: str = "") -> tuple[bool, str]: + """Check if the input prompt is safe using the blocklist.""" + # Check if the input is empty + if not input_prompt: + return False, "Input is empty" + input_prompt = to_ascii(input_prompt) + + # Check full sentence for censored words + censored, message = self.censor_prompt(input_prompt) + if censored: + return False, message + + # Check lemmatized words for censored words + tokens = nltk.word_tokenize(input_prompt) + lemmas = [self.lemmatizer.lemmatize(token) for token in tokens] + lemmatized_prompt = " ".join(lemmas) + censored, message = self.censor_prompt(lemmatized_prompt) + if censored: + return False, message + + # Check for exact match blocklist words + censored, message = self.check_against_whole_word_blocklist( + input_prompt, + self.exact_match_words, + self.guardrail_partial_match_min_chars, + self.guardrail_partial_match_letter_count, + ) + if censored: + return False, message + + # If all these checks pass, the input is safe + return True, "Input is safe" + + +def parse_args(): + parser = argparse.ArgumentParser() + parser.add_argument("--prompt", type=str, required=True, help="Input prompt") + return parser.parse_args() + + +def main(args): + blocklist = Blocklist() + runner = GuardrailRunner(safety_models=[blocklist]) + with misc.timer("blocklist safety check"): + safety, message = runner.run_safety_check(args.prompt) + log.info(f"Input is: {'SAFE' if safety else 'UNSAFE'}") + log.info(f"Message: {message}") if not safety else None + + +if __name__ == "__main__": + args = parse_args() + main(args) diff --git a/cosmos-inference/cosmos3/_src/imaginaire/auxiliary/guardrail/blocklist/profile_blocklist.py b/cosmos-inference/cosmos3/_src/imaginaire/auxiliary/guardrail/blocklist/profile_blocklist.py new file mode 100644 index 00000000..ad03cd5b --- /dev/null +++ b/cosmos-inference/cosmos3/_src/imaginaire/auxiliary/guardrail/blocklist/profile_blocklist.py @@ -0,0 +1,59 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +import time + +from cosmos3._src.imaginaire.auxiliary.guardrail.blocklist.blocklist import Blocklist +from cosmos3._src.imaginaire.auxiliary.guardrail.common.core import GuardrailRunner + +# Test prompts +prompts = [ + "Create a beautiful landscape", + "Generate a detailed cinematic scene of a bustling medieval marketplace with merchants selling colorful spices, fabrics hanging from wooden stalls, cobblestone streets filled with people in period clothing, warm golden hour lighting filtering through ancient stone buildings, and smoke rising from food vendors cooking over open fires", + "Create an image of Adele", + "lot of words" * 100 + "Create an image of Adele", + "lot of words" * 100 + "Create an image of Adeele", + "Make an image of someone getting hurt badly", + "Generate a disturbing scene with graphic violence showing people fighting brutally with weapons, blood splattered everywhere, gruesome injuries visible in high detail, screaming faces contorted in pain and terror, dark atmosphere with dramatic lighting emphasizing the carnage and destruction, bodies lying motionless on the ground surrounded by pools of crimson liquid", + "Show me a peaceful garden with flowers blooming in spring sunlight", + "Design a futuristic space station orbiting Earth with advanced technology, sleek metallic surfaces reflecting starlight, astronauts in cutting-edge spacesuits conducting research, multiple docking bays with various spacecraft, solar panels gleaming in the cosmic void, and Earth's blue marble visible in the background through massive observation windows", +] + +checkpoint_dir = "/path/to/your/checkpoint/dir" # Change this path + +# Initialize +blocklist = Blocklist(checkpoint_dir=checkpoint_dir) +runner = GuardrailRunner(safety_models=[blocklist]) + +# Warm up +_ = runner.run_safety_check(prompts[0]) + + +times = [] +for prompt in prompts: + start = time.time() + safe, message = runner.run_safety_check(prompt) + end = time.time() + + elapsed = end - start + times.append(elapsed) + + print(f"Prompt: '{prompt[:50]}...'") + print(f"Safe: {safe}, Time: {elapsed:.4f}s") + if message: + print(f"Message: {message}") + print("-" * 40) + +print(f"\nAverage time: {sum(times) / len(times):.4f}s") diff --git a/cosmos-inference/cosmos3/_src/imaginaire/auxiliary/guardrail/blocklist/utils.py b/cosmos-inference/cosmos3/_src/imaginaire/auxiliary/guardrail/blocklist/utils.py new file mode 100644 index 00000000..9e72484d --- /dev/null +++ b/cosmos-inference/cosmos3/_src/imaginaire/auxiliary/guardrail/blocklist/utils.py @@ -0,0 +1,45 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +import os +import re + +from cosmos3._src.imaginaire.utils import log + + +def read_keyword_list_from_dir(folder_path: str) -> list[str]: + """Read keyword list from all files in a folder.""" + output_list = [] + file_list = [] + # Get list of files in the folder + for file in os.listdir(folder_path): + if os.path.isfile(os.path.join(folder_path, file)): + file_list.append(file) + + # Process each file + for file in file_list: + file_path = os.path.join(folder_path, file) + try: + with open(file_path) as f: + output_list.extend([line.strip() for line in f.readlines()]) + except Exception as e: + log.error(f"Error reading file {file}: {e!s}") + + return output_list + + +def to_ascii(prompt: str) -> str: + """Convert prompt to ASCII.""" + return re.sub(r"[^\x00-\x7F]+", " ", prompt) diff --git a/cosmos-inference/cosmos3/_src/imaginaire/auxiliary/guardrail/common/__init__.py b/cosmos-inference/cosmos3/_src/imaginaire/auxiliary/guardrail/common/__init__.py new file mode 100644 index 00000000..3159bfe6 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/imaginaire/auxiliary/guardrail/common/__init__.py @@ -0,0 +1,14 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. diff --git a/cosmos-inference/cosmos3/_src/imaginaire/auxiliary/guardrail/common/core.py b/cosmos-inference/cosmos3/_src/imaginaire/auxiliary/guardrail/common/core.py new file mode 100644 index 00000000..c6bb336f --- /dev/null +++ b/cosmos-inference/cosmos3/_src/imaginaire/auxiliary/guardrail/common/core.py @@ -0,0 +1,79 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +from typing import Any + +import numpy as np + +from cosmos3._src.imaginaire.utils import log +from cosmos3._src.imaginaire.utils.checkpoint_db import ( + CheckpointDirHf, +) + +GUARDRAIL1_CHECKPOINT = CheckpointDirHf( + repository="nvidia/Cosmos-Guardrail1", + revision="d6d4bfa899a71454a700907664f3e88f503950cf", +) + + +class ContentSafetyGuardrail: + def is_safe(self, **kwargs) -> tuple[bool, str]: + raise NotImplementedError("Child classes must implement the is_safe method") + + +class PostprocessingGuardrail: + def postprocess(self, frames: np.ndarray) -> np.ndarray: + raise NotImplementedError("Child classes must implement the postprocess method") + + +class GuardrailRunner: + def __init__( + self, + safety_models: list[ContentSafetyGuardrail] | None = None, + generic_block_msg: str = "", + generic_safe_msg: str = "", + postprocessors: list[PostprocessingGuardrail] | None = None, + ): + self.safety_models = safety_models + self.generic_block_msg = generic_block_msg + self.generic_safe_msg = generic_safe_msg if generic_safe_msg else "Prompt is safe" + self.postprocessors = postprocessors + + def run_safety_check(self, input: Any) -> tuple[bool, str]: + """Run the safety check on the input.""" + if not self.safety_models: + log.warning("No safety models found, returning safe") + return True, self.generic_safe_msg + + for guardrail in self.safety_models: + guardrail_name = str(guardrail.__class__.__name__).upper() + log.debug(f"Running guardrail: {guardrail_name}") + safe, message = guardrail.is_safe(input) + if not safe: + reasoning = self.generic_block_msg if self.generic_block_msg else f"{guardrail_name}: {message}" + return False, reasoning + return True, self.generic_safe_msg + + def postprocess(self, frames: np.ndarray) -> np.ndarray: + """Run the postprocessing on the video frames.""" + if not self.postprocessors: + log.warning("No postprocessors found, returning original frames") + return frames + + for guardrail in self.postprocessors: + guardrail_name = str(guardrail.__class__.__name__).upper() + log.debug(f"Running guardrail: {guardrail_name}") + frames = guardrail.postprocess(frames) + return frames diff --git a/cosmos-inference/cosmos3/_src/imaginaire/auxiliary/guardrail/common/io_utils.py b/cosmos-inference/cosmos3/_src/imaginaire/auxiliary/guardrail/common/io_utils.py new file mode 100644 index 00000000..8a2183ef --- /dev/null +++ b/cosmos-inference/cosmos3/_src/imaginaire/auxiliary/guardrail/common/io_utils.py @@ -0,0 +1,78 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +import glob +from dataclasses import dataclass + +import imageio +import numpy as np + +from cosmos3._src.imaginaire.utils import log + + +@dataclass +class VideoData: + frames: np.ndarray # Shape: [B, H, W, C] + fps: int + duration: int # in seconds + + +def get_video_filepaths(input_dir: str) -> list[str]: + """Get a list of filepaths for all videos in the input directory.""" + paths = glob.glob(f"{input_dir}/**/*.mp4", recursive=True) + paths += glob.glob(f"{input_dir}/**/*.avi", recursive=True) + paths += glob.glob(f"{input_dir}/**/*.mov", recursive=True) + paths = sorted(paths) + log.debug(f"Found {len(paths)} videos") + return paths + + +def read_video(filepath: str) -> VideoData: + """Read a video file and extract its frames and metadata.""" + try: + reader = imageio.get_reader(filepath, "ffmpeg") + except Exception as e: + raise ValueError(f"Failed to read video file: {filepath}") from e + + # Extract metadata from the video file + try: + metadata = reader.get_meta_data() + fps = metadata.get("fps") + duration = metadata.get("duration") + except Exception as e: + reader.close() + raise ValueError(f"Failed to extract metadata from video file: {filepath}") from e + + # Extract frames from the video file + try: + frames = np.array([frame for frame in reader]) + except Exception as e: + raise ValueError(f"Failed to extract frames from video file: {filepath}") from e + finally: + reader.close() + + return VideoData(frames=frames, fps=fps, duration=duration) + + +def save_video(filepath: str, frames: np.ndarray, fps: int) -> None: + """Save a video file from a sequence of frames.""" + try: + writer = imageio.get_writer(filepath, fps=fps, macro_block_size=1) + for frame in frames: + writer.append_data(frame) + except Exception as e: + raise ValueError(f"Failed to save video file to {filepath}") from e + finally: + writer.close() diff --git a/cosmos-inference/cosmos3/_src/imaginaire/auxiliary/guardrail/common/presets.py b/cosmos-inference/cosmos3/_src/imaginaire/auxiliary/guardrail/common/presets.py new file mode 100644 index 00000000..1a6a7647 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/imaginaire/auxiliary/guardrail/common/presets.py @@ -0,0 +1,78 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +import numpy as np + +from cosmos3._src.imaginaire.auxiliary.guardrail.blocklist.blocklist import Blocklist +from cosmos3._src.imaginaire.auxiliary.guardrail.common.core import GuardrailRunner +from cosmos3._src.imaginaire.auxiliary.guardrail.face_blur_filter.face_blur_filter import RetinaFaceFilter +from cosmos3._src.imaginaire.auxiliary.guardrail.qwen3guard.qwen3guard import Qwen3Guard +from cosmos3._src.imaginaire.auxiliary.guardrail.video_content_safety_filter.video_content_safety_filter import ( + VideoContentSafetyFilter, +) +from cosmos3._src.imaginaire.utils import log + + +def create_text_guardrail_runner(offload_model_to_cpu: bool = False) -> GuardrailRunner: + """Create the text guardrail runner.""" + return GuardrailRunner( + safety_models=[ + Blocklist(), + Qwen3Guard(offload_model_to_cpu=offload_model_to_cpu), + ] + ) + + +def create_video_guardrail_runner(offload_model_to_cpu: bool = False) -> GuardrailRunner: + """Create the video guardrail runner.""" + return GuardrailRunner( + safety_models=[VideoContentSafetyFilter(offload_model_to_cpu=offload_model_to_cpu)], + postprocessors=[RetinaFaceFilter(offload_model_to_cpu=offload_model_to_cpu)], + ) + + +def run_text_guardrail(prompt: str, guardrail_runner: GuardrailRunner) -> bool: + """Run the text guardrail on the prompt, checking for content safety. + + Args: + prompt: The text prompt. + guardrail_runner: The text guardrail runner. + + Returns: + bool: Whether the prompt is safe. + """ + is_safe, message = guardrail_runner.run_safety_check(prompt) + if not is_safe: + log.critical(f"GUARDRAIL BLOCKED: {message}") + return is_safe + + +def run_video_guardrail(frames: np.ndarray, guardrail_runner: GuardrailRunner) -> np.ndarray | None: + """Run the video guardrail on the frames, checking for content safety and applying face blur. + + Args: + frames: The frames of the generated video. + guardrail_runner: The video guardrail runner. + + Returns: + The processed frames if safe, otherwise None. + """ + is_safe, message = guardrail_runner.run_safety_check(frames) + if not is_safe: + log.critical(f"GUARDRAIL BLOCKED: {message}") + return None + + frames = guardrail_runner.postprocess(frames) + return frames diff --git a/cosmos-inference/cosmos3/_src/imaginaire/auxiliary/guardrail/face_blur_filter/__init__.py b/cosmos-inference/cosmos3/_src/imaginaire/auxiliary/guardrail/face_blur_filter/__init__.py new file mode 100644 index 00000000..3159bfe6 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/imaginaire/auxiliary/guardrail/face_blur_filter/__init__.py @@ -0,0 +1,14 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. diff --git a/cosmos-inference/cosmos3/_src/imaginaire/auxiliary/guardrail/face_blur_filter/blur_utils.py b/cosmos-inference/cosmos3/_src/imaginaire/auxiliary/guardrail/face_blur_filter/blur_utils.py new file mode 100644 index 00000000..d52f69d2 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/imaginaire/auxiliary/guardrail/face_blur_filter/blur_utils.py @@ -0,0 +1,35 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +import cv2 +import numpy as np + + +def pixelate_face(face_img: np.ndarray, blocks: int = 5) -> np.ndarray: + """ + Pixelate a face region by reducing resolution and then upscaling. + + Args: + face_img: Face region to pixelate + blocks: Number of blocks to divide the face into (in each dimension) + + Returns: + Pixelated face region + """ + h, w = face_img.shape[:2] + # Shrink the image and scale back up to create pixelation effect + temp = cv2.resize(face_img, (blocks, blocks), interpolation=cv2.INTER_LINEAR) + pixelated = cv2.resize(temp, (w, h), interpolation=cv2.INTER_NEAREST) + return pixelated diff --git a/cosmos-inference/cosmos3/_src/imaginaire/auxiliary/guardrail/face_blur_filter/face_blur_filter.py b/cosmos-inference/cosmos3/_src/imaginaire/auxiliary/guardrail/face_blur_filter/face_blur_filter.py new file mode 100644 index 00000000..657cc99f --- /dev/null +++ b/cosmos-inference/cosmos3/_src/imaginaire/auxiliary/guardrail/face_blur_filter/face_blur_filter.py @@ -0,0 +1,242 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +import argparse +import os +import warnings + +import numpy as np +import torch +from retinaface.data import cfg_re50 +from retinaface.layers.functions.prior_box import PriorBox +from retinaface.models.retinaface import RetinaFace +from tqdm import tqdm + +from cosmos3._src.imaginaire.auxiliary.guardrail.common.core import ( + GUARDRAIL1_CHECKPOINT, + GuardrailRunner, + PostprocessingGuardrail, +) +from cosmos3._src.imaginaire.auxiliary.guardrail.common.io_utils import ( + get_video_filepaths, + read_video, + save_video, +) +from cosmos3._src.imaginaire.auxiliary.guardrail.face_blur_filter.blur_utils import pixelate_face +from cosmos3._src.imaginaire.auxiliary.guardrail.face_blur_filter.retinaface_utils import ( + decode_batch, + filter_detected_boxes, + load_model, +) +from cosmos3._src.imaginaire.utils import log, misc + +# RetinaFace model constants from https://github.com/biubug6/Pytorch_Retinaface/blob/master/detect.py +TOP_K = 5_000 +KEEP_TOP_K = 750 +NMS_THRESHOLD = 0.4 + + +class RetinaFaceFilter(PostprocessingGuardrail): + def __init__( + self, + batch_size: int = 1, + confidence_threshold: float = 0.7, + offload_model_to_cpu: bool = True, + ) -> None: + """ + Initialize the RetinaFace model for face detection and blurring. + + Args: + checkpoint: Path to the RetinaFace checkpoint file + batch_size: Batch size for RetinaFace inference and processing + confidence_threshold: Minimum confidence score to consider a face detection + offload_model_to_cpu (bool, optional): Whether to offload the model to CPU. Defaults to True. + """ + self.checkpoint = f"{GUARDRAIL1_CHECKPOINT.download()}/face_blur_filter/Resnet50_Final.pth" + self.cfg = cfg_re50 + self.batch_size = batch_size + self.confidence_threshold = confidence_threshold + self.dtype = torch.float32 + self.offload_model = offload_model_to_cpu + + # Disable loading ResNet pretrained weights + self.cfg["pretrain"] = False + with warnings.catch_warnings(): + warnings.simplefilter("ignore") + self.net = RetinaFace(cfg=self.cfg, phase="test") + + # Load from RetinaFace pretrained checkpoint + if not offload_model_to_cpu: + self.net = load_model(self.net, self.checkpoint, False) + self.net.to("cuda", dtype=self.dtype).eval() + log.debug("Moved face blur filter to GPU") + else: + self.net = load_model(self.net, self.checkpoint, True) + self.net.to("cpu", dtype=self.dtype).eval() + log.debug("Moved face blur filter to CPU") + + def preprocess_frames(self, frames: np.ndarray) -> torch.Tensor: + """Preprocess a sequence of frames for face detection. + + Args: + frames: Input frames + + Returns: + Preprocessed frames tensor + """ + with torch.no_grad(): + frames_tensor = torch.from_numpy(frames).to("cuda", dtype=self.dtype) # Shape: [T, H, W, C] + frames_tensor = frames_tensor.permute(0, 3, 1, 2) # Shape: [T, C, H, W] + frames_tensor = frames_tensor[:, [2, 1, 0], :, :] # RGB to BGR to match RetinaFace model input + means = torch.tensor([104.0, 117.0, 123.0], device="cuda", dtype=self.dtype).view(1, 3, 1, 1) + frames_tensor = frames_tensor - means # Subtract mean BGR values for each channel + return frames_tensor + + def blur_detected_faces( + self, + frames: np.ndarray, + batch_loc: torch.Tensor, + batch_conf: torch.Tensor, + prior_data: torch.Tensor, + scale: torch.Tensor, + min_size: tuple[int] = (20, 20), + ) -> list[np.ndarray]: + """Blur detected faces in a batch of frames using RetinaFace predictions. + + Args: + frames: Input frames + batch_loc: Batched location predictions + batch_conf: Batched confidence scores + prior_data: Prior boxes for the video + scale: Scale factor for resizing detections + min_size: Minimum size of a detected face region in pixels + + Returns: + Processed frames with pixelated faces + """ + with torch.no_grad(): + batch_boxes = decode_batch(batch_loc, prior_data, self.cfg["variance"]) + batch_boxes = batch_boxes * scale + + blurred_frames = [] + for i, boxes in enumerate(batch_boxes): + boxes = boxes.detach().cpu().numpy() + scores = batch_conf[i, :, 1].detach().cpu().numpy() + + filtered_boxes = filter_detected_boxes( + boxes, + scores, + confidence_threshold=self.confidence_threshold, + nms_threshold=NMS_THRESHOLD, + top_k=TOP_K, + keep_top_k=KEEP_TOP_K, + ) + + frame = frames[i] + for box in filtered_boxes: + x1, y1, x2, y2 = map(int, box) + # Ignore bounding boxes smaller than the minimum size + if x2 - x1 < min_size[0] or y2 - y1 < min_size[1]: + continue + max_h, max_w = frame.shape[:2] + face_roi = frame[max(y1, 0) : min(y2, max_h), max(x1, 0) : min(x2, max_w)] + blurred_face = pixelate_face(face_roi) + frame[max(y1, 0) : min(y2, max_h), max(x1, 0) : min(x2, max_w)] = blurred_face + blurred_frames.append(frame) + + return blurred_frames + + def postprocess(self, frames: np.ndarray) -> np.ndarray: + """Blur faces in a sequence of frames. + + Args: + frames: Input frames + + Returns: + Processed frames with pixelated faces + """ + if self.offload_model: + self.net = self.net.to("cuda") + log.debug("Move face blur filter to GPU") + + num_frames = len(frames) + processed_batches = [] + prior_data, scale = None, None + + for i in range(0, num_frames, self.batch_size): + # Get batch of frames from numpy array (stays on CPU) + batch_frames = frames[i : i + self.batch_size] + + # Preprocess just this batch on GPU + batch_tensor = self.preprocess_frames(batch_frames) + h, w = batch_tensor.shape[-2:] + + with torch.no_grad(): + # Generate priors for the video + if prior_data is None: + priorbox = PriorBox(self.cfg, image_size=(h, w)) + priors = priorbox.forward() + priors = priors.to("cuda", dtype=self.dtype) + prior_data = priors.data + + # Get scale for resizing detections + if scale is None: + scale = torch.Tensor([w, h, w, h]) + scale = scale.to("cuda", dtype=self.dtype) + + batch_loc, batch_conf, _ = self.net(batch_tensor) + + # Blur detected faces in this batch + processed_batches.append(self.blur_detected_faces(batch_frames, batch_loc, batch_conf, prior_data, scale)) + + # Free GPU memory for this batch + del batch_tensor, batch_loc, batch_conf + + processed_frames = [frame for batch in processed_batches for frame in batch] + if self.offload_model: + self.net = self.net.to("cpu") + log.debug("Offload face blur filter to CPU") + return np.array(processed_frames) + + +def parse_args(): + parser = argparse.ArgumentParser() + parser.add_argument("--input_dir", type=str, required=True, help="Path containing input videos") + parser.add_argument("--output_dir", type=str, required=True, help="Path for saving processed videos") + return parser.parse_args() + + +def main(args): + filepaths = get_video_filepaths(args.input_dir) + if not filepaths: + log.error(f"No video files found in directory: {args.input_dir}") + return + + face_blur = RetinaFaceFilter() + postprocessing_runner = GuardrailRunner(postprocessors=[face_blur]) + os.makedirs(args.output_dir, exist_ok=True) + + for filepath in tqdm(filepaths): + video_data = read_video(filepath) + with misc.timer("face blur filter"): + frames = postprocessing_runner.postprocess(video_data.frames) + + output_path = os.path.join(args.output_dir, os.path.basename(filepath)) + save_video(output_path, frames, video_data.fps) + + +if __name__ == "__main__": + args = parse_args() + main(args) diff --git a/cosmos-inference/cosmos3/_src/imaginaire/auxiliary/guardrail/face_blur_filter/retinaface_utils.py b/cosmos-inference/cosmos3/_src/imaginaire/auxiliary/guardrail/face_blur_filter/retinaface_utils.py new file mode 100644 index 00000000..a8531c6b --- /dev/null +++ b/cosmos-inference/cosmos3/_src/imaginaire/auxiliary/guardrail/face_blur_filter/retinaface_utils.py @@ -0,0 +1,117 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +import numpy as np +import torch +from retinaface.utils.nms.py_cpu_nms import py_cpu_nms + +from cosmos3._src.imaginaire.utils import log + + +# Adapted from https://github.com/biubug6/Pytorch_Retinaface/blob/master/detect.py +def filter_detected_boxes(boxes, scores, confidence_threshold, nms_threshold, top_k, keep_top_k): + """Filter boxes based on confidence score and remove overlapping boxes using NMS.""" + # Keep detections with confidence above threshold + inds = np.where(scores > confidence_threshold)[0] + boxes = boxes[inds] + scores = scores[inds] + + # Sort by confidence and keep top K detections + order = scores.argsort()[::-1][:top_k] + boxes = boxes[order] + scores = scores[order] + + # Run non-maximum-suppression (NMS) to remove overlapping boxes + dets = np.hstack((boxes, scores[:, np.newaxis])).astype(np.float32, copy=False) + keep = py_cpu_nms(dets, nms_threshold) + dets = dets[keep, :] + dets = dets[:keep_top_k, :] + boxes = dets[:, :-1] + return boxes + + +# Adapted from https://github.com/biubug6/Pytorch_Retinaface/blob/master/utils/box_utils.py to handle batched inputs +def decode_batch(loc, priors, variances): + """Decode batched locations from predictions using priors and variances. + + Args: + loc (tensor): Batched location predictions for loc layers. + Shape: [batch_size, num_priors, 4] + priors (tensor): Prior boxes in center-offset form. + Shape: [num_priors, 4] + variances: (list[float]): Variances of prior boxes. + + Return: + Decoded batched bounding box predictions + Shape: [batch_size, num_priors, 4] + """ + batch_size = loc.size(0) + priors = priors.unsqueeze(0).expand(batch_size, -1, -1) + + boxes = torch.cat( + ( + priors[:, :, :2] + loc[:, :, :2] * variances[0] * priors[:, :, 2:], + priors[:, :, 2:] * torch.exp(loc[:, :, 2:] * variances[1]), + ), + dim=2, + ) + + boxes[:, :, :2] -= boxes[:, :, 2:] / 2 + boxes[:, :, 2:] += boxes[:, :, :2] + return boxes + + +# Adapted from https://github.com/biubug6/Pytorch_Retinaface/blob/master/detect.py +def _check_keys(model, pretrained_state_dict): + ckpt_keys = set(pretrained_state_dict.keys()) + model_keys = set(model.state_dict().keys()) + used_pretrained_keys = model_keys & ckpt_keys + unused_pretrained_keys = ckpt_keys - model_keys + missing_keys = model_keys - ckpt_keys + log.debug(f"Missing keys:{len(missing_keys)}") + log.debug(f"Unused checkpoint keys:{len(unused_pretrained_keys)}") + log.debug(f"Used keys:{len(used_pretrained_keys)}") + assert len(used_pretrained_keys) > 0, "load NONE from pretrained checkpoint" + return True + + +# Adapted from https://github.com/biubug6/Pytorch_Retinaface/blob/master/detect.py +def _remove_prefix(state_dict, prefix): + """Old version of the model is stored with all names of parameters sharing common prefix 'module.'""" + log.debug(f"Removing prefix '{prefix}'") + + def f(x): + return x.split(prefix, 1)[-1] if x.startswith(prefix) else x + + return {f(key): value for key, value in state_dict.items()} + + +# Adapted from https://github.com/biubug6/Pytorch_Retinaface/blob/master/detect.py +def load_model(model, pretrained_path, load_to_cpu): + log.debug(f"Loading pretrained model from {pretrained_path}") + if load_to_cpu: + pretrained_dict = torch.load(pretrained_path, map_location=lambda storage, loc: storage, weights_only=True) + else: + device = torch.cuda.current_device() + pretrained_dict = torch.load( + pretrained_path, map_location=lambda storage, loc: storage.cuda(device), weights_only=True + ) + if "state_dict" in pretrained_dict.keys(): + pretrained_dict = _remove_prefix(pretrained_dict["state_dict"], "module.") + else: + pretrained_dict = _remove_prefix(pretrained_dict, "module.") + _check_keys(model, pretrained_dict) + model.load_state_dict(pretrained_dict, strict=False) + return model diff --git a/cosmos-inference/cosmos3/_src/imaginaire/auxiliary/guardrail/llamaGuard3/__init__.py b/cosmos-inference/cosmos3/_src/imaginaire/auxiliary/guardrail/llamaGuard3/__init__.py new file mode 100644 index 00000000..3159bfe6 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/imaginaire/auxiliary/guardrail/llamaGuard3/__init__.py @@ -0,0 +1,14 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. diff --git a/cosmos-inference/cosmos3/_src/imaginaire/auxiliary/guardrail/llamaGuard3/categories.py b/cosmos-inference/cosmos3/_src/imaginaire/auxiliary/guardrail/llamaGuard3/categories.py new file mode 100644 index 00000000..f8d5a95d --- /dev/null +++ b/cosmos-inference/cosmos3/_src/imaginaire/auxiliary/guardrail/llamaGuard3/categories.py @@ -0,0 +1,31 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +UNSAFE_CATEGORIES = { + "S1": "Violent Crimes.", + "S2": "Non-Violent Crimes.", + "S3": "Sex Crimes.", + "S4": "Child Exploitation.", + "S5": "Defamation.", + "S6": "Specialized Advice.", + "S7": "Privacy.", + "S8": "Intellectual Property.", + "S9": "Indiscriminate Weapons.", + "S10": "Hate.", + "S11": "Self-Harm.", + "S12": "Sexual Content.", + "S13": "Elections.", + "s14": "Code Interpreter Abuse.", +} diff --git a/cosmos-inference/cosmos3/_src/imaginaire/auxiliary/guardrail/llamaGuard3/llamaGuard3.py b/cosmos-inference/cosmos3/_src/imaginaire/auxiliary/guardrail/llamaGuard3/llamaGuard3.py new file mode 100644 index 00000000..e0e1b8cd --- /dev/null +++ b/cosmos-inference/cosmos3/_src/imaginaire/auxiliary/guardrail/llamaGuard3/llamaGuard3.py @@ -0,0 +1,130 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +import argparse + +import torch +from transformers import AutoModelForCausalLM, AutoTokenizer + +from cosmos3._src.imaginaire.auxiliary.guardrail.common.core import ContentSafetyGuardrail, GuardrailRunner +from cosmos3._src.imaginaire.auxiliary.guardrail.llamaGuard3.categories import UNSAFE_CATEGORIES +from cosmos3._src.imaginaire.utils import log, misc + +SAFE = misc.Color.green("SAFE") +UNSAFE = misc.Color.red("UNSAFE") + + +class LlamaGuard3(ContentSafetyGuardrail): + def __init__( + self, + offload_model_to_cpu: bool = True, + ) -> None: + """Llama Guard 3 model for text filtering safety check. + + Args: + checkpoint_dir (str): Path to the checkpoint directory. + offload_model_to_cpu (bool, optional): Whether to offload the model to CPU. Defaults to True. + """ + self.offload_model = offload_model_to_cpu + self.dtype = torch.bfloat16 + + model_id = "meta-llama/Llama-Guard-3-8B" + + self.model = AutoModelForCausalLM.from_pretrained(model_id) + self.tokenizer = AutoTokenizer.from_pretrained(model_id) + + # Move model to GPU unless offload_model_to_cpu is True + if not offload_model_to_cpu: + self.model = self.model.to("cuda", dtype=self.dtype).eval() + log.debug("Moved llamaGuard3 model to GPU") + else: + self.model = self.model.to("cpu", dtype=self.dtype).eval() + log.debug("Moved llamaGuard3 model to CPU") + + def get_llamaGuard3_block_message(self, moderation_output: str) -> str: + """Extract the blocked category from the Llama Guard 3 model output.""" + block_msg = "Prompt blocked by Llama Guard 3." + try: + lines = moderation_output.splitlines() + categories_detected = [] + for line in lines[1:]: + line_stripped = line.split("<|eot_id|>")[0].strip() + for catagory in line_stripped.split(","): + catagory = catagory.strip() + if catagory not in UNSAFE_CATEGORIES: + log.warning(f"Unrecognized category from moderation output: {catagory}") + else: + categories_detected.append(catagory) + if len(categories_detected) > 0: + blocked_catagories = ", ".join([UNSAFE_CATEGORIES[catagory][:-1] for catagory in categories_detected]) + block_msg = f"{block_msg} Violations: {blocked_catagories}." + except Exception as e: + log.warning(f"Unable to extract blocked category from Llama Guard 3 output: {e}") + return block_msg + + def filter_llamaGuard3_output(self, prompt: str) -> tuple[bool, str]: + """Filter the Llama Guard 3 model output and return the safety status and message.""" + conversation = [{"role": "user", "content": prompt}] + if self.offload_model: + self.model = self.model.to("cuda") + log.debug("Move llamaGuard3 model to GPU") + input_ids = self.tokenizer.apply_chat_template( + conversation, categories=UNSAFE_CATEGORIES, return_tensors="pt" + ).to("cuda") + prompt_len = input_ids.shape[1] + output = self.model.generate( + input_ids=input_ids, + max_new_tokens=100, + return_dict_in_generate=True, + pad_token_id=0, + ) + generated_tokens = output.sequences[:, prompt_len:] + moderation_output = self.tokenizer.decode(generated_tokens[0], skip_special_tokens=False).strip() + if self.offload_model: + self.model = self.model.to("cpu") + log.debug("Offload llamaGuard3 model to CPU") + if "unsafe" in moderation_output.lower(): + block_msg = self.get_llamaGuard3_block_message(moderation_output) + return False, block_msg + else: + return True, "" + + def is_safe(self, prompt: str) -> tuple[bool, str]: + """Check if the input prompt is safe according to the Llama Guard 3 model.""" + try: + return self.filter_llamaGuard3_output(prompt) + except Exception as e: + log.error(f"Unexpected error occurred when running Llama Guard 3 guardrail: {e}") + return True, "Unexpected error occurred when running Llama Guard 3 guardrail." + + +def parse_args(): + parser = argparse.ArgumentParser() + parser.add_argument("--prompt", type=str, required=True, help="Input prompt") + return parser.parse_args() + + +def main(args): + llamaGuard3 = LlamaGuard3() + runner = GuardrailRunner(safety_models=[llamaGuard3]) + with misc.timer("Llama Guard 3 safety check"): + safety, message = runner.run_safety_check(args.prompt) + log.info(f"Input is: {'SAFE' if safety else 'UNSAFE'}") + log.info(f"Message: {message}") if not safety else None + + +if __name__ == "__main__": + args = parse_args() + main(args) diff --git a/cosmos-inference/cosmos3/_src/imaginaire/auxiliary/guardrail/qwen3guard/__init__.py b/cosmos-inference/cosmos3/_src/imaginaire/auxiliary/guardrail/qwen3guard/__init__.py new file mode 100644 index 00000000..db6d6eb6 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/imaginaire/auxiliary/guardrail/qwen3guard/__init__.py @@ -0,0 +1,15 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025-2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + diff --git a/cosmos-inference/cosmos3/_src/imaginaire/auxiliary/guardrail/qwen3guard/categories.py b/cosmos-inference/cosmos3/_src/imaginaire/auxiliary/guardrail/qwen3guard/categories.py new file mode 100644 index 00000000..b990489f --- /dev/null +++ b/cosmos-inference/cosmos3/_src/imaginaire/auxiliary/guardrail/qwen3guard/categories.py @@ -0,0 +1,23 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +UNSAFE_CATEGORIES = { + "S1": "Violent", + "S2": "Non-violent Illegal Acts", + "S3": "Sexual Content or Sexual Acts", + "S4": "Suicide & Self-Harm", + "S5": "Unethical Acts", + "S6": "Jailbreak", +} diff --git a/cosmos-inference/cosmos3/_src/imaginaire/auxiliary/guardrail/qwen3guard/qwen3guard.py b/cosmos-inference/cosmos3/_src/imaginaire/auxiliary/guardrail/qwen3guard/qwen3guard.py new file mode 100644 index 00000000..e6078bb2 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/imaginaire/auxiliary/guardrail/qwen3guard/qwen3guard.py @@ -0,0 +1,102 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +import argparse +import re + +import torch +from transformers import AutoModelForCausalLM, AutoTokenizer + +from cosmos3._src.imaginaire.auxiliary.guardrail.common.core import ContentSafetyGuardrail, GuardrailRunner +from cosmos3._src.imaginaire.auxiliary.guardrail.qwen3guard.categories import UNSAFE_CATEGORIES +from cosmos3._src.imaginaire.utils import log, misc + +SAFE = misc.Color.green("SAFE") +UNSAFE = misc.Color.red("UNSAFE") + + +class Qwen3Guard(ContentSafetyGuardrail): + def __init__( + self, + offload_model_to_cpu: bool = True, + ) -> None: + """Llama Guard 3 model for text filtering safety check. + + Args: + checkpoint_dir (str): Path to the checkpoint directory. + offload_model_to_cpu (bool, optional): Whether to offload the model to CPU. Defaults to True. + """ + self.offload_model = offload_model_to_cpu + self.dtype = torch.bfloat16 + + model_id = "Qwen/Qwen3Guard-Gen-0.6B" + + self.model = AutoModelForCausalLM.from_pretrained(model_id) + self.tokenizer = AutoTokenizer.from_pretrained(model_id) + + # Move model to GPU unless offload_model_to_cpu is True + if not offload_model_to_cpu: + self.model = self.model.to("cuda", dtype=self.dtype).eval() + log.debug("Moved llamaGuard3 model to GPU") + else: + self.model = self.model.to("cpu", dtype=self.dtype).eval() + log.debug("Moved Qwen3Guard model to CPU") + + def extract_label_and_categories(self, prompt): + safe_pattern = r"Safety: (Safe|Unsafe|Controversial)" + category_pattern = r"(" + "|".join(UNSAFE_CATEGORIES.values()) + ")" + messages = [{"role": "user", "content": prompt}] + + text = self.tokenizer.apply_chat_template(messages, tokenize=False) + model_inputs = self.tokenizer([text], return_tensors="pt").to(self.model.device) + generated_ids = self.model.generate(**model_inputs, max_new_tokens=128) + output_ids = generated_ids[0][len(model_inputs.input_ids[0]) :].tolist() + content = self.tokenizer.decode(output_ids, skip_special_tokens=True) + + safe_label_match = re.search(safe_pattern, content) + label = safe_label_match.group(1) if safe_label_match else None + categories = re.findall(category_pattern, content) + if label.lower() == "unsafe": + return False, f"Prompt blocked by Qwen3Guard. Safety: {label}, Categories: {categories}" + else: + return True, "" + + def is_safe(self, prompt: str) -> tuple[bool, str]: + """Check if the input prompt is safe according to the Qwen3Guard model.""" + try: + return self.extract_label_and_categories(prompt) + except Exception as e: + log.error(f"Unexpected error occurred when running Qwen3Guard guardrail: {e}") + return True, "Unexpected error occurred when running Qwen3Guard guardrail." + + +def parse_args(): + parser = argparse.ArgumentParser() + parser.add_argument("--prompt", type=str, required=True, help="Input prompt") + return parser.parse_args() + + +def main(args): + qwen3guard = Qwen3Guard() + runner = GuardrailRunner(safety_models=[qwen3guard]) + with misc.timer("Qwen3Guard safety check"): + safety, message = runner.run_safety_check(args.prompt) + log.info(f"Input is: {'SAFE' if safety else 'UNSAFE'}") + log.info(f"Message: {message}") if not safety else None + + +if __name__ == "__main__": + args = parse_args() + main(args) diff --git a/cosmos-inference/cosmos3/_src/imaginaire/auxiliary/guardrail/video_content_safety_filter/__init__.py b/cosmos-inference/cosmos3/_src/imaginaire/auxiliary/guardrail/video_content_safety_filter/__init__.py new file mode 100644 index 00000000..3159bfe6 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/imaginaire/auxiliary/guardrail/video_content_safety_filter/__init__.py @@ -0,0 +1,14 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. diff --git a/cosmos-inference/cosmos3/_src/imaginaire/auxiliary/guardrail/video_content_safety_filter/model.py b/cosmos-inference/cosmos3/_src/imaginaire/auxiliary/guardrail/video_content_safety_filter/model.py new file mode 100644 index 00000000..ecb03832 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/imaginaire/auxiliary/guardrail/video_content_safety_filter/model.py @@ -0,0 +1,60 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +import attrs +import torch +import torch.nn as nn + +from cosmos3._src.imaginaire.config import make_freezable + + +@make_freezable +@attrs.define(slots=False) +class ModelConfig: + input_size: int = 1152 + num_classes: int = 7 + + +class SafetyClassifier(nn.Module): + def __init__(self, input_size: int = 1024, num_classes: int = 2): + super().__init__() + self.input_size = input_size + self.num_classes = num_classes + self.layers = nn.Sequential( + nn.Linear(self.input_size, 512), + nn.BatchNorm1d(512), + nn.ReLU(), + nn.Linear(512, 256), + nn.BatchNorm1d(256), + nn.ReLU(), + nn.Linear(256, self.num_classes), + # Note: No activation function here; CrossEntropyLoss expects raw logits + ) + + def forward(self, x): + return self.layers(x) + + +class VideoSafetyModel(nn.Module): + def __init__(self, config: ModelConfig) -> None: + super().__init__() + self.config = config + self.num_classes = config.num_classes + self.network = SafetyClassifier(input_size=config.input_size, num_classes=self.num_classes) + + @torch.inference_mode() + def forward(self, data_batch: dict[str, torch.Tensor]) -> dict[str, torch.Tensor]: + logits = self.network(data_batch["data"].cuda()) + return {"logits": logits} diff --git a/cosmos-inference/cosmos3/_src/imaginaire/auxiliary/guardrail/video_content_safety_filter/video_content_safety_filter.py b/cosmos-inference/cosmos3/_src/imaginaire/auxiliary/guardrail/video_content_safety_filter/video_content_safety_filter.py new file mode 100644 index 00000000..fcfe6ab1 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/imaginaire/auxiliary/guardrail/video_content_safety_filter/video_content_safety_filter.py @@ -0,0 +1,208 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +import argparse +import json +import os +from collections.abc import Iterable + +import torch +from PIL import Image + +from cosmos3._src.imaginaire.auxiliary.guardrail.common.core import ( + GUARDRAIL1_CHECKPOINT, + ContentSafetyGuardrail, + GuardrailRunner, +) +from cosmos3._src.imaginaire.auxiliary.guardrail.common.io_utils import get_video_filepaths, read_video +from cosmos3._src.imaginaire.auxiliary.guardrail.video_content_safety_filter.model import ModelConfig, VideoSafetyModel +from cosmos3._src.imaginaire.auxiliary.guardrail.video_content_safety_filter.vision_encoder import SigLIPEncoder +from cosmos3._src.imaginaire.utils import log, misc + +# Define the class index to class name mapping for multi-class classification +CLASS_IDX_TO_NAME = { + 0: "Safe", + 1: "Sexual_Content", + 3: "Drugs", + 4: "Child_Abuse", + 5: "Hate_and_Harassment", + 6: "Self-Harm", +} + +CUTOFF_UNSAFE_FRAMES_PERCENT = 10 # 10% of frames are unsafe, then the video is unsafe + + +class VideoContentSafetyFilter(ContentSafetyGuardrail): + def __init__( + self, + offload_model_to_cpu: bool = True, + ) -> None: + """Video content safety filter model. + + Args: + checkpoint_dir (str): Path to the checkpoint directory. + offload_model_to_cpu (bool, optional): Whether to offload the model to CPU. Defaults to True. + """ + self.offload_model = offload_model_to_cpu + self.dtype = torch.float32 + self.checkpoint_dir = os.path.join(GUARDRAIL1_CHECKPOINT.download(), "video_content_safety_filter") + + # Use ModelConfig directly for inference configuration + model_config = ModelConfig(input_size=1152, num_classes=7) + + # Load the multi-class classifier and initialize the SigLIP encoder + self.model = VideoSafetyModel(model_config) + safety_filter_local_path = os.path.join(self.checkpoint_dir, "safety_filter.pt") + checkpoint = torch.load(safety_filter_local_path, map_location=torch.device("cpu"), weights_only=True) + self.model.load_state_dict(checkpoint["model"]) + self.encoder = SigLIPEncoder(device="cuda", dtype=self.dtype) + if offload_model_to_cpu: + self.encoder.to("cpu") + self.model = self.model.to("cpu", dtype=self.dtype).eval() + log.debug("Moved video content safety filter to CPU") + else: + self.encoder.to("cuda") + self.model = self.model.to("cuda", dtype=self.dtype).eval() + log.debug("Moved video content safety filter to GPU") + + @torch.inference_mode() + def __infer(self, pil_image: Image.Image) -> int: + """Infer the class of the image.""" + image_embs = self.encoder.encode_image(pil_image) + logits = self.model.network(image_embs) + probabilities = torch.nn.functional.softmax(logits, dim=-1) + predicted_class = int(torch.argmax(probabilities, dim=-1).item()) + return predicted_class + + def _to_cuda_if_offload(self): + if self.offload_model: + self.encoder = self.encoder.to("cuda") + self.model = self.model.to("cuda") + log.debug("Move video content safety filter to GPU") + + def _to_cpu_if_offload(self): + if self.offload_model: + self.encoder = self.encoder.to("cpu") + self.model = self.model.to("cpu") + log.debug("Offload video content safety filter to CPU") + + def is_safe_file(self, filepath: str) -> bool: + """Check if the video file is safe.""" + video_data = read_video(filepath) + + # Sample frames at 2 FPS + sample_rate = 2 # frames per second + frame_interval = int(video_data.fps / sample_rate) + frame_numbers = list(range(0, int(video_data.fps * video_data.duration), frame_interval)) + frames = [video_data.frames[frame_number] for frame_number in frame_numbers] + return self.is_safe_frames(frames) + + def is_safe_frames(self, frames: Iterable) -> bool: + """Check if video frames are safe. Populates ``self.last_diagnostics`` as a side effect; + single-instance serial use only (not thread-safe).""" + is_safe = True + frame_scores: list[dict] = [] + unsafe_frame_count = 0 + total_frame_count = 0 + + self._to_cuda_if_offload() + for frame_number, frame in enumerate(frames): + total_frame_count += 1 + try: + pil_image = Image.fromarray(frame) + predicted_class = self.__infer(pil_image) + class_name = CLASS_IDX_TO_NAME.get(predicted_class, "Unknown") + frame_scores.append({"frame_number": frame_number, "class": class_name}) + + # If any frame considered in the list of unsafe categories, mark the video as unsafe + if class_name != "Safe" and class_name in CLASS_IDX_TO_NAME.values(): + log.warning(f"Unsafe frame detected in frame_number {frame_number}: {class_name}") + unsafe_frame_count += 1 + + except Exception as e: + log.warning(f"Warning: Failed to run safety classifier on frame_number {frame_number}. Exception: {e}") + continue + + unsafe_ratio = unsafe_frame_count / total_frame_count if total_frame_count else 0.0 + if unsafe_ratio > (CUTOFF_UNSAFE_FRAMES_PERCENT / 100): + is_safe = False + log.warning( + f"Unsafe frame count {unsafe_frame_count} is greater than {CUTOFF_UNSAFE_FRAMES_PERCENT}% of total frames {total_frame_count}" + ) + + # .get(..., "Safe") guards against future callers appending partial entries; "Safe" is filtered out. + unsafe_categories = sorted({s.get("class", "Safe") for s in frame_scores if s.get("class", "Safe") != "Safe"}) + self.last_diagnostics: dict = { + "unsafe_frames": unsafe_frame_count, + "total_frames": total_frame_count, + "unsafe_ratio": unsafe_ratio, + "unsafe_categories": unsafe_categories, + "cutoff_percent": CUTOFF_UNSAFE_FRAMES_PERCENT, + } + + video_data = { + "is_safe": is_safe, + "frame_scores": frame_scores, + } + self._to_cpu_if_offload() + log.debug(f"Frames data: {json.dumps(video_data, indent=4)}") + return is_safe + + def _format_block_message(self) -> str: + """Build a diagnostic message for the most recent unsafe classification.""" + d = getattr(self, "last_diagnostics", None) + if not d: + return "unsafe content detected" + return ( + f"unsafe content detected: " + f"{d['unsafe_frames']}/{d['total_frames']} frames " + f"({d['unsafe_ratio']:.1%}) exceed the {d['cutoff_percent']}% cutoff; " + f"categories={d['unsafe_categories']}" + ) + + def is_safe(self, input: str | Iterable) -> tuple[bool, str]: + if isinstance(input, str): + is_safe = self.is_safe_file(input) + return is_safe, "safe video detected" if is_safe else self._format_block_message() + elif isinstance(input, Iterable): + is_safe = self.is_safe_frames(input) + return is_safe, "safe frames detected" if is_safe else self._format_block_message() + else: + raise ValueError(f"Input type {type(input)} not supported.") + + +def parse_args(): + parser = argparse.ArgumentParser() + parser.add_argument("--input_dir", type=str, required=True, help="Path containing input videos") + return parser.parse_args() + + +def main(args): + filepaths = get_video_filepaths(args.input_dir) + if not filepaths: + log.error(f"No video files found in directory: {args.input_dir}") + return + + video_filter = VideoContentSafetyFilter() + runner = GuardrailRunner(safety_models=[video_filter], generic_safe_msg="Video is safe") + + for filepath in filepaths: + with misc.timer("video content safety filter"): + _ = runner.run_safety_check(filepath) + + +if __name__ == "__main__": + args = parse_args() + main(args) diff --git a/cosmos-inference/cosmos3/_src/imaginaire/auxiliary/guardrail/video_content_safety_filter/vision_encoder.py b/cosmos-inference/cosmos3/_src/imaginaire/auxiliary/guardrail/video_content_safety_filter/vision_encoder.py new file mode 100644 index 00000000..283446e3 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/imaginaire/auxiliary/guardrail/video_content_safety_filter/vision_encoder.py @@ -0,0 +1,42 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +import torch +from PIL import Image +from transformers import SiglipModel, SiglipProcessor + + +class SigLIPEncoder(torch.nn.Module): + def __init__( + self, + device="cuda" if torch.cuda.is_available() else "cpu", # noqa: B008 + dtype=torch.float32, + ) -> None: + super().__init__() + self.device = device + self.dtype = dtype + model_id = "google/siglip-so400m-patch14-384" + self.model = SiglipModel.from_pretrained(model_id) + self.processor = SiglipProcessor.from_pretrained(model_id) + self.model.to(self.device, dtype=self.dtype).eval() + + @torch.inference_mode() + def encode_image(self, input_img: Image.Image) -> torch.Tensor: + """Encode an image into a feature vector.""" + with torch.no_grad(): + inputs = self.processor(images=input_img, return_tensors="pt").to(self.device, dtype=self.dtype) + image_features = self.model.get_image_features(**inputs) + image_features /= image_features.norm(dim=-1, keepdim=True) + return image_features diff --git a/cosmos-inference/cosmos3/_src/imaginaire/callbacks/__init__.py b/cosmos-inference/cosmos3/_src/imaginaire/callbacks/__init__.py new file mode 100644 index 00000000..db6d6eb6 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/imaginaire/callbacks/__init__.py @@ -0,0 +1,15 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025-2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + diff --git a/cosmos-inference/cosmos3/_src/imaginaire/callbacks/every_n.py b/cosmos-inference/cosmos3/_src/imaginaire/callbacks/every_n.py new file mode 100644 index 00000000..490de61c --- /dev/null +++ b/cosmos-inference/cosmos3/_src/imaginaire/callbacks/every_n.py @@ -0,0 +1,85 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +from abc import abstractmethod +from typing import Optional + +import torch + +from cosmos3._src.imaginaire.model import ImaginaireModel +from cosmos3._src.imaginaire.trainer import ImaginaireTrainer +from cosmos3._src.imaginaire.utils import distributed, log +from cosmos3._src.imaginaire.utils.callback import Callback + + +class EveryN(Callback): + def __init__( + self, + every_n: Optional[int] = None, + step_size: int = 1, + barrier_after_run: bool = True, + run_at_start: bool = False, + ) -> None: + """Constructor for `EveryN`. + + Args: + every_n (int): Frequency with which callback is run during training. + step_size (int): Size of iteration step count. Default 1. + barrier_after_run (bool): Whether to have a distributed barrier after each execution. Default True, to avoid timeouts. + run_at_start (bool): Whether to run at the beginning of training. Default False. + """ + self.every_n = every_n + if self.every_n == 0: + log.warning( + f"every_n is set to 0. Callback {self.__class__.__name__} will be invoked only once in the beginning of the training. Calls happens on_training_step_end will be skipped." + ) + + self.step_size = step_size + self.barrier_after_run = barrier_after_run + self.run_at_start = run_at_start + + def on_training_step_end( + self, + model: ImaginaireModel, + data_batch: dict[str, torch.Tensor], + output_batch: dict[str, torch.Tensor], + loss: torch.Tensor, + iteration: int = 0, + ) -> None: + # every_n = 0 is a special case which means every_n_impl will be called only once in the beginning of the training + if self.every_n != 0: + trainer = self.trainer + global_step = iteration // self.step_size + should_run = (iteration == 1 and self.run_at_start) or ( + global_step % self.every_n == 0 + ) # (self.every_n - 1) + if should_run: + log.debug(f"Callback {self.__class__.__name__} fired on train_batch_end step {global_step}") + self.every_n_impl(trainer, model, data_batch, output_batch, loss, iteration) + log.debug(f"Callback {self.__class__.__name__} finished on train_batch_end step {global_step}") + # add necessary barrier to avoid timeout + if self.barrier_after_run: + distributed.barrier() + + @abstractmethod + def every_n_impl( + self, + trainer: ImaginaireTrainer, + model: ImaginaireModel, + data_batch: dict[str, torch.Tensor], + output_batch: dict[str, torch.Tensor], + loss: torch.Tensor, + iteration: int, + ) -> None: ... diff --git a/cosmos-inference/cosmos3/_src/imaginaire/callbacks/image_grad_clip.py b/cosmos-inference/cosmos3/_src/imaginaire/callbacks/image_grad_clip.py new file mode 100644 index 00000000..7d6e3a62 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/imaginaire/callbacks/image_grad_clip.py @@ -0,0 +1,79 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +from typing import List, Optional + +import torch +import wandb +from torch.distributed.fsdp import FullyShardedDataParallel as FSDP + +from cosmos3._src.imaginaire.utils import distributed +from cosmos3._src.imaginaire.utils.callback import Callback + + +@torch.jit.script +def _fused_nan_to_num(params: List[torch.Tensor]): + for param in params: + torch.nan_to_num(param, nan=0.0, posinf=0.0, neginf=0.0, out=param) + + +class GradClip(Callback): + def __init__( + self, clip_norm=1.0, force_finite: bool = True, model_key: Optional[str] = None, fsdp_enabled: bool = False + ): + self.clip_norm = clip_norm + self.force_finite = force_finite + self.model_key = model_key + self.fsdp_enabled = fsdp_enabled + + def on_before_optimizer_step( + self, + model_ddp: distributed.DistributedDataParallel, + optimizer: torch.optim.Optimizer, + scheduler: torch.optim.lr_scheduler.LRScheduler, + grad_scaler: torch.amp.GradScaler, + iteration: int = 0, + ) -> None: + del optimizer, scheduler + if isinstance(model_ddp, distributed.DistributedDataParallel): + model = model_ddp.module + else: + model = model_ddp + + # select sub-network if specified + if self.model_key is not None: + items = self.model_key.split(".") + for item in items: + model = getattr(model, item) + + + if self.force_finite: + params = [] + for param in model.parameters(): + if param.grad is not None: + params.append(param.grad) + # torch.nan_to_num(param.grad, nan=0, posinf=0, neginf=0, out=param.grad) + _fused_nan_to_num(params) + + # check if FSDP is used + if isinstance(model, FSDP) and self.fsdp_enabled: + total_norm = model.clip_grad_norm_(self.clip_norm) + else: + total_norm = torch.nn.utils.clip_grad_norm_(model.parameters(), self.clip_norm, foreach=True) + + # log + if iteration % self.config.trainer.logging_iter == 0: + if wandb.run: + wandb.log({"clip_grad_norm": total_norm.item()}, step=iteration) diff --git a/cosmos-inference/cosmos3/_src/imaginaire/callbacks/manual_gc.py b/cosmos-inference/cosmos3/_src/imaginaire/callbacks/manual_gc.py new file mode 100644 index 00000000..ee47db5d --- /dev/null +++ b/cosmos-inference/cosmos3/_src/imaginaire/callbacks/manual_gc.py @@ -0,0 +1,50 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +import gc + +from cosmos3._src.imaginaire.callbacks.every_n import EveryN +from cosmos3._src.imaginaire.utils import log + + +class ManualGarbageCollection(EveryN): + """ + Disable auto gc and manually trigger garbage collection every N iterations + It is super useful for large scale training to reduce gpu sync time! + Can reach 50% speedup. + + It is important to note that this callback only disables gc in main process and have auto gc enabled in subprocesses. + + We start disable gc after warm_up iterations to avoid disabling gc in subprocesses, such as dataloader, which can cause OOM + """ + + def __init__(self, *args, warm_up: int = 5, gc_level: int = 1, **kwargs): + kwargs["barrier_after_run"] = False + super().__init__(*args, **kwargs) + + self.counter = 0 + self.warm = warm_up + self.gc_level = gc_level + + def every_n_impl(self, trainer, model, data_batch, output_batch, loss, iteration): + del trainer, model, data_batch, output_batch, loss + self.counter += 1 + if self.counter < self.warm: + return + if self.counter == self.warm: + gc.disable() + log.critical("Garbage collection disabled") + + gc.collect(self.gc_level) diff --git a/cosmos-inference/cosmos3/_src/imaginaire/checkpointer/__init__.py b/cosmos-inference/cosmos3/_src/imaginaire/checkpointer/__init__.py new file mode 100644 index 00000000..db6d6eb6 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/imaginaire/checkpointer/__init__.py @@ -0,0 +1,15 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025-2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + diff --git a/cosmos-inference/cosmos3/_src/imaginaire/checkpointer/base.py b/cosmos-inference/cosmos3/_src/imaginaire/checkpointer/base.py new file mode 100644 index 00000000..80b9c2d2 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/imaginaire/checkpointer/base.py @@ -0,0 +1,185 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +import os +from abc import ABC, abstractmethod +from typing import Optional + +import torch + +from cosmos3._src.imaginaire.config import CheckpointConfig, JobConfig +from cosmos3._src.imaginaire.flags import INTERNAL +from cosmos3._src.imaginaire.model import ImaginaireModel +from cosmos3._src.imaginaire.utils import callback +from cosmos3._src.imaginaire.utils.easy_io import easy_io + + +class AbstractCheckpointer(ABC): + """The checkpointer class. Supports checkpoint saving/loading to both local disk or object store.""" + + def __init__( + self, + config_checkpoint: CheckpointConfig, + config_job: JobConfig, + callbacks: Optional[callback.CallBackGroup] = None, + ): + """Constructor of the checkpointer. + + Args: + config_checkpoint (CheckpointConfig): The config object for the checkpointer. + """ + self.config_checkpoint = config_checkpoint + # Set the callback functions. + self.callbacks = callbacks + self.save_to_object_store = config_checkpoint.save_to_object_store.enabled + self.load_from_object_store = config_checkpoint.load_from_object_store.enabled + + # Set checkpoint directories for local and object store paths + self._local_dirname = os.path.join(config_job.path_local, "checkpoints") + self._object_store_dirname = os.path.join(config_job.path, "checkpoints") + + self.strict_resume = config_checkpoint.strict_resume + load_path = config_checkpoint.load_path or None + if not INTERNAL: + from cosmos3._src.imaginaire.utils.checkpoint_db import download_checkpoint_v2 + + if load_path: + load_path = download_checkpoint_v2(load_path) + self.load_path = load_path + self.load_training_state = config_checkpoint.load_training_state + self.only_load_scheduler_state = config_checkpoint.only_load_scheduler_state + self.save_thread = None + self.verbose = config_checkpoint.verbose + self.keys_not_to_resume = config_checkpoint.keys_not_to_resume + self.keys_to_skip_loading = getattr(config_checkpoint, "keys_to_skip_loading", []) + self.broadcast_via_filesystem = config_checkpoint.broadcast_via_filesystem + # Create the object store client interface. + if config_checkpoint.load_from_object_store.enabled: + self.load_s3_backend_key = "_ckpt_s3_loader" + easy_io.set_s3_backend( + key="_ckpt_s3_loader", + backend_args={ + "backend": "s3", + "path_mapping": { + "s3://ckpt/": f"s3://{config_checkpoint.load_from_object_store.bucket}/", + }, + "s3_credential_path": config_checkpoint.load_from_object_store.credentials, + }, + ) + else: + self.load_s3_backend_key = None + + if config_checkpoint.save_to_object_store.enabled: + self.save_s3_backend_key = "_ckpt_s3_saver" + easy_io.set_s3_backend( + key="_ckpt_s3_saver", + backend_args={ + "backend": "s3", + "path_mapping": { + "s3://ckpt/": f"s3://{config_checkpoint.save_to_object_store.bucket}/", + }, + "s3_credential_path": config_checkpoint.save_to_object_store.credentials, + }, + ) + else: + self.save_s3_backend_key = None + + @abstractmethod + def save( + self, + model: ImaginaireModel, + optimizer: torch.optim.Optimizer, + scheduler: torch.optim.lr_scheduler.LRScheduler, + grad_scaler: torch.amp.GradScaler, + iteration: int, + ) -> None: + pass + + @abstractmethod + def load( + self, + model: ImaginaireModel, + optimizer: Optional[torch.optim.Optimizer] = None, + scheduler: Optional[torch.optim.lr_scheduler.LRScheduler] = None, + grad_scaler: Optional[torch.amp.GradScaler] = None, + ) -> int: + pass + + @property + def save_bucket(self): + """Get the bucket name for saving checkpoints.""" + return self.config_checkpoint.save_to_object_store.bucket if self.save_to_object_store else None + + @property + def load_bucket(self): + """Get the bucket name for loading checkpoints.""" + return self.config_checkpoint.load_from_object_store.bucket if self.load_from_object_store else None + + @property + def save_dirname(self): + return ( + f"s3://{self.save_bucket}/{self._object_store_dirname}" + if self.save_to_object_store + else self._local_dirname + ) + + @property + def load_dirname(self): + return ( + f"s3://{self.load_bucket}/{self._object_store_dirname}" + if self.load_from_object_store + else self._local_dirname + ) + + def finalize(self) -> None: + """Finalize the checkpointer.""" + if self.save_thread: + self.save_thread.join() + + def _read_latest_checkpoint_file(self) -> str | None: + """Get the file name of the latest saved checkpoint. If it doesn't exist, return None. + + Returns: + checkpoint_file (str | None): file name of the latest saved checkpoint. + """ + checkpoint_file = None + checkpoint_path = os.path.join(self.load_dirname, "latest_checkpoint.txt") + if easy_io.exists(f"{checkpoint_path}", backend_key=self.load_s3_backend_key): + checkpoint_file = easy_io.load(f"{checkpoint_path}", backend_key=self.load_s3_backend_key).strip() + + return checkpoint_file + + def _write_latest_checkpoint_file(self, checkpoint_file: str) -> None: + """Track the file name of the latest saved checkpoint. + + Args: + checkpoint_file (str): file name of the latest saved checkpoint. + """ + content = f"{checkpoint_file}\n" + checkpoint_path = os.path.join(self.save_dirname, "latest_checkpoint.txt") + easy_io.dump( + content, + checkpoint_path, + backend_key=self.save_s3_backend_key, + ) + + def _check_checkpoint_exists(self, checkpoint_path: str) -> None: + """If the file checkpoint_path does not exist, raise an error. + + Args: + checkpoint_path (str): full path to the checkpoint. + """ + if not easy_io.exists(f"{checkpoint_path}", backend_key=self.load_s3_backend_key): + raise FileNotFoundError(f"File not found (object store): {checkpoint_path}") diff --git a/cosmos-inference/cosmos3/_src/imaginaire/checkpointer/ddp.py b/cosmos-inference/cosmos3/_src/imaginaire/checkpointer/ddp.py new file mode 100644 index 00000000..5306cdf8 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/imaginaire/checkpointer/ddp.py @@ -0,0 +1,457 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +import os +import threading +from collections import namedtuple +from typing import Any, Dict, Optional, Set, Tuple, Union + +import torch +import torch.distributed +from megatron.core import parallel_state +from torch.distributed import ProcessGroup, get_process_group_ranks + +from cosmos3._src.imaginaire.checkpointer.base import AbstractCheckpointer +from cosmos3._src.imaginaire.checkpointer.safe_broadcast import broadcast_object +from cosmos3._src.imaginaire.model import ImaginaireModel +from cosmos3._src.imaginaire.utils import distributed, log, misc +from cosmos3._src.imaginaire.utils.easy_io import easy_io + +StateDictItemPath = namedtuple("StateDictItemPath", ["state_dict", "save_path"]) + + +class Checkpointer(AbstractCheckpointer): + """ + Checkpointer for DDP. + Note: This implementation only supports local filesystem. + """ + + KEYS_TO_SAVE = ["model", "optim", "scheduler", "trainer"] + KEYS_TO_POSTFIX = { + "model": "model", + "optim": "optim", + "scheduler": "scheduler", + "trainer": "", + } + + def __init__(self, *args, **kwargs): + super().__init__(*args, **kwargs) + pp_world_size = parallel_state.get_pipeline_model_parallel_world_size() + ep_world_size = parallel_state.get_expert_model_parallel_world_size() + assert pp_world_size < 2, "Pipeline Parallelism (PP) is not tested yet." + assert ep_world_size < 2, "Expert Parallelism (EP) is not tested yet." + self.mp_world_size = parallel_state.get_model_parallel_group().size() + if self.mp_world_size > 1 and self.__class__ == Checkpointer: + raise NotImplementedError( + "Model Parallelism (MP) is enabled - you should use TensorParallel Checkpointer instead of DDP Checkpointer." + ) + # DDP rank (with context parallelism considered) + self.rank_dp_w_cp = parallel_state.get_data_parallel_rank(with_context_parallel=True) + # Context parallelism rank + self.cp_rank = parallel_state.get_context_parallel_rank() + # Model parallelism rank (including Tensor+Pipeline+Expert Parallelisms) + self.mp_rank = parallel_state.get_model_parallel_group().rank() + + # self.mp_rank = parallel_state.get_model_parallel_group(with_expert_parallel=ep_world_size > 1).rank() + if self.broadcast_via_filesystem: + log.info("Broadcasting checkpoint data via the local filesystem.") + if not self.strict_resume: + log.warning("Strict resume mode is off. Some model parameters may not be loaded.") + + # collect ranks of all model parallel groups + all_ranks = [None for _ in range(distributed.get_world_size())] + torch.distributed.all_gather_object( + all_ranks, get_process_group_ranks(parallel_state.get_model_parallel_group()) + ) + all_ranks = list(set(tuple(rank) if isinstance(rank, list) else rank for rank in all_ranks)) + for ranks in all_ranks: + group = torch.distributed.new_group(list(ranks), backend="gloo") + if distributed.get_rank() in ranks: + self.mp_gloo_pg = group + + self.print("Checkpointer Initialized.") + + def print(self, message: str): + """ + Print message to the console. Include the parallelism rank information when verbose is set to True. + """ + if self.verbose: + log.info( + f"[Parallelism Rank: DP-{self.rank_dp_w_cp}, TP-{self.mp_rank}, CP-{self.cp_rank}]: {message}", + rank0_only=False, + ) + else: + log.info(message, rank0_only=True) + + def add_type_postfix_to_checkpoint_path(self, key: str, checkpoint_path: str, model: ImaginaireModel) -> str: + del model + assert key in self.KEYS_TO_SAVE + post_fix = self.KEYS_TO_POSTFIX[key] + + if post_fix: + _ckpt_path = checkpoint_path.replace(".pt", f"_{post_fix}.pt") + else: + _ckpt_path = checkpoint_path + return _ckpt_path + + def save( + self, + model: ImaginaireModel, + optimizer: torch.optim.Optimizer, + scheduler: torch.optim.lr_scheduler.LRScheduler, + grad_scaler: torch.amp.GradScaler, + iteration: int, + ) -> None: + """Save network weights, optimizer parameters, scheduler parameters to a checkpoint. + + Args: + model (ImaginaireModel): The PyTorch model. + optimizer (torch.optim.Optimizer): The model optimizer. + scheduler (torch.optim.lr_scheduler.LRScheduler): The optimization scheduler. + grad_scaler (torch.amp.GradScaler): The gradient scaler (for mixed precision training). + iteration (int): Current iteration number. + """ + self.callbacks.on_save_checkpoint_start(model, iteration) + + checkpoint_file = self.format_checkpoint_filename(model, iteration) + state_dict = self.generate_save_state_dict(model, optimizer, scheduler, grad_scaler, iteration) + state_dict = self._map_state_dict_path_during_save(state_dict, checkpoint_file, model) + if state_dict: + # Wait for previous saver thread to end. + if self.save_thread: + self.save_thread.join() + # Run the checkpoint saver in a separate thread. + self.save_thread = threading.Thread( + target=self._save_worker, + daemon=False, + args=(state_dict, checkpoint_file, distributed.get_rank()), + ) + self.save_thread.start() + + # Note: Checkpoints are saved on a separate thread and this callback is not accurate. + # Please check logs from on_save_checkpoint_success() for better accuracy + self.callbacks.on_save_checkpoint_end(model=None, iteration=iteration) + + def _map_state_dict_path_during_save(self, state_dict, checkpoint_file, model) -> dict[str, StateDictItemPath]: + new_dict = {} + for key, _state_dict in state_dict.items(): + _ckpt_path = self.add_type_postfix_to_checkpoint_path(key, checkpoint_file, model) + checkpoint_path = os.path.join(self.save_dirname, _ckpt_path) + new_dict[key] = StateDictItemPath(_state_dict, checkpoint_path) + return new_dict + + @misc.timer("checkpoint saving") + def _save_worker(self, state_dict: dict[str, StateDictItemPath], checkpoint_file: str, rank: int = 0) -> None: + """Worker to upload checkpoint to object store, spawned with a child thread (in parallel with the training). + + Args: + state_dict (dict[str, StateDictItemPath]): The state dict of the model/optimizer/scheduler. + checkpoint_file (str): The file name of the model checkpoint. + rank (int): GPU device (default: 0). + """ + try: + for key, item in state_dict.items(): + self.print(f"Saving {key} to {item.save_path}") + try: + easy_io.dump( + item.state_dict, + item.save_path, + fast_backend=True, # optional for fast backend, cpu heavy + backend_key=self.save_s3_backend_key, + ) + self.print(f"Saved {key} to {item.save_path}") + except Exception as e: + self.print(f"Failed to save {key} to {item.save_path}: {str(e)}") + raise # Re-raise the exception after logging + + # Synchronize only rank 0 of each model parallel group + if self.mp_world_size > 1: + torch.distributed.barrier(group=self.mp_gloo_pg) + + # Only rank 0 of MP group and rank 0 of DP with CP updates latest_checkpoint.txt + if self.mp_rank == 0 and self.rank_dp_w_cp == 0: + self._write_latest_checkpoint_file(checkpoint_file) + + if distributed.get_rank() == 0: # only rank 0 saves trained_data_record + if "trained_data_record" in state_dict["model"].state_dict: + self._write_trained_data_record( + checkpoint_file, state_dict["model"].state_dict["trained_data_record"] + ) + + iteration = int(checkpoint_file.replace("iter_", "").replace(".pt", "")) + self.callbacks.on_save_checkpoint_success(iteration=iteration) + except Exception as e: # noqa: BLE001 + log.exception(f"Checkpoint failed to upload: {e}", rank0_only=not self.verbose) + + def format_checkpoint_filename(self, model: ImaginaireModel, iteration: int) -> str: + """Generate the checkpoint file name. + + Args: + iteration (int): The current iteration number. + + Returns: + checkpoint_file (str): The checkpoint file name. + """ + del self, model + return f"iter_{iteration:09}.pt" + + @misc.timer("generate saving state dict") + def generate_save_state_dict( + self, + model: ImaginaireModel, + optimizer: torch.optim.Optimizer, + scheduler: torch.optim.lr_scheduler.LRScheduler, + grad_scaler: torch.amp.GradScaler, + iteration: int, + ) -> Optional[Dict[str, Any]]: + state_dict = {} + + if self.rank_dp_w_cp == 0: + trainer_state = dict( + grad_scaler=grad_scaler.state_dict(), + iteration=iteration, + ) + model_state = model.state_dict() + optim_state = optimizer.state_dict() + scheduler_state = scheduler.state_dict() + self.callbacks.on_save_checkpoint(model, state_dict=trainer_state) + + trainer_state, model_state, optim_state, scheduler_state = misc.to( + [trainer_state, model_state, optim_state, scheduler_state], device="cpu" + ) + + state_dict = { + "model": model_state, + "optim": optim_state, + "scheduler": scheduler_state, + } + if distributed.get_rank() == 0: # only rank 0 saves trainer state + state_dict["trainer"] = trainer_state + return state_dict + return state_dict + + def load_broadcast_state_dict( + self, checkpoint_path: str, model: ImaginaireModel, resume_keys: Set + ) -> dict[str, Any]: + """ + Load state_dict and broadcast. + + The main steps are: + 1. Download TP-rank-specific checkpoints for every GPU of DDP-rank 0 and CP-rank 0. + 2. Each rank loads its corresponding checkpoint from the local cache or receives it via broadcast. + + This approach ensures that each MP rank loads its specific part of the model, which is + crucial for Model Parallelism where different parts of the model are distributed across + multiple GPUs. + + When using Model Parallelism (e.g., Tensor Parallelism), the `broadcast_via_filesystem` option can + be set to True. This allows each rank to load its specific checkpoint from the local filesystem + instead of receiving it via network broadcast, which could be more efficient in some cases. + + For standard DDP without TP, `broadcast_via_filesystem` should remain False (default). + + Args: + checkpoint_path (str): The base path of the checkpoint. + model (ImaginaireModel): The model being loaded. + resume_keys (Set): Set of keys to resume from the checkpoint. + + Returns: + dict[str, Any]: A dictionary containing the loaded state for each resumed key. + """ + state_dict = {} + sorted_resume_keys = sorted(resume_keys) + # Step 1: Download TP-rank-specific checkpoints for every GPU of DDP-rank 0 and CP-rank 0. + if self.rank_dp_w_cp == 0: + for key in sorted_resume_keys: + _ckpt_path = self.add_type_postfix_to_checkpoint_path(key, checkpoint_path, model) + local_cache_path = os.path.join(self.load_dirname, os.path.basename(_ckpt_path)) + if os.path.exists(local_cache_path): + # If the local checkpoint exists, we can directly load it + self.print(f"Checkpoint is already in local cache: {local_cache_path}. Loading...") + _state_dict = easy_io.load(local_cache_path, fast_backend=True) + else: + + _state_dict = easy_io.load(_ckpt_path, fast_backend=True, backend_key=self.load_s3_backend_key) + self.print(f"Downloading checkpoint from: {_ckpt_path}") + if self.broadcast_via_filesystem: + # Save the checkpoint to the local filesystem + easy_io.dump(_state_dict, local_cache_path, fast_backend=True) + state_dict[key] = _state_dict + # Ensure all ranks wait for the download to complete + distributed.barrier() + + # Step 2: Broadcast checkpoint data + log.info( + "Start broadcasting checkpoint from the source rank to all other ranks in the same DDP group.", + rank0_only=True, + ) + for key in sorted_resume_keys: + if self.broadcast_via_filesystem: + # Load the checkpoint from the local filesystem for other ranks + if self.rank_dp_w_cp != 0: + _ckpt_path = self.add_type_postfix_to_checkpoint_path(key, checkpoint_path, model) + local_cache_path = os.path.join(self.load_dirname, os.path.basename(_ckpt_path)) + self.print(f"Loading checkpoint from: {local_cache_path}") + state_dict[key] = easy_io.load(local_cache_path, fast_backend=True) + else: + # Broadcast the checkpoint to all GPUs of the current DDP rank + group: ProcessGroup = parallel_state.get_data_parallel_group(with_context_parallel=True) + min_rank = min(get_process_group_ranks(group)) + + _state_dict = broadcast_object( + state_dict[key] if self.rank_dp_w_cp == 0 else None, + min_rank, + group=group, + device=torch.device(torch.cuda.current_device()), + ) + if self.rank_dp_w_cp == 0: + self.print(f'Broadcasted checkpoint["{key}"] to all other ranks in the same DDP group.') + else: + state_dict[key] = _state_dict + self.print(f'Received checkpoint["{key}"] from source rank {min_rank}.') + + return state_dict + + def keys_to_resume_during_load(self) -> Tuple[Set, Union[str, None]]: + latest_checkpoint_file = self._read_latest_checkpoint_file() + + resume_keys = [] + + if latest_checkpoint_file is not None: + # 1. Resume training from latest_checkpoint.txt under the same name. + checkpoint_path = os.path.join(self.load_dirname, latest_checkpoint_file) + resume_keys.extend(self.KEYS_TO_SAVE) + else: + if self.load_path: + # 2. Load the module weights specified by config_checkpoint.path. + checkpoint_path = self.load_path + if self.load_s3_backend_key: + checkpoint_path = f"s3://ckpt/{checkpoint_path}" + if self.load_training_state: + resume_keys.extend(self.KEYS_TO_SAVE) + else: + resume_keys.append("model") + if self.only_load_scheduler_state: + resume_keys.append("scheduler") + else: + checkpoint_path = None + if len(self.keys_not_to_resume) > 0: + for key in self.keys_not_to_resume: + assert key in self.KEYS_TO_SAVE, f"Invalid key to resume: {key} not in {self.KEYS_TO_SAVE}" + resume_keys = [key for key in resume_keys if key not in self.keys_not_to_resume] + return set(resume_keys), checkpoint_path + + @misc.timer("checkpoint loading") + def load( + self, + model: ImaginaireModel, + optimizer: torch.optim.Optimizer | None = None, + scheduler: torch.optim.lr_scheduler.LRScheduler | None = None, + grad_scaler: torch.amp.GradScaler | None = None, + ) -> int: + """Load network weights and optimizer states from a checkpoint in a single process. + + The priority of the checkpoint loading logic is: + 1. Attempt to resume training if possible by looking for latest_checkpoint.txt under the same name. + 2. If no latest checkpoint were found, it loads the model weights specified by config_checkpoint.path. + - This is typically used for inference mode. + - If config_checkpoint.load_optimizer_state is True, then also load the optimizer and scheduler states. + 3. If none of the above, randomly initialize the model parameters and train from scratch. + + Args: + model (ImaginaireModel): The PyTorch model. + optimizer (torch.optim.Optimizer | None): The model optimizer (default: None). + scheduler (torch.optim.lr_scheduler.LRScheduler | None): The optimization scheduler (default: None). + grad_scaler (torch.amp.GradScaler | None): The gradient scaler (for mixed precision training). + + Returns: + iteration (int): the iteration number to start/resume from. + """ + self.callbacks.on_load_checkpoint_start(model) + + resume_keys, checkpoint_path = self.keys_to_resume_during_load() + + iteration = 0 + + # Load checkpoint. + if checkpoint_path is not None: + self._check_checkpoint_exists(checkpoint_path) + state_dict = self.load_broadcast_state_dict(checkpoint_path, model, set(resume_keys)) + + if "trainer" in state_dict: + trainer_state = state_dict["trainer"] + log.critical(state_dict.keys(), rank0_only=False) + log.critical(trainer_state, rank0_only=False) + log.info("- Loading the gradient scaler...") + grad_scaler.load_state_dict(trainer_state["grad_scaler"]) + self.callbacks.on_load_checkpoint(model, state_dict=trainer_state) + iteration = trainer_state["iteration"] + if "optim" in state_dict: + assert optimizer + optimizer_state = state_dict["optim"] + log.info("- Loading the optimizer...") + optimizer.load_state_dict(optimizer_state) + if "scheduler" in state_dict: + assert scheduler + scheduler_state = state_dict["scheduler"] + log.info("- Loading the scheduler...") + scheduler.load_state_dict(scheduler_state) + scheduler.last_epoch = iteration + if "model" in state_dict: + model_state = state_dict["model"] + log.info("- Loading the model...") + # Filter out keys_to_skip_loading before loading + if len(self.keys_to_skip_loading) > 0: + filtered_state = {k: v for k, v in model_state.items() if k not in self.keys_to_skip_loading} + skipped_keys = [k for k in model_state.keys() if k in self.keys_to_skip_loading] + if skipped_keys: + log.info(f"\t Skipping {len(skipped_keys)} keys: {skipped_keys}") + model_state = filtered_state + # model.load_state_dict(model_state) + if self.strict_resume: + log.info("\t Strict resume mode is on.") + else: + log.info("\t Strict resume mode is off.") + model_load_info = model.load_state_dict(model_state, strict=self.strict_resume) + log.info(f"\t {model_load_info}") + if not model_load_info.missing_keys and not model_load_info.unexpected_keys: + log.info("\t Checkpoint weights loaded successfully (all keys matched).") + else: + log.warning("\t Checkpoint weights loaded; review missing_keys/unexpected_keys above.") + self.print(f"Loaded checkpoint from {checkpoint_path} in iteration {iteration}") + else: + log.info("Training from scratch.") + torch.cuda.empty_cache() + + self.callbacks.on_load_checkpoint_end(model, iteration=iteration, checkpoint_path=checkpoint_path) + + return iteration + + def _write_trained_data_record(self, checkpoint_file: str, trained_data_record: dict[str, int]) -> None: + """Write json file to save number of seen samples and number of iterations. + + Args: + checkpoint_file (str): iteration number for the saved checkpoint + trained_data_record (dict[str, int]): example {"image": 0, "video": 0, "iteration": 0}. + """ + # filename: iter_xxxxxxxxx_trained_data_record.json + checkpoint_path = os.path.join( + self.save_dirname, f"{checkpoint_file.replace('.pt', '')}_trained_data_record.json" + ) + easy_io.dump( + trained_data_record, + checkpoint_path, + backend_key=self.save_s3_backend_key, + ) diff --git a/cosmos-inference/cosmos3/_src/imaginaire/checkpointer/dummy.py b/cosmos-inference/cosmos3/_src/imaginaire/checkpointer/dummy.py new file mode 100644 index 00000000..ef71842c --- /dev/null +++ b/cosmos-inference/cosmos3/_src/imaginaire/checkpointer/dummy.py @@ -0,0 +1,47 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +from typing import Optional + +import torch +import torch.distributed + +from cosmos3._src.imaginaire.checkpointer.base import AbstractCheckpointer +from cosmos3._src.imaginaire.model import ImaginaireModel + + +class Checkpointer(AbstractCheckpointer): + """ + A dummy checkpointer that does not save or load anything. This is useful for debugging jobs or share workload with collobrators. + """ + + def save( + self, + model: ImaginaireModel, + optimizer: torch.optim.Optimizer, + scheduler: torch.optim.lr_scheduler.LRScheduler, + grad_scaler: torch.amp.GradScaler, + iteration: int, + ) -> None: + pass + + def load( + self, + model: ImaginaireModel, + optimizer: Optional[torch.optim.Optimizer] = None, + scheduler: Optional[torch.optim.lr_scheduler.LRScheduler] = None, + grad_scaler: Optional[torch.amp.GradScaler] = None, + ) -> int: + return 0 diff --git a/cosmos-inference/cosmos3/_src/imaginaire/checkpointer/s3_filesystem.py b/cosmos-inference/cosmos3/_src/imaginaire/checkpointer/s3_filesystem.py new file mode 100644 index 00000000..dbf1bc22 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/imaginaire/checkpointer/s3_filesystem.py @@ -0,0 +1,330 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +import io +import os +import time +from contextlib import contextmanager +from typing import Generator, Union +from urllib.parse import urlparse + +from botocore.exceptions import ClientError +from torch.distributed.checkpoint import FileSystemReader, FileSystemWriter +from torch.distributed.checkpoint.filesystem import FileSystemBase + +from cosmos3._src.imaginaire.utils import log +from cosmos3._src.imaginaire.utils.easy_io import easy_io + + +class S3Stream(io.BytesIO): + """ + Workaround for PyTorch manually closing the stream before we can upload it to S3. We override the close() as noop + and instead call our own _true_close() method to close the stream after we are done using it. + The commit at fault is https://github.com/pytorch/pytorch/commit/9c909bf3bb122db2cce95e2eb7459bbe50dfa15a + """ + + def close(self): + self.flush() + # No close + + def _true_close(self): + super().close() + + +class S3FileSystem(FileSystemBase): + """Implementation of FileSystemBase for AWS S3 storage.""" + + def __init__( + self, + credential_path: str, + max_attempts: int = 20, + initial_backoff: float = 1.0, + max_backoff: float = 30.0, + backoff_factor: float = 2.0, + enable_gcs_patch_in_boto3: bool = False, + ) -> None: + """ + Initialize S3FileSystem with retry configuration. + + Args: + credential_path: Path to AWS credentials JSON file + max_attempts: Maximum number of retry attempts + initial_backoff: Initial backoff time in seconds + max_backoff: Maximum backoff time in seconds + backoff_factor: Multiplicative factor for backoff time + enable_gcs_patch_in_boto3: Whether to enable GCS patch in boto3 + """ + self.easy_io_backend = easy_io.get_file_backend( + backend_args={ + "backend": "s3", + "s3_credential_path": credential_path, + "path_mapping": None, + } + ) + self.max_attempts = max_attempts + self.initial_backoff = initial_backoff + self.max_backoff = max_backoff + self.backoff_factor = backoff_factor + self.enable_gcs_patch_in_boto3 = enable_gcs_patch_in_boto3 + if enable_gcs_patch_in_boto3: + log.info("enable_gcs_patch_in_boto3: True") + + def _retry_with_backoff(self, operation_func, *args, **kwargs): + """ + Execute an operation with exponential backoff retry logic. + + Args: + operation_func: Function to execute + *args: Positional arguments for the function + **kwargs: Keyword arguments for the function + + Returns: + Result of the operation function + + Raises: + Exception: If all retry attempts fail + """ + last_exception = None + backoff = self.initial_backoff + + for attempt in range(self.max_attempts): + try: + return operation_func(*args, **kwargs) + except ClientError as e: + error_code = e.response.get("Error", {}).get("Code", "") + log.info(f"S3 Filesystem: Received ClientError: {error_code}", rank0_only=False) + + # Handle specific error cases + if error_code in ["SlowDown", "ThrottlingException", "RequestLimitExceeded", "InternalError"]: + last_exception = e + if attempt < self.max_attempts - 1: # Don't sleep on last attempt + current_backoff = min(backoff, self.max_backoff) + log.info(f"S3 Filesystem: Retrying in {current_backoff} seconds", rank0_only=False) + time.sleep(current_backoff) + backoff *= self.backoff_factor + continue + # For other client errors, raise immediately + raise + except Exception as e: + log.info(f"S3 Filesystem: Received Exception: {str(e)}", rank0_only=False) + last_exception = e + if attempt < self.max_attempts - 1: + current_backoff = min(backoff, self.max_backoff) + log.info(f"S3 Filesystem: Retrying in {current_backoff} seconds", rank0_only=False) + time.sleep(current_backoff) + backoff *= self.backoff_factor + continue + + # pyrefly: ignore [bad-raise] + raise last_exception + + @contextmanager + def create_stream(self, path: Union[str, os.PathLike], mode: str) -> Generator[io.IOBase, None, None]: + """Create a stream for reading from or writing to S3 with retry logic.""" + path_str = str(path) + bucket, key = self._parse_s3_uri(path_str) + log.info(f"S3 Filesystem: Creating stream for {key} in bucket {bucket}", rank0_only=False) + + if mode == "rb": + stream = io.BytesIO() + try: + + def download_operation(): + stream.write(self.easy_io_backend.get(filepath=path_str)) + stream.seek(0) + + log.info(f"S3 Filesystem: Downloading {key} from bucket {bucket}", rank0_only=False) + self._retry_with_backoff(download_operation) + log.info("S3 Filesystem: Download complete", rank0_only=False) + yield stream + finally: + stream.close() + elif mode == "wb": + stream = S3Stream() + try: + yield stream + + def upload_operation(): + stream.seek(0) + self.easy_io_backend.put(obj=stream, filepath=path_str) + + log.info(f"S3 Filesystem: Uploading {key} to bucket {bucket}", rank0_only=False) + self._retry_with_backoff(upload_operation) + log.info("S3 Filesystem: Upload complete", rank0_only=False) + finally: + stream._true_close() + else: + raise ValueError(f"Unsupported mode: {mode}") + + def concat_path(self, path: Union[str, os.PathLike], suffix: str) -> Union[str, os.PathLike]: + """Concatenate S3 path with suffix.""" + path_str = str(path) + if path_str.endswith("/"): + return f"{path_str}{suffix}" + return f"{path_str}/{suffix}" + + def init_path(self, path: Union[str, os.PathLike]) -> Union[str, os.PathLike]: + """Initialize and validate S3 path.""" + path_str = str(path) + if not path_str.startswith("s3://"): + raise ValueError(f"Invalid S3 URI: {path_str}. Must start with 's3://'") + return path_str + + def rename(self, path: Union[str, os.PathLike], new_path: Union[str, os.PathLike]) -> None: + """Rename (move) an object in S3 with retry logic.""" + src_path = str(path) + dst_path = str(new_path) + + def copy_operation(): + self.easy_io_backend.copyfile(src=src_path, dst=dst_path) + + self._retry_with_backoff(copy_operation) + + def delete_operation(): + self.easy_io_backend.remove(filepath=src_path) + + self._retry_with_backoff(delete_operation) + + def mkdir(self, path: Union[str, os.PathLike]) -> None: + """ + Create a "directory" in S3. + + Note: S3 doesn't have real directories, but we can create an empty object + with a trailing slash to simulate a directory. + """ + # Creating same buckets from different ranks can cause rate limit issues in GCP. + # In object store, we don't need to create a directory. + pass + + def ls(self, path: Union[str, os.PathLike]) -> list[str]: + """List objects under the given S3 path (prefix) and return s3:// URIs.""" + path_str = str(path) + return [ + f"{path_str.removesuffix('/')}/{obj_suffix}" + for obj_suffix in self.easy_io_backend.list_dir_or_file(dir_path=path_str, list_dir=False, list_file=True) + ] + + @classmethod + def validate_checkpoint_id(cls, checkpoint_id: Union[str, os.PathLike]) -> bool: + """Validate if the checkpoint_id is a valid S3 URI.""" + checkpoint_id_str = str(checkpoint_id) + try: + if not checkpoint_id_str.startswith("s3://"): + return False + parsed = urlparse(checkpoint_id_str) + return bool(parsed.netloc and parsed.path) # Must have bucket and key + except Exception: + return False + + def exists(self, path: Union[str, os.PathLike]) -> bool: + """Check if an object exists in S3 with retry logic.""" + try: + + def head_operation() -> bool: + return self.easy_io_backend.exists(filepath=str(path)) + + return self._retry_with_backoff(head_operation) + except ClientError as e: + if e.response.get("Error", {}).get("Code", "") == "404": + return False + raise + + def rm_file(self, path: Union[str, os.PathLike]) -> None: + """Remove a file from S3 with retry logic.""" + + def delete_operation(): + self.easy_io_backend.remove(filepath=str(path)) + + self._retry_with_backoff(delete_operation) + + def _parse_s3_uri(self, uri: str) -> tuple[str, str]: + """ + Parse an S3 URI into bucket and key. + + Args: + uri: S3 URI in the format s3://bucket-name/key + + Returns: + Tuple of (bucket_name, key) + + Raises: + ValueError: If the URI is invalid + """ + uri = uri if isinstance(uri, str) else str(uri) + if not uri.startswith("s3://"): + raise ValueError(f"Invalid S3 URI: {uri}. Must start with 's3://'") + + parsed = urlparse(uri) + bucket = parsed.netloc + + # Remove leading slash from key + key = parsed.path.lstrip("/") + + if not bucket: + raise ValueError(f"Invalid S3 URI: {uri}. No bucket specified") + + return bucket, key + + +class S3StorageWriter(FileSystemWriter): + def __init__( + self, + credential_path: str, + path: str, + enable_gcs_patch_in_boto3: bool = False, + **kwargs, + ) -> None: + """ + Initialize an S3 writer for distributed checkpointing. + + Args: + region (str): The AWS region for S3. + path (str): The S3 URI to write checkpoints to. + kwargs (dict): Keyword arguments to pass to the parent :class:`FileSystemWriter`. + enable_gcs_patch_in_boto3 (bool): Whether to enable GCS patch in boto3 + """ + super().__init__( + path=path, + sync_files=False, + **kwargs, + ) + self.fs = S3FileSystem(credential_path, enable_gcs_patch_in_boto3=enable_gcs_patch_in_boto3) # type: ignore + self.path = self.fs.init_path(path) + + @classmethod + def validate_checkpoint_id(cls, checkpoint_id: Union[str, os.PathLike]) -> bool: + return S3FileSystem.validate_checkpoint_id(checkpoint_id) + + +class S3StorageReader(FileSystemReader): + def __init__( + self, credential_path: str, path: Union[str, os.PathLike], enable_gcs_patch_in_boto3: bool = False + ) -> None: + """ + Initialize an S3 reader for distributed checkpointing. + + Args: + region (str): The AWS region for S3. + path (Union[str, os.PathLike]): The S3 path to read checkpoints from. + enable_gcs_patch_in_boto3 (bool): Whether to enable GCS patch in boto3 + """ + super().__init__(path) + self.fs = S3FileSystem(credential_path, enable_gcs_patch_in_boto3=enable_gcs_patch_in_boto3) # type: ignore + self.path = self.fs.init_path(path) + self.sync_files = False + + @classmethod + def validate_checkpoint_id(cls, checkpoint_id: Union[str, os.PathLike]) -> bool: + return S3FileSystem.validate_checkpoint_id(checkpoint_id) diff --git a/cosmos-inference/cosmos3/_src/imaginaire/checkpointer/safe_broadcast.py b/cosmos-inference/cosmos3/_src/imaginaire/checkpointer/safe_broadcast.py new file mode 100644 index 00000000..717ffed9 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/imaginaire/checkpointer/safe_broadcast.py @@ -0,0 +1,96 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +import collections +import io +import pickle +from typing import Any + +import torch +import torch.distributed as dist + + +# https://github.com/pytorch/pytorch/blob/main/torch/distributed/optim/zero_redundancy_optimizer.py#L29 + +def broadcast_object( + obj: Any, + src_rank: int, + group: object = dist.group.WORLD, + device: torch.device = torch.device("cpu"), +) -> Any: + r""" + Broadcasts an object to the given group. + + It will be sending the object if called from the source rank and receiving + the object otherwise. + + Arguments: + obj: object to broadcast; only used if called on the source rank. + src_rank (int): source rank. + group (``ProcessGroup``, optional): group used for the broadcast + (default: ``dist.group.WORLD``). + device (``torch.device``, optional): device to send from or receive + to (default: ``torch.device("cpu")``). + + Returns: + The broadcasted object. + """ + if dist.get_rank() == src_rank: + # Send the object + buffer = io.BytesIO() + torch.save(obj, buffer, pickle_protocol=pickle.HIGHEST_PROTOCOL) + data = bytearray(buffer.getbuffer()) + length_tensor = torch.LongTensor([len(data)]).to(device) + data_send_tensor = torch.ByteTensor(data).to(device) + dist.broadcast(length_tensor, src=src_rank, group=group, async_op=False) + dist.broadcast(data_send_tensor, src=src_rank, group=group, async_op=False) + else: + # Receive the object + length_tensor = torch.LongTensor([0]).to(device) + dist.broadcast(length_tensor, src=src_rank, group=group, async_op=False) + data_recv_tensor = torch.empty([int(length_tensor.item())], dtype=torch.uint8, device=device) + dist.broadcast(data_recv_tensor, src=src_rank, group=group, async_op=False) + buffer = io.BytesIO(data_recv_tensor.cpu().numpy()) + obj = torch.load(buffer, map_location=device, weights_only=False) + return obj + + +def _recursive_copy_to_device( + value: Any, + non_blocking: bool, + device: torch.device, +) -> Any: + r""" + Recursively searches lists, tuples, dicts and copies tensors to device if possible. + + Non-tensor values are passed as-is in the result. + + .. note: These are all copies, so if there are two objects that reference + the same object, then after this call, there will be two different objects + referenced on the device. + """ + if isinstance(value, torch.Tensor): + return value.to(device, non_blocking=non_blocking) + + if isinstance(value, (list, tuple)): + values = [_recursive_copy_to_device(val, non_blocking=non_blocking, device=device) for val in value] + return values if isinstance(value, list) else tuple(values) + + if isinstance(value, collections.abc.Mapping): + return { + key: _recursive_copy_to_device(val, non_blocking=non_blocking, device=device) for key, val in value.items() + } + + return value diff --git a/cosmos-inference/cosmos3/_src/imaginaire/checkpointer/tp.py b/cosmos-inference/cosmos3/_src/imaginaire/checkpointer/tp.py new file mode 100644 index 00000000..b118dbbe --- /dev/null +++ b/cosmos-inference/cosmos3/_src/imaginaire/checkpointer/tp.py @@ -0,0 +1,42 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +from cosmos3._src.imaginaire.checkpointer.ddp import Checkpointer as DDPCheckpointer +from cosmos3._src.imaginaire.model import ImaginaireModel + + +class Checkpointer(DDPCheckpointer): + """ + Checkpointer class for Tensor Parallelism (TP) in distributed training. + + This implementation supports the combination of Tensor Parallelism (TP) and Data Parallel Processing (DDP), with optional Context Parallelism (CP). + + Note: + - Fully Sharded Data Parallelism (FSDP) is not supported by this checkpointer. + - In principle, this implementation is also compatible with Pipeline Parallelism (PP) and Expert Parallelism (EP), which are other forms of model parallelism. However, PP and EP have not been tested yet. + """ + + def add_type_postfix_to_checkpoint_path(self, key: str, checkpoint_path: str, model: ImaginaireModel) -> str: + """ + Overwrite the `add_type_postfix_to_checkpoint_path` function of the base class (DDP checkpointer) + to append the TP-rank postfix to the checkpoint path. + """ + checkpoint_path = super().add_type_postfix_to_checkpoint_path(key, checkpoint_path, model) + if key == "trainer": + return checkpoint_path + else: + checkpoint_path = checkpoint_path.replace(".pt", f"_mp_{self.mp_rank}.pt") + + return checkpoint_path diff --git a/cosmos-inference/cosmos3/_src/imaginaire/checkpointer/tp_ema.py b/cosmos-inference/cosmos3/_src/imaginaire/checkpointer/tp_ema.py new file mode 100644 index 00000000..be3a357d --- /dev/null +++ b/cosmos-inference/cosmos3/_src/imaginaire/checkpointer/tp_ema.py @@ -0,0 +1,94 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +from typing import Any, Dict, Optional + +import torch +from megatron.core import parallel_state + +from cosmos3._src.imaginaire.checkpointer.tp import Checkpointer as BaseCheckpointer +from cosmos3._src.imaginaire.model import ImaginaireModel +from cosmos3._src.imaginaire.utils import misc + + +class Checkpointer(BaseCheckpointer): + KEYS_TO_SAVE = ["model", "optim", "trainer", "scheduler", "ema"] + KEYS_TO_POSTFIX = { + "model": "model", + "optim": "optim", + "ema": "ema", + "scheduler": "scheduler", + "trainer": "", + } + + @misc.timer("generate saving state dict") + def generate_save_state_dict( + self, + model: ImaginaireModel, + optimizer: torch.optim.Optimizer, + scheduler: torch.optim.lr_scheduler.LRScheduler, + grad_scaler: torch.amp.GradScaler, + iteration: int, + ) -> Optional[Dict[str, Any]]: + state_dict = {} + if parallel_state.get_data_parallel_rank() == 0: + trainer_state = dict( + grad_scaler=grad_scaler.state_dict(), + iteration=iteration, + ) + model_state = model.state_dict() + optim_state = optimizer.state_dict() + scheduler_state = scheduler.state_dict() + self.callbacks.on_save_checkpoint(model, state_dict=trainer_state) + + trainer_state, model_state, optim_state, scheduler_state = misc.to( + [trainer_state, model_state, optim_state, scheduler_state], device="cpu" + ) + + state_dict = { + "trainer": trainer_state, + "model": model_state, + "optim": optim_state, + "scheduler": scheduler_state, + } + + if parallel_state.get_data_parallel_rank() < 3: + ema_state = model.ema.state_dict() + state_dict["ema"] = ema_state + + return state_dict + + def add_type_postfix_to_checkpoint_path(self, key: str, checkpoint_path: str, model: ImaginaireModel) -> str: + + # we need to get which ema should be saved + assert key in self.KEYS_TO_SAVE + post_fix = self.KEYS_TO_POSTFIX[key] + + if post_fix: + checkpoint_path = checkpoint_path.replace(".pt", f"_{post_fix}.pt") + else: + checkpoint_path = checkpoint_path + + if key == "ema": + dp_rank = parallel_state.get_data_parallel_rank() + checkpoint_path = checkpoint_path.replace(".pt", f"{dp_rank}.pt") + + if key == "trainer": + return checkpoint_path + else: + mp_rank = parallel_state.get_model_parallel_group().rank() + checkpoint_path = checkpoint_path.replace(".pt", f"_mp_{mp_rank}.pt") + + return checkpoint_path diff --git a/cosmos-inference/cosmos3/_src/imaginaire/config.py b/cosmos-inference/cosmos3/_src/imaginaire/config.py new file mode 100644 index 00000000..c57428ab --- /dev/null +++ b/cosmos-inference/cosmos3/_src/imaginaire/config.py @@ -0,0 +1,600 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +"""Training config system for Imaginare4""" + +from __future__ import annotations + +import importlib +import os +import time +from typing import Any, Dict, Optional, Type, TypeVar, Union + +import attrs +import torch +from loguru import logger as logging + +from cosmos3._src.imaginaire.flags import TRAINING + +try: + from megatron.core import ModelParallelConfig + + USE_MEGATRON = True +except ImportError: + USE_MEGATRON = False + +from cosmos3._src.imaginaire.lazy_config import LazyCall as L +from cosmos3._src.imaginaire.lazy_config import LazyDict +from cosmos3._src.imaginaire.utils import distributed +from cosmos3._src.imaginaire.utils.misc import Color + +T = TypeVar("T") + + +def _is_attrs_instance(obj: object) -> bool: + """ + Helper function to check if an object is an instance of an attrs-defined class. + + Args: + obj: The object to check. + + Returns: + bool: True if the object is an instance of an attrs-defined class, False otherwise. + """ + return hasattr(obj, "__attrs_attrs__") + + +def make_freezable(cls: T) -> T: + """ + A decorator that adds the capability to freeze instances of an attrs-defined class. + + NOTE: This requires the wrapped attrs to be defined with attrs.define(slots=False) because we need + to hack on a "_is_frozen" attribute. + + This decorator enhances an attrs-defined class with the ability to be "frozen" at runtime. + Once an instance is frozen, its attributes cannot be changed. It also recursively freezes + any attrs-defined objects that are attributes of the class. + + Usage: + @make_freezable + @attrs.define(slots=False) + class MyClass: + attribute1: int + attribute2: str + + obj = MyClass(1, 'a') + obj.freeze() # Freeze the instance + obj.attribute1 = 2 # Raises AttributeError + + Args: + cls: The class to be decorated. + + Returns: + The decorated class with added freezing capability. + """ + + if not hasattr(cls, "__dict__"): + raise TypeError( + "make_freezable cannot be used with classes that do not define __dict__. Make sure that the wrapped " + "class was defined with `@attrs.define(slots=False)`" + ) + + original_setattr = cls.__setattr__ + + def setattr_override(self, key, value) -> None: # noqa: ANN001 + """ + Override __setattr__ to allow modifications during initialization + and prevent modifications once the instance is frozen. + """ + if hasattr(self, "_is_frozen") and self._is_frozen and key != "_is_frozen": + raise AttributeError("Cannot modify frozen instance") + original_setattr(self, key, value) # type: ignore + + cls.__setattr__ = setattr_override # type: ignore + + def freeze(self: object) -> None: + """ + Freeze the instance and all its attrs-defined attributes. + """ + for _, value in attrs.asdict(self, recurse=False).items(): + if _is_attrs_instance(value) and hasattr(value, "freeze"): + value.freeze() + self._is_frozen = True # type: ignore + + cls.freeze = freeze # type: ignore + + return cls + + +def _pretty_print_attrs_instance(obj: object, indent: int = 0, use_color: bool = False) -> str: + """ + Recursively pretty prints attrs objects with color. + """ + + assert attrs.has(obj.__class__) + + lines: list[str] = [] + for attribute in attrs.fields(obj.__class__): + value = getattr(obj, attribute.name) + if attrs.has(value.__class__): + if use_color: + lines.append(" " * indent + Color.cyan("* ") + Color.green(attribute.name) + ":") + else: + lines.append(" " * indent + "* " + attribute.name + ":") + lines.append(_pretty_print_attrs_instance(value, indent + 1, use_color)) + else: + if use_color: + lines.append( + " " * indent + Color.cyan("* ") + Color.green(attribute.name) + ": " + Color.yellow(value) + ) + else: + lines.append(" " * indent + "* " + attribute.name + ": " + str(value)) + return "\n".join(lines) + + +def pretty_print_overrides(overrides: Optional[list[str]] = None, use_color: bool = False) -> str: + """ + Pretty prints overrides. + """ + + lines: list[str] = [] + lines.append(Color.cyan("* ") + Color.green("overrides") + ": ") + for override in overrides: + if override == "--": + continue + if override.startswith("~"): + attribute_name = override[1:] + attribute_value = None + else: + attribute_name, attribute_value = override.split("=") + if use_color: + lines.append(" " + Color.cyan("* ") + Color.green(attribute_name) + ": " + Color.yellow(attribute_value)) + else: + lines.append(" " + "* " + attribute_name + ": " + str(attribute_value)) + + return "\n".join(lines) + + +@make_freezable +@attrs.define(slots=False) # slots=False is required for make_freezable. See the make_freezable notes for more info. +class ObjectStoreConfig: + # Whether the file I/O is from object store instead of local disk. + enabled: bool = False + # Path to the object store credentials file. + credentials: str = "" + # Object store bucket to read from / write to the objects. + bucket: str = "" + + +@make_freezable +@attrs.define(slots=False) +class HFExportConfig: + # Whether to enable HuggingFace safetensors export after each DCP checkpoint. + enabled: bool = False + # HuggingFace Hub repo ID to push exported weights to (e.g. "nvidia/cosmos3-qwen3-8b"). + # None means local/S3 only. + hf_repo_id: Optional[str] = None + # Object store for uploading exports. If not enabled, upload is skipped. + # To reuse the DCP checkpoint bucket, copy checkpoint.save_to_object_store here. + upload_to_object_store: ObjectStoreConfig = attrs.field(factory=ObjectStoreConfig) + # Export every N DCP checkpoints. Must be >= 1 and ideally a multiple of checkpoint.save_iter. + export_every_n: int = attrs.field(default=1, validator=attrs.validators.ge(1)) + + +@make_freezable +@attrs.define(slots=False) +class JobConfig: + # Project name. + project: str = "" + # Experiment name. + group: str = "" + # Run/job name. + name: str = "" + # W&B mode, can be "online", or "disabled". + wandb_mode: str = "online" + # Cluster configuration (optional, for cluster-specific settings). + cluster: Optional[Any] = None + + @property + def path(self) -> str: + return f"{self.project}/{self.group}/{self.name}" + + @property + def path_local(self) -> str: + local_root = os.environ.get("IMAGINAIRE_OUTPUT_ROOT", "/tmp/imaginaire4-output") + return f"{local_root}/{self.path}" + + +@make_freezable +@attrs.define(slots=False) +class EMAConfig: + # Enable tracking a set of exponential moving average (EMA) weights. + enabled: bool = False + # EMA decay rate. + beta: float = 0.9999 + # Enable removing "_orig_mod-" from buffer names that is added by torch.compile + torch_compile_buffer_renaming: bool = False + + +@make_freezable +@attrs.define(slots=False) +class PowerEMAConfig: + # Enable tracking a set of exponential moving average (EMA) weights. + enabled: bool = False + # EDM2 paper EMA decay rate. + s: float = 0.1 + # Enable removing "_orig_mod-" from buffer names that is added by torch.compile + torch_compile_buffer_renaming: bool = False + + +@make_freezable +@attrs.define(slots=False) +class DDPConfig: + # Traverse the computation graph to find parameters that don't receive gradients. + find_unused_parameters: bool = False + # Set to True if the computation graph does not change during the whole training loop. + static_graph: bool = True + # Set to True if we want to synchronize buffers. Set to False if the sync is going to be handled elsewhere. + broadcast_buffers: bool = True + + +@make_freezable +@attrs.define(slots=False) +class CuDNNConfig: + # Set to True for better reproducibility of the results (only using deterministic cudnn functions). + deterministic: bool = False + # If set to True, cudnn will benchmark several algorithms and pick the fastest one. + benchmark: bool = True + + +@make_freezable +@attrs.define(slots=False) +class JITConfig: + # Enable exporting a JIT compiled model. + enabled: bool = False + # Input tensor shape, for example input. + input_shape: Union[list[int], None] = None + # Device to compile onto. + device: str = "cuda" + # # Data type to compile onto. + dtype: str = "bfloat16" + # Strict mode for PyTorch JIT. + strict: bool = True + + +@make_freezable +@attrs.define(slots=False) +class CheckpointConfig: + # possible checkpoint class + type: Optional[Dict] = None + + # for dcp, whether to use async mode + dcp_async_mode_enabled: bool = False + + # Configs for saving the checkpoints to object store. + save_to_object_store: ObjectStoreConfig = attrs.field(factory=ObjectStoreConfig) + + # Save the checkpoint every N iterations. + save_iter: int = 999999999 + + # Load state_dict to the models in strict mode. If True, `allow_partial_load` in dcp + # planner will be set to False. DCP will raise an error if there are missing keys. + # If False, `allow_partial_load` in dcp planner will be set to True. DCP will not + # raise an error if there are missing keys. + strict_resume: bool = True + + # Configs for loading the checkpoints from object store. + load_from_object_store: ObjectStoreConfig = attrs.field(factory=ObjectStoreConfig) + + # Path of model weights to resume the checkpoint from. + load_path: str = "" + + # The following 3 flags (load_training_state, only_load_scheduler_state, keys_to_skip_loading) + # only take effect when the checkpoints are loaded from `load_path`. If loading happens from + # the previous checkpoint of the same model, these flags are ignored. + + # Whether to load the training states (optimizer/scheduler/grad-scaler) from the checkpoint path. + load_training_state: bool = False + + # Whether to load the scheduler state only from the checkpoint path. If + # load_training_state is True, this will be ignored. + only_load_scheduler_state: bool = False + + # When loading checkpoints from `load_path`, this list serves as a filter + # to bypass the loading for specific model parameters. A key is considered + # a match—and thus its loading is skipped—if it contains any element of this + # list as a substring. This mechanism allows for broad suppression of entire + # modules or parameter groups without requiring the specification of fully + # qualified names (FQNs). Skipping loading of keys is useful when the new model + # has a different component architecture, e.g. different RoPE embeddings than + # the current model. + keys_to_skip_loading: list[str] = [] + + # Configs for JIT compiling EMA model. + jit: JITConfig = attrs.field(factory=JITConfig) + + # Print detailed information during checkpoint saving/loading. + verbose: bool = True + + # Keys not to resume from the checkpoint, choices: ["model", "optim", "scheduler", "trainer", "dataloader"] + keys_not_to_resume: list[str] = [] + + # Whether to use the local filesystem for broadcasting checkpoint data (used for Tensor Parallel Checkpointer). + broadcast_via_filesystem: bool = False + load_ema_to_reg: bool = False + + # Enable GCS patch in boto3 for loading/saving checkpoints from/to GCS + enable_gcs_patch_in_boto3: bool = False + + # Config for exporting HuggingFace-compatible safetensors after each DCP checkpoint. + hf_export: HFExportConfig = attrs.field(factory=HFExportConfig) + + +@make_freezable +@attrs.define(slots=False) +class NVTXConfig: + """Config for NVTX ranges used in the main training loop. + + See tutorials/nanogpt for more details on how to integrate profiling into your model.""" + + # Enable the NVTX ranges. + enabled: bool = False + # Synchronize everything in each NVTX range. + cuda_synchronize: bool = False + + +@make_freezable +@attrs.define(slots=False) +class StragglerDetectionConfig: + """Config for Straggler detection tool: https://gitlab-master.nvidia.com/dl/gwe/fault_tolerance_related/straggler/-/tree/cupti?ref_type=heads""" + + # Enable the Straggler Detection. + enabled: bool = False + # How frequently should the Straggler reports be generated. + report_freq: int = 100 + # How frequently iterations should be profiled + profile_freq: int = 1 + # What is the maximum relative difference between GPUs after they are considered stragglers + max_diff: float = 2.0 + # Should the error be raised when straggler is detected + raise_error: bool = True + # Analyze kernels in the forward pass. + analyze_forward: bool = True + # Analyze kernels in the backward pass. + analyze_backward: bool = True + # Analyze kernels in the optimizer. + analyze_optimizer: bool = True + # Analyze dataloading time. + analyze_dataloading: bool = True + # Whether to save logs to S3 + save_s3: bool = False + + +@make_freezable +@attrs.define(slots=False) +class Profiling: + # Torch profiler: set this True to dump chrome traces. + enable_profiling: bool = False + # Nsight Systems: set this True AND launch under `nsys profile --capture-range=cudaProfilerApi`. + enable_nsys: bool = False + # CUDA memory snapshot: set this True to dump allocator snapshots. + enable_memory_snapshot: bool = False + save_s3: bool = False + profile_freq: int = 1 + # Number of warmup iterations before the active profile iteration. + profile_warmup: int = 3 + # Target ranks for profiling, each entry must be >=0 and < world_size. + target_ranks: list[int] = list(range(8)) + # The options below apply only to the torch profiler (enable_profiling). + # Set `record_shape` and `profile_memory` to False to reduce profile size. + record_shape: bool = False + profile_memory: bool = False + with_stack: bool = True + with_modules: bool = True + + +@make_freezable +@attrs.define(slots=False) +class CompileConfig: + """ + torch.compile config options passed to set_torch_compile_options function. + """ + + recompile_limit: int = 8 + use_duck_shape: bool = True + + +@make_freezable +@attrs.define(slots=False) +class TrainerConfig: + if TRAINING: + from cosmos3._src.imaginaire.trainer import ImaginaireTrainer + from cosmos3._src.imaginaire.utils import callback + + type: Type[ImaginaireTrainer] = ImaginaireTrainer + # Set the callback class. + # Defaults to the callbacks below. + callbacks: LazyDict = LazyDict( + dict( + ema=L(callback.EMAModelCallback)(), + progress_bar=L(callback.ProgressBarCallback)(), + wandb=L(callback.WandBCallback)(), + ) + ) + + # distributed parallelism strategy + distributed_parallelism: str = "ddp" + # Distributed data parallel configs. + ddp: DDPConfig = attrs.field(factory=DDPConfig) + # cuDNN configs. + cudnn: CuDNNConfig = attrs.field(factory=CuDNNConfig) + # Set the random seed. + seed: int = 0 + # Gradient scaler arguments (for torch.amp.GradScaler). + grad_scaler_args: dict = attrs.field(factory=lambda: dict(enabled=False)) + # Maximum number of iterations to train the model. + max_iter: int = 999999999 + # Maximum number of iterations to validate the model. If None, validate on the entire dataset. + max_val_iter: int | None = None + # How often we log the training stats. + logging_iter: int = 100 + # Whether we want to run the validation routines. + run_validation: bool = True + # How often we evaluate on the validation set. + validation_iter: int = 999999999 + # Whether to run the validation on the start of the training. + run_validation_on_start: bool = False + # Kill the process after N seconds since the last iteration (usually means dead job). + timeout_period: int = 999999999 + # Tensor memory organization format. + memory_format: torch.memory_format = torch.preserve_format + # Gradient accumulation (update step every N iteration). + grad_accum_iter: int = 1 + # Straggler Detection config + straggler_detection: StragglerDetectionConfig = attrs.field(factory=StragglerDetectionConfig) + # Profiling config + profiling: Profiling = attrs.field(factory=Profiling) + compile_config: CompileConfig = attrs.field(factory=CompileConfig) + + # Whether to save the checkpoint at iteration 0. + save_zero_checkpoint: bool = False + + +@make_freezable +@attrs.define(slots=False) +class Config: + """Config for an imaginaire4 job. + + See /README.md/Configuration System for more info. + """ + + # Model configs. + model: LazyDict + # Optimizer configs. + optimizer: LazyDict + # Scheduler configs. + scheduler: LazyDict + # Training data configs. + dataloader_train: LazyDict | None + # Validation data configs. + dataloader_val: LazyDict | None + + # Training job configs. + job: JobConfig = attrs.field(factory=JobConfig) + + # Trainer configs. + trainer: TrainerConfig = attrs.field(factory=TrainerConfig) + + if USE_MEGATRON: + # Megatron-Core configs + model_parallel: ModelParallelConfig = attrs.field(factory=ModelParallelConfig) + else: + model_parallel: None = None + + # Checkpointer configs. + checkpoint: CheckpointConfig = attrs.field(factory=CheckpointConfig) + + # enable upload reproducible setup to s3 + upload_reproducible_setup: bool = False + + def pretty_print(self, use_color: bool = False) -> str: + return _pretty_print_attrs_instance(self, 0, use_color) + + def to_dict(self) -> dict[str, Any]: + return attrs.asdict(self) + + def model_init_kwargs(self) -> dict[str, Any]: + """Live root-level sub-configs to pass into instantiate(self.model). + + Override in subclasses whose model __init__ expects fully-composed + top-level configs (e.g. policy/checkpoint/train) rather than the stale + LazyCall snapshot stored under config.model.* at make_config() time. + """ + return {} + + + def validate(self) -> None: + """Validate that the config has all required fields.""" + + # broadcast job.name across all ranks to make sure it is consistent + # otherwise, unaligned job names leads unaligned path to save checkpoints + job_name_tensor = torch.ByteTensor(bytearray(self.job.name, "utf-8")).cuda() + distributed.broadcast(job_name_tensor, 0) + self.job.name = job_name_tensor.cpu().numpy().tobytes().decode("utf-8") + + + assert self.job.project != "" + assert self.job.group != "" + assert self.job.name != "" + + +def load_config(config_path: str, opts: list[str], enable_one_logger: bool = False) -> Config: + from cosmos3._src.imaginaire.serialization import from_yaml, load_callable + + t1 = time.monotonic_ns() + if config_path.endswith(".yaml"): + config = from_yaml(config_path) + # for registration of dataloaders, etc. + _ = load_callable(config.__module__).make_config() + + from cosmos3._src.imaginaire.utils.config_helper import override + + config = override(config, opts, remove_defaults=True) + else: + config = _load_py_config(config_path, opts, validate=False) + + if enable_one_logger: + try: + # pyrefly: ignore # missing-import + from cosmos3._src.imaginaire.utils.one_logger.one_logger_override_utils import override_one_logger_callback + + ol_t1 = time.monotonic_ns() + config = override_one_logger_callback(config) + ol_t2 = time.monotonic_ns() + logging.debug(f"override_one_logger_callback: took {(ol_t2 - ol_t1) / 1e6:.2f}ms") + except ImportError: + pass + + t2 = time.monotonic_ns() + logging.debug(f"total time to load config: {(t2 - t1) / 1e6:.2f}ms") + return config + + +def _load_py_config(config_path: str, opts: list[str], validate: bool = True) -> Config: + + from cosmos3._src.imaginaire.utils.config_helper import get_config_module, override + + t1 = time.monotonic_ns() + config_module = get_config_module(config_path) + t2 = time.monotonic_ns() + logging.debug(f"get_config_module: took {(t2 - t1) / 1e6:.2f}ms") + + t1 = time.monotonic_ns() + config = importlib.import_module(config_module).make_config() + t2 = time.monotonic_ns() + logging.debug(f"importlib.import_module: took {(t2 - t1) / 1e6:.2f}ms") + + t1 = time.monotonic_ns() + config = override(config, opts) + t2 = time.monotonic_ns() + logging.debug(f"override: took {(t2 - t1) / 1e6:.2f}ms") + + if validate: + t1 = time.monotonic_ns() + config.validate() + t2 = time.monotonic_ns() + logging.debug(f"config.validate: took {(t2 - t1) / 1e6:.2f}ms") + + return config diff --git a/cosmos-inference/cosmos3/_src/imaginaire/configs/lr_scheduler.py b/cosmos-inference/cosmos3/_src/imaginaire/configs/lr_scheduler.py new file mode 100644 index 00000000..c81bd5a6 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/imaginaire/configs/lr_scheduler.py @@ -0,0 +1,26 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +from cosmos3._src.imaginaire.functional.lr_scheduler import LambdaLinearScheduler +from cosmos3._src.imaginaire.lazy_config import LazyCall as L +from cosmos3._src.imaginaire.lazy_config import LazyDict + +LambdaLinearSchedulerConfig: LazyDict = L(LambdaLinearScheduler)( + warm_up_steps=[1000], + cycle_lengths=[10000000000000], + f_start=[1.0e-6], + f_max=[1.0], + f_min=[1.0], +) diff --git a/cosmos-inference/cosmos3/_src/imaginaire/datasets/__init__.py b/cosmos-inference/cosmos3/_src/imaginaire/datasets/__init__.py new file mode 100644 index 00000000..db6d6eb6 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/imaginaire/datasets/__init__.py @@ -0,0 +1,15 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025-2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + diff --git a/cosmos-inference/cosmos3/_src/imaginaire/datasets/augmentors/merge_datadict.py b/cosmos-inference/cosmos3/_src/imaginaire/datasets/augmentors/merge_datadict.py new file mode 100644 index 00000000..a5ee4863 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/imaginaire/datasets/augmentors/merge_datadict.py @@ -0,0 +1,54 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +from typing import Optional + +from cosmos3._src.imaginaire.datasets.webdataset.augmentors.augmentor import Augmentor +from cosmos3._src.imaginaire.utils import log + + +class DataDictMerger(Augmentor): + def __init__(self, input_keys: list, output_keys: Optional[list] = None, args: Optional[dict] = None) -> None: + super().__init__(input_keys, output_keys, args) + + def __call__(self, data_dict: dict) -> dict: + r"""Merge the dictionary associated with the input keys into data_dict. Only keys in output_keys are merged. + + Args: + data_dict (dict): Input data dict + Returns: + data_dict (dict): Output dict with dictionary associated with the input keys merged. + """ + for key in self.input_keys: + if key not in data_dict: + log.warning( + f"DataDictMerger dataloader error: missing {key}, {data_dict['__url__']}, {data_dict['__key__']}", + rank0_only=False, + ) + return None + key_dict = data_dict.pop(key) + if key == "depth" and "depth" in self.output_keys: + data_dict["depth"] = key_dict + if key == "human_annotation" and "human_annotation" in self.output_keys: + data_dict["human_annotation"] = key_dict + elif key == "segmentation" and "segmentation" in self.output_keys: + data_dict["segmentation"] = key_dict + elif key == "canny" and "canny" in self.output_keys: + data_dict["canny"] = key_dict + for sub_key in key_dict: + if sub_key in self.output_keys and sub_key not in data_dict: + data_dict[sub_key] = key_dict[sub_key] + del key_dict + return data_dict diff --git a/cosmos-inference/cosmos3/_src/imaginaire/datasets/augmentors/v3_text_transforms.py b/cosmos-inference/cosmos3/_src/imaginaire/datasets/augmentors/v3_text_transforms.py new file mode 100644 index 00000000..3adc1041 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/imaginaire/datasets/augmentors/v3_text_transforms.py @@ -0,0 +1,213 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +import random +from typing import Optional + +import numpy as np +import torch + +from cosmos3._src.imaginaire.datasets.webdataset.augmentors.augmentor import Augmentor + + +def pad_and_resize( + arr_np: np.ndarray, ntokens: int, is_mask_all_ones: bool = False +) -> tuple[torch.Tensor, torch.Tensor]: + r"""Function for padding and resizing a numpy array. + Args: + arr (np.ndarray): Input array + ntokens (int): Number of output tokens after padding + is_mask_all_ones (bool): if true, set mask to ones + Returns: + arr_padded (torch.Tensor): Padded output tensor + mask (torch.Tensor): Padding mask + """ + + if isinstance(arr_np, np.ndarray): + arr = torch.from_numpy(arr_np) + elif isinstance(arr_np, torch.Tensor): + arr = arr_np.clone().detach() + else: + raise TypeError("`arr_np` should be a numpy array or torch tensor.") + embed_dim = arr.shape[1] + + arr_padded = torch.zeros(ntokens, embed_dim, device=arr.device, dtype=torch.float32) + + # If the input text is larger than num_text_tokens, clip it. + if arr.shape[0] > ntokens: + arr = arr[0:ntokens] + + mask = torch.LongTensor(ntokens).zero_() + if len(arr.shape) > 1: + mask[0 : arr.shape[0]] = 1 + + if len(arr.shape) > 1: + arr_padded[0 : arr.shape[0]] = arr + + if is_mask_all_ones: + mask.fill_(1) + + return arr_padded, mask + + +def _obtain_embeddings(cfg: dict, embeddings_captions: dict[str, list], caption_idx: int) -> dict: + r"""Function for obtaining text embeddings and text mask. + Args: + cfg (dict): Config dict + embeddings_captions (np.ndarray): Caption embeddings + caption_idx (int): Caption index + Returns: + Dictionary containing embeddings and mask + """ + out_dict = dict() + is_mask_all_ones = cfg["is_mask_all_ones"] + if "byt5_tokens" in cfg: + out_byt5_text, out_byt5_text_mask = pad_and_resize( + embeddings_captions["byt5_fp8"][caption_idx], + cfg["byt5_tokens"]["num"], + is_mask_all_ones=is_mask_all_ones, + ) + out_dict["byt5_text_embeddings"] = out_byt5_text + out_dict["byt5_text_mask"] = out_byt5_text_mask + + if "t5_tokens" in cfg: + out_t5, out_t5_mask = pad_and_resize( + embeddings_captions["t5_xxl_fp8"][caption_idx], + cfg["t5_tokens"]["num"], + is_mask_all_ones=is_mask_all_ones, + ) + out_dict["t5_text_embeddings"] = out_t5 + out_dict["t5_text_mask"] = out_t5_mask + + return out_dict + + +def obtain_data_dict_from_mixed_gt_and_ai_captions(data_dict: dict, input_keys: list, args: Optional[dict] = None): + out_pkl_dict = dict() + + captions_gt = data_dict[input_keys[0]] + decoded_captions_ai = data_dict[input_keys[1]] + embeddings_captions_gt = data_dict[input_keys[2]] + embeddings_captions_ai = data_dict[input_keys[3]] + + assert args is not None, "Please specify args in augmentation" + probabilities = [args["caption_probs"]["ground_truth"], args["caption_probs"]["vfc_fidelity"]] + valid_captions_indices = list(range(len(probabilities))) + caption_idx = random.choices(valid_captions_indices, weights=probabilities, k=1)[0] + + # If VFC Fidelity caption is not valid, we will use the ground truth caption + if caption_idx == 1 and decoded_captions_ai["had_parse_issue"]: + caption_idx = 0 + + # Merging GT and AI caption raw text + captions = captions_gt["text"] + [decoded_captions_ai["captions"]["vfc_fidelity"]] + + # Merging GT and AI caption embeddings + gt_embeddings = [] + for key in ["ground_truth_headline", "ground_truth"]: + if key in embeddings_captions_gt: + if embeddings_captions_gt[key] is not None: + gt_embeddings.append(embeddings_captions_gt[key]) + + # Randomly select one of the GT embeddings + gt_embedding = random.choice(gt_embeddings) + embeddings_captions = {} + for key in embeddings_captions_ai["vfc_fidelity"]["embeddings"].keys(): + embeddings_captions[key] = [ + gt_embedding["embeddings"][key], + embeddings_captions_ai["vfc_fidelity"]["embeddings"][key], + ] + + # Sampling raw caption and embeddings + raw_captions = captions[caption_idx] + data_dict["raw_captions"] = raw_captions + + embeddings_dict = _obtain_embeddings( + cfg=args, + embeddings_captions=embeddings_captions, + caption_idx=caption_idx, + ) + out_pkl_dict.update(embeddings_dict) + + data_dict.update(out_pkl_dict) + for key in input_keys: + del data_dict[key] + + return data_dict + + +class TextTransform(Augmentor): + def __init__(self, input_keys: list, output_keys: Optional[list] = None, args: Optional[dict] = None) -> None: + super().__init__(input_keys, output_keys, args) + + def __call__(self, data_dict: dict) -> dict: + r"""Performs camera transformation. + + Args: + data_dict (dict): Input data dict + Returns: + data_dict (dict): Output dict with camera attributes added + """ + return obtain_data_dict_from_mixed_gt_and_ai_captions(data_dict, self.input_keys, self.args) + + +class TextTransformAIOnly(Augmentor): + def __init__(self, input_keys: list, output_keys: Optional[list] = None, args: Optional[dict] = None) -> None: + super().__init__(input_keys, output_keys, args) + + def __call__(self, data_dict: dict) -> dict: + r"""Performs text transform for datasets where there are only AI captions (ex., NVCC). + + Args: + data_dict (dict): Input data dict + Returns: + data_dict (dict): Output dict with camera attributes added + """ + + out_pkl_dict = dict() + decoded_captions_ai = data_dict[self.input_keys[0]] + embeddings_captions_ai = data_dict[self.input_keys[1]] + + assert self.args is not None, "Please specify args in augmentation" + + raw_captions = decoded_captions_ai["captions"]["vfc"] + embeddings_captions = {} + + if decoded_captions_ai["had_parse_issue"]: + raw_captions = decoded_captions_ai["captions"]["kosmos_2"] + _embeddings_captions = embeddings_captions_ai["kosmos2"] + else: + raw_captions = decoded_captions_ai["captions"]["vfc"] + _embeddings_captions = embeddings_captions_ai["vfc_fidelity"] + + for key in _embeddings_captions["embeddings"].keys(): + embeddings_captions[key] = [ + _embeddings_captions["embeddings"][key], + ] + + # Sampling raw caption and embeddings + data_dict["raw_captions"] = raw_captions + embeddings_dict = _obtain_embeddings( + cfg=self.args, + embeddings_captions=embeddings_captions, + caption_idx=0, + ) + out_pkl_dict.update(embeddings_dict) + + data_dict.update(out_pkl_dict) + for key in self.input_keys: + del data_dict[key] + + return data_dict diff --git a/cosmos-inference/cosmos3/_src/imaginaire/datasets/decoders/__init__.py b/cosmos-inference/cosmos3/_src/imaginaire/datasets/decoders/__init__.py new file mode 100644 index 00000000..3159bfe6 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/imaginaire/datasets/decoders/__init__.py @@ -0,0 +1,14 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. diff --git a/cosmos-inference/cosmos3/_src/imaginaire/datasets/decoders/json_loader.py b/cosmos-inference/cosmos3/_src/imaginaire/datasets/decoders/json_loader.py new file mode 100644 index 00000000..b1aec2be --- /dev/null +++ b/cosmos-inference/cosmos3/_src/imaginaire/datasets/decoders/json_loader.py @@ -0,0 +1,33 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +import json +import re +from typing import Optional + + +def json_decoder(key: str, data: bytes) -> Optional[dict]: + r""" + Function to decode a json file. + Args: + key: Data key. + data: Data dict. + """ + extension = re.sub(r".*[.]", "", key) + if extension == "json": + data_dict = json.loads(data) + return data_dict + else: + return None diff --git a/cosmos-inference/cosmos3/_src/imaginaire/datasets/decoders/pkl_loader.py b/cosmos-inference/cosmos3/_src/imaginaire/datasets/decoders/pkl_loader.py new file mode 100644 index 00000000..0cae27f0 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/imaginaire/datasets/decoders/pkl_loader.py @@ -0,0 +1,33 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +import pickle +import re +from typing import Optional + + +def pkl_decoder(key: str, data: bytes) -> Optional[dict]: + r""" + Function to decode a pkl file. + Args: + key: Data key. + data: Data dict. + """ + extension = re.sub(r".*[.]", "", key) + if extension == "pkl": + data_dict = pickle.loads(data) + return data_dict + else: + return None diff --git a/cosmos-inference/cosmos3/_src/imaginaire/datasets/decoders/video_decoder.py b/cosmos-inference/cosmos3/_src/imaginaire/datasets/decoders/video_decoder.py new file mode 100644 index 00000000..f00a364d --- /dev/null +++ b/cosmos-inference/cosmos3/_src/imaginaire/datasets/decoders/video_decoder.py @@ -0,0 +1,781 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +import io +import math +import re +from random import randint +from typing import Callable, List, Tuple + +import numpy as np +import torch +from PIL import Image + +from cosmos3._src.imaginaire.utils import log + +Image.MAX_IMAGE_PIXELS = 933120000 +_VIDEO_EXTENSIONS = "mp4 avi webm mov".split() + +VIDEO_DECODER_OPTIONS = {} + + +def video_decoder_register(key): + def decorator(func): + VIDEO_DECODER_OPTIONS[key] = func + return func + + return decorator + + +@video_decoder_register("video_decoder_metadata") +def video_decoder_metadata(num_threads, **kwargs): + """ + Video decoder using the video's native fps + """ + import decord + + def video_decoder(key: str, data: bytes): + extension = re.sub(r".*[.]", "", key) + if extension.lower() not in _VIDEO_EXTENSIONS: + return None + video_buffer = io.BytesIO(data) + reader = decord.VideoReader(video_buffer, num_threads=num_threads) + num_frames = len(reader) + video_fps = int(np.round(reader.get_avg_fps())) + length_in_s = float(num_frames) / float(video_fps) + bitrate = video_buffer.getbuffer().nbytes * 8 / length_in_s + video_frames = reader.get_batch([0]).asnumpy() + video_frames = torch.from_numpy(video_frames).permute(3, 0, 1, 2) # (T, H, W, C) -> (C, T, H, W) + return video_frames, {"fps": video_fps, "num_frames": num_frames, "bitrate": bitrate} + + return video_decoder + + +@video_decoder_register("video_decoder_w_controlled_fps") +def video_decoder_w_controlled_fps( + sequence_length: int = 34, + chunk_size: int = 0, + use_fps_control: bool = False, + min_fps_thres: int = 4, + max_fps_thres: int = 30, + sampling_reweighting: bool = False, + sampling_reweighting_factor: int = 1, + num_threads=4, + limit_fps_range: bool = False, + save_raw: bool = False, +): + """ + Video decoder using with fps control. + This function samples videos with fps in the range [min_fps_thres, max_fps_thres]. + We adjust the fps range if min and max fps cannot be supported to get the sequence length with desired chunk size. + + Parameters: + - sequence_length (int) : Number of frames returned by the function + - chunk_size (int): How the video is divided into chunks. Only return frames within a chunk. chunk_size=0 means we use full video length. Defaults to 0. + - min_fps_thres (int): Minimum fps threshold to sample from. + - max_fps_thres (int): Maximum fps threshold to sample from. + - sampling_reweighting (bool): If False, sample fps weights uniformly. If True, reweight sampling distrubution. + - sampling_reweighting_factor (int): The fps sampling distribution reweighting factor. If sampling_reweighting_factor > 1, sample more on lower fps side. + - num_thread (int): Number of threads for decord. + - save_raw (bool): If True, will also return entire raw video in data_dict key "video_raw_bytes", alongside with the video frames. Only enable this for visualization and debug. + """ + import decord + + def video_decoder( + key: str, + data: bytes, + ): + extension = re.sub(r".*[.]", "", key) + if extension.lower() not in _VIDEO_EXTENSIONS: + return None + + video_buffer = io.BytesIO(data) + video_reader = decord.VideoReader(video_buffer, num_threads=num_threads) + num_target_frames = sequence_length if sequence_length > 0 else len(video_reader) + num_orig_frames = len(video_reader) + + # Obtain the number of chunks + if chunk_size == 0: + curr_chunk_size = num_orig_frames + else: + curr_chunk_size = chunk_size + num_chunks = max(num_orig_frames // curr_chunk_size, 1) + + # Checks to ensure that number of target frames we need is present in the video / chunk. + if num_target_frames > curr_chunk_size: + raise ValueError( + f"Specified sequence_length {num_target_frames} exceeds curr_chunk_size {curr_chunk_size}, num_orig_frames={num_orig_frames}, chunk_size={chunk_size}" + ) + + if num_target_frames > num_orig_frames: + raise ValueError( + f"Specified sequence_length {num_target_frames} exceeds num frames in video {num_orig_frames}." + ) + + # Now obtain min and max fps that we can use within this chunk + video_fps = int(np.round(video_reader.get_avg_fps())) + + if video_fps < 1: + raise ValueError("Video fps lower than 1, skipping") + if limit_fps_range: + if video_fps < min_fps_thres: + raise ValueError(f"Video fps {video_fps} lower than {min_fps_thres}, skipping") + if video_fps > max_fps_thres: + raise ValueError(f"Video fps {video_fps} larger than {max_fps_thres}, skipping") + + # Check if the last chunk has separate window + # This happens only if remainder frames >= curr_chunk_size / 2 [data annotation was done this way] + # Else this is used as a part of previous window. + num_frames_in_last_chunk = num_orig_frames - num_chunks * curr_chunk_size + if num_frames_in_last_chunk >= int(0.5 * curr_chunk_size): + if num_frames_in_last_chunk > num_target_frames: + num_chunks += 1 + + # Sample which chunk to use + chunk_index = randint(0, num_chunks - 1) + + if chunk_index == num_chunks - 1: + # For the last chunk, use all of the remaining frames + num_samples_in_chunk = num_orig_frames - chunk_index * curr_chunk_size + else: + # Else use only the chunk size + num_samples_in_chunk = curr_chunk_size + + if use_fps_control: + # When fps control is provided, sample random fps. + min_fps = max(min_fps_thres, math.ceil(video_fps * float(num_target_frames) / float(num_samples_in_chunk))) + max_fps = min(max_fps_thres, video_fps) + + # Randomly sample a target fps in the range of (min_fps, max_fps) + if max_fps > min_fps: + fps_selections = list(range(min_fps, max_fps + 1)) + + # Sample reweighting favors the smaller fps more + if sampling_reweighting: + dist = [1 / (float(pp) ** sampling_reweighting_factor) for pp in fps_selections] + target_fps = np.random.choice(fps_selections, 1, p=[pp / sum(dist) for pp in dist]) + else: + target_fps = np.random.choice(fps_selections, 1) + else: + target_fps = max_fps + + else: + # If not, use native fps + target_fps = video_fps + + # stride used for subsampling video + stride = int(video_fps / target_fps) + + # This is the actual target fps we obtain after subsampling + target_fps = video_fps / stride + + # Select the frame start index and frame end index + chunk_frame_start = chunk_index * curr_chunk_size + if num_samples_in_chunk <= num_target_frames * stride: + raise ValueError( + f"Decoded video not long enough, num_samples_in_chunk={num_samples_in_chunk}, num_target_frames={num_target_frames}, stride={stride}, video_fps={video_fps}, target_fps={target_fps}, min_fps_thres={min_fps_thres}, max_fps_thres={max_fps_thres}, use_fps_control={use_fps_control}" + ) + # Start index is randomly selected in the chunk + frame_start = chunk_frame_start + int( + np.random.choice(num_samples_in_chunk - int(num_target_frames * stride), 1) + ) + frame_end = frame_start + num_target_frames * stride + + # Subsample the frames + if "depth" in key: + frame_start = video_decoder.frame_start + frame_end = video_decoder.frame_end + stride = video_decoder.stride + chunk_index = video_decoder.chunk_index + else: + video_decoder.frame_start = frame_start + video_decoder.frame_end = frame_end + video_decoder.stride = stride + video_decoder.chunk_index = chunk_index + video_frames = video_reader.get_batch(np.arange(frame_start, frame_end, stride).tolist()).asnumpy() + + # Return the frames and metadata + if num_target_frames is not None and video_frames.shape[0] < num_target_frames: + raise ValueError("Decoded video not long enough, skipping") + video_frames = torch.from_numpy(video_frames).permute(3, 0, 1, 2) # (T, H, W, C) -> (C, T, H, W) + video_reader.seek(0) # set video reader point back to 0 to clean up cache + del video_reader # delete the reader to avoid memory leak + + ret_dict = { + "video": video_frames, + "fps": float(target_fps), + "num_frames": video_frames.shape[1], + "chunk_index": chunk_index, + "frame_start": frame_start, + "frame_end": frame_end, + "stride": stride, + "orig_num_frames": num_orig_frames, + } + if save_raw: + ret_dict["video_raw_bytes"] = data + return ret_dict + + return video_decoder + + +@video_decoder_register("video_decoder_for_kd_dataset") +def video_decoder_for_kd_dataset( + sequence_length: int = 34, + num_threads: int = 4, + save_raw: bool = False, + **kwargs, +): + """ + Video decoder for Knowledge Distillation dataset. + This function reads in the raw video frames, without any fps control. + + Parameters: + - sequence_length (int) : Number of frames returned by the function + - num_thread (int): Number of threads for decord. + - save_raw (bool): If True, will also return entire raw video in data_dict key "video_raw_bytes", alongside with the video frames. Only enable this for visualization and debug. + """ + import decord + + def video_decoder( + key: str, + data: bytes, + ): + extension = re.sub(r".*[.]", "", key) + if extension.lower() not in _VIDEO_EXTENSIONS: + return None + + video_buffer = io.BytesIO(data) + video_reader = decord.VideoReader(video_buffer, num_threads=num_threads) + num_target_frames = sequence_length if sequence_length > 0 else len(video_reader) + num_orig_frames = len(video_reader) + assert num_target_frames == num_orig_frames, ( + "Number of target frames must be equal to the number of original frames" + ) + + # Now obtain min and max fps that we can use within this chunk + video_fps = int(np.round(video_reader.get_avg_fps())) + assert video_fps == 24, "Generated video FPS should be 24" + + # Sample which chunk to use + chunk_index = 0 + frame_start = 0 + stride = 1 + frame_end = frame_start + num_target_frames * stride + video_frames = video_reader.get_batch(np.arange(frame_start, frame_end, stride).tolist()).asnumpy() + + # Return the frames and metadata + if num_target_frames is not None and video_frames.shape[0] < num_target_frames: + raise ValueError("Decoded video not long enough, skipping") + video_frames = torch.from_numpy(video_frames).permute(3, 0, 1, 2) # (T, H, W, C) -> (C, T, H, W) + video_reader.seek(0) # set video reader point back to 0 to clean up cache + del video_reader # delete the reader to avoid memory leak + + ret_dict = { + "video": video_frames, + "fps": float(video_fps), + "num_frames": video_frames.shape[1], + "chunk_index": chunk_index, + "frame_start": frame_start, + "frame_end": frame_end, + "stride": stride, + "orig_num_frames": num_orig_frames, + } + if save_raw: + ret_dict["video_raw_bytes"] = data + return ret_dict + + return video_decoder + + +@video_decoder_register("video_decoder_basic") +def video_decoder_basic( + sequence_length: int = 25, + use_fps_control: bool = False, + min_fps_thres: int = 4, + max_fps_thres: int = 30, + num_threads=4, + **kwargs, +) -> Callable[[str, bytes], dict[str, torch.Tensor | int]]: + """Basic video decoder for a specified sequence length. + + If loaded video has fewer frames than requested, temporally pads with the last frame. + Optionally, allows subsampling video with a variable FPS in [`min_fps_thres` .. `max_fps_thres`]. + + Args: + sequence_length (int) : The number of frames to sample from the loaded video. + use_fps_control (bool) : Controls whether to temporally subsample. + min_fps_thres (int): Minimum FPS threshold to sample from. + max_fps_thres (int): Maximum FPS threshold to sample from. + num_thread (int): Number of threads for the decord. + + Returns: + Returns a callable that returns a dictionary of: + - The sampled video(torch.Tensor, torch.uint8), layout (C, T, H, W). + - The FPS (int) of the sample. + """ + import decord + + def video_decoder( + key: str, + data: bytes, + ) -> dict[str, torch.Tensor | int]: + extension = re.sub(r".*[.]", "", key) + if extension.lower() not in _VIDEO_EXTENSIONS: + return None + + video_buffer = io.BytesIO(data) + video_reader = decord.VideoReader(video_buffer, num_threads=num_threads) + + # video and request metadata. + num_target_frames = sequence_length if sequence_length > 0 else len(video_reader) + num_orig_frames = len(video_reader) + assert num_orig_frames > 0, "Video has no frames." + video_fps = max(1, int(video_reader.get_avg_fps() + 0.5)) + + if use_fps_control: + # When fps control is provided, sample random fps. + min_fps = max(min_fps_thres, math.ceil(video_fps * float(num_target_frames) / float(num_orig_frames))) + max_fps = min(max_fps_thres, video_fps) + + # If frame range is valid, sample random fps in the range of (min_fps, max_fps) + if max_fps > min_fps: + fps_selections = list(range(min_fps, max_fps + 1)) + target_fps = np.random.choice(fps_selections, 1) + else: + target_fps = max_fps + else: + target_fps = video_fps + + # This is the actual target fps we obtain after subsampling. + stride = int(video_fps / target_fps) + target_fps = video_fps / stride + num_target_stride_frames = int(num_target_frames * stride) + + # Start index is randomly selected in the + valid_length = max(num_orig_frames - num_target_stride_frames, 1) + frame_start = np.random.choice(valid_length, 1) + frame_end = min(frame_start + num_target_stride_frames, num_orig_frames) + frame_indices = np.arange(frame_start, frame_end, stride).tolist() + + # Grab the frames. + video_frames = video_reader.get_batch(frame_indices).asnumpy() + + # If sampled frames are less than requested, pad with the last frame via replication + if video_frames.shape[0] < num_target_frames: + pad_size = num_target_frames - video_frames.shape[0] + video_frames = np.pad(video_frames, ((0, pad_size), (0, 0), (0, 0), (0, 0)), mode="edge") + + video_frames = torch.from_numpy(video_frames).permute(3, 0, 1, 2) # (T, H, W, C) -> (C, T, H, W) + video_reader.seek(0) # set video reader point back to 0 to clean up cache + del video_reader # delete the reader to avoid memory leak + return { + "video": video_frames, + "fps": float(target_fps), + } + + return video_decoder + + +@video_decoder_register("video_decoder_still_padding") +def video_decoder_still_padding( + sequence_length: int = 25, + use_fps_control: bool = False, + min_fps_thres: int = 4, + max_fps_thres: int = 30, + num_threads=4, + sampling_reweighting: bool = False, + sampling_reweighting_factor: int = 1, + limit_fps_range: bool = False, + **kwargs, +) -> Callable[[str, bytes], dict[str, torch.Tensor | int]]: + """Video decoder for a specified sequence length. + + If loaded video has fewer frames than requested, temporally pads with the last frame. + Optionally, allows subsampling video with a variable FPS in [`min_fps_thres` .. `max_fps_thres`]. + + Args: + sequence_length (int) : The number of frames to sample from the loaded video. + use_fps_control (bool) : Controls whether to temporally subsample. + min_fps_thres (int): Minimum FPS threshold to sample from. + max_fps_thres (int): Maximum FPS threshold to sample from. + num_thread (int): Number of threads for the decord. + + Returns: + Returns a callable that returns a dictionary of: + - The sampled video(torch.Tensor, torch.uint8), layout (C, T, H, W). + - number of video frames + - frame_start + - frame_end + """ + import decord + + def video_decoder( + key: str, + data: bytes, + ) -> dict[str, torch.Tensor | int]: + extension = re.sub(r".*[.]", "", key) + if extension.lower() not in _VIDEO_EXTENSIONS: + return None + + video_buffer = io.BytesIO(data) + video_reader = decord.VideoReader(video_buffer, num_threads=num_threads) + + # video and request metadata. + num_target_frames = sequence_length if sequence_length > 0 else len(video_reader) + num_orig_frames = len(video_reader) + assert num_orig_frames > 0, "Video has no frames." + + if num_target_frames > num_orig_frames: + log.warning( + f"Specified sequence_length {num_target_frames} exceeds num frames in video {num_orig_frames}. Padding last frame" + ) + # Grab the frames. + video_frames = video_reader.get_batch(range(num_orig_frames)).asnumpy() + + # Pad with the last frame via replication + pad_size = num_target_frames - video_frames.shape[0] + video_frames = np.pad(video_frames, ((0, pad_size), (0, 0), (0, 0), (0, 0)), mode="edge") + + video_frames = torch.from_numpy(video_frames).permute(3, 0, 1, 2) # (T, H, W, C) -> (C, T, H, W) + video_reader.seek(0) # set video reader point back to 0 to clean up cache + del video_reader # delete the reader to avoid memory leak + return { + "video": video_frames, + "frame_start": 0, + "frame_end": num_orig_frames, + "num_frames": video_frames.shape[1], + } + + video_fps = max(1, int(video_reader.get_avg_fps() + 0.5)) + + if video_fps < 1: + raise ValueError("Video fps lower than 1, skipping") + if limit_fps_range: + if video_fps < min_fps_thres: + raise ValueError(f"Video fps {video_fps} lower than {min_fps_thres}, skipping") + if video_fps > max_fps_thres: + raise ValueError(f"Video fps {video_fps} larger than {max_fps_thres}, skipping") + + if use_fps_control: + # When fps control is provided, sample random fps. + min_fps = max(min_fps_thres, math.ceil(video_fps * float(num_target_frames) / float(num_orig_frames))) + max_fps = min(max_fps_thres, video_fps) + + # If frame range is valid, sample random fps in the range of (min_fps, max_fps) + if max_fps > min_fps: + fps_selections = list(range(min_fps, max_fps + 1)) + + # Sample reweighting favors the smaller fps more + if sampling_reweighting: + dist = [1 / (float(pp) ** sampling_reweighting_factor) for pp in fps_selections] + target_fps = np.random.choice(fps_selections, 1, p=[pp / sum(dist) for pp in dist]) + else: + target_fps = np.random.choice(fps_selections, 1) + else: + target_fps = max_fps + else: + target_fps = video_fps + + # This is the actual target fps we obtain after subsampling. + stride = int(video_fps / target_fps) + target_fps = video_fps / stride + num_target_stride_frames = int(num_target_frames * stride) + + # Start index is randomly selected in the + valid_length = max(num_orig_frames - num_target_stride_frames, 1) + frame_start = np.random.choice(valid_length, 1) + frame_end = min(frame_start + num_target_stride_frames, num_orig_frames) + frame_indices = np.arange(frame_start, frame_end, stride).tolist() + + # Grab the frames. + video_frames = video_reader.get_batch(frame_indices).asnumpy() + + # If sampled frames are less than requested, pad with the last frame via replication + if video_frames.shape[0] < num_target_frames: + pad_size = num_target_frames - video_frames.shape[0] + video_frames = np.pad(video_frames, ((0, pad_size), (0, 0), (0, 0), (0, 0)), mode="edge") + + video_frames = torch.from_numpy(video_frames).permute(3, 0, 1, 2) # (T, H, W, C) -> (C, T, H, W) + video_reader.seek(0) # set video reader point back to 0 to clean up cache + del video_reader # delete the reader to avoid memory leak + return { + "video": video_frames, + "frame_start": frame_start, + "frame_end": frame_end, + "num_frames": video_frames.shape[1], + } + + return video_decoder + + +def video_decoder_w_lower_fps_get_indices( + num_orig_frames: int, + video_fps: int, + min_fps_thres: int, + max_fps_thres: int, + sequence_length: int, +) -> Tuple[List[int], float]: + """Generates frame indices for video sampling with FPS control. + + This function determines valid stride lengths for sampling frames from a video, + preferring lower FPS (larger strides) when multiple options are available. + It returns both the selected frame indices and the resulting FPS. + + Args: + num_orig_frames: Total number of frames in the original video. + video_fps: Original video frames per second. + min_fps_thres: Minimum allowed frames per second. + max_fps_thres: Maximum allowed frames per second. + sequence_length: Number of frames to sample. + + Returns: + A tuple containing: + - list[int]: Frame indices to sample from the original video. + - float: The resulting frames per second after sampling. + + Raises: + ValueError: If no valid stride options are available given the constraints. + ValueError: If input parameters are invalid (e.g., negative values). + """ + # Validate input parameters + if num_orig_frames <= 0: + raise ValueError("num_orig_frames must be positive") + if video_fps <= 0: + raise ValueError("video_fps must be positive") + if min_fps_thres <= 0: + raise ValueError("min_fps_thres must be positive") + if max_fps_thres < min_fps_thres: + raise ValueError("max_fps_thres must be greater than or equal to min_fps_thres") + if sequence_length <= 1: + raise ValueError("sequence_length must be greater than 1") + if sequence_length > num_orig_frames: + raise ValueError("sequence_length cannot be greater than num_orig_frames") + + # Calculate stride range + min_stride = 1 + max_stride = (num_orig_frames - 1) // (sequence_length - 1) + + valid_strides = [] + for stride in range(min_stride, max_stride + 1): + # Check if we can get sequence_length frames with this stride + if (num_orig_frames - stride * (sequence_length - 1)) > 0: + new_fps = video_fps / stride + if min_fps_thres <= new_fps <= max_fps_thres: + valid_strides.append(stride) + + if not valid_strides: + raise ValueError( + f"No valid stride options available for the given constraints. " + f"stride range = [{min_stride}, {max_stride}]; " + f"original FPS = {video_fps}; " + f"sequence_length = {sequence_length}; " + f"min_fps_thres = {min_fps_thres}; " + f"max_fps_thres = {max_fps_thres}; " + f"original num_frames = {num_orig_frames}" + ) + + # Select stride with weighted probability + if len(valid_strides) >= 2: + stride_choices = valid_strides[-2:] # Taking last two as they're the largest + weights = [0.01, 0.99] # [smaller_stride, larger_stride] + selected_stride = np.random.choice(stride_choices, p=weights) + else: + selected_stride = valid_strides[0] + + # Calculate the maximum valid start index and random start frame + max_start_idx = num_orig_frames - (sequence_length - 1) * selected_stride + frame_start = np.random.randint(0, max_start_idx) + + # Generate frame indices + frame_indices = [frame_start + i * selected_stride for i in range(sequence_length)] + return frame_indices, video_fps / selected_stride + + +@video_decoder_register("video_decoder_w_lower_fps") +def video_decoder_w_lower_fps( + chunk_size: int = 0, + sequence_length: int = 34, + min_fps_thres: int = 4, + max_fps_thres: int = 30, + num_threads: int = 4, + return_frame_indices: bool = False, + **kwargs, +) -> dict: + """ + Simplified video decoder with FPS control and frame sampling. + + Args: + key: Video file name/key + data: Video binary data + min_fps_thres: Minimum FPS threshold + max_fps_thres: Maximum FPS threshold + sequence_length: Number of frames to return + num_threads: Number of threads for decord + limit_fps_range: Whether to enforce FPS limits + return_frame_indices: Whether to return frame indices + + Returns: + dict with video frames tensor and target FPS + """ + import decord + + del kwargs # Unused + + def video_decoder( + key: str, + data: bytes, + ) -> dict[str, torch.Tensor | int]: + # Check video extension + extension = re.sub(r".*[.]", "", key) + if extension.lower() not in _VIDEO_EXTENSIONS: + return None + + # Read video + video_buffer = io.BytesIO(data) + video_reader = decord.VideoReader(video_buffer, num_threads=num_threads) + num_target_frames = sequence_length if sequence_length > 0 else len(video_reader) + + # Get video metadata + num_orig_frames = len(video_reader) + video_fps = int(np.round(video_reader.get_avg_fps())) + + # Basic validations + # Obtain the number of chunks + if chunk_size == 0: + curr_chunk_size = num_orig_frames + else: + curr_chunk_size = chunk_size + num_chunks = max(num_orig_frames // curr_chunk_size, 1) + + # Checks to ensure that number of target frames we need is present in the video / chunk. + if num_target_frames > curr_chunk_size: + raise ValueError("Specified sequence_length exceeds curr_chunk_size.") + + if num_target_frames > num_orig_frames: + raise ValueError( + f"Specified sequence_length {num_target_frames} exceeds num frames in video {num_orig_frames}." + ) + + if video_fps < 1: + raise ValueError("Video fps lower than 1, skipping") + if video_fps < min_fps_thres: + raise ValueError(f"Video fps {video_fps} lower than {min_fps_thres}, skipping") + + # Check if the last chunk has separate window + # This happens only if remainder frames >= curr_chunk_size / 2 [data annotation was done this way] + # Else this is used as a part of previous window. + num_frames_in_last_chunk = num_orig_frames - num_chunks * curr_chunk_size + if num_frames_in_last_chunk >= int(0.5 * curr_chunk_size): + if num_frames_in_last_chunk > num_target_frames: + num_chunks += 1 + + # Sample which chunk to use + chunk_index = randint(0, num_chunks - 1) + + if chunk_index == num_chunks - 1: + # For the last chunk, use all of the remaining frames + num_samples_cur_chunk = num_orig_frames - chunk_index * curr_chunk_size + else: + # Else use only the chunk size + num_samples_cur_chunk = curr_chunk_size + idx_first_in_cur_chunk = chunk_index * curr_chunk_size + + frame_indices, adjusted_fps = video_decoder_w_lower_fps_get_indices( + num_orig_frames=num_samples_cur_chunk, + video_fps=video_fps, + min_fps_thres=min_fps_thres, + max_fps_thres=max_fps_thres, + sequence_length=num_target_frames, + ) + frame_indices = [idx_first_in_cur_chunk + idx for idx in frame_indices] + + # Sample frames + video_frames = video_reader.get_batch(frame_indices).asnumpy() + video_frames = torch.from_numpy(video_frames).permute(3, 0, 1, 2) # (T, H, W, C) -> (C, T, H, W) + + # Clean up + video_reader.seek(0) + del video_reader + + output = { + "video": video_frames, + "fps": float(adjusted_fps), + "orig_fps": video_fps, + "frame_start": frame_indices[0], + "frame_end": frame_indices[-1], + "num_frames": video_frames.shape[1], + "orig_num_frames": num_orig_frames, + "chunk_index": chunk_index, + } + if return_frame_indices: + output["frame_indices"] = frame_indices + return output + + return video_decoder + + +@video_decoder_register("video_naive_bytes") +def video_naive_bytes(*args, **kwargs): + """ + do nothing, just return the video bytes + """ + del args, kwargs + + def video_decoder( + key: str, + data: bytes, + ): + extension = re.sub(r".*[.]", "", key) + if extension.lower() not in _VIDEO_EXTENSIONS: + return None + + return data + + return video_decoder + + +def construct_video_decoder( + video_decoder_name: str = "video_decoder_w_controlled_fps", + sequence_length: int = 34, + chunk_size: int = 0, + use_fps_control: bool = False, + min_fps_thres: int = 4, + max_fps_thres: int = 24, + sampling_reweighting: bool = False, + sampling_reweighting_factor: int = 1, + num_threads=4, + limit_fps_range: bool = False, + # if true, video decoder will additionally save the raw video (alongside with processed frames) to the data_dict + # set to true for inference/debugging + save_raw: bool = False, +): + return VIDEO_DECODER_OPTIONS[video_decoder_name]( + sequence_length=sequence_length, + chunk_size=chunk_size, + use_fps_control=use_fps_control, + min_fps_thres=min_fps_thres, + max_fps_thres=max_fps_thres, + sampling_reweighting=sampling_reweighting, + sampling_reweighting_factor=sampling_reweighting_factor, + num_threads=num_threads, + limit_fps_range=limit_fps_range, + save_raw=save_raw, + ) + + +def construct_video_decoder_metadata( + num_threads=4, +): + return VIDEO_DECODER_OPTIONS["video_decoder_metadata"](num_threads=num_threads) diff --git a/cosmos-inference/cosmos3/_src/imaginaire/datasets/joint_training.py b/cosmos-inference/cosmos3/_src/imaginaire/datasets/joint_training.py new file mode 100644 index 00000000..255c6dc0 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/imaginaire/datasets/joint_training.py @@ -0,0 +1,105 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +"""Utility funcitons to use joint dataloader for training.""" + +from typing import Dict, Iterator # For multiview training + +import torch + +import cosmos3._src.imaginaire.config +import cosmos3._src.imaginaire.datasets.webdataset.dataloader +from cosmos3._src.imaginaire.config import Config +from cosmos3._src.imaginaire.lazy_config import instantiate +from cosmos3._src.imaginaire.utils import log + + +def create_dataloader_dict( + config: Config, dataloader_train: cosmos3._src.imaginaire.datasets.webdataset.dataloader.DataLoader +) -> Dict: + """Create the dataloader dictionary. + + Example config: + + ``` + config: + joint_train: + data_sample_prob: + dataloader_train: 0.5 # sampling probability for default dataloader + dataloader_1: 0.2 # sampling probability for dataloader_1 + dataloader_2: 0.3 # sampling probability for dataloader_2 + dataloader_1: + ... # dataloader config for dataloader_1 + dataloader_2: + ... # dataloader config for dataloader_2 + ``` + + Args: + config (Config): The config object for the Imaginaire codebase. + + Returns: + dict: The dataloader dictionary. + """ + dataloader_list = list(config.joint_train.data_sample_prob.keys()) + + dataloader_dict = {} + for dataloader_name in dataloader_list: + if dataloader_name == "dataloader_train": + continue + log.info( + f"Creating dataloader: {dataloader_name}, sampling probability: {config.joint_train.data_sample_prob[dataloader_name]}" + ) + dataloader_dict[dataloader_name] = iter(instantiate(getattr(config.joint_train, dataloader_name))) + dataloader_dict["dataloader_train"] = iter(dataloader_train) + return dataloader_dict + + +def data_batch_iterator(dataloader_dict: Dict, data_sample_prob: Dict) -> Iterator[Dict]: + """Sample data batches continuously from the dataloader dictionary based on sampling probabilities.""" + dataloader_list = list(data_sample_prob.keys()) + + while True: + selected_dataloader_id = torch.multinomial( + torch.tensor([data_sample_prob[dataloader_name] for dataloader_name in dataloader_list]), 1 + ).item() + selected_dataloader_name = dataloader_list[selected_dataloader_id] + selected_dataloader = dataloader_dict[selected_dataloader_name] + + try: + data_batch = next(selected_dataloader) + except StopIteration: + # Reinitialize the iterator for the selected dataloader once it is exhausted + dataloader_dict[dataloader_list[selected_dataloader_id]] = iter(selected_dataloader) + data_batch = next(dataloader_dict[dataloader_list[selected_dataloader_id]]) + data_batch["dataloader_name"] = selected_dataloader_name + yield data_batch + + +def init_and_wrap_data_loaders(config: Config, dataloader_train: torch.utils.data.DataLoader) -> Dict: + """Wrap the dataloaders for multiview training. + + Args: + config (Config): The config object for the Imaginaire codebase. + dataloader_train (torch.utils.data.DataLoader): The training data loader. + + Returns: + dict: The dataloader dictionary. + """ + # Create the dataloader dictionary with multiple dataloaders + dataloader_dict = create_dataloader_dict(config, dataloader_train) + + # Create the data batch iterator sample from the dataloader dictionary based on sampling probabilities + dataloader_train = data_batch_iterator(dataloader_dict, config.joint_train.data_sample_prob) + return dataloader_train diff --git a/cosmos-inference/cosmos3/_src/imaginaire/datasets/mock_dataset.py b/cosmos-inference/cosmos3/_src/imaginaire/datasets/mock_dataset.py new file mode 100644 index 00000000..8d0c2d7d --- /dev/null +++ b/cosmos-inference/cosmos3/_src/imaginaire/datasets/mock_dataset.py @@ -0,0 +1,186 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +""" +Copied from jam_data by Qinsheng Zhang, with unknown license. +""" + +import inspect +from typing import Any, Callable, Dict + +import torch +from torch.utils.data import Dataset + +MAX_LENGTH = 1 << 15 + + +class LambdaDataset(torch.utils.data.Dataset): + """ + A dataset that generates items by applying a function. This allows for creating + dynamic datasets where the items are the result of function calls. The function can optionally + accept an index argument. + + Attributes: + length (int): The total number of items in the dataset. + fn (Callable): The function to generate dataset items. + is_index_in_params (bool): Flag to determine whether 'index' should be passed + to the function `fn`. + """ + + def __init__(self, fn: Callable, length: int = MAX_LENGTH) -> None: + """ + Initializes the LambdaDataset with a function and the total length. + + Args: + fn (Callable): A function that returns a dataset item. It can optionally accept an + index argument to generate data items based on their index. + length (int): The total number of items in the dataset, defaults to MAX_LENGTH. + """ + self.length = length + self.fn = fn + + try: + # Attempt to inspect the function signature to determine if it accepts an 'index' parameter. + signature = inspect.signature(fn) + self.is_index_in_params = "index" in signature.parameters + except ValueError: + # If the function signature is not inspectable, assume 'index' is not a parameter. + self.is_index_in_params = False + + def __len__(self) -> int: + """ + Returns the total length of the dataset. + + Returns: + int: The number of items in the dataset. + """ + return self.length + + def __getitem__(self, index: int) -> Any: + """ + Retrieves an item at a specific index from the dataset by calling the function `fn`. + Passes the index to `fn` if `fn` is designed to accept an index. + + Args: + index (int): The index of the item to retrieve. + + Returns: + Any: The item returned by the function `fn`. + """ + if self.is_index_in_params: + return self.fn(index) # Call fn with index if it accepts an index parameter. + return self.fn() # Call fn without any parameters if it does not accept the index. + + +class RepeatDataset(torch.utils.data.Dataset): + """ + A dataset wrapper that allows repeating access to items from an underlying dataset. + + This dataset can be used to create an artificial extension of the underlying dataset + to a specified `length`. Each item from the original dataset can be accessed + repeatedly up to `num_item` times before it loops back. + + Attributes: + length (int): The total length of the dataset to be exposed. + dataset (Dataset): The original dataset. + num_item (int): Number of times each item is repeated. + cache_item (dict): Cache to store accessed items to avoid recomputation. + """ + + def __init__(self, dataset: Dataset, length: int = MAX_LENGTH, num_item: int = 1) -> None: + """ + Initializes the RepeatDataset with a dataset, length, and number of repeats per item. + + Args: + dataset (Dataset): The dataset to repeat. + length (int): The total length of the dataset to be exposed. Defaults to MAX_LENGTH. + num_item (int): The number of times to repeat each item. Defaults to 1. + """ + self.length = length + self.dataset = dataset + self.num_item = num_item + self.cache_item = {} + + def __len__(self) -> int: + return self.length + + def __getitem__(self, index: int) -> Any: + index = index % self.num_item + if index not in self.cache_item: + self.cache_item[index] = self.dataset[index] + return self.cache_item[index] + + +class CombinedDictDataset(torch.utils.data.Dataset): + """ + A dataset that wraps multiple PyTorch datasets and returns a dictionary of data items from each dataset for a given index. + This dataset ensures that all constituent datasets have the same length by setting the length to the minimum length of the datasets provided. + + Parameters: + ----------- + **datasets : Dict[str, Dataset] + A dictionary where keys are string identifiers for the datasets and values are the datasets instances themselves. + + Attributes: + ----------- + datasets : Dict[str, Dataset] + Stores the input datasets. + max_length : int + The minimum length among all provided datasets, determining the length of this combined dataset. + + Examples: + --------- + >>> dataset1 = torch.utils.data.TensorDataset(torch.randn(100, 3, 32, 32)) + >>> dataset2 = torch.utils.data.TensorDataset(torch.randn(100, 3, 32, 32)) + >>> combined_dataset = CombinedDictDataset(dataset1=dataset1, dataset2=dataset2) + >>> print(len(combined_dataset)) + 100 + >>> data = combined_dataset[50] + >>> print(data.keys()) + dict_keys(['dataset1', 'dataset2']) + """ + + def __init__(self, **datasets: Dict[str, Dataset]) -> None: + """ + Initializes the CombinedDictDataset with multiple datasets. + + Args: + **datasets (Dict[str, Dataset]): Key-value pairs where keys are dataset names and values + are dataset instances. Each key-value pair adds a dataset + under the specified key. + """ + self.datasets = datasets + self.max_length = min([len(dataset) for dataset in datasets.values()]) + + def __len__(self) -> int: + return self.max_length + + def __getitem__(self, index: int) -> Dict[str, Any]: + """ + Retrieves an item from each dataset at the specified index, combines them into a dictionary, + and returns the dictionary. Each key in the dictionary corresponds to one of the dataset names provided + during initialization, and its value is the item from that dataset at the given index. + + Args: + index (int): The index of the items to retrieve across all datasets. + + Returns: + Dict[str, Any]: A dictionary containing data items from all datasets for the given index. + Each key corresponds to a dataset name, and its value is the data item from that dataset. + """ + data = {} + for key, dataset in self.datasets.items(): + data[key] = dataset[index] + return data diff --git a/cosmos-inference/cosmos3/_src/imaginaire/datasets/webdataset/__init__.py b/cosmos-inference/cosmos3/_src/imaginaire/datasets/webdataset/__init__.py new file mode 100644 index 00000000..3159bfe6 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/imaginaire/datasets/webdataset/__init__.py @@ -0,0 +1,14 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. diff --git a/cosmos-inference/cosmos3/_src/imaginaire/datasets/webdataset/augmentors/augmentor.py b/cosmos-inference/cosmos3/_src/imaginaire/datasets/webdataset/augmentors/augmentor.py new file mode 100644 index 00000000..cb4a7bcc --- /dev/null +++ b/cosmos-inference/cosmos3/_src/imaginaire/datasets/webdataset/augmentors/augmentor.py @@ -0,0 +1,64 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +from collections.abc import Iterable +from typing import Any, Generator, Optional + + +class Augmentor: + def __init__(self, input_keys: list, output_keys: Optional[list] = None, args: Optional[dict] = None) -> None: + r"""Base augmentor class + + Args: + input_keys (list): List of input keys + output_keys (list): List of output keys + args (dict): Arguments associated with the augmentation + """ + self.input_keys = input_keys + self.output_keys = output_keys + self.args = args + + def __call__(self, *args: Any, **kwds: Any) -> Any: + raise ValueError("Augmentor not implemented") + + +class IterableAugmentor: + def __init__(self, input_keys: list, output_keys: Optional[list] = None, args: Optional[dict] = None) -> None: + r"""Base augmentor class + + Args: + input_keys (list): List of input keys + output_keys (list): List of output keys + args (dict): Arguments associated with the augmentation + """ + self.input_keys = input_keys + self.output_keys = output_keys + self.args = args + self.is_generator = True + + def __call__(self, data: Iterable) -> Generator: + r"""Example usage: + + for data_dict in data: + # Do something to data_dict + data_dict["input"] = data_dict["raw_sequence"][:, :-1] + data_dict["target"] = data_dict["raw_sequence"][:, 1:] + # Skip sample if needed + if data_dict["input"].shape[1] < 64: + continue + # Construct a generator + yield data_dict + """ + raise ValueError("Augmentor not implemented") diff --git a/cosmos-inference/cosmos3/_src/imaginaire/datasets/webdataset/augmentors/geometry/camera.py b/cosmos-inference/cosmos3/_src/imaginaire/datasets/webdataset/augmentors/geometry/camera.py new file mode 100644 index 00000000..5e45f039 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/imaginaire/datasets/webdataset/augmentors/geometry/camera.py @@ -0,0 +1,184 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +"""Camera parameter augmentors for webdataset.""" + +from typing import Optional + +import torch + +from cosmos3._src.imaginaire.datasets.webdataset.augmentors.augmentor import Augmentor +from cosmos3._src.imaginaire.modules.camera import Camera + + +class CameraParamDecoder(Augmentor): + """Decodes camera parameters from text files. + + The text file format is: fx fy cx cy qx qy qz qw tx ty tz + where: + - fx, fy: focal lengths + - cx, cy: principal points + - qx, qy, qz, qw: quaternion rotation (world to camera) + - tx, ty, tz: translation vector (world to camera) + """ + + def __init__(self, input_keys: list, output_keys: Optional[list] = None, args: Optional[dict] = None) -> None: + """Initialize the camera parameter decoder. + + Args: + input_keys: List of input keys (typically ['camera']) + output_keys: List of output keys (typically ['intrinsics', 'world_to_cam']) + args: Additional arguments (not used) + """ + super().__init__(input_keys, output_keys, args) + + def __call__(self, data_dict: dict) -> dict: + """Decode camera parameters from text data. + + Args: + data_dict: Input data dictionary containing camera text data + + Returns: + data_dict: Output data dictionary with decoded camera parameters + """ + # Get the camera text data + camera_text = data_dict[self.input_keys[0]] + + # Convert text to string if it's bytes + if isinstance(camera_text, bytes): + camera_text = camera_text.decode("utf-8") + + # Parse the camera parameters + parts = list(map(float, camera_text.strip().split())) + if len(parts) != 11: + raise ValueError(f"Invalid camera parameter format. Expected 11 values, got {len(parts)}") + + # Extract parameters + fx, fy, cx, cy = parts[0:4] # focal lengths and principal points + quat = parts[4:8] # qx, qy, qz, qw + trans = parts[8:11] # tx, ty, tz + + # Convert intrinsics to 3x3 matrix via helper + intrinsics = Camera.intrinsic_params_to_matrices(torch.tensor([fx, fy, cx, cy], dtype=torch.float32)) + + # Convert quaternion + translation to 4x4 World->Cam matrix via helper + qxyzw_t = torch.tensor([*quat, *trans], dtype=torch.float32) + w2c_3x4 = Camera.extrinsic_params_to_matrices(qxyzw_t) + world_to_cam = torch.eye(4, dtype=torch.float32) + world_to_cam[:3, :] = w2c_3x4 + + # Convert to torch tensors + intrinsics = intrinsics.float() + world_to_cam = world_to_cam.float() + + # Store in output dictionary + data_dict[self.output_keys[0]] = intrinsics + data_dict[self.output_keys[1]] = world_to_cam + + # Remove the original camera text data + data_dict.pop(self.input_keys[0]) + + return data_dict + + +class CameraParamListDecoder(Augmentor): + """Decodes a list of camera parameters from text files. + + The text file format is multiple lines, where each line contains: + fx fy cx cy qx qy qz qw tx ty tz + where: + - fx, fy: focal lengths + - cx, cy: principal points + - qx, qy, qz, qw: quaternion rotation (world to camera) + - tx, ty, tz: translation vector (world to camera) + + Each line corresponds to one frame's camera parameters. + """ + + def __init__(self, input_keys: list, output_keys: Optional[list] = None, args: Optional[dict] = None) -> None: + """Initialize the camera parameter list decoder. + + Args: + input_keys: List of input keys (typically ['camera']) + output_keys: List of output keys (typically ['intrinsics', 'world_to_cam']) + args: Additional arguments (not used) + """ + super().__init__(input_keys, output_keys, args) + + def __call__(self, data_dict: dict) -> dict: + """Decode a list of camera parameters from text data. + + Args: + data_dict: Input data dictionary containing camera text data + + Returns: + data_dict: Output data dictionary with decoded camera parameters as lists + """ + # Get the camera text data + camera_text = data_dict[self.input_keys[0]] + + # Convert text to string if it's bytes + if isinstance(camera_text, bytes): + camera_text = camera_text.decode("utf-8") + + # Split into lines and parse each line + lines = camera_text.strip().split("\n") + num_frames = len(lines) + + if num_frames == 0: + raise ValueError("Empty camera parameter file") + + # Initialize lists to store camera parameters + intrinsics_list = [] + world_to_cam_list = [] + + # Parse each line + for i, line in enumerate(lines): + line = line.strip() + if not line: # Skip empty lines + continue + + parts = list(map(float, line.split())) + if len(parts) != 11: + raise ValueError( + f"Invalid camera parameter format at line {i + 1}. Expected 11 values, got {len(parts)}" + ) + + # Extract parameters + fx, fy, cx, cy = parts[0:4] # focal lengths and principal points + quat = parts[4:8] # qx, qy, qz, qw + trans = parts[8:11] # tx, ty, tz + + # Convert intrinsics and extrinsics via helpers + intrinsics = Camera.intrinsic_params_to_matrices(torch.tensor([fx, fy, cx, cy], dtype=torch.float32)) + qxyzw_t = torch.tensor([*quat, *trans], dtype=torch.float32) + w2c_3x4 = Camera.extrinsic_params_to_matrices(qxyzw_t) + world_to_cam = torch.eye(4, dtype=torch.float32) + world_to_cam[:3, :] = w2c_3x4 + + intrinsics_list.append(intrinsics) + world_to_cam_list.append(world_to_cam) + + # Convert lists to torch tensors with batch dimension + intrinsics_tensor = torch.stack(intrinsics_list).float() # T x 3 x 3 + world_to_cam_tensor = torch.stack(world_to_cam_list).float() # T x 4 x 4 + + # Store in output dictionary + data_dict[self.output_keys[0]] = intrinsics_tensor + data_dict[self.output_keys[1]] = world_to_cam_tensor + + # Remove the original camera text data + data_dict.pop(self.input_keys[0]) + return data_dict diff --git a/cosmos-inference/cosmos3/_src/imaginaire/datasets/webdataset/augmentors/geometry/depth.py b/cosmos-inference/cosmos3/_src/imaginaire/datasets/webdataset/augmentors/geometry/depth.py new file mode 100644 index 00000000..adc47764 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/imaginaire/datasets/webdataset/augmentors/geometry/depth.py @@ -0,0 +1,184 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +"""Depth augmentors for webdataset.""" + +from typing import Optional + +import torch + +from cosmos3._src.imaginaire.datasets.webdataset.augmentors.augmentor import Augmentor + + +class DepthMask(Augmentor): + """Generates a binary mask for valid depth values. + + This augmentor takes a depth image and generates a binary mask indicating + which pixels have valid depth values. A pixel is considered valid if: + 1. Its depth value is greater than min_depth + 2. Its depth value is less than max_depth + 3. Its depth value is not NaN or infinite + 4. Its depth value is not larger than median_multiplier times the median depth + + Args: + min_depth (float): Minimum valid depth value + max_depth (float): Maximum valid depth value + median_multiplier (float): Maximum allowed depth as a multiple of median depth + """ + + def __init__(self, input_keys: list, output_keys: Optional[list] = None, args: Optional[dict] = None) -> None: + """Initialize the depth mask generator. + + Args: + input_keys: List of input keys (typically ['depth']) + output_keys: List of output keys (typically ['depth_mask']) + args: Additional arguments including: + - min_depth (float): Minimum valid depth value + - max_depth (float): Maximum valid depth value + - median_multiplier (float): Maximum allowed depth as a multiple of median depth + """ + super().__init__(input_keys, output_keys, args) + self.min_depth = args.get("min_depth", 0.1) if args else 0.1 + self.max_depth = args.get("max_depth", 100.0) if args else 100.0 + self.median_multiplier = args.get("median_multiplier", 10) if args else 10 + + def __call__(self, data_dict: dict) -> dict: + """Generate depth mask. + + Args: + data_dict: Input data dictionary containing depth image + + Returns: + data_dict: Output data dictionary with depth mask + """ + # Get depth image + depth = data_dict[self.input_keys[0]] # H x W + + # Create mask for valid depth values + mask = torch.ones_like(depth, dtype=torch.bool) + + # Check for minimum depth + mask = mask & (depth > self.min_depth) + + # Check for maximum depth + mask = mask & (depth < self.max_depth) + + # Check for NaN and infinite values + mask = mask & torch.isfinite(depth) & (~torch.isnan(depth)) + + # Compute median depth from currently valid depths + if mask.any(): + valid_depths = depth[mask] + median_depth = torch.median(valid_depths) + + # Filter out depths larger than median_multiplier times the median + max_allowed_depth = self.median_multiplier * median_depth + mask = mask & (depth <= max_allowed_depth) + + # Store in output dictionary + data_dict[self.output_keys[0]] = mask + data_dict[self.input_keys[0]][~mask] = self.max_depth + return data_dict + + +class ConsecutiveFrameSampler(Augmentor): + """Randomly samples N consecutive frames from a video sequence. + + This augmentor takes a video sequence and randomly samples N consecutive frames + starting from a random position within the valid range. + + Args: + num_frames (int): Number of consecutive frames to sample + """ + + def __init__( + self, + input_keys: list, + output_keys: Optional[list] = None, + random_sample: bool = True, + args: Optional[dict] = None, + ) -> None: + """Initialize the consecutive frame sampler. + + Args: + input_keys: List of input keys (typically ['depth', 'points', etc.]) + output_keys: List of output keys (same as input_keys) + args: Additional arguments including: + - num_frames (int): Number of consecutive frames to sample + """ + super().__init__(input_keys, output_keys, args) + self.num_frames = args.get("num_frames", 25) if args else 25 + self.random_sample = random_sample + + def __call__(self, data_dict: dict) -> dict: + """Sample consecutive frames from video sequences. + + Args: + data_dict: Input data dictionary containing video sequences + + Returns: + data_dict: Output data dictionary with sampled frames + """ + + # Get the first input key to determine the temporal dimension + first_key = self.input_keys[0] + video_tensor = data_dict[first_key] + + if video_tensor.dim() == 4: # CxTxHxW + total_frames = video_tensor.shape[1] + elif video_tensor.dim() == 3: # TxHxW + total_frames = video_tensor.shape[0] + else: + raise ValueError(f"Expected 3D (TxHxW) or 4D (CxTxHxW) tensor, got {video_tensor.dim()}D") + + # Calculate valid start indices + max_start_idx = max(0, total_frames - self.num_frames) + if self.num_frames > total_frames: + return None + + if max_start_idx == 0: + # If video is shorter than requested frames, use all available frames + start_idx = 0 + actual_num_frames = total_frames + else: + if self.random_sample: + # Randomly sample start index + start_idx = torch.randint(0, max_start_idx + 1, size=(1,)).item() + else: + start_idx = 0 + actual_num_frames = self.num_frames + + # Sample frames for all input keys + for input_key, output_key in zip(self.input_keys, self.output_keys): + tensor = data_dict[input_key] + + if tensor.dim() == 4: # CxTxHxW + sampled_tensor = tensor[:, start_idx : start_idx + actual_num_frames, :, :] + assert sampled_tensor.shape[1] == actual_num_frames, ( + f"Sampled tensor {input_key} has {sampled_tensor.shape[1]} frames, expected {actual_num_frames}" + ) + elif tensor.dim() == 3: # TxHxW + sampled_tensor = tensor[start_idx : start_idx + actual_num_frames, :, :] + assert sampled_tensor.shape[0] == actual_num_frames, ( + f"Sampled tensor {input_key} has {sampled_tensor.shape[0]} frames, expected {actual_num_frames}" + ) + else: + raise ValueError(f"Expected 3D (TxHxW) or 4D (CxTxHxW) tensor for {input_key}, got {tensor.dim()}D") + + data_dict[output_key] = sampled_tensor + data_dict["frame_start"] = start_idx + data_dict["frame_end"] = start_idx + actual_num_frames + + return data_dict diff --git a/cosmos-inference/cosmos3/_src/imaginaire/datasets/webdataset/augmentors/geometry/pointcloud.py b/cosmos-inference/cosmos3/_src/imaginaire/datasets/webdataset/augmentors/geometry/pointcloud.py new file mode 100644 index 00000000..9d57829f --- /dev/null +++ b/cosmos-inference/cosmos3/_src/imaginaire/datasets/webdataset/augmentors/geometry/pointcloud.py @@ -0,0 +1,390 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +"""Point cloud augmentors for webdataset.""" + +from typing import Optional + +import torch +from einops import rearrange + +from cosmos3._src.imaginaire.datasets.webdataset.augmentors.augmentor import Augmentor +from cosmos3._src.imaginaire.modules.camera import Camera + + +class DepthToPointcloud(Augmentor): + """Converts depth images to point clouds using camera intrinsics. + + This augmentor takes a depth image and camera intrinsics to generate a point cloud. + The depth image should be in meters and the intrinsics should be a 3x3 matrix. + + Args: + to_world_coords (bool): If True, uses the first frame as the coordinate frame for video sequences + """ + + def __init__(self, input_keys: list, output_keys: Optional[list] = None, args: Optional[dict] = None) -> None: + """Initialize the depth to point cloud converter. + + Args: + input_keys: List of input keys (typically ['depth', 'intrinsics', 'world_to_cam']) + output_keys: List of output keys (typically ['points']) + args: Additional arguments including: + - to_world_coords (bool): Whether to use first frame as coordinate frame + """ + assert "depth" in input_keys, "Depth image is required for point cloud conversion" + assert "intrinsics" in input_keys, "Intrinsics are required for point cloud conversion" + assert "world_to_cam" in input_keys or not self.to_world_coords, ( + "World to camera matrix is required for point cloud conversion" + ) + super().__init__(input_keys, output_keys, args) + self.to_world_coords = args.get("to_world_coords", False) if args else False + + def __call__(self, data_dict: dict) -> dict: + """Convert depth image to point cloud. + + Args: + data_dict: Input data dictionary containing depth image and camera intrinsics + + Returns: + data_dict: Output data dictionary with point cloud + """ + # Get depth image and intrinsics + depth = data_dict[self.input_keys[0]] # [T,H,W] or [H,W] + intrinsics = data_dict[self.input_keys[1]] # [T,3,3] or [3,3] + + # Check if we're dealing with video sequences (temporal dimension) + if depth.dim() == 3 and intrinsics.dim() == 3: + # Video sequence: T x H x W and T x 3 x 3 + T, H, W = depth.shape + + # Create pixel coordinates (same for all frames) + y, x = torch.meshgrid( + torch.arange(H, device=depth.device), torch.arange(W, device=depth.device), indexing="ij" + ) + pixels = torch.stack([x, y, torch.ones_like(x)], dim=-1).float() # [H,W,3] + pixels_hw3 = pixels.reshape(-1, 3) # [H*W,3] + + # Back-project to camera space using Camera.image2camera + pixels_batched = pixels_hw3.unsqueeze(0).expand(T, -1, -1) # [T,H*W,3] + points_cam = Camera.image2camera(pixels_batched, intrinsics) # [T,H*W,3] + depth_flat = depth.reshape(T, -1) # [T,H*W] + points_cam = points_cam * depth_flat.unsqueeze(-1) # [T,H*W,3] + + # Transform to first frame coordinate system if requested + if self.to_world_coords: + world_to_cam = data_dict[self.input_keys[2]] # [T,4,4] + w2c = world_to_cam[:, :3, :] # [T,3,4] + # relative pose from cam_t to cam_0: rel = w2c_0 ∘ c2w_t + w2c0 = w2c[0] # [3,4] + c2w = Camera.invert_pose(w2c) # [T,3,4] + w2c0_exp = w2c0.unsqueeze(0).expand_as(c2w) # [T,3,4] + rel = Camera.compose_poses([w2c0_exp, c2w]) # [T,3,4] + points = Camera.world2camera(points_cam, rel) # [T,H*W,3] + else: + points = points_cam # [T,H*W,3] + + # Reshape to T x 3 x H x W + points = rearrange(points, "t (h w) c -> c t h w", h=H, w=W, c=3) # [3,T,H,W] + + else: + # Single frame: H x W and 3 x 3 + H, W = depth.shape[-2:] + + # Create pixel coordinates + y, x = torch.meshgrid( + torch.arange(H, device=depth.device), torch.arange(W, device=depth.device), indexing="ij" + ) + + # Create homogeneous coordinates and convert to float + pixels = torch.stack([x, y, torch.ones_like(x)], dim=-1).float() # [H,W,3] + pixels_hw3 = pixels.reshape(-1, 3) # [H*W,3] + depth_flat = depth.reshape(-1) # [H*W] + + # Back-project to camera space + points_cam = Camera.image2camera(pixels_hw3, intrinsics) # [H*W,3] + points_cam = points_cam * depth_flat.unsqueeze(-1) # [H*W,3] + + # For single frame, just use camera coordinates or transform to world coords as before + if self.to_world_coords: + world_to_cam = data_dict[self.input_keys[2]] # [4,4] + w2c = world_to_cam[:3, :] # [3,4] + points = Camera.camera2world(points_cam, w2c) # [H*W,3] + else: + points = points_cam # [H*W,3] + + # Reshape to 3 x H x W + points = rearrange(points, "(h w) c -> c h w", h=H, w=W, c=3) # [3,H,W] + + # Store in output dictionary + data_dict[self.output_keys[0]] = points + + return data_dict + + +class PointcloudRescale(Augmentor): + """Rescales point clouds to have a mean distance of 1 from the origin. + + This augmentor takes a point cloud and rescales it so that the mean distance + of all points from the origin is 1. It also adjusts the world-to-camera + transformation matrix accordingly. + + Args: + input_keys: List of input keys (typically ['points', 'world_to_cam']) + output_keys: List of output keys (typically ['points', 'world_to_cam']) + """ + + def __init__( + self, + input_keys: list, + output_keys: Optional[list] = None, + mask_key: Optional[str] = None, + args: Optional[dict] = None, + ) -> None: + """Initialize the point cloud rescaler. + + Args: + input_keys: List of input keys (typically ['points', 'world_to_cam']) + output_keys: List of output keys (typically ['points', 'world_to_cam']) + args: Additional arguments (not used in this augmentor) + """ + assert "points" in input_keys, "Points are required for rescaling" + assert "world_to_cam" in input_keys, "World to camera matrix is required for rescaling" + super().__init__(input_keys, output_keys, args) + self.mask_key = mask_key + + def __call__(self, data_dict: dict) -> dict: + """Rescale point cloud and adjust world-to-camera transformation. + + This augmentor computes the average Euclidean distance of all 3D points to the origin + and uses this scale to normalize both the camera translations and point cloud. + + Args: + data_dict: Input data dictionary containing points and world_to_cam + + Returns: + data_dict: Output data dictionary with rescaled points and adjusted world_to_cam + """ + # Get points and world_to_cam + points = data_dict[self.input_keys[0]] # [3,T,H,W] or [3,H,W] + world_to_cam = data_dict[self.input_keys[1]] # [T,4,4] or [4,4] + + # Check if we're dealing with video sequences (temporal dimension) + if points.dim() == 4 and world_to_cam.dim() == 3: + # Video sequence: 3 x T x H x W and T x 4 x 4 + T = world_to_cam.shape[0] + + # Reshape points to T x N x 3 for easier computation + points_flat = points.permute(1, 0, 2, 3).reshape(T, 3, -1).transpose(1, 2) # [T,N,3] + + # Compute average Euclidean distance to origin across all frames + if self.mask_key is not None: + # Get mask and reshape to match points + mask = data_dict[self.mask_key] # [T,H,W] + mask_flat = mask.reshape(T, -1) # [T,N] + + # Only compute average over valid points across all frames + # Compute squared distances for all frames at once + squared_distances = torch.sum(points_flat**2, dim=2) # [T,N] + + # Apply mask and compute mean across all frames + valid_distances = torch.sqrt(squared_distances[mask_flat]) # [N_valid] + avg_dist = valid_distances.mean() # scalar + else: + # Compute average Euclidean distance to origin for all points across all frames + avg_dist = torch.sqrt(torch.sum(points_flat**2, dim=2)).mean() # scalar + + # Compute scale factor to achieve average distance of 1 across all frames + scale = 1.0 / avg_dist # scalar + + # Rescale points for all frames at once + points_scaled = points * scale # [3,T,H,W] + + # Adjust world_to_cam matrix for all frames at once + # We need to scale the translation component by the same factor + world_to_cam_scaled = world_to_cam.clone() + world_to_cam_scaled[:, :3, 3] *= scale # [T,4,4] + + # Scale depth for all frames at once + depth = data_dict[self.input_keys[2]] # [T,H,W] + depth_scaled = depth * scale # [T,H,W] + else: + # Single frame: 3 x H x W and 4 x 4 + # Reshape points to N x 3 for easier computation + points_flat = points.reshape(3, -1).T # [N,3] + + # Compute average Euclidean distance to origin + if self.mask_key is not None: + # Get mask and reshape to match points + mask = data_dict[self.mask_key] # [H,W] + mask_flat = mask.reshape(-1) # [N] + + # Only compute average over valid points + valid_points = points_flat[mask_flat] # [N_valid,3] + # Compute average Euclidean distance to origin + avg_dist = torch.sqrt(torch.sum(valid_points**2, dim=1)).mean() # scalar + else: + # Compute average Euclidean distance to origin for all points + avg_dist = torch.sqrt(torch.sum(points_flat**2, dim=1)).mean() # scalar + + # Compute scale factor to achieve average distance of 1 + scale = 1.0 / avg_dist # scalar + + # Rescale points + points_scaled = points * scale # [3,H,W] + + # Adjust world_to_cam matrix + # We need to scale the translation component by the same factor + world_to_cam_scaled = world_to_cam.clone() + world_to_cam_scaled[:3, 3] *= scale # [4,4] + + # Scale depth + depth = data_dict[self.input_keys[2]] # [H,W] + depth_scaled = depth * scale # [H,W] + + # Store in output dictionary + data_dict[self.output_keys[0]] = points_scaled + data_dict[self.output_keys[1]] = world_to_cam_scaled + data_dict[self.output_keys[2]] = depth_scaled + return data_dict + + +class PointcloudMaskFill(Augmentor): + """Fills point cloud values with 0 when point cloud mask is False. + + This augmentor takes a point cloud and a point cloud mask, and sets point cloud values to 0 + wherever the mask is False. This is useful for cleaning up point clouds by + removing invalid or unreliable point measurements. + + Args: + input_keys: List of input keys (typically ['points', 'pcd_mask']) + output_keys: List of output keys (typically ['points']) + """ + + def __init__( + self, input_keys: list, output_keys: Optional[list] = None, fill_value: float = 0.0, args: Optional[dict] = None + ) -> None: + """Initialize the point cloud mask filler. + + Args: + input_keys: List of input keys (typically ['points', 'pcd_mask']) + output_keys: List of output keys (typically ['points']) + args: Additional arguments (not used in this augmentor) + """ + super().__init__(input_keys, output_keys, args) + self.fill_value = fill_value + + def __call__(self, data_dict: dict) -> dict: + """Fill point cloud values with 0 where point cloud mask is False. + + Args: + data_dict: Input data dictionary containing point cloud and point cloud mask + + Returns: + data_dict: Output data dictionary with masked point cloud + """ + # Get point cloud and point cloud mask + points = data_dict[self.input_keys[0]] # [3,T,H,W] or [3,H,W] + depth_mask = data_dict[self.input_keys[1]] # [T,H,W] or [H,W] + + # Check if we're dealing with video sequences (temporal dimension) + if points.dim() == 4 and depth_mask.dim() == 3: + # Video sequence: 3 x T x H x W and T x H x W + # Create a copy of the point cloud + points_filled = points.clone() # [3,T,H,W] + + # Expand mask to match points dimensions: 3 x T x H x W + mask_expanded = depth_mask.unsqueeze(0).expand(3, -1, -1, -1) # [3,T,H,W] + + # Set point cloud values to fill_value where mask is False for all channels at once + points_filled[~mask_expanded] = self.fill_value + + else: + # Single frame: 3 x H x W and H x W + # Create a copy of the point cloud + points_filled = points.clone() # [3,H,W] + + # Expand mask to match points dimensions: 3 x H x W + mask_expanded = depth_mask.unsqueeze(0).expand(3, -1, -1) # [3,H,W] + + # Set point cloud values to fill_value where mask is False for all channels at once + points_filled[~mask_expanded] = self.fill_value + + # Store in output dictionary + data_dict[self.output_keys[0]] = points_filled + + return data_dict + + +def verify_backprojection(data_dict: dict, scale: float) -> bool: + """Verify that backprojection of rescaled depth and camera poses matches rescaled point cloud. + + This function checks if the backprojection of the rescaled depth image using + the rescaled camera poses produces the same point cloud as the rescaled point cloud. + + Args: + data_dict: Dictionary containing: + - points_scaled: Rescaled point cloud (3 x H x W) + - depth_scaled: Rescaled depth image (H x W) + - world_to_cam_scaled: Rescaled world to camera matrix (4 x 4) + - intrinsics: Camera intrinsics matrix (3 x 3) + scale: The scale factor used for rescaling + + Returns: + bool: True if backprojection matches rescaled point cloud within tolerance + """ + # Get required data + points_scaled = data_dict["points"] # [3,H,W] + depth_scaled = data_dict["depth"] # [H,W] + world_to_cam_scaled = data_dict["world_to_cam"] # [4,4] + intrinsics = data_dict["intrinsics"] # [3,3] + + # Get image dimensions + H, W = depth_scaled.shape[-2:] + + # Create pixel coordinates + y, x = torch.meshgrid( + torch.arange(H, device=depth_scaled.device), torch.arange(W, device=depth_scaled.device), indexing="ij" + ) + + # Create homogeneous coordinates + pixels = torch.stack([x, y, torch.ones_like(x)], dim=-1).float() # [H,W,3] + + # Reshape for batch processing + pixels = pixels.reshape(-1, 3) # [H*W,3] + depth_flat = depth_scaled.reshape(-1) # [H*W] + + # Get inverse of intrinsics + intrinsics_inv = torch.inverse(intrinsics) # [3,3] + + # Back-project to camera space + points_cam = (intrinsics_inv @ pixels.T).T # [H*W,3] + points_cam = points_cam * depth_flat.unsqueeze(-1) # [H*W,3] + + # Convert to world coordinates + cam_to_world = torch.inverse(world_to_cam_scaled) # [4,4] + points_cam_h = torch.cat([points_cam, torch.ones_like(points_cam[:, :1])], dim=-1) # [H*W,4] + points_world_h = (cam_to_world @ points_cam_h.T).T # [H*W,4] + points_world = points_world_h[:, :3] # [H*W,3] + + # Reshape back to image dimensions + points_world = points_world.reshape(H, W, 3) # [H,W,3] + points_world = points_world.permute(2, 0, 1) # [3,H,W] + + # Compare with rescaled point cloud + # Use a small tolerance for floating point comparison + tolerance = 1e-6 + is_close = torch.allclose(points_world, points_scaled, rtol=tolerance, atol=tolerance) + + return is_close diff --git a/cosmos-inference/cosmos3/_src/imaginaire/datasets/webdataset/augmentors/image/__init__.py b/cosmos-inference/cosmos3/_src/imaginaire/datasets/webdataset/augmentors/image/__init__.py new file mode 100644 index 00000000..3159bfe6 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/imaginaire/datasets/webdataset/augmentors/image/__init__.py @@ -0,0 +1,14 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. diff --git a/cosmos-inference/cosmos3/_src/imaginaire/datasets/webdataset/augmentors/image/cropping.py b/cosmos-inference/cosmos3/_src/imaginaire/datasets/webdataset/augmentors/image/cropping.py new file mode 100644 index 00000000..4881f9db --- /dev/null +++ b/cosmos-inference/cosmos3/_src/imaginaire/datasets/webdataset/augmentors/image/cropping.py @@ -0,0 +1,118 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +from typing import Optional + +import torch +import torchvision.transforms.functional as transforms_F +from loguru import logger as logging + +from cosmos3._src.imaginaire.datasets.webdataset.augmentors.augmentor import Augmentor +from cosmos3._src.imaginaire.datasets.webdataset.augmentors.image.misc import obtain_augmentation_size, obtain_image_size + + +class CenterCrop(Augmentor): + def __init__(self, input_keys: list, output_keys: Optional[list] = None, args: Optional[dict] = None) -> None: + super().__init__(input_keys, output_keys, args) + + def __call__(self, data_dict: dict) -> dict: + r"""Performs center crop. + + Args: + data_dict (dict): Input data dict + Returns: + data_dict (dict): Output dict where images are center cropped. + We also save the cropping parameters in the aug_params dict + so that it will be used by other transforms. + """ + assert (self.args is not None) and ("size" in self.args), "Please specify size in args" + + img_size = obtain_augmentation_size(data_dict, self.args) + width, height = img_size + + orig_w, orig_h = obtain_image_size(data_dict, self.input_keys) + for key in self.input_keys: + data_dict[key] = transforms_F.center_crop(data_dict[key], [height, width]) + + # We also add the aug params we use. This will be useful for other transforms + crop_x0 = (orig_w - width) // 2 + crop_y0 = (orig_h - height) // 2 + cropping_params = { + "resize_w": orig_w, + "resize_h": orig_h, + "crop_x0": crop_x0, + "crop_y0": crop_y0, + "crop_w": width, + "crop_h": height, + } + + if "aug_params" not in data_dict: + data_dict["aug_params"] = dict() + + data_dict["aug_params"]["cropping"] = cropping_params + return data_dict + + +class RandomCrop(Augmentor): + def __init__(self, input_keys: list, output_keys: Optional[list] = None, args: Optional[dict] = None) -> None: + super().__init__(input_keys, output_keys, args) + + def __call__(self, data_dict: dict) -> dict: + r"""Performs random crop. + + Args: + data_dict (dict): Input data dict + Returns: + data_dict (dict): Output dict where images are center cropped. + We also save the cropping parameters in the aug_params dict + so that it will be used by other transforms. + """ + + img_size = obtain_augmentation_size(data_dict, self.args) + width, height = img_size + + orig_w, orig_h = obtain_image_size(data_dict, self.input_keys) + # Obtaining random crop coords + try: + crop_x0 = int(torch.randint(0, orig_w - width + 1, size=(1,)).item()) + crop_y0 = int(torch.randint(0, orig_h - height + 1, size=(1,)).item()) + except Exception as e: + logging.warning( + f"Random crop failed. Performing center crop, original_size(wxh): {orig_w}x{orig_h}, random_size(wxh): {width}x{height}" + ) + for key in self.input_keys: + data_dict[key] = transforms_F.center_crop(data_dict[key], [height, width]) + crop_x0 = (orig_w - width) // 2 + crop_y0 = (orig_h - height) // 2 + + # We also add the aug params we use. This will be useful for other transforms + cropping_params = { + "resize_w": orig_w, + "resize_h": orig_h, + "crop_x0": crop_x0, + "crop_y0": crop_y0, + "crop_w": width, + "crop_h": height, + } + + if "aug_params" not in data_dict: + data_dict["aug_params"] = dict() + + data_dict["aug_params"]["cropping"] = cropping_params + + # We must perform same random cropping for all input keys + for key in self.input_keys: + data_dict[key] = transforms_F.crop(data_dict[key], crop_y0, crop_x0, height, width) + return data_dict diff --git a/cosmos-inference/cosmos3/_src/imaginaire/datasets/webdataset/augmentors/image/flip.py b/cosmos-inference/cosmos3/_src/imaginaire/datasets/webdataset/augmentors/image/flip.py new file mode 100644 index 00000000..ba6f22af --- /dev/null +++ b/cosmos-inference/cosmos3/_src/imaginaire/datasets/webdataset/augmentors/image/flip.py @@ -0,0 +1,44 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +from typing import Optional + +import torch +import torchvision.transforms.functional as transforms_F + +from cosmos3._src.imaginaire.datasets.webdataset.augmentors.augmentor import Augmentor + + +class HorizontalFlip(Augmentor): + def __init__(self, input_keys: list, output_keys: Optional[list] = None, args: Optional[dict] = None) -> None: + super().__init__(input_keys, output_keys, args) + + def __call__(self, data_dict: dict) -> dict: + r"""Performs horizontal flipping. + + Args: + data_dict (dict): Input data dict + Returns: + data_dict (dict): Output dict where images are center cropped. + """ + flip_enabled = getattr(self.args, "enabled", True) + if flip_enabled: + p = getattr(self.args, "prob", 0.5) + coin_flip = torch.rand(1).item() > p + for key in self.input_keys: + if coin_flip: + data_dict[key] = transforms_F.hflip(data_dict[key]) + + return data_dict diff --git a/cosmos-inference/cosmos3/_src/imaginaire/datasets/webdataset/augmentors/image/misc.py b/cosmos-inference/cosmos3/_src/imaginaire/datasets/webdataset/augmentors/image/misc.py new file mode 100644 index 00000000..90bddaf7 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/imaginaire/datasets/webdataset/augmentors/image/misc.py @@ -0,0 +1,63 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +from typing import Union + +import torch +from PIL import Image + + +def obtain_image_size(data_dict: dict, input_keys: list) -> tuple[int, int]: + r"""Function for obtaining the image size from the data dict. + + Args: + data_dict (dict): Input data dict + input_keys (list): List of input keys + Returns: + width (int): Width of the input image + height (int): Height of the input image + """ + + data1 = data_dict[input_keys[0]] + if isinstance(data1, Image.Image): + width, height = data1.size + elif isinstance(data1, torch.Tensor): + height, width = data1.size()[-2:] + else: + raise ValueError("data to random crop should be PIL Image or tensor") + + return width, height + + +def obtain_augmentation_size(data_dict: dict, augmentor_cfg: dict) -> Union[int, tuple]: + r"""Function for obtaining size of the augmentation. + When dealing with multi-aspect ratio dataloaders, we need to + find the augmentation size from the aspect ratio of the data. + If data_dict contains "_res_size_map" (e.g. from resolution sampling), + that map is used instead of augmentor_cfg["size"]. + + Args: + data_dict (dict): Input data dict + augmentor_cfg (dict): Augmentor config + Returns: + aug_size (int): Size of augmentation + """ + if "__url__" in data_dict and "aspect_ratio" in data_dict["__url__"].meta.opts: + aspect_ratio = data_dict["__url__"].meta.opts["aspect_ratio"] + else: # Non-webdataset format + aspect_ratio = data_dict["aspect_ratio"] + if "_res_size_map" in data_dict: + return data_dict["_res_size_map"][aspect_ratio] + return augmentor_cfg["size"][aspect_ratio] diff --git a/cosmos-inference/cosmos3/_src/imaginaire/datasets/webdataset/augmentors/image/normalize.py b/cosmos-inference/cosmos3/_src/imaginaire/datasets/webdataset/augmentors/image/normalize.py new file mode 100644 index 00000000..629b8519 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/imaginaire/datasets/webdataset/augmentors/image/normalize.py @@ -0,0 +1,48 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +from typing import Optional + +import torch +import torchvision.transforms.functional as transforms_F + +from cosmos3._src.imaginaire.datasets.webdataset.augmentors.augmentor import Augmentor + + +class Normalize(Augmentor): + def __init__(self, input_keys: list, output_keys: Optional[list] = None, args: Optional[dict] = None) -> None: + super().__init__(input_keys, output_keys, args) + + def __call__(self, data_dict: dict) -> dict: + r"""Performs data normalization. + + Args: + data_dict (dict): Input data dict + Returns: + data_dict (dict): Output dict where images are center cropped. + """ + assert self.args is not None, "Please specify args" + + mean = self.args["mean"] + std = self.args["std"] + + for key in self.input_keys: + if isinstance(data_dict[key], torch.Tensor): + data_dict[key] = data_dict[key].to(dtype=torch.get_default_dtype()).div(255) + else: + data_dict[key] = transforms_F.to_tensor(data_dict[key]) # division by 255 is applied in to_tensor() + + data_dict[key] = transforms_F.normalize(tensor=data_dict[key], mean=mean, std=std) + return data_dict diff --git a/cosmos-inference/cosmos3/_src/imaginaire/datasets/webdataset/augmentors/image/padding.py b/cosmos-inference/cosmos3/_src/imaginaire/datasets/webdataset/augmentors/image/padding.py new file mode 100644 index 00000000..009b4983 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/imaginaire/datasets/webdataset/augmentors/image/padding.py @@ -0,0 +1,72 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +from typing import Optional + +import omegaconf +import torch +import torchvision.transforms.functional as transforms_F + +from cosmos3._src.imaginaire.datasets.webdataset.augmentors.augmentor import Augmentor +from cosmos3._src.imaginaire.datasets.webdataset.augmentors.image.misc import obtain_augmentation_size, obtain_image_size + + +class ReflectionPadding(Augmentor): + def __init__(self, input_keys: list, output_keys: Optional[list] = None, args: Optional[dict] = None) -> None: + super().__init__(input_keys, output_keys, args) + + def __call__(self, data_dict: dict) -> dict: + r"""Performs reflection padding. This function also returns a padding mask. + + Args: + data_dict (dict): Input data dict + Returns: + data_dict (dict): Output dict where images are center cropped. + """ + + assert self.args is not None, "Please specify args in augmentation" + if self.output_keys is None: + self.output_keys = self.input_keys + + # Obtain image and augmentation sizes + orig_w, orig_h = obtain_image_size(data_dict, self.input_keys) + target_size = obtain_augmentation_size(data_dict, self.args) + + assert isinstance(target_size, (tuple, omegaconf.listconfig.ListConfig)), "Please specify target size as tuple" + target_w, target_h = target_size + + target_w = int(target_w) + target_h = int(target_h) + + # One-sided padding (bottom and right only, content stays at top-left) + padding_right = target_w - orig_w + padding_bottom = target_h - orig_h + padding_vals = [0, 0, padding_right, padding_bottom] + + for inp_key, out_key in zip(self.input_keys, self.output_keys): + if max(padding_vals[0], padding_vals[2]) >= orig_w or max(padding_vals[1], padding_vals[3]) >= orig_h: + # In this case, we can't perform reflection padding. This is because padding values + # are larger than the image size. So, perform edge padding instead. + data_dict[out_key] = transforms_F.pad(data_dict[inp_key], padding_vals, padding_mode="edge") + else: + # Perform reflection padding + data_dict[out_key] = transforms_F.pad(data_dict[inp_key], padding_vals, padding_mode="reflect") + + if out_key != inp_key: + del data_dict[inp_key] + + data_dict["image_size"] = torch.tensor([target_h, target_w, orig_h, orig_w], dtype=torch.float) + + return data_dict diff --git a/cosmos-inference/cosmos3/_src/imaginaire/datasets/webdataset/augmentors/image/resize.py b/cosmos-inference/cosmos3/_src/imaginaire/datasets/webdataset/augmentors/image/resize.py new file mode 100644 index 00000000..79f022cc --- /dev/null +++ b/cosmos-inference/cosmos3/_src/imaginaire/datasets/webdataset/augmentors/image/resize.py @@ -0,0 +1,187 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +from typing import Optional + +import omegaconf +import torchvision.transforms.functional as transforms_F + +from cosmos3._src.imaginaire.datasets.webdataset.augmentors.augmentor import Augmentor +from cosmos3._src.imaginaire.datasets.webdataset.augmentors.image.misc import obtain_augmentation_size, obtain_image_size + + +class ResizeSmallestSide(Augmentor): + def __init__(self, input_keys: list, output_keys: Optional[list] = None, args: Optional[dict] = None) -> None: + super().__init__(input_keys, output_keys, args) + + def __call__(self, data_dict: dict) -> dict: + r"""Performs resizing to smaller side + + Args: + data_dict (dict): Input data dict + Returns: + data_dict (dict): Output dict where images are resized + """ + + if self.output_keys is None: + self.output_keys = self.input_keys + assert self.args is not None, "Please specify args in augmentations" + + for inp_key, out_key in zip(self.input_keys, self.output_keys): + out_size = obtain_augmentation_size(data_dict, self.args) + assert isinstance(out_size, int), "Arg size in resize should be an integer" + data_dict[out_key] = transforms_F.resize( + data_dict[inp_key], + size=out_size, # type: ignore + interpolation=getattr(self.args, "interpolation", transforms_F.InterpolationMode.BICUBIC), + antialias=True, + ) + if out_key != inp_key: + del data_dict[inp_key] + return data_dict + + +class ResizeLargestSide(Augmentor): + def __init__(self, input_keys: list, output_keys: Optional[list] = None, args: Optional[dict] = None) -> None: + super().__init__(input_keys, output_keys, args) + + def __call__(self, data_dict: dict) -> dict: + r"""Performs resizing to larger side + + Args: + data_dict (dict): Input data dict + Returns: + data_dict (dict): Output dict where images are resized + """ + + if self.output_keys is None: + self.output_keys = self.input_keys + assert self.args is not None, "Please specify args in augmentations" + + for inp_key, out_key in zip(self.input_keys, self.output_keys): + out_size = obtain_augmentation_size(data_dict, self.args) + assert isinstance(out_size, int), "Arg size in resize should be an integer" + orig_w, orig_h = obtain_image_size(data_dict, self.input_keys) + + scaling_ratio = min(out_size / orig_w, out_size / orig_h) + target_size = [int(scaling_ratio * orig_h), int(scaling_ratio * orig_w)] + + data_dict[out_key] = transforms_F.resize( + data_dict[inp_key], + size=target_size, + interpolation=getattr(self.args, "interpolation", transforms_F.InterpolationMode.BICUBIC), + antialias=True, + ) + if out_key != inp_key: + del data_dict[inp_key] + return data_dict + + +class ResizeSmallestSideAspectPreserving(Augmentor): + def __init__(self, input_keys: list, output_keys: Optional[list] = None, args: Optional[dict] = None) -> None: + super().__init__(input_keys, output_keys, args) + + def __call__(self, data_dict: dict) -> dict: + r"""Performs aspect-ratio preserving resizing. + Image is resized to the dimension which has the smaller ratio of (size / target_size). + First we compute (w_img / w_target) and (h_img / h_target) and resize the image + to the dimension that has the smaller of these ratios. + + Args: + data_dict (dict): Input data dict + Returns: + data_dict (dict): Output dict where images are resized + """ + + if self.output_keys is None: + self.output_keys = self.input_keys + assert self.args is not None, "Please specify args in augmentations" + + img_size = obtain_augmentation_size(data_dict, self.args) + assert isinstance(img_size, (tuple, omegaconf.listconfig.ListConfig)), ( + f"Arg size in resize should be a tuple, get {type(img_size)}, {img_size}" + ) + img_w, img_h = img_size + + orig_w, orig_h = obtain_image_size(data_dict, self.input_keys) + scaling_ratio = max((img_w / orig_w), (img_h / orig_h)) + target_size = (int(scaling_ratio * orig_h + 0.5), int(scaling_ratio * orig_w + 0.5)) + + assert target_size[0] >= img_h and target_size[1] >= img_w, ( + f"Resize error. orig {(orig_w, orig_h)} desire {img_size} compute {target_size}" + ) + + for inp_key, out_key in zip(self.input_keys, self.output_keys): + data_dict[out_key] = transforms_F.resize( + data_dict[inp_key], + size=target_size, # type: ignore + interpolation=( + self.args["interpolation"] + if "interpolation" in self.args + else transforms_F.InterpolationMode.BICUBIC + ), + antialias=True, + ) + + if out_key != inp_key: + del data_dict[inp_key] + return data_dict + + +class ResizeLargestSideAspectPreserving(Augmentor): + def __init__(self, input_keys: list, output_keys: Optional[list] = None, args: Optional[dict] = None) -> None: + super().__init__(input_keys, output_keys, args) + + def __call__(self, data_dict: dict) -> dict: + r"""Performs aspect-ratio preserving resizing. + Image is resized to the dimension which has the larger ratio of (size / target_size). + First we compute (w_img / w_target) and (h_img / h_target) and resize the image + to the dimension that has the larger of these ratios. + + Args: + data_dict (dict): Input data dict + Returns: + data_dict (dict): Output dict where images are resized + """ + + if self.output_keys is None: + self.output_keys = self.input_keys + assert self.args is not None, "Please specify args in augmentations" + + img_size = obtain_augmentation_size(data_dict, self.args) + assert isinstance(img_size, (tuple, omegaconf.listconfig.ListConfig)), ( + f"Arg size in resize should be a tuple, get {type(img_size)}, {img_size}" + ) + img_w, img_h = img_size + + orig_w, orig_h = obtain_image_size(data_dict, self.input_keys) + scaling_ratio = min((img_w / orig_w), (img_h / orig_h)) + target_size = (int(scaling_ratio * orig_h + 0.5), int(scaling_ratio * orig_w + 0.5)) + + assert target_size[0] <= img_h and target_size[1] <= img_w, ( + f"Resize error. orig {(orig_w, orig_h)} desire {img_size} compute {target_size}" + ) + + for inp_key, out_key in zip(self.input_keys, self.output_keys): + data_dict[out_key] = transforms_F.resize( + data_dict[inp_key], + size=target_size, # type: ignore + interpolation=getattr(self.args, "interpolation", transforms_F.InterpolationMode.BICUBIC), + antialias=True, + ) + + if out_key != inp_key: + del data_dict[inp_key] + return data_dict diff --git a/cosmos-inference/cosmos3/_src/imaginaire/datasets/webdataset/config/schema.py b/cosmos-inference/cosmos3/_src/imaginaire/datasets/webdataset/config/schema.py new file mode 100644 index 00000000..968e7ab3 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/imaginaire/datasets/webdataset/config/schema.py @@ -0,0 +1,96 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +from typing import Optional, Type + +import attrs +from torch.utils.data import IterableDataset + +from cosmos3._src.imaginaire import config +from cosmos3._src.imaginaire.config import make_freezable +from cosmos3._src.imaginaire.datasets.webdataset.augmentors.augmentor import Augmentor + + +@make_freezable +@attrs.define(slots=False) +class DatasetInfo: + object_store_config: config.ObjectStoreConfig # Object strore config + wdinfo: list[str] # List of wdinfo files + opts: dict = attrs.Factory(dict) # Additional dataset info args + per_dataset_keys: list[str] = attrs.Factory(list) # List of keys per dataset + source: str = "" # data source + + +@make_freezable +@attrs.define(slots=False) +class SampleInfo: + dataset_name: str + sample_rank: int = 0 + sample_worker_id: int = 0 + sample_epoch: int = 0 + sample_index: int = 0 + + +@make_freezable +@attrs.define(slots=False) +class TarSample: + path: str # Path to the sample + root: str # Root folder + keys: list # List of keys to be loaded from the webdataset + meta: DatasetInfo # Metadata + dset_id: str # Dataset id + num_samples: int = 0 # Number of samples in this tar file (data_list_key_count from wdinfo) + sample_keys_full_list: str = None # Path to the file containing full sample keys for the tar file + sample_meta: SampleInfo = None + + +@make_freezable +@attrs.define(slots=False) +class Wdinfo: + tar_files: list[TarSample] # List of all tar samples + total_key_count: int # Total number of elements present in the dataset + chunk_size: int # Number of elements present in each tar + + +@make_freezable +@attrs.define(slots=False) +class AugmentorConfig: + # Type of augmentor + type: Type[Augmentor] + # Input keys used by the augmentor + input_keys: list[str] + # Output keys returned by the augmentor + output_keys: Optional[list[str]] = None + # Additional arguments used by the augmentor + args: Optional[dict] = None + + def make_instance(self) -> Augmentor: + return self.type(input_keys=self.input_keys, output_keys=self.output_keys, args=self.args) + + +@make_freezable +@attrs.define(slots=False) +class DatasetConfig: + keys: list[str] # List of keys used + buffer_size: int # Buffer size used by each worker + dataset_info: list[DatasetInfo] # List of dataset info files, one for each dataset + distributor: IterableDataset # Iterator for returning list of tar files + decoders: list # List of decoder functions for decoding bytestream + augmentation: dict[str, AugmentorConfig] # Dictionary containing all augmentations + streaming_download: bool = True # Whether to use streaming loader + remove_extension_from_keys: bool = True # True: objects will have a key of data_type; False: data_type.extension + sample_keys_full_list_path: Optional[str] = ( + None # Path to the file containing all keys present in the dataset, e.g., "index" + ) diff --git a/cosmos-inference/cosmos3/_src/imaginaire/datasets/webdataset/dataloader.py b/cosmos-inference/cosmos3/_src/imaginaire/datasets/webdataset/dataloader.py new file mode 100644 index 00000000..fec30b40 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/imaginaire/datasets/webdataset/dataloader.py @@ -0,0 +1,72 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +import os + +import webdataset + +import cosmos3._src.imaginaire.datasets.webdataset.webdataset +from cosmos3._src.imaginaire.utils.distributed import get_world_size + + +class Sampler: + r""" + A sampler function for setting the epoch number and iteration number. + In webdataset, information is propagated using environment flags. + In our case, + WDS_EPOCH_NUM: Epoch number + WDS_START_INDEX: Start index in this epoch. + """ + + def __init__(self, mode: str): + self.mode = mode + assert self.mode in ["train", "val"] + + def set_epoch(self, epoch: int): + if self.mode == "train": + os.environ["WDS_EPOCH_NUM"] = str(epoch) + else: + pass + + def set_iteration(self, start_index: int): + # start_index should be iters * batch_size + # It is the number of samples that have been seen by one GPU + if self.mode == "train": + os.environ["WDS_START_INDEX"] = str(start_index) + else: + pass + + +class DataLoader(webdataset.WebLoader): + r""" + This class is a wrapper on webloader class with a len attribute. + len function is needed in Imaginaire dataloaders. + """ + + def __init__(self, dataset: cosmos3._src.imaginaire.datasets.webdataset.webdataset.Dataset, batch_size: int = 1, *args, **kw): # type: ignore + # Setting data length. Webdataset is an iterable dataset, so it does not have data_len attr. + # So, we compute it from dataset and set it. + dataset_obj = dataset.build_dataset() + world_size = get_world_size() + if dataset_obj.total_images < world_size * batch_size: # type: ignore + data_length = 1 + else: + data_length = dataset_obj.total_images // (world_size * batch_size) # type: ignore + self.data_len = data_length + + super().__init__(dataset_obj, batch_size, *args, **kw) + + def __len__(self) -> int: + return self.data_len diff --git a/cosmos-inference/cosmos3/_src/imaginaire/datasets/webdataset/decoders/__init__.py b/cosmos-inference/cosmos3/_src/imaginaire/datasets/webdataset/decoders/__init__.py new file mode 100644 index 00000000..3159bfe6 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/imaginaire/datasets/webdataset/decoders/__init__.py @@ -0,0 +1,14 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. diff --git a/cosmos-inference/cosmos3/_src/imaginaire/datasets/webdataset/decoders/depth.py b/cosmos-inference/cosmos3/_src/imaginaire/datasets/webdataset/decoders/depth.py new file mode 100644 index 00000000..a77e7287 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/imaginaire/datasets/webdataset/decoders/depth.py @@ -0,0 +1,153 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +"""Depth decoder for EXR files.""" + +import re +from io import BytesIO + +import numpy as np +import torch + +_EXR_EXTENSIONS = "exr" +MAX_DEPTH = 100000 +_NPZ_EXTENSIONS = "npz" + + +def exr_loader(key, data): + """Load depth data from EXR file. + + Args: + key (str): Key of the data + data (bytes): Raw EXR file data + + Returns: + torch.Tensor: Depth map as tensor + """ + # pyrefly: ignore # import-error + import OpenEXR + + extension = re.sub(r".*[.]", "", key) + if extension.lower() not in _EXR_EXTENSIONS: + return None + + # Convert bytes to BytesIO for OpenEXR + exr_file = OpenEXR.InputFile(BytesIO(data)) + + # Get the header information + header = exr_file.header() + dw = header["dataWindow"] + w = dw.max.x - dw.min.x + 1 + h = dw.max.y - dw.min.y + 1 + + # Read the depth data from 'R' channel + depth = np.frombuffer(exr_file.channel("R"), dtype=np.float32).reshape((h, w)) + mask = depth == np.nan + depth = depth.copy() + depth[mask] = MAX_DEPTH + + # Convert to tensor and normalize to [0, 1] + depth = torch.from_numpy(depth).float() + + depth = depth.unsqueeze(0) + return depth + + +def npz_loader(key, data): + """Load depth data from NPZ file.""" + + extension = re.sub(r".*[.]", "", key) + if extension.lower() not in _NPZ_EXTENSIONS: + return None + + # Convert bytes to BytesIO for np.load + npz_file = BytesIO(data) + + # Load the NPZ file + with np.load(npz_file) as npz_data: + # Assuming the depth data is stored in the first array + # You may need to adjust this based on your specific NPZ file structure + depth_array = npz_data[list(npz_data.keys())[0]] + # Convert to tensor and normalize to [0, 1] if needed + depth = torch.from_numpy(depth_array).float() + + return depth + + +def construct_videodepth_decoder(): + """Construct videodepth decoder with frame count filtering. + + Args: + min_frames (int): Minimum number of frames required. Samples with fewer frames will be skipped. + + Returns: + callable: Videodepth decoder function that filters by frame count + """ + + def videodepth_decoder(key, data): + """Decode depth video data from NPZ file and filter by frame count. + + Args: + key (str): Key of the data + data (bytes): Raw NPZ file data + + Returns: + torch.Tensor: Depth video tensor if it has enough frames, None otherwise (to skip) + """ + # Load the depth data using npz_loader + depth = npz_loader(key, data) + if depth is None: + return None + + # Check frame count - determine temporal dimension + if depth.dim() == 4: # CxTxHxW + total_frames = depth.shape[1] + elif depth.dim() == 3: # TxHxW + total_frames = depth.shape[0] + else: + # For 2D depth maps (single frame), skip filtering + return depth + + return depth + + return videodepth_decoder + + +def construct_depth_decoder(sequence_length: int = 0): + """Construct depth decoder. + + Args: + sequence_length (int): Number of frames to decode. Set to 0 for single frame. + + Returns: + callable: Depth decoder function + """ + + def depth_decoder(key, sample): + """Decode depth data from sample. + + Args: + key (str): Key of the data + sample (dict): Sample dictionary containing depth data + + Returns: + dict: Sample dictionary with decoded depth data + """ + depth = exr_loader(key, sample) + if depth is None: + return None + return depth + + return depth_decoder diff --git a/cosmos-inference/cosmos3/_src/imaginaire/datasets/webdataset/decoders/image.py b/cosmos-inference/cosmos3/_src/imaginaire/datasets/webdataset/decoders/image.py new file mode 100644 index 00000000..0b63e5fa --- /dev/null +++ b/cosmos-inference/cosmos3/_src/imaginaire/datasets/webdataset/decoders/image.py @@ -0,0 +1,45 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +import io +import re +from typing import Optional + +from PIL import Image + +Image.MAX_IMAGE_PIXELS = 933120000 +_IMG_EXTENSIONS = "jpg jpeg png ppm pgm pbm pnm webp".split() + + +def pil_loader(key: str, data: bytes) -> Optional[Image.Image]: + r""" + Function to load an image. + If the image is corrupt, it returns a black image. + Args: + key (str): Image key. + data (bytes): Image data stream. + Returns: + PIL image + """ + extension = re.sub(r".*[.]", "", key) + if extension.lower() not in _IMG_EXTENSIONS: + return None + + with io.BytesIO(data) as stream: + img = Image.open(stream) + img.load() + img = img.convert("RGB") + + return img diff --git a/cosmos-inference/cosmos3/_src/imaginaire/datasets/webdataset/decoders/pickle.py b/cosmos-inference/cosmos3/_src/imaginaire/datasets/webdataset/decoders/pickle.py new file mode 100644 index 00000000..622fedaa --- /dev/null +++ b/cosmos-inference/cosmos3/_src/imaginaire/datasets/webdataset/decoders/pickle.py @@ -0,0 +1,33 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +import pickle +import re +from typing import Optional + + +def pkl_decoder(key: str, data: bytes) -> Optional[dict]: + r""" + Function to decode a pkl file. + Args: + key: Data key. + data: Data dict. + """ + extension = re.sub(r".*[.]", "", key) + if extension == "pkl" or extension == "pickle": + data_dict = pickle.loads(data) + return data_dict + else: + return None diff --git a/cosmos-inference/cosmos3/_src/imaginaire/datasets/webdataset/distributors/__init__.py b/cosmos-inference/cosmos3/_src/imaginaire/datasets/webdataset/distributors/__init__.py new file mode 100644 index 00000000..f426cf6c --- /dev/null +++ b/cosmos-inference/cosmos3/_src/imaginaire/datasets/webdataset/distributors/__init__.py @@ -0,0 +1,26 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +from cosmos3._src.imaginaire.datasets.webdataset.distributors.basic import ShardlistBasic +from cosmos3._src.imaginaire.datasets.webdataset.distributors.multi_aspect_ratio import ShardlistMultiAspectRatio +from cosmos3._src.imaginaire.datasets.webdataset.distributors.multi_aspect_ratio_v2 import ShardlistMultiAspectRatioInfinite +from cosmos3._src.imaginaire.datasets.webdataset.distributors.weighted_multi_aspect_ratio import WeightedShardlistMultiAspectRatio + +distributors_list = { + "basic": ShardlistBasic, + "multi_aspect_ratio": ShardlistMultiAspectRatio, + "multi_aspect_ratio_infinite": ShardlistMultiAspectRatioInfinite, + "weighted_multi_aspect_ratio": WeightedShardlistMultiAspectRatio, +} diff --git a/cosmos-inference/cosmos3/_src/imaginaire/datasets/webdataset/distributors/basic.py b/cosmos-inference/cosmos3/_src/imaginaire/datasets/webdataset/distributors/basic.py new file mode 100644 index 00000000..4c7c4cf9 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/imaginaire/datasets/webdataset/distributors/basic.py @@ -0,0 +1,158 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +import os +import random +import time + +from webdataset.pytorch import IterableDataset +from webdataset.utils import pytorch_worker_info + +from cosmos3._src.imaginaire.datasets.webdataset.config.schema import TarSample +from cosmos3._src.imaginaire.datasets.webdataset.utils.misc import repeat_list +from cosmos3._src.imaginaire.utils import log + + +class ShardlistBasic(IterableDataset): + r""" + An iterable dataset that parses and yields tar files. + The dataset restored from an iteration number and index number. + """ + + def __init__( + self, + shuffle: bool = True, + split_by_node: bool = True, + split_by_worker: bool = True, + resume_flag: bool = True, + verbose: bool = False, + is_infinite_loader: bool = False, + max_epochs: int = 100000, + repeat_url: bool = True, + ): + r"""Create a ShardList. + Args: + shuffle (bool): shuffle samples before iterating. + split_by_node (bool): split shards by node if True + split_by_worker (bool): split shards by worker if True + resume_flag (bool): If enabled, resumes from a specific iteration and epoch number. + verbose (bool): Prints some logs if true + is_infinite_loader (bool): If true, creates an infinite dataloader. + So, the dataset will be only one epoch and will not terminate. + max_epochs (int): Infinite dataloader is created with max_epochs number of epochs. + Should be a very large number. + repeat_url (bool): If true, each worker will receive the same number of batches by repeating urls. + """ + super().__init__() + + self.verbose = verbose + if self.verbose: + log.info("ShardListWithResumes init") + self.epoch = 0 + self.start_index = 0 + self.shuffle = shuffle + self.split_by_node = split_by_node + self.split_by_worker = split_by_worker + self.resume_flag = resume_flag + self.is_infinite_loader = is_infinite_loader + self.max_epochs = max_epochs + self.repeat_url = repeat_url + + def set_urls(self, urls: list[TarSample]): + """Set urls + + Args: + urls (list[TarSample]): a list of tar files along with their metadata + """ + self.urls = urls + + def set_chunk_size(self, chunk_size: int): + """Set chunk size + + Args: + chunk_size (int): chunk size used in webdataset creation + """ + self.chunk_size = chunk_size + + def set_epoch(self, epoch: int, start_index: int): + r"""Set the current epoch. Used for per-node shuffling. + Args: + epoch (int): Epoch number + start_index (int): iteraton number + """ + self.epoch = epoch + self.start_index = start_index + + def obtain_url_list(self): + r"""Return an iterator over the shards.""" + + rank, world_size, worker_id, num_workers = pytorch_worker_info() + + # Setting epoch and start index + if self.resume_flag: + self.epoch = int(os.environ.get("WDS_EPOCH_NUM", 0)) + # This tells us number of chunks that have been seen by one GPU + self.start_index = int(os.environ.get("WDS_START_INDEX", 0)) // self.chunk_size + + urls = self.urls + num_urls = len(urls) + + if self.repeat_url: + # Extending urls so that each workers receive the same number of batches. + # This serves the job of ddp_equalize. + nworkers_all = world_size * num_workers + num_urls_per_process = (num_urls + nworkers_all - 1) // nworkers_all + extended_url_list_size = num_urls_per_process * nworkers_all + urls = repeat_list(urls, extended_url_list_size) + + # Splits the urls by node and worker id. This ensures each worker sees different urls. + if self.split_by_node: + urls = urls[rank::world_size] + if self.split_by_worker: + urls = urls[worker_id::num_workers] + + if self.verbose: + log.info("List of urls (before shuffle)") + log.info(urls[0:10]) + + if self.shuffle: + # Shuffle based on the world worker id. + random.Random(rank * num_workers + worker_id).shuffle(urls) + + # This tells us the number of chunks seen by one worker. + # Do not iterate over the seen chunks. + start_index_per_worker = self.start_index // num_workers + if not self.is_infinite_loader: + urls = urls[start_index_per_worker:] + + if self.verbose: + log.info("List of urls (after shuffle)") + log.info(urls[0:10]) + log.info(f"PytorchShardList got {len(urls)} urls") + + return urls + + def __iter__(self): + url_list = self.obtain_url_list() + + if self.is_infinite_loader: + for _ in range(self.max_epochs): + cur_time = int(time.time()) + random.Random(cur_time).shuffle(url_list) + for url in url_list: + yield dict(url=url) + else: + for url in url_list: + yield dict(url=url) diff --git a/cosmos-inference/cosmos3/_src/imaginaire/datasets/webdataset/distributors/multi_aspect_ratio.py b/cosmos-inference/cosmos3/_src/imaginaire/datasets/webdataset/distributors/multi_aspect_ratio.py new file mode 100644 index 00000000..fd28da33 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/imaginaire/datasets/webdataset/distributors/multi_aspect_ratio.py @@ -0,0 +1,285 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +# This script contains the code for multi-aspect ratio shard iterator + +import math +import os +import random +import time +from collections import defaultdict +from copy import deepcopy + +import torch +from webdataset.pytorch import IterableDataset +from webdataset.utils import pytorch_worker_info + +from cosmos3._src.imaginaire.datasets.webdataset.config.schema import TarSample +from cosmos3._src.imaginaire.datasets.webdataset.utils.misc import repeat_list +from cosmos3._src.imaginaire.utils import log + + +class ShardlistMultiAspectRatio(IterableDataset): + r""" + An iterable dataset that parses and yields tar files. + This distributor handles the multi-aspect ratio case. For the dataloader to be successful, + each worker should load only one aspect ratio. Else, there can be a batch where two + aspect ratios would be present which would raise an error in collate function. + So, we design data distribution strategy so that each worker sees only one aspect ratio. + """ + + def __init__( + self, + shuffle: bool = True, + split_by_node: bool = True, + split_by_worker: bool = True, + chunk_size: int = 1, + resume_flag: bool = True, + verbose: bool = False, + is_infinite_loader: bool = False, + ): + r"""Create a multi-aspect ratio ShardList. + Args: + urls (list[TarSample]): a list of tar files along with their metadata + epoch_shuffle (bool): Shuffles the whole epoch. If disabled, each node will see the same set of urls. + shuffle (bool): shuffle samples before iterating. + split_by_node (bool): split shards by node if True + split_by_worker (bool): split shards by worker if True + chunk_size (int): chunk size used in webdataset creation + resume_flag (bool): If enabled, resumes from a specific iteration and epoch number. + verbose (bool): Prints some logs if true + is_infinite_loader (bool): If true, creates an infinite dataloader. + So, the dataset will be only one epoch and will not terminate. + """ + super().__init__() + + self.verbose = verbose + if self.verbose: + log.info("ShardListWithResumes init") + self.epoch = 0 + self.start_index = 0 + self._iter_epoch = 0 + self.shuffle = shuffle + self.split_by_node = split_by_node + self.split_by_worker = split_by_worker + self.chunk_size = chunk_size + self.resume_flag = resume_flag + self.is_infinite_loader = is_infinite_loader + + def set_urls(self, urls: list[TarSample]): + self.urls = urls + self._split_urls_by_aspect_ratio() + + def set_chunk_size(self, chunk_size: int): + """Set chunk size + + Args: + chunk_size (int): chunk size used in webdataset creation + """ + self.chunk_size = chunk_size + + def set_epoch(self, epoch: int, start_index: int): + r"""Set the current epoch. Used for per-node shuffling. + Args: + epoch (int): Epoch number + start_index (int): iteraton number + """ + self.epoch = epoch + self.start_index = start_index + + def _split_urls_by_aspect_ratio(self): + r"""Function for splitting urls by aspect ratio. + We assume that urls are grouped by dataset_id. That is, data belonging to + one dataset_id should have all data in the same aspect ratio. + """ + + url_aspect_split = defaultdict(list) + + for url in self.urls: + dset_info = url.meta + if "aspect_ratio" not in dset_info.opts: + raise ValueError("aspect_ratio should be specified in dataset_info when using multi aspect distributor") + aspect_ratio = dset_info.opts["aspect_ratio"] + url_aspect_split[aspect_ratio].append(url) + + # In deterministic mode, sort keys so rank-to-AR assignment is independent of URL ordering. + if torch.are_deterministic_algorithms_enabled(): + url_aspect_split = dict(sorted(url_aspect_split.items())) + + aspect_ratio_with_most_elems = -1 + aspect_ratio_with_least_elems = -1 + max_aspect_ratio_count = -1 + min_aspect_ratio_count = 1000000000 + + for aspect_ratio in url_aspect_split: + # Sort the url list + url_aspect_split[aspect_ratio] = sorted( + url_aspect_split[aspect_ratio], key=lambda tar: (tar.path, tar.root) + ) + + # Finding max and min tar counts per aspect ratio + if len(url_aspect_split[aspect_ratio]) > max_aspect_ratio_count: + aspect_ratio_with_most_elems = aspect_ratio + max_aspect_ratio_count = len(url_aspect_split[aspect_ratio]) + if len(url_aspect_split[aspect_ratio]) < min_aspect_ratio_count: + aspect_ratio_with_least_elems = aspect_ratio + min_aspect_ratio_count = len(url_aspect_split[aspect_ratio]) + + self.url_aspect_split = url_aspect_split + self.aspect_ratio_with_most_elems = aspect_ratio_with_most_elems + self.aspect_ratio_with_least_elems = aspect_ratio_with_least_elems + + def _ddp_equalize( + self, url_aspect_split: dict[str, list[TarSample]], nworkers_all: int + ) -> tuple[dict[str, list[TarSample]], int]: + r"""This function performs tar file equalization. That is, we repeat the number of tars in each aspect + ratio so that when the tars are split across workers, each worker recieves the same number of tars. + This function is important for ddp to terminate well at the end of each epoch. + + Args: + url_aspect_split (dict[list[TarSample]]): TarSample split by aspect ratio + nworkers_all (int): Total number of dataloader workers + + Returns: + url_aspect_split (dict[list[TarSample]]): TarSample split after DDP equalization + num_urls_per_worker (int): Number of tars in each worker + """ + betas = [] + n_total = sum([len(url_aspect_split[aspect_ratio]) for aspect_ratio in url_aspect_split]) + + # Initial assignment + aspect_ind_with_most_elems = 0 + for i, aspect_ratio in enumerate(url_aspect_split): + betas.append(math.ceil((len(url_aspect_split[aspect_ratio]) / n_total) * nworkers_all)) + if aspect_ratio == self.aspect_ratio_with_most_elems: + aspect_ind_with_most_elems = i + + # Constraint that total number of workers is fixed + betas[aspect_ind_with_most_elems] += nworkers_all - sum(betas) + + # Rebalance the number of urls + num_urls_per_worker = math.ceil(n_total / sum(betas)) + for i, aspect_ratio in enumerate(url_aspect_split): + url_aspect_split[aspect_ratio] = repeat_list(url_aspect_split[aspect_ratio], betas[i] * num_urls_per_worker) + + return url_aspect_split, num_urls_per_worker + + def _obtain_node_worker_url_mapping( + self, + url_aspect_split: dict[str, list[TarSample]], + num_urls_per_worker: int, + rank: int, + world_size: int, + worker_id: int, + num_workers: int, + ): + r"""This function obtains the worker-URL mapping. It assigns the tar list seen by + each workers. + + Args: + url_aspect_split (dict[list[TarSample]]: TarSample split by aspect ratio + num_urls_per_worker (int): Number of tar files seen by each worker + rank (int): Rank of the current GPU + world_size (int): Total number of GPUs + worker_id (int): ID for the current worker in the dataloader + num_workers (int): Total number of workers in the dataloader + + Returns: + URL list for the current worker + """ + assert self.split_by_node is True and self.split_by_worker is True + + # First chunk the tars + chunk_mappings = [] + for aspect_ratio in url_aspect_split: + samples_asp = url_aspect_split[aspect_ratio] + nchunks_asp = int(len(samples_asp) / num_urls_per_worker) + for chunk_id in range(nchunks_asp): + chunk_mappings.append((aspect_ratio, samples_asp[chunk_id::nchunks_asp])) + + # Split by rank and workers + chunk_mappings = chunk_mappings[rank::world_size] + chunk_mappings = chunk_mappings[worker_id::num_workers] + + assert len(chunk_mappings) == 1 + return chunk_mappings[0][1] + + def obtain_url_list(self): + r"""Return an iterator over the shards.""" + + rank, world_size, worker_id, num_workers = pytorch_worker_info() + + # Setting epoch and start index + if self.resume_flag: + self.epoch = int(os.environ.get("WDS_EPOCH_NUM", 0)) + + # This tells us number of chunks that have been seen by one GPU + self.start_index = int(os.environ.get("WDS_START_INDEX", 0)) // self.chunk_size + + urls = deepcopy(self.urls) + url_aspect_split = deepcopy(self.url_aspect_split) + + # Splitting the shards by worker and node + if self.verbose: + log.info(f"PytorchShardList rank {rank} of {world_size}") + log.info(f"PytorchShardList worker {worker_id} of {num_workers}") + + nworkers_all = world_size * num_workers + + # Perform DDP equalization + url_aspect_split, num_urls_per_worker = self._ddp_equalize(url_aspect_split, nworkers_all) + + # Form a mapping of url_aspect_split to node and workers + urls = self._obtain_node_worker_url_mapping( + url_aspect_split, num_urls_per_worker, rank, world_size, worker_id, num_workers + ) + + if self.verbose: + log.info("List of urls (before shuffle)") + log.info(urls[0:10]) + + if self.shuffle: + random.Random(rank * num_workers + worker_id).shuffle(urls) + + # This tells us the number of chunks seen by one worker. + # Do not iterate over the seen chunks. + start_index_per_worker = self.start_index // num_workers + if not self.is_infinite_loader: + urls = urls[start_index_per_worker:] + + if self.verbose: + log.info("List of urls (after shuffle)") + log.info(urls[0:10]) + log.info(f"PytorchShardList got {len(urls)} urls") + + return urls + + def __iter__(self): + url_list = self.obtain_url_list() + + if self.is_infinite_loader: + rank, _, worker_id, num_workers = pytorch_worker_info() + while True: + if torch.are_deterministic_algorithms_enabled(): + seed = self._iter_epoch * 65536 + rank * num_workers + worker_id + else: + seed = time.time_ns() + random.Random(seed).shuffle(url_list) + self._iter_epoch += 1 + for url in url_list: + yield dict(url=url) + else: + for url in url_list: + yield dict(url=url) diff --git a/cosmos-inference/cosmos3/_src/imaginaire/datasets/webdataset/distributors/multi_aspect_ratio_v2.py b/cosmos-inference/cosmos3/_src/imaginaire/datasets/webdataset/distributors/multi_aspect_ratio_v2.py new file mode 100644 index 00000000..fae6b327 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/imaginaire/datasets/webdataset/distributors/multi_aspect_ratio_v2.py @@ -0,0 +1,252 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +# This script contains the code for multi-aspect ratio shard iterator + +import random +import time +from collections import defaultdict + +import numpy as np +from webdataset.pytorch import IterableDataset +from webdataset.utils import pytorch_worker_info + +from cosmos3._src.imaginaire.datasets.webdataset.config.schema import TarSample +from cosmos3._src.imaginaire.utils import log + + +class ShardlistMultiAspectRatioInfinite(IterableDataset): + r""" + An iterable dataset that parses and yields tar files. + This distributor handles the multi-aspect ratio case. For the dataloader to be successful, + each worker should load only one aspect ratio. Else, there can be a batch where two + aspect ratios would be present which would raise an error in collate function. + So, we design data distribution strategy so that each worker sees only one aspect ratio. + + This version only supports infinite loader mode. This enables a simpler code that is faster to initialize + and produces samples better matching the dataset distribution. + """ + + def __init__( + self, + shuffle: bool = True, + split_by_node: bool = True, + split_by_worker: bool = True, + chunk_size: int = 1, + resume_flag: bool = True, + verbose: bool = False, + is_infinite_loader: bool = True, + ): + r"""Create a multi-aspect ratio ShardList. + Args: + urls (list[TarSample]): a list of tar files along with their metadata + epoch_shuffle (bool): Shuffles the whole epoch. If disabled, each node will see the same set of urls. + shuffle (bool): shuffle samples before iterating. + split_by_node (bool): split shards by node if True + split_by_worker (bool): split shards by worker if True + chunk_size (int): Ignored + resume_flag (bool): Ignored + verbose (bool): Prints some logs if true + is_infinite_loader (bool): If true, creates an infinite dataloader. + So, the dataset will be only one epoch and will not terminate. + """ + super().__init__() + + self.verbose = verbose + if self.verbose: + log.info("ShardlistMultiAspectRatioInfinite init") + self.shuffle = shuffle + self.split_by_node = split_by_node + self.split_by_worker = split_by_worker + self.chunk_size = chunk_size # Ignored + self.resume_flag = resume_flag # Ignored + assert is_infinite_loader is True + + def set_urls(self, urls: list[TarSample]): + self.url_aspect_split = self._split_urls_by_aspect_ratio(urls) + + def set_chunk_size(self, chunk_size: int): + """Set chunk size + For backward compatibility. Ignored. + + Args: + chunk_size (int): chunk size used in webdataset creation + """ + self.chunk_size = chunk_size + + def set_epoch(self, epoch: int, start_index: int): + r"""Set the current epoch. Used for per-node shuffling. + For backward compatibility. Ignored. + + Args: + epoch (int): Epoch number + start_index (int): iteraton number + """ + self.epoch = epoch + self.start_index = start_index + + def _split_urls_by_aspect_ratio(self, urls): + r"""Function for splitting urls by aspect ratio. + We assume that urls are grouped by dataset_id. That is, data belonging to + one dataset_id should have all data in the same aspect ratio. + """ + + url_aspect_split = defaultdict(list) + + for url in urls: + dset_info = url.meta + if "aspect_ratio" not in dset_info.opts: + raise ValueError("aspect_ratio should be specified in dataset_info when using multi aspect distributor") + aspect_ratio = dset_info.opts["aspect_ratio"] + url_aspect_split[aspect_ratio].append(url) + + for aspect_ratio in url_aspect_split: + # Sort the url list + url_aspect_split[aspect_ratio] = sorted( + url_aspect_split[aspect_ratio], key=lambda tar: (tar.path, tar.root) + ) + + return url_aspect_split + + def _allocate_workers_to_aspects( + self, url_aspect_split: dict[str, list[TarSample]], num_workers_all: int + ) -> list[tuple[str, int]]: + r"""Allocate workers to each aspect ratio so that: + 1. Each aspect ratio has at least one worker + 2. All the workers have jobs to do + + Args: + url_aspect_split (dict[list[TarSample]]): TarSample split by aspect ratio + num_workers_all (int): Total number of dataloader workers + + Returns: + aspect_worker_allocation (list): List of tuple containing (aspect_key, num_workers) + """ + if self.verbose: + log.info( + f"#URLs for each aspect ratio: {[len(url_aspect_split[aspect_ratio]) for aspect_ratio in url_aspect_split]}" + ) + + # Must have more global workers than the number of aspect ratios, as each global worker can only load a single + # aspect ratio. + num_aspects = len(url_aspect_split) + assert num_workers_all >= num_aspects + + aspect_keys = list(url_aspect_split.keys()) + # Allocate at least one worker per aspect ratios + target_ratio = np.array([len(url_aspect_split[key]) for key in aspect_keys]) + target_ratio = target_ratio / target_ratio.sum() + aspect_worker_allocation = np.ones([num_aspects], dtype=np.int64) + for _i in range(num_workers_all - num_aspects): + current_ratio = aspect_worker_allocation / aspect_worker_allocation.sum() + aspect_worker_allocation[np.argmin(current_ratio - target_ratio)] += 1 + + if self.verbose: + log.info(f"Aspects: {aspect_keys}") + log.info(f"Target ratio: {target_ratio}") + log.info(f"Worker allocation: {aspect_worker_allocation}") + log.info(f"Discrepancy: {aspect_worker_allocation / aspect_worker_allocation.sum() / target_ratio}") + return [(k, v) for k, v in zip(aspect_keys, aspect_worker_allocation.tolist())] + + def _obtain_node_worker_url_mapping( + self, + url_aspect_split: dict[str, list[TarSample]], + aspect_worker_allocation: list[tuple[str, int]], + rank: int, + world_size: int, + worker_id: int, + num_workers: int, + ): + r"""This function obtains the worker-URL mapping. It assigns the tar list seen by + each workers. + + Args: + url_aspect_split (dict[list[TarSample]]: TarSample split by aspect ratio + aspect_worker_allocation (dict): Number of workers allocated to each aspect ratio + rank (int): Rank of the current GPU + world_size (int): Total number of GPUs + worker_id (int): ID for the current worker in the dataloader + num_workers (int): Total number of workers in the dataloader + + Returns: + URL list for the current worker + """ + assert self.split_by_node is True and self.split_by_worker is True + + # First determine the aspect ratio for the current worker + global_worker_id = rank * num_workers + worker_id + + cumulative = 0 + for aspect_key, worker_count in aspect_worker_allocation: + cumulative += worker_count + if global_worker_id < cumulative: + chunk_id = global_worker_id - cumulative + worker_count + break + + if self.verbose: + log.info(f"GID={global_worker_id}, aspect_key={aspect_key}, chunk_id={chunk_id}") + # chunk the urls for the target aspect ratio + urls_asp = url_aspect_split[aspect_key] + if len(urls_asp) >= worker_count: + url_chunk = urls_asp[chunk_id::worker_count] + else: + url_chunk = urls_asp[chunk_id % len(urls_asp) : chunk_id % len(urls_asp) + 1] + + return url_chunk + + def obtain_url_list(self): + r"""Return an iterator over the shards.""" + + rank, world_size, worker_id, num_workers = pytorch_worker_info() + + # Splitting the shards by worker and node + if self.verbose: + log.info(f"PytorchShardList rank {rank} of {world_size}") + log.info(f"PytorchShardList worker {worker_id} of {num_workers}") + + nworkers_all = world_size * num_workers + + # Assigning workers to process each aspect ratio + aspect_worker_allocation = self._allocate_workers_to_aspects(self.url_aspect_split, nworkers_all) + + # Form a mapping of url_aspect_split to node and workers + urls = self._obtain_node_worker_url_mapping( + self.url_aspect_split, aspect_worker_allocation, rank, world_size, worker_id, num_workers + ) + + if self.verbose: + log.info("List of urls (before shuffle)") + log.info(urls[0:10]) + + if self.shuffle: + global_worker_id = rank * num_workers + worker_id + random.Random(global_worker_id).shuffle(urls) + + if self.verbose: + log.info("List of urls (after shuffle)") + log.info(urls[0:10]) + log.info(f"PytorchShardList got {len(urls)} urls") + + return urls + + def __iter__(self): + url_list = self.obtain_url_list() + while True: + if self.shuffle: + cur_time = time.time_ns() + random.Random(cur_time).shuffle(url_list) + assert len(url_list) > 0, "No urls found" + for url in url_list: + yield dict(url=url) diff --git a/cosmos-inference/cosmos3/_src/imaginaire/datasets/webdataset/distributors/weighted_multi_aspect_ratio.py b/cosmos-inference/cosmos3/_src/imaginaire/datasets/webdataset/distributors/weighted_multi_aspect_ratio.py new file mode 100644 index 00000000..f071d89e --- /dev/null +++ b/cosmos-inference/cosmos3/_src/imaginaire/datasets/webdataset/distributors/weighted_multi_aspect_ratio.py @@ -0,0 +1,173 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +""" +Weighted multi-aspect ratio shard distributor. +Subclass of ShardlistMultiAspectRatio that adds sampling weighted by data source +within each aspect-ratio partition. Preserves per-worker aspect-ratio assignment and DDP behavior. +""" + +import os +import random +import time +from datetime import datetime + +import torch +from webdataset.utils import pytorch_worker_info + +from cosmos3._src.imaginaire.datasets.webdataset.config.schema import TarSample +from cosmos3._src.imaginaire.datasets.webdataset.distributors.multi_aspect_ratio import ShardlistMultiAspectRatio +from cosmos3._src.imaginaire.utils import log + + +class WeightedShardlistMultiAspectRatio(ShardlistMultiAspectRatio): + r""" + Multi-aspect ratio shard list with weighted sampling by data source. + Each worker still receives URLs for a single aspect ratio (same as base class). + Within that set, URLs are sampled by datasource according to data_weight_dict. + """ + + def __init__( + self, + data_weight_dict: dict | None = None, + shuffle: bool = True, + split_by_node: bool = True, + split_by_worker: bool = True, + chunk_size: int = 1, + resume_flag: bool = True, + verbose: bool = False, + is_infinite_loader: bool = False, + dump_worker_category_distribution: bool = False, + ): + r"""Create a weighted multi-aspect ratio ShardList. + + Args: + data_weight_dict (dict | None): Mapping from data source name to weight. + If None, behaves like ShardlistMultiAspectRatio (no weighting). + shuffle (bool): Shuffle samples before iterating. + split_by_node (bool): Split shards by node if True. + split_by_worker (bool): Split shards by worker if True. + chunk_size (int): Chunk size used in webdataset creation. + resume_flag (bool): If enabled, resumes from WDS_EPOCH_NUM and WDS_START_INDEX. + verbose (bool): Print extra logs if True. + is_infinite_loader (bool): If True, dataloader runs indefinitely with weighted sampling. + dump_worker_category_distribution (bool): If True, dump the worker category distribution to one csv file per worker. + """ + super().__init__( + shuffle=shuffle, + split_by_node=split_by_node, + split_by_worker=split_by_worker, + chunk_size=chunk_size, + resume_flag=resume_flag, + verbose=verbose, + is_infinite_loader=is_infinite_loader, + ) + self.data_weight_dict = data_weight_dict + self.dump_worker_category_distribution = dump_worker_category_distribution + if self.dump_worker_category_distribution: + self.weight_per_tar_csv_dir = f"outputs/weight_csvs_{datetime.now().strftime('%Y%m%d_%H%M%S')}" + os.makedirs(self.weight_per_tar_csv_dir, exist_ok=True) + + def set_urls(self, urls: list[TarSample]): + super().set_urls(urls) + if self.data_weight_dict: + # Count global *samples* per datasource *before* per-worker splitting so that + # each tar file can be assigned weight = datasource_weight * global_sample_count. + global_sample_counts: dict[str, int] = {} + for url in urls: + src = url.meta.source + global_sample_counts[src] = global_sample_counts.get(src, 0) + url.num_samples + self._global_datasource_sample_counts = global_sample_counts + for src in global_sample_counts: + log.info(f"Global counts for {src}: {global_sample_counts[src]} samples") + if self.verbose: + # Log aspect-ratio split from base class (ratio feature is used per-worker) + if hasattr(self, "url_aspect_split") and self.url_aspect_split: + ratio_summary = {ar: len(entries) for ar, entries in self.url_aspect_split.items()} + log.info( + f"WeightedShardlistMultiAspectRatio: aspect_ratio split (ratio feature active): {ratio_summary}" + ) + if self.data_weight_dict: + log.info(f"data_weight_dict: {self.data_weight_dict}") + + def __iter__(self): + url_list = self.obtain_url_list() + + # Group URLs by datasource within this worker's list + urls_by_datasource: dict[str, list[TarSample]] = {} + for url in url_list: + datasource = url.meta.source + if datasource not in self.data_weight_dict: + raise ValueError( + f"Datasource '{datasource}' from URL not found in data_weight_dict. " + f"Available: {list(self.data_weight_dict.keys())}" + ) + if datasource not in urls_by_datasource: + urls_by_datasource[datasource] = [] + urls_by_datasource[datasource].append(url) + + if self.verbose: + counts = {cat: len(u) for cat, u in urls_by_datasource.items()} + log.info( + f"WeightedShardlistMultiAspectRatio: weighted sampling active — " + f"URLs per datasource (this worker): {counts}, weights={self.data_weight_dict}" + ) + + datasource_names = list(urls_by_datasource.keys()) + + if self.is_infinite_loader: + rank, world_size, worker_id, num_workers = pytorch_worker_info() + # In deterministic mode seed by (_iter_epoch, rank, worker_id); otherwise time-based. + if torch.are_deterministic_algorithms_enabled(): + worker_seed = self._iter_epoch * 65536 + rank * num_workers + worker_id + else: + worker_seed = (rank * num_workers + worker_id) + int(time.time() * 10000) + self._iter_epoch += 1 + rng = random.Random(worker_seed) + + # Build a flat list of tar files with per-tar weights. + # Each tar from datasource C gets weight = data_weight_dict[C] * global_samples_C. + flat_urls: list[TarSample] = [] + flat_weights: list[float] = [] + if self.dump_worker_category_distribution: + weight_csv_file = open( + os.path.join(self.weight_per_tar_csv_dir, f"_weight_per_tar_{rank * num_workers + worker_id}.csv"), + "w", + ) + weight_csv_file.write( + "datasource,wdinfo,path,weight,global_samples,data_list_key_count,data_weight_dict\n" + ) + + for datasource in datasource_names: + tars = urls_by_datasource[datasource] + global_samples = self._global_datasource_sample_counts[datasource] + for url in tars: + per_tar_weight = self.data_weight_dict[datasource] / global_samples + + flat_urls.append(url) + flat_weights.append(per_tar_weight) + if self.dump_worker_category_distribution: + weight_csv_file.write( + f"{datasource},{url.meta.wdinfo},{url.path},{per_tar_weight},{global_samples},{url.num_samples},{self.data_weight_dict[datasource]}\n" + ) + if self.dump_worker_category_distribution: + weight_csv_file.close() + + while True: + url = rng.choices(flat_urls, weights=flat_weights, k=1)[0] + yield dict(url=url) + else: + for url in url_list: + yield dict(url=url) diff --git a/cosmos-inference/cosmos3/_src/imaginaire/datasets/webdataset/utils/iterators.py b/cosmos-inference/cosmos3/_src/imaginaire/datasets/webdataset/utils/iterators.py new file mode 100644 index 00000000..4b070d00 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/imaginaire/datasets/webdataset/utils/iterators.py @@ -0,0 +1,617 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +import io +import os +import random +import sys +import time +from typing import IO, Any, BinaryIO, Callable, Dict, Iterable, Iterator, Optional, Tuple, Union +from urllib.parse import urlparse + +import boto3 +import botocore +import botocore.exceptions +import pandas as pd +import webdataset.gopen as gopen_webdata +import yaml +from webdataset import cache, filters, shardlists +from webdataset.compat import FluidInterface +from webdataset.handlers import reraise_exception +from webdataset.pipeline import DataPipeline +from webdataset.pytorch import IterableDataset +from webdataset.tariterators import group_by_keys, tar_file_iterator + +from cosmos3._src.imaginaire.datasets.webdataset.config.schema import TarSample +from cosmos3._src.imaginaire.datasets.webdataset.utils.stream import RetryingStream +from cosmos3._src.imaginaire.utils import log + +# Number of attempts to read s3 objects. +_NUM_OBJECT_STORE_READ_ATTEMPTS = 10 + + +def gopen(url: Tuple, mode: str = "rb", bufsize: int = 8192, **kw) -> Union[io.BytesIO, RetryingStream, BinaryIO, IO]: + r"""Open the URL. + This uses the `gopen_schemes` dispatch table to dispatch based + on scheme. + Support for the following schemes is built-in: pipe, file, + http, https, sftp, ftps, scp. + When no scheme is given the url is treated as a file. + You can use the OPEN_VERBOSE argument to get info about + files being opened. + Args: + url (tuple): (source URL, dataset id) + the source URL is join(TarSample.root, one of TarSample.keys, TarSample.path) + e.g. join("openx_short_cmu_playing_with_food_202505/v2.3/resolution_lt_720/aspect_ratio_4_3/duration_5_10/", "videos", "part_000000/000000.tar") + mode (str): the mode ("rb", "r") + bufsize (int): the buffer size + Returns: + Byte streams + """ + global fallback_gopen + verbose = int(os.environ.get("GOPEN_VERBOSE", 0)) + if verbose: + log.info("GOPEN", url, gopen_webdata.info, file=sys.stderr) + + assert mode in ["rb", "wb"], mode + if url == "-": + if mode == "rb": + return sys.stdin.buffer + elif mode == "wb": + return sys.stdout.buffer + else: + raise ValueError(f"unknown mode {mode}") + + # If we specify 'object_store' in keyword arguments, + # then we would load from s3. + if "object_store" in kw and kw["object_store"]: + assert isinstance(url, tuple) + return gopen_s3( + url, + s3_clients=kw["s3_client"], + s3_bucket_name=kw["s3_bucket_name"], + streaming_download=kw["streaming_download"], + ) + + # For all other gopen schemes, use the native webdataset gopen functions. + # pr = gopen_webdata.urlparse(url) + # this should be a path to an existing file on local machine + url = url[0] + assert isinstance(url, str) + pr = urlparse(url) + if pr.scheme == "": + bufsize = int(os.environ.get("GOPEN_BUFFER", -1)) + return open(url, mode, buffering=bufsize) + if pr.scheme == "file": + bufsize = int(os.environ.get("GOPEN_BUFFER", -1)) + return open(pr.path, mode, buffering=bufsize) + handler = gopen_webdata.gopen_schemes["__default__"] + handler = gopen_webdata.gopen_schemes.get(pr.scheme, handler) + return handler(url, mode, bufsize, **kw) # type: ignore + + +def gopen_s3( + url: tuple, + s3_clients: Dict[str, boto3.client], # type: ignore + s3_bucket_name: Dict[str, str], + streaming_download=True, +) -> Union[io.BytesIO, RetryingStream]: + r"""Gopen scheme for s3. + Function for reading urls from s3 + Args: + url (list[TarSample]): the source URL + s3_client (boto3.client): Boto3 client for downloading from S3 + s3_bucket_name (str): Bucket name for the S3 data + Returns: + Byte streams + """ + + attempt = 0 + + url_path = url[0] + dset_id = url[1] + s3_client = s3_clients[dset_id] + bucket = s3_bucket_name[dset_id] + + while attempt < _NUM_OBJECT_STORE_READ_ATTEMPTS: + try: + if streaming_download: + # Downloads in a streaming fashion + s3_stream = RetryingStream(s3_client, bucket=bucket, key=url_path) + return s3_stream + else: + # Downloads the entire file + buffer = io.BytesIO() + s3_client.download_fileobj(bucket, url_path, buffer) + buffer.seek(0) + return buffer + except botocore.exceptions.ClientError as e: + # If there is an exception (usually connectivity error or protocol error), read again + attempt += 1 + retry_interval = min( + 0.1 * 2**attempt + random.uniform(0, 1), 30 + ) # sleep workers randomly to avoid burst of requests + log.info( + f"Got an exception while downloading data {url_path}: attempt={attempt} - {e}. {type(e)}", + rank0_only=False, + ) + log.info(f"Retrying tar file download after {retry_interval}s", rank0_only=False) + time.sleep(retry_interval) + continue + raise ConnectionError("Unable to read {} from PBSS. {} attempts tried.".format(url, attempt)) + + +def url_opener(data: Iterable, handler: Callable = reraise_exception, **kw) -> Iterator[dict]: + r"""Given a stream of url names (packaged in `dict(url=url)`), yield opened streams. + + Args: + data (Iterable): Iterator of dictionaires containing url paths. + handler (Callable): Exception handler. + + Yields: + Dictionaries with this structure: + {"url": ... + "stream": list[Union[io.BytesIO, RetryingStream]]} + """ + for sample in data: + assert isinstance(sample, dict), sample + assert "url" in sample + + url = sample["url"] + assert isinstance(url, TarSample), "URL should be of type TarSample" + try: + stream = [] + for data_key in url.keys: + url_path_full = os.path.join(url.root, data_key, url.path) + url_key = (url_path_full, url.dset_id) + stream.append(gopen(url_key, **kw)) + + sample.update(stream=stream) + yield sample + except Exception as exn: + log.info(f"Got an exception while opening urls - {exn}", rank0_only=False) + exn.args = exn.args + (url,) + if handler(exn): + continue + else: + break + + +def process_sample(sample, url, key_idx): + assert isinstance(sample, dict) and "data" in sample and "fname" in sample + # Edit the url entries + sample["__url__"] = url + # This is the folder name + data_key = url.keys[key_idx] + # Handle the case where data_key has "/" + data_key = data_key.replace("/", "_") + # Edit the fname to include the data_key + fname_splits = sample["fname"].split(".") + if len(fname_splits) == 2: + prefix, suffix = fname_splits # {sample_key}.{suffix} e.g. "id_1410095.json" + else: # if the fname here contains more than one dot, we replace all the dots except the last one with "-" + prefix = "-".join(fname_splits[:-1]) + suffix = fname_splits[-1] + + # e.g. "id_1410095.caption_ai_from_image.json" + sample["fname"] = f"{prefix}.{data_key}.{suffix}" + + return sample + + +def tar_file_expander( + data: Iterable[Dict[str, Any]], + handler: Callable[[Exception], bool] = reraise_exception, + select_files: Optional[Callable[[str], bool]] = None, + rename_files: Optional[Callable[[str], str]] = None, + s3_client: Optional[Dict[str, boto3.client]] = None, # type: ignore + s3_bucket_name: Optional[Dict[str, str]] = None, +) -> Iterator[Dict[str, Any]]: + """Expand tar files. + + Args: + data (Iterable[Iterable[Dict[str, Any]]]): iterator over opened tar file streams. + handler (Callable[[Exception], bool]): exception handler. + select_files (Optional[Callable[[str], bool]]): select files from tarfiles by name (permits skipping files). + rename_files (Optional[Callable[[str], bool]]): Renaming tar files. + + Optional args if reading sample_keys_full_list: + s3_clients (Dict[str, boto3.client]): If loading from object store, specify S3 client. Keys is the dset_id, i.e. dataset id since different dataset could use different s3 client and bucket + s3_bucket_name (Dict[str, str]): If loading from object store, specify S3 bucket name. + + Yields: + a stream of samples. + """ + for source in data: + url = source["url"] + try: + assert isinstance(source, dict) + assert "stream" in source + tar_file_iterator_list = [] + for stream_id in range(len(source["stream"])): + tar_file_iterator_list.append( + tar_file_iterator( + source["stream"][stream_id], + handler=handler, + select_files=select_files, + rename_files=rename_files, + ) + ) + if url.sample_keys_full_list is None: # Original behavior + # tar_file_iterator_list is a list of iterator: [tar_file_iterator_0, tar_file_iterator_1, ... tar_file_iterator_N] + for sample in zip(*tar_file_iterator_list): + # Merging data from all streams + # sample is list of dictionaries, each dictionary contains data and fname + # sample [tar_file_iterator_0[0], tar_file_iterator_1[0], ... tar_file_iterator_N[0]], length = num_of_data_key + for key_idx, sample_key in enumerate(sample): + sample_key = process_sample(sample_key, url, key_idx) + yield sample_key + else: + # Read the index file from pbss + s3_client_cur = s3_client[url.dset_id] + bucket_cur = s3_bucket_name[url.dset_id] + sample_keys_full_list = read_sample_keys_full_list( + url.sample_keys_full_list, s3_client_cur, bucket_cur + ) # e.g. ["has_material_glb_from_obj_v4_1410095_0", "has_material_glb_from_obj_v4_1410095_1", ...] + sample_keys_full_to_index = {element: index for index, element in enumerate(sample_keys_full_list)} + + # Start reading the tar files + target_index = 0 + last_index = [-1] * len(tar_file_iterator_list) # Keep track of the last index of each tar file + sample_list = [] # List of samples from each tar file + while True: # Exit until target_index reach the max value + skip_offset = False + for key_idx, iterator in enumerate(tar_file_iterator_list): + if last_index[key_idx] >= target_index: + # This tar is moving faster than others, skip it and wait for others + continue + + # Read the tar file until current_index >= target_index + sample, current_index = run_iterator_to_index( + iterator, + target_index, + sample_keys_full_to_index, + name=f"{url.sample_keys_full_list}.{url.keys[key_idx]}", + ) + if sample is None: # Iterator {key_idx} already reached the end, exit the for loop + if target_index < len(sample_keys_full_to_index): # Missing keys + missing_info = f"index_path={url.sample_keys_full_list} | id={target_index}, sample_key={sample_keys_full_list[target_index]};" + log.info( + f"[missing keys] found in tar file: data_key={url.keys[key_idx]} | {missing_info}", + rank0_only=False, + ) + sample_list = [] # Reset the sample_list + break + + # Update the last_index + last_index[key_idx] = current_index + + # Process sample dict + sample = process_sample(sample, url=url, key_idx=key_idx) + + # Now check if the current index is matched or ahead + if current_index == target_index: # Nice! + sample_list.append(sample) + elif current_index > target_index: + # This means there is missing keys in this tar, this tar is moving faster than others + + # Log the missing info + missing_info = f"index_path={url.sample_keys_full_list} | " + for missing_idx in range(target_index, current_index): + missing_info += f" id={missing_idx}, sample_key={sample_keys_full_list[missing_idx]}; " + log.info( + f"[missing keys] found in tar file: data_key={url.keys[key_idx]} | {missing_info}", + rank0_only=False, + ) + + # Update the target_index to current_index, skip index inbetween old target_index and current_index + target_index = current_index + + # Reset sample_list, save the sample from this tar into sample_list and wait for others + sample_list = [ + sample + ] # Attnetion: this will change the order of sample_list, we will put them in the right order later + skip_offset = True # Skip the offset of target_index, since we are waiting for others + break + elif current_index < target_index: + # This should not happen + raise ValueError( + "Invalid output from run_iterator_to_index function. current_index should be equal or less than target_index" + ) + + # Decide where to yield the samples + if len(sample_list) == len(tar_file_iterator_list): + # Only yeild the samples if all the tars are preserved + all_prefix = [sample["fname"].split(".")[0] for sample in sample_list] + # Check all the prefix are the same + assert all(prefix == all_prefix[0] for prefix in all_prefix), ( + f"prefixes are not the same: {all_prefix}" + ) + # Correct the order of sample_list + sample_list = correct_order(sample_list, url.keys) + # Yield all the samples + for sample in sample_list: + assert isinstance(sample, dict) and "data" in sample and "fname" in sample + yield sample + sample_list = [] # Reset the sample_list + elif len(sample_list) > 1: + # Unexpected + raise ValueError(f"Unexpected length of sample_list: {len(sample_list)}") + elif len(sample_list) == 0 or len(sample_list) == 1: + # If the sample_list is empty, it means the tar file is exhausted + # If the sample_list has only one element, it means one tar file is moving faster than others + pass # Do nothing + + if not skip_offset: + # If sample_list has one element, we stay at current target_index until others catch up + target_index += 1 # Increase it by 1 + if target_index == len(sample_keys_full_to_index): + break # Reach the maximum index + # Make sure all the iterator are closed + for iterators in tar_file_iterator_list: + try: + next(iterators) + except StopIteration: + pass + + except Exception as exn: + log.info(f"Got an exception while expanding tars - {exn}", rank0_only=False) + exn.args = exn.args + (source.get("stream"), source.get("url")) + if handler(exn): + continue + else: + break + + +def correct_order(sample_list: list[Dict], expected_keys_order: list[str]) -> list[Dict]: + """Make sure the order of samples are the same as the url.keys order.""" + data_keys_per_sample = [sample["fname"].split(".")[1] for sample in sample_list] + expected_keys_order = [key.replace("/", "_") for key in expected_keys_order] + if data_keys_per_sample == expected_keys_order: # Correct order + return sample_list + # Order the sample_list based on the expected_keys_order + sample_list_ordered = [None] * len(expected_keys_order) + for data_key, sample in zip(data_keys_per_sample, sample_list): + idx = expected_keys_order.index(data_key) + sample_list_ordered[idx] = sample + return sample_list_ordered + + +def load_func_parquet(buffer): + data_list = pd.read_parquet(buffer).values.tolist() + names = [data[0] for data in data_list] + return names + + +def _read_sample_keys_full_list(key, s3_client: boto3.client, s3_bucket_name: str): + with io.BytesIO() as buffer: + s3_client.download_fileobj(Bucket=s3_bucket_name, Key=key, Fileobj=buffer) + buffer.seek(0) + sample_keys_full_list = load_func_parquet(buffer) + sample_keys_full_list = [key.split(".")[0] for key in sample_keys_full_list] + return sample_keys_full_list + + +def read_sample_keys_full_list(key: str, s3_client: boto3.client, s3_bucket_name: str, max_attempts=10): + for attempt in range(max_attempts): + try: + return _read_sample_keys_full_list(key, s3_client, s3_bucket_name) + except botocore.exceptions.ClientError as e: + retry_interval = min( + 0.1 * 2**attempt + random.uniform(0, 1), 30 + ) # sleep workers randomly to avoid burst of requests + log.exception( + f"Failed to read sample_keys_full_list {key}, attempt {attempt}. {e}. Retrying after {retry_interval}s." + ) + if attempt < max_attempts - 1: + time.sleep(retry_interval) + raise ConnectionError(f"Unable to read sample_keys_full_list {key} after {max_attempts} attempts.") + + +def run_iterator_to_index(iterator, target_index: int, sample_keys_full_to_index: dict, name: str = ""): + """ + Iterates over samples from an iterator, checking against the index of current sample (current_index) + to target_index, until it finds + 1) the sample key corresponds to the target index + or 2) the target index is passed (i,e, the target keys are missing) + or 3) until the iterator is exhausted. + + This function is designed to handle cases where there are unexpected, duplicated, or missing + sample keys based on the index mapping provided. + + Args: + iterator (iterator): An iterator yielding dictionaries that must include a key 'fname', + which contains the filename. The filename should be in the format 'prefix.suffix', + where 'prefix' will be used as the sample key. + target_index (int): The index of the sample to be retrieved according to the dictionary + mapping sample keys to indices. + sample_keys_full_to_index (dict): A dictionary mapping sample keys (extracted from the + 'fname' prefix of the iterator's samples) to their respective indices. This mapping + dictates the order in which samples are considered valid and should be found. + e.g. {"name_0": 0, "name_1": 1, "name_2": 2} + name (str): Names of the tar file, used to log the progress. + + Returns: + tuple: A tuple containing: + - sample (dict or None): The sample dictionary that matches the target index, or None + if no such sample is found by the time the iterator is exhausted. + - current_index (int or None): The index of the found sample according to the mapping, + or None if no sample is found. + + Raises: + StopIteration: If the iterator is exhausted without finding a matching sample, though this + is caught internally and handled by returning None values. + """ + sample, current_index = None, None + skip_count = 0 + while True: + try: + sample = next(iterator) + prefix, suffix = sample["fname"].split(".") + sample_key = prefix + + if sample_key not in sample_keys_full_to_index: # extra sample_key + log.info( + f"Skipping ({skip_count}) unexpected key {sample_key}; not found in the sample_keys_full_to_index {name} {sample_keys_full_to_index.keys()}" + ) + skip_count += 1 + continue + current_index = sample_keys_full_to_index[sample_key] # can be <,=,> target_index + if current_index < target_index: + # Note: current_index < target_index happens when duplicated keys or it's under catching up process + # e.g. [name_0, name_0, name_1] with target index = 1 + # Pointer at ^ + # Current index is 0, which is less than target index 1 + # In this case, we keep iterating + # log.info(f"[Skip] key {sample_key}; current_index={current_index} < target_index={target_index} {name}") + continue + elif current_index >= target_index: # Note: current_index > targer_index happens when there is missing keys + # Note: current_index > targer_index happens when there is missing keys + # e.g. [name_0, name_2, name_3] with target index 1 + # Pointer at ^ + # Current index is 2, which is greater than target index 1 + # In this case, we return the current_index, set the target_index to 2 and tell other tars to catch up. + # if current_index == target_index: # Matched! + # log.info(f"[Pass!] current_index={current_index} == target_index={target_index}") + # else: # Missing keys + # log.info(f"[Missing key detected!] current_index={current_index} > target_index={target_index} {name}") + break + + except StopIteration: + sample = None + current_index = None + break + return sample, current_index + + +def tarfile_samples( + src: Iterable, + handler: Callable = reraise_exception, + load_from_object_store: bool = False, + s3_client: Dict[str, boto3.client] = None, # type: ignore + s3_bucket_name: Optional[Dict[str, str]] = None, + streaming_download: bool = True, +) -> Iterator[Dict]: + r""" + Given an iterator of filenames, this function opens the URL streams + and groups data by keys. + + Args: + src (Iterable): Iterator of TarSample. + handler (Callable): Exception handler. + load_from_object_store (bool): A boolean flag to specify whether to load from + object store. + s3_client (boto3.client): If loading from object store, specify S3 client. + s3_bucket_name (str): If loading from object store, specify S3 bucket name. + streaming_download(bool): If enabled, performs streaming download. + """ + streams = url_opener( + src, + handler=handler, + object_store=load_from_object_store, + s3_client=s3_client, + s3_bucket_name=s3_bucket_name, + streaming_download=streaming_download, + ) + files = tar_file_expander(streams, handler=handler, s3_client=s3_client, s3_bucket_name=s3_bucket_name) + samples = group_by_keys(files, handler=handler) + return samples + + +tarfile_to_samples = filters.pipelinefilter(tarfile_samples) + + +class WebDataset(DataPipeline, FluidInterface): + r"""Webdataset class modified to support loading from object store.""" + + def __init__( + self, + urls: list[TarSample], + handler: Callable = reraise_exception, + resampled: bool = False, + shardshuffle: Optional[bool] = None, + cache_size: int = -1, + cache_dir: Optional[str] = None, + detshuffle: bool = False, + nodesplitter: Callable = shardlists.single_node_only, + verbose: bool = False, + load_from_object_store: bool = False, + s3_client: Dict[str, boto3.client] = None, # type: ignore + s3_bucket_name: Optional[Dict[str, str]] = None, + streaming_download: bool = True, + ): + r""" + Args: + urls (list[TarSample]): An iterator containing a list of url names. + handler (Callable): Exception handler. + resampled (bool): If true, sample shards from shard list with replacement. + shardshuffle (bool): If true, shuffles the entire shard list. + cache_size (int): Size of cache. + cache_dir (str): Path to store cache. + detshuffle (bool): Whether to use deterministic shuffling when shardshuffle is True. + nodesplitter (Callable): Function for splitting urls among nodes. + verbose (bool): If True, prints logs. + load_from_object_store (bool): A boolean flag to specify whether to load from + object store. + s3_client (boto3.client): If loading from object store, specify S3 client. + s3_bucket_name (str): If loading from object store, specify S3 bucket name. + streaming_download (bool): Whether to do streaming download or full object download. + """ + super().__init__() + if isinstance(urls, IterableDataset): + assert not resampled + self.append(urls) + elif isinstance(urls, str) and (urls.endswith(".yaml") or urls.endswith(".yml")): + with open(urls) as stream: + spec = yaml.safe_load(stream) + assert "datasets" in spec + self.append(shardlists.MultiShardSample(spec)) + elif isinstance(urls, dict): + assert "datasets" in urls + self.append(shardlists.MultiShardSample(urls)) + elif resampled: + self.append(shardlists.ResampledShards(urls)) + else: + self.append(shardlists.SimpleShardList(urls)) + self.append(nodesplitter) + self.append(shardlists.split_by_worker) + if shardshuffle is True: + shardshuffle = 100 # type: ignore + if shardshuffle is not None: + if detshuffle: + self.append(filters.detshuffle(shardshuffle)) + else: + self.append(filters.shuffle(shardshuffle)) + if cache_dir is None or cache_size == 0: + self.append( + tarfile_to_samples( + handler=handler, + load_from_object_store=load_from_object_store, + s3_client=s3_client, + s3_bucket_name=s3_bucket_name, + streaming_download=streaming_download, + ) + ) + else: + # We dont use cache. + assert cache_size == -1 or cache_size > 0 + self.append( + cache.cached_tarfile_to_samples( + handler=handler, + verbose=verbose, + cache_size=cache_size, + cache_dir=cache_dir, + ) + ) diff --git a/cosmos-inference/cosmos3/_src/imaginaire/datasets/webdataset/utils/misc.py b/cosmos-inference/cosmos3/_src/imaginaire/datasets/webdataset/utils/misc.py new file mode 100644 index 00000000..4ab1592b --- /dev/null +++ b/cosmos-inference/cosmos3/_src/imaginaire/datasets/webdataset/utils/misc.py @@ -0,0 +1,101 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +import os +from typing import Iterator + +import attrs + +from cosmos3._src.imaginaire.datasets.webdataset.config.schema import SampleInfo + + +def repeat_list(x: list, n: int) -> list: + r"""Function to repeat the list to a fixed shape. + n is the desired length of the extended list. + Args: + x (list): Input list + n (int): Desired length + Returns: + Extended list + """ + if n == 0: + return [] + assert len(x) > 0 + + x_extended = [] + while len(x_extended) < n: + x_extended = x_extended + x + x_extended = x_extended[0:n] + + return x_extended + + +def remove_extensions_from_keys(data: Iterator[dict]) -> Iterator[dict]: + r"""Function to remove extension from keys + Args: + data (dict): Input data dict + Returns: + data dict with keys removed + """ + + for data_dict in data: + data_dict_remapped = dict() + + for key in data_dict: + key_split = key.split(".") + if len(key_split) > 1: + key_new = ".".join(key_split[:-1]) + else: + key_new = key + data_dict_remapped[key_new] = data_dict[key] + + yield data_dict_remapped + + +def update_url(data: Iterator[dict]) -> Iterator[dict]: + r"""Function to update the URLs so that the TarSample is removed from data. + Instead, we replace the URL with a string. + Args: + data (dict): Input data dict + Returns: + data dict with URL replaced with a string + """ + for data_dict in data: + # unpack meta information from TarSample + url = data_dict["__url__"] + sample_meta = url.sample_meta + if sample_meta is not None: + assert isinstance(sample_meta, SampleInfo) + data_dict.update(attrs.asdict(sample_meta)) + + # unpack url + data_dict["__url__"] = os.path.join(url.root, url.path) + yield data_dict + + +def skip_keys(data: Iterator[dict]) -> Iterator[dict]: + r""" + Function to skip keys + Args: + data (dict): Input data dict + Returns: + data_dict with keys skipped + """ + + for data_dict in data: + if ("keys_to_skip" in data_dict) and (int(data_dict["keys_to_skip"]) == 1): + # Skip this key if data_dict["skip_key"] is True + continue + else: + yield data_dict diff --git a/cosmos-inference/cosmos3/_src/imaginaire/datasets/webdataset/utils/stream.py b/cosmos-inference/cosmos3/_src/imaginaire/datasets/webdataset/utils/stream.py new file mode 100644 index 00000000..b5cfa946 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/imaginaire/datasets/webdataset/utils/stream.py @@ -0,0 +1,1042 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + + +# PBSS +import atexit +import json +import os +import sys +import threading +import time +import weakref +from dataclasses import dataclass, field +from http.client import IncompleteRead +from pathlib import Path +from typing import Optional + +import boto3 +from botocore.exceptions import ( + ClientError, + ConnectionClosedError, + EndpointConnectionError, + ResponseStreamingError, +) +from botocore.exceptions import ( + ReadTimeoutError as BotocoreReadTimeoutError, +) +from urllib3.exceptions import ProtocolError as URLLib3ProtocolError +from urllib3.exceptions import ReadTimeoutError as URLLib3ReadTimeoutError +from urllib3.exceptions import SSLError as URLLib3SSLError + +from cosmos3._src.imaginaire.utils import log + +# Public API - only these should be imported from this module +__all__ = [ + "RetryingStream", # Main class for S3 streaming with retries + "ENABLE_RETRY_STATS", # Flag to enable/disable statistics (used in tests/benchmarks) + "RETRY_STATS_LOG_INTERVAL", # Interval in seconds between periodic statistics logs + "ENABLE_THROUGHPUT_STATS", # Flag to enable/disable throughput statistics + "THROUGHPUT_STATS_LOG_INTERVAL", # Interval between periodic throughput statistics logs + "ENABLE_STREAM_WANDB", # Flag to enable/disable IPC file writes for wandb metrics + "WATCHDOG_ENABLED", # Enable/disable watchdog reconnects + "WATCHDOG_MIN_THROUGHPUT_MBPS", # Minimum throughput (MB/s) before watchdog reconnects + "RETRYABLE_EXCEPTIONS", # Tuple of exceptions that trigger retries + "collect_throughput_ipc_stats", # Main-process reader for worker throughput IPC files, for wandb logging +] + +# Flag to enable/disable statistics gathering (for performance testing) +# Set to False to disable all statistics overhead for maximum performance. +# When disabled, no thread-local tracking occurs and no logs are generated. +# +# Usage for benchmarking: +# import cosmos3._src.imaginaire.datasets.webdataset.utils.stream as stream_module +# stream_module.ENABLE_RETRY_STATS = False # Disable stats +# # ... run benchmark ... +# stream_module.ENABLE_RETRY_STATS = True # Re-enable +ENABLE_RETRY_STATS = False + +# Interval in seconds between periodic retry statistics logs +# Default is 300 seconds (5 minutes). Set to a lower value for more frequent logging +# or a higher value to reduce log verbosity. +# +# Usage: +# import cosmos3._src.imaginaire.datasets.webdataset.utils.stream as stream_module +# stream_module.RETRY_STATS_LOG_INTERVAL = 600 # Log every 10 minutes +RETRY_STATS_LOG_INTERVAL = 300.0 # 5 minutes + +# Flag to enable/disable throughput log messages (verbose per-worker logs). +# Does NOT affect IPC file writes (controlled by ENABLE_STREAM_WANDB). +# +# Usage: +# import cosmos3._src.imaginaire.datasets.webdataset.utils.stream as stream_module +# stream_module.ENABLE_THROUGHPUT_STATS = False +ENABLE_THROUGHPUT_STATS = False + +# Interval in seconds between periodic throughput statistics logs. +# Independent from retry stats log interval. +# +# Usage: +# import cosmos3._src.imaginaire.datasets.webdataset.utils.stream as stream_module +# stream_module.THROUGHPUT_STATS_LOG_INTERVAL = 600 # Log every 10 minutes +THROUGHPUT_STATS_LOG_INTERVAL = 300.0 # 5 minutes + +# Enable/disable IPC file writes for cross-worker metrics aggregation (wandb). +# This controls whether workers write cumulative stats to /tmp/throughput_stats/ +# for the main process to collect and log to wandb. Independent from verbose +# log messages (ENABLE_THROUGHPUT_STATS). +# +# Env var: export ENABLE_STREAM_WANDB=0 (to disable; default enabled) +ENABLE_STREAM_WANDB = os.environ.get("ENABLE_STREAM_WANDB", "1") != "0" + + +# Enable/disable the throughput watchdog (reconnects on sustained low throughput). +# Default: True (enabled) +# +# Env var: export RETRYING_STREAM_WATCHDOG=0 +# Python: import cosmos3._src.imaginaire.datasets.webdataset.utils.stream as stream_module +# stream_module.WATCHDOG_ENABLED = False +WATCHDOG_ENABLED = os.environ.get("RETRYING_STREAM_WATCHDOG", "1") != "0" + +# Minimum throughput (MB/s) before the watchdog triggers a reconnect. +# Default: 10.0 MB/s +# +# Env var: export RETRYING_STREAM_WATCHDOG_MIN_MBPS=50.0 +# Python: import cosmos3._src.imaginaire.datasets.webdataset.utils.stream as stream_module +# stream_module.WATCHDOG_MIN_THROUGHPUT_MBPS = 50.0 +WATCHDOG_MIN_THROUGHPUT_MBPS = float(os.environ.get("RETRYING_STREAM_WATCHDOG_MIN_MBPS", "50.0")) + + +@dataclass +class GlobalRetryStatistics: + """Per-process statistics aggregator for S3 retry operations. + + Aggregates statistics across all threads within this process (e.g., a DataLoader worker). + Each process maintains its own independent statistics - no cross-process communication. + In distributed training with DataLoader workers: + - Each rank's main process has its own _global_retry_stats instance + - Each DataLoader worker process (spawned via multiprocessing) has its own instance + - Statistics are isolated per-process, ensuring accurate tracking without interference + + Uses WeakSet to track active instances - automatically handles cleanup + even if threads die or exceptions occur during construction. + + Tracks both per-thread and cumulative statistics: + - registered_threads: Per-thread counters (including PID and thread ID) for detailed breakdown + - cumulative_*: Process-local cumulative counters (never reset, for atexit log) + + Statistics terminology: + - operations_started: Number of S3 operations initiated (read/get_length/get_stream calls) + - failed_operations: Operations that failed at least once and required retry + - total_attempts: Sum of all attempts (initial + retries) + + Note: operations_started counts how many operations we started (each gets 1 initial attempt). + total_attempts >= operations_started because failed operations retry multiple times. + + Thread safety: + - Per-thread counters are lock-free (threading.local() ensures isolation) + - Cumulative counters use a lock because += is not atomic in Python + - Lock overhead is negligible (< 0.1% from benchmarks) + - Single-threaded case: Lock acquisition is uncontended (instant, no blocking) + - Multi-threaded case: Lock contention is minimal (only during retries, which are rare) + """ + + registered_threads: dict[int, dict[str, int]] = field(default_factory=dict) + lock: threading.Lock = field(default_factory=threading.Lock) # Protects cumulative counters + last_log_time: float = field(default_factory=time.time) + active_instances: weakref.WeakSet = field(default_factory=weakref.WeakSet) # Tracks active RetryingStream instances + rank: int | None = None # Lazily initialized rank ID (None = not yet initialized) + pid: int | None = None # Lazily initialized process ID (None = not yet initialized, cached to avoid OS calls) + registered_pids: set[int] = field( + default_factory=set + ) # PIDs that have registered atexit handlers (for multiprocessing support) + + # Cumulative counters (never reset, for atexit final log) + # These require the lock because += is not atomic (3 bytecode operations: LOAD, ADD, STORE) + # Without the lock, concurrent increments can cause lost updates + cumulative_operations_started: int = 0 # Number of operations initiated + cumulative_failed_operations: int = 0 # Operations that failed and required retry + cumulative_attempts: int = 0 # Sum of all attempts (initial + retries) + + def get_rank(self) -> int: + """Get rank with lazy initialization. + + Rank is captured on first access, not at module import time. + This ensures torch.distributed is initialized before we try to read it. + + Falls back to RANK environment variable if torch.distributed is not initialized. + This handles DataLoader worker processes which inherit RANK from parent but + don't have torch.distributed initialized. + + Returns: + The rank ID (0 if distributed not available/initialized and no RANK env var) + """ + if self.rank is None: + try: + import torch.distributed as dist + + if dist.is_available() and dist.is_initialized(): + self.rank = dist.get_rank() + else: + # Fallback to RANK environment variable (for DataLoader workers) + self.rank = int(os.environ.get("RANK", "0")) + except Exception: + self.rank = 0 # Fallback if distributed not available + return self.rank + + def get_pid(self) -> int: + """Get process ID with lazy caching (avoids repeated OS calls). + + Returns: + The current process ID (cached after first call) + """ + if self.pid is None: + self.pid = os.getpid() + return self.pid + + +# Conditionally create per-process statistics objects only if stats are enabled +# Each process maintains independent statistics (no cross-process communication) +if ENABLE_RETRY_STATS: + _global_retry_stats = GlobalRetryStatistics() + _thread_local_stats = threading.local() + + # Rank is lazily initialized on first access via get_rank() + # This ensures torch.distributed is initialized before we try to read it + + # Note: atexit handler registration is now done lazily per-process in _get_thread_stats() + # This ensures each DataLoader worker process (spawned via multiprocessing) automatically + # registers its own atexit handler, making multiprocessing support transparent +else: + _global_retry_stats = None # type: ignore + _thread_local_stats = None # type: ignore + + +@dataclass +class GlobalThroughputStatistics: + """Per-process statistics aggregator for shard throughput and watchdog events. + + Same aggregation pattern as GlobalRetryStatistics: each DataLoader worker process + maintains its own independent instance. No cross-process communication. + + Periodic logs show interval values (delta since last log). The atexit final + log shows lifetime cumulative totals. IPC files carry cumulative snapshots + (deltas computed in collect_throughput_ipc_stats). + + Thread safety: Same lock pattern as GlobalRetryStatistics + """ + + lock: threading.Lock = field(default_factory=threading.Lock) + last_log_time: float = field(default_factory=time.time) + rank: int | None = None + pid: int | None = None + registered_pids: set[int] = field(default_factory=set) + + cumulative_bytes_read: int = 0 # log.warning + wandb (via IPC) + cumulative_total_read_time: float = 0.0 # log.warning + wandb (via IPC) + cumulative_watchdog_reconnects: int = 0 # log.warning + wandb (via IPC) + + _prev_bytes: int = 0 + _prev_read_time: float = 0.0 + _prev_watchdog: int = 0 + + def get_rank(self) -> int: + """Get rank with lazy initialization (same logic as GlobalRetryStatistics.get_rank).""" + if self.rank is None: + try: + import torch.distributed as dist + + if dist.is_available() and dist.is_initialized(): + self.rank = dist.get_rank() + else: + self.rank = int(os.environ.get("RANK", "0")) + except Exception: + self.rank = 0 + return self.rank + + def get_pid(self) -> int: + """Get process ID with lazy caching (avoids repeated OS calls).""" + if self.pid is None: + self.pid = os.getpid() + return self.pid + + +if ENABLE_THROUGHPUT_STATS or ENABLE_STREAM_WANDB: + _global_throughput_stats = GlobalThroughputStatistics() +else: + _global_throughput_stats = None # type: ignore + +# Exceptions that should trigger retries for S3 streaming operations +RETRYABLE_EXCEPTIONS = ( + URLLib3ReadTimeoutError, + URLLib3ProtocolError, + URLLib3SSLError, + IncompleteRead, + IOError, + ResponseStreamingError, + ConnectionClosedError, + BotocoreReadTimeoutError, +) + + +def _get_thread_stats() -> dict[str, int]: + """Get or initialize thread-local statistics (lock-free). + + Lazily registers atexit handler on first call per process, making multiprocessing + support transparent (each DataLoader worker process automatically gets its own handler). + + Performance optimizations: + - PID cached in GlobalRetryStatistics.pid (avoids repeated os.getpid() syscalls) + - Thread ID cached in thread-local counters (constant per thread lifetime) + - Rank cached in GlobalRetryStatistics.rank (avoids repeated torch.distributed calls) + + Returns: + Dictionary with thread-local counters for operations and retries. + Returns empty dict if statistics are disabled. + """ + # No-op if statistics are disabled (for performance) + if not ENABLE_RETRY_STATS or _thread_local_stats is None: + return {} # Return empty dict as no-op + + if not hasattr(_thread_local_stats, "counters"): + # Cache PID and thread ID (constant for the lifetime of this thread) + pid = _global_retry_stats.get_pid() # Cached to avoid repeated os.getpid() syscalls + thread_id = threading.get_ident() # Already fast, but cached for consistency + + counters = { + "pid": pid, # Process ID (distinguishes DataLoader workers) + "thread_id": thread_id, # Thread ID within this process + "operations_started": 0, # Number of S3 operations initiated (read/get_length/get_stream) + "failed_operations": 0, # Operations that failed at least once and required retry + "total_attempts": 0, # Sum of all attempts (initial + retries) + } + _thread_local_stats.counters = counters + + # Register this thread's stats for aggregation (only once per thread) + with _global_retry_stats.lock: + _global_retry_stats.registered_threads[thread_id] = counters + + # Lazily register atexit handler once per process (not per thread) + # This provides best-effort final statistics logging when processes exit normally. + # Note: atexit is unreliable in multiprocessing.Process (known Python limitation), + # so tests/critical paths should explicitly call _log_retry_stats_internal(force=True). + if pid not in _global_retry_stats.registered_pids: + _global_retry_stats.registered_pids.add(pid) + + # Register atexit handler with proper error handling + def _atexit_handler(): + try: + if _global_retry_stats: + _log_retry_stats_internal(force=True) + # Flush output to ensure atexit logs are captured + try: + sys.stdout.flush() + sys.stderr.flush() + except Exception: + pass + except Exception as e: + # Fallback: try to print error if logging infrastructure is torn down + try: + print(f"[PID {os.getpid()}] atexit handler error: {e}", flush=True) + except Exception: + pass # Silently fail if stdout is closed + + atexit.register(_atexit_handler) + + return _thread_local_stats.counters + + +def _log_retry_stats_internal(force: bool = False) -> None: + """Internal function to log retry statistics with per-thread breakdown and process-local totals. + + Statistics are aggregated across all threads within this process only. + Each process logs independently - no cross-process communication for zero overhead. + + Args: + force: If True, log cumulative lifetime stats (for atexit). + If False, log periodic snapshot of current stats (counters keep accumulating). + """ + # No-op if statistics are disabled (for performance) + if not ENABLE_RETRY_STATS or _global_retry_stats is None: + return + + current_time = time.time() + + # Quick check without lock (small race condition is acceptable here) + if not force and current_time - _global_retry_stats.last_log_time < RETRY_STATS_LOG_INTERVAL: + return + + # Now acquire lock to read stats + with _global_retry_stats.lock: + # Double-check pattern for periodic logs (skip if time hasn't elapsed) + if not force and current_time - _global_retry_stats.last_log_time < RETRY_STATS_LOG_INTERVAL: + return + + # Get cumulative stats (for final log) or aggregate per-thread stats (for periodic) + if force: + # Final log: use cumulative counters (guaranteed monotonic) + total_ops = _global_retry_stats.cumulative_operations_started + failed_ops = _global_retry_stats.cumulative_failed_operations + total_attempts = _global_retry_stats.cumulative_attempts + per_thread_stats = None # Not needed for final log + else: + # Periodic log: aggregate per-thread stats (snapshot, not cumulative) + # Note: We track per-thread stats internally for correctness (handles rare multi-threaded + # cases and ensures accurate aggregation), but only log the per-process cumulative totals. + # In typical usage, each DataLoader worker process has a single thread doing I/O. + per_thread_stats = {} + total_ops = 0 # Total operations started across all threads + failed_ops = 0 # Failed operations across all threads + total_attempts = 0 # Total attempts across all threads + + for thread_id, thread_stats in _global_retry_stats.registered_threads.items(): + pid = thread_stats["pid"] # Process ID (identifies DataLoader worker) + ops = thread_stats["operations_started"] # S3 operations started in this thread + failed = thread_stats["failed_operations"] # Operations that failed in this thread + attempts = thread_stats["total_attempts"] # All attempts (initial + retries) in this thread + + per_thread_stats[thread_id] = { + "pid": pid, + "thread_id": thread_id, + "operations_started": ops, + "failed_operations": failed, + "total_attempts": attempts, + } + + # Aggregate across all threads + total_ops += ops + failed_ops += failed + total_attempts += attempts + + if total_ops > 0: + failure_percentage = (failed_ops / total_ops) * 100 + avg_attempts_per_op = total_attempts / total_ops + + prefix = "[RetryingStream Stats - Final]" if force else "[RetryingStream Stats]" + # Include rank and PID in message (lazily cached to avoid repeated OS calls) + rank = _global_retry_stats.get_rank() + pid = _global_retry_stats.get_pid() + message = ( + f"{prefix} [Rank {rank}] [PID {pid}] PROCESS-LOCAL: {total_ops} total operations, " + f"{failed_ops} failed operations ({failure_percentage:.1f}%), " + f"avg {avg_attempts_per_op:.2f} attempts/operation" + ) + + # Always use logging infrastructure (with fallback for atexit edge cases) + try: + # Only log the cumulative per-process summary + # (Per-thread stats are still tracked internally for accuracy, just not printed) + log.warning(message, rank0_only=False) + except Exception: + # Fallback to print if logging is torn down (rare edge case during atexit) + try: + print(f"WARNING: {message}", flush=True) + except Exception: + pass # Silently fail if stdout is also closed (multiprocessing edge case) + + # Update last log time (only for periodic logs, not final) + if not force: + _global_retry_stats.last_log_time = current_time + + +def _maybe_log_retry_stats() -> None: + """Log process-local retry statistics if RETRY_STATS_LOG_INTERVAL seconds have elapsed since last log. + + Each process logs independently - no cross-process communication. + The log interval is configurable via the RETRY_STATS_LOG_INTERVAL module variable (default: 300 seconds). + """ + if not ENABLE_RETRY_STATS: + return + _log_retry_stats_internal(force=False) + + +# Throughput statistics helpers (mirrors the retry statistics helpers above) + + +def _register_throughput_atexit() -> None: + """Lazily register atexit handler once per process for final throughput log and IPC flush.""" + if _global_throughput_stats is None: + return + + pid = _global_throughput_stats.get_pid() + with _global_throughput_stats.lock: + if pid not in _global_throughput_stats.registered_pids: + _global_throughput_stats.registered_pids.add(pid) + + def _atexit_handler() -> None: + try: + if _global_throughput_stats: + _log_throughput_stats_internal(force=True) + try: + sys.stdout.flush() + sys.stderr.flush() + except Exception: + pass + except Exception as e: + try: + print(f"[PID {os.getpid()}] throughput atexit error: {e}", flush=True) + except Exception: + pass + + atexit.register(_atexit_handler) + + +def _log_throughput_stats_internal(force: bool = False) -> None: + """Log throughput statistics with process-local totals and write IPC files. + + - force=False: periodic log (if ENABLE_THROUGHPUT_STATS) + IPC write (if ENABLE_STREAM_WANDB) + - force=True: cumulative lifetime stats for atexit + final IPC write + """ + if _global_throughput_stats is None: + return + + current_time = time.time() + + if not force and current_time - _global_throughput_stats.last_log_time < THROUGHPUT_STATS_LOG_INTERVAL: + return + + with _global_throughput_stats.lock: + if not force and current_time - _global_throughput_stats.last_log_time < THROUGHPUT_STATS_LOG_INTERVAL: + return + + s = _global_throughput_stats + if force: + bytes_read = s.cumulative_bytes_read + read_time = s.cumulative_total_read_time + watchdog = s.cumulative_watchdog_reconnects + else: + bytes_read = s.cumulative_bytes_read - s._prev_bytes + read_time = s.cumulative_total_read_time - s._prev_read_time + watchdog = s.cumulative_watchdog_reconnects - s._prev_watchdog + + s._prev_bytes = s.cumulative_bytes_read + s._prev_read_time = s.cumulative_total_read_time + s._prev_watchdog = s.cumulative_watchdog_reconnects + + if bytes_read > 0: + if ENABLE_THROUGHPUT_STATS: + mb_read = bytes_read / (1024**2) + avg_mbps = mb_read / read_time if read_time > 0 else 0 + + prefix = "[Throughput Stats - Final]" if force else "[Throughput Stats]" + rank = _global_throughput_stats.get_rank() + pid = _global_throughput_stats.get_pid() + watchdog_part = f", {watchdog} watchdog reconnects" if WATCHDOG_ENABLED else "" + message = ( + f"{prefix} [Rank {rank}] [PID {pid}] PROCESS-LOCAL: " + f"{mb_read:.2f}MB in {read_time:.3f}s " + f"({avg_mbps:.1f}MB/s avg){watchdog_part}" + ) + + try: + log.warning(message, rank0_only=False) + except Exception: + try: + print(f"WARNING: {message}", flush=True) + except Exception: + pass + + if ENABLE_STREAM_WANDB: + _write_throughput_ipc() + + if not force: + _global_throughput_stats.last_log_time = current_time + + +def _maybe_log_throughput_stats() -> None: + """Log process-local throughput statistics and/or write IPC files if interval has elapsed.""" + if not ENABLE_THROUGHPUT_STATS and not ENABLE_STREAM_WANDB: + return + _log_throughput_stats_internal(force=False) + + +def _write_throughput_ipc() -> None: + """Write cumulative throughput snapshot to a per-worker IPC file (for wandb logging).""" + if _global_throughput_stats is None: + return + try: + s = _global_throughput_stats + rank = s.get_rank() + ipc_dir = Path(f"/tmp/throughput_stats/rank_{rank}") + ipc_dir.mkdir(parents=True, exist_ok=True) + filepath = ipc_dir / f"worker_{os.getpid()}.json" + tmp = filepath.with_suffix(".tmp") + with open(tmp, "w") as f: + json.dump( + { + "bytes": s.cumulative_bytes_read, + "read_time": s.cumulative_total_read_time, + "watchdog": s.cumulative_watchdog_reconnects, + "ts": time.time(), + }, + f, + ) + tmp.rename(filepath) + except Exception: + pass + + +def _is_pid_alive(pid: int) -> bool: + """Check if a process with the given PID is still running (zero-cost signal 0).""" + try: + os.kill(pid, 0) + return True + except OSError: + return False + + +_ipc_prev_per_file: dict[str, dict[str, float]] = {} + + +def collect_throughput_ipc_stats(rank: int | None = None) -> dict[str, float]: + """Read per-worker IPC files and return accurate interval deltas for wandb. + + Deltas are tracked per file so that workers appearing (spawn) or + disappearing (death/respawn with persistent_workers=False) never corrupt + other workers' accounting. + + Dead-worker files are read first (to capture their final cumulative delta), + then deleted. Workers do NOT delete their own files on exit — they only + write a final flush via atexit. This avoids losing the last interval's data. + + Returns {"MBps": ..., "watchdog_reconnects": ...} if ENABLE_STREAM_WANDB is True, otherwise {}. + + Called by DataloadingMonitor callback once per logging window. + """ + if rank is None: + rank = int(os.environ.get("RANK", "0")) + ipc_dir = Path(f"/tmp/throughput_stats/rank_{rank}") + if not ipc_dir.exists(): + return {} + + total_d_bytes = 0 + total_d_time = 0.0 + total_d_watchdog = 0 + seen: set[str] = set() + + for filepath in ipc_dir.glob("worker_*.json"): + try: + pid = int(filepath.stem.split("_", 1)[1]) + except (ValueError, IndexError): + continue + + alive = _is_pid_alive(pid) + + try: + with open(filepath) as f: + data = json.load(f) + except (json.JSONDecodeError, OSError): + if not alive: + try: + filepath.unlink(missing_ok=True) + except OSError: + pass + continue + + fname = filepath.name + seen.add(fname) + + cur = { + "bytes": data.get("bytes", 0), + "read_time": data.get("read_time", 0.0), + "watchdog": data.get("watchdog", 0), + } + prev = _ipc_prev_per_file.get(fname, {"bytes": 0, "read_time": 0.0, "watchdog": 0}) + + d_b = cur["bytes"] - prev["bytes"] + d_t = cur["read_time"] - prev["read_time"] + d_w = cur["watchdog"] - prev["watchdog"] + + if d_b < 0 or d_t < 0 or d_w < 0: + log.info( + f"[Stream IPC] PID reuse detected for {fname}, treating as fresh worker", + rank0_only=False, + ) + d_b, d_t, d_w = cur["bytes"], cur["read_time"], cur["watchdog"] + + total_d_bytes += d_b + total_d_time += d_t + total_d_watchdog += d_w + _ipc_prev_per_file[fname] = cur + + if not alive: + try: + filepath.unlink(missing_ok=True) + log.info( + f"[Stream IPC] Read final stats and removed file for dead worker PID {pid}: {fname}", + rank0_only=False, + ) + except OSError: + pass + + for fname in list(_ipc_prev_per_file): + if fname not in seen: + log.info(f"[Stream IPC] Purging stale tracking entry: {fname}", rank0_only=False) + del _ipc_prev_per_file[fname] + + result: dict[str, float] = { + "MBps": (total_d_bytes / (1024**2)) / total_d_time if total_d_time > 0 else 0, + } + if WATCHDOG_ENABLED: + result["watchdog_reconnects"] = float(total_d_watchdog) + return result + + +@dataclass +class WatchdogConfig: + """Configuration for the throughput watchdog that resets stream connections with sustained low throughput. + + Attributes: + enabled: Master switch. Controlled by env var ``RETRYING_STREAM_WATCHDOG`` + (``"0"`` to disable; default enabled). + min_throughput_mbps: Sustained throughput threshold in MB/s. If the moving + window average drops below this, the connection is reset. Controlled by env var ``RETRYING_STREAM_WATCHDOG_MIN_MBPS`` (default ``50.0``). + min_window_seconds: Minimum accumulated read time (seconds) in the current + window before a throughput check is meaningful. Prevents premature resets. + check_interval: Number of ``read()`` calls between throughput checks to avoid checking overhead. + """ + + enabled: bool = WATCHDOG_ENABLED + min_throughput_mbps: float = WATCHDOG_MIN_THROUGHPUT_MBPS + min_window_seconds: float = 5.0 + check_interval: int = 50 + + +class RetryingStream: + def __init__(self, client: boto3.client, bucket: str, key: str, retries: int = 10): # type: ignore + r"""Class for loading data in a streaming fashion. + Args: + client (boto3.client): Boto3 client + bucket (str): Bucket where data is stored + key (str): Key to read + retries (int): Number of retries + """ + self.client = client + self.bucket = bucket + self.key = key + self.retries = retries + self.name = f"{bucket}/{key}" + + # Cache stats flag as instance variable to avoid module lookup overhead + self._enable_retry_stats = ENABLE_RETRY_STATS + self._enable_throughput_stats = ENABLE_THROUGHPUT_STATS + self._enable_tracking = ENABLE_THROUGHPUT_STATS or ENABLE_STREAM_WANDB + + # Get content length (with retries for transient failures) + self.content_size = self._retry_operation( + operation=self.get_length, + operation_name="get_length", + max_attempts=self.retries, + ) + + # Get initial stream (with retries for transient failures) + self.stream, _ = self._retry_operation( + operation=self.get_stream, + operation_name="get_stream", + max_attempts=self.retries, + ) + + self._amount_read = 0 + + # Per-shard read timing (accumulated across all read() calls) + self._stream_read_time = 0.0 + + self._watchdog = WatchdogConfig() + self._read_count = 0 + self._window_start_read_time: float = 0.0 + self._window_start_bytes: int = 0 + + if self._enable_retry_stats: + with _global_retry_stats.lock: + _global_retry_stats.active_instances.add(self) + + if self._enable_tracking: + _register_throughput_atexit() + + def __del__(self) -> None: + r"""Destructor for cleanup. + + Note: WeakSet automatically removes dead references, so no manual cleanup needed. + Final statistics are logged by the atexit handler when the program exits. + """ + # WeakSet handles cleanup automatically - no action needed + # Final stats logging happens via atexit handler, not destructor + pass + + def _watchdog_reset_stream_if_low_throughput(self, new_position: int) -> None: + """Reset the stream connection if sustained throughput drops below a threshold. + + Cloud object-storage backends (especially GCS) occasionally serve individual connections at far below their healthy capacity (tail latency problem). + + Because the DataLoader blocks on the slowest worker, a single degraded connection can + bottleneck the entire training step, observed as `dataloading spikes` in the training charts. + + This mitigation abandons the slow connection and opens a fresh one from the byte offset where the previous stream left off. + This is proven not to lose bytes (reconnection continues from where the previous stream left off), and doesn't introduce overhead. + + The check runs every `WatchdogConfig.check_interval` read() calls. + It computes a moving-window throughput (bytes read / accumulated read time) and compares it against `WatchdogConfig.min_throughput_mbps`. + A minimum window read time (`WatchdogConfig.min_window_seconds`) prevents premature resets. After a reset, the window counters restart from the current position. + + When disabled (`RETRYING_STREAM_WATCHDOG=0`), this method returns immediately on the first check. + """ + wd = self._watchdog + if ( + not wd.enabled + or self._read_count % wd.check_interval != 0 + or self._read_count == 0 + or new_position >= self.content_size + ): + return + + window_read_time = self._stream_read_time - self._window_start_read_time + window_bytes = new_position - self._window_start_bytes + if window_read_time <= wd.min_window_seconds or window_bytes <= 0: + return + + throughput_mbps = (window_bytes / (1024 * 1024)) / window_read_time + if throughput_mbps >= wd.min_throughput_mbps: + return + + if self._enable_tracking: + with _global_throughput_stats.lock: + _global_throughput_stats.cumulative_watchdog_reconnects += 1 + + if self._enable_throughput_stats: + rank = _global_throughput_stats.get_rank() + pid = _global_throughput_stats.get_pid() + log.warning( + f"[Throughput Watchdog] [Rank {rank}] [PID {pid}] reconnecting: " + f"{throughput_mbps:.1f}MB/s < {wd.min_throughput_mbps}MB/s, " + f"read_time {window_read_time:.1f}s, " + f"{self.name} @ {new_position}/{self.content_size}", + rank0_only=False, + ) + + try: + self.stream, _ = self.get_stream(new_position) + except (EndpointConnectionError, ClientError) as e: + log.warning( + f"[Throughput Watchdog] reconnect failed: {e} {self.name}", + rank0_only=False, + ) + self._window_start_read_time = self._stream_read_time + self._window_start_bytes = new_position + + @staticmethod + def _exponential_backoff_sleep(attempt: int) -> None: + r"""Sleep with exponential backoff based on attempt number. + + Args: + attempt: Zero-indexed attempt number (0 for first retry) + """ + time.sleep(0.5 * 2**attempt) + + def _retry_operation(self, operation, operation_name: str, max_attempts: int = 3): + r"""Retry an operation with exponential backoff for transient failures. + + Args: + operation: Callable to execute + operation_name: Name of operation for logging + max_attempts: Maximum number of attempts + + Returns: + Result of the operation + + Raises: + Exception from the operation if all retries fail + """ + # Track this operation in both thread-local and cumulative statistics + if self._enable_retry_stats: + _maybe_log_retry_stats() # Check if periodic log is due + + # Track this operation (lock-free thread-local counters) + stats = _get_thread_stats() + stats["operations_started"] += 1 # Count this S3 operation being started + stats["total_attempts"] += 1 # Count the initial attempt + + # Also update cumulative counters (requires lock because += is not atomic) + # Lock overhead is negligible: uncontended in single-threaded case, minimal contention in multi-threaded + with _global_retry_stats.lock: + _global_retry_stats.cumulative_operations_started += 1 + _global_retry_stats.cumulative_attempts += 1 + else: + stats = None + + # Include EndpointConnectionError for initialization operations + init_retryable = RETRYABLE_EXCEPTIONS + (EndpointConnectionError,) + + operation_had_retry = False # Track if this operation failed at least once + for attempt in range(max_attempts): + try: + return operation() + except init_retryable as e: + if attempt == max_attempts - 1: # Last attempt + raise + + # Track retry statistics + if stats is not None: + # Mark this operation as failed (only once per operation, lock-free) + if not operation_had_retry: + stats["failed_operations"] += 1 # This operation failed at least once + operation_had_retry = True + # Also update cumulative counter (lock needed because += is not atomic) + with _global_retry_stats.lock: + _global_retry_stats.cumulative_failed_operations += 1 + + # Count this retry attempt (lock-free) + stats["total_attempts"] += 1 # Each retry is an additional attempt + + # Also update cumulative counter (lock needed because += is not atomic) + with _global_retry_stats.lock: + _global_retry_stats.cumulative_attempts += 1 + + # Only log retries after the first one (attempt >= 1) + if attempt >= 1: + log.warning( + f"Transient error in {operation_name} for {self.name} " + f"(attempt {attempt + 1}/{max_attempts}): {type(e).__name__}: {e}", + rank0_only=False, + ) + self._exponential_backoff_sleep(attempt) + + def get_length(self) -> int: + r"""Function for obtaining length of the bytestream""" + head_obj = self.client.head_object(Bucket=self.bucket, Key=self.key) + length = int(head_obj["ContentLength"]) + return length + + def get_stream(self, start_range: int = 0, end_range: Optional[int] = None): + r"""Function for getting stream in a range + Args: + start_range (int): Start index for stream + end_range (int): End index for stream + Returns: + stream (bytes): Stream of data being read + content_size (int): Length of the bytestream read + """ + extra_args = {} + if start_range != 0 or end_range is not None: + # End range in S3 is inclusive + end_str = "" if end_range is None else str(end_range - 1) + extra_args["Range"] = f"bytes={start_range}-{end_str}" + + response = self.client.get_object(Bucket=self.bucket, Key=self.key, **extra_args) + + # FIX: Use the public 'Body' property (StreamingBody) + # It implements .read() and handles internal resource management + return response["Body"], int(response["ContentLength"]) + + def read(self, amt: Optional[int] = None) -> bytes: + r"""Function for reading data from the stream + Args: + amt (int): Amount of data to read + Returns: + chunk (bytes): Data read from the stream + """ + # Track this operation in both thread-local and cumulative statistics + if self._enable_retry_stats: + _maybe_log_retry_stats() # Check if periodic log is due + + # Track this read operation (lock-free thread-local counters) + stats = _get_thread_stats() + stats["operations_started"] += 1 # Count this read() call being started + stats["total_attempts"] += 1 # Count the initial attempt + + # Also update cumulative counters (requires lock) + with _global_retry_stats.lock: + _global_retry_stats.cumulative_operations_started += 1 + _global_retry_stats.cumulative_attempts += 1 + else: + stats = None + + operation_had_retry = False # Track if this read() failed at least once + for cur_retry_idx in range(self.retries): + try: + t_read_start = time.monotonic() + chunk = self.stream.read(amt) + read_dur = time.monotonic() - t_read_start + self._stream_read_time += read_dur # always: used by watchdog + if self._enable_tracking: + with _global_throughput_stats.lock: + _global_throughput_stats.cumulative_bytes_read += len(chunk) + _global_throughput_stats.cumulative_total_read_time += read_dur + self._read_count += 1 + # Check for unexpected end of stream + if amt is not None and amt > 0 and len(chunk) == 0 and self._amount_read != self.content_size: + raise IOError("Premature end of stream detected.") + + # Throughput watchdog + # Periodically check if sustained throughput is too low. + # If so, abandon the slow connection and open a fresh one from where we left off. + new_position = self._amount_read + len(chunk) + self._watchdog_reset_stream_if_low_throughput(new_position) + + # Success: Update pointer and return + self._amount_read += len(chunk) + if self._enable_tracking: + _maybe_log_throughput_stats() + return chunk + + except RETRYABLE_EXCEPTIONS as e: + self._stream_read_time += time.monotonic() - t_read_start + # Track retry statistics + if stats is not None: + # Mark this operation as failed (only once per operation, lock-free) + if not operation_had_retry: + stats["failed_operations"] += 1 # This operation failed at least once + operation_had_retry = True + # Also update cumulative counter (lock needed because += is not atomic) + with _global_retry_stats.lock: + _global_retry_stats.cumulative_failed_operations += 1 + + # Count this retry attempt (lock-free) + stats["total_attempts"] += 1 # Each retry is an additional attempt + + # Also update cumulative counter (lock needed because += is not atomic) + with _global_retry_stats.lock: + _global_retry_stats.cumulative_attempts += 1 + + # Only log retries after the first one (cur_retry_idx >= 1) + if cur_retry_idx >= 1: + log.warning( + f"[read] {type(e).__name__}: {e} {self.name} retry: {cur_retry_idx + 1}/{self.retries}", + rank0_only=False, + ) + + if cur_retry_idx == self.retries - 1: + raise # Re-raise the last exception if all retries fail + + # Exponential backoff: 0.5s, 1s, 2s, 4s, 8s... + self._exponential_backoff_sleep(cur_retry_idx) + + try: + # Close the old stream to prevent resource leaks + if hasattr(self.stream, "close"): + self.stream.close() + # Re-establish the stream from the last successful byte + self.stream, _ = self.get_stream(self._amount_read) + except RETRYABLE_EXCEPTIONS + (EndpointConnectionError,) as e_conn: + # Only log reconnection failures after the first retry + if cur_retry_idx >= 1: + log.warning( + f"Failed to reconnect on attempt {cur_retry_idx + 1}/{self.retries}: " + f"{type(e_conn).__name__}: {e_conn}", + rank0_only=False, + ) + # Loop continues, will retry the entire read operation (including get_stream) next iteration + # Note: self.stream may be in a bad state, but we'll create a fresh one on next iteration + + return b"" # Should theoretically not reach here due to the raise diff --git a/cosmos-inference/cosmos3/_src/imaginaire/datasets/webdataset/utils/unit_test/RetryingStreamDataLoaderTest.py b/cosmos-inference/cosmos3/_src/imaginaire/datasets/webdataset/utils/unit_test/RetryingStreamDataLoaderTest.py new file mode 100644 index 00000000..a657161e --- /dev/null +++ b/cosmos-inference/cosmos3/_src/imaginaire/datasets/webdataset/utils/unit_test/RetryingStreamDataLoaderTest.py @@ -0,0 +1,231 @@ +# ----------------------------------------------------------------------------- +# Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. +# All rights reserved. +# ----------------------------------------------------------------------------- + +"""Test RetryingStream statistics with PyTorch DataLoader workers. + +This test demonstrates that RetryingStream statistics work correctly +with PyTorch DataLoader's multiprocessing workers, which is the typical +production usage pattern. + +Key points tested: +1. Each DataLoader worker process maintains independent statistics +2. Thread-local storage works correctly within each worker +3. Statistics are properly aggregated within each worker process +4. No cross-worker interference or shared state issues +""" + +import sys +import time +from http.client import IncompleteRead +from unittest.mock import MagicMock + +from torch.utils.data import DataLoader, Dataset + +import cosmos3._src.imaginaire.datasets.webdataset.utils.stream as stream_module +from cosmos3._src.imaginaire.datasets.webdataset.utils.stream import RetryingStream +from cosmos3._src.imaginaire.utils import log + +# Configure faster logging interval for tests (10 seconds instead of 5 minutes) +stream_module.RETRY_STATS_LOG_INTERVAL = 10.0 + + +class MockS3Dataset(Dataset): + """Mock dataset that uses RetryingStream to simulate S3 streaming.""" + + def __init__(self, num_samples: int, retry_rate: float = 0.2): + """Initialize mock dataset. + + Args: + num_samples: Number of samples in the dataset + retry_rate: Fraction of samples that will trigger a retry + """ + self.num_samples = num_samples + self.retry_rate = retry_rate + # Enable statistics + stream_module.ENABLE_RETRY_STATS = True + + def __len__(self) -> int: + return self.num_samples + + def __getitem__(self, idx: int): + """Get a sample, simulating S3 streaming with RetryingStream.""" + # Create mock S3 client + client = MagicMock() + test_data = b"X" * 1024 + + client.head_object.return_value = {"ContentLength": str(len(test_data))} + + # Simulate retry for some samples + mock_body = MagicMock() + if (idx % int(1 / self.retry_rate)) == 0: + # This sample will fail once then succeed + mock_body.read.side_effect = [IncompleteRead(b"partial"), test_data] + else: + # This sample succeeds immediately + mock_body.read.return_value = test_data + + client.get_object.return_value = {"Body": mock_body, "ContentLength": len(test_data)} + + # Use RetryingStream (this is what happens in production) + stream = RetryingStream(client, f"test-bucket", f"file-{idx}.tar", retries=5) + + try: + data = stream.read(1024) + return {"idx": idx, "data": data, "size": len(data)} + except Exception as e: + return {"idx": idx, "error": str(e)} + + +def test_dataloader_workers(): + """Test that statistics work correctly with PyTorch DataLoader workers.""" + print("\n" + "=" * 70) + print("DATALOADER WORKER TEST") + print("=" * 70) + + # Test configuration + num_samples = 100 + batch_size = 10 + num_workers = 4 # This creates 4 separate worker processes + retry_rate = 0.2 # 20% of samples will retry + + print(f"Configuration:") + print(f" Dataset size: {num_samples} samples") + print(f" Batch size: {batch_size}") + print(f" Num workers: {num_workers} (separate processes)") + print(f" Retry rate: {retry_rate * 100}%") + print("-" * 70) + + # Create dataset and dataloader + dataset = MockS3Dataset(num_samples=num_samples, retry_rate=retry_rate) + dataloader = DataLoader(dataset, batch_size=batch_size, num_workers=num_workers, shuffle=False, drop_last=False) + + print("\nStarting DataLoader iteration...") + print("(Each worker process maintains independent statistics)") + print("-" * 70) + + # Process all batches + start_time = time.time() + total_samples = 0 + errors = 0 + + for batch_idx, batch in enumerate(dataloader): + batch_size_actual = len(batch["idx"]) + total_samples += batch_size_actual + + # Check for errors + if "error" in batch: + for error in batch["error"]: + if error: + errors += 1 + + # Print progress every 5 batches + if (batch_idx + 1) % 5 == 0: + print(f" Processed {total_samples}/{num_samples} samples...") + + elapsed = time.time() - start_time + print(f"\n✓ Completed in {elapsed:.2f}s") + print(f" Total samples processed: {total_samples}") + print(f" Errors: {errors}") + print("-" * 70) + + # Calculate expected retries + expected_retries = sum(1 for i in range(num_samples) if i % int(1 / retry_rate) == 0) + + print("\nExpected behavior:") + print(f" Each worker process had its own _global_retry_stats instance") + print(f" Each worker independently tracked its subset of {num_samples // num_workers}~ samples") + print(f" Total retries across all workers: ~{expected_retries}") + print(f" Per-worker retries: ~{expected_retries // num_workers}") + + print("\nNote: Statistics are logged per-worker-process during iteration.") + print(" Check the output above for '[RetryingStream Stats]' messages.") + print("=" * 70) + + # Verify no errors + if errors > 0: + print(f"\n❌ FAIL: {errors} errors occurred during processing") + return False + else: + print(f"\n✅ PASS: DataLoader workers processed all samples successfully") + print(" ✓ Each worker maintained independent statistics") + print(" ✓ No cross-worker interference") + print(" ✓ Thread-local storage worked correctly") + return True + + +def test_dataloader_workers_with_threading(): + """Test DataLoader with threading backend (less common, but valid).""" + print("\n" + "=" * 70) + print("DATALOADER THREADING BACKEND TEST") + print("=" * 70) + + # Note: torch.multiprocessing with threads is less common but supported + # This tests the threading.local() aggregation within a single process + num_samples = 50 + batch_size = 5 + retry_rate = 0.2 + + print(f"Configuration:") + print(f" Dataset size: {num_samples} samples") + print(f" Batch size: {batch_size}") + print(f" Threading backend (single process, multiple threads)") + print(f" Retry rate: {retry_rate * 100}%") + print("-" * 70) + + # Create dataset and dataloader with threading (num_workers=0 uses main thread) + dataset = MockS3Dataset(num_samples=num_samples, retry_rate=retry_rate) + + # Process in main thread (num_workers=0) + dataloader = DataLoader(dataset, batch_size=batch_size, num_workers=0, shuffle=False) + + print("\nProcessing in main thread...") + total_samples = 0 + for batch in dataloader: + total_samples += len(batch["idx"]) + + print(f"✓ Processed {total_samples}/{num_samples} samples") + + # Force a stats log + if stream_module.ENABLE_RETRY_STATS: + with stream_module._global_retry_stats.lock: + stream_module._global_retry_stats.last_log_time = 0 + stream_module._maybe_log_retry_stats() + + print("-" * 70) + print("✅ PASS: Single-threaded DataLoader worked correctly") + print("=" * 70) + return True + + +if __name__ == "__main__": + # Initialize logging + log.init_loguru_stdout() + + # Run tests + test1_passed = test_dataloader_workers() + print("\n\n") + time.sleep(1) # Brief pause between tests + + test2_passed = test_dataloader_workers_with_threading() + + print("\n" + "=" * 70) + print("SUMMARY") + print("=" * 70) + print(f" DataLoader Workers (multiprocessing): {'✅ PASS' if test1_passed else '❌ FAIL'}") + print(f" DataLoader Threading (single process): {'✅ PASS' if test2_passed else '❌ FAIL'}") + print("=" * 70) + + if test1_passed and test2_passed: + print("\n✅ All DataLoader tests PASSED!") + print("\nVerified:") + print(" ✓ Statistics work with PyTorch DataLoader workers (multiprocessing)") + print(" ✓ Statistics work with single-threaded DataLoader") + print(" ✓ Each worker process maintains independent statistics") + print(" ✓ No cross-worker shared state issues") + print(" ✓ Thread-local aggregation works correctly") + sys.exit(0) + else: + print("\n❌ Some DataLoader tests FAILED") + sys.exit(1) diff --git a/cosmos-inference/cosmos3/_src/imaginaire/datasets/webdataset/utils/unit_test/RetryingStreamMockTest.py b/cosmos-inference/cosmos3/_src/imaginaire/datasets/webdataset/utils/unit_test/RetryingStreamMockTest.py new file mode 100644 index 00000000..88505fa6 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/imaginaire/datasets/webdataset/utils/unit_test/RetryingStreamMockTest.py @@ -0,0 +1,452 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +from http.client import IncompleteRead +from unittest.mock import MagicMock, patch + +from botocore.exceptions import EndpointConnectionError, ResponseStreamingError +from urllib3.exceptions import ProtocolError as URLLib3ProtocolError +from urllib3.exceptions import ReadTimeoutError as URLLib3ReadTimeoutError +from urllib3.exceptions import SSLError as URLLib3SSLError + +import cosmos3._src.imaginaire.datasets.webdataset.utils.stream as stream_module +from cosmos3._src.imaginaire.datasets.webdataset.utils.stream import RetryingStream + +# Configure faster logging interval for tests (10 seconds instead of 5 minutes) +stream_module.RETRY_STATS_LOG_INTERVAL = 10.0 + +# Test 1: Simulate IncompleteRead and verify retry works +print("Test 1: Retry on IncompleteRead") +client = MagicMock() +expected_data = b"X" * 100 # 100 bytes of data +client.head_object.return_value = {"ContentLength": str(len(expected_data))} + +# Create mock streams +mock_body_1 = MagicMock() +mock_body_1.close = MagicMock() # Track if close() is called +mock_body_1.read.side_effect = IncompleteRead(b"partial") # First read fails + +mock_body_2 = MagicMock() +mock_body_2.read.return_value = expected_data # Second read succeeds with full data + +# Return different bodies on each get_object call +client.get_object.side_effect = [ + {"Body": mock_body_1, "ContentLength": len(expected_data)}, # First attempt + {"Body": mock_body_2, "ContentLength": len(expected_data)}, # Retry attempt +] + +stream = RetryingStream(client, "test-bucket", "test.tar", retries=3) + +# Mock time.sleep to skip waiting +with patch("time.sleep"): + data = stream.read(100) + +assert data == expected_data, f"Expected {len(expected_data)} bytes but got {len(data)} bytes" +assert len(data) == 100, f"Expected 100 bytes but got {len(data)}" +assert mock_body_1.close.called, "Old stream was not closed" +assert client.get_object.call_count == 2, f"Expected 2 calls but got {client.get_object.call_count}" +print(f"✓ Read succeeded after retry: {len(data)} bytes") +print(f"✓ Old stream was closed: {mock_body_1.close.called}") +print(f"✓ get_object called {client.get_object.call_count} times (initial + retry)") + +# Test 2: Multiple errors before success +print("\nTest 2: Multiple retries before success") +client2 = MagicMock() +expected_data2 = b"Y" * 200 # 200 bytes of data +client2.head_object.return_value = {"ContentLength": str(len(expected_data2))} + +# Create multiple failing streams and one success +failing_bodies = [] +for i in range(2): + body = MagicMock() + body.close = MagicMock() + body.read.side_effect = IncompleteRead(b"fail") + failing_bodies.append(body) + +success_body = MagicMock() +success_body.read.return_value = expected_data2 + +client2.get_object.side_effect = [ + {"Body": failing_bodies[0], "ContentLength": len(expected_data2)}, + {"Body": failing_bodies[1], "ContentLength": len(expected_data2)}, + {"Body": success_body, "ContentLength": len(expected_data2)}, +] + +stream2 = RetryingStream(client2, "test-bucket", "test.tar", retries=5) + +with patch("time.sleep"): + data2 = stream2.read(200) + +assert data2 == expected_data2, f"Expected {len(expected_data2)} bytes but got {len(data2)} bytes" +assert len(data2) == 200, f"Expected 200 bytes but got {len(data2)}" +assert failing_bodies[0].close.called, "First stream was not closed" +assert failing_bodies[1].close.called, "Second stream was not closed" +assert client2.get_object.call_count == 3, f"Expected 3 calls but got {client2.get_object.call_count}" +print(f"✓ Read succeeded after {client2.get_object.call_count - 1} retries: {len(data2)} bytes") +print(f"✓ First stream closed: {failing_bodies[0].close.called}") +print(f"✓ Second stream closed: {failing_bodies[1].close.called}") + +# Test 3: Max retries exceeded +print("\nTest 3: Max retries exceeded") +client3 = MagicMock() +expected_size3 = 150 +client3.head_object.return_value = {"ContentLength": str(expected_size3)} + +# Always fail +always_fail_body = MagicMock() +always_fail_body.close = MagicMock() +always_fail_body.read.side_effect = IncompleteRead(b"always fail") + +client3.get_object.return_value = {"Body": always_fail_body, "ContentLength": expected_size3} + +stream3 = RetryingStream(client3, "test-bucket", "test.tar", retries=2) + +exception_raised = False +try: + with patch("time.sleep"): + data3 = stream3.read(100) + print("✗ Should have raised exception!") + assert False, "Expected IncompleteRead exception to be raised" +except IncompleteRead: + exception_raised = True + print(f"✓ Correctly raised IncompleteRead after {stream3.retries} retries") + +assert exception_raised, "Exception was not raised when it should have been" + +# Test 4: Mix different error types +print("\nTest 4: Mixed error types (IncompleteRead + URLLib3ReadTimeoutError)") +client4 = MagicMock() +expected_data4 = b"Z" * 250 # 250 bytes +client4.head_object.return_value = {"ContentLength": str(len(expected_data4))} + +error_body_1 = MagicMock() +error_body_1.close = MagicMock() +error_body_1.read.side_effect = IncompleteRead(b"incomplete") + +error_body_2 = MagicMock() +error_body_2.close = MagicMock() +error_body_2.read.side_effect = URLLib3ReadTimeoutError(None, None, "timeout") + +success_body_2 = MagicMock() +success_body_2.read.return_value = expected_data4 + +client4.get_object.side_effect = [ + {"Body": error_body_1, "ContentLength": len(expected_data4)}, + {"Body": error_body_2, "ContentLength": len(expected_data4)}, + {"Body": success_body_2, "ContentLength": len(expected_data4)}, +] + +stream4 = RetryingStream(client4, "test-bucket", "test.tar", retries=5) + +with patch("time.sleep"): + data4 = stream4.read(250) + +assert data4 == expected_data4, f"Expected {len(expected_data4)} bytes but got {len(data4)} bytes" +assert len(data4) == 250, f"Expected 250 bytes but got {len(data4)}" +assert error_body_1.close.called, "First error stream was not closed" +assert error_body_2.close.called, "Second error stream was not closed" +assert client4.get_object.call_count == 3, f"Expected 3 calls but got {client4.get_object.call_count}" +print(f"✓ Recovered from mixed errors: {len(data4)} bytes") +print(f"✓ Both error streams were closed: {error_body_1.close.called and error_body_2.close.called}") + +# Test 5: URLLib3ProtocolError +print("\nTest 5: Retry on URLLib3ProtocolError") +client5 = MagicMock() +expected_data5 = b"A" * 128 +client5.head_object.return_value = {"ContentLength": str(len(expected_data5))} + +error_body_5 = MagicMock() +error_body_5.close = MagicMock() +error_body_5.read.side_effect = URLLib3ProtocolError("Connection broken") + +success_body_5 = MagicMock() +success_body_5.read.return_value = expected_data5 + +client5.get_object.side_effect = [ + {"Body": error_body_5, "ContentLength": len(expected_data5)}, + {"Body": success_body_5, "ContentLength": len(expected_data5)}, +] + +stream5 = RetryingStream(client5, "test-bucket", "test.tar", retries=3) + +with patch("time.sleep"): + data5 = stream5.read(128) + +assert data5 == expected_data5, f"Expected {len(expected_data5)} bytes but got {len(data5)} bytes" +assert error_body_5.close.called, "Error stream was not closed" +assert client5.get_object.call_count == 2, f"Expected 2 calls but got {client5.get_object.call_count}" +print(f"✓ Recovered from ProtocolError: {len(data5)} bytes") + +# Test 6: URLLib3SSLError +print("\nTest 6: Retry on URLLib3SSLError") +client6 = MagicMock() +expected_data6 = b"B" * 256 +client6.head_object.return_value = {"ContentLength": str(len(expected_data6))} + +error_body_6 = MagicMock() +error_body_6.close = MagicMock() +error_body_6.read.side_effect = URLLib3SSLError("SSL handshake failed") + +success_body_6 = MagicMock() +success_body_6.read.return_value = expected_data6 + +client6.get_object.side_effect = [ + {"Body": error_body_6, "ContentLength": len(expected_data6)}, + {"Body": success_body_6, "ContentLength": len(expected_data6)}, +] + +stream6 = RetryingStream(client6, "test-bucket", "test.tar", retries=3) + +with patch("time.sleep"): + data6 = stream6.read(256) + +assert data6 == expected_data6, f"Expected {len(expected_data6)} bytes but got {len(data6)} bytes" +assert error_body_6.close.called, "Error stream was not closed" +assert client6.get_object.call_count == 2, f"Expected 2 calls but got {client6.get_object.call_count}" +print(f"✓ Recovered from SSLError: {len(data6)} bytes") + +# Test 7: Generic IOError +print("\nTest 7: Retry on generic IOError") +client7 = MagicMock() +expected_data7 = b"C" * 512 +client7.head_object.return_value = {"ContentLength": str(len(expected_data7))} + +error_body_7 = MagicMock() +error_body_7.close = MagicMock() +error_body_7.read.side_effect = IOError("Generic IO error") + +success_body_7 = MagicMock() +success_body_7.read.return_value = expected_data7 + +client7.get_object.side_effect = [ + {"Body": error_body_7, "ContentLength": len(expected_data7)}, + {"Body": success_body_7, "ContentLength": len(expected_data7)}, +] + +stream7 = RetryingStream(client7, "test-bucket", "test.tar", retries=3) + +with patch("time.sleep"): + data7 = stream7.read(512) + +assert data7 == expected_data7, f"Expected {len(expected_data7)} bytes but got {len(data7)} bytes" +assert error_body_7.close.called, "Error stream was not closed" +assert client7.get_object.call_count == 2, f"Expected 2 calls but got {client7.get_object.call_count}" +print(f"✓ Recovered from IOError: {len(data7)} bytes") + +# Test 8: Premature end of stream detection +print("\nTest 8: Premature end of stream detection") +client8 = MagicMock() +expected_data8 = b"D" * 1024 +client8.head_object.return_value = {"ContentLength": str(len(expected_data8))} + +# First body returns empty when we expect data (premature end) +premature_body = MagicMock() +premature_body.close = MagicMock() +premature_body.read.return_value = b"" # Empty read when we expect data + +success_body_8 = MagicMock() +success_body_8.read.return_value = expected_data8 + +client8.get_object.side_effect = [ + {"Body": premature_body, "ContentLength": len(expected_data8)}, + {"Body": success_body_8, "ContentLength": len(expected_data8)}, +] + +stream8 = RetryingStream(client8, "test-bucket", "test.tar", retries=3) + +with patch("time.sleep"): + data8 = stream8.read(1024) + +assert data8 == expected_data8, f"Expected {len(expected_data8)} bytes but got {len(data8)} bytes" +assert premature_body.close.called, "Premature stream was not closed" +assert client8.get_object.call_count == 2, f"Expected 2 calls but got {client8.get_object.call_count}" +print(f"✓ Recovered from premature end of stream: {len(data8)} bytes") + +# Test 9: EndpointConnectionError during reconnection +print("\nTest 9: EndpointConnectionError during reconnection (continues retry loop)") +client9 = MagicMock() +expected_data9 = b"E" * 2048 +client9.head_object.return_value = {"ContentLength": str(len(expected_data9))} + +# First read fails +error_body_9a = MagicMock() +error_body_9a.close = MagicMock() +error_body_9a.read.side_effect = URLLib3ReadTimeoutError(None, None, "timeout") + +# First reconnection attempt fails with EndpointConnectionError +# Second read also fails +error_body_9b = MagicMock() +error_body_9b.close = MagicMock() +error_body_9b.read.side_effect = URLLib3ReadTimeoutError(None, None, "timeout") + +# Final success +success_body_9 = MagicMock() +success_body_9.read.return_value = expected_data9 + +# Simulate EndpointConnectionError on first reconnection attempt, then succeed +client9.get_object.side_effect = [ + {"Body": error_body_9a, "ContentLength": len(expected_data9)}, # Initial read + EndpointConnectionError(endpoint_url="https://s3.amazonaws.com"), # Reconnection fails + {"Body": error_body_9b, "ContentLength": len(expected_data9)}, # Second attempt after endpoint error + {"Body": success_body_9, "ContentLength": len(expected_data9)}, # Final success +] + +stream9 = RetryingStream(client9, "test-bucket", "test.tar", retries=5) + +with patch("time.sleep"): + data9 = stream9.read(2048) + +assert data9 == expected_data9, f"Expected {len(expected_data9)} bytes but got {len(data9)} bytes" +assert error_body_9a.close.called, "First error stream was not closed" +assert error_body_9b.close.called, "Second error stream was not closed" +# Should be called 4 times: initial + endpoint error + retry after endpoint + final success +assert client9.get_object.call_count == 4, f"Expected 4 calls but got {client9.get_object.call_count}" +print(f"✓ Recovered from EndpointConnectionError during reconnection: {len(data9)} bytes") + +# Test 10: ResponseStreamingError during reconnection (now FIXED) +print("\nTest 10: ResponseStreamingError during reconnection - should retry") +client10 = MagicMock() +expected_data10 = b"F" * 1024 +client10.head_object.return_value = {"ContentLength": str(len(expected_data10))} + +# First read fails with IncompleteRead +error_body_10a = MagicMock() +error_body_10a.close = MagicMock() +error_body_10a.read.side_effect = IncompleteRead(b"incomplete") + +# Reconnection fails with ResponseStreamingError (wrapping IncompleteRead) +reconnect_error = ResponseStreamingError( + error=IncompleteRead(b"x" * 97727), msg="Connection broken: IncompleteRead(97727 bytes read, 143937 more expected)" +) + +# Second attempt after reconnection succeeds +success_body_10 = MagicMock() +success_body_10.read.return_value = expected_data10 + +# Simulate: read fails → reconnect fails with ResponseStreamingError → retry succeeds +client10.get_object.side_effect = [ + {"Body": error_body_10a, "ContentLength": len(expected_data10)}, # Initial read + reconnect_error, # First reconnection fails with ResponseStreamingError (now caught!) + {"Body": success_body_10, "ContentLength": len(expected_data10)}, # Second reconnection succeeds +] + +stream10 = RetryingStream(client10, "test-bucket", "test.tar", retries=5) + +with patch("time.sleep"): + with patch("cosmos3._src.imaginaire.datasets.webdataset.utils.stream.log"): + data10 = stream10.read(1024) + +# Verify the fix worked +assert data10 == expected_data10, f"Expected {len(expected_data10)} bytes but got {len(data10)} bytes" +assert error_body_10a.close.called, "First error stream was not closed" +assert client10.get_object.call_count == 3, f"Expected 3 calls but got {client10.get_object.call_count}" +print(f"✓ Recovered from ResponseStreamingError during reconnection: {len(data10)} bytes") + +# Test 11: Failure during __init__ get_length() - now WITH retry logic +print("\nTest 11: Failure during __init__ get_length() - retries 3 times then fails") +client11 = MagicMock() + +# head_object fails with ResponseStreamingError on all attempts +client11.head_object.side_effect = ResponseStreamingError( + error=IncompleteRead(b"fail"), msg="Connection broken during head_object" +) + +test11_exception = None +with patch("time.sleep"): # Skip sleep delays in test + try: + stream11 = RetryingStream(client11, "test-bucket", "test.tar", retries=5) + print("✗ Should have raised exception after retries exhausted") + except ResponseStreamingError as e: + test11_exception = e + print(f"✓ ResponseStreamingError raised after retries exhausted") + print(f" Error: {e}") + print(f" head_object was called {client11.head_object.call_count} time(s)") + +assert test11_exception is not None, "Should have raised exception after retries exhausted" +assert client11.head_object.call_count == 5, "Should have retried 5 times (retries=5)" + + +# Test 12: Failure during __init__ get_stream() - now WITH retry logic +print("\nTest 12: Failure during __init__ get_stream() - retries 3 times then fails") +client12 = MagicMock() +client12.head_object.return_value = {"ContentLength": "1024"} + +# get_object fails with ResponseStreamingError during initial stream creation on all attempts +client12.get_object.side_effect = ResponseStreamingError( + error=IncompleteRead(b"fail"), msg="Connection broken during initial get_object" +) + +test12_exception = None +with patch("time.sleep"): # Skip sleep delays in test + try: + stream12 = RetryingStream(client12, "test-bucket", "test.tar", retries=5) + print("✗ Should have raised exception after retries exhausted") + except ResponseStreamingError as e: + test12_exception = e + print(f"✓ ResponseStreamingError raised after retries exhausted") + print(f" Error: {e}") + print(f" get_object was called {client12.get_object.call_count} time(s)") + +assert test12_exception is not None, "Should have raised exception after retries exhausted" +assert client12.get_object.call_count == 5, "Should have retried 5 times (retries=5)" + + +# Test 13: Transient failure during __init__ get_stream() on first attempt, success on retry +print("\nTest 13: Transient failure during __init__ - now succeeds with retry logic!") +client13 = MagicMock() +expected_data13 = b"G" * 512 +client13.head_object.return_value = {"ContentLength": str(len(expected_data13))} + +# First get_object fails, second succeeds (showing network blip during initialization) +success_body_13 = MagicMock() +success_body_13.read.return_value = expected_data13 + +get_object_call_count = [0] + + +def get_object_with_initial_failure(**kwargs): + get_object_call_count[0] += 1 + if get_object_call_count[0] == 1: + # First call during __init__ fails + raise ResponseStreamingError( + error=IncompleteRead(b"transient"), msg="Transient network error during initialization" + ) + else: + # Subsequent calls succeed + return {"Body": success_body_13, "ContentLength": len(expected_data13)} + + +client13.get_object.side_effect = get_object_with_initial_failure + +test13_exception = None +test13_stream = None +with patch("time.sleep"): # Skip sleep delays in test + try: + test13_stream = RetryingStream(client13, "test-bucket", "test.tar", retries=5) + print(f"✓ Object created successfully after transient failure") + print(f" get_object was called {get_object_call_count[0]} time(s)") + except ResponseStreamingError as e: + test13_exception = e + print(f"✗ Unexpected failure: {e}") + +# Verify the retry logic worked +assert test13_exception is None, "Should have succeeded after retry" +assert test13_stream is not None, "Stream should be created" +assert get_object_call_count[0] == 2, "Should have failed once, then succeeded on retry" +data13 = test13_stream.read() +assert data13 == expected_data13, "Should be able to read data successfully" +print(f"✓ Successfully read {len(data13)} bytes after recovering from transient init error") + +print("\n✅ All mock tests passed! Retry logic working correctly.") diff --git a/cosmos-inference/cosmos3/_src/imaginaire/datasets/webdataset/utils/unit_test/RetryingStreamStatsOverheadBenchmark.py b/cosmos-inference/cosmos3/_src/imaginaire/datasets/webdataset/utils/unit_test/RetryingStreamStatsOverheadBenchmark.py new file mode 100644 index 00000000..52055cde --- /dev/null +++ b/cosmos-inference/cosmos3/_src/imaginaire/datasets/webdataset/utils/unit_test/RetryingStreamStatsOverheadBenchmark.py @@ -0,0 +1,1185 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +"""Benchmark script to measure the performance overhead of retry statistics tracking.""" + +import gc +import os +import statistics +import subprocess +import sys +import threading +import time +from http.client import IncompleteRead +from unittest.mock import MagicMock + +import cosmos3._src.imaginaire.datasets.webdataset.utils.stream as stream_module +from cosmos3._src.imaginaire.datasets.webdataset.utils.stream import RetryingStream + +# Configure faster logging interval for tests (10 seconds instead of 5 minutes) +stream_module.RETRY_STATS_LOG_INTERVAL = 10.0 + + +def benchmark_iteration(enable_stats: bool, num_operations: int, network_delay_ms: float = 0) -> float: + """Run a single benchmark iteration. + + Args: + enable_stats: Whether to enable statistics tracking + num_operations: Number of read operations to perform + network_delay_ms: Simulated network delay in milliseconds (0 = no delay) + + Returns: + Time taken in seconds + """ + stream_module.ENABLE_RETRY_STATS = enable_stats + + # Setup mock + client = MagicMock() + test_data = b"X" * 1024 # 1KB chunks + client.head_object.return_value = {"ContentLength": str(len(test_data))} + + mock_body = MagicMock() + if network_delay_ms > 0: + # Add simulated network delay to mock read + def mock_read_with_delay(amt): + time.sleep(network_delay_ms / 1000.0) # Convert ms to seconds + return test_data + + mock_body.read = mock_read_with_delay + else: + mock_body.read.return_value = test_data + + client.get_object.return_value = {"Body": mock_body, "ContentLength": len(test_data)} + + stream = RetryingStream(client, "benchmark-bucket", "test.tar", retries=3) + + # Disable GC during timing to reduce noise + gc.collect() + gc.disable() + + try: + start_time = time.perf_counter() # Use perf_counter for higher precision + for _ in range(num_operations): + stream.read(1024) + end_time = time.perf_counter() + finally: + gc.enable() + + return end_time - start_time + + +def run_benchmark_suite(name: str, num_operations: int, num_runs: int, network_delay_ms: float = 0): + """Run a complete benchmark suite.""" + print(f"\n{'=' * 70}") + print(f"{name}") + print(f"{'=' * 70}") + print(f"Operations: {num_operations:,} per run, {num_runs} runs") + if network_delay_ms > 0: + print(f"Network delay: {network_delay_ms}ms per read (simulates S3 latency)") + else: + print(f"Network delay: None (synthetic benchmark)") + print(f"GC disabled during timing for accuracy") + print("-" * 70) + + # Interleave runs to reduce system variance + with_stats_times = [] + without_stats_times = [] + + for i in range(num_runs): + # Run both configs in same iteration to reduce variance + elapsed_with = benchmark_iteration(True, num_operations, network_delay_ms) + elapsed_without = benchmark_iteration(False, num_operations, network_delay_ms) + + with_stats_times.append(elapsed_with) + without_stats_times.append(elapsed_without) + + overhead_this_run = ((elapsed_with - elapsed_without) / elapsed_without) * 100 + print( + f"Run {i + 1:2d}: stats ON={elapsed_with:.4f}s ({num_operations / elapsed_with:,.0f} ops/s) | " + f"stats OFF={elapsed_without:.4f}s ({num_operations / elapsed_without:,.0f} ops/s) | " + f"overhead={overhead_this_run:+.1f}%" + ) + + # Calculate statistics (use trimmed mean to reduce outlier impact) + median_with_stats = statistics.median(with_stats_times) + median_without_stats = statistics.median(without_stats_times) + + # Also calculate trimmed mean (remove top/bottom 10%) + sorted_with = sorted(with_stats_times) + sorted_without = sorted(without_stats_times) + trim_count = max(1, len(sorted_with) // 10) + trimmed_with = sorted_with[trim_count:-trim_count] if len(sorted_with) > 2 * trim_count else sorted_with + trimmed_without = sorted_without[trim_count:-trim_count] if len(sorted_without) > 2 * trim_count else sorted_without + + mean_trimmed_with = statistics.mean(trimmed_with) + mean_trimmed_without = statistics.mean(trimmed_without) + + stddev_with_stats = statistics.stdev(with_stats_times) if len(with_stats_times) > 1 else 0 + stddev_without_stats = statistics.stdev(without_stats_times) if len(without_stats_times) > 1 else 0 + + # Calculate coefficient of variation (CV) to show relative stability + cv_with = (stddev_with_stats / median_with_stats) * 100 if median_with_stats > 0 else 0 + cv_without = (stddev_without_stats / median_without_stats) * 100 if median_without_stats > 0 else 0 + + print("-" * 70) + print(f"Stats ON: median={median_with_stats:.4f}s, stddev={stddev_with_stats:.4f}s, CV={cv_with:.1f}%") + print(f"Stats OFF: median={median_without_stats:.4f}s, stddev={stddev_without_stats:.4f}s, CV={cv_without:.1f}%") + + if max(cv_with, cv_without) > 15: + print(f"⚠ High variance detected (CV > 15%) - results may be unreliable due to system noise") + + # Use both median and trimmed mean for overhead calculation + overhead_median = ((median_with_stats - median_without_stats) / median_without_stats) * 100 + overhead_trimmed = ((mean_trimmed_with - mean_trimmed_without) / mean_trimmed_without) * 100 + + print(f"\nMedian overhead: {overhead_median:+.2f}%") + print(f"Trimmed mean overhead: {overhead_trimmed:+.2f}% (outliers removed)") + + # Show per-operation overhead (using trimmed mean for robustness) + per_op_overhead_ns = ((mean_trimmed_with - mean_trimmed_without) / num_operations) * 1e9 + per_op_overhead_us = per_op_overhead_ns / 1000.0 + print(f"Per-operation overhead: {per_op_overhead_ns:.1f} nanoseconds ({per_op_overhead_us:.3f} microseconds)") + + if network_delay_ms > 0: + network_delay_us = network_delay_ms * 1000.0 + overhead_vs_network = (per_op_overhead_us / network_delay_us) * 100 + print(f"Overhead vs network delay: {overhead_vs_network:.4f}% of {network_delay_ms}ms") + + # Use trimmed mean for final assessment (more robust) + if abs(overhead_trimmed) < 1.0: + print("✓ Negligible overhead (< 1%)") + elif abs(overhead_trimmed) < 5.0: + print("✓ Low overhead (< 5%)") + else: + print("⚠ Measurable overhead (>= 5%)") + + return overhead_trimmed + + +def test_multithreaded_stats_correctness(): + """Test that global statistics correctly aggregate across multiple threads and instances.""" + print("\n" + "=" * 70) + print("CORRECTNESS TEST: Multi-threaded Statistics Aggregation") + print("=" * 70) + + # Enable stats for testing + stream_module.ENABLE_RETRY_STATS = True + + # Reset global stats + with stream_module._global_retry_stats.lock: + stream_module._global_retry_stats.registered_threads.clear() + stream_module._global_retry_stats.active_instances.clear() + stream_module._global_retry_stats.cumulative_operations_started = 0 + stream_module._global_retry_stats.cumulative_failed_operations = 0 + stream_module._global_retry_stats.cumulative_attempts = 0 + + # Test configuration + num_threads = 4 + operations_per_thread = 50 + retry_probability = 0.3 # 30% of operations will require a retry + + # Calculate exact expected retries (deterministic based on modulo) + total_read_ops = num_threads * operations_per_thread + retry_every_n = int(1 / retry_probability) # Every 3rd operation retries + expected_retries = sum(1 for i in range(total_read_ops) if i % retry_every_n == 0) + + print(f"Configuration: {num_threads} threads, {operations_per_thread} operations per thread") + print(f"Expected: {total_read_ops} read operations (+ init operations)") + print(f"Expected retries: {expected_retries} operations with retries (every {retry_every_n}th operation)") + print("-" * 70) + + # Counter to track which operation should fail + operation_counter = {"count": 0, "lock": threading.Lock()} + + def thread_worker(thread_id: int): + """Worker function that creates streams and performs operations.""" + client = MagicMock() + test_data = b"X" * 1024 + client.head_object.return_value = {"ContentLength": str(len(test_data))} + + # Track operations in this thread + local_ops = 0 + local_retries = 0 + + # Keep all streams alive until thread completes to prevent premature destructor calls + streams = [] + + for i in range(operations_per_thread): + # Determine if this operation should require a retry + with operation_counter["lock"]: + op_num = operation_counter["count"] + operation_counter["count"] += 1 + should_retry = (op_num % int(1 / retry_probability)) == 0 + + # Create mock body that may fail once then succeed + mock_body = MagicMock() + if should_retry: + # First read fails with IncompleteRead, second succeeds + mock_body.read.side_effect = [ + IncompleteRead(b"partial"), + test_data, + ] + local_retries += 1 + else: + # Always succeeds + mock_body.read.return_value = test_data + + client.get_object.return_value = {"Body": mock_body, "ContentLength": len(test_data)} + + # Create stream and perform read + stream = RetryingStream(client, f"bucket-{thread_id}", f"file-{i}.tar", retries=5) + streams.append(stream) # Keep alive to prevent premature destructor calls + try: + data = stream.read(1024) + local_ops += 1 + except Exception as e: + print(f"Thread {thread_id}: Unexpected error: {e}") + + print(f"Thread {thread_id}: Completed {local_ops} operations, {local_retries} with retries") + return local_ops, local_retries + + # Run threads + print("Starting threads...") + threads = [] + results = [] + + def thread_wrapper(thread_id): + result = thread_worker(thread_id) + results.append(result) + + start_time = time.time() + for i in range(num_threads): + t = threading.Thread(target=thread_wrapper, args=(i,)) + threads.append(t) + t.start() + + # Wait for all threads to complete + for t in threads: + t.join() + + elapsed = time.time() - start_time + print(f"All threads completed in {elapsed:.2f}s") + print("-" * 70) + + # Give threads a moment to finish cleanup + time.sleep(0.1) + + # Aggregate local values from thread results (for sanity check) + local_total_ops = sum(r[0] for r in results) + local_ops_with_retries = sum(r[1] for r in results) + + # Get actual stats from global tracker (aggregate from per-thread stats) + with stream_module._global_retry_stats.lock: + actual_ops_started = 0 + actual_failed_ops = 0 + actual_total_attempts = 0 + + for thread_stats in stream_module._global_retry_stats.registered_threads.values(): + actual_ops_started += thread_stats["operations_started"] + actual_failed_ops += thread_stats["failed_operations"] + actual_total_attempts += thread_stats["total_attempts"] + + # Note: actual_ops_started includes init operations (get_length, get_stream) too + # Each RetryingStream.__init__ calls _retry_operation twice + expected_init_ops = num_threads * operations_per_thread * 2 # get_length + get_stream + expected_read_ops = local_total_ops # Should equal num_threads * operations_per_thread + expected_total_with_init = expected_init_ops + expected_read_ops + + print("RESULTS:") + print(f"Local tracking (from threads):") + print(f" Read operations: {local_total_ops}") + print(f" Operations with retries: {local_ops_with_retries}") + print(f"Global tracking (from stats aggregator):") + print(f" Operations started (including init): {actual_ops_started}") + print(f" Expected: {expected_total_with_init} ({expected_init_ops} init + {expected_read_ops} read)") + print(f" Failed operations: {actual_failed_ops}") + print(f" Expected: {expected_retries} (deterministic)") + print(f" Total attempts: {actual_total_attempts}") + print(f" Expected: {expected_total_with_init + expected_retries} (base + retry attempts)") + print("-" * 70) + + # Verify correctness + success = True + + # Sanity check: local tracking should match expected + if local_ops_with_retries != expected_retries: + print(f"⚠ WARNING: Local thread tracking mismatch (bug in test itself!)") + print(f" Expected retries: {expected_retries}, Local tracked: {local_ops_with_retries}") + + # Check that we tracked the right number of operations + if actual_ops_started != expected_total_with_init: + print(f"❌ FAIL: Operations started mismatch!") + print(f" Expected: {expected_total_with_init}, Got: {actual_ops_started}") + success = False + else: + print(f"✓ Operations started tracked correctly") + + # Check that failed operations matches exactly (deterministic) + if actual_failed_ops != expected_retries: + print(f"❌ FAIL: Failed operations mismatch!") + print(f" Expected: {expected_retries}, Got: {actual_failed_ops}") + success = False + else: + print(f"✓ Failed operations tracked correctly") + + # Check that total attempts equals base operations + retry attempts + expected_total_attempts = expected_total_with_init + expected_retries + if actual_total_attempts != expected_total_attempts: + print(f"❌ FAIL: Total attempts mismatch!") + print(f" Expected: {expected_total_attempts}, Got: {actual_total_attempts}") + success = False + else: + print(f"✓ Total attempts tracked correctly") + + # Check that we created thread-local stats for each thread + num_registered_threads = len(stream_module._global_retry_stats.registered_threads) + if num_registered_threads != num_threads: + print(f"❌ FAIL: Incorrect number of threads registered!") + print(f" Expected: {num_threads}, Got: {num_registered_threads}") + success = False + else: + print(f"✓ All {num_threads} threads registered correctly") + + print("-" * 70) + if success: + print("✅ PASS: Multi-threaded statistics aggregation is CORRECT!") + else: + print("❌ FAIL: Multi-threaded statistics aggregation has ERRORS!") + + print("\nNote: Final stats will be logged via atexit handler when the program exits.") + + return success + + +def test_weakref_robustness(): + """Test that WeakSet-based tracking handles failures gracefully. + + Note: Final stats are logged via atexit handler at program exit, + not via destructors, so we won't see "Final" logs during this test. + """ + print("\n" + "=" * 70) + print("ROBUSTNESS TEST: WeakSet Handles Thread Death & Init Failures") + print("=" * 70) + + # Enable stats for testing + stream_module.ENABLE_RETRY_STATS = True + + # Reset global stats + with stream_module._global_retry_stats.lock: + stream_module._global_retry_stats.registered_threads.clear() + stream_module._global_retry_stats.active_instances.clear() + stream_module._global_retry_stats.cumulative_operations_started = 0 + stream_module._global_retry_stats.cumulative_failed_operations = 0 + stream_module._global_retry_stats.cumulative_attempts = 0 + + client = MagicMock() + test_data = b"X" * 1024 + client.head_object.return_value = {"ContentLength": str(len(test_data))} + + # Test 1: Normal construction and destruction + print("Test 1: Normal construction and destruction") + mock_body = MagicMock() + mock_body.read.return_value = test_data + client.get_object.return_value = {"Body": mock_body, "ContentLength": len(test_data)} + + stream1 = RetryingStream(client, "bucket", "file1.tar", retries=5) + with stream_module._global_retry_stats.lock: + count = len(stream_module._global_retry_stats.active_instances) + print(f" After creating stream1: {count} active instance(s)") + assert count == 1, f"Expected 1 instance, got {count}" + + del stream1 + gc.collect() # Force garbage collection + + with stream_module._global_retry_stats.lock: + count = len(stream_module._global_retry_stats.active_instances) + print(f" After deleting stream1: {count} active instance(s)") + assert count == 0, f"Expected 0 instances, got {count}" + print(" ✓ Pass: Normal lifecycle works correctly") + + # Test 2: Exception during init (simulated by creating then raising) + print("\nTest 2: WeakSet cleans up even if instance only partially constructed") + stream2 = RetryingStream(client, "bucket", "file2.tar", retries=5) + with stream_module._global_retry_stats.lock: + count_before = len(stream_module._global_retry_stats.active_instances) + print(f" Created stream2: {count_before} active instance(s)") + + # Simulate early destruction (exception path, thread death, etc.) + del stream2 + gc.collect() # Force garbage collection + + with stream_module._global_retry_stats.lock: + count_after = len(stream_module._global_retry_stats.active_instances) + print(f" After destruction: {count_after} active instance(s)") + assert count_after == 0, f"Expected 0 instances after cleanup, got {count_after}" + print(" ✓ Pass: WeakSet automatically cleaned up") + + # Test 3: Multiple instances, destroy in random order + print("\nTest 3: Multiple instances with out-of-order destruction") + # Create streams with explicit references so we can delete specific ones + s0 = RetryingStream(client, "bucket", "file0.tar", retries=5) + s1 = RetryingStream(client, "bucket", "file1.tar", retries=5) + s2 = RetryingStream(client, "bucket", "file2.tar", retries=5) + s3 = RetryingStream(client, "bucket", "file3.tar", retries=5) + s4 = RetryingStream(client, "bucket", "file4.tar", retries=5) + + with stream_module._global_retry_stats.lock: + count = len(stream_module._global_retry_stats.active_instances) + print(f" Created 5 streams: {count} active instance(s)") + assert count == 5, f"Expected 5 instances, got {count}" + + # Delete specific streams (keep s1 and s3 alive) + del s0 + del s2 + del s4 + gc.collect() # Force garbage collection to ensure destructors run + + with stream_module._global_retry_stats.lock: + count = len(stream_module._global_retry_stats.active_instances) + print(f" After deleting 3 streams: {count} active instance(s)") + assert count == 2, f"Expected 2 instances (s1, s3), got {count}" + + # Clean up remaining (s1 and s3) + del s1 + del s3 + gc.collect() # Force garbage collection + + with stream_module._global_retry_stats.lock: + count = len(stream_module._global_retry_stats.active_instances) + print(f" After deleting all: {count} active instance(s)") + assert count == 0, f"Expected 0 instances, got {count}" + print(" ✓ Pass: Out-of-order destruction handled correctly") + + print("-" * 70) + print("✅ PASS: WeakSet-based tracking is ROBUST!") + print(" - Handles normal lifecycle") + print(" - Automatically cleans up dead references") + print(" - Works with arbitrary destruction order") + print(" - No risk of stuck counters or deadlocks") + + return True + + +def test_multi_rank_stats_logging(): + """Test stats logging with multiple ranks AND multiple workers per rank (simulating DataLoader). + + This test uses actual distributed launchers: + 1. Tests with torchrun (PyTorch's distributed launcher) if available + 2. Tests with mpirun (OpenMPI/MPICH) if available (requires mpi4py: `uv pip install mpi4py`) + 3. Skips test if neither available + + The worker script (mpi_rank_worker.py) is launched by real launchers. + Each rank spawns multiple worker processes (via multiprocessing) to simulate + DataLoader workers with num_workers > 0. + + Multi-level testing: + - Multiple ranks (distributed training) + - Multiple workers per rank (DataLoader processes) + - Different workload per worker (simulates real workload imbalance) + - Different failure patterns per worker + + This ensures: + - Each worker process has independent statistics (separate PID, separate _global_retry_stats) + - Each worker logs its own statistics with correct PID + - Worker processes can have the same thread ID but different PIDs + - Statistics are correctly isolated between workers and between ranks + + Note: mpi4py is an optional dependency only needed for mpirun testing. + """ + + print("\n" + "=" * 70) + print("MULTI-RANK (REAL MPI/TORCHRUN) TEST") + print("=" * 70) + + world_size = 10 + + # Get path to worker script + worker_script = os.path.join(os.path.dirname(__file__), "mpi_rank_worker.py") + + # Check which launchers are available + available_launchers = {} + + print("Checking available launchers...") + + # Check torchrun + try: + result = subprocess.run(["torchrun", "--help"], capture_output=True, timeout=5) + if result.returncode == 0: + available_launchers["torchrun"] = [ + "torchrun", + "--standalone", + "--nnodes=1", + f"--nproc_per_node={world_size}", + worker_script, + ] + print(" ✓ torchrun available") + except (FileNotFoundError, subprocess.TimeoutExpired): + print(" ✗ torchrun not found") + + # Check mpirun + try: + result = subprocess.run(["mpirun", "--version"], capture_output=True, timeout=5) + if result.returncode == 0: + available_launchers["mpirun"] = [ + "mpirun", + "--oversubscribe", # Allow more processes than physical cores + "--tag-output", # Prefix output with [rank,node] + "-np", + str(world_size), + sys.executable, + "-u", # Unbuffered Python output + worker_script, + ] + print(" ✓ mpirun available") + except (FileNotFoundError, subprocess.TimeoutExpired): + print(" ✗ mpirun not found") + + if not available_launchers: + print("\n⚠ SKIP: Neither torchrun nor mpirun available") + print(" Install PyTorch (for torchrun) or OpenMPI (for mpirun)") + print("=" * 70) + return True # Not a failure, just skip + + print(f"\nTesting with {len(available_launchers)} launcher(s): {', '.join(available_launchers.keys())}") + print(f" World size: {world_size} ranks per launcher") + print(f" Worker script: {worker_script}") + print("=" * 70) + + # Test with each available launcher + all_passed = True + results = {} + + for launcher_name, launcher_cmd in available_launchers.items(): + print(f"\n{'=' * 70}") + print(f"TESTING WITH: {launcher_name}") + print(f"{'=' * 70}") + print(f"Command: {' '.join(launcher_cmd)}") + print("-" * 70) + + try: + result = subprocess.run( + launcher_cmd, + capture_output=True, # Capture output for verification + text=True, + timeout=120, # 2 minute timeout + ) + returncode = result.returncode + output = result.stdout + result.stderr + + # Print output to terminal + print(output) + + # Note: If a rank fails, torchrun/mpirun will show a stack trace indicating + # which rank failed. This is NORMAL and EXPECTED behavior - it's how + # distributed launchers report failures, not a crash or bug. + + except subprocess.TimeoutExpired: + print(f"\n❌ TIMEOUT: {launcher_name} did not complete within 2 minutes") + results[launcher_name] = "TIMEOUT" + all_passed = False + continue + except Exception as e: + print(f"\n❌ ERROR launching {launcher_name}: {e}") + results[launcher_name] = f"ERROR: {e}" + all_passed = False + continue + + if returncode != 0: + print(f"\n❌ {launcher_name} FAILED: exit code {returncode}") + results[launcher_name] = f"FAILED (exit {returncode})" + all_passed = False + else: + # Verify that each rank reported success + # Each rank now has multiple workers, so we check for rank-level success messages + # Look for the pattern "✅ Rank X: All N workers verified" + # Use regex to count occurrences (not lines) because stdout buffering can concatenate outputs + import re + + rank_summaries = len(re.findall(r"✅ Rank \d+: All \d+ workers verified", output)) + + if rank_summaries == world_size: + print(f"\n✅ {launcher_name} PASSED - All {world_size} ranks with their workers verified") + results[launcher_name] = "PASSED" + else: + print(f"\n❌ {launcher_name} PARTIAL SUCCESS - Only {rank_summaries}/{world_size} ranks verified") + results[launcher_name] = f"PARTIAL ({rank_summaries}/{world_size} ranks OK)" + all_passed = False + + # Print summary + print("\n" + "=" * 70) + print("MULTI-RANK TEST SUMMARY") + print("=" * 70) + + for launcher_name, result in results.items(): + status_icon = "✅" if result == "PASSED" else "❌" + print(f" {status_icon} {launcher_name}: {result}") + + print("-" * 70) + + if all_passed: + print(f"✅ OVERALL: PASS") + print(f"\nAll {world_size} ranks completed successfully with all launchers!") + print(f"\nVerified:") + print(f" ✓ Each rank ran as independent process") + print(f" ✓ Each rank had separate Python interpreter and statistics") + print(f" ✓ Each rank used different failure patterns (rank-specific)") + print(f" ✓ Each rank's statistics matched expected values exactly") + print(f" ✓ Each rank logged its own statistics independently") + print(f" ✓ No shared memory or stat contamination between ranks") + print(f" ✓ Works with both torchrun and mpirun") + print(f"\nThis confirms statistics work correctly in distributed training!") + else: + print(f"❌ OVERALL: FAIL") + print(f"\nSome launchers failed - check output above for details") + + print("=" * 70) + return all_passed + + +def test_rank_failure_robustness(): + """Test that statistics logging doesn't cause deadlocks when ranks fail unexpectedly. + + This is a fault tolerance test that simulates real-world distributed training failures. + + What we test: + - Random ranks are killed mid-execution (os._exit() simulates crash) + - Tests complete within timeout (NO DEADLOCK) + - Statistics logging doesn't hold locks that cause hangs + - Atexit handlers don't deadlock when ranks die + + What we expect: + - Test completes within 60s (key requirement - no hang/deadlock) + - torchrun: Kills all ranks (elastic fail-fast behavior - CORRECT) + - mpirun: May kill all ranks or allow survivors (both OK) + - Non-zero exit code (some ranks died - EXPECTED) + + What we DON'T test: + - Whether survivors complete their work (launcher-dependent) + - Recovery or re-launching (not our responsibility) + + This test is critical because in production, ranks can fail due to: + - Hardware failures (GPU crashes, node failures) + - OOM errors + - Network issues + - Data corruption + + The statistics logging must NOT make the system more brittle by adding deadlock risks. + """ + print("\n" + "=" * 70) + print("RANK FAILURE ROBUSTNESS TEST") + print("=" * 70) + print("Testing that statistics logging doesn't cause deadlocks when ranks fail") + print("=" * 70) + + world_size = 10 + num_failed_ranks = 3 # Kill 30% of ranks + + # Get path to worker script + worker_script = os.path.join(os.path.dirname(__file__), "mpi_rank_worker.py") + + # Randomly select ranks to kill + import random + + random.seed(42) # Deterministic for reproducibility + failed_ranks = random.sample(range(world_size), num_failed_ranks) + failed_ranks_str = ",".join(map(str, failed_ranks)) + + print(f"\n World size: {world_size} ranks") + print(f" Simulating failures: {num_failed_ranks} ranks will be killed mid-execution") + print(f" Failed ranks: {failed_ranks}") + print(f"\n SUCCESS CRITERIA:") + print(f" ✓ Test completes within 60s timeout (NO DEADLOCK)") + print(f" ✓ Job exits with non-zero code (ranks died as expected)") + print(f"\n NOTE: Launchers use fail-fast behavior (kill all ranks when one fails)") + print(f" This is CORRECT and EXPECTED behavior!") + print("=" * 70) + + # Check which launchers are available + available_launchers = {} + + # Check torchrun + try: + result = subprocess.run(["torchrun", "--help"], capture_output=True, timeout=5) + if result.returncode == 0: + available_launchers["torchrun"] = [ + "torchrun", + f"--nproc_per_node={world_size}", + worker_script, + ] + except (FileNotFoundError, subprocess.TimeoutExpired): + pass + + # Check mpirun + try: + result = subprocess.run(["mpirun", "--version"], capture_output=True, timeout=5) + if result.returncode == 0: + available_launchers["mpirun"] = [ + "mpirun", + "-np", + str(world_size), + "--oversubscribe", + "--tag-output", + "python", + worker_script, + ] + except (FileNotFoundError, subprocess.TimeoutExpired): + pass + + if not available_launchers: + print("\n⚠ SKIPPING: Neither torchrun nor mpirun available") + print("This test requires a distributed launcher.") + return True # Skip test, don't fail + + print(f"\nAvailable launchers: {', '.join(available_launchers.keys())}") + + # Test with each available launcher + all_passed = True + results = {} + + for launcher_name, launcher_cmd in available_launchers.items(): + print(f"\n{'=' * 70}") + print(f"TESTING WITH: {launcher_name}") + print(f"{'=' * 70}") + + # Set environment variables for failure simulation + env = os.environ.copy() + env["SIMULATE_FAILURE_RANKS"] = failed_ranks_str + env["SKIP_BARRIER"] = "1" # Skip barrier to avoid deadlock + + try: + result = subprocess.run( + launcher_cmd, + capture_output=True, + text=True, + timeout=60, # Shorter timeout - should fail fast + env=env, + ) + returncode = result.returncode + output = result.stdout + result.stderr + + # Print output + print(output) + + except subprocess.TimeoutExpired: + print(f"\n❌ TIMEOUT: Test hung for 60 seconds - likely a DEADLOCK!") + print(f"This means the statistics logging caused a deadlock when ranks failed.") + results[launcher_name] = "DEADLOCK" + all_passed = False + continue + except Exception as e: + print(f"\n❌ ERROR launching {launcher_name}: {e}") + results[launcher_name] = f"ERROR: {e}" + all_passed = False + continue + + # Expected behavior: Job should fail (some ranks died) but NOT hang/deadlock + # Key metric: Did it complete within timeout? If yes, NO DEADLOCK! + + # Count ranks that completed their work + success_count = output.count("All statistics verified correctly!") + killed_ranks_confirmed = output.count("SIMULATING RANK FAILURE") + expected_survivors = world_size - num_failed_ranks + + if returncode == 0: + print(f"\n⚠️ WARNING: {launcher_name} succeeded even though ranks were killed") + print(f"This is unexpected - check if failure simulation worked") + results[launcher_name] = "UNEXPECTED_SUCCESS" + all_passed = False + else: + # Job failed as expected (some ranks died) + # Check for deadlock: If we got here without timeout, NO DEADLOCK! + print(f"\n✅ {launcher_name} PASSED - No deadlock detected!") + print(f" Job completed within timeout (no hang/deadlock)") + print(f" Simulated failures: {killed_ranks_confirmed}/{num_failed_ranks} ranks") + print(f" Completed successfully: {success_count} ranks") + + # torchrun kills ALL ranks when ANY rank fails (elastic behavior) + # mpirun may allow survivors to continue + if launcher_name == "torchrun": + if success_count == 0: + print(f" ℹ️ torchrun killed all ranks (expected elastic fail-fast behavior)") + results[launcher_name] = "PASSED" + else: + print(f" ⚠️ Some ranks survived despite torchrun elastic mode") + results[launcher_name] = "PASSED" + else: # mpirun or other + if success_count >= expected_survivors - 1: # Allow 1 off due to timing + print(f" ℹ️ {success_count}/{expected_survivors} expected survivors completed") + results[launcher_name] = "PASSED" + elif success_count > 0: + print(f" ⚠️ Only {success_count}/{expected_survivors} survivors completed") + print(f" Some ranks may have been killed by launcher") + results[launcher_name] = "PASSED" # Still no deadlock + else: + print(f" ℹ️ Launcher killed all ranks (fail-fast behavior)") + results[launcher_name] = "PASSED" + + # Print summary + print("\n" + "=" * 70) + print("RANK FAILURE ROBUSTNESS TEST SUMMARY") + print("=" * 70) + + for launcher_name, result in results.items(): + status_icon = "✅" if result == "PASSED" else "⚠️" if "UNEXPECTED" in result else "❌" + print(f" {status_icon} {launcher_name}: {result}") + + print("-" * 70) + + if all_passed and results: + print(f"✅ OVERALL: PASS - No deadlocks detected!") + print(f"\nAll launchers handled rank failures without deadlock!") + print(f"\nVerified:") + print(f" ✓ {num_failed_ranks} ranks were killed mid-execution (simulated failures)") + print(f" ✓ No deadlocks or hangs detected (completed within timeout)") + print(f" ✓ Statistics logging didn't hold locks that cause deadlocks") + print(f" ✓ Launchers handled failures with fail-fast behavior") + print(f"\nKey finding:") + print(f" • torchrun: Uses elastic fail-fast (kills all ranks when one fails)") + print(f" • mpirun: May allow survivors or use fail-fast depending on config") + print(f" • Both behaviors are CORRECT - no deadlock is the critical requirement") + print(f"\nThis confirms the statistics infrastructure is fault-tolerant!") + elif not results: + print(f"⚠️ OVERALL: SKIP - No launchers available for testing") + else: + print(f"❌ OVERALL: FAIL") + print(f"\nSome tests failed - check for DEADLOCK or TIMEOUT issues above") + print(f"\nIf tests TIMEOUT, that indicates a DEADLOCK problem.") + print(f"If tests complete quickly with 0 survivors, that's fail-fast (OK).") + + print("=" * 70) + return all_passed + + +def test_dataloader_worker_process_isolation(): + """Test to prove DataLoader workers are separate processes with independent statistics. + + This test demonstrates that: + 1. Each DataLoader worker is a separate process (different PID) + 2. Each worker process's main thread can have the same thread ID + 3. Each worker has its own _global_retry_stats instance with independent counters + 4. This explains why production logs show the same thread ID with different operation counts + """ + import multiprocessing + import queue + + from torch.utils.data import DataLoader, Dataset + + print("\n" + "=" * 70) + print("TEST: DataLoader Worker Process Isolation") + print("=" * 70) + + # Queue to collect results from worker processes + result_queue = multiprocessing.Manager().Queue() + + class DataLoaderTestDataset(Dataset): + """Dataset that reports process/thread info from workers.""" + + def __init__(self, num_items: int, result_queue: queue.Queue): + self.num_items = num_items + self.result_queue = result_queue + + def __len__(self) -> int: + return self.num_items + + def __getitem__(self, idx: int) -> dict: + """Each worker performs S3 operations and reports its PID/thread ID/stats.""" + import os + import threading + from unittest.mock import MagicMock + + import cosmos3._src.imaginaire.datasets.webdataset.utils.stream as stream_module + from cosmos3._src.imaginaire.datasets.webdataset.utils.stream import RetryingStream + from cosmos3._src.imaginaire.utils import log + + # Initialize logging in worker process (each worker subprocess needs this) + # Only initialize once per worker + if not hasattr(self, "_log_initialized"): + log.init_loguru_stdout() + self._log_initialized = True + + # Get process and thread info + pid = os.getpid() + thread_id = threading.get_ident() + + # Create mock S3 client + client = MagicMock() + test_data = b"X" * 1024 + client.head_object.return_value = {"ContentLength": str(len(test_data))} + + # Vary number of operations per item so each worker gets unique totals + # This makes it clear in logs that each worker has independent statistics + num_ops = 5 + (idx % 4) * 5 # 5, 10, 15, 20 ops depending on idx + + # Perform S3 operations to increment stats + for i in range(num_ops): + mock_body = MagicMock() + mock_body.read.return_value = test_data + client.get_object.return_value = {"Body": mock_body, "ContentLength": len(test_data)} + + stream = RetryingStream(client, f"bucket-worker{pid}", f"file-{idx}-{i}.tar", retries=5) + _ = stream.read(1024) + + # Force a log every few items to demonstrate the logging behavior + # This simulates what would happen in production when 60 seconds elapse + if idx % 3 == 0: # Log every 3rd item + # Force logging by resetting the timer + with stream_module._global_retry_stats.lock: + stream_module._global_retry_stats.last_log_time = 0 + + # Trigger periodic log (will show cumulative stats for this worker) + stream_module._maybe_log_retry_stats() + + # Get cumulative stats from this worker's _global_retry_stats + with stream_module._global_retry_stats.lock: + ops_started = stream_module._global_retry_stats.cumulative_operations_started + failed_ops = stream_module._global_retry_stats.cumulative_failed_operations + total_attempts = stream_module._global_retry_stats.cumulative_attempts + + # Report to main process + result = { + "idx": idx, + "pid": pid, + "thread_id": thread_id, + "ops_started": ops_started, + "failed_ops": failed_ops, + "total_attempts": total_attempts, + } + + self.result_queue.put(result) + return result + + # Create DataLoader with multiple workers + num_workers = 4 + items_per_worker = 5 + dataset = DataLoaderTestDataset(num_items=num_workers * items_per_worker, result_queue=result_queue) + + print(f"\nCreating DataLoader with {num_workers} workers...") + print(f"Each worker will process ~{items_per_worker} items") + print(f"Number of S3 operations varies per item: 5, 10, 15, or 20 read ops") + print(f"Each item: (2 * num_read_ops) init operations + num_read_ops read operations") + print(f"This creates DIFFERENT operation counts per worker (proves isolation)") + print(f"\nWorkers will log statistics every 3 items (simulating periodic logs)") + print(f"Watch for logs showing the SAME thread ID but DIFFERENT operation counts!\n") + + dataloader = DataLoader( + dataset, + batch_size=1, + num_workers=num_workers, + shuffle=False, + ) + + # Consume the dataloader (this triggers worker processes) + for batch in dataloader: + pass # Workers send results via queue + + # Collect all results from workers + results = [] + while not result_queue.empty(): + try: + results.append(result_queue.get_nowait()) + except queue.Empty: + break + + # Analyze results + print(f"Collected {len(results)} results from workers\n") + + # Group by PID + by_pid = {} + for result in results: + pid = result["pid"] + if pid not in by_pid: + by_pid[pid] = [] + by_pid[pid].append(result) + + print(f"Found {len(by_pid)} unique worker PIDs:") + + thread_ids_by_pid = {} + for pid, items in sorted(by_pid.items()): + thread_ids = set(item["thread_id"] for item in items) + thread_ids_by_pid[pid] = thread_ids + + # Get final stats for this worker (last item has cumulative total) + final_stats = max(items, key=lambda x: x["ops_started"]) + + print(f" PID {pid}:") + print(f" Thread ID(s): {thread_ids}") + print(f" Processed {len(items)} items") + print( + f" Cumulative stats: {final_stats['ops_started']} ops, " + f"{final_stats['failed_ops']} failed, {final_stats['total_attempts']} attempts" + ) + + # Check for thread ID collisions across processes + all_thread_ids = [] + for thread_ids in thread_ids_by_pid.values(): + all_thread_ids.extend(thread_ids) + + thread_id_counts = {} + for tid in all_thread_ids: + thread_id_counts[tid] = thread_id_counts.get(tid, 0) + 1 + + duplicated_thread_ids = {tid: count for tid, count in thread_id_counts.items() if count > 1} + + print("\n" + "-" * 70) + print("ANALYSIS:") + print("-" * 70) + + if duplicated_thread_ids: + print(f"✅ Found thread ID collision(s): {len(duplicated_thread_ids)} thread IDs shared across processes") + for tid, count in duplicated_thread_ids.items(): + print(f" Thread ID {tid} appears in {count} different worker processes") + print("\nThis proves that the same thread ID can exist in multiple processes!") + else: + print("⚠️ No thread ID collisions found (less common, but still valid)") + print(" Each worker happened to get a unique thread ID") + + print(f"\nEach worker process has INDEPENDENT _global_retry_stats (different counts):") + operation_counts = [] + for pid, items in sorted(by_pid.items()): + final_stats = max(items, key=lambda x: x["ops_started"]) + operation_counts.append(final_stats["ops_started"]) + print(f" PID {pid}: {final_stats['ops_started']} operations (independent counter)") + + # Verify that operation counts are different (proving independence) + if len(set(operation_counts)) > 1: + print(f"\n✅ Workers have DIFFERENT operation counts: {operation_counts}") + print(" This proves each process has its own independent statistics!") + else: + print(f"\n⚠️ Workers have same counts (less typical, but still independent processes)") + + print("\n" + "=" * 70) + print("CONCLUSION:") + print("=" * 70) + print("During the test, you should have seen WARNING logs like:") + print(" [RetryingStream Stats] RANK-LOCAL: X ops, Y failed...") + print(" Thread NNNN: X ops...") + print("\nThese logs likely showed:") + print(" - The SAME thread ID appearing multiple times") + print(" - DIFFERENT operation counts with that same thread ID") + print(" - This EXACTLY matches what you see in production!") + print("\nWhy production logs show:") + print(" - Same thread ID (e.g., 23456244278592) across ALL log entries") + print(" - DIFFERENT/DECREASING operation counts with the same thread ID") + print("\nThe explanation:") + print(" - Each DataLoader worker is a SEPARATE PROCESS with its own memory") + print(" - Each process has its own independent _global_retry_stats instance") + print(" - Thread IDs are process-local (NOT globally unique)") + print(" - The main thread in each process often gets the same thread ID") + print(" - When different workers log at different times → interleaved stats") + print(" - Different workers have different workloads → different operation counts") + print("\nThe solution:") + print(" 1. Use cumulative_* counters (already maintained, protected by lock)") + print(" 2. Add PID to log messages to distinguish which worker is logging") + print(" 3. Understand that 'decreasing' counts are actually different processes") + print("=" * 70) + + +if __name__ == "__main__": + # First, run robustness test + test_passed = test_weakref_robustness() + + if not test_passed: + print("\n⚠ WARNING: Robustness test failed!") + time.sleep(2) + + # Second, run correctness test + test_passed = test_multithreaded_stats_correctness() + + if not test_passed: + print("\n⚠ WARNING: Correctness test failed! Proceeding with benchmarks anyway...") + time.sleep(2) + + # Third, run multi-rank test + test_passed = test_multi_rank_stats_logging() + + if not test_passed: + print("\n⚠ WARNING: Multi-rank test failed! Check errors above.") + time.sleep(2) + + # Fourth, run rank failure robustness test + print("\n\nWaiting 3 seconds before fault tolerance test...") + time.sleep(3) + + test_passed = test_rank_failure_robustness() + + if not test_passed: + print("\n⚠ WARNING: Rank failure robustness test failed!") + print("This means the statistics logging may cause deadlocks when ranks fail.") + time.sleep(2) + + # Fifth, run DataLoader worker isolation test + print("\n\nWaiting 3 seconds before DataLoader worker isolation test...") + time.sleep(3) + + test_dataloader_worker_process_isolation() + + # Warmup (longer to stabilize JIT and caches) + print("\n" + "=" * 70) + print("PERFORMANCE BENCHMARKS") + print("=" * 70) + print("\nWarming up...") + for _ in range(10): + benchmark_iteration(True, 1000, 0) + benchmark_iteration(False, 1000, 0) + + # Benchmark 1: Synthetic (no network delay) - shows maximum overhead + run_benchmark_suite( + name="BENCHMARK 1: Synthetic (No Network Delay)", num_operations=100000, num_runs=10, network_delay_ms=0 + ) + + # Benchmark 2: Aggressive in-region latency (1ms per read) + # This represents same-region VM-to-S3/GCS with high-bandwidth network + print("\n\nWaiting 2 seconds before realistic benchmark...") + time.sleep(2) + + run_benchmark_suite( + name="BENCHMARK 2: In-Region VM to S3/GCS (1ms latency)", + num_operations=1000, # 1 second total with 1ms each + num_runs=10, + network_delay_ms=1.0, + ) + + # Benchmark 3: Typical cross-region or slower network (10ms per read) + print("\n\nWaiting 2 seconds before final benchmark...") + time.sleep(2) + + run_benchmark_suite( + name="BENCHMARK 3: Cross-Region or Slower Network (10ms latency)", + num_operations=200, # 2 seconds total with 10ms each + num_runs=10, + network_delay_ms=10.0, + ) + + print("\n" + "=" * 70) + print("SUMMARY") + print("=" * 70) + print("The synthetic benchmark (no delay) shows maximum theoretical overhead.") + print("The realistic benchmarks show actual production impact:") + print(" - 1ms: Aggressive same-region VM to S3/GCS") + print(" - 10ms: Typical cross-region or shared network") + print("\nIn real-world S3/GCS usage, the overhead is completely negligible") + print("because network I/O dominates (1-100ms vs ~200ns overhead).") + + print("\n" + "-" * 70) + print("NOTE: Synthetic benchmark variance (5-20%) is normal and caused by:") + print(" - CPU frequency scaling (1.2GHz → 4.5GHz dynamically)") + print(" - Thermal throttling as CPU heats up") + print(" - Background OS processes") + print(" - Cache effects (cold vs warm)") + print("\nThis variance does NOT exist in production with real network I/O!") + print("The ~200ns overhead is consistent; only the baseline varies.") + + # Re-enable stats for normal operation + stream_module.ENABLE_RETRY_STATS = True diff --git a/cosmos-inference/cosmos3/_src/imaginaire/datasets/webdataset/utils/unit_test/RetryingStreamTarIteratorTest.py b/cosmos-inference/cosmos3/_src/imaginaire/datasets/webdataset/utils/unit_test/RetryingStreamTarIteratorTest.py new file mode 100644 index 00000000..65ad87c5 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/imaginaire/datasets/webdataset/utils/unit_test/RetryingStreamTarIteratorTest.py @@ -0,0 +1,975 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +""" +Integration tests for RetryingStream compatibility with tar_file_iterator. +These tests ensure that RetryingStream works correctly when used as a stream +for webdataset's tar_file_iterator function. + +These tests simulate various failure scenarios to verify retry behavior: +- Early failures during tar header reading +- Multiple consecutive failures requiring multiple retries +- Failures during file data block reading +- Exhausted retries leading to error propagation +- Different types of network exceptions (URLLib3, IncompleteRead) +- botocore ResponseStreamingError (wraps IncompleteRead from production logs) +- botocore ConnectionClosedError (connection closed unexpectedly) +- botocore ReadTimeoutError (read timeout on boto3 layer, distinct from urllib3) +""" + +import io +import tarfile +from http.client import IncompleteRead +from unittest.mock import MagicMock, patch + +from botocore.exceptions import ConnectionClosedError, ResponseStreamingError +from botocore.exceptions import ReadTimeoutError as BotocoreReadTimeoutError +from urllib3.exceptions import ProtocolError as URLLib3ProtocolError +from urllib3.exceptions import ReadTimeoutError as URLLib3ReadTimeoutError +from webdataset.tariterators import tar_file_iterator + +import cosmos3._src.imaginaire.datasets.webdataset.utils.stream as stream_module +from cosmos3._src.imaginaire.datasets.webdataset.utils.stream import RetryingStream + +# Configure faster logging interval for tests (10 seconds instead of 5 minutes) +stream_module.RETRY_STATS_LOG_INTERVAL = 10.0 + +# Test 1: Basic compatibility - RetryingStream works with tar_file_iterator +print("Test 1: RetryingStream basic compatibility with tar_file_iterator") + +# Create a real tar file in memory +tar_buffer = io.BytesIO() +with tarfile.open(fileobj=tar_buffer, mode="w") as tar: + # Add sample files to the tar + sample1_data = b"This is sample 1 content" + sample1_info = tarfile.TarInfo(name="sample001.txt") + sample1_info.size = len(sample1_data) + tar.addfile(sample1_info, io.BytesIO(sample1_data)) + + sample2_data = b"This is sample 2 content with more data" + sample2_info = tarfile.TarInfo(name="sample002.txt") + sample2_info.size = len(sample2_data) + tar.addfile(sample2_info, io.BytesIO(sample2_data)) + +# Get the tar file bytes +tar_bytes = tar_buffer.getvalue() +tar_size = len(tar_bytes) + +# Create a mock S3 client that returns this tar file +client = MagicMock() +client.head_object.return_value = {"ContentLength": str(tar_size)} + +# Create a mock body that simulates boto3's StreamingBody +mock_body = MagicMock() +mock_body._raw_stream = io.BytesIO(tar_bytes) + + +def mock_read(amt=None): + """Simulate StreamingBody.read() behavior""" + return mock_body._raw_stream.read(amt) + + +mock_body.read = mock_read + +client.get_object.return_value = {"Body": mock_body, "ContentLength": tar_size} + +# Create a RetryingStream +retrying_stream = RetryingStream(client, "test-bucket", "test.tar", retries=3) + +# Pass the RetryingStream to tar_file_iterator +samples = [] +try: + for sample in tar_file_iterator(retrying_stream): + samples.append(sample) + print(f" ✓ Extracted: {sample['fname']}, size: {len(sample['data'])} bytes") +except Exception as e: + print(f" ✗ Error during tar iteration: {e}") + raise + +# Verify results +assert len(samples) == 2, f"Expected 2 samples but got {len(samples)}" +assert samples[0]["fname"] == "sample001.txt", f"Expected 'sample001.txt' but got {samples[0]['fname']}" +assert samples[0]["data"] == b"This is sample 1 content", "Sample 1 data mismatch" +assert samples[1]["fname"] == "sample002.txt", f"Expected 'sample002.txt' but got {samples[1]['fname']}" +assert samples[1]["data"] == b"This is sample 2 content with more data", "Sample 2 data mismatch" + +print(f"✓ Successfully extracted {len(samples)} samples from tar via RetryingStream") +print("✓ All samples have correct filenames and content") + + +# Test 2: Multiple files in tar +print("\nTest 2: RetryingStream with multiple files in tar") + +tar_buffer2 = io.BytesIO() +with tarfile.open(fileobj=tar_buffer2, mode="w") as tar: + # Add multiple test files + for i in range(5): + data = f"Sample {i} data content with index {i}".encode() + info = tarfile.TarInfo(name=f"sample{i:03d}.json") + info.size = len(data) + tar.addfile(info, io.BytesIO(data)) + +tar_bytes2 = tar_buffer2.getvalue() +tar_size2 = len(tar_bytes2) + +client2 = MagicMock() +client2.head_object.return_value = {"ContentLength": str(tar_size2)} + +mock_body2 = MagicMock() +mock_body2._raw_stream = io.BytesIO(tar_bytes2) +mock_body2.read = lambda amt=None: mock_body2._raw_stream.read(amt) + +client2.get_object.return_value = {"Body": mock_body2, "ContentLength": tar_size2} + +retrying_stream2 = RetryingStream(client2, "test-bucket", "test2.tar", retries=3) + +samples2 = [] +for sample in tar_file_iterator(retrying_stream2): + samples2.append(sample) + +assert len(samples2) == 5, f"Expected 5 samples but got {len(samples2)}" +for i, sample in enumerate(samples2): + expected_name = f"sample{i:03d}.json" + expected_data = f"Sample {i} data content with index {i}".encode() + assert sample["fname"] == expected_name, f"Sample {i}: Expected '{expected_name}' but got {sample['fname']}" + assert sample["data"] == expected_data, f"Sample {i}: Data mismatch" + print(f" ✓ Sample {i}: {sample['fname']} ({len(sample['data'])} bytes)") + +print(f"✓ Successfully extracted all {len(samples2)} samples") + + +# Test 3: Retry during tar_file_iterator reading - Early failure in tar header +print("\nTest 3: RetryingStream retries during early tar header reading") + +# Create a tar file +tar_buffer3 = io.BytesIO() +with tarfile.open(fileobj=tar_buffer3, mode="w") as tar: + for i in range(3): + data = f"Sample {i} data content".encode() + info = tarfile.TarInfo(name=f"file{i}.txt") + info.size = len(data) + tar.addfile(info, io.BytesIO(data)) + +tar_bytes3 = tar_buffer3.getvalue() +tar_size3 = len(tar_bytes3) +print(f" Test tar size: {tar_size3} bytes") + +# Mock client with simulated failure during first tar header read +client3 = MagicMock() +client3.head_object.return_value = {"ContentLength": str(tar_size3)} + +# Track state across stream instances +state = {"bytes_read": 0, "read_count": 0, "fail_triggered": False} + +# First body fails after partial read of first tar header (512 bytes) +mock_body_fail = MagicMock() +mock_body_fail.close = MagicMock() + + +def failing_read(amt=None): + """Simulate a failure during tar header reading""" + state["read_count"] += 1 + # Read some bytes successfully first + if state["bytes_read"] < 256: # Read half of first tar header + chunk_size = min(256 - state["bytes_read"], amt if amt else 1024) + chunk = tar_bytes3[state["bytes_read"] : state["bytes_read"] + chunk_size] + state["bytes_read"] += len(chunk) + print(f" Read attempt {state['read_count']}: read {len(chunk)} bytes (total: {state['bytes_read']})") + return chunk + else: + # Fail on next read + print(f" Read attempt {state['read_count']}: SIMULATING FAILURE at byte {state['bytes_read']}") + state["fail_triggered"] = True + raise IncompleteRead(b"partial") + + +mock_body_fail.read = failing_read + +# Second body succeeds from the retry point +mock_body_success = MagicMock() + + +def success_read(amt=None): + """Successful read from retry point""" + if amt is None or amt < 0: + chunk = tar_bytes3[state["bytes_read"] :] + state["bytes_read"] = len(tar_bytes3) + else: + chunk = tar_bytes3[state["bytes_read"] : state["bytes_read"] + amt] + state["bytes_read"] += len(chunk) + if len(chunk) > 0: + print(f" Retry read: {len(chunk)} bytes (total: {state['bytes_read']})") + return chunk + + +mock_body_success.read = success_read + +client3.get_object.side_effect = [ + {"Body": mock_body_fail, "ContentLength": tar_size3}, + {"Body": mock_body_success, "ContentLength": tar_size3 - state["bytes_read"]}, +] + +retrying_stream3 = RetryingStream(client3, "test-bucket", "test3.tar", retries=5) + +# Try to read from tar_file_iterator with retry +samples3 = [] +with patch("time.sleep"): # Skip sleep delays + with patch("cosmos3._src.imaginaire.datasets.webdataset.utils.stream.log"): # Suppress retry logs + try: + for sample in tar_file_iterator(retrying_stream3): + samples3.append(sample) + print(f" ✓ Extracted: {sample['fname']}") + print(f"✓ Successfully recovered and extracted {len(samples3)} samples after network error") + print(f"✓ Failure was triggered: {state['fail_triggered']}") + print(f"✓ Stream reconnected and continued from byte {256}") + assert len(samples3) == 3, f"Expected 3 samples but got {len(samples3)}" + except Exception as e: + print(f" ℹ Partial recovery scenario: {type(e).__name__}: {e}") + print(f" ℹ Bytes read before failure: {256}") + print(f" ℹ RetryingStream retries at byte-level, but tar may need full restart") + assert state["fail_triggered"], "Failure should have been triggered" + print(" ✓ RetryingStream attempted retry as expected") + + +# Test 3b: Multiple failures before success +print("\nTest 3b: Multiple retry attempts before success") + +tar_buffer3b = io.BytesIO() +with tarfile.open(fileobj=tar_buffer3b, mode="w") as tar: + data = b"Test data for multiple retries" + info = tarfile.TarInfo(name="retry_test.txt") + info.size = len(data) + tar.addfile(info, io.BytesIO(data)) + +tar_bytes3b = tar_buffer3b.getvalue() +tar_size3b = len(tar_bytes3b) + +client3b = MagicMock() +client3b.head_object.return_value = {"ContentLength": str(tar_size3b)} + +# State for multiple retries +state3b = {"bytes_read": 0, "failure_count": 0, "get_stream_calls": 0} + + +# Create multiple failing bodies +def create_failing_body(fail_after_bytes: int) -> MagicMock: + """Create a body that fails after reading specific number of bytes""" + body = MagicMock() + body.close = MagicMock() + body_state = {"local_read": 0} + + def read_then_fail(amt=None): + if body_state["local_read"] >= fail_after_bytes: + state3b["failure_count"] += 1 + print(f" Failure #{state3b['failure_count']} at byte {state3b['bytes_read']}") + raise IncompleteRead(b"fail") + chunk_size = min(fail_after_bytes - body_state["local_read"], amt if amt else 1024) + chunk = tar_bytes3b[state3b["bytes_read"] : state3b["bytes_read"] + chunk_size] + body_state["local_read"] += len(chunk) + state3b["bytes_read"] += len(chunk) + return chunk + + body.read = read_then_fail + return body + + +# First body fails after 100 bytes, second after 150, third succeeds +def get_object_multi_fail(**kwargs): + state3b["get_stream_calls"] += 1 + if state3b["get_stream_calls"] == 1: + return {"Body": create_failing_body(100), "ContentLength": tar_size3b} + elif state3b["get_stream_calls"] == 2: + return {"Body": create_failing_body(150), "ContentLength": tar_size3b - state3b["bytes_read"]} + else: + # Final success body + body = MagicMock() + + def success_read_final(amt=None): + chunk = tar_bytes3b[state3b["bytes_read"] :] + state3b["bytes_read"] = len(tar_bytes3b) + return chunk + + body.read = success_read_final + return {"Body": body, "ContentLength": tar_size3b - state3b["bytes_read"]} + + +client3b.get_object.side_effect = get_object_multi_fail + +retrying_stream3b = RetryingStream(client3b, "test-bucket", "multi_retry.tar", retries=5) + +samples3b = [] +with patch("time.sleep"): + with patch("cosmos3._src.imaginaire.datasets.webdataset.utils.stream.log"): + try: + for sample in tar_file_iterator(retrying_stream3b): + samples3b.append(sample) + print(f" ✓ Extracted after {state3b['failure_count']} failures: {sample['fname']}") + assert len(samples3b) == 1, f"Expected 1 sample but got {len(samples3b)}" + print(f"✓ Successfully handled {state3b['failure_count']} failures with automatic retry") + print(f"✓ Total stream reconnections: {state3b['get_stream_calls'] - 1}") + except Exception as e: + print(f" ✗ Multiple retry scenario failed: {type(e).__name__}") + print(f" ✗ Failures encountered: {state3b['failure_count']}") + print(f" ✗ Retry attempts made: {state3b['get_stream_calls'] - 1}") + raise AssertionError(f"Test 3b failed: Multiple retries did not recover - {type(e).__name__}: {e}") from e + + +# Test 3c: Failure during file data reading (not header) +print("\nTest 3c: Retry during file data block reading") + +tar_buffer3c = io.BytesIO() +with tarfile.open(fileobj=tar_buffer3c, mode="w") as tar: + # Create file with larger data to ensure failure happens during data read + large_data = b"X" * 2048 # 2KB of data + info = tarfile.TarInfo(name="largefile.bin") + info.size = len(large_data) + tar.addfile(info, io.BytesIO(large_data)) + +tar_bytes3c = tar_buffer3c.getvalue() +tar_size3c = len(tar_bytes3c) +print(f" Test tar size: {tar_size3c} bytes (512 header + 2048 data)") + +client3c = MagicMock() +client3c.head_object.return_value = {"ContentLength": str(tar_size3c)} + +state3c = {"bytes_read": 0, "failed": False} + +# First body: read header successfully, fail during data +mock_body_fail_data = MagicMock() +mock_body_fail_data.close = MagicMock() + + +def fail_during_data(amt=None): + """Read tar header OK, fail during data block""" + # Let header pass (512 bytes) + if state3c["bytes_read"] < 512: + chunk = tar_bytes3c[state3c["bytes_read"] : 512] + state3c["bytes_read"] = 512 + print(f" Read tar header: 512 bytes") + return chunk + # Fail during data block + if state3c["bytes_read"] < 1024: + chunk = tar_bytes3c[state3c["bytes_read"] : 1024] + state3c["bytes_read"] = 1024 + print(f" Read partial data: {len(chunk)} bytes") + return chunk + # Now fail + print(f" FAILURE during data block at byte {state3c['bytes_read']}") + state3c["failed"] = True + raise IncompleteRead(b"data_fail") + + +mock_body_fail_data.read = fail_during_data + +# Success body continues from retry point +mock_body_success_data = MagicMock() + + +def success_read_data(amt=None): + chunk_size = min(amt if amt else 4096, tar_size3c - state3c["bytes_read"]) + chunk = tar_bytes3c[state3c["bytes_read"] : state3c["bytes_read"] + chunk_size] + state3c["bytes_read"] += len(chunk) + if len(chunk) > 0: + print(f" Retry continuing: read {len(chunk)} bytes (total: {state3c['bytes_read']})") + return chunk + + +mock_body_success_data.read = success_read_data + +client3c.get_object.side_effect = [ + {"Body": mock_body_fail_data, "ContentLength": tar_size3c}, + {"Body": mock_body_success_data, "ContentLength": tar_size3c - state3c["bytes_read"]}, +] + +retrying_stream3c = RetryingStream(client3c, "test-bucket", "datablock.tar", retries=5) + +samples3c = [] +with patch("time.sleep"): + with patch("cosmos3._src.imaginaire.datasets.webdataset.utils.stream.log"): + try: + for sample in tar_file_iterator(retrying_stream3c): + samples3c.append(sample) + print(f" ✓ Extracted: {sample['fname']} ({len(sample['data'])} bytes)") + assert len(samples3c) == 1, f"Expected 1 sample but got {len(samples3c)}" + assert samples3c[0]["data"] == b"X" * 2048, "Data integrity check failed" + print(f"✓ Successfully recovered from failure during data block reading") + print(f"✓ Data integrity maintained: {len(samples3c[0]['data'])} bytes verified") + assert state3c["failed"], "Failure should have been triggered during data read" + except Exception as e: + print(f" ✗ Data block failure scenario failed: {type(e).__name__}: {str(e)[:100]}") + print(f" ✗ Failure occurred at: {'during data read' if state3c['failed'] else 'unexpected location'}") + assert state3c["failed"], "Should have triggered failure during data read" + raise AssertionError(f"Test 3c failed: Data block retry did not recover - {type(e).__name__}: {e}") from e + + +# Test 3d: Exhausted retries - all attempts fail +print("\nTest 3d: Exhausted retries - failure propagates after max attempts") + +tar_buffer3d = io.BytesIO() +with tarfile.open(fileobj=tar_buffer3d, mode="w") as tar: + data = b"Test data that will never be read" + info = tarfile.TarInfo(name="unreachable.txt") + info.size = len(data) + tar.addfile(info, io.BytesIO(data)) + +tar_bytes3d = tar_buffer3d.getvalue() +tar_size3d = len(tar_bytes3d) + +client3d = MagicMock() +client3d.head_object.return_value = {"ContentLength": str(tar_size3d)} + +state3d = {"bytes_read": 0, "attempt_count": 0} + + +def always_fail_read(amt=None): + """Always fail after reading a bit""" + state3d["attempt_count"] += 1 + if state3d["bytes_read"] < 100: + chunk = tar_bytes3d[state3d["bytes_read"] : 100] + state3d["bytes_read"] = 100 + return chunk + print(f" Attempt {state3d['attempt_count']}: FAILING") + raise IncompleteRead(b"always_fails") + + +# All stream attempts will fail +def always_fail_get_object(**kwargs): + body = MagicMock() + body.close = MagicMock() + body.read = always_fail_read + return {"Body": body, "ContentLength": tar_size3d - state3d["bytes_read"]} + + +client3d.get_object.side_effect = always_fail_get_object + +retrying_stream3d = RetryingStream(client3d, "test-bucket", "fail.tar", retries=3) + +exception_caught = False +exception_type = None +with patch("time.sleep"): + with patch("cosmos3._src.imaginaire.datasets.webdataset.utils.stream.log"): + try: + samples3d = list(tar_file_iterator(retrying_stream3d)) + except Exception as e: + exception_caught = True + exception_type = type(e).__name__ + print(f" ✓ Exception propagated after exhausting retries: {exception_type}") + print(f" ✓ Total retry attempts: {state3d['attempt_count'] - 1}") + +assert exception_caught, "Should have raised exception after exhausting retries" +assert state3d["attempt_count"] > 1, f"Should have retried multiple times, got {state3d['attempt_count']}" +print(f"✓ Correctly exhausted retries and propagated error") + + +# Test 3e: Different exception types (URLLib3 errors) +print("\nTest 3e: Retry on different network exception types") + +tar_buffer3e = io.BytesIO() +with tarfile.open(fileobj=tar_buffer3e, mode="w") as tar: + data = b"Data for exception type testing" + info = tarfile.TarInfo(name="exception_test.txt") + info.size = len(data) + tar.addfile(info, io.BytesIO(data)) + +tar_bytes3e = tar_buffer3e.getvalue() +tar_size3e = len(tar_bytes3e) + +# Test with ReadTimeoutError then ProtocolError +client3e = MagicMock() +client3e.head_object.return_value = {"ContentLength": str(tar_size3e)} + +state3e = {"bytes_read": 0, "exceptions_raised": [], "get_object_calls": 0} + +# First body: reads some data then raises ReadTimeoutError +mock_body_fail1 = MagicMock() +mock_body_fail1.close = MagicMock() + + +def first_error_read(amt=None): + """Raise ReadTimeoutError immediately without reading""" + state3e["exceptions_raised"].append("ReadTimeoutError") + print(f" Raising ReadTimeoutError at byte {state3e['bytes_read']}") + raise URLLib3ReadTimeoutError(None, None, "Timeout") + + +mock_body_fail1.read = first_error_read + +# Second body: reads some data then raises ProtocolError +mock_body_fail2 = MagicMock() +mock_body_fail2.close = MagicMock() + + +def second_error_read(amt=None): + """Raise ProtocolError immediately without reading""" + state3e["exceptions_raised"].append("ProtocolError") + print(f" Raising ProtocolError at byte {state3e['bytes_read']}") + raise URLLib3ProtocolError("Connection broken") + + +mock_body_fail2.read = second_error_read + +# Final success body +mock_body_success = MagicMock() + + +def final_success(amt=None): + """Successful read from current position""" + if amt is None or amt < 0: + chunk = tar_bytes3e[state3e["bytes_read"] :] + state3e["bytes_read"] = len(tar_bytes3e) + else: + chunk = tar_bytes3e[state3e["bytes_read"] : state3e["bytes_read"] + amt] + state3e["bytes_read"] += len(chunk) + if len(chunk) > 0: + print(f" Read {len(chunk)} bytes (total: {state3e['bytes_read']})") + return chunk + + +mock_body_success.read = final_success + + +def get_with_different_errors(**kwargs): + """Return different bodies that raise different errors""" + state3e["get_object_calls"] += 1 + if state3e["get_object_calls"] == 1: + return {"Body": mock_body_fail1, "ContentLength": tar_size3e} + elif state3e["get_object_calls"] == 2: + return {"Body": mock_body_fail2, "ContentLength": tar_size3e - state3e["bytes_read"]} + else: + return {"Body": mock_body_success, "ContentLength": tar_size3e - state3e["bytes_read"]} + + +client3e.get_object.side_effect = get_with_different_errors + +retrying_stream3e = RetryingStream(client3e, "test-bucket", "exceptions.tar", retries=5) + +samples3e = [] +with patch("time.sleep"): + with patch("cosmos3._src.imaginaire.datasets.webdataset.utils.stream.log"): + try: + for sample in tar_file_iterator(retrying_stream3e): + samples3e.append(sample) + print(f" ✓ Extracted: {sample['fname']}") + print(f"✓ Successfully handled multiple exception types:") + for exc in state3e["exceptions_raised"]: + print(f" - {exc}") + assert len(samples3e) == 1, f"Expected 1 sample but got {len(samples3e)}" + assert len(state3e["exceptions_raised"]) == 2, "Should have raised 2 different exceptions" + except Exception as e: + # Byte-level retries can cause tar corruption - verify exceptions were at least caught + print(f" ⚠ Exception during tar parsing: {type(e).__name__}: {str(e)[:80]}") + print(f" ℹ Exception types that triggered retries: {state3e['exceptions_raised']}") + print(f" ℹ S3 reconnection attempts: {state3e['get_object_calls']}") + # Verify that we at least caught and retried the exceptions + if len(state3e["exceptions_raised"]) >= 2: + print(f" ✓ Successfully caught and retried multiple exception types") + print(f" ℹ Note: Tar parsing may fail due to byte-level retry limitations") + else: + raise AssertionError( + f"Test 3e failed: Expected to catch 2 exceptions but only caught {len(state3e['exceptions_raised'])}" + ) from e + + +# Test 3f: ResponseStreamingError from botocore (EXPECTED TO FAIL until fixed) +print("\nTest 3f: botocore.exceptions.ResponseStreamingError handling") + +tar_buffer3f = io.BytesIO() +with tarfile.open(fileobj=tar_buffer3f, mode="w") as tar: + data = b"Data that will trigger ResponseStreamingError" + info = tarfile.TarInfo(name="streaming_error_test.txt") + info.size = len(data) + tar.addfile(info, io.BytesIO(data)) + +tar_bytes3f = tar_buffer3f.getvalue() +tar_size3f = len(tar_bytes3f) + +client3f = MagicMock() +client3f.head_object.return_value = {"ContentLength": str(tar_size3f)} + +state3f = {"bytes_read": 0, "error_raised": False, "retry_attempted": False} + +mock_body_streaming_error = MagicMock() +mock_body_streaming_error.close = MagicMock() + + +def raise_response_streaming_error(amt=None): + """Raise ResponseStreamingError as seen in production""" + if state3f["bytes_read"] == 0: + # Read some data first + chunk = tar_bytes3f[: min(512, len(tar_bytes3f))] + state3f["bytes_read"] = len(chunk) + print(f" First read: {len(chunk)} bytes") + return chunk + else: + # Now raise the ResponseStreamingError wrapping IncompleteRead + state3f["error_raised"] = True + print(f" Raising ResponseStreamingError at byte {state3f['bytes_read']}") + # This simulates the actual error from the logs: + # botocore.exceptions.ResponseStreamingError: ('Connection broken: IncompleteRead(81920 bytes read, 563200 more expected)' + inner_error = IncompleteRead(b"x" * 81920) + error_msg = ( + f"Connection broken: IncompleteRead(81920 bytes read, {tar_size3f - state3f['bytes_read']} more expected)" + ) + raise ResponseStreamingError(error=inner_error, msg=error_msg) + + +mock_body_streaming_error.read = raise_response_streaming_error + +# Success body for retry +mock_body_success_3f = MagicMock() + + +def success_read_3f(amt=None): + """Successful read after retry""" + state3f["retry_attempted"] = True + chunk = tar_bytes3f[state3f["bytes_read"] :] + state3f["bytes_read"] = len(tar_bytes3f) + print(f" Retry successful: read {len(chunk)} bytes") + return chunk + + +mock_body_success_3f.read = success_read_3f + +client3f.get_object.side_effect = [ + {"Body": mock_body_streaming_error, "ContentLength": tar_size3f}, + {"Body": mock_body_success_3f, "ContentLength": tar_size3f - state3f["bytes_read"]}, +] + +retrying_stream3f = RetryingStream(client3f, "test-bucket", "streaming_error.tar", retries=5) + +samples3f = [] +exception_caught_3f = None +with patch("time.sleep"): + with patch("cosmos3._src.imaginaire.datasets.webdataset.utils.stream.log"): + try: + for sample in tar_file_iterator(retrying_stream3f): + samples3f.append(sample) + print(f" ✓ Extracted: {sample['fname']}") + print(f"✓ SUCCESS: ResponseStreamingError was caught and retried!") + print(f" - Error was raised: {state3f['error_raised']}") + print(f" - Retry was attempted: {state3f['retry_attempted']}") + print(f" - Samples extracted: {len(samples3f)}") + assert len(samples3f) == 1, f"Expected 1 sample but got {len(samples3f)}" + assert state3f["error_raised"], "ResponseStreamingError should have been raised" + assert state3f["retry_attempted"], "Retry should have been attempted" + except ResponseStreamingError as e: + exception_caught_3f = e + print(f" ✗ EXPECTED FAILURE: ResponseStreamingError was NOT caught by RetryingStream") + print(f" ✗ Error message: {e}") + print(f" ✗ Error was raised: {state3f['error_raised']}") + print(f" ✗ Retry was attempted: {state3f['retry_attempted']}") + print(f" ℹ This error needs to be added to the exception handler in stream.py") + assert state3f["error_raised"], "Should have raised ResponseStreamingError" + assert not state3f["retry_attempted"], "Retry should NOT have happened (error not caught)" + except Exception as e: + exception_caught_3f = e + print(f" ⚠ Unexpected exception type: {type(e).__name__}: {e}") + +if exception_caught_3f is not None: + print(f"\n⚠ Test 3f demonstrates the bug: ResponseStreamingError is not handled") + print(f" Fix required: Add ResponseStreamingError to exception handler in stream.py") +else: + print(f"\n✓ Test 3f passed: ResponseStreamingError is properly handled") + + +# Test 3g: ConnectionClosedError from botocore (EXPECTED TO FAIL until fixed) +print("\nTest 3g: botocore.exceptions.ConnectionClosedError handling") + +tar_buffer3g = io.BytesIO() +with tarfile.open(fileobj=tar_buffer3g, mode="w") as tar: + data = b"Data that will trigger ConnectionClosedError" + info = tarfile.TarInfo(name="connection_closed_test.txt") + info.size = len(data) + tar.addfile(info, io.BytesIO(data)) + +tar_bytes3g = tar_buffer3g.getvalue() +tar_size3g = len(tar_bytes3g) + +client3g = MagicMock() +client3g.head_object.return_value = {"ContentLength": str(tar_size3g)} + +state3g = {"bytes_read": 0, "error_raised": False, "retry_attempted": False} + +mock_body_conn_closed = MagicMock() +mock_body_conn_closed.close = MagicMock() + + +def raise_connection_closed_error(amt=None): + """Raise ConnectionClosedError as seen in production""" + if state3g["bytes_read"] == 0: + # Read some data first + chunk = tar_bytes3g[: min(512, len(tar_bytes3g))] + state3g["bytes_read"] = len(chunk) + print(f" First read: {len(chunk)} bytes") + return chunk + else: + # Now raise the ConnectionClosedError + state3g["error_raised"] = True + print(f" Raising ConnectionClosedError at byte {state3g['bytes_read']}") + # This simulates: Connection was closed before we received a valid response from endpoint + raise ConnectionClosedError(endpoint_url="https://s3.amazonaws.com/bucket/key") + + +mock_body_conn_closed.read = raise_connection_closed_error + +# Success body for retry +mock_body_success_3g = MagicMock() + + +def success_read_3g(amt=None): + """Successful read after retry""" + state3g["retry_attempted"] = True + chunk = tar_bytes3g[state3g["bytes_read"] :] + state3g["bytes_read"] = len(tar_bytes3g) + print(f" Retry successful: read {len(chunk)} bytes") + return chunk + + +mock_body_success_3g.read = success_read_3g + +client3g.get_object.side_effect = [ + {"Body": mock_body_conn_closed, "ContentLength": tar_size3g}, + {"Body": mock_body_success_3g, "ContentLength": tar_size3g - state3g["bytes_read"]}, +] + +retrying_stream3g = RetryingStream(client3g, "test-bucket", "conn_closed.tar", retries=5) + +samples3g = [] +exception_caught_3g = None +with patch("time.sleep"): + with patch("cosmos3._src.imaginaire.datasets.webdataset.utils.stream.log"): + try: + for sample in tar_file_iterator(retrying_stream3g): + samples3g.append(sample) + print(f" ✓ Extracted: {sample['fname']}") + print(f"✓ SUCCESS: ConnectionClosedError was caught and retried!") + print(f" - Error was raised: {state3g['error_raised']}") + print(f" - Retry was attempted: {state3g['retry_attempted']}") + print(f" - Samples extracted: {len(samples3g)}") + assert len(samples3g) == 1, f"Expected 1 sample but got {len(samples3g)}" + assert state3g["error_raised"], "ConnectionClosedError should have been raised" + assert state3g["retry_attempted"], "Retry should have been attempted" + except ConnectionClosedError as e: + exception_caught_3g = e + print(f" ✗ EXPECTED FAILURE: ConnectionClosedError was NOT caught by RetryingStream") + print(f" ✗ Error message: {e}") + print(f" ✗ Error was raised: {state3g['error_raised']}") + print(f" ✗ Retry was attempted: {state3g['retry_attempted']}") + print(f" ℹ This error needs to be added to the exception handler in stream.py") + assert state3g["error_raised"], "Should have raised ConnectionClosedError" + assert not state3g["retry_attempted"], "Retry should NOT have happened (error not caught)" + except Exception as e: + exception_caught_3g = e + print(f" ⚠ Unexpected exception type: {type(e).__name__}: {e}") + +if exception_caught_3g is not None: + print(f"\n⚠ Test 3g demonstrates the bug: ConnectionClosedError is not handled") + print(f" Fix required: Add ConnectionClosedError to exception handler in stream.py") +else: + print(f"\n✓ Test 3g passed: ConnectionClosedError is properly handled") + + +# Test 3h: ReadTimeoutError from botocore (EXPECTED TO FAIL until fixed) +print("\nTest 3h: botocore.exceptions.ReadTimeoutError handling") + +tar_buffer3h = io.BytesIO() +with tarfile.open(fileobj=tar_buffer3h, mode="w") as tar: + data = b"Data that will trigger botocore ReadTimeoutError" + info = tarfile.TarInfo(name="read_timeout_test.txt") + info.size = len(data) + tar.addfile(info, io.BytesIO(data)) + +tar_bytes3h = tar_buffer3h.getvalue() +tar_size3h = len(tar_bytes3h) + +client3h = MagicMock() +client3h.head_object.return_value = {"ContentLength": str(tar_size3h)} + +state3h = {"bytes_read": 0, "error_raised": False, "retry_attempted": False} + +mock_body_read_timeout = MagicMock() +mock_body_read_timeout.close = MagicMock() + + +def raise_botocore_read_timeout_error(amt=None): + """Raise botocore ReadTimeoutError (different from urllib3 version)""" + if state3h["bytes_read"] == 0: + # Read some data first + chunk = tar_bytes3h[: min(400, len(tar_bytes3h))] + state3h["bytes_read"] = len(chunk) + print(f" First read: {len(chunk)} bytes") + return chunk + else: + # Now raise botocore's ReadTimeoutError + state3h["error_raised"] = True + print(f" Raising botocore.exceptions.ReadTimeoutError at byte {state3h['bytes_read']}") + # This simulates: Read timeout on endpoint URL + raise BotocoreReadTimeoutError(endpoint_url="https://s3.amazonaws.com/bucket/key") + + +mock_body_read_timeout.read = raise_botocore_read_timeout_error + +# Success body for retry +mock_body_success_3h = MagicMock() + + +def success_read_3h(amt=None): + """Successful read after retry""" + state3h["retry_attempted"] = True + chunk = tar_bytes3h[state3h["bytes_read"] :] + state3h["bytes_read"] = len(tar_bytes3h) + print(f" Retry successful: read {len(chunk)} bytes") + return chunk + + +mock_body_success_3h.read = success_read_3h + +client3h.get_object.side_effect = [ + {"Body": mock_body_read_timeout, "ContentLength": tar_size3h}, + {"Body": mock_body_success_3h, "ContentLength": tar_size3h - state3h["bytes_read"]}, +] + +retrying_stream3h = RetryingStream(client3h, "test-bucket", "read_timeout.tar", retries=5) + +samples3h = [] +exception_caught_3h = None +with patch("time.sleep"): + with patch("cosmos3._src.imaginaire.datasets.webdataset.utils.stream.log"): + try: + for sample in tar_file_iterator(retrying_stream3h): + samples3h.append(sample) + print(f" ✓ Extracted: {sample['fname']}") + print(f"✓ SUCCESS: botocore ReadTimeoutError was caught and retried!") + print(f" - Error was raised: {state3h['error_raised']}") + print(f" - Retry was attempted: {state3h['retry_attempted']}") + print(f" - Samples extracted: {len(samples3h)}") + assert len(samples3h) == 1, f"Expected 1 sample but got {len(samples3h)}" + assert state3h["error_raised"], "ReadTimeoutError should have been raised" + assert state3h["retry_attempted"], "Retry should have been attempted" + except BotocoreReadTimeoutError as e: + exception_caught_3h = e + print(f" ✗ EXPECTED FAILURE: botocore ReadTimeoutError was NOT caught by RetryingStream") + print(f" ✗ Error message: {e}") + print(f" ✗ Error was raised: {state3h['error_raised']}") + print(f" ✗ Retry was attempted: {state3h['retry_attempted']}") + print(f" ℹ This error needs to be added to the exception handler in stream.py") + assert state3h["error_raised"], "Should have raised ReadTimeoutError" + assert not state3h["retry_attempted"], "Retry should NOT have happened (error not caught)" + except Exception as e: + exception_caught_3h = e + print(f" ⚠ Unexpected exception type: {type(e).__name__}: {e}") + +if exception_caught_3h is not None: + print(f"\n⚠ Test 3h demonstrates the bug: botocore ReadTimeoutError is not handled") + print(f" Fix required: Add ReadTimeoutError to exception handler in stream.py") +else: + print(f"\n✓ Test 3h passed: botocore ReadTimeoutError is properly handled") + + +# Test 4: Large tar file with chunked reads +print("\nTest 4: RetryingStream with large tar file (chunked reads)") + +tar_buffer4 = io.BytesIO() +with tarfile.open(fileobj=tar_buffer4, mode="w") as tar: + # Add files with larger content to force multiple read() calls + for i in range(3): + # Each file is 10KB + data = (f"Large sample {i} content " * 500).encode()[:10240] + info = tarfile.TarInfo(name=f"large{i:03d}.bin") + info.size = len(data) + tar.addfile(info, io.BytesIO(data)) + +tar_bytes4 = tar_buffer4.getvalue() +tar_size4 = len(tar_bytes4) + +client4 = MagicMock() +client4.head_object.return_value = {"ContentLength": str(tar_size4)} + +mock_body4 = MagicMock() +mock_body4._raw_stream = io.BytesIO(tar_bytes4) +mock_body4.read = lambda amt=None: mock_body4._raw_stream.read(amt) + +client4.get_object.return_value = {"Body": mock_body4, "ContentLength": tar_size4} + +retrying_stream4 = RetryingStream(client4, "test-bucket", "large.tar", retries=3) + +samples4 = [] +for sample in tar_file_iterator(retrying_stream4): + samples4.append(sample) + +assert len(samples4) == 3, f"Expected 3 samples but got {len(samples4)}" +for i, sample in enumerate(samples4): + expected_name = f"large{i:03d}.bin" + assert sample["fname"] == expected_name, f"Sample {i}: Expected '{expected_name}' but got {sample['fname']}" + assert len(sample["data"]) == 10240, f"Sample {i}: Expected 10240 bytes but got {len(sample['data'])}" + print(f" ✓ Large file {i}: {sample['fname']} ({len(sample['data'])} bytes)") + +print(f"✓ Successfully handled large tar file with {len(samples4)} files") + + +print("\n" + "=" * 70) +print("Test Summary:") +print(" ✓ Test 1: Basic compatibility - RetryingStream works with tar_file_iterator") +print(" ✓ Test 2: Multiple files extraction - Handles tar files with multiple entries") +print(" ✓ Test 3: Early header failure - Retries during tar header reading") +print(" ✓ Test 3b: Multiple retries - Handles consecutive failures before success") +print(" ✓ Test 3c: Data block failure - Retries during file data reading") +print(" ✓ Test 3d: Exhausted retries - Properly propagates errors after max attempts") +print(" ✓ Test 3e: Multiple exception types - Handles various network errors") + +# Check which botocore exception tests failed +failed_tests = [] +if exception_caught_3f is None: + print(" ✓ Test 3f: ResponseStreamingError - Properly caught and retried") +else: + print(" ✗ Test 3f: ResponseStreamingError - NOT caught (needs fix in stream.py)") + failed_tests.append("ResponseStreamingError") + +if exception_caught_3g is None: + print(" ✓ Test 3g: ConnectionClosedError - Properly caught and retried") +else: + print(" ✗ Test 3g: ConnectionClosedError - NOT caught (needs fix in stream.py)") + failed_tests.append("ConnectionClosedError") + +if exception_caught_3h is None: + print(" ✓ Test 3h: botocore ReadTimeoutError - Properly caught and retried") +else: + print(" ✗ Test 3h: botocore ReadTimeoutError - NOT caught (needs fix in stream.py)") + failed_tests.append("botocore.ReadTimeoutError") + +if failed_tests: + print("\n" + "=" * 70) + print(f"FAILURE: {len(failed_tests)} botocore exception(s) not properly handled") + print("=" * 70) + print("Missing exception handlers:") + for exc in failed_tests: + print(f" - {exc}") + print("\nFix required in stream.py:") + print(" Add these exceptions to the exception handler in RetryingStream.read()") + print("=" * 70) + raise AssertionError( + f"Tests failed: {', '.join(failed_tests)} not caught. " + "Fix required in stream.py: Add these exceptions to exception handler." + ) + +print(" ✓ Test 4: Large files - Correctly handles chunked reads for large tar files") +print("\n" + "=" * 70) +print("✓ ALL TESTS PASSED!") +print("=" * 70) +print("\nConclusion:") +print(" RetryingStream successfully implements byte-level retry logic that works") +print(" seamlessly with tar_file_iterator, recovering from transient network errors") +print(" during tar file streaming and decompression.") +print("=" * 70) diff --git a/cosmos-inference/cosmos3/_src/imaginaire/datasets/webdataset/utils/unit_test/mpi_rank_worker.py b/cosmos-inference/cosmos3/_src/imaginaire/datasets/webdataset/utils/unit_test/mpi_rank_worker.py new file mode 100644 index 00000000..1b6fc7a7 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/imaginaire/datasets/webdataset/utils/unit_test/mpi_rank_worker.py @@ -0,0 +1,369 @@ +#!/usr/bin/env python +# ----------------------------------------------------------------------------- +# Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. +# All rights reserved. +# ----------------------------------------------------------------------------- + +"""Worker script for MPI/torchrun multi-rank statistics testing. + +This script is launched by torchrun or mpirun with multiple ranks. +Each rank spawns multiple worker processes to simulate DataLoader workers. +Each worker independently tests RetryingStream statistics tracking. + +Dependencies: + - torch.distributed (required) + - mpi4py (optional, only needed for mpirun testing) + +Usage with torchrun (no extra dependencies): + torchrun --nproc_per_node=10 mpi_rank_worker.py + +Usage with mpirun (requires mpi4py): + uv pip install mpi4py # Only if using mpirun + mpirun -np 10 --oversubscribe python mpi_rank_worker.py + +Environment variables: + SIMULATE_FAILURE_RANKS: Comma-separated list of ranks to kill (e.g., "2,5,7") + SKIP_BARRIER: If set to "1", skips the final barrier (for failure tests) + NUM_WORKERS_PER_RANK: Number of worker processes per rank (default: 3, simulates DataLoader workers) +""" + +import multiprocessing +import os +import random +import sys +from http.client import IncompleteRead +from unittest.mock import MagicMock + +import cosmos3._src.imaginaire.datasets.webdataset.utils.stream as stream_module +from cosmos3._src.imaginaire.datasets.webdataset.utils.stream import RetryingStream +from cosmos3._src.imaginaire.utils import log + +# Configure faster logging interval for tests (10 seconds instead of 5 minutes) +stream_module.RETRY_STATS_LOG_INTERVAL = 10.0 + + +def init_distributed(): + """Initialize distributed environment (torchrun or mpirun). + + Returns: + tuple: (rank, world_size, launcher_type) + launcher_type: 'torchrun' or 'mpirun' + + Raises: + RuntimeError: If neither torchrun nor mpirun is available + """ + import torch.distributed as dist + + # Try torchrun first + if "TORCHELASTIC_RUN_ID" in os.environ or ("RANK" in os.environ and "WORLD_SIZE" in os.environ): + # Initialize PyTorch distributed + if not dist.is_initialized(): + dist.init_process_group(backend="gloo") # Use gloo for CPU testing + + rank = dist.get_rank() + world_size = dist.get_world_size() + return rank, world_size, "torchrun" + + # Try mpirun + if "OMPI_COMM_WORLD_RANK" in os.environ or "PMI_RANK" in os.environ: + try: + from mpi4py import MPI # type: ignore # Optional dependency for MPI testing + except ImportError as e: + raise RuntimeError( + "mpirun detected but mpi4py is not installed!\n" + "Install with: uv pip install mpi4py\n" + "Or use torchrun instead:\n" + " torchrun --nproc_per_node=N mpi_rank_worker.py" + ) from e + + comm = MPI.COMM_WORLD + rank = comm.Get_rank() + world_size = comm.Get_size() + + # Initialize torch.distributed using MPI backend so log module can detect rank + if not dist.is_initialized(): + # Set environment variables for torch.distributed + os.environ["RANK"] = str(rank) + os.environ["WORLD_SIZE"] = str(world_size) + os.environ["MASTER_ADDR"] = "127.0.0.1" + os.environ["MASTER_PORT"] = "29500" + + # Initialize with gloo backend (MPI backend requires special build) + dist.init_process_group(backend="gloo", rank=rank, world_size=world_size) + + return rank, world_size, "mpirun" + + # Neither launcher detected + raise RuntimeError( + "Neither torchrun nor mpirun detected!\n" + "This script must be launched with:\n" + " torchrun --nproc_per_node=N mpi_rank_worker.py\n" + " OR\n" + " mpirun -np N python mpi_rank_worker.py" + ) + + +def worker_process( + worker_id: int, + rank: int, + world_size: int, + num_operations: int, + failure_interval: int, + should_fail: bool, + result_queue: multiprocessing.Queue, +) -> None: + """Worker process function that simulates a DataLoader worker. + + Each worker process: + 1. Has its own PID and independent _global_retry_stats instance + 2. Performs a different number of operations (worker-specific workload) + 3. Uses worker-specific failure patterns + 4. Logs its own statistics independently + + Args: + worker_id: Unique worker ID within this rank (0, 1, 2, ...) + rank: Distributed rank this worker belongs to + world_size: Total number of distributed ranks + num_operations: Base number of operations (will be varied per worker) + failure_interval: How often operations fail + should_fail: Whether this worker should simulate a crash + result_queue: Queue to report results back to parent + """ + try: + # Initialize logging in worker process + # Note: Worker processes spawned via multiprocessing.Process don't have torch.distributed + # initialized, which is the same behavior as real PyTorch DataLoader workers. + # They inherit the RANK environment variable from their parent, which RetryingStream's + # get_rank() will use as a fallback. This ensures statistics log with the correct rank. + log.init_loguru_stdout() + + # Enable statistics + stream_module.ENABLE_RETRY_STATS = True + + # Note: atexit handler registration is now automatic (handled lazily in _get_thread_stats) + # Each worker process automatically registers its own handler on first RetryingStream use + + # Each worker does a different amount of work to simulate real workload imbalance + # Worker 0: 100% of base ops, Worker 1: 80%, Worker 2: 120%, Worker 3: 60% + workload_multipliers = [1.0, 0.8, 1.2, 0.6, 1.1, 0.9] + multiplier = workload_multipliers[worker_id % len(workload_multipliers)] + worker_num_operations = int(num_operations * multiplier) + + # Each worker has a slightly different failure pattern + worker_failure_interval = failure_interval + worker_id # Offset by worker_id + + print( + f"Rank {rank} Worker {worker_id} (PID={os.getpid()}): " + f"{worker_num_operations} ops, fail every {worker_failure_interval}th", + flush=True, + ) + + # Calculate expected stats for this worker + expected_init_ops = worker_num_operations * 2 + expected_read_ops = worker_num_operations + expected_total_ops = expected_init_ops + expected_read_ops + + # Count failures + if worker_num_operations > 0: + expected_failed_reads = (worker_num_operations - 1) // worker_failure_interval + 1 + else: + expected_failed_reads = 0 + expected_total_failed = expected_failed_reads + expected_total_attempts = expected_total_ops + expected_failed_reads + + # Setup mock S3 client + client = MagicMock() + test_data = b"X" * 1024 + client.head_object.return_value = {"ContentLength": str(len(test_data))} + + # Simulate crash at random point if requested + failure_point = random.randint(30, 70) if should_fail else None + if should_fail: + print( + f"⚠️ Rank {rank} Worker {worker_id}: WILL CRASH at operation {failure_point}/{worker_num_operations}", + flush=True, + ) + + # Perform operations + for i in range(worker_num_operations): + if should_fail and i == failure_point: + print(f"💥 Rank {rank} Worker {worker_id}: SIMULATING CRASH (killed at operation {i})", flush=True) + sys.stdout.flush() + sys.stderr.flush() + os._exit(1) + + mock_body = MagicMock() + + # Fail based on worker-specific interval + if i % worker_failure_interval == 0: + mock_body.read.side_effect = [IncompleteRead(b"partial"), test_data] + else: + mock_body.read.return_value = test_data + + client.get_object.return_value = {"Body": mock_body, "ContentLength": len(test_data)} + + stream = RetryingStream(client, f"bucket-rank{rank}-w{worker_id}", f"file-{i}.tar", retries=5) + try: + _ = stream.read(1024) + except Exception as e: + print(f"Rank {rank} Worker {worker_id}: ERROR: {e}", flush=True) + result_queue.put({"worker_id": worker_id, "success": False, "error": str(e)}) + sys.stdout.flush() + sys.stderr.flush() + sys.exit(1) + + # Force stats logging for this worker process + with stream_module._global_retry_stats.lock: + stream_module._global_retry_stats.last_log_time = 0 + stream_module._log_retry_stats_internal(force=False) + + # Get and verify cumulative stats + with stream_module._global_retry_stats.lock: + actual_ops_started = stream_module._global_retry_stats.cumulative_operations_started + actual_failed_ops = stream_module._global_retry_stats.cumulative_failed_operations + actual_total_attempts = stream_module._global_retry_stats.cumulative_attempts + + # Verify silently unless there's a mismatch + + # Verify stats + success = ( + actual_ops_started == expected_total_ops + and actual_failed_ops == expected_total_failed + and actual_total_attempts == expected_total_attempts + ) + + if not success: + print(f"❌ Rank {rank} Worker {worker_id}: Statistics mismatch!", flush=True) + result_queue.put({"worker_id": worker_id, "success": False, "error": "stats_mismatch"}) + # Log final statistics even on failure + stream_module._log_retry_stats_internal(force=True) + sys.stdout.flush() + sys.stderr.flush() + sys.exit(1) + + print(f"✅ Rank {rank} Worker {worker_id}: Verified", flush=True) + result_queue.put({"worker_id": worker_id, "success": True}) + + # Explicitly log final statistics before exit + # Note: atexit handlers are unreliable in multiprocessing.Process (even with sys.exit(0)) + # This is a known Python limitation, so we explicitly call the final log + stream_module._log_retry_stats_internal(force=True) + + # Explicitly flush all output streams before exiting + sys.stdout.flush() + sys.stderr.flush() + + # Exit cleanly + sys.exit(0) + + except Exception as e: + print(f"❌ Rank {rank} Worker {worker_id}: Unexpected error: {e}", flush=True) + result_queue.put({"worker_id": worker_id, "success": False, "error": str(e)}) + # Log final statistics even on error + try: + stream_module._log_retry_stats_internal(force=True) + except Exception: + pass # Don't let logging errors mask the original error + sys.stdout.flush() + sys.stderr.flush() + sys.exit(1) + + +def main(): + """Main function: spawns multiple worker processes per rank to simulate DataLoader workers.""" + # Initialize distributed environment (torchrun or mpirun) + rank, world_size, launcher = init_distributed() + + # Get configuration from environment + simulate_failure_ranks = os.environ.get("SIMULATE_FAILURE_RANKS", "") + should_fail = str(rank) in simulate_failure_ranks.split(",") if simulate_failure_ranks else False + skip_barrier = os.environ.get("SKIP_BARRIER", "0") == "1" + num_workers = int(os.environ.get("NUM_WORKERS_PER_RANK", "3")) # Default: 3 workers per rank + + try: + # Initialize logging in main process + log.init_loguru_stdout() + + # Base operations per worker + num_operations = 50 + + # Rank-specific failure pattern + failure_intervals = [10, 7, 5, 4, 3, 3, 2, 2, 2, 2] + failure_interval = failure_intervals[rank % len(failure_intervals)] + + print( + f"Rank {rank}/{world_size} ({launcher}): Starting {num_workers} workers", + flush=True, + ) + + # Create queue for worker results + result_queue = multiprocessing.Queue() + + # Spawn worker processes (simulating DataLoader workers) + workers = [] + for worker_id in range(num_workers): + # Only simulate failure in the first worker of a failing rank + worker_should_fail = should_fail and worker_id == 0 + + p = multiprocessing.Process( + target=worker_process, + args=( + worker_id, + rank, + world_size, + num_operations, + failure_interval, + worker_should_fail, + result_queue, + ), + ) + p.start() + workers.append(p) + + # Wait for all workers to complete + all_success = True + for p in workers: + p.join() + if p.exitcode != 0: + print(f"❌ Rank {rank}: Worker with PID {p.pid} failed with exit code {p.exitcode}", flush=True) + all_success = False + + # Collect results from queue + worker_results = [] + while not result_queue.empty(): + worker_results.append(result_queue.get()) + + # Verify all workers succeeded + success_count = sum(1 for r in worker_results if r.get("success", False)) + + if not all_success or success_count != num_workers: + print(f"❌ Rank {rank}: {success_count}/{num_workers} workers succeeded", flush=True) + sys.exit(1) + + print(f"✅ Rank {rank}: All {num_workers} workers verified", flush=True) + + # Synchronize all ranks before exit (silent) + if not skip_barrier: + try: + import torch.distributed as dist + + if dist.is_initialized(): + dist.barrier() + except Exception: + pass # Ignore barrier failures + + finally: + # Cleanup distributed environment + if launcher == "torchrun": + try: + import torch.distributed as dist + + if dist.is_initialized(): + dist.destroy_process_group() + except (ImportError, Exception): + pass + # mpi4py calls MPI.Finalize() automatically at exit, no cleanup needed + + +if __name__ == "__main__": + main() diff --git a/cosmos-inference/cosmos3/_src/imaginaire/datasets/webdataset/webdataset.py b/cosmos-inference/cosmos3/_src/imaginaire/datasets/webdataset/webdataset.py new file mode 100644 index 00000000..cb39bb52 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/imaginaire/datasets/webdataset/webdataset.py @@ -0,0 +1,402 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +import json +import os +import threading +import time +import traceback +import warnings +from collections.abc import Iterable +from concurrent.futures import ThreadPoolExecutor, as_completed +from functools import partial +from typing import Callable + +import omegaconf +import torch.distributed as dist +import webdataset as wds +from webdataset.handlers import reraise_exception + +from cosmos3._src.imaginaire.datasets.webdataset.config.schema import AugmentorConfig, DatasetConfig, DatasetInfo, TarSample, Wdinfo +from cosmos3._src.imaginaire.datasets.webdataset.utils.iterators import WebDataset +from cosmos3._src.imaginaire.datasets.webdataset.utils.misc import remove_extensions_from_keys, skip_keys, update_url +from cosmos3._src.imaginaire.lazy_config import instantiate +from cosmos3._src.imaginaire.utils import log +from cosmos3._src.imaginaire.utils.distributed import get_rank, get_world_size +from cosmos3._src.imaginaire.utils.object_store import ObjectStore + + +def wrap_augmentor_func_as_generator(func: Callable, data: Iterable): + for data_dict in data: + data_dict_out = func(data_dict) + if data_dict_out is None: + # Skip "unhealthy" samples + continue + yield data_dict_out + + +def _sample_timer(data: Iterable) -> Iterable: + """Pipeline stage that measures total per-sample production time. + + Must be the LAST stage appended to the dataset pipeline. When the + DataLoader worker calls ``next()`` on this iterator the call propagates + through the entire upstream chain (I/O -> decode -> augment -> ...), + so the elapsed time captures the full cost of producing one sample. + """ + it = iter(data) + while True: + t_start = time.monotonic() + try: + sample = next(it) + except StopIteration: + return + sample["_sample_time"] = time.monotonic() - t_start + yield sample + + +class Dataset: + def __init__( + self, + config: DatasetConfig, + handler: Callable = reraise_exception, + ): + r"""Webdataloader class + + Args: + config: Dataset config + world_size: Total number of GPUs + """ + super().__init__() + + self.config = config + + self.world_size = get_world_size() + + dataset_info = config.dataset_info + self.streaming_download = config.streaming_download + + self.s3_client = dict() + self.bucket = dict() + self.use_object_store = False + self.data_keys = config.keys + + # Parse the metadata + self.wdinfo = Wdinfo([], 0, 0) + self.parse_dataset_info(dataset_info=dataset_info, use_multithread=True) + self.handler = handler + self.augmentors = dict() + + def parse_dataset_info(self, dataset_info: list[DatasetInfo], use_multithread: bool = True): + r"""Parse metadata about the list of tar files. + + When ``torch.distributed`` is initialized, only rank 0 fetches the + wdinfo JSONs (in parallel via a thread pool) and broadcasts the parsed + metadata to every other rank. + + Args: + dataset_info (list): List of dictionaries containing paths to metadata files. + use_multithread (bool): Whether to use multi-threaded parsing across datasets. Default: True. + """ + rank = get_rank() + world_size = get_world_size() + use_broadcast = world_size > 1 and dist.is_available() and dist.is_initialized() + log.info(f"Start parsing dataset info with {len(dataset_info)} entries, use multithread = {use_multithread}") + tic = time.time() + + # Thread-local ObjectStore cache for per-thread ObjectStore construction. + thread_local_stores = threading.local() + + def get_thread_local_store(dset_info: DatasetInfo) -> ObjectStore: + """Get or create a thread-local ObjectStore for a dataset.""" + cache = getattr(thread_local_stores, "cache", None) + if cache is None: + cache = thread_local_stores.cache = {} + key = (dset_info.object_store_config.credentials, dset_info.object_store_config.bucket) + if key not in cache: + cache[key] = ObjectStore(config_object_storage=dset_info.object_store_config) + return cache[key] + + def process_single_dataset(dset_num: int, dset_info: DatasetInfo): + # For each dataset, we parse the file paths and store them as a list of TarSample. + # TarSample will then be used by each worker to load the data. + use_object_store = dset_info.object_store_config.enabled + dset_id = "dset: {}".format(dset_num) + if use_object_store: + object_store_reader = get_thread_local_store(dset_info) + # Create PBSS config if data is loaded from PBSS + bucket_dset = dset_info.object_store_config.bucket + else: + object_store_reader = None + bucket_dset = None + + tar_samples = [] + total_key_count = 0 + chunk_sizes = [] + + # Read all wdinfo files and obtain the DataSample list + for wdinfo_path in dset_info.wdinfo: + if use_object_store: + if not object_store_reader.object_exists(wdinfo_path): + raise FileNotFoundError(f"{wdinfo_path} not found") + cur_dset_info = object_store_reader.load_object(key=wdinfo_path, type="json") # type: ignore + else: + with open(wdinfo_path, "r") as fp: + cur_dset_info = json.load(fp) + + data_root = cur_dset_info["root"] + # Strip s3://bucket/ prefix from root if present, as the bucket is specified separately + if data_root.startswith("s3://"): + # Remove s3://bucket/ prefix (e.g., "s3://debug/path/" -> "path/") + parts = data_root[5:].split("/", 1) # Split after "s3://" + if len(parts) > 1: + data_root = parts[1] # Take everything after bucket name + else: + data_root = "" + tar_files_list = cur_dset_info["data_list"] + # Use per-tar actual sample counts from data_list_key_count when available; + # fall back to evenly distributing total_key_count across tars. + # chunk_size is only the nominal tar capacity and is not reliable. + per_tar_key_counts = cur_dset_info.get( + "data_list_key_count", + [cur_dset_info["total_key_count"] // max(len(tar_files_list), 1)] * len(tar_files_list), + ) + local_tar_samples = [ + TarSample( + path=tar_file, + root=data_root, + keys=( + dset_info.per_dataset_keys if dset_info.per_dataset_keys else self.data_keys + ), # use per dataset keys if available + meta=dset_info, + dset_id=dset_id, + num_samples=n_samples, + sample_keys_full_list=None, + ) + for tar_file, n_samples in zip(tar_files_list, per_tar_key_counts) + ] + tar_samples.extend(local_tar_samples) + total_key_count += cur_dset_info["total_key_count"] + # Fall back to average samples-per-tar when chunk_size is absent (e.g. SILA wdinfos). + default_chunk_size = cur_dset_info["total_key_count"] // max(len(tar_files_list), 1) + chunk_sizes.append(cur_dset_info.get("chunk_size", default_chunk_size)) + + # boto3 clients are not picklable, so they can't ride along in the + # broadcast payload; we rebuild them locally on every rank below. + return { + "dset_num": dset_num, + "dset_id": dset_id, + "tar_samples": tar_samples, + "total_key_count": total_key_count, + "chunk_sizes": chunk_sizes, + "has_object_store": use_object_store, + "bucket": bucket_dset, + } + + # Step 1: rank 0 (or single-process runs) fetches every wdinfo JSON. + fetch_elapsed = 0.0 + broadcast_elapsed = 0.0 + if rank == 0 or not use_broadcast: + fetch_tic = time.time() + try: + dataset_results = [] + tasks: list[tuple[int, DatasetInfo]] = [] + for i, dset_info in enumerate(dataset_info): + if len(dset_info.wdinfo) == 0: + log.warning(f"No wdinfo found for dataset {i}, skipping...") + continue + tasks.append((i, dset_info)) + if use_multithread and len(tasks) > 1: + # Only rank 0 runs this in distributed mode, so we can + # over-subscribe the pool: wdinfo fetches are I/O-bound, + # so ~2x CPU count keeps the (per-thread) connection pools + num_workers = min(2 * (os.cpu_count() or 16), len(tasks)) + log.info(f"Fetching {len(tasks)} datasets with {num_workers} threads") + with ThreadPoolExecutor(max_workers=num_workers) as executor: + futures = [executor.submit(process_single_dataset, *task) for task in tasks] + for future in as_completed(futures): + dataset_results.append(future.result()) + else: + for task in tasks: + dataset_results.append(process_single_dataset(*task)) + payload = {"ok": True, "dataset_results": dataset_results} + except Exception as exc: + payload = { + "ok": False, + "error_type": type(exc).__name__, + "error_message": str(exc), + "traceback": traceback.format_exc(), + } + fetch_elapsed = time.time() - fetch_tic + else: + payload = None + + # Step 2: broadcast the parsed metadata (or error sentinel) to all ranks. + if use_broadcast: + obj_list = [payload] + broadcast_tic = time.time() + dist.broadcast_object_list(obj_list, src=0) + broadcast_elapsed = time.time() - broadcast_tic + payload = obj_list[0] + + assert payload is not None # for type checkers + if not payload["ok"]: + raise RuntimeError( + f"Rank 0 failed while fetching wdinfo metadata: " + f"{payload['error_type']}: {payload['error_message']}\n" + f"{payload['traceback']}" + ) + dataset_results = payload["dataset_results"] + + # Step 3: every rank merges results and rebuilds ObjectStore instances + # locally (boto3 clients aren't picklable, so they can't ride along in + # the broadcast payload). Each cache entry holds a full ObjectStore; + # we key by (credentials, bucket) so configs with thousands of + # DatasetInfo entries sharing the same auth + bucket reuse a single + # ObjectStore per rank instead of building one per DatasetInfo. + self.use_object_store = any(result["has_object_store"] for result in dataset_results) + local_object_stores: dict[tuple[str, str], ObjectStore] = {} + for result in dataset_results: + dset_id = result["dset_id"] + self.wdinfo.tar_files.extend(result["tar_samples"]) + self.wdinfo.total_key_count += result["total_key_count"] + if len(set(result["chunk_sizes"])) > 1: + warnings.warn( + f"Multiple chunk_size values found in {dset_id}: {result['chunk_sizes']}. Using the first one." + ) + self.wdinfo.chunk_size = result["chunk_sizes"][0] + if result["has_object_store"]: + dset_info = dataset_info[result["dset_num"]] + cache_key = (dset_info.object_store_config.credentials, dset_info.object_store_config.bucket) + if cache_key not in local_object_stores: + local_object_stores[cache_key] = ObjectStore(config_object_storage=dset_info.object_store_config) + self.s3_client[dset_id] = local_object_stores[cache_key].client + if result["bucket"]: + self.bucket[dset_id] = result["bucket"] + + toc = time.time() + log.info( + f"Parsed dataset info with {len(dataset_info)} wdinfos " + f"(num_keys = {self.wdinfo.total_key_count}, num_tars = {len(self.wdinfo.tar_files)}) " + f"and multithread = {use_multithread}, took {(toc - tic):.2f} seconds " + f"(fetch = {fetch_elapsed:.2f}s [rank 0 only], broadcast = {broadcast_elapsed:.2f}s, " + f"world_size = {world_size})" + ) + + @staticmethod + # This is the function that calls each augmentor in sequence. + def augmentor_fn(data, augmentations): + def _stamp_pre_aug(upstream): + for sample in upstream: + sample["_pre_aug_time"] = time.monotonic() + sample["_aug_step_last"] = sample["_pre_aug_time"] + yield sample + + def _checkpoint(upstream, step_name): + for sample in upstream: + now = time.monotonic() + last = sample.get("_aug_step_last", now) + sample.setdefault("_aug_step_times", {})[step_name] = now - last + sample["_aug_step_last"] = now + yield sample + + # Build augmentor chain + data = _stamp_pre_aug(data) + for aug_fn in augmentations: + # Use generator function as augmentor + # (recommended, allows skipping or replicating samples inside the augmentor) + name = getattr(aug_fn, "__name__", None) or type(aug_fn).__name__ + if getattr(aug_fn, "is_generator", False): + data = aug_fn(data) + else: # Use regular function as augmentor (backward compatibility) + data = wrap_augmentor_func_as_generator(aug_fn, data) + data = _checkpoint(data, name) + for sample in data: + sample.pop("_aug_step_last", None) + pre = sample.pop("_pre_aug_time", None) + if pre is not None: + sample["_aug_time"] = time.monotonic() - pre + yield sample + + def build_data_augmentor(self, augmentor_cfg: dict[str, AugmentorConfig]) -> Callable: + r"""Function for building data augmentors from augmentor config.""" + augmentations = [] + for aug in augmentor_cfg.keys(): + augmentations.append(instantiate(augmentor_cfg[aug])) + + # This is the function that calls each augmentor in sequence. + return partial(Dataset.augmentor_fn, augmentations=augmentations) + + def build_dataset(self, **kwargs) -> WebDataset: + tar_list = self.wdinfo.tar_files + num_tars = len(tar_list) + assert num_tars > 0, "Did not find any data." + + shuffle_buffer_size = getattr(self.config, "buffer_size", self.wdinfo.chunk_size) + + # update distributor urls and chunk size + distributor_fn = self.config.distributor + + distributor_fn.set_urls(tar_list) + distributor_fn.set_chunk_size(self.wdinfo.chunk_size) + + dataset = WebDataset( + distributor_fn, + load_from_object_store=self.use_object_store, + s3_client=self.s3_client, + s3_bucket_name=self.bucket, + streaming_download=self.streaming_download, + handler=self.handler, + ) + + # Creating a shuffle buffer + if shuffle_buffer_size > 0: + dataset.append(wds.shuffle(shuffle_buffer_size)) + + # Adding decoders + # Decoders are functions that decode the input IO stream + decoder_list = getattr(self.config, "decoders", []) + decoder_functions = [] + for decoder in decoder_list: + # If the specified decoder is a string, use the webdataset decoder + # If its a callable function, use the defined function to decode data + assert isinstance(decoder, str) or callable(decoder), "Decoder should either be callable or a str" + decoder_functions.append(decoder) + dataset.append(wds.decode(*decoder_functions)) + + # After the decoders are added, remove extension from the keys + # Extensions in the data keys are needed for auto-detection of decoders in webdataset. + if self.config.remove_extension_from_keys: + dataset.append(remove_extensions_from_keys) + + # Function to skip keys + dataset.append(skip_keys) + # Building augmentors + augmentor_cfg = getattr(self.config, "augmentation", None) + assert isinstance(augmentor_cfg, (dict, omegaconf.dictconfig.DictConfig)), ( + f"getting type: {type(augmentor_cfg)}" + ) + augmentation_fn = self.build_data_augmentor(augmentor_cfg) + dataset.append(augmentation_fn) + + # Updates URL names so that the collate function can handle + dataset.append(update_url) + + dataset.append(_sample_timer) + + dataset.total_images = self.wdinfo.total_key_count # type: ignore + log.info("Total number of training shards: %d" % num_tars) + log.info("Total training key count: %d" % dataset.total_images) # type: ignore + + return dataset diff --git a/cosmos-inference/cosmos3/_src/imaginaire/datasets/webdataset/webdataset_ext.py b/cosmos-inference/cosmos3/_src/imaginaire/datasets/webdataset/webdataset_ext.py new file mode 100644 index 00000000..d8120793 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/imaginaire/datasets/webdataset/webdataset_ext.py @@ -0,0 +1,117 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +from typing import Callable, Optional + +import omegaconf +import webdataset as wds +from webdataset import filters +from webdataset.handlers import reraise_exception + +from cosmos3._src.imaginaire.datasets.webdataset.config.schema import DatasetConfig +from cosmos3._src.imaginaire.datasets.webdataset.utils.iterators import WebDataset +from cosmos3._src.imaginaire.datasets.webdataset.utils.misc import remove_extensions_from_keys, skip_keys, update_url +from cosmos3._src.imaginaire.datasets.webdataset.webdataset import Dataset as BaseDataset +from cosmos3._src.imaginaire.datasets.webdataset.webdataset import _sample_timer +from cosmos3._src.imaginaire.utils import log + + +class Dataset(BaseDataset): + def __init__( + self, + config: DatasetConfig, + handler: Callable = reraise_exception, + decoder_handler: Optional[Callable] = None, + detshuffle: bool = False, + ): + r"""Webdataloader class + + Args: + config: Dataset config + handler (Callable): Error handler for webdataset class + decoder_handler (Callable): Error handler during decoding + """ + super().__init__(config=config, handler=handler) + self.decoder_handler = decoder_handler + self.detshuffle = detshuffle + + def build_dataset(self, **kwargs) -> WebDataset: + r""" + Build the dataset object. + The function only diffs from BaseDataset.build_dataset by only adding the decoder_handler to the WebDataset object. + """ + tar_list = self.wdinfo.tar_files + num_tars = len(tar_list) + assert num_tars > 0, "Did not find any data." + + shuffle_buffer_size = getattr(self.config, "buffer_size", self.wdinfo.chunk_size) + + # update distributor urls and chunk size + distributor_fn = self.config.distributor + + distributor_fn.set_urls(tar_list) + distributor_fn.set_chunk_size(self.wdinfo.chunk_size) + + dataset = WebDataset( + distributor_fn, + load_from_object_store=self.use_object_store, + s3_client=self.s3_client, + s3_bucket_name=self.bucket, + streaming_download=self.streaming_download, + handler=self.handler, + ) + + # Creating a shuffle buffer + if self.detshuffle: + dataset.append(filters.detshuffle(shuffle_buffer_size)) + else: + dataset.append(wds.shuffle(shuffle_buffer_size)) + + # Adding decoders + # Decoders are functions that decode the input IO stream + decoder_list = getattr(self.config, "decoders", []) + decoder_functions = [] + for decoder in decoder_list: + # If the specified decoder is a string, use the webdataset decoder + # If its a callable function, use the defined function to decode data + assert isinstance(decoder, str) or callable(decoder), "Decoder should either be callable or a str" + decoder_functions.append(decoder) + dataset.append(wds.decode(*decoder_functions, handler=self.decoder_handler)) + + # After the decoders are added, remove extension from the keys + # Extensions in the data keys are needed for auto-detection of decoders in webdataset. + if self.config.remove_extension_from_keys: + dataset.append(remove_extensions_from_keys) + + # Function to skip keys + dataset.append(skip_keys) + # Building augmentors + augmentor_cfg = getattr(self.config, "augmentation", None) + assert isinstance(augmentor_cfg, (dict, omegaconf.dictconfig.DictConfig)), ( + f"getting type: {type(augmentor_cfg)}" + ) + augmentation_fn = self.build_data_augmentor(augmentor_cfg) + dataset.append(augmentation_fn) + + # Updates URL names so that the collate function can handle + dataset.append(update_url) + + dataset.append(_sample_timer) + + dataset.total_images = self.wdinfo.total_key_count # type: ignore + log.info("Total number of training shards: %d" % num_tars) + log.info("Total training key count: %d" % dataset.total_images) # type: ignore + + return dataset diff --git a/cosmos-inference/cosmos3/_src/imaginaire/flags.py b/cosmos-inference/cosmos3/_src/imaginaire/flags.py new file mode 100644 index 00000000..52aa127a --- /dev/null +++ b/cosmos-inference/cosmos3/_src/imaginaire/flags.py @@ -0,0 +1,98 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +"""Feature flags.""" + +import os +from dataclasses import dataclass +from enum import Enum +from typing import Final + + +class StrEnum(str, Enum): + """Backport of StrEnum from Python 3.11.""" + + def __str__(self) -> str: + return self.value + + @staticmethod + def _generate_next_value_(name: str, start: int, count: int, last_values: list[str]) -> str: + return name.lower() + + +def _parse_bool(value: str) -> bool: + """Parse string to a boolean.""" + return value.lower() in ["true", "1", "yes", "y"] + + +def _get_bool(name: str, default: bool) -> bool: + """Get a boolean flag from the environment.""" + value = os.environ.get(name, "") + if not value: + return default + return _parse_bool(value) + + +TRAINING: Final[bool] = _get_bool("COSMOS_TRAINING", False) +"""Whether to enable training features. + +This is used to make training dependencies optional. +""" + +INTERNAL: Final[bool] = _get_bool("COSMOS_INTERNAL", False) +"""Whether to use internal (nvidia-only) resources (e.g. S3).""" + +SMOKE: Final[bool] = _get_bool("COSMOS_SMOKE", False) +"""Whether to enable smoke test. + +Sets parameters to minimum values (e.g. num_steps=1, num_layers=2). +""" + + +class Device(StrEnum): + CUDA = "cuda" + CPU = "cpu" + META = "meta" + + +DEVICE: Final[Device] = Device(os.environ.get("COSMOS_DEVICE", "cuda").lower()) +"""Torch device to use. + +Used for checkpoint conversion and smoke tests. +""" + +VERBOSE: Final[bool] = _get_bool("COSMOS_VERBOSE", INTERNAL) +"""Whether to enable verbose console output.""" + +EXPERIMENTAL_CHECKPOINTS: Final[bool] = _get_bool("COSMOS_EXPERIMENTAL_CHECKPOINTS", INTERNAL) +"""Whether to enable experimental checkpoints.""" + + +if INTERNAL: + TRAINING = True + + +@dataclass +class Flags: + internal: bool = INTERNAL + training: bool = TRAINING + smoke: bool = SMOKE + device: Device = DEVICE + verbose: bool = VERBOSE + experimental_checkpoints: bool = EXPERIMENTAL_CHECKPOINTS + + +FLAGS = Flags() +"""Convenience object for accessing flags.""" diff --git a/cosmos-inference/cosmos3/_src/imaginaire/flops/__init__.py b/cosmos-inference/cosmos3/_src/imaginaire/flops/__init__.py new file mode 100644 index 00000000..c3b1cab5 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/imaginaire/flops/__init__.py @@ -0,0 +1,30 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +"""Reusable FLOPs estimation utilities for model architectures.""" + +from cosmos3._src.imaginaire.flops.omni_mot import ( + OmniMoTModelDescriptor, + compute_omni_mot_flops_per_batch, + get_omni_mot_model_descriptor, +) +from cosmos3._src.imaginaire.flops.wan_vae import compute_wan_vae_encoder_flops + +__all__ = [ + "OmniMoTModelDescriptor", + "compute_omni_mot_flops_per_batch", + "compute_wan_vae_encoder_flops", + "get_omni_mot_model_descriptor", +] diff --git a/cosmos-inference/cosmos3/_src/imaginaire/flops/omni_mot.py b/cosmos-inference/cosmos3/_src/imaginaire/flops/omni_mot.py new file mode 100644 index 00000000..999022e1 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/imaginaire/flops/omni_mot.py @@ -0,0 +1,498 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +"""FLOPs estimation for the OmniMoT (Mixture-of-Tokens) dual-pathway transformer.""" + +from decimal import Decimal +from typing import NamedTuple + +from cosmos3._src.imaginaire.utils import log + + +class OmniMoTModelDescriptor(NamedTuple): + """ + Holds information about the OmniMoT model architecture needed for the custom flops formula. + + This captures the dual-pathway (MoT) transformer with support for vision, action, and sound + modalities, and optional Mixture of Experts (MoE) layers. + """ + + # LLM / Transformer core + hidden_size: int # D: hidden dimension of the transformer (e.g. 2048, 3584) + num_hidden_layers: int # number of transformer decoder layers + num_attention_heads: int # number of Q heads + num_key_value_heads: int # number of K/V heads (GQA when < num_attention_heads) + head_dim: int # dimension per head + intermediate_size: int # dense MLP intermediate size (gate_proj / up_proj output dim) + vocab_size: int # vocabulary size for embed_tokens and lm_head + + # MoE parameters + use_moe: bool # whether MoE layers are used + num_experts: int # total number of experts per MoE layer + num_experts_per_tok: int # top-k experts activated per token + moe_intermediate_size: int # intermediate size inside each expert + decoder_sparse_step: int # every `decoder_sparse_step`-th layer is MoE + mlp_only_layers: list[int] # layers forced to use dense MLP even in MoE config + + # Vision modality + latent_patch_size: int # spatial patch size for latent patchification (default 2) + latent_channel_size: int # number of channels in the VAE latent (default 16) + + # Action modality + action_dim: int # action token dimension (default 32) + + # Sound modality + sound_dim: int # sound token dimension + + # TimestepEmbedder + frequency_embedding_size: int # sinusoidal frequency embedding dim (default 256) + + # Text prediction + predict_text_tokens: bool # whether lm_head is applied for text CE loss + + +def get_omni_mot_model_descriptor( + hidden_size: int = 2048, + num_hidden_layers: int = 24, + num_attention_heads: int = 16, + num_key_value_heads: int = 16, + head_dim: int | None = None, + intermediate_size: int = 5632, + vocab_size: int = 151936, + use_moe: bool = True, + num_experts: int = 60, + num_experts_per_tok: int = 4, + moe_intermediate_size: int = 1408, + decoder_sparse_step: int = 1, + mlp_only_layers: list[int] | None = None, + latent_patch_size: int = 2, + latent_channel_size: int = 16, + action_dim: int = 32, + sound_dim: int = 64, + frequency_embedding_size: int = 256, + predict_text_tokens: bool = False, +) -> OmniMoTModelDescriptor: + if head_dim is None: + head_dim = hidden_size // num_attention_heads + if mlp_only_layers is None: + mlp_only_layers = [] + return OmniMoTModelDescriptor( + hidden_size=hidden_size, + num_hidden_layers=num_hidden_layers, + num_attention_heads=num_attention_heads, + num_key_value_heads=num_key_value_heads, + head_dim=head_dim, + intermediate_size=intermediate_size, + vocab_size=vocab_size, + use_moe=use_moe, + num_experts=num_experts, + num_experts_per_tok=num_experts_per_tok, + moe_intermediate_size=moe_intermediate_size, + decoder_sparse_step=decoder_sparse_step, + mlp_only_layers=mlp_only_layers, + latent_patch_size=latent_patch_size, + latent_channel_size=latent_channel_size, + action_dim=action_dim, + sound_dim=sound_dim, + frequency_embedding_size=frequency_embedding_size, + predict_text_tokens=predict_text_tokens, + ) + + +def _pct(part: Decimal, whole: Decimal) -> str: + """Return percentage string, guarding against division by zero.""" + if whole == 0: + return "0" + return str(round(part / whole * 100, 1)) + + +def _extract_padding_tokens( + split_lens: list[int], + attn_modes: list[str], +) -> int: + """Return the total number of padding tokens in a packed sequence. + + Padding splits are lone ``"causal"`` entries that do not form a + ``(causal, full)`` pair with the next split. In practice, finalize() + appends at most one such split at the end. + """ + padding = 0 + i = 0 + while i < len(split_lens): + if i + 1 < len(split_lens) and attn_modes[i] == "causal" and attn_modes[i + 1] == "full": + i += 2 + else: + if attn_modes[i] == "causal": + padding += split_lens[i] + i += 1 + return padding + + +def _compute_per_sample_attn_flops( + n_heads: int, + d_head: int, + B: int | Decimal, + S_und: int | Decimal, + S_gen: int | Decimal, + split_lens: list[int] | None = None, + attn_modes: list[str] | None = None, + include_padding: bool = False, +) -> tuple[Decimal, Decimal]: + """Compute per-layer attention dot-product FLOPs (QK^T + Attn*V). + + The MoT attention pattern is: + - Und tokens (causal): each sample's text tokens self-attend causally. + - Gen tokens (full): each sample's gen tokens attend to ALL tokens in + that sample (und + gen) with full (non-causal) attention. + + When ``split_lens``/``attn_modes`` are provided (packed-sequence mode), + per-sample lengths are extracted from the alternating (causal, full) pairs. + Otherwise, ``B`` uniform samples each with ``S_und`` and ``S_gen`` tokens + are assumed. + + Args: + include_padding: If True, lone ``"causal"`` splits (padding tokens + appended by finalize()) are counted as additional causal + self-attention windows. + + Returns: + (und_attn_flops, gen_attn_flops) for a single layer (QK^T + Attn*V). + """ + if split_lens is not None and attn_modes is not None: + und_attn = Decimal(0) + gen_attn = Decimal(0) + i = 0 + while i < len(split_lens): + if i + 1 < len(split_lens) and attn_modes[i] == "causal" and attn_modes[i + 1] == "full": + s_und_i = split_lens[i] + s_gen_i = split_lens[i + 1] + und_attn += 4 * n_heads * d_head * s_und_i * s_und_i + gen_attn += 4 * n_heads * d_head * s_gen_i * (s_und_i + s_gen_i) + i += 2 + else: + if include_padding and attn_modes[i] == "causal": + s_pad = split_lens[i] + und_attn += 4 * n_heads * d_head * s_pad * s_pad + i += 1 + return und_attn, gen_attn + + und_attn = Decimal(4 * B * n_heads * d_head * S_und * S_und) + gen_attn = Decimal(4 * B * n_heads * d_head * S_gen * (S_und + S_gen)) + return und_attn, gen_attn + + +def compute_omni_mot_flops_per_batch( + cfg: OmniMoTModelDescriptor, + B: int | Decimal, + text_tokens: int = 512, + vision_tokens: int = 0, + action_tokens: int = 0, + sound_tokens: int = 0, + freeze_und: bool = False, + vision_gen: bool = True, + action_gen: bool = False, + sound_gen: bool = False, + backwardpass_ratio: float = 2.0, + split_lens: list[int] | None = None, + attn_modes: list[str] | None = None, + include_padding: bool = False, + use_activation_checkpointing: bool = False, +) -> Decimal: + """Compute training FLOPs for a single batch of the OmniMoT model. + + This is a standalone function that can be called from calculators or callbacks. + It accounts for all parts of the dual-pathway (MoT) transformer, including: + - Modality-specific embedding/projection layers (vae2llm, llm2vae, action2llm, + llm2action, sound2llm, llm2sound). + - TimestepEmbedder MLPs. + - lm_head for text prediction. + - Transformer blocks with dual-pathway attention (separate Q/K/V/O projections + for und and gen pathways). + - Per-sample attention: und tokens self-attend causally, gen tokens attend to + all tokens in their sample with full attention. + - Attention softmax FLOPs (~5 ops per element of the attention matrix). + - Dual-pathway MLPs (dense SwiGLU or MoE per layer). + - RMSNorm at all positions (4 per layer + Q/K norms + 2 final norms). + - Backward pass with special handling for freeze_und. + - Activation checkpointing forward recomputation during backward. + + Args: + cfg: Model architecture descriptor. + B: Batch size. For the packed-sequence path (``split_lens`` provided), + set ``B=1`` and let ``text_tokens``/``vision_tokens`` be the totals + across all packed samples. + text_tokens: Total number of text (understanding) tokens across all samples. + vision_tokens: Total number of vision generation tokens (after patchification) + across all samples. + action_tokens: Total number of action tokens across all samples. + sound_tokens: Total number of sound tokens across all samples. + freeze_und: If True, understanding pathway is frozen (no backward FLOPs for und). + vision_gen: Whether vision generation is active. + action_gen: Whether action generation is active. + sound_gen: Whether sound generation is active. + backwardpass_ratio: Multiplier for backward pass FLOPs relative to forward + (default 2.0). + split_lens: Per-split token lengths from the packed sequence. Alternating + ``[und_0, gen_0, und_1, gen_1, ...]`` with matching ``attn_modes``. + When provided, per-sample attention FLOPs are computed correctly + instead of assuming one big attention window. + attn_modes: Attention mode for each split (``"causal"`` or ``"full"``). + Must have the same length as ``split_lens``. + include_padding: If True, padding tokens (lone ``"causal"`` splits at + the end of ``split_lens``) are included in FLOPs for attention, + projections, MLP, and norms. Useful for measuring total GPU FLOPs + including wasted work on padding. + use_activation_checkpointing: If True, add FLOPs for the forward + recomputation of each transformer layer during the backward pass. + Activation checkpointing discards intermediate activations and + recomputes them on-the-fly, adding ~1x layer forward FLOPs. + + Returns: + Total training FLOPs (forward + backward) as a Decimal. + """ + bp_ratio = Decimal(backwardpass_ratio) + D = cfg.hidden_size + n_heads = cfg.num_attention_heads + n_kv_heads = cfg.num_key_value_heads + d_head = cfg.head_dim + n_layers = cfg.num_hidden_layers + + # =================================================================== + # Token counts + # =================================================================== + L_vision = vision_tokens if vision_gen else 0 + + S_und = text_tokens + S_gen = L_vision + (action_tokens if action_gen else 0) + (sound_tokens if sound_gen else 0) + + # Padding tokens follow the causal (und) path. When include_padding is + # set, add them to S_und so projections, MLP, and norms account for the + # extra work the GPU performs on padding. + S_pad = 0 + if include_padding and split_lens is not None and attn_modes is not None: + S_pad = _extract_padding_tokens(split_lens, attn_modes) + S_und = S_und + S_pad + + # =================================================================== + # 1. Embedding / Projection Layers (outside transformer blocks) + # =================================================================== + embedding_flops = Decimal(0) + + if vision_gen and L_vision > 0: + patch_latent_dim = cfg.latent_patch_size**2 * cfg.latent_channel_size + embedding_flops += 2 * B * L_vision * patch_latent_dim * D + + if vision_gen and L_vision > 0: + embedding_flops += 2 * B * L_vision * D * patch_latent_dim + + if action_gen and action_tokens > 0: + embedding_flops += 2 * B * action_tokens * cfg.action_dim * D + + if action_gen and action_tokens > 0: + embedding_flops += 2 * B * action_tokens * D * cfg.action_dim + + if sound_gen and sound_tokens > 0 and cfg.sound_dim is not None: + embedding_flops += 2 * B * sound_tokens * cfg.sound_dim * D + + if sound_gen and sound_tokens > 0 and cfg.sound_dim is not None: + embedding_flops += 2 * B * sound_tokens * D * cfg.sound_dim + + # TimestepEmbedder MLP: Linear(freq_dim, D) -> SiLU -> Linear(D, D) + freq_dim = cfg.frequency_embedding_size + timestep_mlp_flops_per_call = 2 * freq_dim * D + 2 * D * D + n_timestep_calls = 0 + if vision_gen and L_vision > 0: + n_timestep_calls += 1 + if action_gen and action_tokens > 0: + n_timestep_calls += 1 + if sound_gen and sound_tokens > 0: + n_timestep_calls += 1 + embedding_flops += n_timestep_calls * B * timestep_mlp_flops_per_call + + if cfg.predict_text_tokens: + embedding_flops += 2 * B * text_tokens * D * cfg.vocab_size + + log.debug(f"embedding_flops: {embedding_flops}") + + # =================================================================== + # Pre-compute per-sample attention dot-product FLOPs (shared by + # forward and backward). Und tokens self-attend causally, + # gen tokens attend to all tokens in their sample. + # =================================================================== + und_attn_dot, gen_attn_dot = _compute_per_sample_attn_flops( + n_heads, + d_head, + B, + S_und, + S_gen, + split_lens, + attn_modes, + include_padding=include_padding, + ) + + # Softmax FLOPs: ~5 ops per element of the S_q x S_k attention matrix + # (subtract max, exp, sum, divide, plus the mask/scale). + # Same sequence-length dependency as dot product but with coefficient + # 5 * n_heads instead of 4 * n_heads * d_head. + softmax_ratio = Decimal(5) / Decimal(4 * d_head) + und_softmax = und_attn_dot * softmax_ratio + gen_softmax = gen_attn_dot * softmax_ratio + + # =================================================================== + # 2. Transformer Blocks + # =================================================================== + total_block_flops = Decimal(0) + total_attn_dot_fwd = Decimal(0) + total_softmax_fwd = Decimal(0) + q_dim = n_heads * d_head + kv_dim = n_kv_heads * d_head + + def _dense_mlp_flops(seq_len: int | Decimal) -> Decimal: + return Decimal(6 * B * seq_len * D * cfg.intermediate_size) + + def _moe_mlp_flops(seq_len: int | Decimal) -> Decimal: + gate_flops = 2 * B * seq_len * D * cfg.num_experts + expert_flops = cfg.num_experts_per_tok * 6 * B * seq_len * D * cfg.moe_intermediate_size + return Decimal(gate_flops + expert_flops) + + for layer_idx in range(n_layers): + is_moe_layer = ( + cfg.use_moe + and cfg.num_experts > 0 + and layer_idx not in cfg.mlp_only_layers + and (layer_idx + 1) % cfg.decoder_sparse_step == 0 + ) + + # 2a. Attention (PackedAttentionMoT) + attn_und_proj = 2 * B * S_und * D * q_dim + 2 * B * S_und * D * kv_dim + 2 * B * S_und * D * kv_dim + attn_gen_proj = 2 * B * S_gen * D * q_dim + 2 * B * S_gen * D * kv_dim + 2 * B * S_gen * D * kv_dim + attn_dot = und_attn_dot + gen_attn_dot + attn_o_proj = 2 * B * S_und * q_dim * D + 2 * B * S_gen * q_dim * D + attn_qk_norm = ( + 5 * B * S_und * n_heads * d_head + + 5 * B * S_und * n_kv_heads * d_head + + 5 * B * S_gen * n_heads * d_head + + 5 * B * S_gen * n_kv_heads * d_head + ) + layer_attn_flops = attn_und_proj + attn_gen_proj + attn_qk_norm + attn_dot + attn_o_proj + + # 2b. MLP (separate for und and gen pathways) + mlp_und_flops = _moe_mlp_flops(S_und) if is_moe_layer else _dense_mlp_flops(S_und) + mlp_gen_flops = _moe_mlp_flops(S_gen) if is_moe_layer else _dense_mlp_flops(S_gen) + layer_mlp_flops = mlp_und_flops + mlp_gen_flops + + # 2c. RMSNorm (4 layer norms per decoder layer, dimension D) + layer_norm_flops = 5 * B * S_und * D + 5 * B * S_gen * D + 5 * B * S_und * D + 5 * B * S_gen * D + + # 2d. Attention softmax + layer_softmax_flops = und_softmax + gen_softmax + + layer_flops = layer_attn_flops + layer_mlp_flops + layer_norm_flops + layer_softmax_flops + total_block_flops += layer_flops + total_attn_dot_fwd += attn_dot + total_softmax_fwd += layer_softmax_flops + + if layer_idx == 0: + log.debug(f"Layer 0 breakdown (MoE={is_moe_layer}):") + log.debug(f" attn_und_proj: {attn_und_proj}") + log.debug(f" attn_gen_proj: {attn_gen_proj}") + log.debug(f" attn_qk_norm: {attn_qk_norm}") + log.debug(f" attn_dot: {attn_dot}") + log.debug(f" attn_softmax: {layer_softmax_flops}") + log.debug(f" attn_o_proj: {attn_o_proj}") + log.debug(f" mlp_und: {mlp_und_flops}") + log.debug(f" mlp_gen: {mlp_gen_flops}") + log.debug(f" layer_norms: {layer_norm_flops}") + log.debug(f" total layer: {layer_flops}") + + # =================================================================== + # 3. Final norms (applied to und and gen separately after all layers) + # =================================================================== + final_norm_flops = Decimal(5 * B * S_und * D + 5 * B * S_gen * D) + + log.debug(f"final_norm_flops: {final_norm_flops}") + + # =================================================================== + # 4. Forward pass total + # =================================================================== + fp = embedding_flops + total_block_flops + final_norm_flops + + log.debug(f"Forward pass FLOPs: {fp}") + log.debug(f" embedding_flops: {embedding_flops} ({_pct(embedding_flops, fp)}%)") + log.debug(f" transformer_blocks: {total_block_flops} ({_pct(total_block_flops, fp)}%)") + log.debug(f" final_norms: {final_norm_flops} ({_pct(final_norm_flops, fp)}%)") + + # =================================================================== + # 5. Backward pass + # =================================================================== + + if freeze_und: + # When freeze_und is True, the understanding pathway gradients are detached. + # Backward cost: gen-pathway projections/MLPs, gen-side attention (gen Q + # attends to the full sample), gen norms, and gen embedding layers. + # Causal (und) attention has zero backward cost. + gen_proj_mlp_flops = Decimal(0) + gen_norm_flops = Decimal(0) + for layer_idx in range(n_layers): + is_moe_layer = ( + cfg.use_moe + and cfg.num_experts > 0 + and layer_idx not in cfg.mlp_only_layers + and (layer_idx + 1) % cfg.decoder_sparse_step == 0 + ) + gen_proj_mlp_flops += ( + 2 * B * S_gen * D * q_dim + + 2 * B * S_gen * D * kv_dim + + 2 * B * S_gen * D * kv_dim + + 2 * B * S_gen * q_dim * D + ) + gen_proj_mlp_flops += _moe_mlp_flops(S_gen) if is_moe_layer else _dense_mlp_flops(S_gen) + + gen_norm_flops += 5 * B * S_gen * D * 2 + gen_norm_flops += 5 * B * S_gen * n_heads * d_head + 5 * B * S_gen * n_kv_heads * d_head + + gen_norm_flops += 5 * B * S_gen * D + + gen_embedding_flops = embedding_flops # conservative: count all embedding flops + + backward_attn_flops = gen_attn_dot * n_layers + backward_softmax_flops = gen_softmax * n_layers + + bp = ( + gen_proj_mlp_flops + backward_attn_flops + backward_softmax_flops + gen_norm_flops + gen_embedding_flops + ) * bp_ratio + + else: + bp = fp * bp_ratio + + # =================================================================== + # 6. Activation checkpointing recomputation + # =================================================================== + # When activation checkpointing is enabled, each transformer layer's + # forward pass is fully recomputed during the backward pass. This adds + # ~1x of the transformer-block forward FLOPs (projections, attention + # dot products, softmax, MLP, and norms — everything inside the layer). + ac_recomp = Decimal(0) + if use_activation_checkpointing: + ac_recomp = total_block_flops + + total = fp + bp + ac_recomp + + log.debug(f"Backward pass FLOPs: {bp}") + if use_activation_checkpointing: + log.debug(f"Activation checkpointing recomp FLOPs: {ac_recomp}") + log.debug(f"Total FLOPs: {total}") + + return total diff --git a/cosmos-inference/cosmos3/_src/imaginaire/flops/wan_vae.py b/cosmos-inference/cosmos3/_src/imaginaire/flops/wan_vae.py new file mode 100644 index 00000000..11aff78c --- /dev/null +++ b/cosmos-inference/cosmos3/_src/imaginaire/flops/wan_vae.py @@ -0,0 +1,134 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +"""FLOPs estimation for the Wan 2.2 VAE encoder (Encoder3d).""" + +from decimal import Decimal + + +def compute_wan_vae_encoder_flops( + B: int | Decimal, + T: int, + H: int, + W: int, + *, + dim: int = 160, + z_dim: int = 48, + dim_mult: list[int] | None = None, + num_res_blocks: int = 2, + temperal_downsample: list[bool] | None = None, +) -> Decimal: + """Compute forward-pass FLOPs for the Wan 2.2 VAE encoder (Encoder3d). + + The encoder converts a pixel-space video [B, 3, T, H, W] into a latent + [B, z_dim, T//4, H//16, W//16]. It is frozen during training so only + forward-pass FLOPs are counted (no backward). + + The architecture: patchify(2) -> conv1 -> 4 downsample stages (each with + ``num_res_blocks`` residual blocks + optional spatial/temporal downsample) + -> middle block (ResBlock + single-head spatial attention + ResBlock) + -> head (RMSNorm + SiLU + conv) -> pointwise 1x1 conv. + + Args: + B: Batch size. + T: Number of pixel-space temporal frames. + H: Pixel-space height (must be divisible by 16). + W: Pixel-space width (must be divisible by 16). + dim: Base channel dimension of the encoder (default 160). + z_dim: Latent channel dimension (default 48, encoder outputs 2*z_dim). + dim_mult: Channel multiplier per stage (default [1, 2, 4, 4]). + num_res_blocks: Residual blocks per downsample stage (default 2). + temperal_downsample: Per-stage temporal downsampling flags (default + [False, True, True]). + + Returns: + Total forward-pass FLOPs as a Decimal. + """ + if dim_mult is None: + dim_mult = [1, 2, 4, 4] + if temperal_downsample is None: + temperal_downsample = [False, True, True] + + B = int(B) + flops = Decimal(0) + + def _causalconv3d_flops(c_in: int, c_out: int, kt: int, kh: int, kw: int, bt: int, bh: int, bw: int) -> int: + return 2 * c_out * c_in * kt * kh * kw * B * bt * bh * bw + + def _resblock_flops(in_dim: int, out_dim: int, bt: int, bh: int, bw: int) -> int: + vol = B * bt * bh * bw + f = 0 + f += 5 * in_dim * vol # RMS_norm(in_dim) + f += 2 * out_dim * in_dim * 27 * vol # CausalConv3d(in_dim, out_dim, 3) + f += 5 * out_dim * vol # RMS_norm(out_dim) + f += 2 * out_dim * out_dim * 27 * vol # CausalConv3d(out_dim, out_dim, 3) + if in_dim != out_dim: + f += 2 * out_dim * in_dim * vol # shortcut CausalConv3d(in_dim, out_dim, 1) + return f + + def _attnblock_flops(d: int, bt: int, bh: int, bw: int) -> int: + vol = B * bt * bh * bw + seq = bh * bw + f = 0 + f += 5 * d * vol # RMS_norm + f += 2 * (d * 3) * d * vol # to_qkv Conv2d(d, 3d, 1) + f += 4 * B * bt * seq * seq * d # QK^T + Attn*V + f += 2 * d * d * vol # proj Conv2d(d, d, 1) + return f + + # After patchify(patch_size=2): [B, 12, T, H/2, W/2] + t, h, w = T, H // 2, W // 2 + + # conv1: CausalConv3d(12, dims[0], 3) + dims = [dim * u for u in [1] + dim_mult] # [160, 160, 320, 640, 640] + flops += _causalconv3d_flops(12, dims[0], 3, 3, 3, t, h, w) + + # Downsample stages + for i, (in_d, out_d) in enumerate(zip(dims[:-1], dims[1:])): + t_down = temperal_downsample[i] if i < len(temperal_downsample) else False + down_flag = i != len(dim_mult) - 1 + + cur_in = in_d + for _ in range(num_res_blocks): + flops += _resblock_flops(cur_in, out_d, t, h, w) + cur_in = out_d + + if down_flag: + if t_down: + h_new, w_new = h // 2, w // 2 + flops += 2 * out_d * out_d * 9 * B * t * h_new * w_new # spatial conv2d + t_new = t // 2 + flops += 2 * out_d * out_d * 3 * B * t_new * h_new * w_new # temporal conv3d(3,1,1) + t, h, w = t_new, h_new, w_new + else: + h_new, w_new = h // 2, w // 2 + flops += 2 * out_d * out_d * 9 * B * t * h_new * w_new + h, w = h_new, w_new + + # Middle block: ResBlock + AttentionBlock + ResBlock + mid_dim = dims[-1] + flops += _resblock_flops(mid_dim, mid_dim, t, h, w) + flops += _attnblock_flops(mid_dim, t, h, w) + flops += _resblock_flops(mid_dim, mid_dim, t, h, w) + + # Head: RMS_norm + SiLU + CausalConv3d(mid_dim, z_dim*2, 3) + enc_out_dim = z_dim * 2 + flops += 5 * mid_dim * B * t * h * w # RMS_norm + flops += _causalconv3d_flops(mid_dim, enc_out_dim, 3, 3, 3, t, h, w) + + # WanVAE_.conv1: CausalConv3d(z_dim*2, z_dim*2, 1) — pointwise 1x1 + flops += _causalconv3d_flops(enc_out_dim, enc_out_dim, 1, 1, 1, t, h, w) + + return Decimal(flops) diff --git a/cosmos-inference/cosmos3/_src/imaginaire/functional/batch_ops.py b/cosmos-inference/cosmos3/_src/imaginaire/functional/batch_ops.py new file mode 100644 index 00000000..e60fce3a --- /dev/null +++ b/cosmos-inference/cosmos3/_src/imaginaire/functional/batch_ops.py @@ -0,0 +1,61 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +# Functions for performing operations with broadcasting to the right axis +# +# Example +# input1: tensor of size (N1, N2) +# input2: tensor of size (N1, N2, N3, N4) +# batch_mul(input1, input2) = input1[:, :, None, None] * input2 +# +# If the common dimensions don't match, we raise an assertion error. + +from torch import Tensor + + +def common_broadcast(x: Tensor, y: Tensor) -> tuple[Tensor, Tensor]: + ndims1 = x.ndim + ndims2 = y.ndim + + common_ndims = min(ndims1, ndims2) + for axis in range(common_ndims): + assert x.shape[axis] == y.shape[axis], "Dimensions not equal at axis {}".format(axis) + + if ndims1 < ndims2: + x = x.reshape(x.shape + (1,) * (ndims2 - ndims1)) # x broadcast-padded to ndims2: [*x.shape,1,...] + elif ndims2 < ndims1: + y = y.reshape(y.shape + (1,) * (ndims1 - ndims2)) # y broadcast-padded to ndims1: [*y.shape,1,...] + + return x, y + + +def batch_add(x: Tensor, y: Tensor) -> Tensor: + x, y = common_broadcast(x, y) + return x + y # broadcast result shape + + +def batch_mul(x: Tensor, y: Tensor) -> Tensor: + x, y = common_broadcast(x, y) + return x * y # broadcast result shape + + +def batch_sub(x: Tensor, y: Tensor) -> Tensor: + x, y = common_broadcast(x, y) + return x - y # broadcast result shape + + +def batch_div(x: Tensor, y: Tensor) -> Tensor: + x, y = common_broadcast(x, y) + return x / y # broadcast result shape diff --git a/cosmos-inference/cosmos3/_src/imaginaire/functional/lr_scheduler.py b/cosmos-inference/cosmos3/_src/imaginaire/functional/lr_scheduler.py new file mode 100644 index 00000000..ef9cfb5e --- /dev/null +++ b/cosmos-inference/cosmos3/_src/imaginaire/functional/lr_scheduler.py @@ -0,0 +1,178 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +from typing import Optional + +import numpy as np + +from cosmos3._src.imaginaire.utils import distributed, log + + +class TeroPolyScheduler: + def __init__( + self, + total_Mimg: int, + batch_size: int, + ref_Mimg: Optional[int] = None, + ref_batches: float = 70e3 / 1024, + max_lr_ratio: Optional[float] = 1.0, + min_lr_ratio: Optional[float] = None, + rampup_Mimg: float = 0, + rampdown_Mimg: int = 0, + verbosity_interval: int = 0, + formula: str = "poly", + poly_exp: float = 0.5, + ): + self.total_Mimg = total_Mimg + self.batch_size = batch_size * distributed.get_world_size() + self.ref_Mimg = ref_Mimg or ref_batches * batch_size / 1e6 + self.ref_batches = ref_batches + self.max_lr_ratio = max_lr_ratio + self.min_lr_ratio = min_lr_ratio + self.rampup_Mimg = rampup_Mimg + self.rampdown_Mimg = rampdown_Mimg + self.verbosity_interval = verbosity_interval + self.formula = formula + self.poly_exp = poly_exp + + self._model = None + + @property + def model(self): + return self._model + + @model.setter + def model(self, model): + self._model = model + + def schedule(self, n, **kwargs): + cur_Mimg = getattr(self.model, "sample_counter", 0) / 1e6 + + if self.formula == "constant": + lr = 1.0 + elif self.formula == "poly": + lr = max(cur_Mimg / self.ref_Mimg, 1e-8) ** -self.poly_exp + else: + raise ValueError(f'Invalid learning rate formula "{self.formula}"') + + if self.max_lr_ratio is not None: + lr = min(lr, self.max_lr_ratio) + if self.min_lr_ratio is not None: + lr = max(lr, self.min_lr_ratio) + + if self.rampup_Mimg > 0 and cur_Mimg < self.rampup_Mimg: + lr *= cur_Mimg / self.rampup_Mimg + if self.rampdown_Mimg > 0 and cur_Mimg > self.total_Mimg - self.rampdown_Mimg: + lr *= (self.total_Mimg - cur_Mimg) / self.rampdown_Mimg + + return lr + + def __call__(self, n, **kwargs): + return self.schedule(n, **kwargs) + + +class LambdaWarmUpCosineScheduler: + """ + A learning rate scheduler that combines warm-up with a cosine decay schedule for multiple cycles. + It supports different configurations for each cycle, including the number of warm-up steps, minimum + and maximum scaling factors for the learning rate. + + The scheduler is intended to be used with a base learning rate of 1.0, where the actual learning + rate at any step is the base learning rate multiplied by the scaling factor computed by the scheduler. + + Parameters: + warm_up_steps (list[int]): List of integers where each element represents the number of warm-up + steps for the corresponding cycle. + f_min (list[float]): List of the minimum scaling factors for each cycle after warm-up. + f_max (list[float]): List of the maximum scaling factors at the start and end of each cosine cycle. + f_start (list[float]): List of starting scaling factors for each warm-up phase. + cycle_lengths (list[int]): List of the total lengths of each cycle, including warm-up steps. + verbosity_interval (int, optional): Interval of training steps at which to print current step and + scaling factor information. Set to 0 by default to disable verbosity. + + Examples: + >>> scheduler = LambdaWarmUpCosineScheduler2( + warm_up_steps=[10, 10], + f_min=[0.1, 0.1], + f_max=[1.0, 1.0], + f_start=[0.01, 0.01], + cycle_lengths=[50, 50], + verbosity_interval=10) + >>> for step in range(100): + >>> lr_multiplier = scheduler(step) + >>> print(f"Step {step}: LR Multiplier = {lr_multiplier}") + """ + + def __init__(self, warm_up_steps, f_min, f_max, f_start, cycle_lengths, verbosity_interval=0): + assert len(warm_up_steps) == len(f_min) == len(f_max) == len(f_start) == len(cycle_lengths) + self.lr_warm_up_steps = warm_up_steps + self.f_start = f_start + self.f_min = f_min + self.f_max = f_max + self.cycle_lengths = cycle_lengths + self.cum_cycles = np.cumsum([0] + list(self.cycle_lengths)) + self.last_f = 0.0 + self.verbosity_interval = verbosity_interval + + def find_in_interval(self, n): + interval = 0 + for cl in self.cum_cycles[1:]: + if n <= cl: + return interval + interval += 1 + + def schedule(self, n, **kwargs): + cycle = self.find_in_interval(n) + n = n - self.cum_cycles[cycle] + if self.verbosity_interval > 0: + if n % self.verbosity_interval == 0: + log.info(f"current step: {n}, recent lr-multiplier: {self.last_f}, current cycle {cycle}") + if n < self.lr_warm_up_steps[cycle]: + f = (self.f_max[cycle] - self.f_start[cycle]) / self.lr_warm_up_steps[cycle] * n + self.f_start[cycle] + self.last_f = f + return f + else: + t = (n - self.lr_warm_up_steps[cycle]) / (self.cycle_lengths[cycle] - self.lr_warm_up_steps[cycle]) + t = min(t, 1.0) + f = self.f_min[cycle] + 0.5 * (self.f_max[cycle] - self.f_min[cycle]) * (1 + np.cos(t * np.pi)) + self.last_f = f + return f + + def __call__(self, n, **kwargs): + return self.schedule(n, **kwargs) + + +class LambdaLinearScheduler(LambdaWarmUpCosineScheduler): + """ + Linear instead of cosine decay for the main part of the cycle. + """ + + def schedule(self, n, **kwargs): + cycle = self.find_in_interval(n) + n = n - self.cum_cycles[cycle] + if self.verbosity_interval > 0: + if n % self.verbosity_interval == 0: + log.info(f"current step: {n}, recent lr-multiplier: {self.last_f}, current cycle {cycle}") + + if n < self.lr_warm_up_steps[cycle]: + f = (self.f_max[cycle] - self.f_start[cycle]) / self.lr_warm_up_steps[cycle] * n + self.f_start[cycle] + self.last_f = f + return f + else: + f = self.f_min[cycle] + (self.f_max[cycle] - self.f_min[cycle]) * (self.cycle_lengths[cycle] - n) / ( + self.cycle_lengths[cycle] - self.lr_warm_up_steps[cycle] + ) + self.last_f = f + return f diff --git a/cosmos-inference/cosmos3/_src/imaginaire/functional/multi_step.py b/cosmos-inference/cosmos3/_src/imaginaire/functional/multi_step.py new file mode 100644 index 00000000..48f96ff0 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/imaginaire/functional/multi_step.py @@ -0,0 +1,60 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +""" +Impl of multistep methods to solve the ODE in the diffusion model. +""" + +from typing import Callable, List, Tuple + +import torch + +from cosmos3._src.imaginaire.functional.runge_kutta import reg_x0_euler_step, res_x0_rk2_step + + +def order2_fn( + x_s: torch.Tensor, s: torch.Tensor, t: torch.Tensor, x0_s: torch.Tensor, x0_preds: torch.Tensor +) -> Tuple[torch.Tensor, List[torch.Tensor]]: + """ + impl the second order multistep method in https://arxiv.org/pdf/2308.02157 + Adams Bashforth approach! + """ + if x0_preds: + x0_s1, s1 = x0_preds[0] + x_t = res_x0_rk2_step(x_s, t, s, x0_s, s1, x0_s1) + else: + x_t = reg_x0_euler_step(x_s, s, t, x0_s)[0] + return x_t, [(x0_s, s)] + + +# key: method name, value: method function +# key: order + algorithm name +MULTISTEP_FNs = { + "2ab": order2_fn, +} + + +def get_multi_step_fn(name: str) -> Callable: + if name in MULTISTEP_FNs: + return MULTISTEP_FNs[name] + methods = "\n\t".join(MULTISTEP_FNs.keys()) + raise RuntimeError("Only support multistep method\n" + methods) + + +def is_multi_step_fn_supported(name: str) -> bool: + """ + Check if the multistep method is supported. + """ + return name in MULTISTEP_FNs diff --git a/cosmos-inference/cosmos3/_src/imaginaire/functional/runge_kutta.py b/cosmos-inference/cosmos3/_src/imaginaire/functional/runge_kutta.py new file mode 100644 index 00000000..fcb667c3 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/imaginaire/functional/runge_kutta.py @@ -0,0 +1,333 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +from typing import Callable, Tuple + +import torch + +from cosmos3._src.imaginaire.functional.batch_ops import batch_mul + + +def phi1(t: torch.Tensor) -> torch.Tensor: + """ + Compute the first order phi function: (exp(t) - 1) / t. + + Args: + t: Input tensor. + + Returns: + Tensor: Result of phi1 function. + """ + input_dtype = t.dtype + t = t.to(dtype=torch.float64) + return (torch.expm1(t) / t).to(dtype=input_dtype) + + +def phi2(t: torch.Tensor) -> torch.Tensor: + """ + Compute the second order phi function: (phi1(t) - 1) / t. + + Args: + t: Input tensor. + + Returns: + Tensor: Result of phi2 function. + """ + input_dtype = t.dtype + t = t.to(dtype=torch.float64) + return ((phi1(t) - 1.0) / t).to(dtype=input_dtype) + + +def res_x0_rk2_step( + x_s: torch.Tensor, + t: torch.Tensor, + s: torch.Tensor, + x0_s: torch.Tensor, + s1: torch.Tensor, + x0_s1: torch.Tensor, +) -> torch.Tensor: + """ + Perform a residual-based 2nd order Runge-Kutta step. + + Args: + x_s: Current state tensor. + t: Target time tensor. + s: Current time tensor. + x0_s: Prediction at current time. + s1: Intermediate time tensor. + x0_s1: Prediction at intermediate time. + + Returns: + Tensor: Updated state tensor. + + Raises: + AssertionError: If step size is too small. + """ + s = -torch.log(s) # scalar or [B] + t = -torch.log(t) # scalar or [B] + m = -torch.log(s1) # scalar or [B] + + dt = t - s # scalar or [B] + assert not torch.any(torch.isclose(dt, torch.zeros_like(dt), atol=1e-6)), "Step size is too small" + assert not torch.any(torch.isclose(m - s, torch.zeros_like(dt), atol=1e-6)), "Step size is too small" + + c2 = (m - s) / dt # scalar or [B] + phi1_val, phi2_val = phi1(-dt), phi2(-dt) # scalar or [B] each + + # Handle edge case where t = s = m + b1 = torch.nan_to_num(phi1_val - 1.0 / c2 * phi2_val, nan=0.0) # scalar or [B] + b2 = torch.nan_to_num(1.0 / c2 * phi2_val, nan=0.0) # scalar or [B] + + return batch_mul(torch.exp(-dt), x_s) + batch_mul(dt, batch_mul(b1, x0_s) + batch_mul(b2, x0_s1)) # [B,...] + + +def reg_x0_euler_step( + x_s: torch.Tensor, + s: torch.Tensor, + t: torch.Tensor, + x0_s: torch.Tensor, +) -> Tuple[torch.Tensor, torch.Tensor]: + """ + Perform a regularized Euler step based on x0 prediction. + + Args: + x_s: Current state tensor. + s: Current time tensor. + t: Target time tensor. + x0_s: Prediction at current time. + + Returns: + Tuple[Tensor, Tensor]: Updated state tensor and current prediction. + """ + coef_x0 = (s - t) / s # scalar or [B] + coef_xs = t / s # scalar or [B] + return batch_mul(coef_x0, x0_s) + batch_mul(coef_xs, x_s), x0_s # [B,...], [B,...] + + +def reg_eps_euler_step( + x_s: torch.Tensor, s: torch.Tensor, t: torch.Tensor, eps_s: torch.Tensor +) -> Tuple[torch.Tensor, torch.Tensor]: + """ + Perform a regularized Euler step based on epsilon prediction. + + Args: + x_s: Current state tensor. + s: Current time tensor. + t: Target time tensor. + eps_s: Epsilon prediction at current time. + + Returns: + Tuple[Tensor, Tensor]: Updated state tensor and current x0 prediction. + """ + return x_s + batch_mul(eps_s, t - s), x_s + batch_mul(eps_s, 0 - s) # [B,...], [B,...] + + +def rk1_euler( + x_s: torch.Tensor, s: torch.Tensor, t: torch.Tensor, x0_fn: Callable +) -> Tuple[torch.Tensor, torch.Tensor]: + """ + Perform a first-order Runge-Kutta (Euler) step. + + Recommended for diffusion models with guidance or model undertrained + Usually more stable at the cost of a bit slower convergence. + + Args: + x_s: Current state tensor. + s: Current time tensor. + t: Target time tensor. + x0_fn: Function to compute x0 prediction. + + Returns: + Tuple[Tensor, Tensor]: Updated state tensor and x0 prediction. + """ + x0_s = x0_fn(x_s, s) # [B,...] + return reg_x0_euler_step(x_s, s, t, x0_s) # [B,...], [B,...] + + +def rk2_mid_stable( + x_s: torch.Tensor, s: torch.Tensor, t: torch.Tensor, x0_fn: Callable +) -> Tuple[torch.Tensor, torch.Tensor]: + """ + Perform a stable second-order Runge-Kutta (midpoint) step. + + Args: + x_s: Current state tensor. + s: Current time tensor. + t: Target time tensor. + x0_fn: Function to compute x0 prediction. + + Returns: + Tuple[Tensor, Tensor]: Updated state tensor and x0 prediction. + """ + s1 = torch.sqrt(s * t) # scalar or [B] + x_s1, _ = rk1_euler(x_s, s, s1, x0_fn) # [B,...] + + x0_s1 = x0_fn(x_s1, s1) # [B,...] + return reg_x0_euler_step(x_s, s, t, x0_s1) # [B,...], [B,...] + + +def rk2_mid(x_s: torch.Tensor, s: torch.Tensor, t: torch.Tensor, x0_fn: Callable) -> Tuple[torch.Tensor, torch.Tensor]: + """ + Perform a second-order Runge-Kutta (midpoint) step. + + Args: + x_s: Current state tensor. + s: Current time tensor. + t: Target time tensor. + x0_fn: Function to compute x0 prediction. + + Returns: + Tuple[Tensor, Tensor]: Updated state tensor and x0 prediction. + """ + s1 = torch.sqrt(s * t) # scalar or [B] + x_s1, x0_s = rk1_euler(x_s, s, s1, x0_fn) # [B,...], [B,...] + + x0_s1 = x0_fn(x_s1, s1) # [B,...] + + return res_x0_rk2_step(x_s, t, s, x0_s, s1, x0_s1), x0_s1 # [B,...], [B,...] + + +def rk_2heun_naive( + x_s: torch.Tensor, s: torch.Tensor, t: torch.Tensor, x0_fn: Callable +) -> Tuple[torch.Tensor, torch.Tensor]: + """ + Perform a naive second-order Runge-Kutta (Heun's method) step. + Impl based on rho-rk-deis solvers, https://github.com/qsh-zh/deis + Recommended for diffusion models without guidance and relative large NFE + + Args: + x_s: Current state tensor. + s: Current time tensor. + t: Target time tensor. + x0_fn: Function to compute x0 prediction. + + Returns: + Tuple[Tensor, Tensor]: Updated state tensor and current state. + """ + x_t, x0_s = rk1_euler(x_s, s, t, x0_fn) # [B,...], [B,...] + eps_s = batch_mul(1.0 / s, x_t - x0_s) # [B,...] + x0_t = x0_fn(x_t, t) # [B,...] + eps_t = batch_mul(1.0 / t, x_t - x0_t) # [B,...] + + avg_eps = (eps_s + eps_t) / 2 # [B,...] + + return reg_eps_euler_step(x_s, s, t, avg_eps) # [B,...], [B,...] + + +def rk_2heun_edm( + x_s: torch.Tensor, s: torch.Tensor, t: torch.Tensor, x0_fn: Callable +) -> Tuple[torch.Tensor, torch.Tensor]: + """ + Perform a naive second-order Runge-Kutta (Heun's method) step. + Impl based no EDM second order Heun method + + Args: + x_s: Current state tensor. + s: Current time tensor. + t: Target time tensor. + x0_fn: Function to compute x0 prediction. + + Returns: + Tuple[Tensor, Tensor]: Updated state tensor and current state. + """ + x_t, x0_s = rk1_euler(x_s, s, t, x0_fn) # [B,...], [B,...] + x0_t = x0_fn(x_t, t) # [B,...] + + avg_x0 = (x0_s + x0_t) / 2 # [B,...] + + return reg_x0_euler_step(x_s, s, t, avg_x0) # [B,...], [B,...] + + +def rk_3kutta_naive( + x_s: torch.Tensor, s: torch.Tensor, t: torch.Tensor, x0_fn: Callable +) -> Tuple[torch.Tensor, torch.Tensor]: + """ + Perform a naive third-order Runge-Kutta step. + Impl based on rho-rk-deis solvers, https://github.com/qsh-zh/deis + Recommended for diffusion models without guidance and relative large NFE + + Args: + x_s: Current state tensor. + s: Current time tensor. + t: Target time tensor. + x0_fn: Function to compute x0 prediction. + + Returns: + Tuple[Tensor, Tensor]: Updated state tensor and current state. + """ + c2, c3 = 0.5, 1.0 # stage time fractions + a31, a32 = -1.0, 2.0 # Butcher tableau coefficients + b1, b2, b3 = 1.0 / 6, 4.0 / 6, 1.0 / 6 # quadrature weights + + delta = t - s # scalar or [B] + + s1 = c2 * delta + s # scalar or [B] + s2 = c3 * delta + s # scalar or [B] + x_s1, x0_s = rk1_euler(x_s, s, s1, x0_fn) # [B,...], [B,...] + eps_s = batch_mul(1.0 / s, x_s - x0_s) # [B,...] + x0_s1 = x0_fn(x_s1, s1) # [B,...] + eps_s1 = batch_mul(1.0 / s1, x_s1 - x0_s1) # [B,...] + + _eps = a31 * eps_s + a32 * eps_s1 # [B,...] + x_s2, _ = reg_eps_euler_step(x_s, s, s2, _eps) # [B,...] + + x0_s2 = x0_fn(x_s2, s2) # [B,...] + eps_s2 = batch_mul(1.0 / s2, x_s2 - x0_s2) # [B,...] + + avg_eps = b1 * eps_s + b2 * eps_s1 + b3 * eps_s2 # [B,...] + return reg_eps_euler_step(x_s, s, t, avg_eps) # [B,...], [B,...] + + +# key : order + name +RK_FNs = { + "1euler": rk1_euler, + "2mid": rk2_mid, + "2mid_stable": rk2_mid_stable, + "2heun_edm": rk_2heun_edm, + "2heun_naive": rk_2heun_naive, + "3kutta_naive": rk_3kutta_naive, +} + + +def get_runge_kutta_fn(name: str) -> Callable: + """ + Get the specified Runge-Kutta function. + + Args: + name: Name of the Runge-Kutta method. + + Returns: + Callable: The specified Runge-Kutta function. + + Raises: + RuntimeError: If the specified method is not supported. + """ + if name in RK_FNs: + return RK_FNs[name] + methods = "\n\t".join(RK_FNs.keys()) + raise RuntimeError(f"Only support the following Runge-Kutta methods:\n\t{methods}") + + +def is_runge_kutta_fn_supported(name: str) -> bool: + """ + Check if the specified Runge-Kutta function is supported. + + Args: + name: Name of the Runge-Kutta method. + + Returns: + bool: True if the method is supported, False otherwise. + """ + return name in RK_FNs diff --git a/cosmos-inference/cosmos3/_src/imaginaire/lazy_config/__init__.py b/cosmos-inference/cosmos3/_src/imaginaire/lazy_config/__init__.py new file mode 100644 index 00000000..43f67a1d --- /dev/null +++ b/cosmos-inference/cosmos3/_src/imaginaire/lazy_config/__init__.py @@ -0,0 +1,70 @@ +# Copyright (c) Facebook, Inc. and its affiliates. +import os + +from omegaconf import DictConfig, OmegaConf + +from cosmos3._src.imaginaire.flags import TRAINING +from cosmos3._src.imaginaire.lazy_config.instantiate import instantiate +from cosmos3._src.imaginaire.lazy_config.lazy_call import LazyCall +from cosmos3._src.imaginaire.lazy_config.omegaconf_patch import to_object + +OmegaConf.to_object = to_object + +PLACEHOLDER = None + + +class LazyDict(DictConfig): + def __init__(self, *args, **kwargs): + super().__init__(*args, **kwargs) + + +__all__ = ["instantiate", "LazyCall", "PLACEHOLDER", "LazyDict"] +if TRAINING: + from cosmos3._src.imaginaire.lazy_config.lazy import LazyConfig + + __all__ += ["LazyConfig"] + + +DOC_BUILDING = os.getenv("_DOC_BUILDING", False) # set in docs/conf.py + + +def fixup_module_metadata(module_name, namespace, keys=None): + """ + Fix the __qualname__ of module members to be their exported api name, so + when they are referenced in docs, sphinx can find them. Reference: + https://github.com/python-trio/trio/blob/6754c74eacfad9cc5c92d5c24727a2f3b620624e/trio/_util.py#L216-L241 + """ + if not DOC_BUILDING: + return + seen_ids = set() + + def fix_one(qualname, name, obj): + # avoid infinite recursion (relevant when using + # typing.Generic, for example) + if id(obj) in seen_ids: + return + seen_ids.add(id(obj)) + + mod = getattr(obj, "__module__", None) + if mod is not None and (mod.startswith(module_name) or mod.startswith("fvcore.")): + obj.__module__ = module_name + # Modules, unlike everything else in Python, put fully-qualitied + # names into their __name__ attribute. We check for "." to avoid + # rewriting these. + if hasattr(obj, "__name__") and "." not in obj.__name__: + obj.__name__ = name + obj.__qualname__ = qualname + if isinstance(obj, type): + for attr_name, attr_value in obj.__dict__.items(): + fix_one(objname + "." + attr_name, attr_name, attr_value) + + if keys is None: + keys = namespace.keys() + for objname in keys: + if not objname.startswith("_"): + obj = namespace[objname] + fix_one(objname, objname, obj) + + +fixup_module_metadata(__name__, globals(), __all__) +del fixup_module_metadata diff --git a/cosmos-inference/cosmos3/_src/imaginaire/lazy_config/file_io.py b/cosmos-inference/cosmos3/_src/imaginaire/lazy_config/file_io.py new file mode 100644 index 00000000..0c6693f4 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/imaginaire/lazy_config/file_io.py @@ -0,0 +1,25 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + + +from iopath.common.file_io import HTTPURLHandler, OneDrivePathHandler, PathHandler +from iopath.common.file_io import PathManager as PathManagerBase + +__all__ = ["PathManager", "PathHandler"] + + +PathManager = PathManagerBase() +PathManager.register_handler(HTTPURLHandler()) +PathManager.register_handler(OneDrivePathHandler()) diff --git a/cosmos-inference/cosmos3/_src/imaginaire/lazy_config/instantiate.py b/cosmos-inference/cosmos3/_src/imaginaire/lazy_config/instantiate.py new file mode 100644 index 00000000..66d81a8c --- /dev/null +++ b/cosmos-inference/cosmos3/_src/imaginaire/lazy_config/instantiate.py @@ -0,0 +1,120 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +import collections.abc as abc +import dataclasses +import logging +from typing import Any + +import attrs + +from cosmos3._src.imaginaire.lazy_config.registry import _convert_target_to_string, locate + +__all__ = ["dump_dataclass", "instantiate"] + + +def is_dataclass_or_attrs(target): + return dataclasses.is_dataclass(target) or attrs.has(target) + + +def dump_dataclass(obj: Any): + """ + Dump a dataclass recursively into a dict that can be later instantiated. + + Args: + obj: a dataclass object + + Returns: + dict + """ + assert dataclasses.is_dataclass(obj) and not isinstance(obj, type), ( + "dump_dataclass() requires an instance of a dataclass." + ) + ret = {"_target_": _convert_target_to_string(type(obj))} + for f in dataclasses.fields(obj): + v = getattr(obj, f.name) + if dataclasses.is_dataclass(v): + v = dump_dataclass(v) + if isinstance(v, (list, tuple)): + v = [dump_dataclass(x) if dataclasses.is_dataclass(x) else x for x in v] + ret[f.name] = v + return ret + + +def instantiate(cfg, *args, **kwargs): + """ + Recursively instantiate objects defined in dictionaries by + "_target_" and arguments. + + Args: + cfg: a dict-like object with "_target_" that defines the caller, and + other keys that define the arguments + args: Optional positional parameters pass-through. + kwargs: Optional named parameters pass-through. + + Returns: + object instantiated by cfg + """ + from omegaconf import DictConfig, ListConfig, OmegaConf + + if isinstance(cfg, ListConfig): + lst = [instantiate(x) for x in cfg] + return ListConfig(lst, flags={"allow_objects": True}) + if isinstance(cfg, list): + # Specialize for list, because many classes take + # list[objects] as arguments, such as ResNet, DatasetMapper + return [instantiate(x) for x in cfg] + + # If input is a DictConfig backed by dataclasses (i.e. omegaconf's structured config), + # instantiate it to the actual dataclass. + if isinstance(cfg, DictConfig) and is_dataclass_or_attrs(cfg._metadata.object_type): + return OmegaConf.to_object(cfg) + + if isinstance(cfg, abc.Mapping) and "_target_" in cfg: + # conceptually equivalent to hydra.utils.instantiate(cfg) with _convert_=all, + # but faster: https://github.com/facebookresearch/hydra/issues/1200 + is_recursive = getattr(cfg, "_recursive_", True) + if is_recursive: + cfg = {k: instantiate(v) for k, v in cfg.items()} + else: + cfg = {k: v for k, v in cfg.items()} + # pop the _recursive_ key to avoid passing it as a parameter + if "_recursive_" in cfg: + cfg.pop("_recursive_") + cls = cfg.pop("_target_") + cls = instantiate(cls) + + if isinstance(cls, str): + cls_name = cls + cls = locate(cls_name) + assert cls is not None, cls_name + else: + try: + cls_name = cls.__module__ + "." + cls.__qualname__ + except Exception: + # target could be anything, so the above could fail + cls_name = str(cls) + assert callable(cls), f"_target_ {cls} does not define a callable object" + try: + # override config with kwargs + instantiate_kwargs = {} + instantiate_kwargs.update(cfg) + instantiate_kwargs.update(kwargs) + return cls(*args, **instantiate_kwargs) + except TypeError: + logger = logging.getLogger(__name__) + logger.error(f"Error when instantiating {cls_name}!") + raise + return cfg # return as-is if don't know what to do diff --git a/cosmos-inference/cosmos3/_src/imaginaire/lazy_config/lazy.py b/cosmos-inference/cosmos3/_src/imaginaire/lazy_config/lazy.py new file mode 100644 index 00000000..846c0117 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/imaginaire/lazy_config/lazy.py @@ -0,0 +1,377 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +import ast +import builtins +import importlib +import inspect +import logging +import os +import pickle +import uuid +from collections import OrderedDict +from contextlib import contextmanager +from copy import deepcopy +from typing import Any, Dict, List, Tuple, Union + +import attrs +import yaml +from omegaconf import DictConfig, ListConfig, OmegaConf + +try: + import dill as dill_pickle +except ImportError: + dill_pickle = None + +try: + import cloudpickle +except ImportError: + cloudpickle = None + +from cosmos3._src.imaginaire.lazy_config.file_io import PathManager +from cosmos3._src.imaginaire.lazy_config.lazy_call import LazyCall, get_default_params + +__all__ = ["LazyCall", "LazyConfig"] + + +def sort_dict(d: Dict[str, Any]) -> OrderedDict[str, Any]: + return OrderedDict(sorted(d.items(), key=lambda x: x[0])) + + +def dict_representer(dumper: yaml.Dumper, data: OrderedDict[str, Any]) -> yaml.nodes.MappingNode: + return dumper.represent_mapping("tag:yaml.org,2002:map", data.items()) + + +def sort_recursive(obj: Union[Dict[str, Any], List[Any], Any]) -> Union[OrderedDict[str, Any], List[Any], Any]: + if isinstance(obj, dict): + return sort_dict({k: sort_recursive(v) for k, v in obj.items()}) + elif isinstance(obj, list): + return [sort_recursive(item) for item in obj] + return obj + + +yaml.add_representer(OrderedDict, dict_representer) + +OmegaConf.register_new_resolver("add", lambda *vals: sum(vals)) +OmegaConf.register_new_resolver("subtract", lambda *vals: vals[0] - sum(vals[1:])) + + +def _visit_dict_config(cfg, func): + """ + Apply func recursively to all DictConfig in cfg. + """ + if isinstance(cfg, DictConfig): + func(cfg) + for v in cfg.values(): + _visit_dict_config(v, func) + elif isinstance(cfg, ListConfig): + for v in cfg: + _visit_dict_config(v, func) + + +def _validate_py_syntax(filename): + # see also https://github.com/open-mmlab/mmcv/blob/master/mmcv/utils/config.py + with PathManager.open(filename, "r") as f: + content = f.read() + try: + ast.parse(content) + except SyntaxError as e: + raise SyntaxError(f"Config file {filename} has syntax error!") from e + + +def _cast_to_config(obj): + # if given a dict, return DictConfig instead + if isinstance(obj, dict): + return DictConfig(obj, flags={"allow_objects": True}) + return obj + + +_CFG_PACKAGE_NAME = "detectron2._cfg_loader" +""" +A namespace to put all imported config into. +""" + + +def _random_package_name(filename): + # generate a random package name when loading config files + return _CFG_PACKAGE_NAME + str(uuid.uuid4())[:4] + "." + os.path.basename(filename) + + +@contextmanager +def _patch_import(): + """ + Enhance relative import statements in config files, so that they: + 1. locate files purely based on relative location, regardless of packages. + e.g. you can import file without having __init__ + 2. do not cache modules globally; modifications of module states has no side effect + 3. support other storage system through PathManager, so config files can be in the cloud + 4. imported dict are turned into omegaconf.DictConfig automatically + """ + old_import = builtins.__import__ + + def find_relative_file(original_file, relative_import_path, level): + + # if such import should produce `x` as a python module or DictConfig. + # This can be discussed further if needed. + relative_import_err = """ +Relative import of directories is not allowed within config files. +Within a config file, relative import can only import other config files. +""".replace("\n", " ") + if not len(relative_import_path): + raise ImportError(relative_import_err) + + cur_file = os.path.dirname(original_file) + for _ in range(level - 1): + cur_file = os.path.dirname(cur_file) + cur_name = relative_import_path.lstrip(".") + for part in cur_name.split("."): + cur_file = os.path.join(cur_file, part) + if not cur_file.endswith(".py"): + cur_file += ".py" + if not PathManager.isfile(cur_file): + cur_file_no_suffix = cur_file[: -len(".py")] + if PathManager.isdir(cur_file_no_suffix): + raise ImportError(f"Cannot import from {cur_file_no_suffix}." + relative_import_err) + else: + raise ImportError( + f"Cannot import name {relative_import_path} from {original_file}: {cur_file} does not exist." + ) + return cur_file + + def new_import(name, globals=None, locals=None, fromlist=(), level=0): + if ( + # Only deal with relative imports inside config files + level != 0 and globals is not None and (globals.get("__package__", "") or "").startswith(_CFG_PACKAGE_NAME) + ): + cur_file = find_relative_file(globals["__file__"], name, level) + _validate_py_syntax(cur_file) + spec = importlib.machinery.ModuleSpec(_random_package_name(cur_file), None, origin=cur_file) + module = importlib.util.module_from_spec(spec) + module.__file__ = cur_file + with PathManager.open(cur_file) as f: + content = f.read() + exec(compile(content, cur_file, "exec"), module.__dict__) + for name in fromlist: # turn imported dict into DictConfig automatically + val = _cast_to_config(module.__dict__[name]) + module.__dict__[name] = val + return module + return old_import(name, globals, locals, fromlist=fromlist, level=level) + + builtins.__import__ = new_import + yield new_import + builtins.__import__ = old_import + + +class LazyConfig: + """ + Provide methods to save, load, and overrides an omegaconf config object + which may contain definition of lazily-constructed objects. + """ + + @staticmethod + def load_rel(filename: str, keys: Union[None, str, Tuple[str, ...]] = None): + """ + Similar to :meth:`load()`, but load path relative to the caller's + source file. + + This has the same functionality as a relative import, except that this method + accepts filename as a string, so more characters are allowed in the filename. + """ + caller_frame = inspect.stack()[1] + caller_fname = caller_frame[0].f_code.co_filename + assert caller_fname != "", "load_rel Unable to find caller" + caller_dir = os.path.dirname(caller_fname) + filename = os.path.join(caller_dir, filename) + return LazyConfig.load(filename, keys) + + @staticmethod + def load(filename: str, keys: Union[None, str, Tuple[str, ...]] = None): + """ + Load a config file. + + Args: + filename: absolute path or relative path w.r.t. the current working directory + keys: keys to load and return. If not given, return all keys + (whose values are config objects) in a dict. + """ + has_keys = keys is not None + filename = filename.replace("/./", "/") # redundant + if os.path.splitext(filename)[1] not in [".py", ".yaml", ".yml"]: + raise ValueError(f"Config file {filename} has to be a python or yaml file.") + if filename.endswith(".py"): + _validate_py_syntax(filename) + + with _patch_import(): + # Record the filename + module_namespace = { + "__file__": filename, + "__package__": _random_package_name(filename), + } + with PathManager.open(filename) as f: + content = f.read() + # Compile first with filename to: + # 1. make filename appears in stacktrace + # 2. make load_rel able to find its parent's (possibly remote) location + exec(compile(content, filename, "exec"), module_namespace) + + ret = module_namespace + else: + with PathManager.open(filename) as f: + obj = yaml.unsafe_load(f) + ret = OmegaConf.create(obj, flags={"allow_objects": True}) + + if has_keys: + if isinstance(keys, str): + return _cast_to_config(ret[keys]) + else: + return tuple(_cast_to_config(ret[a]) for a in keys) + else: + if filename.endswith(".py"): + # when not specified, only load those that are config objects + ret = DictConfig( + { + name: _cast_to_config(value) + for name, value in ret.items() + if isinstance(value, (DictConfig, ListConfig, dict)) and not name.startswith("_") + }, + flags={"allow_objects": True}, + ) + return ret + + @staticmethod + def save_pkl(cfg, filename: str) -> str: + """ + Saves a Config object to a file using pickle serialization. This method is typically used + when the configuration object contains complex objects, such as lambdas, that are not supported by + simpler serialization methods like YAML. The function attempts to create a deep copy of the configuration + object before serialization to ensure that the original object remains unmodified. + + Args: + cfg: A Config object to be serialized and saved. + filename: The path and name of the file where the configuration should be saved. The function + assumes the file extension indicates a pickle format (e.g., .pkl). + + Returns: + str: The filename to which the configuration was saved. This can be used to verify the file location + or log the outcome. + + Notes: + - The function logs a warning if the configuration is successfully saved using pickle. + - If saving fails, an error is logged with the exception details. + """ + logger = logging.getLogger(__name__) + try: + cfg = deepcopy(cfg) + except Exception: + pass + + try: + with PathManager.open(filename, "wb") as f: + pickle.dump(cfg, f) + logger.warning(f"Config is saved using pickle at {filename}.") + except Exception as e: + logger.error(f"Failed to save config to {filename}: {e}. Trying dill or cloudpickle instead") + if dill_pickle: + try: + with PathManager.open(filename, "wb") as f: + pickle.dump(dill_pickle.dumps(cfg, recurse=True), f) + logger.warning(f"Config is saved using dill at {filename}.") + except Exception as e: + logger.error(f"Failed to save config to {filename}: {e}.") + if cloudpickle: + try: + with PathManager.open(filename, "wb") as f: + pickle.dump(cloudpickle.dumps(cfg), f) + logger.warning(f"Config is saved using cloudpickle at {filename}.") + except Exception as e: + logger.error(f"Failed to save config to {filename}: {e}.") + else: + logger.error("cloudpickle is not available. Cannot save the config.") + raise e + + return filename + + @staticmethod + def save_yaml(cfg, filename: str) -> str: + """ + Saves a Config object to a file using YAML serialization. This method is beneficial when the configuration object's content needs to be human-readable and easily editable. YAML is suitable for configurations that do not contain complex types like lambdas, which must be handled differently. The function converts unserializable items to strings before saving to ensure compatibility with YAML serialization. + + Args: + cfg: A Config object to be serialized and saved. It handles both DictConfig and ListConfig types. + filename: The path and name of the file where the configuration should be saved. The function does not require a specific file extension but typically uses '.yaml'. + + Returns: + str: The filename to which the configuration was saved. This can be used to verify the file location or log the outcome. + + Notes: + - The function logs a warning if the configuration is successfully saved using YAML. + - If saving fails, an error is logged with the exception details. + """ + logger = logging.getLogger(__name__) + try: + cfg = deepcopy(cfg) + except Exception: + pass + + # Define a function to check if an item is serializable to YAML + def is_serializable(item): + try: + OmegaConf.to_yaml(item) + return True + except Exception as e: + return False + + # Function to convert unserializable items to strings + def serialize_config(config): + if isinstance(config, DictConfig): + for key, value in config.items(): + if isinstance(value, (DictConfig, ListConfig)): + try: + if "_target_" in value: + default_params = get_default_params(value["_target_"]) + for default_key, default_v in default_params.items(): + if default_key not in value: + value[default_key] = default_v + except Exception as e: + logger.error(f"Failed to add default argument values: {e}") + + serialize_config(value) + else: + if not is_serializable(value) and value is not None: + config[key] = str(value) + elif isinstance(config, ListConfig): + for i, item in enumerate(config): + if isinstance(item, (DictConfig, ListConfig)): + serialize_config(item) + else: + if not is_serializable(item) and item is not None: + config[i] = str(item) + else: + raise NotImplementedError("Input config must be a DictConfig or ListConfig.") + return config + + # Convert Config object to a DictConfig object. + config_dict = attrs.asdict(cfg) + config_omegaconf = DictConfig(content=config_dict, flags={"allow_objects": True}) + + # Serialize the DictConfig object by converting non-serializable objects to strings. + config_omegaconf = serialize_config(config_omegaconf) + + config_dict: Dict[str, Any] = OmegaConf.to_container(config_omegaconf, resolve=True) + sorted_config: OrderedDict[str, Any] = sort_recursive(config_dict) + with open(filename, "w") as f: + yaml.dump(sorted_config, f, default_flow_style=False) + logger.warning(f"Config is saved using omegaconf at {filename}.") + return filename diff --git a/cosmos-inference/cosmos3/_src/imaginaire/lazy_config/lazy_call.py b/cosmos-inference/cosmos3/_src/imaginaire/lazy_config/lazy_call.py new file mode 100644 index 00000000..48824cf1 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/imaginaire/lazy_config/lazy_call.py @@ -0,0 +1,81 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +import collections.abc as abc +import inspect +from dataclasses import is_dataclass +from typing import ClassVar + +import attrs +from omegaconf import DictConfig + +from cosmos3._src.imaginaire.lazy_config.registry import convert_target_to_string + +__all__ = ["LazyCall"] + + +def get_default_params(cls_or_func): + if callable(cls_or_func): + # inspect signature for function + signature = inspect.signature(cls_or_func) + else: + # inspect signature for class + signature = inspect.signature(cls_or_func.__init__) + params = signature.parameters + default_params = { + name: param.default for name, param in params.items() if param.default is not inspect.Parameter.empty + } + return default_params + + +_CONVERT_TARGET_TO_STRING: ClassVar[bool] = False +"""Used by tests to enforce conversion of target to string.""" + + +class LazyCall: + """ + Wrap a callable so that when it's called, the call will not be executed, + but returns a dict that describes the call. + + LazyCall object has to be called with only keyword arguments. Positional + arguments are not yet supported. + + Examples: + :: + from detectron2.config import instantiate, LazyCall + + layer_cfg = LazyCall(nn.Conv2d)(in_channels=32, out_channels=32) + layer_cfg.out_channels = 64 # can edit it afterwards + layer = instantiate(layer_cfg) + """ + + def __init__(self, target): + if not (callable(target) or isinstance(target, (str, abc.Mapping))): + raise TypeError(f"target of LazyCall must be a callable or defines a callable! Got {target}") + self._target = target + + def __call__(self, **kwargs): + if _CONVERT_TARGET_TO_STRING or is_dataclass(self._target) or attrs.has(self._target): + # omegaconf object cannot hold dataclass type + # https://github.com/omry/omegaconf/issues/784 + target = convert_target_to_string(self._target) + else: + target = self._target + kwargs["_target_"] = target + + _final_params = get_default_params(self._target) + _final_params.update(kwargs) + + return DictConfig(content=_final_params, flags={"allow_objects": True}) diff --git a/cosmos-inference/cosmos3/_src/imaginaire/lazy_config/omegaconf_patch.py b/cosmos-inference/cosmos3/_src/imaginaire/lazy_config/omegaconf_patch.py new file mode 100644 index 00000000..39dca42a --- /dev/null +++ b/cosmos-inference/cosmos3/_src/imaginaire/lazy_config/omegaconf_patch.py @@ -0,0 +1,65 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +from typing import Any, Dict, List, Union + +from omegaconf import OmegaConf +from omegaconf.base import DictKeyType, SCMode +from omegaconf.dictconfig import DictConfig # pragma: no cover + + +def to_object(cfg: Any) -> Union[Dict[DictKeyType, Any], List[Any], None, str, Any]: + """ + Converts an OmegaConf configuration object to a native Python container (dict or list), unless + the configuration is specifically created by LazyCall, in which case the original configuration + is returned directly. + + This function serves as a modification of the original `to_object` method from OmegaConf, + preventing DictConfig objects created by LazyCall from being automatically converted to Python + dictionaries. This ensures that configurations meant to be lazily evaluated retain their intended + structure and behavior. + + Differences from OmegaConf's original `to_object`: + - Adds a check at the beginning to return the configuration unchanged if it is created by LazyCall. + + Reference: + - Original OmegaConf `to_object` method: https://github.com/omry/omegaconf/blob/master/omegaconf/omegaconf.py#L595 + + Args: + cfg (Any): The OmegaConf configuration object to convert. + + Returns: + Union[Dict[DictKeyType, Any], List[Any], None, str, Any]: The converted Python container if + `cfg` is not a LazyCall created configuration, otherwise the unchanged `cfg`. + + Examples: + >>> cfg = DictConfig({"key": "value", "_target_": "Model"}) + >>> to_object(cfg) + DictConfig({"key": "value", "_target_": "Model"}) + + >>> cfg = DictConfig({"list": [1, 2, 3]}) + >>> to_object(cfg) + {'list': [1, 2, 3]} + """ + if isinstance(cfg, DictConfig) and "_target_" in cfg.keys(): + return cfg + + return OmegaConf.to_container( + cfg=cfg, + resolve=True, + throw_on_missing=True, + enum_to_str=False, + structured_config_mode=SCMode.INSTANTIATE, + ) diff --git a/cosmos-inference/cosmos3/_src/imaginaire/lazy_config/registry.py b/cosmos-inference/cosmos3/_src/imaginaire/lazy_config/registry.py new file mode 100644 index 00000000..4621fa82 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/imaginaire/lazy_config/registry.py @@ -0,0 +1,91 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +import inspect +import pydoc +from typing import Any + +""" +``Registry`` and `locate` provide ways to map a string (typically found +in config files) to callable objects. +""" + +__all__ = ["locate", "convert_target_to_string"] + +try: + from fvcore.common.registry import Registry # for backward compatibility. + + __all__ += ["Registry"] +except Exception: + pass + + +def convert_target_to_string(t: Any) -> str: + """ + Inverse of ``locate()``. + + Args: + t: any object with ``__module__`` and ``__qualname__`` + """ + if hasattr(t, "__self__") and inspect.isclass(t.__self__): + # classmethod + cls = t.__self__ + module = cls.__module__ + qualname = f"{cls.__name__}.{t.__name__}" + else: + module = t.__module__ + qualname = t.__qualname__ + + # Compress the path to this object, e.g. ``module.submodule._impl.class`` + # may become ``module.submodule.class``, if the later also resolves to the same + # object. This simplifies the string, and also is less affected by moving the + # class implementation. + module_parts = module.split(".") + for k in range(1, len(module_parts)): + prefix = ".".join(module_parts[:k]) + candidate = f"{prefix}.{qualname}" + try: + if locate(candidate) is t: + return candidate + except ImportError: + pass + return f"{module}.{qualname}" + + +_convert_target_to_string = convert_target_to_string # for backward compatibility. + + +def locate(name: str) -> Any: + """ + Locate and return an object ``x`` using an input string ``{x.__module__}.{x.__qualname__}``, + such as "module.submodule.class_name". + + Raise Exception if it cannot be found. + """ + obj = pydoc.locate(name) + + # Some cases (e.g. torch.optim.sgd.SGD) not handled correctly + # by pydoc.locate. Try a private function from hydra. + if obj is None: + try: + # from hydra.utils import get_method - will print many errors + + from hydra.utils import _locate + except ImportError as e: + raise ImportError(f"Cannot dynamically locate object {name}!") from e + else: + obj = _locate(name) # it raises if fails + + return obj diff --git a/cosmos-inference/cosmos3/_src/imaginaire/model.py b/cosmos-inference/cosmos3/_src/imaginaire/model.py new file mode 100644 index 00000000..b595ba1c --- /dev/null +++ b/cosmos-inference/cosmos3/_src/imaginaire/model.py @@ -0,0 +1,130 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +from typing import Any + +import torch + +from cosmos3._src.imaginaire.lazy_config import LazyDict, instantiate + + +class ImaginaireModel(torch.nn.Module): + """The base model class of Imaginaire. It is inherited from torch.nn.Module. + + All models in Imaginaire should inherit ImaginaireModel. It should include the implementions for all the + computation graphs. All inheriting child classes should implement the following methods: + - training_step(): The training step of the model, including the loss computation. + - validation_step(): The validation step of the model, including the loss computation. + - forward(): The computation graph for model inference. + The following methods have default implementations in ImaginaireModel: + - init_optimizer_scheduler(): Creates the optimizer and scheduler for the model. + """ + + def __init__(self) -> None: + super().__init__() + self.parallel_dims = None + + def init_optimizer_scheduler( + self, optimizer_config: LazyDict, scheduler_config: LazyDict + ) -> tuple[torch.optim.Optimizer, torch.optim.lr_scheduler.LRScheduler]: + """Creates the optimizer and scheduler for the model. + + Args: + config_model (ModelConfig): The config object for the model. + + Returns: + optimizer (torch.optim.Optimizer): The model optimizer. + scheduler (torch.optim.lr_scheduler.LRScheduler): The optimization scheduler. + """ + optimizer_config.params = self.parameters() + optimizer = instantiate(optimizer_config) + scheduler_config.optimizer = optimizer + scheduler = instantiate(scheduler_config) + return optimizer, scheduler + + def training_step( + self, data_batch: dict[str, torch.Tensor], iteration: int + ) -> tuple[dict[str, torch.Tensor], torch.Tensor]: + """The training step of the model, including the loss computation. + + Args: + data (dict[str, torch.Tensor]): Data batch (dictionary of tensors). + iteration (int): Current iteration number. + + Returns: + output_batch (dict[str, torch.Tensor]): Auxiliary model output from the training batch. + loss (torch.Tensor): The total loss for backprop (weighted sum of various losses). + """ + raise NotImplementedError + + @torch.no_grad() + def validation_step( + self, data_batch: dict[str, torch.Tensor], iteration: int + ) -> tuple[dict[str, torch.Tensor], torch.Tensor]: + """The validation step of the model, including the loss computation. + + Args: + data (dict[str, torch.Tensor]): Data batch (dictionary of tensors). + iteration (int): Current iteration number. + + Returns: + output_batch (dict[str, torch.Tensor]): Auxiliary model output from the validation batch. + loss (torch.Tensor): The total loss (weighted sum of various losses). + """ + raise NotImplementedError + + @torch.inference_mode() + def forward(self, *args: Any, **kwargs: Any) -> Any: + """The computation graph for model inference. + + Args: + *args: Whatever you decide to pass into the forward method. + **kwargs: Keyword arguments are also possible. + + Return: + Your model's output. + """ + raise NotImplementedError + + def on_train_start(self, memory_format: torch.memory_format = torch.preserve_format) -> None: + """The model preparation before the training is launched + + Args: + memory_format (torch.memory_format): Memory format of the model. + """ + pass + + def on_before_zero_grad( + self, optimizer: torch.optim.Optimizer, scheduler: torch.optim.lr_scheduler.LRScheduler, iteration: int + ) -> None: + """Hook before zero_grad() is called. + + Args: + optimizer (torch.optim.Optimizer): The model optimizer. + scheduler (torch.optim.lr_scheduler.LRScheduler): The optimization scheduler. + iteration (int): Current iteration number. + """ + pass + + def on_after_backward(self, iteration: int = 0) -> None: + """Hook after loss.backward() is called. + + This method is called immediately after the backward pass, allowing for custom operations + or modifications to be performed on the gradients before the optimizer step. + + Args: + iteration (int): Current iteration number. + """ + pass diff --git a/cosmos-inference/cosmos3/_src/imaginaire/models/abstract_emb_model.py b/cosmos-inference/cosmos3/_src/imaginaire/models/abstract_emb_model.py new file mode 100644 index 00000000..d4ae38b9 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/imaginaire/models/abstract_emb_model.py @@ -0,0 +1,104 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +from __future__ import annotations + +from typing import Optional, Union + +import torch +import torch.nn as nn + +from cosmos3._src.imaginaire.functional.batch_ops import batch_mul +from cosmos3._src.imaginaire.utils.count_params import count_params + + +class AbstractEmbModel(nn.Module): + def __init__(self) -> None: + super().__init__() + + self._is_trainable = None + self._dropout_rate = None + self._input_key = None + self._return_dict = False + + @property + def is_trainable(self) -> bool: + return self._is_trainable + + @property + def dropout_rate(self) -> Union[float, torch.Tensor]: + return self._dropout_rate + + @property + def input_key(self) -> str: + return self._input_key + + @property + def is_return_dict(self) -> bool: + return self._return_dict + + @is_trainable.setter + def is_trainable(self, value: bool) -> None: + self._is_trainable = value + + @dropout_rate.setter + def dropout_rate(self, value: Union[float, torch.Tensor]) -> None: + self._dropout_rate = value + + @input_key.setter + def input_key(self, value: str) -> None: + self._input_key = value + + @is_return_dict.setter + def is_return_dict(self, value: bool) -> None: + self._return_dict = value + + @is_trainable.deleter + def is_trainable(self) -> None: + del self._is_trainable + + @dropout_rate.deleter + def dropout_rate(self) -> None: + del self._dropout_rate + + @input_key.deleter + def input_key(self) -> None: + del self._input_key + + @is_return_dict.deleter + def is_return_dict(self) -> None: + del self._return_dict + + def random_dropout_input( + self, in_tensor: torch.Tensor, dropout_rate: Optional[float] = None, key: Optional[str] = None + ) -> torch.Tensor: + del key + dropout_rate = dropout_rate if dropout_rate is not None else self.dropout_rate + return batch_mul( + torch.bernoulli((1.0 - dropout_rate) * torch.ones(in_tensor.shape[0])).type_as(in_tensor), + in_tensor, + ) + + def details(self) -> str: + return "" + + def summary(self) -> str: + input_key = self.input_key if self.input_key is not None else getattr(self, "input_keys", None) + return ( + f"{self.__class__.__name__} \n\tinput key: {input_key}" + f"\n\tParam count: {count_params(self, False)} \n\tTrainable: {self.is_trainable}" + f"\n\tDropout rate: {self.dropout_rate}" + f"\n\t{self.details()}" + ) diff --git a/cosmos-inference/cosmos3/_src/imaginaire/modules/camera.py b/cosmos-inference/cosmos3/_src/imaginaire/modules/camera.py new file mode 100644 index 00000000..b8bf556d --- /dev/null +++ b/cosmos-inference/cosmos3/_src/imaginaire/modules/camera.py @@ -0,0 +1,663 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +import numpy as np +import torch + + +def _recursive_to_numpy(x): + if isinstance(x, torch.Tensor): + return x.detach().cpu().numpy() + if isinstance(x, (list, tuple)): + return type(x)(_recursive_to_numpy(v) for v in x) + if isinstance(x, dict): + return {k: _recursive_to_numpy(v) for k, v in x.items()} + return x + + +def supports_numpy(arg_names, use_no_grad: bool = True): + """Decorator to transparently support numpy inputs. + + - Converts the specified named args from numpy arrays to torch tensors on entry + - Runs the wrapped function (optionally under no_grad) + - Converts returns back to numpy iff the FIRST targeted arg was a numpy array + """ + import functools + import inspect + + def decorator(fn): + sig = inspect.signature(fn) + + @functools.wraps(fn) + def wrapper(*args, **kwargs): + bound = sig.bind(*args, **kwargs) + bound.apply_defaults() + first_is_numpy = False + for idx, name in enumerate(arg_names): + if name in bound.arguments: + val = bound.arguments[name] + # Handle direct ndarray + if isinstance(val, np.ndarray): + if idx == 0: + first_is_numpy = True + bound.arguments[name] = torch.from_numpy(val) + continue + # Handle list/tuple of ndarrays -> list/tuple of tensors + if isinstance(val, (list, tuple)): + saw_numpy_first_elem = False + converted = [] + for i, el in enumerate(val): + if isinstance(el, np.ndarray): + if i == 0: + saw_numpy_first_elem = True + converted.append(torch.from_numpy(el)) + else: + converted.append(el) + if idx == 0 and saw_numpy_first_elem: + first_is_numpy = True + bound.arguments[name] = type(val)(converted) + + ctx = torch.no_grad() if use_no_grad else torch.enable_grad() + with ctx: + out = fn(*bound.args, **bound.kwargs) + return _recursive_to_numpy(out) if first_is_numpy else out + + return wrapper + + return decorator + + +class Camera: + """A class with a collection of common ops for camera transformations (Pytorch tensors). + + All poses are expected to have shape [...,3,4], where (...) indicates batch sizes of various ranks. + The last two dimensions (of size (3,4)) correspond to the extrinsic matrix [R|t] in OpenCV format. + + Convention: cam_pose is always a world-to-camera transform (world2cam): x_cam = R @ x_world + t. + This module operates on row-vector points with homogeneous coordinates on the right, so we apply + transformations as: points_hom @ cam_pose^T. + """ + + @staticmethod + @supports_numpy(["cam_pose"], use_no_grad=True) + def _check_valid_pose(cam_pose: torch.Tensor | np.ndarray) -> None: + """Checks whether the input tensor is a valid camera pose. + + Args: + cam_pose (torch.Tensor [...,3,4]): Input camera pose in world2cam [R|t] (OpenCV) format. + """ + assert cam_pose.shape[-2:] == (3, 4), "Camera pose is not of shape (3,4)." + R = cam_pose[..., :3] + # Compute determinant in float32 for numerical stability and allow dtype-dependent tolerance. + det_R = torch.linalg.det(R.to(torch.float32)) + one = torch.tensor(1.0, dtype=torch.float32, device=cam_pose.device) + if cam_pose.dtype in (torch.bfloat16, torch.float16): + rtol, atol = 1e-2, 1e-2 + else: + rtol, atol = 1e-4, 1e-6 + finite = bool(torch.isfinite(det_R).all()) + close = torch.allclose(det_R, one, rtol=rtol, atol=atol) + assert finite and close, ( + f"Rotation component in camera pose is invalid (det != 1 within tol). " + f"dtype={cam_pose.dtype}, rtol={rtol}, atol={atol}, det_mean={det_R.mean().item():.6f}" + ) + + @staticmethod + @supports_numpy(["cam_pose"], use_no_grad=True) + def invert_pose(cam_pose: torch.Tensor | np.ndarray) -> torch.Tensor | np.ndarray: + """Invert a camera pose. + + Args: + cam_pose (torch.Tensor/np.ndarray [...,3,4]): Input camera pose (world2cam [R|t]). + + Returns: + cam_pose_inv (torch.Tensor/np.ndarray [...,3,4]): The inverted camera pose (cam2world [R|t]). + """ + Camera._check_valid_pose(cam_pose) + in_dtype = cam_pose.dtype if isinstance(cam_pose, torch.Tensor) else torch.float32 + R, t = cam_pose[..., :3], cam_pose[..., 3:] # [...,3,3], [...,3,1] + # Compute in float32 for numerical stability, cast back at the end + R32 = R.to(torch.float32) # [...,3,3] + t32 = t.to(torch.float32) # [...,3,1] + # For rotation matrices, inverse equals transpose; prefer transpose for stability and speed + R_inv32 = R32.transpose(-1, -2) # [...,3,3] + t_inv32 = -R_inv32 @ t32 # [...,3,1] + cam_pose_inv32 = torch.cat([R_inv32, t_inv32], dim=-1) # [...,3,4] + return cam_pose_inv32.to(in_dtype) + + @staticmethod + @supports_numpy(["cam_poses"], use_no_grad=True) + def compose_poses(cam_poses: list[torch.Tensor | np.ndarray]) -> torch.Tensor | np.ndarray: + """Compose a sequence of camera transformations together. + + pose_new = compose_poses([pose_1, pose_2, ... pose_N]) + pose_new(x) = pose_N o ... o pose_2 o pose_1(x) + + Args: + cam_poses (list[torch.Tensor/np.ndarray [...,3,4]]): Sequence of rigid transforms [R|t]. + When used as camera extrinsics in this module, each pose is assumed to be world2cam. + The composition follows the same row-vector convention: points_hom @ pose^T. + List items may be numpy arrays; they will be converted to torch internally. + + Returns: + cam_pose_new (torch.Tensor/np.ndarray [...,3,4]): The composed transformation [R|t]. + """ + cam_pose_new = cam_poses[0] + Camera._check_valid_pose(cam_pose_new) + out_dtype = cam_pose_new.dtype if isinstance(cam_pose_new, torch.Tensor) else torch.float32 + R_new, t_new = ( + cam_pose_new[..., :3].to(torch.float32), + cam_pose_new[..., 3:].to(torch.float32), + ) # [...,3,3], [...,3,1] + for cam_pose in cam_poses[1:]: + Camera._check_valid_pose(cam_pose) + # pose_new(x) = pose o pose_new(x) + R, t = cam_pose[..., :3].to(torch.float32), cam_pose[..., 3:].to(torch.float32) # [...,3,3], [...,3,1] + R_new = R @ R_new # [...,3,3] + t_new = R @ t_new + t # [...,3,1] + cam_pose_new32 = torch.cat([R_new, t_new], dim=-1) # [...,3,4] + return cam_pose_new32.to(out_dtype) + + @staticmethod + @supports_numpy(["cam_pose", "cam_intr"], use_no_grad=True) + def get_camera_rays( + cam_pose: torch.Tensor | np.ndarray, + cam_intr: torch.Tensor | np.ndarray, + image_size: tuple[int, int], + ) -> torch.Tensor | np.ndarray: + """Get unit-norm camera rays in world coordinates for each pixel center. + + Args: + cam_pose (torch.Tensor/np.ndarray [...,3,4]): Camera pose (world2cam [R|t]). + cam_intr (torch.Tensor/np.ndarray [...,3,3]): Camera intrinsics. + image_size (Tuple[int, int]): Image size (height, width). + + Returns: + rays_world (torch.Tensor/np.ndarray [...,HW,3]): Unit direction rays from camera center through pixel centers, flattened over pixels. + """ + H, W = image_size + with torch.no_grad(): + # Compute image coordinate grid (in float32 for stability). + y_range = torch.arange(H, dtype=torch.float32, device=cam_pose.device).add_(0.5) # [H] + x_range = torch.arange(W, dtype=torch.float32, device=cam_pose.device).add_(0.5) # [W] + y_grid, x_grid = torch.meshgrid(y_range, x_range, indexing="ij") # [H,W] each + xy_grid = torch.stack([x_grid, y_grid], dim=-1).view(-1, 2) # [HW,2] + xy_grid = xy_grid.repeat(*cam_pose.shape[:-2], 1, 1) # [...,HW,2] + # Pixel centers in camera coordinates at depth 1 (flattened HW) + grid_camera = Camera.image2camera(Camera.to_homogeneous(xy_grid), cam_intr) # [...,HW,3] + # Transform sample points and center to world + grid_world = Camera.camera2world(grid_camera, cam_pose) # [...,HW,3] + center_world = Camera.get_camera_center(cam_pose).unsqueeze(-2).expand_as(grid_world) # [...,HW,3] + rays_world = grid_world - center_world # [...,HW,3] + # Normalize to unit vectors + eps = 1e-8 + if cam_pose.dtype in (torch.bfloat16, torch.float16): + eps = 1e-2 + norms32 = rays_world.to(torch.float32).norm(dim=-1, keepdim=True).clamp_min(eps) # [...,HW,1] + rays_world = rays_world / norms32.to(rays_world.dtype) # [...,HW,3] + # Cast back to input dtype for consistency + rays_world = rays_world.to(cam_pose.dtype) # [...,HW,3] + # Keep flattened shape [...,HW,3] + return rays_world + + @staticmethod + @supports_numpy(["cam_pose", "cam_intr"], use_no_grad=True) + def get_plucker_rays( + cam_pose: torch.Tensor | np.ndarray, + cam_intr: torch.Tensor | np.ndarray, + image_size: tuple[int, int], + ) -> torch.Tensor | np.ndarray: + """Get Plücker coordinates (moment, direction) for each pixel center. + + Args: + cam_pose (torch.Tensor/np.ndarray [...,3,4]): Camera pose (world2cam [R|t]). + cam_intr (torch.Tensor/np.ndarray [...,3,3]): Camera intrinsics. + image_size (Tuple[int, int]): Image size (height, width). + + Returns: + plucker (torch.Tensor/np.ndarray [...,HW,6]): Plücker coordinates [m | d], where + d is a unit direction vector and m = o × d with o the camera center in world. + """ + H, W = image_size + rays_world = Camera.get_camera_rays(cam_pose, cam_intr, image_size) # [...,HW,3] + # Expand center to [...,HW,3] + center_hw = Camera.get_camera_center(cam_pose).unsqueeze(-2).expand_as(rays_world) # [...,HW,3] + moment = torch.linalg.cross(center_hw, rays_world) # [...,HW,3] + plucker = torch.cat([moment, rays_world], dim=-1) # [...,HW,6] + return plucker + + @staticmethod + @supports_numpy(["cam_pose"], use_no_grad=True) + def get_relative_poses_wrt_frame0( + cam_pose: torch.Tensor | np.ndarray, + ) -> torch.Tensor | np.ndarray: + """Compute poses relative to the first frame (index 0). + + All poses are world-to-camera [R|t] with shape [...,3,4]. The returned poses are expressed + in the coordinate system of the first camera, so the first pose is identity [I|0]. For the + i-th pose: pose_rel_i = compose(pose_i, inverse(pose_ref)). + + Args: + cam_pose (torch.Tensor/np.ndarray [...,V,3,4]): World-to-camera extrinsics per view. + + Returns: + cam_pose_rel (torch.Tensor/np.ndarray [...,V,3,4]): Relative world-to-camera extrinsics in the first frame. + """ + # supports_numpy handles numpy + assert cam_pose.shape[-2:] == (3, 4), "cam_pose must have shape [..., V, 3, 4]." + # Reference pose and its inverse + pose_ref = cam_pose.select(dim=-3, index=0) # [...,3,4] + pose_ref_inv = Camera.invert_pose(pose_ref) # [...,3,4] + # Compose with broadcasting: pose_rel = pose ∘ pose_ref_inv + cam_pose_rel = Camera.compose_poses([pose_ref_inv, cam_pose]) # [...,V,3,4] + return cam_pose_rel + + @staticmethod + @supports_numpy(["cam_pose"], use_no_grad=True) + def get_camera_center(cam_pose: torch.Tensor | np.ndarray) -> torch.Tensor | np.ndarray: + """Get the camera center in world coordinates for a given world2cam pose. + + Args: + cam_pose (torch.Tensor/np.ndarray [...,3,4]): Camera pose (world2cam [R|t]). + + Returns: + center_world (torch.Tensor/np.ndarray [...,3]): Camera center in world coordinates. + """ + Camera._check_valid_pose(cam_pose) + R, t = cam_pose[..., :3], cam_pose[..., 3:] # [...,3,3], [...,3,1] + center_world32 = (-R.to(torch.float32).transpose(-1, -2) @ t.to(torch.float32)).squeeze(-1) # [...,3] + return center_world32.to(R.dtype) + + @staticmethod + @supports_numpy(["points"], use_no_grad=True) + def to_homogeneous(points: torch.Tensor | np.ndarray) -> torch.Tensor | np.ndarray: + """Get homogeneous coordinates of the input points. + + Args: + points (torch.Tensor/np.ndarray [...,K]): Input coordinates. + + Returns: + points_hom (torch.Tensor/np.ndarray [...,K+1]): Homogeneous coordinates. + """ + # Compute homogeneous coordinate in float32 for stability, then cast back + one32 = torch.ones_like( + points[..., :1], dtype=torch.float32, device=(points.device if isinstance(points, torch.Tensor) else None) + ) # [...,1] + points_hom = torch.cat([points, one32.to(points.dtype)], dim=-1) # [...,K+1] + return points_hom + + @staticmethod + @supports_numpy(["points", "cam_pose"], use_no_grad=True) + def world2camera( + points: torch.Tensor | np.ndarray, cam_pose: torch.Tensor | np.ndarray + ) -> torch.Tensor | np.ndarray: + """Given the camera pose, transform input 3D points from world coordinates to camera coordinates. + + Args: + points (torch.Tensor/np.ndarray [...,N,3]): Input 3D points. + cam_pose (torch.Tensor/np.ndarray [...,3,4]/[3,4]): (Batched) camera pose (world2cam [R|t]). + + Returns: + points_new (torch.Tensor/np.ndarray [...,N,3]): Transformed 3D points. + """ + points_hom = Camera.to_homogeneous(points).to(torch.float32) # [...,N,4] + points_new32 = points_hom @ cam_pose.to(torch.float32).transpose(-1, -2) # [...,N,3] + return points_new32.to(points.dtype) # [...,N,3] + + @staticmethod + @supports_numpy(["points", "cam_pose"], use_no_grad=True) + def camera2world( + points: torch.Tensor | np.ndarray, cam_pose: torch.Tensor | np.ndarray + ) -> torch.Tensor | np.ndarray: + """Given the camera pose, transform input 3D points from camera coordinates to world coordinates. + + Args: + points (torch.Tensor/np.ndarray [...,N,3]): Input 3D points. + cam_pose (torch.Tensor/np.ndarray [...,3,4]/[3,4]): (Batched) camera pose (world2cam [R|t]). + + Returns: + points_new (torch.Tensor/np.ndarray [...,N,3]): Transformed 3D points. + """ + points_hom = Camera.to_homogeneous(points).to(torch.float32) # [...,N,4] + pose_inv = Camera.invert_pose(cam_pose) # [...,3,4] + points_new32 = points_hom @ pose_inv.to(torch.float32).transpose(-1, -2) # [...,N,3] + # To reduce double-quantization error on low-precision dtypes (e.g., bf16 on CPU), + # keep high precision on output for transform back to world space. + if isinstance(points, torch.Tensor) and points.dtype in (torch.bfloat16, torch.float16): + return points_new32 # [...,N,3] + return points_new32.to(points.dtype) # [...,N,3] + + @staticmethod + @supports_numpy(["points", "cam_intr"], use_no_grad=True) + def camera2image( + points: torch.Tensor | np.ndarray, cam_intr: torch.Tensor | np.ndarray + ) -> torch.Tensor | np.ndarray: + """Given the camera intrinsics, calibrate input 3D points from camera frame to image (pixel) frame. + + Args: + points (torch.Tensor/np.ndarray [...,N,3]): Input 3D points. + cam_intr (torch.Tensor/np.ndarray [...,3,3]/[3,3]): (Batched) camera intrinsic matrix. + + Returns: + points_new (torch.Tensor/np.ndarray [...,N,3]): Transformed 3D points. + """ + points32 = points.to(torch.float32) # [...,N,3] + points_new32 = points32 @ cam_intr.to(torch.float32).transpose(-1, -2) # [...,N,3] + return points_new32.to(points.dtype) # [...,N,3] + + @staticmethod + @supports_numpy(["points", "cam_intr"], use_no_grad=True) + def image2camera( + points: torch.Tensor | np.ndarray, cam_intr: torch.Tensor | np.ndarray + ) -> torch.Tensor | np.ndarray: + """Given the camera intrinsics, calibrate input 3D points from image (pixel) frame to camera frame. + + Args: + points (torch.Tensor/np.ndarray [...,N,3]): Input 3D points. + cam_intr (torch.Tensor/np.ndarray [...,3,3]/[3,3]): (Batched) camera intrinsic matrix. + + Returns: + points_new (torch.Tensor/np.ndarray [...,N,3]): Transformed 3D points. + """ + K_inv32 = torch.linalg.inv(cam_intr.to(torch.float32)) # [...,3,3] + points32 = points.to(torch.float32) # [...,N,3] + points_new32 = points32 @ K_inv32.transpose(-1, -2) # [...,N,3] + return points_new32.to(points.dtype) # [...,N,3] + + @staticmethod + @supports_numpy(["params"], use_no_grad=True) + def intrinsic_params_to_matrices(params: torch.Tensor | np.ndarray) -> torch.Tensor | np.ndarray: + """Convert (fx, fy, cx, cy) parameters to camera intrinsic matrix/matrices. + + Args: + params (torch.Tensor/np.ndarray [...,4]): Intrinsic parameters (fx, fy, cx, cy). + + Returns: + K (torch.Tensor/np.ndarray [...,3,3]): Camera intrinsic matrices. + """ + assert params.shape[-1] == 4, "Intrinsic params must have shape (..., 4) for (fx, fy, cx, cy)." + fx, fy, cx, cy = params.unbind(dim=-1) # [...] each + one = torch.ones_like(fx) # [...] + zero = torch.zeros_like(fx) # [...] + row0 = torch.stack([fx, zero, cx], dim=-1) # [...,3] + row1 = torch.stack([zero, fy, cy], dim=-1) # [...,3] + row2 = torch.stack([zero, zero, one], dim=-1) # [...,3] + K = torch.stack([row0, row1, row2], dim=-2) # [...,3,3] + return K + + @staticmethod + @supports_numpy(["cam_intr"], use_no_grad=True) + def intrinsic_matrices_to_params( + cam_intr: torch.Tensor | np.ndarray, atol: float = 1e-6 + ) -> torch.Tensor | np.ndarray: + """Extract (fx, fy, cx, cy) from camera intrinsic matrix/matrices. + + Args: + cam_intr (torch.Tensor/np.ndarray [...,3,3]): Camera intrinsic matrices. + atol (float): Tolerance when checking the bottom row against [0,0,1]. + + Returns: + params (torch.Tensor/np.ndarray [...,4]): Intrinsic parameters (fx, fy, cx, cy). + """ + assert cam_intr.shape[-2:] == (3, 3), "Intrinsic matrix must have shape (..., 3, 3)." + row32 = cam_intr[..., 2, :].to(torch.float32) + target32 = torch.tensor([0.0, 0.0, 1.0], dtype=torch.float32, device=cam_intr.device) + rtol = 1e-5 + atol_eff = atol + if cam_intr.dtype in (torch.bfloat16, torch.float16): + rtol = 1e-2 + atol_eff = max(atol, 1e-2) + if not torch.allclose(row32, target32, rtol=rtol, atol=atol_eff): + # Still proceed but warn via assertion message if strictness is desired. + pass + fx = cam_intr[..., 0, 0] # [...] + fy = cam_intr[..., 1, 1] # [...] + cx = cam_intr[..., 0, 2] # [...] + cy = cam_intr[..., 1, 2] # [...] + params = torch.stack([fx, fy, cx, cy], dim=-1) # [...,4] + return params + + @staticmethod + @supports_numpy(["qxyzw_t"], use_no_grad=True) + def extrinsic_params_to_matrices(qxyzw_t: torch.Tensor | np.ndarray) -> torch.Tensor | np.ndarray: + """Convert (x,y,z,w, tx,ty,tz) to world2cam extrinsic matrix/matrices [R|t]. + + Args: + qxyzw_t (torch.Tensor/np.ndarray [...,7]): Quaternion (xyzw) and translation stacked. + + Returns: + cam_pose (torch.Tensor/np.ndarray [...,3,4]): World-to-camera extrinsic [R|t]. + """ + assert qxyzw_t.shape[-1] == 7, "Input must have shape (..., 7) for (qx,qy,qz,qw,tx,ty,tz)." + q = qxyzw_t[..., :4] # [...,4] + t = qxyzw_t[..., 4:7] # [...,3] + # Enforce unit quaternion + Quaternion._check_valid_quaternion(q, require_normalized=True) + R = Quaternion.to_rotation_matrix(q) # [...,3,3] + cam_pose = torch.cat([R, t.unsqueeze(-1)], dim=-1) # [...,3,4] + return cam_pose + + @staticmethod + @supports_numpy(["cam_pose"], use_no_grad=True) + def extrinsic_matrices_to_params(cam_pose: torch.Tensor | np.ndarray) -> torch.Tensor | np.ndarray: + """Convert world2cam extrinsic matrix/matrices [R|t] to (x,y,z,w, tx,ty,tz). + + Args: + cam_pose (torch.Tensor/np.ndarray [...,3,4]): World-to-camera extrinsic [R|t]. + + Returns: + qxyzw_t (torch.Tensor/np.ndarray [...,7]): Quaternion (xyzw) and translation stacked. + """ + Camera._check_valid_pose(cam_pose) + R = cam_pose[..., :3] # [...,3,3] + t = cam_pose[..., 3:].squeeze(-1) # [...,3] + q = Quaternion.from_rotation_matrix(R) # [...,4] + qxyzw_t = torch.cat([q, t], dim=-1) # [...,7] + return qxyzw_t + + +class Quaternion: + """A collection of common quaternion operations (Pytorch tensors). + + Convention (STRICT): Quaternions are represented in (x, y, z, w) order (xyzw) and are unit-norm. + The last dimension must be size 4. + """ + + @staticmethod + @supports_numpy(["q"], use_no_grad=True) + def _check_valid_quaternion( + q: torch.Tensor | np.ndarray, require_normalized: bool = True, atol: float = 1e-5 + ) -> torch.Tensor | np.ndarray: + """Checks whether the input tensor is a valid quaternion. + + Args: + q (torch.Tensor [...,4]): Input quaternion(s) in (x, y, z, w) order. + require_normalized (bool): If True, assert unit-norm within atol. Defaults to True. + atol (float): Absolute tolerance for the unit-norm check. + """ + assert q.shape[-1] == 4, "Quaternion is not of shape (..., 4)." + if require_normalized: + norms32 = q.to(torch.float32).norm(dim=-1) # [...] + ones32 = torch.ones_like(norms32) # [...] + tol = max(atol, 1e-2) if q.dtype in (torch.bfloat16, torch.float16) else atol + assert torch.allclose(norms32, ones32, atol=tol), "Quaternion must be unit length." + return q + + @staticmethod + @supports_numpy(["q"], use_no_grad=True) + def normalize(q: torch.Tensor | np.ndarray, eps: float = 1e-8) -> torch.Tensor | np.ndarray: + """Normalize quaternion(s) to unit length. + + Args: + q (torch.Tensor [...,4]): Input quaternion(s). + eps (float): Small epsilon to avoid division by zero. + + Returns: + q_norm (torch.Tensor [...,4]): Unit quaternions. + """ + # Allow non-normalized input here, since this function normalizes + Quaternion._check_valid_quaternion(q, require_normalized=False) + eps_eff = eps + if q.dtype in (torch.bfloat16, torch.float16): + eps_eff = max(eps, 1e-2) + norm32 = q.to(torch.float32).norm(dim=-1, keepdim=True).clamp_min(eps_eff) # [...,1] + out32 = q.to(torch.float32) / norm32 # [...,4] + out = out32.to(q.dtype) # [...,4] + return out + + @staticmethod + @supports_numpy(["q"], use_no_grad=True) + def to_rotation_matrix(q: torch.Tensor | np.ndarray) -> torch.Tensor | np.ndarray: + """Convert quaternion(s) to rotation matrix/matrices. + + Args: + q (torch.Tensor [...,4]): Quaternion(s) (x, y, z, w). + + Returns: + R (torch.Tensor [...,3,3]): Rotation matrix/matrices. + """ + # Enforce unit quaternions for rotations + Quaternion._check_valid_quaternion(q, require_normalized=True) + q32 = q.to(torch.float32) # [...,4] + qx, qy, qz, qw = q32.unbind(dim=-1) # [...] each + two = torch.tensor(2.0, dtype=torch.float32, device=q32.device) + + r00 = 1 - two * (qy * qy + qz * qz) # [...] + r01 = two * (qx * qy - qz * qw) # [...] + r02 = two * (qx * qz + qy * qw) # [...] + r10 = two * (qx * qy + qz * qw) # [...] + r11 = 1 - two * (qx * qx + qz * qz) # [...] + r12 = two * (qy * qz - qx * qw) # [...] + r20 = two * (qx * qz - qy * qw) # [...] + r21 = two * (qx * qw + qy * qz) # [...] + r22 = 1 - two * (qx * qx + qy * qy) # [...] + + R32 = torch.stack( + [ + torch.stack([r00, r01, r02], dim=-1), # [...,3] + torch.stack([r10, r11, r12], dim=-1), # [...,3] + torch.stack([r20, r21, r22], dim=-1), # [...,3] + ], + dim=-2, + ) # [...,3,3] + return R32.to(q.dtype) + + @staticmethod + @supports_numpy(["R"], use_no_grad=True) + def from_rotation_matrix(R: torch.Tensor | np.ndarray, eps: float = 1e-8) -> torch.Tensor | np.ndarray: + """Convert rotation matrix/matrices to quaternion(s). + + Args: + R (torch.Tensor [...,3,3]): Rotation matrix/matrices. + eps (float): Numerical stability epsilon. + + Returns: + q (torch.Tensor [...,4]): Quaternion(s) in (x, y, z, w) order. + """ + assert R.shape[-2:] == (3, 3), "Rotation matrix is not of shape (..., 3, 3)." + R32 = R.to(torch.float32) # [...,3,3] + m00 = R32[..., 0, 0] # [...] + m11 = R32[..., 1, 1] # [...] + m22 = R32[..., 2, 2] # [...] + trace = m00 + m11 + m22 # [...] + + q32 = torch.empty(*R32.shape[:-2], 4, dtype=torch.float32, device=R32.device) # [...,4] + + cond0 = trace > 0 # [...] + eps_eff = max(eps, 1e-6) + s0 = torch.sqrt(trace + 1.0 + eps_eff) * 2.0 # [...] + qw0 = 0.25 * s0 # [...] + qx0 = (R[..., 2, 1] - R[..., 1, 2]) / s0 # [...] + qy0 = (R[..., 0, 2] - R[..., 2, 0]) / s0 # [...] + qz0 = (R[..., 1, 0] - R[..., 0, 1]) / s0 # [...] + + cond1 = (~cond0) & (m00 > m11) & (m00 > m22) # [...] + s1 = torch.sqrt(1.0 + m00 - m11 - m22 + eps_eff) * 2.0 # [...] + qw1 = (R[..., 2, 1] - R[..., 1, 2]) / s1 # [...] + qx1 = 0.25 * s1 # [...] + qy1 = (R[..., 0, 1] + R[..., 1, 0]) / s1 # [...] + qz1 = (R[..., 0, 2] + R[..., 2, 0]) / s1 # [...] + + cond2 = (~cond0) & (~cond1) & (m11 > m22) # [...] + s2 = torch.sqrt(1.0 + m11 - m00 - m22 + eps_eff) * 2.0 # [...] + qw2 = (R[..., 0, 2] - R[..., 2, 0]) / s2 # [...] + qx2 = (R[..., 0, 1] + R[..., 1, 0]) / s2 # [...] + qy2 = 0.25 * s2 # [...] + qz2 = (R[..., 1, 2] + R[..., 2, 1]) / s2 # [...] + + cond3 = (~cond0) & (~cond1) & (~cond2) # [...] + s3 = torch.sqrt(1.0 + m22 - m00 - m11 + eps_eff) * 2.0 # [...] + qw3 = (R[..., 1, 0] - R[..., 0, 1]) / s3 # [...] + qx3 = (R[..., 0, 2] + R[..., 2, 0]) / s3 # [...] + qy3 = (R[..., 1, 2] + R[..., 2, 1]) / s3 # [...] + qz3 = 0.25 * s3 # [...] + + qw = torch.where(cond0, qw0, torch.where(cond1, qw1, torch.where(cond2, qw2, qw3))) # [...] + qx = torch.where(cond0, qx0, torch.where(cond1, qx1, torch.where(cond2, qx2, qx3))) # [...] + qy = torch.where(cond0, qy0, torch.where(cond1, qy1, torch.where(cond2, qy2, qy3))) # [...] + qz = torch.where(cond0, qz0, torch.where(cond1, qz1, torch.where(cond2, qz2, qz3))) # [...] + + q32[..., 0] = qx + q32[..., 1] = qy + q32[..., 2] = qz + q32[..., 3] = qw + + q_out = Quaternion.normalize(q32).to(R.dtype) # [...,4] + return q_out + + @staticmethod + @supports_numpy(["q"], use_no_grad=True) + def invert(q: torch.Tensor | np.ndarray, eps: float = 1e-8) -> torch.Tensor | np.ndarray: + """Inverse of quaternion(s). Equivalent to conjugate of quaternion. + + For unit quaternions, the inverse equals the conjugate. For non-unit quaternions, + q^{-1} = conjugate(q) / ||q||^2. + + Args: + q (torch.Tensor [...,4]): Input quaternion(s). + eps (float): Small epsilon to avoid division by zero. + + Returns: + q_inv (torch.Tensor [...,4]): Inverted quaternion(s). + """ + # Enforce unit quaternions; inverse equals conjugate + Quaternion._check_valid_quaternion(q, require_normalized=True) + qx, qy, qz, qw = q.unbind(dim=-1) # [...] each + return torch.stack([-qx, -qy, -qz, qw], dim=-1) # [...,4] + + @staticmethod + @supports_numpy(["q1", "q2"], use_no_grad=True) + def multiply(q1: torch.Tensor | np.ndarray, q2: torch.Tensor | np.ndarray) -> torch.Tensor | np.ndarray: + """Hamilton product of two quaternion sets. + + Args: + q1 (torch.Tensor [...,4]): Left quaternion(s) in (x, y, z, w) order. + q2 (torch.Tensor [...,4]): Right quaternion(s) in (x, y, z, w) order. + + Returns: + q (torch.Tensor [...,4]): Product quaternion(s) q = q1 ⊗ q2 in (x, y, z, w) order. + """ + # Enforce unit inputs + Quaternion._check_valid_quaternion(q1, require_normalized=True) + Quaternion._check_valid_quaternion(q2, require_normalized=True) + q1x, q1y, q1z, q1w = q1.unbind(dim=-1) # [...] each + q2x, q2y, q2z, q2w = q2.unbind(dim=-1) # [...] each + qx = q1w * q2x + q2w * q1x + q1y * q2z - q1z * q2y # [...] + qy = q1w * q2y + q2w * q1y + q1z * q2x - q1x * q2z # [...] + qz = q1w * q2z + q2w * q1z + q1x * q2y - q1y * q2x # [...] + qw = q1w * q2w - (q1x * q2x + q1y * q2y + q1z * q2z) # [...] + q = torch.stack([qx, qy, qz, qw], dim=-1) # [...,4] + # Re-normalize to counteract numerical drift and enforce constraint + return Quaternion.normalize(q) # [...,4] diff --git a/cosmos-inference/cosmos3/_src/imaginaire/modules/denoiser_scaling.py b/cosmos-inference/cosmos3/_src/imaginaire/modules/denoiser_scaling.py new file mode 100644 index 00000000..6d843ef4 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/imaginaire/modules/denoiser_scaling.py @@ -0,0 +1,64 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +from typing import Tuple + +import torch + + +class EDMScaling: + def __init__(self, sigma_data: float = 0.5): + self.sigma_data = sigma_data + + def __call__(self, sigma: torch.Tensor) -> Tuple[torch.Tensor, torch.Tensor, torch.Tensor, torch.Tensor]: + c_skip = self.sigma_data**2 / (sigma**2 + self.sigma_data**2) + c_out = sigma * self.sigma_data / (sigma**2 + self.sigma_data**2) ** 0.5 + c_in = 1 / (sigma**2 + self.sigma_data**2) ** 0.5 + c_noise = 0.25 * sigma.log() + return c_skip, c_out, c_in, c_noise + + +class RectifiedFlowScaling: + def __init__(self, sigma_data: float = 1.0, t_scaling_factor: float = 1.0, loss_weight_uniform: bool = True): + assert abs(sigma_data - 1.0) < 1e-6, "sigma_data must be 1.0 for RectifiedFlowScaling" + self.t_scaling_factor = t_scaling_factor + self.loss_weight_uniform = loss_weight_uniform + if loss_weight_uniform is False: + # using huan lin suggested one here. which put more weight on the middle of the timesteps. + self.num_steps = 1000 + t = torch.linspace(0, 1, self.num_steps) + y = torch.exp(-2 * (t - 0.5) ** 2) + shift = y - y.min() + weights = shift * (self.num_steps / shift.sum()) # make sure the avg weights is 1.0 + self.weights = weights + + def sigma_loss_weights(self, sigma: torch.Tensor) -> torch.Tensor: + if self.loss_weight_uniform: + return (1.0 + sigma) ** 2 / sigma**2 + else: + t = sigma / (sigma + 1) + index = (t * self.num_steps).round().long() + # Clamp index to valid range [0, num_steps-1] to avoid out of bounds + index = torch.clamp(index, 0, self.num_steps - 1) + weights_on_device = self.weights.to(sigma.device) + return weights_on_device[index].type_as(sigma) + + def __call__(self, sigma: torch.Tensor) -> Tuple[torch.Tensor, torch.Tensor, torch.Tensor, torch.Tensor]: + t = sigma / (sigma + 1) + c_skip = 1.0 - t + c_out = -t + c_in = 1.0 - t + c_noise = t * self.t_scaling_factor + return c_skip, c_out, c_in, c_noise diff --git a/cosmos-inference/cosmos3/_src/imaginaire/modules/edm_sde.py b/cosmos-inference/cosmos3/_src/imaginaire/modules/edm_sde.py new file mode 100644 index 00000000..3d08a822 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/imaginaire/modules/edm_sde.py @@ -0,0 +1,43 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +from statistics import NormalDist + +import numpy as np +import torch + + +class EDMSDE: + def __init__( + self, + p_mean: float = -1.2, + p_std: float = 1.2, + sigma_max: float = 80.0, + sigma_min: float = 0.002, + ): + self.gaussian_dist = NormalDist(mu=p_mean, sigma=p_std) + self.sigma_max = sigma_max + self.sigma_min = sigma_min + + def sample_t(self, batch_size: int) -> torch.Tensor: + cdf_vals = np.random.uniform(size=(batch_size)) + samples_interval_gaussian = [self.gaussian_dist.inv_cdf(cdf_val) for cdf_val in cdf_vals] + + log_sigma = torch.tensor(samples_interval_gaussian, device="cuda") + return torch.exp(log_sigma) + + def marginal_prob(self, x0: torch.Tensor, sigma: torch.Tensor) -> tuple[torch.Tensor, torch.Tensor]: + """This is trivial in the base class, but may be used by derived classes in a more interesting way""" + return x0, sigma diff --git a/cosmos-inference/cosmos3/_src/imaginaire/modules/image_embeddings.py b/cosmos-inference/cosmos3/_src/imaginaire/modules/image_embeddings.py new file mode 100644 index 00000000..2d3d0524 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/imaginaire/modules/image_embeddings.py @@ -0,0 +1,763 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +""" +Forked from https://github.com/openai/CLIP/blob/main/clip/model.py +This file differs in that it exposes the prepooled patch tokens alongside the global image tokens +when calling the CLIP model +""" + +import hashlib +import os +import urllib +import warnings +from collections import OrderedDict +from typing import Any, List, Tuple, Union # noqa: F401 + +import numpy as np +import torch +import torch.nn.functional as F +from PIL import Image +from pkg_resources import packaging +from torch import nn +from torchvision.transforms import CenterCrop, Compose, Normalize, Resize, ToTensor +from tqdm import tqdm + +try: + from torchvision.transforms import InterpolationMode + + BICUBIC = InterpolationMode.BICUBIC +except ImportError: + BICUBIC = Image.BICUBIC + + +if packaging.version.parse(torch.__version__) < packaging.version.parse("1.7.1"): + warnings.warn("PyTorch version 1.7.1 or higher is recommended") + +__all__ = ["available_models", "load"] + +_MODELS = { + "RN50": "https://openaipublic.azureedge.net/clip/models/afeb0e10f9e5a86da6080e35cf09123aca3b358a0c3e3b6c78a7b63bc04b6762/RN50.pt", # noqa: E501 + "RN101": "https://openaipublic.azureedge.net/clip/models/8fa8567bab74a42d41c5915025a8e4538c3bdbe8804a470a72f30b0d94fab599/RN101.pt", # noqa: E501 + "RN50x4": "https://openaipublic.azureedge.net/clip/models/7e526bd135e493cef0776de27d5f42653e6b4c8bf9e0f653bb11773263205fdd/RN50x4.pt", # noqa: E501 + "RN50x16": "https://openaipublic.azureedge.net/clip/models/52378b407f34354e150460fe41077663dd5b39c54cd0bfd2b27167a4a06ec9aa/RN50x16.pt", # noqa: E501 + "RN50x64": "https://openaipublic.azureedge.net/clip/models/be1cfb55d75a9666199fb2206c106743da0f6468c9d327f3e0d0a543a9919d9c/RN50x64.pt", # noqa: E501 + "ViT-B/32": "https://openaipublic.azureedge.net/clip/models/40d365715913c9da98579312b702a82c18be219cc2a73407c4526f58eba950af/ViT-B-32.pt", # noqa: E501 + "ViT-B/16": "https://openaipublic.azureedge.net/clip/models/5806e77cd80f8b59890b7e101eabd078d9fb84e6937f9e85e4ecb61988df416f/ViT-B-16.pt", # noqa: E501 + "ViT-L/14": "https://openaipublic.azureedge.net/clip/models/b8cca3fd41ae0c99ba7e8951adf17d267cdb84cd88be6f7c2e0eca1737a03836/ViT-L-14.pt", # noqa: E501 + "ViT-L/14@336px": "https://openaipublic.azureedge.net/clip/models/3035c92b350959924f9f00213499208652fc7ea050643e8b385c2dac08641f02/ViT-L-14-336px.pt", # noqa: E501 +} + + +class Bottleneck(nn.Module): + """Bottleneck residual block for ResNet.""" + + expansion = 4 + + def __init__(self, inplanes, planes, stride=1): + super().__init__() + + # all conv layers have stride 1. an avgpool is performed after the second convolution when stride > 1 + self.conv1 = nn.Conv2d(inplanes, planes, 1, bias=False) + self.bn1 = nn.BatchNorm2d(planes) + self.relu1 = nn.ReLU(inplace=True) + + self.conv2 = nn.Conv2d(planes, planes, 3, padding=1, bias=False) + self.bn2 = nn.BatchNorm2d(planes) + self.relu2 = nn.ReLU(inplace=True) + + self.avgpool = nn.AvgPool2d(stride) if stride > 1 else nn.Identity() + + self.conv3 = nn.Conv2d(planes, planes * self.expansion, 1, bias=False) + self.bn3 = nn.BatchNorm2d(planes * self.expansion) + self.relu3 = nn.ReLU(inplace=True) + + self.downsample = None + self.stride = stride + + if stride > 1 or inplanes != planes * Bottleneck.expansion: + # downsampling layer is prepended with an avgpool, and the subsequent convolution has stride 1 + self.downsample = nn.Sequential( + OrderedDict( + [ + ("-1", nn.AvgPool2d(stride)), + ("0", nn.Conv2d(inplanes, planes * self.expansion, 1, stride=1, bias=False)), + ("1", nn.BatchNorm2d(planes * self.expansion)), + ] + ) + ) + + def forward(self, x: torch.Tensor): + identity = x + + out = self.relu1(self.bn1(self.conv1(x))) + out = self.relu2(self.bn2(self.conv2(out))) + out = self.avgpool(out) + out = self.bn3(self.conv3(out)) + + if self.downsample is not None: + identity = self.downsample(x) + + out += identity + out = self.relu3(out) + return out + + +class AttentionPool2d(nn.Module): + """Attention pooling layer for 2D feature maps.""" + + def __init__(self, spacial_dim: int, embed_dim: int, num_heads: int, output_dim: int = None): + super().__init__() + self.positional_embedding = nn.Parameter(torch.randn(spacial_dim**2 + 1, embed_dim) / embed_dim**0.5) + self.k_proj = nn.Linear(embed_dim, embed_dim) + self.q_proj = nn.Linear(embed_dim, embed_dim) + self.v_proj = nn.Linear(embed_dim, embed_dim) + self.c_proj = nn.Linear(embed_dim, output_dim or embed_dim) + self.num_heads = num_heads + + def forward(self, x): + x = x.reshape(x.shape[0], x.shape[1], x.shape[2] * x.shape[3]).permute(2, 0, 1) # [HW,B,C] + x = torch.cat([x.mean(dim=0, keepdim=True), x], dim=0) # [HW+1,B,C] + x = x + self.positional_embedding[:, None, :].to(x.dtype) # [HW+1,B,C] + x, _ = F.multi_head_attention_forward( # x: [HW+1,B,output_dim] + query=x, + key=x, + value=x, + embed_dim_to_check=x.shape[-1], + num_heads=self.num_heads, + q_proj_weight=self.q_proj.weight, + k_proj_weight=self.k_proj.weight, + v_proj_weight=self.v_proj.weight, + in_proj_weight=None, + in_proj_bias=torch.cat([self.q_proj.bias, self.k_proj.bias, self.v_proj.bias]), # [3*embed_dim] + bias_k=None, + bias_v=None, + add_zero_attn=False, + dropout_p=0, + out_proj_weight=self.c_proj.weight, + out_proj_bias=self.c_proj.bias, + use_separate_proj_weight=True, + training=self.training, + need_weights=False, + ) + + return x[0] # [B,output_dim] + + +class ModifiedResNet(nn.Module): + """ + A ResNet class that is similar to torchvision's but contains the following changes: + - There are now 3 "stem" convolutions as opposed to 1, with an average pool instead of a max pool. + - Performs anti-aliasing strided convolutions, where an avgpool is prepended to convolutions with stride > 1 + - The final pooling layer is a QKV attention instead of an average pool + """ + + def __init__(self, layers, output_dim, heads, input_resolution=224, width=64): + super().__init__() + self.output_dim = output_dim + self.input_resolution = input_resolution + + # the 3-layer stem + self.conv1 = nn.Conv2d(3, width // 2, kernel_size=3, stride=2, padding=1, bias=False) + self.bn1 = nn.BatchNorm2d(width // 2) + self.relu1 = nn.ReLU(inplace=True) + self.conv2 = nn.Conv2d(width // 2, width // 2, kernel_size=3, padding=1, bias=False) + self.bn2 = nn.BatchNorm2d(width // 2) + self.relu2 = nn.ReLU(inplace=True) + self.conv3 = nn.Conv2d(width // 2, width, kernel_size=3, padding=1, bias=False) + self.bn3 = nn.BatchNorm2d(width) + self.relu3 = nn.ReLU(inplace=True) + self.avgpool = nn.AvgPool2d(2) + + # residual layers + self._inplanes = width # this is a *mutable* variable used during construction + self.layer1 = self._make_layer(width, layers[0]) + self.layer2 = self._make_layer(width * 2, layers[1], stride=2) + self.layer3 = self._make_layer(width * 4, layers[2], stride=2) + self.layer4 = self._make_layer(width * 8, layers[3], stride=2) + + embed_dim = width * 32 # the ResNet feature dimension + self.attnpool = AttentionPool2d(input_resolution // 32, embed_dim, heads, output_dim) + + def _make_layer(self, planes, blocks, stride=1): + """ + Create a layer of residual blocks. + - planes: the number of output channels for this layer + - blocks: the number of residual blocks in this layer + - stride: the stride to use for the first convolution of the layer + """ + layers = [Bottleneck(self._inplanes, planes, stride)] + + self._inplanes = planes * Bottleneck.expansion + for _ in range(1, blocks): + layers.append(Bottleneck(self._inplanes, planes)) + + return nn.Sequential(*layers) + + def forward(self, x): + def stem(x): + """ + The stem convolutions at the beginning of the network. + Performs 3 convolutions and an average pool. + """ + x = self.relu1(self.bn1(self.conv1(x))) + x = self.relu2(self.bn2(self.conv2(x))) + x = self.relu3(self.bn3(self.conv3(x))) + x = self.avgpool(x) + return x + + x = x.type(self.conv1.weight.dtype) # [B,3,H,W] + x = stem(x) # [B,width,H/4,W/4] + x = self.layer1(x) # [B,width*4,H/4,W/4] + x = self.layer2(x) # [B,width*8,H/8,W/8] + x = self.layer3(x) # [B,width*16,H/16,W/16] + x = self.layer4(x) # [B,width*32,H/32,W/32] + x = self.attnpool(x) # [B,output_dim] + + return x + + +class LayerNorm(nn.LayerNorm): + """Subclass torch's LayerNorm to handle fp16.""" + + def forward(self, x: torch.Tensor): + orig_type = x.dtype + ret = super().forward(x.type(torch.float32)) + return ret.type(orig_type) + + +class QuickGELU(nn.Module): + """GELU activation function approximation""" + + def forward(self, x: torch.Tensor): + return x * torch.sigmoid(1.702 * x) + + +class ResidualAttentionBlock(nn.Module): + def __init__(self, d_model: int, n_head: int, attn_mask: torch.Tensor = None): + super().__init__() + + self.attn = nn.MultiheadAttention(d_model, n_head) + self.ln_1 = LayerNorm(d_model) + self.mlp = nn.Sequential( + OrderedDict( + [ + ("c_fc", nn.Linear(d_model, d_model * 4)), + ("gelu", QuickGELU()), + ("c_proj", nn.Linear(d_model * 4, d_model)), + ] + ) + ) + self.ln_2 = LayerNorm(d_model) + self.attn_mask = attn_mask + + def attention(self, x: torch.Tensor): + """Perform multi-head attention on the input tensor.""" + self.attn_mask = self.attn_mask.to(dtype=x.dtype, device=x.device) if self.attn_mask is not None else None + return self.attn(x, x, x, need_weights=False, attn_mask=self.attn_mask)[0] + + def forward(self, x: torch.Tensor): + x = x + self.attention(self.ln_1(x)) + x = x + self.mlp(self.ln_2(x)) + return x + + +class Transformer(nn.Module): + def __init__(self, width: int, layers: int, heads: int, attn_mask: torch.Tensor = None): + super().__init__() + self.width = width + self.layers = layers + self.resblocks = nn.Sequential(*[ResidualAttentionBlock(width, heads, attn_mask) for _ in range(layers)]) + + def forward(self, x: torch.Tensor): + return self.resblocks(x) + + +class VisionTransformer(nn.Module): + def __init__(self, input_resolution: int, patch_size: int, width: int, layers: int, heads: int, output_dim: int): + super().__init__() + self.input_resolution = input_resolution + self.output_dim = output_dim + self.conv1 = nn.Conv2d(in_channels=3, out_channels=width, kernel_size=patch_size, stride=patch_size, bias=False) + + scale = width**-0.5 + self.class_embedding = nn.Parameter(scale * torch.randn(width)) + self.positional_embedding = nn.Parameter(scale * torch.randn((input_resolution // patch_size) ** 2 + 1, width)) + self.ln_pre = LayerNorm(width) + + self.transformer = Transformer(width, layers, heads) + + self.ln_post = LayerNorm(width) + self.proj = nn.Parameter(scale * torch.randn(width, output_dim)) + + def forward(self, x: torch.Tensor): + x = self.conv1(x) # [B,width,grid,grid] + x = x.reshape(x.shape[0], x.shape[1], -1) # [B,width,grid**2] + x = x.permute(0, 2, 1) # [B,grid**2,width] + x = torch.cat( + [ + self.class_embedding.to(x.dtype) + + torch.zeros(x.shape[0], 1, x.shape[-1], dtype=x.dtype, device=x.device), + x, + ], + dim=1, + ) # [B,grid**2+1,width] + x = x + self.positional_embedding.to(x.dtype) # [B,grid**2+1,width] + x = self.ln_pre(x) # [B,grid**2+1,width] + + x = x.permute(1, 0, 2) # [grid**2+1,B,width] + x = self.transformer(x) # [grid**2+1,B,width] + x = x.permute(1, 0, 2) # [B,grid**2+1,width] + + x_pre_pooling = x # [B,grid**2+1,width] + x = self.ln_post(x[:, 0, :]) # [B,width] + + if self.proj is not None: + x = x @ self.proj # [B,output_dim] + + return x, x_pre_pooling + + +class CLIP(nn.Module): + """ + This CLIP module combines a visual encoder and a text encoder. It initializes the parameters, builds the attention mask, + and provides methods to encode images and text separately. The forward method computes the cosine similarity between + the encoded image and text features. + """ + + def __init__( + self, + embed_dim: int, + # vision + image_resolution: int, + vision_layers: Union[Tuple[int, int, int, int], int], + vision_width: int, + vision_patch_size: int, + # text + context_length: int, + vocab_size: int, + transformer_width: int, + transformer_heads: int, + transformer_layers: int, + ): + super().__init__() + + self.context_length = context_length + + if isinstance(vision_layers, (tuple, list)): + vision_heads = vision_width * 32 // 64 + self.visual = ModifiedResNet( + layers=vision_layers, + output_dim=embed_dim, + heads=vision_heads, + input_resolution=image_resolution, + width=vision_width, + ) + else: + vision_heads = vision_width // 64 + self.visual = VisionTransformer( + input_resolution=image_resolution, + patch_size=vision_patch_size, + width=vision_width, + layers=vision_layers, + heads=vision_heads, + output_dim=embed_dim, + ) + + self.transformer = Transformer( + width=transformer_width, + layers=transformer_layers, + heads=transformer_heads, + attn_mask=self.build_attention_mask(), + ) + + self.vocab_size = vocab_size + self.token_embedding = nn.Embedding(vocab_size, transformer_width) + self.positional_embedding = nn.Parameter(torch.empty(self.context_length, transformer_width)) + self.ln_final = LayerNorm(transformer_width) + + self.text_projection = nn.Parameter(torch.empty(transformer_width, embed_dim)) + self.logit_scale = nn.Parameter(torch.ones([]) * np.log(1 / 0.07)) + + self.initialize_parameters() + + def initialize_parameters(self): + """Initialize the parameters of the CLIP module.""" + nn.init.normal_(self.token_embedding.weight, std=0.02) + nn.init.normal_(self.positional_embedding, std=0.01) + + if isinstance(self.visual, ModifiedResNet): + if self.visual.attnpool is not None: + std = self.visual.attnpool.c_proj.in_features**-0.5 + nn.init.normal_(self.visual.attnpool.q_proj.weight, std=std) + nn.init.normal_(self.visual.attnpool.k_proj.weight, std=std) + nn.init.normal_(self.visual.attnpool.v_proj.weight, std=std) + nn.init.normal_(self.visual.attnpool.c_proj.weight, std=std) + + for resnet_block in [self.visual.layer1, self.visual.layer2, self.visual.layer3, self.visual.layer4]: + for name, param in resnet_block.named_parameters(): + if name.endswith("bn3.weight"): + nn.init.zeros_(param) + + proj_std = (self.transformer.width**-0.5) * ((2 * self.transformer.layers) ** -0.5) + attn_std = self.transformer.width**-0.5 + fc_std = (2 * self.transformer.width) ** -0.5 + for block in self.transformer.resblocks: + nn.init.normal_(block.attn.in_proj_weight, std=attn_std) + nn.init.normal_(block.attn.out_proj.weight, std=proj_std) + nn.init.normal_(block.mlp.c_fc.weight, std=fc_std) + nn.init.normal_(block.mlp.c_proj.weight, std=proj_std) + + if self.text_projection is not None: + nn.init.normal_(self.text_projection, std=self.transformer.width**-0.5) + + def build_attention_mask(self): + # lazily create causal attention mask, with full attention between the vision tokens + # pytorch uses additive attention mask; fill with -inf + mask = torch.empty(self.context_length, self.context_length) # [context_length,context_length] + mask.fill_(float("-inf")) + mask.triu_(1) # zero out the lower diagonal + return mask + + @property + def dtype(self): + return self.visual.conv1.weight.dtype + + def encode_image(self, image): + """ + Encode an image using the visual encoder. + + Args: + image (torch.Tensor): The input image. + + Returns: + torch.Tensor: The encoded image features. + """ + return self.visual(image.type(self.dtype)) + + def encode_text(self, text): + """ + Encode text using the text encoder. + + Args: + text (torch.Tensor): The input text. + + Returns: + torch.Tensor: The encoded text features. + """ + x = self.token_embedding(text).type(self.dtype) # [B,n_ctx,transformer_width] + + x = x + self.positional_embedding.type(self.dtype) # [B,n_ctx,transformer_width] + x = x.permute(1, 0, 2) # [n_ctx,B,transformer_width] + x = self.transformer(x) # [n_ctx,B,transformer_width] + x = x.permute(1, 0, 2) # [B,n_ctx,transformer_width] + x = self.ln_final(x).type(self.dtype) # [B,n_ctx,transformer_width] + + # take features from the eot embedding (eot_token is the highest number in each sequence) + x = x[torch.arange(x.shape[0]), text.argmax(dim=-1)] @ self.text_projection # [B,embed_dim] + + return x + + def forward(self, image, text): + """ + Forward pass of the CLIP module. + + Args: + image (torch.Tensor): The input image. + text (torch.Tensor): The input text. + + Returns: + tuple: A tuple containing the logits per image and logits per text. + """ + image_features, _ = self.encode_image(image) # [B,embed_dim] + text_features = self.encode_text(text) # [B,embed_dim] + + # normalized features + image_features = image_features / image_features.norm(dim=1, keepdim=True) # [B,embed_dim] + text_features = text_features / text_features.norm(dim=1, keepdim=True) # [B,embed_dim] + + # cosine similarity as logits + logit_scale = self.logit_scale.exp() # scalar + logits_per_image = logit_scale * image_features @ text_features.t() # [B,B] + logits_per_text = logits_per_image.t() # [B,B] + + return logits_per_image, logits_per_text + + +def convert_weights(model: nn.Module): + """Convert applicable model parameters to fp16""" + + def _convert_weights_to_fp16(ll): + if isinstance(ll, (nn.Conv1d, nn.Conv2d, nn.Linear)): + ll.weight.data = ll.weight.data.half() + if ll.bias is not None: + ll.bias.data = ll.bias.data.half() + + if isinstance(ll, nn.MultiheadAttention): + for attr in [*[f"{s}_proj_weight" for s in ["in", "q", "k", "v"]], "in_proj_bias", "bias_k", "bias_v"]: + tensor = getattr(ll, attr) + if tensor is not None: + tensor.data = tensor.data.half() + + for name in ["text_projection", "proj"]: + if hasattr(ll, name): + attr = getattr(ll, name) + if attr is not None: + attr.data = attr.data.half() + + model.apply(_convert_weights_to_fp16) + + +def build_model(state_dict: dict): + """Build the CLIP model from a state dictionary.""" + vit = state_dict.get("visual.proj") is not None + + if vit: + vision_width = state_dict["visual.conv1.weight"].shape[0] + vision_layers = len( + [k for k in state_dict.keys() if k.startswith("visual.") and k.endswith(".attn.in_proj_weight")] + ) + vision_patch_size = state_dict["visual.conv1.weight"].shape[-1] + grid_size = round((state_dict["visual.positional_embedding"].shape[0] - 1) ** 0.5) + image_resolution = vision_patch_size * grid_size + else: + counts: list = [ + len(set(k.split(".")[2] for k in state_dict if k.startswith(f"visual.layer{b}"))) for b in [1, 2, 3, 4] + ] + vision_layers = tuple(counts) + vision_width = state_dict["visual.layer1.0.conv1.weight"].shape[0] + output_width = round((state_dict["visual.attnpool.positional_embedding"].shape[0] - 1) ** 0.5) + vision_patch_size = None + assert output_width**2 + 1 == state_dict["visual.attnpool.positional_embedding"].shape[0] + image_resolution = output_width * 32 + + embed_dim = state_dict["text_projection"].shape[1] + context_length = state_dict["positional_embedding"].shape[0] + vocab_size = state_dict["token_embedding.weight"].shape[0] + transformer_width = state_dict["ln_final.weight"].shape[0] + transformer_heads = transformer_width // 64 + transformer_layers = len(set(k.split(".")[2] for k in state_dict if k.startswith("transformer.resblocks"))) + + model = CLIP( + embed_dim, + image_resolution, + vision_layers, + vision_width, + vision_patch_size, + context_length, + vocab_size, + transformer_width, + transformer_heads, + transformer_layers, + ) + + for key in ["input_resolution", "context_length", "vocab_size"]: + if key in state_dict: + del state_dict[key] + + convert_weights(model) + model.load_state_dict(state_dict) + return model.eval() + + +def _download(url: str, root: str): + """ + Download a file from a URL and place it in root. + + Args: + url (str): URL to download file from. + root (str): Directory to place the downloaded file. + + Returns: + str: Path to the downloaded file. + """ + os.makedirs(root, exist_ok=True) + filename = os.path.basename(url) + + expected_sha256 = url.split("/")[-2] + download_target = os.path.join(root, filename) + + if os.path.exists(download_target) and not os.path.isfile(download_target): + raise RuntimeError(f"{download_target} exists and is not a regular file") + + if os.path.isfile(download_target): + if hashlib.sha256(open(download_target, "rb").read()).hexdigest() == expected_sha256: + return download_target + else: + warnings.warn(f"{download_target} exists, but the SHA256 checksum does not match; re-downloading the file") + + with urllib.request.urlopen(url) as source, open(download_target, "wb") as output: + with tqdm( + total=int(source.info().get("Content-Length")), ncols=80, unit="iB", unit_scale=True, unit_divisor=1024 + ) as loop: + while True: + buffer = source.read(8192) + if not buffer: + break + + output.write(buffer) + loop.update(len(buffer)) + + if hashlib.sha256(open(download_target, "rb").read()).hexdigest() != expected_sha256: + raise RuntimeError("Model has been downloaded but the SHA256 checksum does not not match") + + return download_target + + +def _convert_image_to_rgb(image): + """ + Convert an image to RGB format. + + Args: + image (PIL.Image): The image to convert. + + Returns: + PIL.Image: The converted RGB image. + """ + return image.convert("RGB") + + +def _transform(n_px): + """ + Create a transformation pipeline for preprocessing images. + + Args: + n_px (int): The desired size of the transformed image. + + Returns: + Compose: A torchvision transform pipeline. + """ + return Compose( + [ + Resize(n_px, interpolation=BICUBIC), + CenterCrop(n_px), + _convert_image_to_rgb, + ToTensor(), + Normalize((0.48145466, 0.4578275, 0.40821073), (0.26862954, 0.26130258, 0.27577711)), + ] + ) + + +def available_models() -> List[str]: + """Returns the names of available CLIP models""" + return list(_MODELS.keys()) + + +def load( + name: str, + device: Union[str, torch.device] = "cuda" if torch.cuda.is_available() else "cpu", + jit: bool = False, + download_root: str = None, +): + """Load a CLIP model + + Parameters + ---------- + name : str + A model name listed by `clip.available_models()`, or the path to a model checkpoint containing the state_dict + + device : Union[str, torch.device] + The device to put the loaded model + + jit : bool + Whether to load the optimized JIT model or more hackable non-JIT model (default). + + download_root: str + path to download the model files; by default, it uses "~/.cache/clip" + + Returns + ------- + model : torch.nn.Module + The CLIP model + + preprocess : Callable[[PIL.Image], torch.Tensor] + A torchvision transform that converts a PIL image into a tensor that the returned model can take as its input + """ + if name in _MODELS: + model_path = _download(_MODELS[name], download_root or os.path.expanduser("~/.cache/clip")) + elif os.path.isfile(name): + model_path = name + else: + raise RuntimeError(f"Model {name} not found; available models = {available_models()}") + + with open(model_path, "rb") as opened_file: + try: + # loading JIT archive + model = torch.jit.load(opened_file, map_location=device if jit else "cpu").eval() + state_dict = None + except RuntimeError: + # loading saved state dict + if jit: + warnings.warn(f"File {model_path} is not a JIT archive. Loading as a state dict instead") + jit = False + state_dict = torch.load(opened_file, map_location="cpu") + + if not jit: + model = build_model(state_dict or model.state_dict()).to(device) + if str(device) == "cpu": + model.float() + return model, _transform(model.visual.input_resolution) + + # patch the device names + device_holder = torch.jit.trace(lambda: torch.ones([]).to(torch.device(device)), example_inputs=[]) + device_node = [n for n in device_holder.graph.findAllNodes("prim::Constant") if "Device" in repr(n)][-1] + + def patch_device(module): + try: + graphs = [module.graph] if hasattr(module, "graph") else [] + except RuntimeError: + graphs = [] + + if hasattr(module, "forward1"): + graphs.append(module.forward1.graph) + + for graph in graphs: + for node in graph.findAllNodes("prim::Constant"): + if "value" in node.attributeNames() and str(node["value"]).startswith("cuda"): + node.copyAttributes(device_node) + + model.apply(patch_device) + patch_device(model.encode_image) + patch_device(model.encode_text) + + # patch dtype to float32 on CPU + if str(device) == "cpu": + float_holder = torch.jit.trace(lambda: torch.ones([]).float(), example_inputs=[]) + float_input = list(float_holder.graph.findNode("aten::to").inputs())[1] + float_node = float_input.node() + + def patch_float(module): + try: + graphs = [module.graph] if hasattr(module, "graph") else [] + except RuntimeError: + graphs = [] + + if hasattr(module, "forward1"): + graphs.append(module.forward1.graph) + + for graph in graphs: + for node in graph.findAllNodes("aten::to"): + inputs = list(node.inputs()) + for i in [1, 2]: # dtype can be the second or third argument to aten::to() + if inputs[i].node()["value"] == 5: + inputs[i].node().copyAttributes(float_node) + + model.apply(patch_float) + patch_float(model.encode_image) + patch_float(model.encode_text) + + model.float() + + return model, _transform(model.input_resolution.item()) diff --git a/cosmos-inference/cosmos3/_src/imaginaire/modules/res_sampler.py b/cosmos-inference/cosmos3/_src/imaginaire/modules/res_sampler.py new file mode 100644 index 00000000..dd76d1b1 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/imaginaire/modules/res_sampler.py @@ -0,0 +1,287 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +""" +A general framework for various sampling algorithm from a diffusion model. +Impl based on +* Refined Exponential Solver (RES) in https://arxiv.org/pdf/2308.02157 +* also clude other impl, DDIM, DEIS, DPM-Solver, EDM sampler. +Most of sampling algorihtm, Runge-Kutta, Multi-step, etc, can be impl in this framework by \ + adding new step function in get_runge_kutta_fn or get_multi_step_fn. +""" + +import math +from typing import Any, Callable, List, Literal, Optional, Tuple, Union + +import attrs +import torch + +from cosmos3._src.imaginaire.config import make_freezable +from cosmos3._src.imaginaire.functional.multi_step import get_multi_step_fn, is_multi_step_fn_supported +from cosmos3._src.imaginaire.functional.runge_kutta import get_runge_kutta_fn, is_runge_kutta_fn_supported +from cosmos3._src.imaginaire.utils import log + +COMMON_SOLVER_OPTIONS = Literal["2ab", "2mid", "1euler"] + + +@make_freezable +@attrs.define(slots=False) +class SolverConfig: + is_multi: bool = False + rk: str = "2mid" + multistep: str = "2ab" + # following parameters control stochasticity, see EDM paper + # BY default, we use deterministic with no stochasticity + s_churn: float = 0.0 + s_t_max: float = float("inf") + s_t_min: float = 0.05 + s_noise: float = 1.0 + + +@make_freezable +@attrs.define(slots=False) +class SolverTimestampConfig: + nfe: int = 50 + t_min: float = 0.002 + t_max: float = 80.0 + order: float = 7.0 + is_forward: bool = False # whether generate forward or backward timestamps + + +@make_freezable +@attrs.define(slots=False) +class SamplerConfig: + solver: SolverConfig = attrs.field(factory=SolverConfig) + timestamps: SolverTimestampConfig = attrs.field(factory=SolverTimestampConfig) + sample_clean: bool = True # whether run one last step to generate clean image + + +def get_rev_ts( + t_min: float, t_max: float, num_steps: int, ts_order: Union[int, float], is_forward: bool = False +) -> torch.Tensor: + """ + Generate a sequence of reverse time steps. + + Args: + t_min (float): The minimum time value. + t_max (float): The maximum time value. + num_steps (int): The number of time steps to generate. + ts_order (Union[int, float]): The order of the time step progression. + is_forward (bool, optional): If True, returns the sequence in forward order. Defaults to False. + + Returns: + torch.Tensor: A tensor containing the generated time steps in reverse or forward order. + + Raises: + ValueError: If `t_min` is not less than `t_max`. + TypeError: If `ts_order` is not an integer or float. + """ + if t_min >= t_max: + raise ValueError("t_min must be less than t_max") + + if not isinstance(ts_order, (int, float)): + raise TypeError("ts_order must be an integer or float") + + step_indices = torch.arange(num_steps + 1, dtype=torch.float64) + time_steps = ( + t_max ** (1 / ts_order) + step_indices / num_steps * (t_min ** (1 / ts_order) - t_max ** (1 / ts_order)) + ) ** ts_order + + if is_forward: + return time_steps.flip(dims=(0,)) + + return time_steps + + +class Sampler(torch.nn.Module): + def __init__(self, cfg: Optional[SamplerConfig] = None): + super().__init__() + if cfg is None: + cfg = SamplerConfig() + self.cfg = cfg + + @torch.no_grad() + def forward( + self, + x0_fn: Callable, + x_sigma_max: torch.Tensor, + num_steps: int = 35, + sigma_min: float = 0.002, + sigma_max: float = 80, + rho: float = 7, + S_churn: float = 0, + S_min: float = 0, + S_max: float = float("inf"), + S_noise: float = 1, + solver_option: str = "2ab", + ) -> torch.Tensor: + in_dtype = x_sigma_max.dtype + + def float64_x0_fn(x_B_StateShape: torch.Tensor, t_B: torch.Tensor) -> torch.Tensor: + return x0_fn(x_B_StateShape.to(in_dtype), t_B.to(in_dtype)).to(torch.float64) + + is_multistep = is_multi_step_fn_supported(solver_option) + is_rk = is_runge_kutta_fn_supported(solver_option) + assert is_multistep or is_rk, f"Only support multistep or Runge-Kutta method, got {solver_option}" + + solver_cfg = SolverConfig( + s_churn=S_churn, + s_t_max=S_max, + s_t_min=S_min, + s_noise=S_noise, + is_multi=is_multistep, + rk=solver_option, + multistep=solver_option, + ) + timestamps_cfg = SolverTimestampConfig(nfe=num_steps, t_min=sigma_min, t_max=sigma_max, order=rho) + sampler_cfg = SamplerConfig(solver=solver_cfg, timestamps=timestamps_cfg, sample_clean=True) + + return self._forward_impl(float64_x0_fn, x_sigma_max, sampler_cfg).to(in_dtype) + + @torch.no_grad() + def _forward_impl( + self, + denoiser_fn: Callable[[torch.Tensor, torch.Tensor], torch.Tensor], + noisy_input_B_StateShape: torch.Tensor, + sampler_cfg: Optional[SamplerConfig] = None, + callback_fns: Optional[List[Callable]] = None, + ) -> torch.Tensor: + """ + Internal implementation of the forward pass. + + Args: + denoiser_fn: Function to denoise the input. + noisy_input_B_StateShape: Input tensor with noise. + sampler_cfg: Configuration for the sampler. + callback_fns: List of callback functions to be called during sampling. + + Returns: + torch.Tensor: Denoised output tensor. + """ + sampler_cfg = self.cfg if sampler_cfg is None else sampler_cfg + solver_order = 1 if sampler_cfg.solver.is_multi else int(sampler_cfg.solver.rk[0]) + num_timestamps = sampler_cfg.timestamps.nfe // solver_order + + sigmas_L = get_rev_ts( + sampler_cfg.timestamps.t_min, sampler_cfg.timestamps.t_max, num_timestamps, sampler_cfg.timestamps.order + ).to(noisy_input_B_StateShape.device) + + denoised_output = differential_equation_solver( + denoiser_fn, sigmas_L, sampler_cfg.solver, callback_fns=callback_fns + )(noisy_input_B_StateShape) + + if sampler_cfg.sample_clean: + # Override denoised_output with fully denoised version + ones = torch.ones(denoised_output.size(0), device=denoised_output.device, dtype=denoised_output.dtype) + denoised_output = denoiser_fn(denoised_output, sigmas_L[-1] * ones) + + return denoised_output + + +def fori_loop(lower: int, upper: int, body_fun: Callable[[int, Any], Any], init_val: Any) -> Any: + """ + Implements a for loop with a function. + + Args: + lower: Lower bound of the loop (inclusive). + upper: Upper bound of the loop (exclusive). + body_fun: Function to be applied in each iteration. + init_val: Initial value for the loop. + + Returns: + The final result after all iterations. + """ + val = init_val + for i in range(lower, upper): + # Add log during sampling to meet APS job health requirement of one log every 2mins + if i % 10 == 0: + log.info(f"fori_loop: {i}") + val = body_fun(i, val) + return val + + +def differential_equation_solver( + x0_fn: Callable[[torch.Tensor, torch.Tensor], torch.Tensor], + sigmas_L: torch.Tensor, + solver_cfg: SolverConfig, + callback_fns: Optional[List[Callable]] = None, +) -> Callable[[torch.Tensor], torch.Tensor]: + """ + Creates a differential equation solver function. + + Args: + x0_fn: Function to compute x0 prediction. + sigmas_L: Tensor of sigma values with shape [L,]. + solver_cfg: Configuration for the solver. + callback_fns: Optional list of callback functions. + + Returns: + A function that solves the differential equation. + """ + num_step = len(sigmas_L) - 1 + + if solver_cfg.is_multi: + update_step_fn = get_multi_step_fn(solver_cfg.multistep) + else: + update_step_fn = get_runge_kutta_fn(solver_cfg.rk) + + eta = min(solver_cfg.s_churn / (num_step + 1), math.sqrt(1.2) - 1) + + def sample_fn(input_xT_B_StateShape: torch.Tensor) -> torch.Tensor: + """ + Samples from the differential equation. + + Args: + input_xT_B_StateShape: Input tensor with shape [B, StateShape]. + + Returns: + Output tensor with shape [B, StateShape]. + """ + ones_B = torch.ones(input_xT_B_StateShape.size(0), device=input_xT_B_StateShape.device, dtype=torch.float64) + + def step_fn( + i_th: int, state: Tuple[torch.Tensor, Optional[List[torch.Tensor]]] + ) -> Tuple[torch.Tensor, Optional[List[torch.Tensor]]]: + input_x_B_StateShape, x0_preds = state + sigma_cur_0, sigma_next_0 = sigmas_L[i_th], sigmas_L[i_th + 1] + + # algorithm 2: line 4-6 + if solver_cfg.s_t_min < sigma_cur_0 < solver_cfg.s_t_max: + hat_sigma_cur_0 = sigma_cur_0 + eta * sigma_cur_0 + input_x_B_StateShape = input_x_B_StateShape + ( + hat_sigma_cur_0**2 - sigma_cur_0**2 + ).sqrt() * solver_cfg.s_noise * torch.randn_like(input_x_B_StateShape) + sigma_cur_0 = hat_sigma_cur_0 + + if solver_cfg.is_multi: + x0_pred_B_StateShape = x0_fn(input_x_B_StateShape, sigma_cur_0 * ones_B) + output_x_B_StateShape, x0_preds = update_step_fn( + input_x_B_StateShape, sigma_cur_0 * ones_B, sigma_next_0 * ones_B, x0_pred_B_StateShape, x0_preds + ) + else: + output_x_B_StateShape, x0_preds = update_step_fn( + input_x_B_StateShape, sigma_cur_0 * ones_B, sigma_next_0 * ones_B, x0_fn + ) + + if callback_fns: + for callback_fn in callback_fns: + callback_fn(**locals()) + + return output_x_B_StateShape, x0_preds + + x_at_eps, _ = fori_loop(0, num_step, step_fn, [input_xT_B_StateShape, None]) + return x_at_eps + + return sample_fn diff --git a/cosmos-inference/cosmos3/_src/imaginaire/serialization.py b/cosmos-inference/cosmos3/_src/imaginaire/serialization.py new file mode 100644 index 00000000..3a17f60f --- /dev/null +++ b/cosmos-inference/cosmos3/_src/imaginaire/serialization.py @@ -0,0 +1,447 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +import abc +import functools +import importlib +import json +import os +from collections.abc import Callable as Callable2 +from collections.abc import Mapping, Sequence +from dataclasses import fields, is_dataclass +from types import UnionType +from typing import Any, List, Optional, TypeVar, Union, get_args, get_origin + +import attrs +import torch +import yaml +from omegaconf import DictConfig, ListConfig, OmegaConf + +from cosmos3._src.imaginaire.lazy_config import LazyCall, LazyDict, instantiate +from cosmos3._src.imaginaire.lazy_config.lazy import get_default_params + +T = TypeVar("T") + + +def from_dict( + x: dict, clazz: str | type | None = None, force_construct_target: bool | None = None, field_name: str = "" +) -> T: ... +def to_dict(x: T, field_name: str = "", hydra_compat: bool = True) -> dict: ... +def from_yaml(path: str | None = None, clazz: type | None = None, file_like_or_str=None) -> T: + if path: + assert os.path.exists(path), f"{path} does not exist" + with open(path) as in_f: + return from_dict(yaml.safe_load(in_f), clazz=clazz) + elif file_like_or_str: + return from_dict(yaml.safe_load(file_like_or_str), clazz=clazz) + else: + raise ValueError("expected file_like_or_str or path to not be None") + + +def _yaml_safe(obj: Any) -> Any: + # primitives + if obj is None or isinstance(obj, (bool, int, float, str)): + return obj + + # dict-like + if isinstance(obj, Mapping): + return {str(k): _yaml_safe(v) for k, v in obj.items()} + + # list/tuple-like (but not strings/bytes) + if isinstance(obj, Sequence) and not isinstance(obj, (str, bytes, bytearray)): + return [_yaml_safe(v) for v in obj] + + # classes / functions / bound methods -> import path + if hasattr(obj, "__module__") and hasattr(obj, "__qualname__"): + return f"{obj.__module__}.{obj.__qualname__}" + + # torch dtype, Path, enums, dataclasses, etc. + return str(obj) + + +def to_yaml(config: T, out_path: str | None = None) -> str | None: + config_dict = to_dict(config) + safe_dict = _yaml_safe(config_dict) + + if out_path is not None: + with open(out_path, "w") as f: + yaml.safe_dump( + safe_dict, + f, + sort_keys=False, + default_flow_style=False, + allow_unicode=True, + ) + return None + + return yaml.safe_dump( + safe_dict, + sort_keys=False, + default_flow_style=False, + allow_unicode=True, + ) + + +def load_callable(name: str) -> Callable2 | None: + if not name: + return None + + idx = name.rfind(".") + assert idx != -1, "expected ." + module_name = name[0:idx] + fn_name = name[idx + 1 :] + mod = importlib.import_module(module_name) + return getattr(mod, fn_name) + + +def maybe_load_callable(name: str | Callable2 | None) -> Callable2 | None: + if isinstance(name, str): + return load_callable(name) + + return name + + +def maybe_idx(x: Any, idx: int) -> Any: + if idx < 0 or idx >= len(x): + return None + return x[idx] + + +def is_attrs(x: Any) -> bool: + return hasattr(x, "__attrs_attrs__") + + +def to_qualitified_name(x) -> str: + # Handle functools.partial explicitly + if isinstance(x, functools.partial): + fn = x.func + fn_name = to_qualitified_name(fn) + + # args/keywords may contain non-serializable stuff; stringify safely + args = [] + if x.args: + args = [repr(a) for a in x.args] + + kwargs = {} + if x.keywords: + kwargs = {str(k): repr(v) for k, v in x.keywords.items()} + + if args or kwargs: + return f"functools.partial({fn_name}, args={args}, kwargs={kwargs})" + return f"functools.partial({fn_name})" + + # Normal callable/class/module qualified name + mod = getattr(x, "__module__", None) + qn = getattr(x, "__qualname__", None) + + if mod and qn: + return f"{mod}.{qn}" + + # Some callables only have __name__ + name = getattr(x, "__name__", None) + if mod and name: + return f"{mod}.{name}" + + # Fallback: repr + return repr(x) + + +def is_optional(x: type) -> bool: + origin = get_origin(x) + args = get_args(x) + return origin is Optional or (origin in (Union, UnionType) and len(args) == 2 and type(None) in args) + + +def _to_dict_value(x: T, field_type: type, metadata: dict, field_name: str = ""): + + t = type(x) + + # attrs specific + if x is attrs.NOTHING or x is None: + return None + # torch specifics + elif field_type in (torch.memory_format, torch.dtype): + return str(x) + # i4 specific types + elif field_type == LazyCall: + result = _to_dict_value(x, field_type._target, metadata, field_name) + return result + elif field_type in (DictConfig, LazyDict): + if "_target_" in x: + default_params = get_default_params(x["_target_"]) + for default_key, default_v in default_params.items(): + if default_key not in x: + x[default_key] = default_v + result = _to_dict_value(x, dict, metadata, field_name) + object_type = getattr(x._metadata, "object_type", None) + if object_type and (is_dataclass(object_type) or is_attrs(object_type)): + result.setdefault("_target_", to_qualitified_name(object_type)) + return result + elif field_type == ListConfig: + return _to_dict_value(x, list, metadata, field_name) + # general python types + dataclasses + attrs + # * meta types + elif field_type == type or field_type == abc.ABCMeta: + + return to_qualitified_name(x) + elif get_origin(field_type) is type: + return to_qualitified_name(x) + elif callable(x) or get_origin(field_type) is Callable2: + if callable(x): + return to_qualitified_name(x) + else: + assert isinstance(x, str), f"{x.__class__=}" + return x + elif is_dataclass(t) or is_attrs(t): + return to_dict(x, field_name=field_name) + # * built-in composites types + elif is_optional(field_type): + return _to_dict_value(x, get_args(field_type)[0], metadata) + elif get_origin(field_type) in (Union, UnionType): + raise AssertionError("unions are not implemented yet!") + # * primitives + elif t in (dict,) or field_type in (dict,) or get_origin(field_type) in (dict,): + return { + _to_dict_value( + k, + maybe_idx(get_args(field_type), 0) or type(k), + metadata, + field_name=f"{field_name}.{k}.key", + ): _to_dict_value( + v, + maybe_idx(get_args(field_type), 1) or type(v), + metadata, + field_name=f"{field_name}.{k}", + ) + for k, v in x.items() + } + elif ( + t + in ( + tuple, + list, + ) + or field_type + in ( + tuple, + list, + ) + or get_origin(field_type) in (tuple, list) + ): + if field_type is None or field_type not in ( + tuple, + list, + ): + field_type = list + + return field_type( + [ + _to_dict_value(xx, maybe_idx(get_args(field_type), 0) or type(xx), metadata, field_name + f"[{i}]") + for i, xx in enumerate(x) + ] + ) + elif field_type in (int, str, float, bool): + result = field_type(x) + return result + else: # catch all for everything else + return x + + +def to_dict(x: T, field_name: str = "", hydra_compat: bool = True) -> dict: + if is_dataclass(x): + result = {} + if hydra_compat: + result["_target_"] = to_qualitified_name(x.__class__) + for f in fields(x): + + if hydra_compat and f.name == "defaults": + continue + result[f.name] = _to_dict_value( + x.__dict__[f.name], + f.type, + f.metadata, + field_name=field_name + f".{f.name}" if field_name else f.name, + ) + return result + elif is_attrs(x): + # references: + # - https://github.com/python-attrs/attrs/blob/main/src/attr/_funcs.py + attrs.resolve_types(x.__class__) + + result = {} + if hydra_compat: + result["_target_"] = to_qualitified_name(x.__class__) + for f in attrs.fields(x.__class__): + + if hydra_compat and f.name == "defaults": + continue + result[f.name] = _to_dict_value( + getattr(x, f.name), + f.type, + f.metadata, + field_name=field_name + f".{f.name}" if field_name else f.name, + ) + return result + + +def _from_dict_value( + x: T, + field_type: type, + concrete_type: type, + field_name: str, + force_construct_target: bool | None = None, +): + + + is_dc_type = is_dataclass(field_type) + is_attrs_type = is_attrs(field_type) + origin = get_origin(field_type) or field_type + args = get_args(field_type) + + if x is None: + return None + elif field_type in (torch.memory_format, torch.dtype): + return maybe_load_callable(x) + elif field_type == LazyCall: + return _from_dict_value(x, field_type._target, concrete_type, field_name=field_name) + elif is_dc_type or is_attrs_type: + if concrete_type == str: + assert isinstance(x, str) + if x.endswith(".json"): + json_value = json.loads(x) + return from_dict( + json_value, field_type, force_construct_target=force_construct_target, field_name=field_name + ) + elif x.endswith(".yaml"): + yaml_value = yaml.safe_load(x) + return from_dict( + yaml_value, field_type, force_construct_target=force_construct_target, field_name=field_name + ) + else: + raise AssertionError(f"unexpected string: {x}") + else: + assert not isinstance(x, str) + return from_dict(x, field_type, field_name=field_name) + elif field_type in (DictConfig, LazyDict) or origin in (dict,): + + construct_target = x.get("_recursive_", field_type == DictConfig) + if force_construct_target is not None: + construct_target = force_construct_target + + target_value = x.get("_target_") + target_cls = maybe_load_callable(target_value) + + if target_value and construct_target and (is_dataclass(target_cls) or is_attrs(target_cls)): + result = from_dict(x, target_cls, force_construct_target=force_construct_target, field_name=field_name) + else: + result = { + _from_dict_value( + k, + maybe_idx(get_args(field_type), 0) or type(k), + type(k), + field_name=f"{field_name}.{k}.key", + force_construct_target=construct_target, + ): _from_dict_value( + v, + maybe_idx(get_args(field_type), 1) or type(v), + type(v), + field_name=f"{field_name}.{k}", + force_construct_target=construct_target, + ) + for k, v in x.items() + } + if field_type in (DictConfig, LazyDict): + result = OmegaConf.structured(result, flags={"allow_objects": True}) + if construct_target: + result = instantiate(result) + if "_target_" in result: + result["_target_"] = maybe_load_callable(result["_target_"]) + elif construct_target and target_cls: # instantiate a regular class from a dict + special_keys = { + "_target_", + "_recursive_", + "_convert_", + "_args_", + "_kwargs_", + } + constructable_items = { + k: v for k, v in result.items() if not (isinstance(k, str) and k in special_keys) + } + result = target_cls(**constructable_items) + return result + elif field_type is ListConfig or origin in ( + list, + List, + ): + return [ + _from_dict_value( + xx, maybe_idx(get_args(field_type), 0) or type(xx), type(xx), field_name=f"{field_type}[{i}]" + ) + for i, xx in enumerate(x) + ] + elif is_optional(field_type): + return _from_dict_value(x, args[0], type(x), field_name=field_name) + elif origin in (Union, UnionType): + raise AssertionError("unions are not implemented yet!") + elif origin is Callable2 or origin is type: + return maybe_load_callable(x) + elif field_type in (int, float, str, bool): + return x + elif field_type is type(None) or field_type == Any: # no typing + return x + else: + raise TypeError( + f"unexpected type: {field_type} (origin={origin}, concrete_type={concrete_type}, args={args}, x={x})" + ) + + +def from_dict( + x: dict, clazz: type | None = None, force_construct_target: bool | None = None, field_name: str = "" +) -> T: + if clazz is None: + assert "_target_" in x + clazz = maybe_load_callable(x["_target_"]) + + assert is_dataclass(clazz) or is_attrs(clazz), f"{clazz} is not a dataclass or attrs" + if is_dataclass(clazz): + construct_args = {} + for f in fields(clazz): + if f.name in x: + construct_args[f.name] = _from_dict_value( + x[f.name], + f.type, + type(x[f.name]), + field_name=field_name + "." + f.name if field_name else f.name, + force_construct_target=force_construct_target, + ) + elif is_optional(f.type): + construct_args[f.name] = None + return clazz(**construct_args) + elif is_attrs(clazz): + attrs.resolve_types(clazz) + + construct_args = {} + for f in attrs.fields(clazz): + if f.name in x: + construct_args[f.name] = _from_dict_value( + x[f.name], + f.type, + type(x[f.name]), + field_name=field_name + "." + f.name if field_name else f.name, + force_construct_target=force_construct_target, + ) + elif is_optional(f.type): + construct_args[f.name] = None + return clazz(**construct_args) diff --git a/cosmos-inference/cosmos3/_src/imaginaire/trainer.py b/cosmos-inference/cosmos3/_src/imaginaire/trainer.py new file mode 100644 index 00000000..b7b82473 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/imaginaire/trainer.py @@ -0,0 +1,452 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +import functools +import inspect +import os +import signal + +import torch +import torch.distributed as dist +import torch.utils.data + +from cosmos3._src.imaginaire.flags import INTERNAL +from cosmos3._src.imaginaire.utils.context_managers import distributed_init +from cosmos3._src.imaginaire.utils.profiling import maybe_enable_memory_snapshot, maybe_enable_nsys_profiling, maybe_enable_profiling + +try: + from megatron.core import parallel_state + + USE_MEGATRON = True +except ImportError: + USE_MEGATRON = False + + +from cosmos3._src.imaginaire.lazy_config import LazyConfig, instantiate +from cosmos3._src.imaginaire.model import ImaginaireModel +from cosmos3._src.imaginaire.utils import callback, distributed, ema, log, misc +from cosmos3._src.imaginaire.utils.checkpointer import Checkpointer +from cosmos3._src.imaginaire.utils.misc import StragglerDetectorV2 + + + +class ImaginaireTrainer: + """The base trainer class of Imaginaire. + + All trainers in Imaginaire should inherit ImaginaireTrainer. It contains the basic functionality for model training + (particularly suited for large-scale training), including data parallel (DDP/FSDP), model weight average (EMA), + mixed-precision training (fp16/bf16). + + Attributes: + checkpointer (Checkpointer): checkpointer object to save/load model weights and optimizer states. + training_timer (misc.Timer): Timer object to time code blocks and functions. + """ + + def __init__(self, config): + """Constructor of the trainer. + + Args: + config (Config): The config object for the Imaginaire codebase. + """ + super().__init__() + self.config = config + # Set up the distributed computing environment. + with distributed_init(): + distributed.init() + # Set up parallel states. + if hasattr(config.model, "context_parallel_size"): + if config.model_parallel.context_parallel_size > 1: + raise ValueError( + "Both config.model.context_parallel_size and config.model_parallel.context_parallel_size are set. " + "config.model.context_parallel_size is deprecated. Please only set config.model_parallel.context_parallel_size." + ) + else: + log.critical( + "Using deprecated config.model.context_parallel_size. Please use config.model_parallel.context_parallel_size instead." + ) + config.model_parallel.context_parallel_size = config.model.context_parallel_size + if USE_MEGATRON: + if ( + "create_gloo_process_groups" + in inspect.signature(parallel_state.initialize_model_parallel).parameters + ): + parallel_state.initialize_model_parallel( + pipeline_model_parallel_size=config.model_parallel.pipeline_model_parallel_size, + tensor_model_parallel_size=config.model_parallel.tensor_model_parallel_size, + context_parallel_size=config.model_parallel.context_parallel_size, + create_gloo_process_groups=False, + ) + else: + parallel_state.initialize_model_parallel( + pipeline_model_parallel_size=config.model_parallel.pipeline_model_parallel_size, + tensor_model_parallel_size=config.model_parallel.tensor_model_parallel_size, + context_parallel_size=config.model_parallel.context_parallel_size, + ) + # `config.model_parallel.sequence_parallel` is a bool that indicates whether to use sequence parallelism. + # It is not part of the original `parallel_state` API, so we need to set it manually. + parallel_state.sequence_parallel = config.model_parallel.sequence_parallel + if parallel_state.sequence_parallel: + os.environ["CUDA_DEVICE_MAX_CONNECTIONS"] = "1" + + # Create the local job directory, save the config file, and pipe to a local log. + if distributed.is_rank0(): + os.makedirs(config.job.path_local, exist_ok=True) + # Save the config as .pkl for reproducibility. + LazyConfig.save_pkl(config, f"{config.job.path_local}/config.pkl") + # Save the config as .yaml for reading or parsing experiment hyperparameters. + LazyConfig.save_yaml(config, f"{config.job.path_local}/config.yaml") + dist.barrier() + if INTERNAL: + log.init_loguru_file(f"{config.job.path_local}/stdout.log") + if distributed.is_rank0(): + # Print important environment variables and the effective config. + log.info("Config:\n" + config.pretty_print(use_color=True)) + misc.print_environ_variables(["TORCH_HOME", "IMAGINAIRE_OUTPUT_ROOT", "ENABLE_ONELOGGER"]) + else: + misc.print_environ_variables(["HF_HOME", "IMAGINAIRE_OUTPUT_ROOT"]) + # Set the random seed. If multi-GPU, different ranks are set with different seeds. + misc.set_random_seed(seed=config.trainer.seed, by_rank=True) + # Initialize cuDNN. + torch.backends.cudnn.deterministic = config.trainer.cudnn.deterministic + torch.backends.cudnn.benchmark = config.trainer.cudnn.benchmark + # Initialize the callback functions. + self.callbacks = callback.CallBackGroup(config=config, trainer=self) + # Initialize the model checkpointer. + if config.checkpoint.type is None: + self.checkpointer = Checkpointer(config.checkpoint, config.job, callbacks=self.callbacks) + else: + self.checkpointer: Checkpointer = instantiate( + config.checkpoint.type, config.checkpoint, config.job, callbacks=self.callbacks + ) + # Initialize the timer for speed benchmarking. + self.training_timer = misc.TrainingTimer() + # Initialize Straggler Detection + self.straggler_detector = StragglerDetectorV2( + enabled=self.config.trainer.straggler_detection.enabled, + report_freq=self.config.trainer.straggler_detection.report_freq, + profile_freq=self.config.trainer.straggler_detection.profile_freq, + max_diff=self.config.trainer.straggler_detection.max_diff, + raise_error=self.config.trainer.straggler_detection.raise_error, + save_s3=self.config.trainer.straggler_detection.save_s3, + ) + misc.set_torch_compile_options( + self.config.trainer.compile_config.recompile_limit, self.config.trainer.compile_config.use_duck_shape + ) + self.straggler_detector.initialize() + # Send a TimeoutError if a training step takes over timeout_period seconds. + signal.signal(signal.SIGALRM, functools.partial(misc.timeout_handler, config.trainer.timeout_period)) # type: ignore + + def _fetch_and_broadcast_data( + self, + model: ImaginaireModel, + dataloader_iter, + iteration: int, + ): + """ + Fetches data from the dataloader on the batch owner rank and broadcasts it to all other ranks in the Context Parallel group if CP is enabled. + When CP is disabled, data is fetched from the dataloader on the current rank and no broadcasting is needed. + + Args: + model (ImaginaireModel): The model containing parallel dimensions info. + dataloader_iter: Iterator for the dataloader. + iteration (int): Current iteration number to determine the batch owner. + + Returns: + tuple: (data_batch, stop_signal) + - data_batch: The fetched data batch (or None if stopped/not owner). + - stop_signal (bool): True if StopIteration was encountered. + """ + parallel_dims = getattr(model, "parallel_dims", None) + if parallel_dims is None or not parallel_dims.cp_enabled: + try: + return next(dataloader_iter), False + except StopIteration: + return None, True + + # To prevent redundant data loading among the Context Parallel ranks, + # one of the Context Parallel ranks (round-robin) broadcasts the data to all other cp ranks. + batch_owner_rank = iteration % parallel_dims.cp_mesh.size() + stop_signal = False + data_batch = None + + if parallel_dims.cp_rank == batch_owner_rank: + try: + data_batch = next(dataloader_iter) + except StopIteration: + stop_signal = True + data_batch = None + + objs = [data_batch, stop_signal] + + # Calculate the global rank of the batch owner within the CP group + global_src_rank = dist.get_global_rank(parallel_dims.cp_mesh.get_group(), batch_owner_rank) + + dist.broadcast_object_list( + objs, + src=global_src_rank, + group=parallel_dims.cp_mesh.get_group(), + ) + + return objs[0], objs[1] + + def train( + self, + model: ImaginaireModel, + dataloader_train: torch.utils.data.DataLoader, + dataloader_val: torch.utils.data.DataLoader, + ) -> None: + """The training function. + + Args: + model (ImaginaireModel): The PyTorch model. + dataloader_train (torch.utils.data.DataLoader): The training data loader. + dataloader_val (torch.utils.data.DataLoader): The validation data loader. + """ + # Leaving this for backward compability for now, but we can think about moving this to model.on_train_start for all models. + model = model.to("cuda", memory_format=self.config.trainer.memory_format) # type: ignore + model.on_train_start(self.config.trainer.memory_format) + + # Initialize the optimizer, scheduler, and grad_scaler. + self.callbacks.on_optimizer_init_start() + optimizer, scheduler = model.init_optimizer_scheduler(self.config.optimizer, self.config.scheduler) + grad_scaler = torch.amp.GradScaler("cuda", **self.config.trainer.grad_scaler_args) + self.callbacks.on_optimizer_init_end() + # Load the model checkpoint and get the starting iteration number. + iteration = self.checkpointer.load(model, optimizer, scheduler, grad_scaler) + if hasattr(dataloader_train, "set_start_iteration"): + dataloader_train.set_start_iteration(iteration * self.config.trainer.grad_accum_iter) + grad_accum_iter = 0 + log.critical(f"Distributed parallelism mode: {self.config.trainer.distributed_parallelism}") + if self.config.trainer.distributed_parallelism == "ddp": + # Create a DDP model wrapper. + model_ddp = distributed.parallel_model_wrapper(self.config.trainer.ddp, model) + elif self.config.trainer.distributed_parallelism == "fsdp": + model_ddp = model + else: + raise ValueError(f"Unknown distributed parallelism mode: {self.config.trainer.distributed_parallelism}") + + log.info("Starting training...") + sm_carveout = int(os.environ.get("GROUPED_MM_SM_CARVEOUT", "0")) + if sm_carveout: + torch._C._set_sm_carveout_experimental(sm_carveout) + log.info(f"Set SM carveout to {sm_carveout}") + self.callbacks.on_train_start(model, iteration=iteration) + # Initial validation. + if self.config.trainer.run_validation and iteration == 0 and self.config.trainer.run_validation_on_start: + self.validate(model, dataloader_val, iteration=iteration) + + if self.config.trainer.save_zero_checkpoint and iteration == 0: + self.checkpointer.save(model, optimizer, scheduler, grad_scaler, iteration=0) + + _end_training = False + if torch.are_deterministic_algorithms_enabled(): + # Re-seed all global RNGs after init (model load, checkpoint load, compile warmup, + # callbacks) so data-augmentation randomness starts from a deterministic state + # regardless of how much RNG state init consumed. + misc.set_random_seed(seed=self.config.trainer.seed, by_rank=True) + with ( + maybe_enable_profiling(self.config, global_step=iteration) as torch_profiler, + maybe_enable_memory_snapshot(self.config, global_step=iteration) as memory_profiler, + maybe_enable_nsys_profiling(self.config, global_step=iteration) as nsys_profiler, + ): + while True: + dataloader_train_iter = iter(dataloader_train) + while True: + self.callbacks.on_before_dataloading(iteration) + try: + with ( + self.training_timer("dataloader_train"), + self.straggler_detector.profile_section( + "dataloading", + self.config.trainer.straggler_detection.analyze_dataloading, + profile_cuda=False, + ), + ): + data_batch, stop_signal = self._fetch_and_broadcast_data( + model, + dataloader_train_iter, + iteration, + ) + if stop_signal: + raise StopIteration + except StopIteration: + break + finally: + self.callbacks.on_after_dataloading(iteration) + # If max_iter is reached, exit the training loop. + if iteration >= self.config.trainer.max_iter: + _end_training = True + break + # Move all tensors in the data batch to GPU device. + data_batch = misc.to(data_batch, device="cuda") + # The actual training step. + self.callbacks.on_training_step_start(model, data_batch, iteration=iteration) + self.callbacks.on_training_step_batch_start(model, data_batch, iteration=iteration) + if not model.training: + model_ddp.train() + assert model_ddp.training, "model_ddp is not in training mode." + assert model.training, "model is not in training mode." + output_batch, loss, grad_accum_iter = self.training_step( + model_ddp, + optimizer, + scheduler, + grad_scaler, + data_batch, + iteration=iteration, + grad_accum_iter=grad_accum_iter, + ) + self.callbacks.on_training_step_batch_end( + model, data_batch, output_batch, loss, iteration=iteration + ) + # If the gradients are still being accumulated, continue to load the next training batch. + if grad_accum_iter != 0: + continue + # Do the following when an actual optimizer (update) step has been made. + iteration += 1 + # Save checkpoint. + if iteration % self.config.checkpoint.save_iter == 0: + self.checkpointer.save(model, optimizer, scheduler, grad_scaler, iteration=iteration) + self.callbacks.on_training_step_end(model, data_batch, output_batch, loss, iteration=iteration) + # Validation. + if self.config.trainer.run_validation and iteration % self.config.trainer.validation_iter == 0: + self.validate(model, dataloader_val, iteration=iteration) + # This iteration is successful; reset the timeout signal. + signal.alarm(self.config.trainer.timeout_period) + self.straggler_detector.generate_report(iteration) + if torch_profiler: + torch_profiler.step() + if memory_profiler: + memory_profiler.step() + if nsys_profiler: + nsys_profiler.step() + if _end_training: + break + log.success("Done with training.") + if sm_carveout: + torch._C._set_sm_carveout_experimental(None) + if iteration % self.config.checkpoint.save_iter != 0: + self.checkpointer.save(model, optimizer, scheduler, grad_scaler, iteration=iteration) + self.callbacks.on_train_end(model, iteration=iteration) + self.checkpointer.finalize() + distributed.barrier() + self.callbacks.on_app_end() + if dist.is_available() and dist.is_initialized(): + dist.destroy_process_group() + + def training_step( + self, + model_ddp: torch.nn.Module | distributed.DistributedDataParallel, + optimizer: torch.optim.Optimizer, + scheduler: torch.optim.lr_scheduler.LRScheduler, + grad_scaler: torch.amp.GradScaler, + data: dict[str, torch.Tensor], + iteration: int = 0, + grad_accum_iter: int = 0, + ) -> tuple[dict[str, torch.Tensor], torch.Tensor, int]: + """The training step. + + Args: + model_ddp (torch.nn.Module | distributed.DistributedDataParallel): The model with a DDP wrapper or, the bare + module, depending on whether distributed training is enabled or not. + optimizer (torch.optim.Optimizer): The model optimizer. + scheduler (torch.optim.lr_scheduler.LRScheduler): The optimization scheduler. + grad_scaler (torch.amp.GradScaler): The gradient scaler (for mixed precision training). + data (dict[str, torch.Tensor]): Data batch (dictionary of tensors). + iteration (int): Current iteration number. + grad_accum_iter (int): Number of gradient accumulation iterations. + + Returns: + output (dict[str, torch.Tensor]): The model output from the training data batch (dictionary of tensors). + loss (torch.Tensor): The total loss of the training data batch. + """ + # Only let DDP sync gradient at the last iteration of the gradient accumulation window + with distributed.ddp_sync_grad(model_ddp, grad_accum_iter == self.config.trainer.grad_accum_iter - 1): + self.callbacks.on_before_forward(iteration=iteration) + with self.training_timer("forward"): + with self.straggler_detector.profile_section( + "fwd", self.config.trainer.straggler_detection.analyze_forward + ): + output_batch, loss = model_ddp.training_step(data, iteration) + self.callbacks.on_after_forward(iteration=iteration) + self.callbacks.on_before_backward(model_ddp, loss, iteration=iteration) + with self.training_timer("backward"): + with self.straggler_detector.profile_section( + "bwd", self.config.trainer.straggler_detection.analyze_backward + ): + loss_scaled = grad_scaler.scale(loss / self.config.trainer.grad_accum_iter) + loss_scaled.backward() + if self.config.trainer.distributed_parallelism == "ddp": + model_ddp.module.on_after_backward() + else: + model_ddp.on_after_backward() + self.callbacks.on_after_backward(model_ddp, iteration=iteration) + grad_accum_iter += 1 + if grad_accum_iter == self.config.trainer.grad_accum_iter: + with self.training_timer("optimizer_step"): + with self.straggler_detector.profile_section( + "opt", self.config.trainer.straggler_detection.analyze_optimizer + ): + self.callbacks.on_before_optimizer_step( + model_ddp, optimizer, scheduler, grad_scaler, iteration=iteration + ) + model = model_ddp.module if self.config.trainer.distributed_parallelism == "ddp" else model_ddp + self._optimizer_step(model, optimizer, scheduler, grad_scaler, iteration=iteration) + self.callbacks.on_before_zero_grad(model_ddp, optimizer, scheduler, iteration=iteration) + if self.config.trainer.distributed_parallelism == "ddp": + model_ddp.module.on_before_zero_grad(optimizer, scheduler, iteration=iteration) + else: + model_ddp.on_before_zero_grad(optimizer, scheduler, iteration=iteration) + self._zero_grad(model, optimizer, iteration) + grad_accum_iter = 0 + return output_batch, loss, grad_accum_iter + + def _optimizer_step( + self, + model: torch.nn.Module, + optimizer: torch.optim.Optimizer, + scheduler: torch.optim.lr_scheduler.LRScheduler, + grad_scaler: torch.amp.GradScaler, + iteration: int, + ) -> None: + """Execute the optimizer step. Override to customise (e.g. PhaseOptimizer).""" + grad_scaler.step(optimizer) + grad_scaler.update() + scheduler.step() + + def _zero_grad(self, model: torch.nn.Module, optimizer: torch.optim.Optimizer, iteration: int) -> None: + """Zero gradients. Override to customise (e.g. PhaseOptimizer).""" + optimizer.zero_grad(set_to_none=True) + + @torch.no_grad() + def validate(self, model: ImaginaireModel, dataloader_val: torch.utils.data.DataLoader, iteration: int = 0) -> None: + """Validate on the full validation dataset. + + Args: + model (ImaginaireModel): The PyTorch model. + dataloader_val (torch.utils.data.DataLoader): The validation data loader. + iteration (int): Current iteration number. + """ + self.callbacks.on_validation_start(model, dataloader_val, iteration=iteration) + model.eval() + # Evaluate on the full validation set. + with ema.ema_scope(model, enabled=model.config.ema.enabled): + for val_iter, data_batch in enumerate(dataloader_val): + if self.config.trainer.max_val_iter is not None and val_iter >= self.config.trainer.max_val_iter: + break + data_batch = misc.to(data_batch, device="cuda") + self.callbacks.on_validation_step_start(model, data_batch, iteration=iteration) + output_batch, loss = model.validation_step(data_batch, iteration) + self.callbacks.on_validation_step_end(model, data_batch, output_batch, loss, iteration=iteration) + self.callbacks.on_validation_end(model, iteration=iteration) diff --git a/cosmos-inference/cosmos3/_src/imaginaire/utils/__init__.py b/cosmos-inference/cosmos3/_src/imaginaire/utils/__init__.py new file mode 100644 index 00000000..db6d6eb6 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/imaginaire/utils/__init__.py @@ -0,0 +1,15 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025-2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + diff --git a/cosmos-inference/cosmos3/_src/imaginaire/utils/callback.py b/cosmos-inference/cosmos3/_src/imaginaire/utils/callback.py new file mode 100644 index 00000000..fe12f063 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/imaginaire/utils/callback.py @@ -0,0 +1,618 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +from __future__ import annotations + +import sys +import time +import warnings +from typing import TYPE_CHECKING, Any, Callable, Optional + +import omegaconf +import torch +import torch.distributed as dist +import torch.utils.data +import tqdm +import wandb + +from cosmos3._src.imaginaire.lazy_config import instantiate +from cosmos3._src.imaginaire.utils import distributed, log, misc, wandb_util +from cosmos3._src.imaginaire.utils.misc import get_local_tensor_if_DTensor + + +try: + from megatron.core import parallel_state +except ImportError: + parallel_state = None + + +if TYPE_CHECKING: + from cosmos3._src.imaginaire.config import Config + from cosmos3._src.imaginaire.model import ImaginaireModel + from cosmos3._src.imaginaire.trainer import ImaginaireTrainer + + +class CallBackGroup: + """A class for hosting a collection of callback objects. + + It is used to execute callback functions of multiple callback objects with the same method name. + When callbackgroup.func(args) is executed, internally it loops through the objects in self._callbacks and runs + self._callbacks[0].func(args), self._callbacks[1].func(args), etc. The method name and arguments should match. + + Attributes: + _callbacks (list[Callback]): List of callback objects. + """ + + def __init__(self, config: Config, trainer: ImaginaireTrainer) -> None: + """Initializes the list of callback objects. + + Args: + config (Config): The config object for the Imaginaire codebase. + trainer (ImaginaireTrainer): The main trainer. + """ + self._callbacks = [] + callback_configs = config.trainer.callbacks + if callback_configs: + if isinstance(callback_configs, list) or isinstance(callback_configs, omegaconf.listconfig.ListConfig): + warnings.warn( + "The 'config.trainer.callbacks' parameter should be a dict instead of a list. " + "Please update your code", + DeprecationWarning, + stacklevel=2, + ) + callback_configs = {f"callback_{i}": v for i, v in enumerate(callback_configs)} + for callback_name, current_callback_cfg in callback_configs.items(): + if "_target_" not in current_callback_cfg: + log.critical( + f"Callback {callback_name} is missing the '_target_' field. \n SKip {current_callback_cfg}" + ) + continue + log.critical(f"Instantiating callback {callback_name}: {current_callback_cfg}") + _callback = instantiate(current_callback_cfg) + assert isinstance(_callback, Callback), f"{current_callback_cfg} is not a valid callback." + _callback.config = config + _callback.trainer = trainer + self._callbacks.append(_callback) + + def __getattr__(self, method_name: str) -> Callable: + """Loops through the callback objects to call the corresponding callback function. + + Args: + method_name (str): Callback method name. + """ + + def multi_callback_wrapper(*args, **kwargs) -> None: + for callback in self._callbacks: + assert hasattr(callback, method_name) + method = getattr(callback, method_name) + assert callable(method) + _ = method(*args, **kwargs) + + return multi_callback_wrapper + + +class Callback: + """The base class for all callbacks. + + All callbacks should inherit from this class and adhere to the established method names and signatures. + """ + + def __init__(self, config: Optional["Config"] = None, trainer: Optional["ImaginaireTrainer"] = None): + """Initializes a Callback object. + + Args: + config (Optional[Config]): The configuration object for the Imaginaire codebase, if available. + trainer (Optional[ImaginaireTrainer]): The main trainer handling the training loop, if available. + + Notes: + The config and trainer parameters are optional to maintain backward compatibility. + In future releases, these parameters will be removed. Upon using these parameters, a deprecation + warning will be issued. + + """ + if config is not None or trainer is not None: + warnings.warn( + "The 'config' and 'trainer' parameters are deprecated and will be removed in a future release. " + "Please update your code to create Callback instances without these parameters.", + DeprecationWarning, + stacklevel=2, + ) + del config, trainer + + def on_train_start(self, model: ImaginaireModel, iteration: int = 0) -> None: + pass + + def on_training_step_start(self, model: ImaginaireModel, data: dict[str, torch.Tensor], iteration: int = 0) -> None: + """ + Called before the training step, for each batch. This is paired with on_training_step_end() but note that + when using gradient accumulation, while on_training_step_end() is only called when the optimizer is updated, + this function is called for every batch. + Use on_training_step_batch_start and on_training_step_batch_end if you need callbacks that are called + for every batch, albeit with the same iteration number. + FIXME - should this either be deprecated, or called only when a new training step is started after having updated + the optimizer? + """ + pass + + def on_training_step_batch_start( + self, model: ImaginaireModel, data: dict[str, torch.Tensor], iteration: int = 0 + ) -> None: + """ + Called before the training step, for each batch, similarly to on_training_step_start(). This function is paired with + on_training_step_batch_end(), and both functions are called for every batch even when using gradient accumulation. + Note that the iteration is only updated when the optimizer is updated, and therefore it may be the same for multiple invocations. + """ + pass + + def on_before_forward(self, iteration: int = 0) -> None: + pass + + def on_after_forward(self, iteration: int = 0) -> None: + pass + + def on_before_backward( + self, model_ddp: distributed.DistributedDataParallel, loss: torch.Tensor, iteration: int = 0 + ) -> None: + pass + + def on_after_backward(self, model_ddp: distributed.DistributedDataParallel, iteration: int = 0) -> None: + pass + + def on_before_dataloading(self, iteration: int = 0) -> None: + pass + + def on_after_dataloading(self, iteration: int = 0) -> None: + pass + + def on_optimizer_init_start(self) -> None: + pass + + def on_optimizer_init_end(self) -> None: + pass + + def on_before_optimizer_step( + self, + model_ddp: distributed.DistributedDataParallel, + optimizer: torch.optim.Optimizer, + scheduler: torch.optim.lr_scheduler.LRScheduler, + grad_scaler: torch.amp.GradScaler, + iteration: int = 0, + ) -> None: + pass + + def on_before_zero_grad( + self, + model_ddp: distributed.DistributedDataParallel, + optimizer: torch.optim.Optimizer, + scheduler: torch.optim.lr_scheduler.LRScheduler, + iteration: int = 0, + ) -> None: + pass + + def on_training_step_batch_end( + self, + model: ImaginaireModel, + data_batch: dict[str, torch.Tensor], + output_batch: dict[str, torch.Tensor], + loss: torch.Tensor, + iteration: int = 0, + ) -> None: + """ + Called at the end of a training step for every batch even when using gradient accumulation. + This is paired with on_training_step_batch_start(). Note that the iteration is only updated when the optimizer is updated, + and therefore it may be the same for multiple batches. + """ + pass + + def on_training_step_end( + self, + model: ImaginaireModel, + data_batch: dict[str, torch.Tensor], + output_batch: dict[str, torch.Tensor], + loss: torch.Tensor, + iteration: int = 0, + ) -> None: + """ + Called at the end of a training step, but note that when using gradient accumulation, this is only called + when the optimizer is updated, and the iteration incremented, whereas on_training_step_start is called every time. + Use on_training_step_batch_start and on_training_step_batch_end if you need callbacks that are called + for every batch. + """ + pass + + def on_validation_start( + self, model: ImaginaireModel, dataloader_val: torch.utils.data.DataLoader, iteration: int = 0 + ) -> None: + pass + + def on_validation_step_start( + self, model: ImaginaireModel, data: dict[str, torch.Tensor], iteration: int = 0 + ) -> None: + pass + + def on_validation_step_end( + self, + model: ImaginaireModel, + data_batch: dict[str, torch.Tensor], + output_batch: dict[str, torch.Tensor], + loss: torch.Tensor, + iteration: int = 0, + ) -> None: + pass + + def on_validation_end(self, model: ImaginaireModel, iteration: int = 0) -> None: + pass + + def on_load_checkpoint_start(self, model: ImaginaireModel) -> None: + pass + + def on_load_checkpoint_end( + self, model: ImaginaireModel, iteration: int = 0, checkpoint_path: Optional[str] = None + ) -> None: + pass + + def on_load_checkpoint(self, model: ImaginaireModel, state_dict: dict[Any]) -> None: + """ + Called when checkpoint loading is about to start, but after on_save_checkpoint_start(). + FIXME - why do we need this callback, can't we just use on_save_checkpoint_start()? + """ + pass + + def on_save_checkpoint_start(self, model: ImaginaireModel, iteration: int = 0) -> None: + """ + Called when checkpoint saving is about to start. + """ + pass + + def on_save_checkpoint_end(self, model: ImaginaireModel, iteration: int = 0) -> None: + """ + Called when the synchronous part of checkpointing is finished, this function can be used + along with on_save_checkpoint_start() to measure the exposed (synchronous) checkpoint time. + Note that for asynchronous checkpoint, the checkpoint may still be ongoing, so this function + does not mean the checkpoint is finished for the asynchronous case, use on_save_checkpoint_success() + for that. + """ + pass + + def on_save_checkpoint_success(self, iteration: int = 0, elapsed_time: float = 0) -> None: + """ + Called when checkpoint saving is fully finished, and succeeded. Not called if checkpoint failed. + For synchronous checkpoint, it is called at the same time as on_save_checkpoint_end(), but for asynchronous + checkpoint, it is called after the asynchronous part has also finished. For checkpointers with out-of-process + checkpointing, this function is called as soon as the notification is received from the checkpointer process, + which may not be immediately after the checkpoint has completed but later on. Therefore, if you need to measure + the full checkpoint duration for the asynchronous part, use the elapsed_time parameter, do not measure it directly + as this would be a significant overestimate. + """ + pass + + def on_save_checkpoint(self, model: ImaginaireModel, state_dict: dict[Any]) -> None: + pass + + def on_train_end(self, model: ImaginaireModel, iteration: int = 0) -> None: + pass + + def on_app_end(self) -> None: + pass + + +class EMAModelCallback(Callback): + """The callback class for tracking EMA model weights.""" + + def on_train_start(self, model: ImaginaireModel, iteration: int = 0) -> None: + # Set up the EMA model weight tracker. + if model.config.ema.enabled: + assert hasattr(model, "ema"), "EMA should be initialized from ImaginaireModel" + # EMA model must be kept in FP32 precision. + model.ema = model.ema.to(dtype=torch.float32) + else: + assert not hasattr(model, "ema"), "There should be no EMA initialized." + + def on_training_step_end( + self, + model: ImaginaireModel, + data_batch: dict[str, torch.Tensor], + output_batch: dict[str, torch.Tensor], + loss: torch.Tensor, + iteration: int = 0, + ) -> None: + # Update the EMA model with the new regular weights. + if model.config.ema.enabled: + model.ema.update_average(model, iteration) + + +class ProgressBarCallback(Callback): + """The callback class for visualizing the training/validation progress bar in the console.""" + + @distributed.rank0_only + def on_train_start(self, model: ImaginaireModel, iteration: int = 0) -> None: + self.train_pbar = tqdm.trange(self.config.trainer.max_iter, initial=iteration, desc="Training") + + @distributed.rank0_only + def on_training_step_end( + self, + model: ImaginaireModel, + data_batch: dict[str, torch.Tensor], + output_batch: dict[str, torch.Tensor], + loss: torch.Tensor, + iteration: int = 0, + ) -> None: + self.train_pbar.update() + + @distributed.rank0_only + def on_validation_start( + self, model: ImaginaireModel, dataloader_val: torch.utils.data.DataLoader, iteration: int = 0 + ) -> None: + if self.config.trainer.max_val_iter is not None: + num_iter = self.config.trainer.max_val_iter + else: + num_iter = len(dataloader_val) + assert num_iter is not None and num_iter > 0, f"Invalid number of validation iterations: {num_iter}" + self.val_pbar = tqdm.trange(num_iter, desc="Validating", position=1, leave=False) + + @distributed.rank0_only + def on_validation_step_end( + self, + model: ImaginaireModel, + data_batch: dict[str, torch.Tensor], + output_batch: dict[str, torch.Tensor], + loss: torch.Tensor, + iteration: int = 0, + ) -> None: + self.val_pbar.update() + + @distributed.rank0_only + def on_validation_end(self, model: ImaginaireModel, iteration: int = 0) -> None: + self.val_pbar.close() + + @distributed.rank0_only + def on_train_end(self, model: ImaginaireModel, iteration: int = 0) -> None: + self.trainer.checkpointer.finalize() + self.train_pbar.close() + + +class IterationLoggerCallback(Callback): + """The callback class for visualizing the training/validation progress bar in the console.""" + + @distributed.rank0_only + def on_train_start(self, model: ImaginaireModel, iteration: int = 0) -> None: + # self.train_pbar = tqdm.trange(self.config.trainer.max_iter, initial=iteration, desc="Training") + self.start_iteration_time = time.time() + self.elapsed_iteration_time = 0 + + @distributed.rank0_only + def on_training_step_start(self, model: ImaginaireModel, data: dict[str, torch.Tensor], iteration: int = 0) -> None: + self.start_iteration_time = time.time() + + @distributed.rank0_only + def on_training_step_end( + self, + model: ImaginaireModel, + data_batch: dict[str, torch.Tensor], + output_batch: dict[str, torch.Tensor], + loss: torch.Tensor, + iteration: int = 0, + ) -> None: + + # but this is only called when the optimizer is updated, so it's only the time for the last batch. + self.elapsed_iteration_time += time.time() - self.start_iteration_time + + if iteration % self.config.trainer.logging_iter == 0: + avg_time = self.elapsed_iteration_time / self.config.trainer.logging_iter + log.info(f"Iteration: {iteration}, average iter time: {avg_time:2f}, total loss {loss.item():4f}") + + self.elapsed_iteration_time = 0 + + +class WandBCallback(Callback): + """The callback class for logging to Weights and Biases (W&B). + + By default, WandBCallback logs the following training stats to W&B every config.trainer.logging_iter: + - iteration: The current iteration number (useful for visualizing the training progress over time). + - train/loss: The computed overall loss in the training batch. + - optim/lr: The current learning rate. + - timer/*: The averaged timing results of each code block recorded by trainer.training_timer. + For validation, WandBCallback logs: + - val/loss: The computed overall loss in the validation dataset. + """ + + def on_train_start(self, model: ImaginaireModel, iteration: int = 0) -> None: + wandb_util.init_wandb(self.config, model=model) + + def on_before_optimizer_step( + self, + model_ddp: distributed.DistributedDataParallel, + optimizer: torch.optim.Optimizer, + scheduler: torch.optim.lr_scheduler.LRScheduler, + grad_scaler: torch.amp.GradScaler, + iteration: int = 0, + ) -> None: # Log the curent learning rate. + if iteration % self.config.trainer.logging_iter == 0 and distributed.is_rank0(): + wandb.log({"optim/lr": scheduler.get_last_lr()[0]}, step=iteration) + wandb.log({"optim/grad_scale": grad_scaler.get_scale()}, step=iteration) + + def on_training_step_end( + self, + model: ImaginaireModel, + data_batch: dict[str, torch.Tensor], + output_batch: dict[str, torch.Tensor], + loss: torch.Tensor, + iteration: int = 0, + ) -> None: # Log the timing results (over a number of iterations) and the training loss. + if iteration % self.config.trainer.logging_iter == 0: + timer_results = self.trainer.training_timer.compute_average_results() + + # reduce loss + sample_size = torch.tensor(misc.get_data_batch_size(data_batch), device="cuda") + loss_sum = loss * sample_size + dist.all_reduce(loss_sum, op=dist.ReduceOp.SUM) + dist.all_reduce(sample_size, op=dist.ReduceOp.SUM) + avg_loss = loss_sum.item() / sample_size.item() + + if distributed.is_rank0(): + wandb.log({f"timer/{key}": value for key, value in timer_results.items()}, step=iteration) + wandb.log({"train/loss": avg_loss}, step=iteration) + wandb.log({"iteration": iteration}, step=iteration) + self.trainer.training_timer.reset() + + def on_validation_start( + self, model: ImaginaireModel, dataloader_val: torch.utils.data.DataLoader, iteration: int = 0 + ) -> None: + # Cache for collecting data/output batches. + self._val_cache: dict[str, Any] = dict( + data_batches=[], + output_batches=[], + loss=torch.tensor(0.0, device="cuda"), + sample_size=torch.tensor(0, device="cuda"), + ) + + def on_validation_step_end( + self, + model: ImaginaireModel, + data_batch: dict[str, torch.Tensor], + output_batch: dict[str, torch.Tensor], + loss: torch.Tensor, + iteration: int = 0, + ) -> None: # Collect the validation batch and aggregate the overall loss. + # Collect the validation batch and aggregate the overall loss. + batch_size = misc.get_data_batch_size(data_batch) + self._val_cache["loss"] += loss * batch_size + self._val_cache["sample_size"] += batch_size + + def on_validation_end(self, model: ImaginaireModel, iteration: int = 0) -> None: + # Compute the average validation loss across all devices. + dist.all_reduce(self._val_cache["loss"], op=dist.ReduceOp.SUM) + dist.all_reduce(self._val_cache["sample_size"], op=dist.ReduceOp.SUM) + loss = self._val_cache["loss"].item() / self._val_cache["sample_size"] + # Log data/stats of validation set to W&B. + if distributed.is_rank0(): + log.info(f"Validation loss (iteration {iteration}): {loss:4f}") + wandb.log({"val/loss": loss}, step=iteration) + + def on_train_end(self, model: ImaginaireModel, iteration: int = 0) -> None: + wandb.finish() + + +class LowPrecisionCallback(Callback): + """The callback class handling low precision training""" + + def __init__(self, config: Config, trainer: ImaginaireTrainer, update_iter: int): + self.update_iter = update_iter + + def on_train_start(self, model: ImaginaireModel, iteration: int = 0) -> None: + if model.precision == torch.float32: + log.critical("Using fp32. We should disable master weights update.") + self.update_iter = sys.maxsize + else: + assert model.precision in [ + torch.bfloat16, + torch.float16, + torch.half, + ], "LowPrecisionCallback must use a low precision dtype." + self.precision_type = model.precision + + def on_training_step_start(self, model: ImaginaireModel, data: dict[str, torch.Tensor], iteration: int = 0) -> None: + for k, v in data.items(): + if isinstance(v, torch.Tensor) and torch.is_floating_point(data[k]): + data[k] = v.to(dtype=self.precision_type) + + def on_validation_step_start( + self, model: ImaginaireModel, data: dict[str, torch.Tensor], iteration: int = 0 + ) -> None: + for k, v in data.items(): + if isinstance(v, torch.Tensor) and torch.is_floating_point(data[k]): + data[k] = v.to(dtype=self.precision_type) + + def on_before_zero_grad( + self, + model_ddp: distributed.DistributedDataParallel, + optimizer: torch.optim.Optimizer, + scheduler: torch.optim.lr_scheduler.LRScheduler, + iteration: int = 0, + ) -> None: + if iteration % self.update_iter == 0: + if getattr(optimizer, "master_weights", False): + params, master_params = [], [] + for group, group_master in zip(optimizer.param_groups, optimizer.param_groups_master): + for p, p_master in zip(group["params"], group_master["params"]): + params.append(get_local_tensor_if_DTensor(p).data) + master_params.append(get_local_tensor_if_DTensor(p_master).data) + torch._foreach_copy_(params, master_params) + + +class NVTXCallback(Callback): + """The callback for creating NVTX ranges""" + + def __init__( + self, + synchronize: bool = False, + config: Optional["Config"] = None, + trainer: Optional["ImaginaireTrainer"] = None, + ): + super().__init__(config, trainer) + self.synchronize = synchronize + + def on_before_forward(self, iteration: int = 0) -> None: + if self.synchronize: + torch.cuda.synchronize() + torch.cuda.nvtx.range_push("forward") + + def on_after_forward(self, iteration: int = 0) -> None: + if self.synchronize: + torch.cuda.synchronize() + torch.cuda.nvtx.range_pop() + + def on_before_backward( + self, model_ddp: distributed.DistributedDataParallel, loss: torch.Tensor, iteration: int = 0 + ) -> None: + if self.synchronize: + torch.cuda.synchronize() + torch.cuda.nvtx.range_push("backward") + + def on_after_backward(self, model_ddp: distributed.DistributedDataParallel, iteration: int = 0) -> None: + if self.synchronize: + torch.cuda.synchronize() + torch.cuda.nvtx.range_pop() + + def on_before_optimizer_step( + self, + model_ddp: distributed.DistributedDataParallel, + optimizer: torch.optim.Optimizer, + scheduler: torch.optim.lr_scheduler.LRScheduler, + grad_scaler: torch.amp.GradScaler, + iteration: int = 0, + ) -> None: + if self.synchronize: + torch.cuda.synchronize() + torch.cuda.nvtx.range_push("optimizer_step") + + def on_before_zero_grad( + self, + model_ddp: distributed.DistributedDataParallel, + optimizer: torch.optim.Optimizer, + scheduler: torch.optim.lr_scheduler.LRScheduler, + iteration: int = 0, + ) -> None: + if self.synchronize: + torch.cuda.synchronize() + torch.cuda.nvtx.range_pop() + + def on_before_dataloading(self, iteration: int = 0) -> None: + torch.cuda.nvtx.range_push("dataloading") + + def on_after_dataloading(self, iteration: int = 0) -> None: + torch.cuda.nvtx.range_pop() + + diff --git a/cosmos-inference/cosmos3/_src/imaginaire/utils/checkpoint_db.py b/cosmos-inference/cosmos3/_src/imaginaire/utils/checkpoint_db.py new file mode 100644 index 00000000..48dfdd35 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/imaginaire/utils/checkpoint_db.py @@ -0,0 +1,452 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +"""Database of released checkpoints. + +The database maps checkpoint internal URIs to public URIs and associates metadata (e.g. +experiment name). + +## Usage + +Register a checkpoint: + +```python +CheckpointConfig( + uuid="0e8177cc-0db5-4cfd-a8a4-b820c772f4fc", + s3=CheckpointDirS3( + uri="s3://bucket/path/to/checkpoint", + ), + hf=CheckpointDirHf( + repository="org/repo", + revision="revision", + subdirectory="path/to/checkpoint", + ), +).register() +``` + +Checkpoints can be referenced by UUID, S3 URI, or local path. Optionally, use `get_checkpoint_uri` to validate and normalize the URI. + +```python +# S3 URI +checkpoint_uri = get_checkpoint_uri("s3://bucket/path/to/checkpoint") +# UUID +checkpoint_uri = get_checkpoint_uri("0e8177cc-0db5-4cfd-a8a4-b820c772f4fc") +# Local path +checkpoint_uri = get_checkpoint_uri("/path/to/checkpoint", check_exists=True) +``` + +When the checkpoint is loaded, call 'download_checkpoint': + +```python +from cosmos3._src.imaginaire.flags import INTERNAL + +if not INTERNAL: + from cosmos3._src.imaginaire.utils.checkpoint_db import download_checkpoint + + checkpoint_uri = download_checkpoint(checkpoint_uri) + +load_checkpoint(checkpoint_uri) +``` +""" + +import functools +import json +import os +import shlex +import subprocess +import uuid +from abc import ABC, abstractmethod +from pathlib import Path +from typing import Annotated, TypeAlias + +import pydantic +from typing_extensions import Self, override + +from cosmos3._src.imaginaire.flags import EXPERIMENTAL_CHECKPOINTS, INTERNAL, StrEnum +from cosmos3._src.imaginaire.utils import log + +HF_VERSION = "1.13.0" + + +def _is_uuid(checkpoint_uri: str) -> bool: + """Return True if the URI is a UUID.""" + try: + uuid.UUID(str(checkpoint_uri)) + return True + except ValueError: + return False + + +def _is_path(checkpoint_uri: str) -> bool: + """Return True if the URI is a local path.""" + return not ("://" in checkpoint_uri or _is_uuid(checkpoint_uri)) + + +def normalize_uri(checkpoint_uri: str) -> str: + """Normalize checkpoint URI.""" + checkpoint_uri = checkpoint_uri.rstrip("/") + if checkpoint_uri.startswith("s3://"): + checkpoint_uri = checkpoint_uri.removesuffix("/model") + return checkpoint_uri + + +def sanitize_uri(checkpoint_uri: str) -> str: + """Sanitize checkpoint URI.""" + checkpoint_uri = normalize_uri(checkpoint_uri) + if checkpoint_uri.startswith("s3://"): + checkpoint_uri = checkpoint_uri.removeprefix("s3://").split("/", 1)[1] + checkpoint_uri = f"s3://bucket/{checkpoint_uri}" + return checkpoint_uri + + +class _CheckpointUri(pydantic.BaseModel): + """Config for checkpoint file/directory.""" + + model_config = pydantic.ConfigDict(extra="forbid", frozen=True) + + metadata: dict = pydantic.Field(default_factory=dict) + """File metadata. + + Only used for debugging. + """ + + +def _validate_s3_uri(uri: str) -> str: + """Validate and normalize S3 URI.""" + if not uri.startswith("s3://"): + raise ValueError(f"Invalid S3 URI: {uri}. Must start with 's3://'") + return normalize_uri(uri) + + +S3Uri = Annotated[str, pydantic.AfterValidator(_validate_s3_uri)] + + +class _CheckpointS3(_CheckpointUri): + """Config for checkpoint on S3.""" + + uri: S3Uri + """S3 URI.""" + + +class CheckpointFileS3(_CheckpointS3): + """Config for checkpoint file on S3.""" + + +class CheckpointDirS3(_CheckpointS3): + """Config for checkpoint directory on S3.""" + + +CheckpointS3: TypeAlias = CheckpointFileS3 | CheckpointDirS3 + + +def _hf_download(cmd_args: list[str]) -> str: + """Run Hugging Face CLI download command and return the local path. + + Uses a newer Hugging Face CLI version to download checkpoint. The dependency + version is very old and not robust. + """ + is_rank0 = os.environ.get("RANK", "0") == "0" + cmd = [ + "uvx", + f"hf@{HF_VERSION}", + "download", + "--format=json", + *cmd_args, + ] + log.info(f"{shlex.join(cmd)}") + output = subprocess.run( + cmd, + stdout=subprocess.PIPE, + stderr=None if is_rank0 else subprocess.PIPE, + text=True, + check=True, + ) + return json.loads(output.stdout)["path"] + + +class RepositoryType(StrEnum): + """Repository type.""" + + MODEL = "model" + """Model repository.""" + DATASET = "dataset" + """Dataset repository.""" + + +class _CheckpointHf(_CheckpointUri, ABC): + """Config for checkpoint on Hugging Face.""" + + repository: str + """Repository id (organization/repository).""" + repository_type: RepositoryType = RepositoryType.MODEL + """Repository type.""" + revision: str + """Git revision id which can be a branch name, a tag, or a commit hash.""" + + _path: str | None = None + """Local path.""" + + @abstractmethod + def _download(self) -> str: ... + + def download(self) -> str: + """Download checkpoint and return the local path.""" + if self._path is None: + self._path = self._download() + return self._path + + +class CheckpointFileHf(_CheckpointHf): + """Config for checkpoint file on Hugging Face.""" + + filename: str + """File name.""" + + @override + def _download(self) -> str: + """Download checkpoint and return the local path.""" + cmd_args = [ + self.repository, + "--repo-type", + self.repository_type.value, + "--revision", + self.revision, + self.filename, + ] + path = _hf_download(cmd_args) + assert os.path.exists(path), path + return path + + +class CheckpointDirHf(_CheckpointHf): + """Config for checkpoint directory on Hugging Face.""" + + subdirectory: str = "" + """Repository subdirectory.""" + include: tuple[str, ...] = () + """Include patterns. + + See https://huggingface.co/docs/huggingface_hub/en/guides/download#filter-files-to-download + """ + exclude: tuple[str, ...] = () + """Exclude patterns. + + See https://huggingface.co/docs/huggingface_hub/en/guides/download#filter-files-to-download + """ + + @override + def _download(self) -> str: + """Download checkpoint and return the local path.""" + include = list(self.include) or ["*"] + exclude = list(self.exclude) + if self.subdirectory: + for patterns in [include, exclude]: + for i, pattern in enumerate(patterns): + patterns[i] = os.path.join(self.subdirectory, pattern) + + cmd_args = [ + self.repository, + "--repo-type", + self.repository_type.value, + "--revision", + self.revision, + ] + for pattern in include: + cmd_args.extend(["--include", pattern]) + for pattern in exclude: + cmd_args.extend(["--exclude", pattern]) + path = _hf_download(cmd_args) + if self.subdirectory: + path = os.path.join(path, self.subdirectory) + assert os.path.exists(path), path + return path + + +CheckpointHf: TypeAlias = CheckpointFileHf | CheckpointDirHf + + +class CheckpointConfig(pydantic.BaseModel): + """Config for checkpoint.""" + + model_config = pydantic.ConfigDict(extra="forbid", frozen=True) + + uuid: str + """Checkpoint UUID.""" + name: str + """Checkpoint name. + + Only used for debugging. + """ + metadata: dict = pydantic.Field(default_factory=dict) + """Checkpoint metadata. + + Only used for debugging. + """ + experiment: str | None = None + """Hydra experiment name.""" + config_file: str | None = None + """Hydra config file.""" + + s3: CheckpointS3 + """Config for checkpoint on S3.""" + hf: CheckpointHf + """Config for checkpoint on Hugging Face.""" + + @property + def full_name(self) -> str: + """Return full name for debugging.""" + return f"{self.name}({self.uuid})" + + def download(self) -> str: + """Download checkpoint and return the local path.""" + if INTERNAL: + return self.s3.uri + + log.info(f"Downloading checkpoint {self.full_name}") + return self.hf.download() + + @classmethod + def maybe_from_uri(cls, uri: str) -> Self | None: + """Return checkpoint config for URI if found, otherwise None.""" + uri = normalize_uri(uri) + return _CHECKPOINTS.get(uri, None) + + @classmethod + def from_uri(cls, uri: str) -> Self: + """Return checkpoint config for URI if found, otherwise raise an error.""" + self = cls.maybe_from_uri(uri) + if self is None: + raise ValueError( + f"Checkpoint '{uri}' not found. Set 'export COSMOS_EXPERIMENTAL_CHECKPOINTS=1' to include experimental checkpoints." + ) + return self + + def register(self): + """Register checkpoint config.""" + register_checkpoint(self) + + +_CHECKPOINTS: dict[str, CheckpointConfig] = {} +"""Mapping from checkpoint URI to checkpoint config.""" + + +def register_checkpoint(checkpoint_config: CheckpointConfig): + """Register checkpoint config. + + DEPRECATED: Use 'CheckpointConfig.register' instead. + """ + if not EXPERIMENTAL_CHECKPOINTS: + if checkpoint_config.hf.repository in ["nvidia/Cosmos-Experimental", "nvidia-cosmos-ea/Cosmos-Experimental"]: + # Don't register experimental checkpoints. An exception will be + # raised in CI if the checkpoint is used without + # EXPERIMENTAL_CHECKPOINTS. + return + for uri in [checkpoint_config.uuid, checkpoint_config.s3.uri]: + if uri in _CHECKPOINTS: + raise ValueError(f"Checkpoint '{uri}' already registered.") + _CHECKPOINTS[uri] = checkpoint_config + + +def get_checkpoint_uri(checkpoint_uri: str, *, check_exists: bool = False) -> str: + """Validate and normalize checkpoint URI.""" + checkpoint_uri = normalize_uri(checkpoint_uri) + if (checkpoint := CheckpointConfig.maybe_from_uri(checkpoint_uri)) is not None: + return checkpoint.s3.uri + if checkpoint_uri.startswith("hf://"): + return checkpoint_uri + if _is_path(checkpoint_uri): + if check_exists: + checkpoint_path = Path(checkpoint_uri).expanduser().absolute() + if not checkpoint_path.exists(): + raise ValueError(f"Checkpoint '{checkpoint_path}' does not exist.") + checkpoint_uri = str(checkpoint_path) + return checkpoint_uri + if INTERNAL: + return checkpoint_uri + raise ValueError( + f"Checkpoint '{checkpoint_uri}' not found. Set 'export COSMOS_EXPERIMENTAL_CHECKPOINTS=1' to include experimental checkpoints." + ) + + +@functools.lru_cache +def _download_hf_checkpoint(checkpoint_hf: str) -> str: + # Parse hf://org/repo/path/to/file.pth + assert checkpoint_hf.startswith("hf://"), f"Not a HuggingFace URI: {checkpoint_hf}" + hf_path = checkpoint_hf.removeprefix("hf://") + # Split into repo_id (org/repo) and filename (path/to/file.pth) + parts = hf_path.split("/") + if len(parts) < 3: + raise ValueError( + f"Invalid HuggingFace URI format: {checkpoint_hf}. Expected format: hf://org/repo/path/to/file.pth" + ) + repo_id = "/".join(parts[:2]) # org/repo + filename = "/".join(parts[2:]) # path/to/file.pth + return CheckpointFileHf( + repository=repo_id, + revision="main", + filename=filename, + ).download() + + +@functools.lru_cache +def download_checkpoint(checkpoint_uri: str, *, check_exists: bool = True) -> str: + """Download a checkpoint by URI and return the local path. + + DEPRECATED: Use 'download_checkpoint_v2' instead. + + This should only be used when the checkpoint is loaded. If you just need a + URI, use 'get_checkpoint_uri' instead. + + Downloaded checkpoints are cached, so calling this multiple times will + return the same path. + + Supports: + - Checkpoint UUID: 0e8177cc-0db5-4cfd-a8a4-b820c772f4fc + - S3 URI: s3://bucket/path/to/checkpoint + - HuggingFace URI: hf://org/repo/path/to/file.pth + - Local path: /path/to/checkpoint + """ + if INTERNAL: + return checkpoint_uri + if (checkpoint := CheckpointConfig.maybe_from_uri(checkpoint_uri)) is not None: + return checkpoint.download() + if checkpoint_uri.startswith("hf://"): + return _download_hf_checkpoint(checkpoint_uri) + if check_exists and not os.path.exists(checkpoint_uri): + raise ValueError(f"Checkpoint path {checkpoint_uri} does not exist.") + return checkpoint_uri + + +@functools.lru_cache +def download_checkpoint_v2(checkpoint_uri: str, *, check_exists: bool = True) -> str: + """Maybe download a checkpoint by URI and return the local path. + + Similar to 'download_checkpoint', but unknown S3 URIs are passed through. + """ + if INTERNAL: + return checkpoint_uri + if (checkpoint := CheckpointConfig.maybe_from_uri(sanitize_uri(checkpoint_uri))) is not None: + return checkpoint.download() + if checkpoint_uri.startswith("s3://"): + return checkpoint_uri + if checkpoint_uri.startswith("hf://"): + return _download_hf_checkpoint(checkpoint_uri) + if check_exists and not os.path.exists(checkpoint_uri): + raise ValueError(f"Checkpoint path {checkpoint_uri} does not exist.") + return checkpoint_uri + + +get_checkpoint_path = download_checkpoint +"""DEPRECATED: Use 'download_checkpoint' instead.""" diff --git a/cosmos-inference/cosmos3/_src/imaginaire/utils/checkpointer.py b/cosmos-inference/cosmos3/_src/imaginaire/utils/checkpointer.py new file mode 100644 index 00000000..34c6dae5 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/imaginaire/utils/checkpointer.py @@ -0,0 +1,504 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +from __future__ import annotations + +import os +import threading +from typing import TYPE_CHECKING, List, NamedTuple, Tuple + +import torch + +from cosmos3._src.imaginaire.model import ImaginaireModel +from cosmos3._src.imaginaire.utils import callback, distributed, log, misc, object_store + +if TYPE_CHECKING: + from cosmos3._src.imaginaire.config import CheckpointConfig, JobConfig + +TORCH_VERSION: Tuple[int, ...] = tuple(int(x) for x in torch.__version__.split(".")[:2]) +if TORCH_VERSION >= (1, 11): + from torch.ao import quantization + from torch.ao.quantization import FakeQuantizeBase, ObserverBase +elif ( + TORCH_VERSION >= (1, 8) + and hasattr(torch.quantization, "FakeQuantizeBase") + and hasattr(torch.quantization, "ObserverBase") +): + from torch import quantization + from torch.quantization import FakeQuantizeBase, ObserverBase + + +class Checkpointer: + """The checkpointer class. Supports checkpoint saving/loading to both local disk or object store.""" + + def __init__(self, config_checkpoint: CheckpointConfig, config_job: JobConfig, callbacks: callback.CallBackGroup): + """Constructor of the checkpointer. + + Args: + config_checkpoint (CheckpointConfig): The config object for the checkpointer. + """ + # Set the callback functions. + self.callbacks = callbacks + + + + self.checkpoint_dir_local = f"{config_job.path_local}/checkpoints" + self.checkpoint_dir_object_store = f"{config_job.path}/checkpoints" + self.save_to_object_store = config_checkpoint.save_to_object_store.enabled + self.load_from_object_store = config_checkpoint.load_from_object_store.enabled + self.strict_resume = config_checkpoint.strict_resume + self.load_path = config_checkpoint.load_path or None + self.load_training_state = config_checkpoint.load_training_state + self.only_load_scheduler_state = config_checkpoint.only_load_scheduler_state + self.save_thread = None + # Create the object store client interface. + if self.save_to_object_store: + self.object_store_saver = object_store.ObjectStore(config_checkpoint.save_to_object_store) + if self.load_from_object_store: + self.object_store_loader = object_store.ObjectStore(config_checkpoint.load_from_object_store) + + def save( + self, + model: ImaginaireModel, + optimizer: torch.optim.Optimizer, + scheduler: torch.optim.lr_scheduler.LRScheduler, + grad_scaler: torch.amp.GradScaler, + iteration: int, + ) -> None: + """Save network weights, optimizer parameters, scheduler parameters to a checkpoint. + + Args: + model (ImaginaireModel): The PyTorch model. + optimizer (torch.optim.Optimizer): The model optimizer. + scheduler (torch.optim.lr_scheduler.LRScheduler): The optimization scheduler. + grad_scaler (torch.amp.GradScaler): The gradient scaler (for mixed precision training). + iteration (int): Current iteration number. + """ + self.callbacks.on_save_checkpoint_start(model, iteration) + + checkpoint_file = f"iter_{iteration:09}.pt" + + if distributed.get_rank() == 0: + state_dict = dict( + model=model.state_dict(), + optimizer=optimizer.state_dict(), + scheduler=scheduler.state_dict(), + grad_scaler=grad_scaler.state_dict(), + iteration=iteration, + ) + state_dict = misc.to(state_dict, device="cpu") + self.callbacks.on_save_checkpoint(model, state_dict=state_dict) + # Wait for previous saver thread to end. + if self.save_thread: + self.save_thread.join() + # Run the checkpoint saver in a separate thread. + self.save_thread = threading.Thread( + target=self._save_worker_object_store if self.save_to_object_store else self._save_worker_local, + daemon=False, + args=(state_dict, checkpoint_file, distributed.get_rank()), + ) + self.save_thread.start() + + # Note: Checkpoints are saved on a separate thread and this callback is not accurate. + # Please check logs from on_save_checkpoint_success() for better accuracy + self.callbacks.on_save_checkpoint_end(model=None, iteration=iteration) + + @misc.timer("checkpoint saving (local)") + def _save_worker_local(self, state_dict: dict[str, torch.Tensor], checkpoint_file: str, rank: int = 0) -> None: + """Worker to save checkpoint to local disk, spawned with a child thread (runs in parallel with the training). + + Args: + state_dict (dict[str, torch.Tensor]): The state dict of the model/optimizer/scheduler. + checkpoint_file (str): The file name of the model checkpoint. + rank (int): GPU device (default: 0). + """ + checkpoint_path = os.path.join(self.checkpoint_dir_local, checkpoint_file) + os.makedirs(self.checkpoint_dir_local, exist_ok=True) + try: + torch.save(state_dict, checkpoint_path) + if rank == 0: + self._write_latest_checkpoint_file(checkpoint_file) + log.success(f"Saved checkpoint (local): {checkpoint_path}") + iteration = int(checkpoint_file.replace("iter_", "").replace(".pt", "")) + self.callbacks.on_save_checkpoint_success(iteration=iteration) + except Exception as e: # noqa: BLE001 + log.exception(f"Checkpoint failed to save (local): {e}") + + @misc.timer("checkpoint saving (object store)") + def _save_worker_object_store( + self, state_dict: dict[str, torch.Tensor], checkpoint_file: str, rank: int = 0 + ) -> None: + """Worker to upload checkpoint to object store, spawned with a child thread (in parallel with the training). + + Args: + state_dict (dict[str, torch.Tensor]): The state dict of the model/optimizer/scheduler. + checkpoint_file (str): The file name of the model checkpoint. + rank (int): GPU device (default: 0). + """ + checkpoint_path = os.path.join(self.checkpoint_dir_object_store, checkpoint_file) + try: + self.object_store_saver.save_object(state_dict, key=checkpoint_path, type="torch") + if rank == 0: + self._write_latest_checkpoint_file(checkpoint_file) + log.success(f"Saved checkpoint (object store): {checkpoint_path}") + iteration = int(checkpoint_file.replace("iter_", "").replace(".pt", "")) + self.callbacks.on_save_checkpoint_success(iteration=iteration) + except Exception as e: # noqa: BLE001 + log.exception(f"Checkpoint failed to upload (object store): {e}") + + @misc.timer("checkpoint loading") + def load( + self, + model: ImaginaireModel, + optimizer: torch.optim.Optimizer | None = None, + scheduler: torch.optim.lr_scheduler.LRScheduler | None = None, + grad_scaler: torch.amp.GradScaler | None = None, + ) -> int: + """Load network weights and optimizer states from a checkpoint in a single process. + + The priority of the checkpoint loading logic is: + 1. Attempt to resume training if possible by looking for latest_checkpoint.txt under the same name. + 2. If no latest checkpoint were found, it loads the model weights specified by config_checkpoint.path. + - This is typically used for inference mode. + - If config_checkpoint.load_optimizer_state is True, then also load the optimizer and scheduler states. + 3. If none of the above, randomly initialize the model parameters and train from scratch. + + Args: + model (ImaginaireModel): The PyTorch model. + optimizer (torch.optim.Optimizer | None): The model optimizer (default: None). + scheduler (torch.optim.lr_scheduler.LRScheduler | None): The optimization scheduler (default: None). + grad_scaler (torch.amp.GradScaler | None): The gradient scaler (for mixed precision training). + + Returns: + iteration (int): the iteration number to start/resume from. + """ + self.callbacks.on_load_checkpoint_start(model) + + latest_checkpoint_file = self._read_latest_checkpoint_file() + if latest_checkpoint_file is not None: + # 1. Resume training from latest_checkpoint.txt under the same name. + checkpoint_dir = ( + self.checkpoint_dir_object_store if self.load_from_object_store else self.checkpoint_dir_local + ) + checkpoint_path = os.path.join(checkpoint_dir, latest_checkpoint_file) + resume = True + only_resume_scheduler = True + else: + if self.load_path: + # 2. Load the module weights specified by config_checkpoint.path. + checkpoint_path = self.load_path + resume = self.load_training_state + only_resume_scheduler = self.only_load_scheduler_state + else: + # 3. Randomly initialize the model parameters and train from scratch. + checkpoint_path = None + resume = False + only_resume_scheduler = False + # Load checkpoint. + if checkpoint_path is not None: + self._check_checkpoint_exists(checkpoint_path) + if self.load_from_object_store: + log.info(f"Loading checkpoint (object store): {checkpoint_path}") + state_dict = self.object_store_loader.load_object(key=checkpoint_path, type="torch") + log.success(f"Complete loading checkpoint (object store): {checkpoint_path}") + else: + log.info(f"Loading checkpoint (local): {checkpoint_path}") + state_dict = torch.load(checkpoint_path, map_location=lambda storage, loc: storage, weights_only=False) + log.success(f"Complete loading checkpoint (local): {checkpoint_path}") + self.callbacks.on_load_checkpoint(model, state_dict=state_dict) + # Load the state dicts. + log.info("- Loading the model...") + model.load_state_dict(state_dict["model"], strict=self.strict_resume) + if resume or only_resume_scheduler: + iteration = state_dict["iteration"] + assert scheduler + log.info("- Loading the scheduler...") + scheduler.load_state_dict(state_dict["scheduler"]) + scheduler.last_epoch = iteration + else: + iteration = 0 + if resume: + assert optimizer + log.info("- Loading the optimizer...") + optimizer.load_state_dict(state_dict["optimizer"]) + log.info("- Loading the gradient scaler...") + grad_scaler.load_state_dict(state_dict["grad_scaler"]) + log.success(f"Done with loading the checkpoint (iteration {iteration}).") + else: + log.success("Done with loading the checkpoint.") + else: + # Checkpoint not found and not specified. We will train everything from scratch. + iteration = 0 + log.info("Training from scratch.") + torch.cuda.empty_cache() + + self.callbacks.on_load_checkpoint_end(model, iteration=iteration, checkpoint_path=checkpoint_path) + + return iteration + + def _read_latest_checkpoint_file(self) -> str | None: + """Get the file name of the latest saved checkpoint. If it doesn't exist, return None. + + Returns: + checkpoint_file (str | None): file name of the latest saved checkpoint. + """ + checkpoint_file = None + if self.load_from_object_store: + latest_path = os.path.join(self.checkpoint_dir_object_store, "latest_checkpoint.txt") + if self.object_store_loader.object_exists(key=latest_path): + checkpoint_file = self.object_store_loader.load_object(key=latest_path, type="text").strip() + else: + latest_path = os.path.join(self.checkpoint_dir_local, "latest_checkpoint.txt") + if os.path.isfile(latest_path): + checkpoint_file = open(latest_path).read().strip() + return checkpoint_file + + def _write_latest_checkpoint_file(self, checkpoint_file: str) -> None: + """Track the file name of the latest saved checkpoint. + + Args: + checkpoint_file (str): file name of the latest saved checkpoint. + """ + content = f"{checkpoint_file}\n" + if self.save_to_object_store: + latest_path = os.path.join(self.checkpoint_dir_object_store, "latest_checkpoint.txt") + self.object_store_saver.save_object(content, key=latest_path, type="text") + else: + latest_path = os.path.join(self.checkpoint_dir_local, "latest_checkpoint.txt") + with open(latest_path, "w") as file: + file.write(content) + + def _check_checkpoint_exists(self, checkpoint_path: str) -> None: + """If the file checkpoint_path does not exist, raise an error. + + Args: + checkpoint_path (str): full path to the checkpoint. + """ + if self.load_from_object_store: + if not self.object_store_loader.object_exists(key=checkpoint_path): + raise FileNotFoundError(f"File not found (object store): {checkpoint_path}") + else: + if not os.path.exists(checkpoint_path): + raise FileNotFoundError(f"File not found (local): {checkpoint_path}") + + def finalize(self) -> None: + """Finalize the checkpointer.""" + if self.save_thread: + self.save_thread.join() + + +class _IncompatibleKeys( + NamedTuple( + "IncompatibleKeys", + [ + ("missing_keys", List[str]), + ("unexpected_keys", List[str]), + ("incorrect_shapes", List[Tuple[str, Tuple[int], Tuple[int]]]), + ], + ) +): + pass + + +class MultiRankCheckpointer(Checkpointer): + def save( + self, + model: ImaginaireModel, + optimizer: torch.optim.Optimizer, + scheduler: torch.optim.lr_scheduler.LRScheduler, + grad_scaler: torch.amp.GradScaler, + iteration: int, + ) -> None: + """Save network weights, optimizer parameters, scheduler parameters to a checkpoint. + + Args: + model (ImaginaireModel): The PyTorch model. + optimizer (torch.optim.Optimizer): The model optimizer. + scheduler (torch.optim.lr_scheduler.LRScheduler): The optimization scheduler. + grad_scaler (torch.amp.GradScaler): The gradient scaler (for mixed precision training). + iteration (int): Current iteration number. + """ + # checkpoint_file = f"iter_{iteration:09}.pt" + postfix, _, total_ema_num = model.get_ckpt_postfix() + checkpoint_file = f"iter_{iteration:09}{postfix}.pt" + save_ranks = list(range(total_ema_num)) + for _rank in save_ranks: + if distributed.get_rank() == _rank: + state_dict = dict( + model=model.state_dict(), + optimizer=optimizer.state_dict(), + scheduler=scheduler.state_dict(), + grad_scaler=grad_scaler.state_dict(), + iteration=iteration, + ) + state_dict = misc.to(state_dict, device="cpu") + self.callbacks.on_save_checkpoint(model, state_dict=state_dict) + # Wait for previous saver thread to end. + if self.save_thread: + self.save_thread.join() + # Run the checkpoint saver in a separate thread. + self.save_thread = threading.Thread( + target=self._save_worker_object_store if self.save_to_object_store else self._save_worker_local, + daemon=False, + args=(state_dict, checkpoint_file, distributed.get_rank()), + ) + self.save_thread.start() + + @misc.timer("checkpoint loading") + def load( + self, + model: ImaginaireModel, + optimizer: torch.optim.Optimizer | None = None, + scheduler: torch.optim.lr_scheduler.LRScheduler | None = None, + grad_scaler: torch.amp.GradScaler | None = None, + ) -> int: + """Load network weights and optimizer states from a checkpoint in a single process. + + The priority of the checkpoint loading logic is: + 1. Attempt to resume training if possible by looking for latest_checkpoint.txt under the same name. + 2. If no latest checkpoint were found, it loads the model weights specified by config_checkpoint.path. + - This is typically used for inference mode. + - If config_checkpoint.load_optimizer_state is True, then also load the optimizer and scheduler states. + 3. If none of the above, randomly initialize the model parameters and train from scratch. + + Args: + model (ImaginaireModel): The PyTorch model. + optimizer (torch.optim.Optimizer | None): The model optimizer (default: None). + scheduler (torch.optim.lr_scheduler.LRScheduler | None): The optimization scheduler (default: None). + grad_scaler (torch.amp.GradScaler | None): The gradient scaler (for mixed precision training). + + Returns: + iteration (int): the iteration number to start/resume from. + """ + latest_checkpoint_file = self._read_latest_checkpoint_file() + if latest_checkpoint_file is not None: + # different from base checkpointer, this support multi-EMA + postfix, _, total_ema_num = model.get_ckpt_postfix() + latest_checkpoint_file = latest_checkpoint_file.replace(".pt", f"{postfix}.pt") + # 1. Resume training from latest_checkpoint.txt under the same name. + checkpoint_dir = ( + self.checkpoint_dir_object_store if self.load_from_object_store else self.checkpoint_dir_local + ) + checkpoint_path = os.path.join(checkpoint_dir, latest_checkpoint_file) + resume = True + else: + if self.load_path: + # 2. Load the module weights specified by config_checkpoint.path. + checkpoint_path = self.load_path + # different from base checkpointer, this support multi-EMA + postfix, _, total_ema_num = model.get_ckpt_postfix() + checkpoint_path = checkpoint_path.replace(".pt", f"{postfix}.pt") + resume = self.load_training_state + else: + # 3. Randomly initialize the model parameters and train from scratch. + checkpoint_path = None + resume = False + # Load checkpoint. + if checkpoint_path is not None: + self._check_checkpoint_exists(checkpoint_path) + if self.load_from_object_store: + log.info(f"Loading checkpoint (object store): {checkpoint_path}") + state_dict = self.object_store_loader.load_object(key=checkpoint_path, type="torch") + log.success(f"Complete loading checkpoint (object store): {checkpoint_path}") + else: + log.info(f"Loading checkpoint (local): {checkpoint_path}") + state_dict = torch.load(checkpoint_path, map_location=lambda storage, loc: storage) + log.success(f"Complete loading checkpoint (local): {checkpoint_path}") + self.callbacks.on_load_checkpoint(model, state_dict=state_dict) + # Load the state dicts. + log.info("- Loading the model...") + log.critical(model.load_state_dict(state_dict["model"], strict=self.strict_resume)) + if resume: + iteration = state_dict["iteration"] + assert optimizer and scheduler + log.info("- Loading the optimizer...") + optimizer.load_state_dict(state_dict["optimizer"]) + log.info("- Loading the scheduler...") + scheduler.load_state_dict(state_dict["scheduler"]) + scheduler.last_epoch = iteration + log.info("- Loading the gradient scaler...") + grad_scaler.load_state_dict(state_dict["grad_scaler"]) + log.success(f"Done with loading the checkpoint (iteration {iteration}).") + else: + iteration = 0 + log.success("Done with loading the checkpoint.") + else: + # Checkpoint not found and not specified. We will train everything from scratch. + iteration = 0 + log.info("Training from scratch.") + torch.cuda.empty_cache() + return iteration + + +# https://github.com/facebookresearch/fvcore/blob/9d683aae73fb899dd35d6cf6720e5ef567761c57/fvcore/common/checkpoint.py +def non_strict_load_model(model: torch.nn.Module, checkpoint_state_dict: dict) -> _IncompatibleKeys: + # workaround https://github.com/pytorch/pytorch/issues/24139 + model_state_dict = model.state_dict() + incorrect_shapes = [] + for k in list(checkpoint_state_dict.keys()): + if k in model_state_dict: + if "_extra_state" in k: # Key introduced by TransformerEngine for FP8 + log.warning(f"Skipping key {k} introduced by TransformerEngine for FP8 in the checkpoint.") + continue + model_param = model_state_dict[k] + # Allow mismatch for uninitialized parameters + if TORCH_VERSION >= (1, 8) and isinstance(model_param, torch.nn.parameter.UninitializedParameter): + continue + if not isinstance(model_param, torch.Tensor): + raise ValueError( + f"Find non-tensor parameter {k} in the model. type: {type(model_param)} {type(checkpoint_state_dict[k])}, please check if this key is safe to skip or not." + ) + + shape_model = tuple(model_param.shape) + shape_checkpoint = tuple(checkpoint_state_dict[k].shape) + if shape_model != shape_checkpoint: + has_observer_base_classes = ( + TORCH_VERSION >= (1, 8) + and hasattr(quantization, "ObserverBase") + and hasattr(quantization, "FakeQuantizeBase") + ) + if has_observer_base_classes: + # Handle the special case of quantization per channel observers, + # where buffer shape mismatches are expected. + def _get_module_for_key(model: torch.nn.Module, key: str) -> torch.nn.Module: + # foo.bar.param_or_buffer_name -> [foo, bar] + key_parts = key.split(".")[:-1] + cur_module = model + for key_part in key_parts: + cur_module = getattr(cur_module, key_part) + return cur_module + + cls_to_skip = ( + ObserverBase, + FakeQuantizeBase, + ) + target_module = _get_module_for_key(model, k) + if isinstance(target_module, cls_to_skip): + # Do not remove modules with expected shape mismatches + # them from the state_dict loading. They have special logic + # in _load_from_state_dict to handle the mismatches. + continue + + incorrect_shapes.append((k, shape_checkpoint, shape_model)) + checkpoint_state_dict.pop(k) + incompatible = model.load_state_dict(checkpoint_state_dict, strict=False) + # Remove keys with "_extra_state" suffix, which are non-parameter items introduced by TransformerEngine for FP8 handling + missing_keys = [k for k in incompatible.missing_keys if "_extra_state" not in k] + unexpected_keys = [k for k in incompatible.unexpected_keys if "_extra_state" not in k] + return _IncompatibleKeys( + missing_keys=missing_keys, + unexpected_keys=unexpected_keys, + incorrect_shapes=incorrect_shapes, + ) diff --git a/cosmos-inference/cosmos3/_src/imaginaire/utils/cluster_env.py b/cosmos-inference/cosmos3/_src/imaginaire/utils/cluster_env.py new file mode 100644 index 00000000..55787cf5 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/imaginaire/utils/cluster_env.py @@ -0,0 +1,166 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +import os +from enum import Enum +from functools import lru_cache +from typing import Dict + + +class ClusterType(Enum): + LOCAL = "local" + NGC = "ngc" + SLURM = "slurm" + + +class ClusterEnvInfo(Enum): + BASIC = "basic" + DETAILED = "detailed" + ALL = "all" + + +NGC_ENV_BASIC_VARS = [ + "NGC_JOB_ID", + "NGC_ARRAY_SIZE", + "NGC_GPUS_PER_NODE", +] + +SLURM_ENV_BASIC_VARS = [ + "SLURM_JOB_USER", + "SLURM_JOB_PARTITION", + "SLURM_LOG_DIR", + "SLURM_JOBID", + "SLURM_NNODES", + "SLURM_JOB_NAME", + "SLURM_JOB_NODELIST", + "SLURMD_NODENAME", +] + + +@lru_cache() +def is_local() -> bool: + """ + Check if the code is running on a local machine. + """ + return not is_ngc() and not is_slurm() + + +@lru_cache() +def is_ngc() -> bool: + """ + Check if the code is running on NGC. + """ + return "NGC_ARRAY_SIZE" in os.environ + + +@lru_cache() +def is_slurm() -> bool: + """ + Check if the code is running on SLURM. + """ + return "SLURM_JOB_ID" in os.environ + + +def get_ngc_env(level: ClusterEnvInfo = ClusterEnvInfo.BASIC) -> Dict[str, str]: + """ + Retrieves NVIDIA GPU Cloud (NGC) environment variables based on the specified detail level. + The function filters environment variables to include only those relevant to NGC, + differentiated by the detail level specified. + + Parameters: + level (ClusterInfoLevel): The level of detail for the information returned. + Defaults to ClusterInfoLevel.BASIC. + + Returns: + dict: A dictionary containing the environment variables. If the level is BASIC, + it includes only predefined key variables that are considered basic. + If the level is DETAILED, it includes all environment variables that start + with "NGC_". + + Raises: + ValueError: If an unknown level is specified, an exception is raised indicating that the + level is not recognized. + """ + if level == ClusterEnvInfo.BASIC: + return {k: os.environ[k] for k in NGC_ENV_BASIC_VARS if k in os.environ} + elif level == ClusterEnvInfo.DETAILED: + return {k: os.environ[k] for k in os.environ if k.startswith("NGC_")} + elif level == ClusterEnvInfo.ALL: + return {k: v for k, v in os.environ} + else: + raise ValueError(f"Unknown level {level}") + + +def get_slurm_env(level: ClusterEnvInfo = ClusterEnvInfo.BASIC) -> Dict[str, str]: + """ + Retrieves SLURM environment variables based on the specified detail level. + This function filters the environment variables related to the SLURM job scheduler + environment based on the provided detail level of the cluster information. + + Parameters: + level (ClusterEnvInfo): The detail level of the environment variables to retrieve. + This can be BASIC, DETAILED, or ALL. Defaults to BASIC. + + Returns: + Dict[str, str]: A dictionary containing the SLURM environment variables. The contents of + the dictionary vary based on the level: + - BASIC: Returns predefined key variables important for basic SLURM variables. + - DETAILED: Includes all variables that start with "SLURM_". + - ALL: Returns all environment variables available in the current session. + + Raises: + ValueError: If an unknown level is specified, it raises an exception indicating + that the level is not recognized. + """ + if level == ClusterEnvInfo.BASIC: + return {k: os.environ[k] for k in SLURM_ENV_BASIC_VARS if k in os.environ} + elif level == ClusterEnvInfo.DETAILED: + return {k: os.environ[k] for k in os.environ if k.startswith("SLURM_")} + elif level == ClusterEnvInfo.ALL: + return {k: v for k, v in os.environ.items()} + else: + raise ValueError(f"Unknown level {level}") + + +def get_cluster_env(level: ClusterEnvInfo = ClusterEnvInfo.BASIC) -> Dict[str, str]: + """ + Retrieves a combination of environment variables from the cluster, merging information from + both NVIDIA GPU Cloud (NGC) and SLURM environments based on the specified detail level. + This function provides a unified dictionary of environment settings that are crucial for + applications running in clustered computing environments. + + Parameters: + level (ClusterEnvInfo): The level of detail for the environment variables to be retrieved. + The level can be BASIC, DETAILED, or ALL. Defaults to BASIC. + - BASIC: Gathers basic environment variables from both NGC and SLURM. + - DETAILED: Includes more detailed information from both NGC and SLURM. + - ALL: Combines all available environment variables from the system + with NGC and SLURM specific ones. + + Returns: + Dict[str, str]: A dictionary containing key-value pairs of environment variables. + Initially includes the current working directory under the key 'PWD'. + """ + env_info = { + "PWD": os.getcwd(), # Always include the present working directory. + } + if level == ClusterEnvInfo.ALL: + env_info.update(os.environ) # Adds all system environment variables. + return env_info + + # For BASIC and DETAILED levels, merge environment variables from NGC and SLURM: + env_info.update(get_ngc_env(level)) + env_info.update(get_slurm_env(level)) + return env_info diff --git a/cosmos-inference/cosmos3/_src/imaginaire/utils/config_helper.py b/cosmos-inference/cosmos3/_src/imaginaire/utils/config_helper.py new file mode 100644 index 00000000..e026fca1 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/imaginaire/utils/config_helper.py @@ -0,0 +1,219 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +import importlib +import importlib.util +import os +import pkgutil +import sys +from dataclasses import fields as dataclass_fields +from dataclasses import is_dataclass +from typing import Any, Dict, Optional + +import attr +import attrs +from hydra import compose, initialize +from hydra.core.config_store import ConfigStore +from hydra.core.global_hydra import GlobalHydra +from omegaconf import DictConfig, OmegaConf + +from cosmos3._src.imaginaire.config import Config +from cosmos3._src.imaginaire.utils import log + + +def is_attrs_or_dataclass(obj) -> bool: + """ + Check if the object is an instance of an attrs class or a dataclass. + + Args: + obj: The object to check. + + Returns: + bool: True if the object is an instance of an attrs class or a dataclass, False otherwise. + """ + return is_dataclass(obj) or attr.has(type(obj)) + + +def get_fields(obj): + """ + Get the fields of an attrs class or a dataclass. + + Args: + obj: The object to get fields from. Must be an instance of an attrs class or a dataclass. + + Returns: + list: A list of field names. + + Raises: + ValueError: If the object is neither an attrs class nor a dataclass. + """ + if is_dataclass(obj): + return [field.name for field in dataclass_fields(obj)] + elif attr.has(type(obj)): + return [field.name for field in attr.fields(type(obj))] + else: + raise ValueError("The object is neither an attrs class nor a dataclass.") + + +def override(config: Config, overrides: Optional[list[str]] = None, remove_defaults: bool = False) -> Config: + """ + :param config: the instance of class `Config` (usually from `make_config`) + :param overrides: list of overrides for config + :return: the composed instance of class `Config` + """ + # Store the class of the config for reconstruction after overriding. + # config_class = type(config) + + def remove_defaults_filter(f, _): + return f.name != "defaults" + + # Convert Config object to a DictConfig object + config_dict = attrs.asdict(config, filter=remove_defaults_filter if remove_defaults else None) + config_omegaconf = DictConfig(content=config_dict, flags={"allow_objects": True}) + # Enforce "--" separator between the script arguments and overriding configs. + if overrides: + if overrides[0] != "--": + raise ValueError( + f'Hydra config overrides must be separated with a "--" token. but got overrides={overrides}, and overrides[0]={overrides[0]}' + ) + overrides = overrides[1:] + # Use Hydra to handle overrides + cs = ConfigStore.instance() + cs.store(name="config", node=config_omegaconf) + if not GlobalHydra().is_initialized(): + with initialize(version_base=None): + config_omegaconf = compose(config_name="config", overrides=overrides) + OmegaConf.resolve(config_omegaconf) + else: + config_omegaconf = compose(config_name="config", overrides=overrides) + OmegaConf.resolve(config_omegaconf) + + def config_from_dict(ref_instance: Any, kwargs: Any) -> Any: + """ + Construct an instance of the same type as ref_instance using the provided dictionary or data or unstructured data + + Args: + ref_instance: The reference instance to determine the type and fields when needed + kwargs: A dictionary of keyword arguments to use for constructing the new instance or primitive data or unstructured data + + Returns: + Any: A new instance of the same type as ref_instance constructed using the provided kwargs or the primitive data or unstructured data + + Raises: + AssertionError: If the fields do not match or if extra keys are found. + Exception: If there is an error constructing the new instance. + """ + is_type = is_attrs_or_dataclass(ref_instance) + if not is_type: + return kwargs + else: + ref_fields = set(get_fields(ref_instance)) + assert isinstance(kwargs, dict) or isinstance(kwargs, DictConfig), ( + "kwargs must be a dictionary or a DictConfig" + ) + keys = set(kwargs.keys()) + + # ref_fields must equal to or include all keys + extra_keys = keys - ref_fields + assert ref_fields == keys or keys.issubset(ref_fields), ( + f"Fields mismatch: {ref_fields} != {keys}. Extra keys found: {extra_keys} \n \t when constructing {type(ref_instance)} with {keys}" + ) + + resolved_kwargs: Dict[str, Any] = {} + for f in keys: + resolved_kwargs[f] = config_from_dict(getattr(ref_instance, f), kwargs[f]) + try: + new_instance = type(ref_instance)(**resolved_kwargs) + except Exception as e: + log.error(f"Error when constructing {type(ref_instance)} with {resolved_kwargs}") + log.error(e) + raise e + return new_instance + + config = config_from_dict(config, config_omegaconf) + + return config + + +def get_config_module(config_file: str) -> str: + if not config_file.endswith(".py"): + log.error("Config file cannot be specified as module.") + log.error("Please provide the path to the Python config file (relative to the Imaginaire4 root).") + # Convert to importable module format. + config_module = config_file.replace("/", ".").replace(".py", "") + if importlib.util.find_spec(config_module) is None: + raise ValueError(f"Imaginaire4 config module ({config_module}) not found.") + return config_module + + +def import_module(full_module_name: str, reload: bool = False): + """ + Import a module by name. + + Args: + full_module_name: The fully qualified name of the module to import. + reload: If True, reload the module if it's already imported. + """ + if full_module_name in sys.modules and reload: + importlib.reload(sys.modules[full_module_name]) + else: + importlib.import_module(full_module_name) + + +def import_all_modules_from_package(package_path: str, reload: bool = False, skip_underscore: bool = True) -> None: + """ + Import all modules from the specified package path recursively. + + This function is typically used in conjunction with Hydra to ensure that all modules + within a specified package are imported, which is necessary for registering configurations. + + Example usage: + ```python + import_all_modules_from_package("projects.cosmos.diffusion.v1.config.experiment", reload=True, skip_underscore=False) + ``` + + Args: + package_path (str): The dotted path to the package from which to import all modules. + reload (bool): Flag to determine whether to reload modules if they're already imported. + skip_underscore (bool): If True, skips importing modules that start with an underscore. + """ + log.critical(f"{'Reloading' if reload else 'Importing'} all modules from package {package_path}") + package = importlib.import_module(package_path) + package_directory = package.__path__ + + def import_modules_recursively(directory: str, prefix: str) -> None: + """ + Recursively imports or reloads all modules in the given directory. + + Args: + directory (str): The file system path to the current package directory. + prefix (str): The module prefix (e.g., 'projects.cosmos.diffusion.v1.config'). + """ + for _, module_name, is_pkg in pkgutil.iter_modules([directory]): + if skip_underscore and module_name.startswith("_"): + log.debug(f"Skipping module {module_name} as it starts with an underscore") + continue + + full_module_name = f"{prefix}.{module_name}" + log.debug(f"{'Reloading' if reload else 'Importing'} module {full_module_name}") + + import_module(full_module_name, reload=reload) + + if is_pkg: + sub_package_directory = os.path.join(directory, module_name) + import_modules_recursively(sub_package_directory, full_module_name) + + for directory in package_directory: + import_modules_recursively(directory, package_path) diff --git a/cosmos-inference/cosmos3/_src/imaginaire/utils/context_managers.py b/cosmos-inference/cosmos3/_src/imaginaire/utils/context_managers.py new file mode 100644 index 00000000..63deee1d --- /dev/null +++ b/cosmos-inference/cosmos3/_src/imaginaire/utils/context_managers.py @@ -0,0 +1,78 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +from contextlib import ExitStack, contextmanager +from typing import Generator + +import torch + +from cosmos3._src.imaginaire.utils.misc import timer + + + +@contextmanager +def disable_tf32() -> Generator[None, None, None]: + """Context manager to temporarily disable TF32 for CUDA matrix multiplications. + + This is useful for ensuring full FP32 precision in numerical computations, + particularly when debugging or comparing results between different implementations. + + Example: + with disable_tf32(): + result = torch.matmul(a, b) # Uses full FP32 precision + """ + old_allow_tf32_matmul = torch.backends.cuda.matmul.allow_tf32 + try: + torch.backends.cuda.matmul.allow_tf32 = False + with torch.backends.cudnn.flags(enabled=None, benchmark=None, deterministic=None, allow_tf32=False): + yield + finally: + torch.backends.cuda.matmul.allow_tf32 = old_allow_tf32_matmul + + +@contextmanager +def data_loader_init() -> Generator[None, None, None]: + """ + Wrap the data loader initialization with multiple context managers used for telemetry and one logger. + """ + contexts = [ + timer("init_data_loader"), + ] + with ExitStack() as stack: + yield [stack.enter_context(cm) for cm in contexts] + + +@contextmanager +def model_init(set_barrier: bool = False) -> Generator[None, None, None]: + """ + Wrap the instantiation of the model with multiple context managers used for telemetry and one logger. + """ + contexts = [ + timer("init_model"), + ] + with ExitStack() as stack: + yield [stack.enter_context(cm) for cm in contexts] + + +@contextmanager +def distributed_init() -> Generator[None, None, None]: + """ + Wrap the distributed initialization, used for telemetry and timers + """ + contexts = [ + timer("init_distributed"), + ] + with ExitStack() as stack: + yield [stack.enter_context(cm) for cm in contexts] diff --git a/cosmos-inference/cosmos3/_src/imaginaire/utils/context_parallel.py b/cosmos-inference/cosmos3/_src/imaginaire/utils/context_parallel.py new file mode 100644 index 00000000..1eb1cfc4 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/imaginaire/utils/context_parallel.py @@ -0,0 +1,255 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +import math +from typing import Optional + +try: + import megatron.core.parallel_state as parallel_state + + USE_MEGATRON = True +except ImportError: + USE_MEGATRON = False + +import torch +from torch import Tensor +from torch.distributed import ProcessGroup, all_gather, broadcast_object_list, get_process_group_ranks, get_world_size +from torch.distributed.utils import _verify_param_shape_across_processes + +from cosmos3._src.imaginaire.utils import distributed + + +def split_inputs_cp(x: Tensor, seq_dim: int, cp_group: ProcessGroup) -> Tensor: + """ + Split input tensor along the sequence dimension for checkpoint parallelism. + + This function divides the input tensor into equal parts along the specified + sequence dimension, based on the number of ranks in the checkpoint parallelism group. + It then selects the part corresponding to the current rank. + + Args: + x: Input tensor to be split. + seq_dim: The dimension along which to split the input (sequence dimension). + cp_group: The process group for checkpoint parallelism. + + Returns: + A slice of the input tensor corresponding to the current rank. + + Raises: + AssertionError: If the sequence dimension is not divisible by the number of ranks. + """ + cp_ranks = get_process_group_ranks(cp_group) + cp_size = len(cp_ranks) + + assert x.shape[seq_dim] % cp_size == 0, f"{x.shape[seq_dim]} cannot divide cp_size {cp_size}" + x = x.view(*x.shape[:seq_dim], cp_size, x.shape[seq_dim] // cp_size, *x.shape[(seq_dim + 1) :]) + seq_idx = torch.tensor([cp_group.rank()], device=x.device) + x = x.index_select(seq_dim, seq_idx) + # Note that the new sequence length is the original sequence length / cp_size + x = x.view(*x.shape[:seq_dim], -1, *x.shape[(seq_dim + 2) :]) + return x + + +@torch.compiler.disable +def cat_outputs_cp(x: Tensor, seq_dim: int, cp_group: ProcessGroup) -> Tensor: + """ + Concatenate outputs from different ranks in the checkpoint parallelism group. + + This function gathers tensors from all ranks in the checkpoint parallelism group + and concatenates them along the specified sequence dimension. + + The function is decorated with @torch.compiler.disable because it contains distributed + operations and dynamic tensor creation based on runtime rank information that seem to be + incompatible with torch.compile's static graph compilation. + + Args: + x: Input tensor to be concatenated. + seq_dim: The dimension along which to concatenate the tensors (sequence dimension). + cp_group: The process group for checkpoint parallelism. + + Returns: + A tensor that is the concatenation of tensors from all ranks in the cp_group. + + Raises: + RuntimeError: If the gather operation fails. + """ + # Get the world size (number of processes in the group) + world_size = get_world_size(cp_group) + + # Create a list to store tensors from all ranks + gathered_tensors = [torch.zeros_like(x) for _ in range(world_size)] + + # Gather tensors from all ranks + try: + all_gather(gathered_tensors, x, group=cp_group) + except RuntimeError as e: + raise RuntimeError(f"Failed to gather tensors: {e}") + + # Concatenate the gathered tensors along the specified dimension + return torch.cat(gathered_tensors, dim=seq_dim) + + +def cat_outputs_cp_with_grad(x: Tensor, seq_dim: int, cp_group: ProcessGroup) -> Tensor: + """ + Concatenate outputs from different ranks in the context parallelism group. + + This function gathers tensors from all ranks in the checkpoint parallelism group + and concatenates them along the specified sequence dimension. + + It retains computational graph locally for each rank by replacing the portion of the tensor with original output. + + Args: + x: Input tensor to be concatenated. + seq_dim: The dimension along which to concatenate the tensors (sequence dimension). + cp_group: The process group for checkpoint parallelism. + + Returns: + A tensor that is the concatenation of tensors from all ranks in the cp_group. + + Raises: + RuntimeError: If the gather operation fails. + """ + # Get the world size (number of processes in the group) + cp_size = cp_group.size() + assert cp_size > 0, "cp_size should be greater than 0" + + # Create a list to store tensors from all ranks + gathered_tensors = [torch.zeros_like(x) for _ in range(cp_size)] + + # Gather tensors from all ranks + try: + all_gather(gathered_tensors, x, group=cp_group) + except RuntimeError as e: + raise RuntimeError(f"Failed to gather tensors: {e}") + + rank = cp_group.rank() + gathered_tensors[rank] = x + # Concatenate the gathered tensors along the specified dimension + return torch.cat(gathered_tensors, dim=seq_dim) + + +@torch.compiler.disable +def robust_broadcast(tensor: torch.Tensor, src: int, pg: ProcessGroup, is_check_shape: bool = False) -> torch.Tensor: + """ + Perform a robust broadcast operation that works regardless of tensor shapes on different ranks. + + The function is decorated with @torch.compiler.disable because it contains distributed + operations and dynamic tensor creation based on runtime rank information that seem to be + incompatible with torch.compile's static graph compilation. + + Args: + tensor (torch.Tensor): The tensor to broadcast (on src rank) or receive (on other ranks). + src (int): The source rank for the broadcast. Defaults to 0. + + Returns: + torch.Tensor: The broadcasted tensor on all ranks. + """ + # First, broadcast the shape of the tensor + if distributed.get_rank() == src: + shape = torch.tensor(tensor.shape, dtype=torch.long).cuda() + else: + shape = torch.empty(tensor.dim(), dtype=torch.long).cuda() + if is_check_shape: + _verify_param_shape_across_processes(pg, [shape]) + torch.distributed.broadcast(shape, src, group=pg) + + # Resize the tensor on non-src ranks if necessary + if distributed.get_rank() != src: + tensor = tensor.new_empty(shape.tolist()).type_as(tensor) + + # Now broadcast the tensor data + torch.distributed.broadcast(tensor, src, group=pg) + + return tensor + + +def broadcast( + item: torch.Tensor | str | None, process_group: Optional[ProcessGroup] = None +) -> torch.Tensor | str | None: + """ + Broadcast the item from the minimum rank in the specified group(s). + """ + if process_group is None: + return item + + min_rank = min(get_process_group_ranks(process_group)) + if isinstance(item, torch.Tensor): # assume the device is cuda + item = robust_broadcast(item, min_rank, process_group) + elif item is not None: + broadcastable_list = [item] + broadcast_object_list(broadcastable_list, min_rank, group=process_group) + item = broadcastable_list[0] + return item + + +def broadcast_split_tensor( + tensor: torch.Tensor, + seq_dim: int, + process_group: Optional[ProcessGroup] = None, +) -> torch.Tensor: + """ + Broadcast the tensor from the minimum rank in the specified group(s). + """ + if tensor is None: + return tensor + min_rank = min(get_process_group_ranks(process_group)) + tensor = robust_broadcast(tensor, min_rank, process_group) + return split_inputs_cp(tensor, seq_dim, process_group) + + +def find_split( + shape_tensor: torch.Size, cp_size: int, patch_values: tuple[int, int, int] = (1, 2, 2), view_factor: int = 1 +) -> torch.Size: + """ + Find the shape of input tensor for post-CP split, taking into account both temporal and spatial split, as well as patching values. + The split by width is not possible currently, due to memory stride issues, which break quality. This is checked + by an assert. + + The spatial split is achieved by flattening the input video into a single dimension before CP split is performed, + and rearranging it back into [T, H, W] format after the CP split, since the input passed to the model must still be in [T, H, W] format. + + Args: + shape_tensor (torch.Size): The shape of the Tensor that we want to split. Needs to be in [B, C, T, H, W] format. + cp_size (int): The Context Parallelism size that we want to use. + patch_values (tuple[int, int, int], optional): The patch values that are applied inside the Diffusion model. + First element of the tuple is temporal patch size. Two next elements are the spatial patch sizes. + The default value is (1, 2, 2) + view_factor (int, optional): The number of views that are present in the temporal dimension. Default value is 1. + + Returns: + The torch.Size of how the post-split tensor should look like in [T, H, W] dimensions. + + """ + if not USE_MEGATRON: + raise ImportError("No megatron.core package found, which is required for Context Parallelism usage.") + B, C, T, H, W = shape_tensor + ret = [] + assert T % view_factor == 0 + T = T // view_factor + cp_size_t = 1 + for i, size in enumerate([T, H, W]): + if i == 2 and cp_size > 1: + raise ValueError( + f"Split by width dimension is not currently supported due to quality issues. Width dimension would be split by a factor of {cp_size}. Lower the CP size to avoid splitting by width." + ) + patch_size = patch_values[i] + gcd = math.gcd(size // patch_size, cp_size) + cp_size = cp_size // gcd + if i == 0: + cp_size_t = gcd + ret.append(size // gcd) + # Saving the CP size in the temporal dimension for VideoPositionEmb embeddings calculation + parallel_state.cp_size_t = cp_size_t + return torch.Size(ret) diff --git a/cosmos-inference/cosmos3/_src/imaginaire/utils/count_params.py b/cosmos-inference/cosmos3/_src/imaginaire/utils/count_params.py new file mode 100644 index 00000000..79c03199 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/imaginaire/utils/count_params.py @@ -0,0 +1,23 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +from torch import nn + + +def count_params(model: nn.Module, verbose=False) -> int: + total_params = sum(p.numel() for p in model.parameters() if p.requires_grad) + if verbose: + print(f"{model.__class__.__name__} has {total_params * 1.0e-6:.2f} M params.") + return total_params diff --git a/cosmos-inference/cosmos3/_src/imaginaire/utils/dataloader.py b/cosmos-inference/cosmos3/_src/imaginaire/utils/dataloader.py new file mode 100644 index 00000000..b2bac676 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/imaginaire/utils/dataloader.py @@ -0,0 +1,105 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +from typing import Iterable, Iterator + +import torch +import torch.distributed as dist +import torch.utils.data + + +class MultiEpochsDataLoader(torch.utils.data.DataLoader): + """A dataloader that relentlessly samples from the dataset. + + This eliminates the overhead of prefetching data before each epoch. + Ref: https://github.com/rwightman/pytorch-image-models/blob/master/timm/data/loader.py + """ + + def __init__(self, *args, **kwargs) -> None: + super().__init__(*args, **kwargs) + self._DataLoader__initialized = False + if self.batch_sampler is None: + self.sampler = _RepeatSampler(self.sampler) # type: ignore + else: + self.batch_sampler = _RepeatSampler(self.batch_sampler) # type: ignore + self._DataLoader__initialized = True + self.iterator = super().__iter__() + + def __len__(self) -> int: + return len(self.sampler) if self.batch_sampler is None else len(self.batch_sampler.sampler) # type: ignore + + def __iter__(self) -> Iterable: + for _ in range(len(self)): + yield next(self.iterator) + + +class _RepeatSampler: + """A sampler wrapper that repeats data sampling forever. + + Args: + sampler (Sampler): Data sampler object. + """ + + def __init__(self, sampler: torch.utils.data.Sampler): + self.sampler = sampler + + def __iter__(self) -> Iterator: + while True: + yield from iter(self.sampler) + + +class DistributedEvalSampler(torch.utils.data.Sampler): + """Distributed data sampler for evaluation. + + Ref: https://github.com/SeungjunNah/DeepDeblur-PyTorch/blob/master/src/data/sampler.py (by snah) + DistributedEvalSampler is different from DistributedSampler in that it does not pad extra samples to make it + evenly divisible. It should not be used for training, or the distributed processes could hang forever. + DistributedEvalSampler is for evaluation purpose where synchronization does not happen every epoch. + Synchronization should be done outside the dataloader loop. + """ + + def __init__(self, dataset: torch.utils.data.Dataset, shuffle: bool = False, seed: int = 0): + """Constructor of DistributedEvalSampler, + + Args: + dataset (torch.utils.data.Dataset): Dataset used for sampling. + shuffle (bool): Whether to shuffle the indices (default: False). + seed (int): Random seed for shuffling if enabled (default: 0). + """ + self.dataset = dataset + self.num_replicas = dist.get_world_size() + self.rank = dist.get_rank() + self.dataset_size = len(self.dataset) # type: ignore + indices = list(range(self.dataset_size)) + indices = indices[self.rank : self.dataset_size : self.num_replicas] + self.num_samples = len(indices) + self.shuffle = shuffle + self.seed = seed + + def __iter__(self) -> Iterator: + if self.shuffle: + # Deterministically shuffle based on epoch and seed. + gen = torch.Generator() + gen.manual_seed(self.seed) + indices = torch.randperm(self.dataset_size, generator=gen).tolist() + else: + indices = list(range(self.dataset_size)) + # Subsample. + indices = indices[self.rank : self.dataset_size : self.num_replicas] + assert len(indices) == self.num_samples + return iter(indices) + + def __len__(self) -> int: + return self.num_samples diff --git a/cosmos-inference/cosmos3/_src/imaginaire/utils/dataset_utils.py b/cosmos-inference/cosmos3/_src/imaginaire/utils/dataset_utils.py new file mode 100644 index 00000000..04e313ec --- /dev/null +++ b/cosmos-inference/cosmos3/_src/imaginaire/utils/dataset_utils.py @@ -0,0 +1,345 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +""" +Adapted from: +https://github.com/bytedance/IRASim/blob/main/dataset/dataset_util.py +""" + +import base64 +import math +import os +from io import BytesIO + +import numpy as np +import torch +import torch.distributed as dist +import torchvision.transforms.functional as F +from PIL import Image + + +def is_dist_avail_and_initialized(): + if not dist.is_available(): + return False + if not dist.is_initialized(): + return False + return True + + +def get_rank(): + if not is_dist_avail_and_initialized(): + return 0 + return dist.get_rank() + + +def get_1d_sincos_pos_embed_from_grid(embed_dim, pos): + """ + embed_dim: output dimension for each position + pos: a list of positions to be encoded: size (M,) + out: (M, D) + """ + assert embed_dim % 2 == 0 + omega = np.arange(embed_dim // 2, dtype=np.float32) + omega /= embed_dim / 2.0 + omega = 1.0 / 10000**omega # (D/2,) + + pos = pos.reshape(-1) # (M,) + out = np.einsum("m,d->md", pos, omega) # (M, D/2), outer product + + emb_sin = np.sin(out) # (M, D/2) + emb_cos = np.cos(out) # (M, D/2) + + emb = np.concatenate([emb_sin, emb_cos], axis=1) # (M, D) + return emb + + +def get_2d_sincos_pos_embed_from_grid(embed_dim, grid): + assert embed_dim % 2 == 0 + + # use half of dimensions to encode grid_h + emb_h = get_1d_sincos_pos_embed_from_grid(embed_dim // 2, grid[0]) # (H*W, D/2) + emb_w = get_1d_sincos_pos_embed_from_grid(embed_dim // 2, grid[1]) # (H*W, D/2) + + emb = np.concatenate([emb_h, emb_w], axis=1) # (H*W, D) + return emb + + +def get_2d_sincos_pos_embed(embed_dim, grid_size, cls_token=False): + """ + grid_size: int of the grid height and width + return: + pos_embed: [grid_size*grid_size, embed_dim] or [1+grid_size*grid_size, embed_dim] (w/ or w/o cls_token) + """ + grid_h = np.arange(grid_size, dtype=np.float32) + grid_w = np.arange(grid_size, dtype=np.float32) + grid = np.meshgrid(grid_w, grid_h) # here w goes first + grid = np.stack(grid, axis=0) + + grid = grid.reshape([2, 1, grid_size, grid_size]) + pos_embed = get_2d_sincos_pos_embed_from_grid(embed_dim, grid) + if cls_token: + pos_embed = np.concatenate([np.zeros([1, embed_dim]), pos_embed], axis=0) + return pos_embed + + +def b64_2_img(data: str): + image_b64 = base64.b64decode(data) + img = Image.open(BytesIO(image_b64)).convert("RGB") + return img + + +def get_continuous_action(d_acts, c_act_max, c_act_min, n_bins): + c_act_max = c_act_max.to(d_acts.device) + c_act_min = c_act_min.to(d_acts.device) + c_acts = d_acts / (n_bins - 1) * (c_act_max - c_act_min) + c_act_min + return c_acts + + +def alpha2rotm(a): + """Alpha euler angle to rotation matrix.""" + rotm = np.array([[1, 0, 0], [0, np.cos(a), -np.sin(a)], [0, np.sin(a), np.cos(a)]]) + return rotm + + +def beta2rotm(b): + """Beta euler angle to rotation matrix.""" + rotm = np.array([[np.cos(b), 0, np.sin(b)], [0, 1, 0], [-np.sin(b), 0, np.cos(b)]]) + return rotm + + +def gamma2rotm(c): + """Gamma euler angle to rotation matrix.""" + rotm = np.array([[np.cos(c), -np.sin(c), 0], [np.sin(c), np.cos(c), 0], [0, 0, 1]]) + return rotm + + +def euler2rotm(euler_angles): + """Euler angle (ZYX) to rotation matrix.""" + alpha = euler_angles[0] + beta = euler_angles[1] + gamma = euler_angles[2] + + rotm_a = alpha2rotm(alpha) + rotm_b = beta2rotm(beta) + rotm_c = gamma2rotm(gamma) + + rotm = rotm_c @ rotm_b @ rotm_a + + return rotm + + +def isRotm(R): + # Checks if a matrix is a valid rotation matrix. + # Forked from Andy Zeng + Rt = np.transpose(R) + shouldBeIdentity = np.dot(Rt, R) + I = np.identity(3, dtype=R.dtype) + n = np.linalg.norm(I - shouldBeIdentity) + return n < 1e-6 + + +def rotm2euler(R): + # Forked from: https://learnopencv.com/rotation-matrix-to-euler-angles/ + # R = Rz * Ry * Rx + assert isRotm(R) + sy = math.sqrt(R[0, 0] * R[0, 0] + R[1, 0] * R[1, 0]) + singular = sy < 1e-6 + + if not singular: + x = math.atan2(R[2, 1], R[2, 2]) + y = math.atan2(-R[2, 0], sy) + z = math.atan2(R[1, 0], R[0, 0]) + else: + x = math.atan2(-R[1, 2], R[1, 1]) + y = math.atan2(-R[2, 0], sy) + z = 0 + + # (-pi , pi] + while x > np.pi: + x -= 2 * np.pi + while x <= -np.pi: + x += 2 * np.pi + while y > np.pi: + y -= 2 * np.pi + while y <= -np.pi: + y += 2 * np.pi + while z > np.pi: + z -= 2 * np.pi + while z <= -np.pi: + z += 2 * np.pi + return np.array([x, y, z]) + + +def quat2rotm(quat): + """Quaternion to rotation matrix. + + Args: + quat (4, numpy array): quaternion x, y, z, w + Returns: + rotm (3x3 numpy array): rotation matrix + """ + w = quat[3] + x = quat[0] + y = quat[1] + z = quat[2] + + s = w * w + x * x + y * y + z * z + + rotm = np.array( + [ + [1 - 2 * (y * y + z * z) / s, 2 * (x * y - z * w) / s, 2 * (x * z + y * w) / s], + [2 * (x * y + z * w) / s, 1 - 2 * (x * x + z * z) / s, 2 * (y * z - x * w) / s], + [2 * (x * z - y * w) / s, 2 * (y * z + x * w) / s, 1 - 2 * (x * x + y * y) / s], + ] + ) + + return rotm + + +def rotm2quat(R): + """Convert 3x3 rotation matrix to quaternion (w, x, y, z).""" + R = np.array(R, dtype=float) + trace = np.trace(R) + + if trace > 0: + s = 0.5 / np.sqrt(trace + 1.0) + w = 0.25 / s + x = (R[2, 1] - R[1, 2]) * s + y = (R[0, 2] - R[2, 0]) * s + z = (R[1, 0] - R[0, 1]) * s + else: + if R[0, 0] > R[1, 1] and R[0, 0] > R[2, 2]: + s = 2.0 * np.sqrt(1.0 + R[0, 0] - R[1, 1] - R[2, 2]) + w = (R[2, 1] - R[1, 2]) / s + x = 0.25 * s + y = (R[0, 1] + R[1, 0]) / s + z = (R[0, 2] + R[2, 0]) / s + elif R[1, 1] > R[2, 2]: + s = 2.0 * np.sqrt(1.0 + R[1, 1] - R[0, 0] - R[2, 2]) + w = (R[0, 2] - R[2, 0]) / s + x = (R[0, 1] + R[1, 0]) / s + y = 0.25 * s + z = (R[1, 2] + R[2, 1]) / s + else: + s = 2.0 * np.sqrt(1.0 + R[2, 2] - R[0, 0] - R[1, 1]) + w = (R[1, 0] - R[0, 1]) / s + x = (R[0, 2] + R[2, 0]) / s + y = (R[1, 2] + R[2, 1]) / s + z = 0.25 * s + + return np.array([w, x, y, z]) + + +def get_converted_fp32_paths(deepspeed_ckpt_path): + deepspeed_ckpt_path = deepspeed_ckpt_path.rstrip("/") + ckpt_dir = os.path.dirname(deepspeed_ckpt_path) + ckpt_name = os.path.basename(deepspeed_ckpt_path) + fp32_ckpt_name = f"{ckpt_name}.fp32.pt" + converted_path = os.path.join(ckpt_dir, fp32_ckpt_name) + return converted_path + + +class Resize_Preprocess: + def __init__(self, size): + """ + Initialize the preprocessing class with the target size. + Args: + size (tuple): The target height and width as a tuple (height, width). + """ + self.size = size + + def __call__(self, video_frames): + """ + Apply the transformation to each frame in the video. + Args: + video_frames (torch.Tensor): A tensor representing a batch of video frames. + Returns: + torch.Tensor: The transformed video frames. + """ + # Resize each frame in the video + resized_frames = torch.stack([F.resize(frame, self.size, antialias=True) for frame in video_frames]) + return resized_frames + + +class Preprocess: + def __init__(self, size): + self.size = size + + def __call__(self, clip): + clip = Preprocess.resize_scale(clip, self.size[0], self.size[1], interpolation_mode="bilinear") + return clip + + def __repr__(self) -> str: + return f"{self.__class__.__name__}(size={self.size})" + + @staticmethod + def resize_scale(clip, target_height, target_width, interpolation_mode): + target_ratio = target_height / target_width + H = clip.size(-2) + W = clip.size(-1) + clip_ratio = H / W + if clip_ratio > target_ratio: + scale_ = target_width / W + else: + scale_ = target_height / H + return torch.nn.functional.interpolate(clip, scale_factor=scale_, mode=interpolation_mode, align_corners=False) + + +class ToTensorVideo: + """ + Convert tensor data type from uint8 to float, divide value by 255.0 and + permute the dimensions of clip tensor + """ + + def __init__(self): + pass + + def __call__(self, clip): + """ + Args: + clip (torch.tensor, dtype=torch.uint8): Size is (T, C, H, W) + Return: + clip (torch.tensor, dtype=torch.float): Size is (T, C, H, W) + """ + return to_tensor(clip) + + def __repr__(self) -> str: + return self.__class__.__name__ + + +def to_tensor(clip): + """ + Convert tensor data type from uint8 to float, divide value by 255.0 and + permute the dimensions of clip tensor + Args: + clip (torch.tensor, dtype=torch.uint8): Size is (T, C, H, W) + Return: + clip (torch.tensor, dtype=torch.float): Size is (T, C, H, W) + """ + _is_tensor_video_clip(clip) + if not clip.dtype == torch.uint8: + raise TypeError("clip tensor should have data type uint8. Got %s" % str(clip.dtype)) + # return clip.float().permute(3, 0, 1, 2) / 255.0 + return clip.float() / 255.0 + + +def _is_tensor_video_clip(clip): + if not torch.is_tensor(clip): + raise TypeError("clip should be Tensor. Got %s" % type(clip)) + + if not clip.ndimension() == 4: + raise ValueError("clip should be 4D. Got %dD" % clip.dim()) + + return True diff --git a/cosmos-inference/cosmos3/_src/imaginaire/utils/denoise_prediction.py b/cosmos-inference/cosmos3/_src/imaginaire/utils/denoise_prediction.py new file mode 100644 index 00000000..a209db0e --- /dev/null +++ b/cosmos-inference/cosmos3/_src/imaginaire/utils/denoise_prediction.py @@ -0,0 +1,28 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +from __future__ import annotations + +from dataclasses import dataclass +from typing import Optional + +import torch + + +@dataclass +class DenoisePrediction: + x0: torch.Tensor # clean data prediction + eps: Optional[torch.Tensor] = None # noise prediction + logvar: Optional[torch.Tensor] = None # log variance of noise prediction, can be used a confidence / uncertainty diff --git a/cosmos-inference/cosmos3/_src/imaginaire/utils/device.py b/cosmos-inference/cosmos3/_src/imaginaire/utils/device.py new file mode 100644 index 00000000..3f186f39 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/imaginaire/utils/device.py @@ -0,0 +1,152 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +import gc +import math +import os +from functools import wraps + +import pynvml +from loguru import logger as logging + + +def get_gpu_architecture(): + """ + Retrieves the GPU architecture of the available GPUs. + + Returns: + str: The GPU architecture, which can be "H100", "A100", or "Other". + """ + try: + pynvml.nvmlInit() + device_count = pynvml.nvmlDeviceGetCount() + for i in range(device_count): + handle = pynvml.nvmlDeviceGetHandleByIndex(i) + model_name = pynvml.nvmlDeviceGetName(handle) + if isinstance(model_name, bytes): + model_name = model_name.decode("utf-8") + print(f"GPU {i}: Model: {model_name}") + + # Check for specific models like H100 or A100 + if "H100" in model_name or "H200" in model_name: + return "H100" + elif "A100" in model_name: + return "A100" + elif "L40S" in model_name: + return "L40S" + elif "B200" in model_name: + return "B200" + except pynvml.NVMLError as error: + print(f"Failed to get GPU info: {error}") + finally: + pynvml.nvmlShutdown() + + # return "Other" incase of non hopper/ampere or error + return "Other" + + +class GPUArchitectureNotSupported(Exception): + """ + Custom exception raised when the expected GPU architecture is not supported. + """ + + pass + + +def print_gpu_mem(str=None): + try: + pynvml.nvmlInit() + meminfo = pynvml.nvmlDeviceGetMemoryInfo(pynvml.nvmlDeviceGetHandleByIndex(0)) + logging.info( + f"{str}: {meminfo.used / 1024 / 1024}/{meminfo.total / 1024 / 1024}MiB used ({meminfo.free / 1024 / 1024}MiB free)" + ) + except pynvml.NVMLError as error: + print(f"Failed to get GPU memory info: {error}") + + +def force_gc(): + print_gpu_mem() + print("gc()") + gc.collect() + print_gpu_mem() + print("empty cuda cache") + # print(torch.cuda.memory_summary()) + print_gpu_mem() + + +def gpu0_has_80gb_or_less(): + try: + pynvml.nvmlInit() + meminfo = pynvml.nvmlDeviceGetMemoryInfo(pynvml.nvmlDeviceGetHandleByIndex(0)) + return meminfo.total / 1024 / 1024 / 1024 <= 80 + except pynvml.NVMLError as error: + print(f"Failed to get GPU memory info: {error}") + + +class Device: + + + _nvml_affinity_elements = math.ceil(os.cpu_count() / 64) # type: ignore + + def __init__(self, device_idx: int): + + super().__init__() + self.handle = pynvml.nvmlDeviceGetHandleByIndex(device_idx) + + def get_name(self) -> str: + + return pynvml.nvmlDeviceGetName(self.handle) + + def get_cpu_affinity(self) -> list[int]: + + affinity_string = "" + for j in pynvml.nvmlDeviceGetCpuAffinity(self.handle, Device._nvml_affinity_elements): + # assume nvml returns list of 64 bit ints + affinity_string = "{:064b}".format(j) + affinity_string + affinity_list = [int(x) for x in affinity_string] + affinity_list.reverse() # so core 0 is in 0th element of list + return [i for i, e in enumerate(affinity_list) if e != 0] + + +def with_torch_device(device): + """ + Decorator factory that wraps a function to execute within a specific torch device context. + + This decorator ensures that all tensor allocations and operations within the decorated + function use the specified device by default. + + Args: + device: The torch device to use (e.g., 'cuda', 'cuda:0', 'cpu', or torch.device object). + + Returns: + A decorator function that wraps the target function with the specified device context. + + Example: + @with_torch_device('cuda:0') + def create_tensors(): + x = torch.randn(10, 10) # Will be created on cuda:0 + return x + """ + import torch + + def decorator(fn): + @wraps(fn) + def wrapper(*args, **kwargs): + with torch.device(device): + return fn(*args, **kwargs) + + return wrapper + + return decorator diff --git a/cosmos-inference/cosmos3/_src/imaginaire/utils/disabled_train.py b/cosmos-inference/cosmos3/_src/imaginaire/utils/disabled_train.py new file mode 100644 index 00000000..af961d31 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/imaginaire/utils/disabled_train.py @@ -0,0 +1,22 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +from typing import Any + + +def disabled_train(self: Any, mode: bool = True) -> Any: + """Overwrite model.train with this function to make sure train/eval mode + does not change anymore.""" + return self diff --git a/cosmos-inference/cosmos3/_src/imaginaire/utils/distributed.py b/cosmos-inference/cosmos3/_src/imaginaire/utils/distributed.py new file mode 100644 index 00000000..6a54442a --- /dev/null +++ b/cosmos-inference/cosmos3/_src/imaginaire/utils/distributed.py @@ -0,0 +1,491 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +from __future__ import annotations + +import collections +import collections.abc +import ctypes +import functools +import os +from contextlib import contextmanager +from datetime import timedelta +from typing import TYPE_CHECKING, Any, Callable, Container, Optional + +import pynvml +import torch +import torch.distributed as dist +from torch.distributed import get_process_group_ranks + +from cosmos3._src.imaginaire.flags import INTERNAL +from cosmos3._src.imaginaire.utils.device import Device + +if dist.is_available(): + from torch.distributed.distributed_c10d import _get_default_group + from torch.distributed.utils import _sync_module_states, _verify_param_shape_across_processes + +from cosmos3._src.imaginaire.utils import log + +if TYPE_CHECKING: + from cosmos3._src.imaginaire.config import DDPConfig + + +def init() -> int | None: + """Initialize distributed training.""" + if dist.is_initialized(): + return torch.cuda.current_device() + + # Set GPU affinity. + pynvml.nvmlInit() + local_rank = int(os.getenv("LOCAL_RANK", 0)) + try: + device = Device(local_rank) + os.sched_setaffinity(0, device.get_cpu_affinity()) + except pynvml.NVMLError as e: + log.warning(f"Failed to set device affinity: {e}") + # Set up NCCL communication. + os.environ["TORCH_NCCL_BLOCKING_WAIT"] = "0" + os.environ["TORCH_NCCL_ASYNC_ERROR_HANDLING"] = "1" + if dist.is_available(): + torch.cuda.set_device(local_rank) + # Get the timeout value from environment variable + timeout_seconds = os.getenv("TORCH_NCCL_HEARTBEAT_TIMEOUT_SEC", 1800) + # Convert the timeout to an integer (if it isn't already) and then to a timedelta + timeout_timedelta = timedelta(seconds=int(timeout_seconds)) + dist.init_process_group(backend="nccl", init_method="env://", timeout=timeout_timedelta) + log.critical( + f"Initialized distributed training with local rank {local_rank} with timeout {timeout_seconds}", + rank0_only=False, + ) + # Increase the L2 fetch granularity for faster speed. + # For oss, we need to search for the library in site-packages. + if INTERNAL: + _libcudart = ctypes.CDLL("libcudart.so") + # Set device limit on the current device. + p_value = ctypes.cast((ctypes.c_int * 1)(), ctypes.POINTER(ctypes.c_int)) + _libcudart.cudaDeviceSetLimit(ctypes.c_int(0x05), ctypes.c_int(128)) + _libcudart.cudaDeviceGetLimit(p_value, ctypes.c_int(0x05)) + log.info(f"Training with {get_world_size()} GPUs.") + + +def get_rank(group: Optional[dist.ProcessGroup] = None) -> int: + """Get the rank (GPU device) of the worker. + + Returns: + rank (int): The rank of the worker. + """ + rank = 0 + if dist.is_available() and dist.is_initialized(): + rank = dist.get_rank(group) + return rank + + +def get_world_size(group: Optional[dist.ProcessGroup] = None) -> int: + """Get world size. How many GPUs are available in this job. + + Returns: + world_size (int): The total number of GPUs available in this job. + """ + world_size = 1 + if dist.is_available() and dist.is_initialized(): + world_size = dist.get_world_size(group) + return world_size + + +def is_rank0() -> bool: + """Check if current process is the master GPU. + + Returns: + (bool): True if this function is called from the master GPU, else False. + """ + return get_rank() == 0 + + +def is_local_rank0() -> bool: + """Check if current process is the local master GPU in the current node. + + Returns: + (bool): True if this function is called from the local master GPU, else False. + """ + return torch.cuda.current_device() == 0 + + +def rank0_only(func: Callable) -> Callable: + """Apply this function only to the master GPU. + + Example usage: + @rank0_only + def func(x): + return x + 3 + + Args: + func (Callable): a function. + + Returns: + (Callable): A function wrapper executing the function only on the master GPU. + """ + + @functools.wraps(func) + def wrapper(*args, **kwargs): # noqa: ANN202 + if is_rank0(): + return func(*args, **kwargs) + else: + return None + + return wrapper + + +def barrier() -> None: + """Barrier for all GPUs.""" + if dist.is_available() and dist.is_initialized(): + dist.barrier() + + +def rank0_first(func: Callable) -> Callable: + """run the function on rank 0 first, then on other ranks.""" + + @functools.wraps(func) + def wrapper(*args, **kwargs): # noqa: ANN202 + if is_rank0(): + result = func(*args, **kwargs) + barrier() + if not is_rank0(): + result = func(*args, **kwargs) + return result + + return wrapper + + +def parallel_model_wrapper(config_ddp: DDPConfig, model: torch.nn.Module) -> torch.nn.Module | DistributedDataParallel: + """Wraps the model to enable data parallalism for training across multiple GPU devices. + + Args: + config_ddp (DDPConfig): The data parallel config. + model (torch.nn.Module): The PyTorch module. + + Returns: + model (torch.nn.Module | DistributedDataParallel): The data parallel model wrapper + if distributed environment is available, otherwise return the original model. + """ + if dist.is_available() and dist.is_initialized(): + local_rank = int(os.getenv("LOCAL_RANK", 0)) + try: + from megatron.core import parallel_state + + ddp_group = parallel_state.get_data_parallel_group(with_context_parallel=True) + except Exception as e: + log.info(e) + log.info("parallel_state not initialized, treating all GPUs equally for DDP") + ddp_group = None + + model = DistributedDataParallel( + model, + device_ids=[local_rank], + output_device=local_rank, + find_unused_parameters=config_ddp.find_unused_parameters, + static_graph=config_ddp.static_graph, + broadcast_buffers=config_ddp.broadcast_buffers, + process_group=ddp_group, + ) + return model + + +class DistributedDataParallel(torch.nn.parallel.DistributedDataParallel): + """This extends torch.nn.parallel.DistributedDataParallel with .training_step(). + + This borrows the concept of `forward-redirection` from Pytorch lightning. It wraps an ImaginaireModel such that + model.training_step() would be executed when calling self.training_step(), while preserving the behavior of calling + model() for Pytorch modules. Internally, this is a double rerouting mechanism (training_step -> forward -> + training_step), allowing us to preserve the function names and signatures. + """ + + def __init__(self, model: torch.nn.Module, *args, **kwargs): + super().__init__(model, *args, **kwargs) + self.show_sync_grad_static_graph_warning = True + + def training_step(self, *args, **kwargs) -> Any: + # Cache the original model.forward() method. + original_forward = self.module.forward + + def wrapped_training_step(*_args, **_kwargs): # noqa: ANN202 + # Unpatch immediately before calling training_step() because itself may want to call the real forward. + self.module.forward = original_forward + # The actual .training_step(). + return self.module.training_step(*_args, **_kwargs) + + # Patch the original_module's forward so we can redirect the arguments back to the real method. + self.module.forward = wrapped_training_step + # Call self, which implicitly calls self.forward() --> model.forward(), which is now model.training_step(). + # Without calling self.forward() or model.forward() explciitly, implicit hooks are also executed. + return self(*args, **kwargs) + + +@contextmanager +def ddp_sync_grad(model, enabled): + r""" + Context manager to enable/disable gradient synchronizations across DDP processes for DDP model. + Modified from: + https://pytorch.org/docs/stable/_modules/torch/nn/parallel/distributed.html#DistributedDataParallel.no_sync + Note that this is incompatible with static_graph=True and will be an no-op if static_graph=True. + + Within this context, gradients will be accumulated on module + variables, which will later be synchronized in the first + forward-backward pass exiting the context. + + .. warning:: + The forward pass should be included inside the context manager, or + else gradients will still be synchronized. + """ + assert isinstance(model, torch.nn.Module) + if isinstance(model, DistributedDataParallel): + old_require_backward_grad_sync = model.require_backward_grad_sync + if model.static_graph and model.require_backward_grad_sync != enabled: + if model.show_sync_grad_static_graph_warning: + log.warning("DDP static_graph=True is incompatible with sync_grad(). Performance will be reduced.") + model.show_sync_grad_static_graph_warning = False + else: + model.require_backward_grad_sync = enabled + try: + yield + finally: + if isinstance(model, DistributedDataParallel): + model.require_backward_grad_sync = old_require_backward_grad_sync + + +def collate_batches(data_batches: list[dict[str, torch.Tensor]]) -> torch.Tensor | dict[str, torch.Tensor]: + """Aggregate the list of data batches from all devices and process the results. + + This is used for gathering validation data batches with cosmos3._src.imaginaire.utils.dataloader.DistributedEvalSampler. + It will return the data/output of the entire validation set in its original index order. The sizes of data_batches + in different ranks may differ by 1 (if dataset size is not evenly divisible), in which case a dummy sample will be + created before calling dis.all_gather(). + + Args: + data_batches (list[dict[str, torch.Tensor]]): List of tensors or (hierarchical) dictionary where + leaf entries are tensors. + + Returns: + data_gather (torch.Tensor | dict[str, torch.Tensor]): tensors or (hierarchical) dictionary where + leaf entries are concatenated tensors. + """ + if isinstance(data_batches[0], torch.Tensor): + # Concatenate the local data batches. + data_concat = torch.cat(data_batches, dim=0) # type: ignore + # Get the largest number of local samples from all ranks to determine whether to dummy-pad on this rank. + max_num_local_samples = torch.tensor(len(data_concat), device="cuda") + dist.all_reduce(max_num_local_samples, op=dist.ReduceOp.MAX) + if len(data_concat) < max_num_local_samples: + assert len(data_concat) + 1 == max_num_local_samples + dummy = torch.empty_like(data_concat[:1]) + data_concat = torch.cat([data_concat, dummy], dim=0) + dummy_count = torch.tensor(1, device="cuda") + else: + dummy_count = torch.tensor(0, device="cuda") + # Get all concatenated batches from all ranks and concatenate again. + dist.all_reduce(dummy_count, op=dist.ReduceOp.SUM) + data_concat = all_gather_tensor(data_concat.contiguous()) + data_collate = torch.stack(data_concat, dim=1).flatten(start_dim=0, end_dim=1) + # Remove the dummy samples. + if dummy_count > 0: + data_collate = data_collate[:-dummy_count] + elif isinstance(data_batches[0], collections.abc.Mapping): + data_collate = dict() + for key in data_batches[0].keys(): + data_collate[key] = collate_batches([data[key] for data in data_batches]) # type: ignore + else: + raise TypeError + return data_collate + + +@torch.no_grad() +def all_gather_tensor(tensor: torch.Tensor) -> list[torch.Tensor]: + """Gather the corresponding tensor from all GPU devices to a list. + + Args: + tensor (torch.Tensor): Pytorch tensor. + + Returns: + tensor_list (list[torch.Tensor]): A list of Pytorch tensors gathered from all GPU devices. + """ + tensor_list = [torch.zeros_like(tensor) for _ in range(get_world_size())] + dist.all_gather(tensor_list, tensor) + return tensor_list + + +def gather_object(payload: Any) -> list[Any] | None: + """Gather the corresponding object from all GPU devices to a rank 0 hosted list. + + Args: + payload: Any pickle-able object. + + Returns: + payload_list (list[Any]) | None: + Rank 0: A list of Pytorch tensors gathered from all RANK process. + Rest : None + """ + rank, world_size = get_rank(), get_world_size() + payload_gathered = [None] * world_size if rank == 0 else None + dist.gather_object(payload, object_gather_list=payload_gathered, dst=0) + return payload_gathered + + +def broadcast(tensor, src, group=None, async_op=False): + world_size = get_world_size() + if world_size < 2: + return tensor + dist.broadcast(tensor, src=src, group=group, async_op=async_op) + + +def dist_reduce_tensor(tensor, rank=0, reduce="mean"): + r"""Reduce to rank 0""" + world_size = get_world_size() + if world_size < 2: + return tensor + with torch.no_grad(): + dist.reduce(tensor, dst=rank) + if get_rank() == rank: + if reduce == "mean": + tensor /= world_size + elif reduce == "sum": + pass + else: + raise NotImplementedError + return tensor + + +def sync_model_states( + model: torch.nn.Module, + process_group: Optional[dist.ProcessGroup] = None, + src: int = 0, + params_and_buffers_to_ignore: Optional[Container[str]] = None, + broadcast_buffers: bool = True, +): + """ + Modify based on DDP source code + Synchronizes the parameters and buffers of a model across different processes in a distributed setting. + + This function ensures that all processes in the specified process group have the same initial parameters and + buffers from the source rank, typically rank 0. It is useful when different processes start with different model + states and a synchronization is required to ensure consistency across all ranks. + + Args: + model (nn.Module): The model whose parameters and buffers are to be synchronized. + process_group (dist.ProcessGroup, optional): The process group for communication. If None, + the default group is used. Defaults to None. + src (int, optional): The source rank from which parameters and buffers will be broadcasted. + Defaults to 0. + params_and_buffers_to_ignore (Optional[Container[str]], optional): A container of parameter and buffer + names to exclude from synchronization. Defaults to None, which means all parameters and buffers are + included. + broadcast_buffers (bool, optional): Whether to broadcast buffers or not. Defaults to True. + + Side Effects: + This function modifies the state of the model in-place to synchronize it with the source rank's model state. + + Raises: + RuntimeError: If the shapes of parameters across processes do not match, a runtime error will be raised. + + Examples: + >>> # downloading duplicated model weights from s3 in each rank and save network bandwidth + >>> # useful and save our time when model weights are huge + >>> if dist.get_rank == 0: + >>> model.load_state_dict(network_bound_weights_download_fn(s3_weights_path)) + >>> dist.barrir() + >>> sync_model_states(model) # sync rank0 weights to other ranks + """ + if not dist.is_available() or not dist.is_initialized(): + return + if process_group is None: + process_group = _get_default_group() + if not params_and_buffers_to_ignore: + params_and_buffers_to_ignore = set() + + log.info( + f"Synchronizing model states from rank {src} to all ranks in process group {get_process_group_ranks(process_group)}." + ) + + # Build tuple of (module, parameter) for all parameters that require grads. + modules_and_parameters = [ + (module, parameter) + for module_name, module in model.named_modules() + for parameter in [ + param + # Note that we access module.named_parameters instead of + # parameters(module). parameters(module) is only needed in the + # single-process multi device case, where it accesses replicated + # parameters through _former_parameters. + for param_name, param in module.named_parameters(recurse=False) + if f"{module_name}.{param_name}" not in params_and_buffers_to_ignore + # if param.requires_grad + # and f"{module_name}.{param_name}" not in params_and_buffers_to_ignore + ] + ] + + # Deduplicate any parameters that might be shared across child modules. + memo = set() + modules_and_parameters = [ + # "p not in memo" is the deduplication check. + # "not memo.add(p)" is always True, and it's only there to cause "add(p)" if needed. + (m, p) + for m, p in modules_and_parameters + if p not in memo and not memo.add(p) # type: ignore[func-returns-value] + ] + + # Build list of parameters. + parameters = [parameter for _, parameter in modules_and_parameters] + if len(parameters) == 0: + return + + _verify_param_shape_across_processes(process_group, parameters) + + _sync_module_states( + module=model, + process_group=process_group, + broadcast_bucket_size=int(250 * 1024 * 1024), + src=src, + params_and_buffers_to_ignore=params_and_buffers_to_ignore, + broadcast_buffers=broadcast_buffers, + ) + + +def all_gather_object(payload: Any) -> list[Any]: + """Gather the corresponding object from all GPU devices to all ranks.""" + world_size = get_world_size() + payload_gathered = [None] * world_size + dist.all_gather_object(payload_gathered, payload) + return payload_gathered # type: ignore[return-value] + + +def broadcast_object(object, *args, **kwargs): + """Broadcast a object to all GPU.""" + if not dist.is_available() or not dist.is_initialized(): + return object + object_list = [object] + dist.broadcast_object_list(object_list, *args, **kwargs) + return object_list[0] + + +def broadcast_object_list(object_list, *args, **kwargs): + """Broadcast a object list to all GPU. (the list is inplace edited)""" + if not dist.is_available() or not dist.is_initialized(): + return None + else: + dist.broadcast_object_list(object_list, *args, **kwargs) + + +def destroy_process_group(): + if not dist.is_available() or not dist.is_initialized(): + return + dist.destroy_process_group() diff --git a/cosmos-inference/cosmos3/_src/imaginaire/utils/easy_io/__init__.py b/cosmos-inference/cosmos3/_src/imaginaire/utils/easy_io/__init__.py new file mode 100644 index 00000000..3159bfe6 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/imaginaire/utils/easy_io/__init__.py @@ -0,0 +1,14 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. diff --git a/cosmos-inference/cosmos3/_src/imaginaire/utils/easy_io/backends/__init__.py b/cosmos-inference/cosmos3/_src/imaginaire/utils/easy_io/backends/__init__.py new file mode 100644 index 00000000..c6c15692 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/imaginaire/utils/easy_io/backends/__init__.py @@ -0,0 +1,38 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025-2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +from cosmos3._src.imaginaire.flags import TRAINING +from cosmos3._src.imaginaire.utils.easy_io.backends.base_backend import BaseStorageBackend +from cosmos3._src.imaginaire.utils.easy_io.backends.http_backend import HTTPBackend +from cosmos3._src.imaginaire.utils.easy_io.backends.local_backend import LocalBackend +from cosmos3._src.imaginaire.utils.easy_io.backends.registry_utils import backends, prefix_to_backends, register_backend + +__all__ = [ + "BaseStorageBackend", + "LocalBackend", + "HTTPBackend", + "register_backend", + "backends", + "prefix_to_backends", +] + +if TRAINING: + from cosmos3._src.imaginaire.utils.easy_io.backends.boto3_backend import Boto3Backend + from cosmos3._src.imaginaire.utils.easy_io.backends.msc_backend import MSCBackend + + __all__ += [ + "Boto3Backend", + "MSCBackend", + ] diff --git a/cosmos-inference/cosmos3/_src/imaginaire/utils/easy_io/backends/auto_auth.py b/cosmos-inference/cosmos3/_src/imaginaire/utils/easy_io/backends/auto_auth.py new file mode 100644 index 00000000..3858402c --- /dev/null +++ b/cosmos-inference/cosmos3/_src/imaginaire/utils/easy_io/backends/auto_auth.py @@ -0,0 +1,70 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +import contextlib +import json +from collections.abc import Generator +from typing import IO, Any, Optional, Union + +from cosmos3._src.imaginaire.utils import log +from cosmos3._src.imaginaire.utils.env_parsers.cred_env_parser import CRED_ENVS, CRED_ENVS_DICT + +DEPLOYMENT_ENVS = ["prod", "dev", "stg"] + + +# context manger to open a file or read from env variable +@contextlib.contextmanager +def open_auth(s3_credential_path: Optional[Any], mode: str) -> Generator[Union[None, dict[str, Any], IO]]: + if not s3_credential_path: + log.info(f"No credential file provided {s3_credential_path}.") + yield None + return + + name = s3_credential_path.split("/")[-1].split(".")[0] + if not name: + raise ValueError(f"Could not parse into env var: {s3_credential_path}") + cred_env_name = f"PROD_{name.upper()}" + + if CRED_ENVS.APP_ENV in DEPLOYMENT_ENVS and cred_env_name in CRED_ENVS_DICT: + object_storage_config = get_creds_from_env(cred_env_name) + log.info(f"using ENV vars for {cred_env_name}") + yield object_storage_config + else: + log.info(f"using credential file: {s3_credential_path}") + with open(s3_credential_path, mode) as f: + yield f + + +def get_creds_from_env(cred_env_name: str) -> dict[str, Any]: + try: + object_storage_config = CRED_ENVS_DICT[cred_env_name] + except KeyError: + raise ValueError(f"Could not find {cred_env_name} in CRED_ENVS") + empty_args = {key.upper() for key in object_storage_config if object_storage_config[key] == ""} + if empty_args: + raise ValueError(f"Some required environment variable(s) were not provided for {cred_env_name}", empty_args) + return object_storage_config + + +def json_load_auth(f: Union[None, dict[str, Any], IO]) -> dict[str, Any]: + # None. + if f is None: + return {} + # dict[str, Any]. + elif isinstance(f, dict): + return f + # IO. + else: + return json.load(f) diff --git a/cosmos-inference/cosmos3/_src/imaginaire/utils/easy_io/backends/base_backend.py b/cosmos-inference/cosmos3/_src/imaginaire/utils/easy_io/backends/base_backend.py new file mode 100644 index 00000000..1c3853b4 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/imaginaire/utils/easy_io/backends/base_backend.py @@ -0,0 +1,147 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +import io +import os +import os.path as osp +from abc import ABCMeta, abstractmethod +from collections.abc import Generator, Iterator +from contextlib import contextmanager +from pathlib import Path +from typing import Optional, Union + + +def mkdir_or_exist(dir_name, mode=0o777): + if dir_name == "": + return + dir_name = osp.expanduser(dir_name) + os.makedirs(dir_name, mode=mode, exist_ok=True) + + +def has_method(obj, method): + return hasattr(obj, method) and callable(getattr(obj, method)) + + +class BaseStorageBackend(metaclass=ABCMeta): + """Abstract class of storage backends.""" + + # a flag to indicate whether the backend can create a symlink for a file + # This attribute will be deprecated in future. + _allow_symlink: bool = False + + @property + def allow_symlink(self) -> bool: + return self._allow_symlink + + @property + def name(self) -> str: + return self.__class__.__name__ + + @abstractmethod + def size(self, filepath: Union[str, Path]) -> int: + pass + + @abstractmethod + def get(self, filepath: Union[str, Path], offset: Optional[int] = None, size: Optional[int] = None) -> bytes: + pass + + @abstractmethod + def get_text(self, filepath: Union[str, Path], encoding: str = "utf-8") -> str: + pass + + @abstractmethod + def put(self, obj: Union[bytes, io.BytesIO], filepath: Union[str, Path]) -> None: + pass + + @abstractmethod + def put_text(self, obj: str, filepath: Union[str, Path], encoding: str = "utf-8") -> None: + pass + + @abstractmethod + def exists(self, filepath: Union[str, Path]) -> bool: + pass + + @abstractmethod + def isdir(self, filepath: Union[str, Path]) -> bool: + pass + + @abstractmethod + def isfile(self, filepath: Union[str, Path]) -> bool: + pass + + @abstractmethod + def join_path(self, filepath: Union[str, Path], *filepaths: Union[str, Path]) -> str: + pass + + @abstractmethod + @contextmanager + def get_local_path(self, filepath: Union[str, Path]) -> Generator[Union[str, Path], None, None]: + pass + + @abstractmethod + def copyfile(self, src: Union[str, Path], dst: Union[str, Path]) -> str: + pass + + @abstractmethod + def copytree(self, src: Union[str, Path], dst: Union[str, Path]) -> str: + pass + + @abstractmethod + def copyfile_from_local(self, src: Union[str, Path], dst: Union[str, Path]) -> str: + pass + + @abstractmethod + def copytree_from_local(self, src: Union[str, Path], dst: Union[str, Path]) -> str: + pass + + @abstractmethod + def copyfile_to_local( + self, + src: Union[str, Path], + dst: Union[str, Path], + dst_type: str, # Choose from ["file", "dir"] + ) -> Union[str, Path]: + pass + + @abstractmethod + def copytree_to_local(self, src: Union[str, Path], dst: Union[str, Path]) -> Union[str, Path]: + pass + + @abstractmethod + def remove(self, filepath: Union[str, Path]) -> None: + pass + + @abstractmethod + def rmtree(self, dir_path: Union[str, Path]) -> None: + pass + + @abstractmethod + def copy_if_symlink_fails(self, src: Union[str, Path], dst: Union[str, Path]) -> bool: + pass + + @abstractmethod + def list_dir(self, dir_path: Union[str, Path]) -> Generator[str, None, None]: + pass + + @abstractmethod + def list_dir_or_file( # pylint: disable=too-many-arguments + self, + dir_path: Union[str, Path], + list_dir: bool = True, + list_file: bool = True, + suffix: Optional[Union[str, tuple[str]]] = None, + recursive: bool = False, + ) -> Iterator[str]: + pass diff --git a/cosmos-inference/cosmos3/_src/imaginaire/utils/easy_io/backends/boto3_backend.py b/cosmos-inference/cosmos3/_src/imaginaire/utils/easy_io/backends/boto3_backend.py new file mode 100644 index 00000000..b0b9957e --- /dev/null +++ b/cosmos-inference/cosmos3/_src/imaginaire/utils/easy_io/backends/boto3_backend.py @@ -0,0 +1,862 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +import io +import os +import re +import tempfile +from collections.abc import Generator, Iterator +from contextlib import contextmanager +from pathlib import Path +from shutil import SameFileError +from typing import Optional, Union + +from cosmos3._src.imaginaire.utils import log +from cosmos3._src.imaginaire.utils.easy_io.backends.base_backend import BaseStorageBackend, has_method, mkdir_or_exist +from cosmos3._src.imaginaire.utils.easy_io.backends.boto3_client import Boto3Client + + +class Boto3Backend(BaseStorageBackend): + """boto3 storage backend (for internal usage). + + **Deprecated**. Use the MSC backend instead. + + Boto3Backend supports reading and writing data to multiple clusters. + If the file path contains the cluster name, Boto3Backend will read data + from specified cluster or write data to it. Otherwise, Boto3Backend will + access the default cluster. + + Args: + path_mapping (dict, optional): Path mapping dict from local path to + Boto3 path. When ``path_mapping={'src': 'dst'}``, ``src`` in + ``filepath`` will be replaced by ``dst``. Defaults to None. + s3_credential_path (str, optional): Config path of Boto3 client. Default: None. + `New in version 0.3.3`. + + Examples: + >>> backend = Boto3Backend() + >>> filepath1 = 's3://path/of/file' + >>> filepath2 = 'cluster-name:s3://path/of/file' + >>> backend.get(filepath1) # get data from default cluster + >>> client.get(filepath2) # get data from 'cluster-name' cluster + """ + + def __init__( + self, + s3_credential_path: str = "", + path_mapping: Optional[dict] = None, + ): + self._client = Boto3Client(s3_credential_path=s3_credential_path) + assert isinstance(path_mapping, dict) or path_mapping is None + self.path_mapping = path_mapping + if path_mapping: + for k, v in path_mapping.items(): + log.critical(f"Path mapping: {k} -> {v}", rank0_only=False) + + def _map_path(self, filepath: Union[str, Path]) -> str: + """Map ``filepath`` to a string path whose prefix will be replaced by + :attr:`self.path_mapping`. + + Args: + filepath (str or Path): Path to be mapped. + """ + filepath = str(filepath) + if self.path_mapping is not None: + for k, v in self.path_mapping.items(): + filepath = filepath.replace(k, v, 1) + return filepath + + def _format_path(self, filepath: str) -> str: + """Convert a ``filepath`` to standard format of s3 oss. + + If the ``filepath`` is concatenated by ``os.path.join``, in a Windows + environment, the ``filepath`` will be the format of + 's3://bucket_name\\image.jpg'. By invoking :meth:`_format_path`, the + above ``filepath`` will be converted to 's3://bucket_name/image.jpg'. + + Args: + filepath (str): Path to be formatted. + """ + return re.sub(r"\\+", "/", filepath) + + def _replace_prefix(self, filepath: Union[str, Path]) -> str: + filepath = str(filepath) + return filepath + # return filepath.replace('s3://', 's3://') + + def size(self, filepath: Union[str, Path]) -> int: + """Get the file size in bytes for a given ``filepath``. + + Args: + filepath (str or Path): Path to get file size in bytes. + + Returns: + int: File size in bytes for filepath. + + Examples: + >>> backend = Boto3Backend() + >>> filepath = 's3://path/of/file' + >>> backend.size(filepath) # file containing 'hello world' + 11 + """ + filepath = self._map_path(filepath) + filepath = self._format_path(filepath) + filepath = self._replace_prefix(filepath) + return self._client.size(filepath) + + def get(self, filepath: Union[str, Path], offset: Optional[int] = None, size: Optional[int] = None) -> bytes: + """Read bytes from a given ``filepath`` with 'rb' mode in range [offset, offset + size). + + Args: + filepath (str or Path): Path to read data. + offset (int, optional): Read offset in bytes (0-index). Defaults to 0. + size (int, optional): Read size in bytes. Defaults to the file size. + + Returns: + bytes: Return bytes read from filepath. + + Examples: + >>> backend = Boto3Backend() + >>> filepath = 's3://path/of/file' + >>> backend.get(filepath) + b'hello world' + """ + filepath = self._map_path(filepath) + filepath = self._format_path(filepath) + filepath = self._replace_prefix(filepath) + value = self._client.get(filepath=filepath, offset=offset, size=size) + return value + + def get_text( + self, + filepath: Union[str, Path], + encoding: str = "utf-8", + ) -> str: + """Read text from a given ``filepath`` with 'r' mode. + + Args: + filepath (str or Path): Path to read data. + encoding (str): The encoding format used to open the ``filepath``. + Defaults to 'utf-8'. + + Returns: + str: Expected text reading from ``filepath``. + + Examples: + >>> backend = Boto3Backend() + >>> filepath = 's3://path/of/file' + >>> backend.get_text(filepath) + 'hello world' + """ + return str(self.get(filepath), encoding=encoding) + + def put(self, obj: Union[bytes, io.BytesIO], filepath: Union[str, Path]) -> None: + """Write bytes to a given ``filepath``. + + Args: + obj (bytes): Data to be saved. + filepath (str or Path): Path to write data. + + Examples: + >>> backend = Boto3Backend() + >>> filepath = 's3://path/of/file' + >>> backend.put(b'hello world', filepath) + """ + filepath = self._map_path(filepath) + filepath = self._format_path(filepath) + filepath = self._replace_prefix(filepath) + self._client.put(obj, filepath) + + def fast_put(self, obj: Union[bytes, io.BytesIO], filepath: Union[str, Path], num_processes: int = 32) -> None: + """Write bytes to a given ``filepath`` with multiple processes and async""" + assert num_processes > 1 + filepath = self._map_path(filepath) + filepath = self._format_path(filepath) + filepath = self._replace_prefix(filepath) + self._client.fast_put(obj, filepath, num_processes=num_processes) + + def put_text( + self, + obj: str, + filepath: Union[str, Path], + encoding: str = "utf-8", + ) -> None: + """Write text to a given ``filepath``. + + Args: + obj (str): Data to be written. + filepath (str or Path): Path to write data. + encoding (str): The encoding format used to encode the ``obj``. + Defaults to 'utf-8'. + + Examples: + >>> backend = Boto3Backend() + >>> filepath = 's3://path/of/file' + >>> backend.put_text('hello world', filepath) + """ + self.put(bytes(obj, encoding=encoding), filepath) + + def exists(self, filepath: Union[str, Path]) -> bool: + """Check whether a file path exists. + + Args: + filepath (str or Path): Path to be checked whether exists. + + Returns: + bool: Return ``True`` if ``filepath`` exists, ``False`` otherwise. + + Examples: + >>> backend = Boto3Backend() + >>> filepath = 's3://path/of/file' + >>> backend.exists(filepath) + True + """ + if not (has_method(self._client, "contains") and has_method(self._client, "isdir")): + raise NotImplementedError( + "Current version of Boto3 Python SDK has not supported " + "the `contains` and `isdir` methods, please use a higher" + "version or dev branch instead." + ) + + filepath = self._map_path(filepath) + filepath = self._format_path(filepath) + filepath = self._replace_prefix(filepath) + return self._client.contains(filepath) or self._client.isdir(filepath) + + def isdir(self, filepath: Union[str, Path]) -> bool: + """Check whether a file path is a directory. + + Args: + filepath (str or Path): Path to be checked whether it is a + directory. + + Returns: + bool: Return ``True`` if ``filepath`` points to a directory, + ``False`` otherwise. + + Examples: + >>> backend = Boto3Backend() + >>> filepath = 's3://path/of/dir' + >>> backend.isdir(filepath) + True + """ + if not has_method(self._client, "isdir"): + raise NotImplementedError( + "Current version of Boto3 Python SDK has not supported " + "the `isdir` method, please use a higher version or dev" + " branch instead." + ) + + filepath = self._map_path(filepath) + filepath = self._format_path(filepath) + filepath = self._replace_prefix(filepath) + return self._client.isdir(filepath) + + def isfile(self, filepath: Union[str, Path]) -> bool: + """Check whether a file path is a file. + + Args: + filepath (str or Path): Path to be checked whether it is a file. + + Returns: + bool: Return ``True`` if ``filepath`` points to a file, ``False`` + otherwise. + + Examples: + >>> backend = Boto3Backend() + >>> filepath = 's3://path/of/file' + >>> backend.isfile(filepath) + True + """ + if not has_method(self._client, "contains"): + raise NotImplementedError( + "Current version of Boto3 Python SDK has not supported " + "the `contains` method, please use a higher version or " + "dev branch instead." + ) + + filepath = self._map_path(filepath) + filepath = self._format_path(filepath) + filepath = self._replace_prefix(filepath) + return self._client.contains(filepath) + + def join_path( + self, + filepath: Union[str, Path], + *filepaths: Union[str, Path], + ) -> str: + r"""Concatenate all file paths. + + Join one or more filepath components intelligently. The return value + is the concatenation of filepath and any members of \*filepaths. + + Args: + filepath (str or Path): Path to be concatenated. + + Returns: + str: The result after concatenation. + + Examples: + >>> backend = Boto3Backend() + >>> filepath = 's3://path/of/file' + >>> backend.join_path(filepath, 'another/path') + 's3://path/of/file/another/path' + >>> backend.join_path(filepath, '/another/path') + 's3://path/of/file/another/path' + """ + filepath = self._format_path(self._map_path(filepath)) + if filepath.endswith("/"): + filepath = filepath[:-1] + formatted_paths = [filepath] + for path in filepaths: + formatted_path = self._format_path(self._map_path(path)) + formatted_paths.append(formatted_path.lstrip("/")) + + return "/".join(formatted_paths) + + @contextmanager + def get_local_path( + self, + filepath: Union[str, Path], + ) -> Generator[Union[str, Path], None, None]: + """Download a file from ``filepath`` to a local temporary directory, + and return the temporary path. + + ``get_local_path`` is decorated by :meth:`contxtlib.contextmanager`. It + can be called with ``with`` statement, and when exists from the + ``with`` statement, the temporary path will be released. + + Args: + filepath (str or Path): Download a file from ``filepath``. + + Yields: + Iterable[str]: Only yield one temporary path. + + Examples: + >>> backend = Boto3Backend() + >>> # After existing from the ``with`` clause, + >>> # the path will be removed + >>> filepath = 's3://path/of/file' + >>> with backend.get_local_path(filepath) as path: + ... # do something here + """ + assert self.isfile(filepath) + try: + f = tempfile.NamedTemporaryFile(delete=False) + f.write(self.get(filepath)) + f.close() + yield f.name + finally: + os.remove(f.name) + + def copyfile( + self, + src: Union[str, Path], + dst: Union[str, Path], + ) -> str: + """Copy a file src to dst and return the destination file. + + src and dst should have the same prefix. If dst specifies a directory, + the file will be copied into dst using the base filename from src. If + dst specifies a file that already exists, it will be replaced. + + Args: + src (str or Path): A file to be copied. + dst (str or Path): Copy file to dst. + + Returns: + str: The destination file. + + Raises: + SameFileError: If src and dst are the same file, a SameFileError + will be raised. + + Examples: + >>> backend = Boto3Backend() + >>> # dst is a file + >>> src = 's3://path/of/file' + >>> dst = 's3://path/of/file1' + >>> backend.copyfile(src, dst) + 's3://path/of/file1' + + >>> # dst is a directory + >>> dst = 's3://path/of/dir' + >>> backend.copyfile(src, dst) + 's3://path/of/dir/file' + """ + src = self._format_path(self._map_path(src)) + dst = self._format_path(self._map_path(dst)) + if self.isdir(dst): + dst = self.join_path(dst, src.split("/")[-1]) + + if src == dst: + raise SameFileError("src and dst should not be same") + + self.put(self.get(src), dst) + return dst + + def copytree( + self, + src: Union[str, Path], + dst: Union[str, Path], + ) -> str: + """Recursively copy an entire directory tree rooted at src to a + directory named dst and return the destination directory. + + src and dst should have the same prefix. + + Args: + src (str or Path): A directory to be copied. + dst (str or Path): Copy directory to dst. + backend_args (dict, optional): Arguments to instantiate the + prefix of uri corresponding backend. Defaults to None. + + Returns: + str: The destination directory. + + Raises: + FileExistsError: If dst had already existed, a FileExistsError will + be raised. + + Examples: + >>> backend = Boto3Backend() + >>> src = 's3://path/of/dir' + >>> dst = 's3://path/of/dir1' + >>> backend.copytree(src, dst) + 's3://path/of/dir1' + """ + src = self._format_path(self._map_path(src)) + dst = self._format_path(self._map_path(dst)) + + if self.exists(dst): + raise FileExistsError("dst should not exist") + + for path in self.list_dir_or_file(src, list_dir=False, recursive=True): + src_path = self.join_path(src, path) + dst_path = self.join_path(dst, path) + self.put(self.get(src_path), dst_path) + + return dst + + def copyfile_from_local( + self, + src: Union[str, Path], + dst: Union[str, Path], + ) -> str: + """Upload a local file src to dst and return the destination file. + + Args: + src (str or Path): A local file to be copied. + dst (str or Path): Copy file to dst. + backend_args (dict, optional): Arguments to instantiate the + prefix of uri corresponding backend. Defaults to None. + + Returns: + str: If dst specifies a directory, the file will be copied into dst + using the base filename from src. + + Examples: + >>> backend = Boto3Backend() + >>> # dst is a file + >>> src = 'path/of/your/file' + >>> dst = 's3://path/of/file1' + >>> backend.copyfile_from_local(src, dst) + 's3://path/of/file1' + + >>> # dst is a directory + >>> dst = 's3://path/of/dir' + >>> backend.copyfile_from_local(src, dst) + 's3://path/of/dir/file' + """ + dst = self._format_path(self._map_path(dst)) + if self.isdir(dst): + dst = self.join_path(dst, os.path.basename(src)) + + with open(src, "rb") as f: + self.put(f.read(), dst) + + return dst + + def copytree_from_local( + self, + src: Union[str, Path], + dst: Union[str, Path], + ) -> str: + """Recursively copy an entire directory tree rooted at src to a + directory named dst and return the destination directory. + + Args: + src (str or Path): A local directory to be copied. + dst (str or Path): Copy directory to dst. + + Returns: + str: The destination directory. + + Raises: + FileExistsError: If dst had already existed, a FileExistsError will + be raised. + + Examples: + >>> backend = Boto3Backend() + >>> src = 'path/of/your/dir' + >>> dst = 's3://path/of/dir1' + >>> backend.copytree_from_local(src, dst) + 's3://path/of/dir1' + """ + dst = self._format_path(self._map_path(dst)) + if self.exists(dst): + raise FileExistsError("dst should not exist") + + src = str(src) + + for cur_dir, _, files in os.walk(src): + for f in files: + src_path = os.path.join(cur_dir, f) + dst_path = self.join_path(dst, src_path.replace(src, "")) + self.copyfile_from_local(src_path, dst_path) + + return dst + + def copyfile_to_local( + self, + src: Union[str, Path], + dst: Union[str, Path], + dst_type: str, # Choose from ["file", "dir"] + ) -> Union[str, Path]: + """Copy the file src to local dst and return the destination file. + + If dst specifies a directory, the file will be copied into dst using + the base filename from src. If dst specifies a file that already + exists, it will be replaced. + + Args: + src (str or Path): A file to be copied. + dst (str or Path): Copy file to to local dst. + + Returns: + str: If dst specifies a directory, the file will be copied into dst + using the base filename from src. + + Examples: + >>> backend = Boto3Backend() + >>> # dst is a file + >>> src = 's3://path/of/file' + >>> dst = 'path/of/your/file' + >>> backend.copyfile_to_local(src, dst) + 'path/of/your/file' + + >>> # dst is a directory + >>> dst = 'path/of/your/dir' + >>> backend.copyfile_to_local(src, dst) + 'path/of/your/dir/file' + """ + assert dst_type in ["file", "dir"] + # There is no good way to detect whether dst is a directory or a file, so we make dst_type required + if dst_type == "dir": + basename = os.path.basename(src) + if isinstance(dst, str): + dst = os.path.join(dst, basename) + else: + assert isinstance(dst, Path) + dst = dst / basename + + # Create parent directory if it doesn't exist + parent_dir = os.path.dirname(dst) + os.makedirs(parent_dir, exist_ok=True) + + try: + with open(dst, "wb") as f: + data = self.get(src) + f.write(data) + except Exception as e: + log.error(f"Failed to write file: {e}") + raise + + return dst + + def copytree_to_local( + self, + src: Union[str, Path], + dst: Union[str, Path], + ) -> Union[str, Path]: + """Recursively copy an entire directory tree rooted at src to a local + directory named dst and return the destination directory. + + Args: + src (str or Path): A directory to be copied. + dst (str or Path): Copy directory to local dst. + backend_args (dict, optional): Arguments to instantiate the + prefix of uri corresponding backend. Defaults to None. + + Returns: + str: The destination directory. + + Examples: + >>> backend = Boto3Backend() + >>> src = 's3://path/of/dir' + >>> dst = 'path/of/your/dir' + >>> backend.copytree_to_local(src, dst) + 'path/of/your/dir' + """ + for path in self.list_dir_or_file(src, list_dir=False, recursive=True): + dst_path = os.path.join(dst, path) + mkdir_or_exist(os.path.dirname(dst_path)) + with open(dst_path, "wb") as f: + f.write(self.get(self.join_path(src, path))) + + return dst + + def remove(self, filepath: Union[str, Path]) -> None: + """Remove a file. + + Args: + filepath (str or Path): Path to be removed. + + Raises: + FileNotFoundError: If filepath does not exist, an FileNotFoundError + will be raised. + IsADirectoryError: If filepath is a directory, an IsADirectoryError + will be raised. + + Examples: + >>> backend = Boto3Backend() + >>> filepath = 's3://path/of/file' + >>> backend.remove(filepath) + """ + if not has_method(self._client, "delete"): + raise NotImplementedError( + "Current version of Boto3 Python SDK has not supported " + "the `delete` method, please use a higher version or dev " + "branch instead." + ) + + if not self.exists(filepath): + raise FileNotFoundError(f"filepath {filepath} does not exist") + + if self.isdir(filepath): + raise IsADirectoryError("filepath should be a file") + + filepath = self._map_path(filepath) + filepath = self._format_path(filepath) + filepath = self._replace_prefix(filepath) + self._client.delete(filepath) + + def rmtree(self, dir_path: Union[str, Path]) -> None: + """Recursively delete a directory tree. + + Args: + dir_path (str or Path): A directory to be removed. + + Examples: + >>> backend = Boto3Backend() + >>> dir_path = 's3://path/of/dir' + >>> backend.rmtree(dir_path) + """ + for path in self.list_dir_or_file(dir_path, list_dir=False, recursive=True): + filepath = self.join_path(dir_path, path) + self.remove(filepath) + + def copy_if_symlink_fails( + self, + src: Union[str, Path], + dst: Union[str, Path], + ) -> bool: + """Create a symbolic link pointing to src named dst. + + Directly copy src to dst because PetrelBacekend does not support create + a symbolic link. + + Args: + src (str or Path): A file or directory to be copied. + dst (str or Path): Copy a file or directory to dst. + backend_args (dict, optional): Arguments to instantiate the + prefix of uri corresponding backend. Defaults to None. + + Returns: + bool: Return False because Boto3Backend does not support create + a symbolic link. + + Examples: + >>> backend = Boto3Backend() + >>> src = 's3://path/of/file' + >>> dst = 's3://path/of/your/file' + >>> backend.copy_if_symlink_fails(src, dst) + False + >>> src = 's3://path/of/dir' + >>> dst = 's3://path/of/your/dir' + >>> backend.copy_if_symlink_fails(src, dst) + False + """ + if self.isfile(src): + self.copyfile(src, dst) + else: + self.copytree(src, dst) + return False + + def list_dir(self, dir_path: Union[str, Path]): + """List all folders in an S3 bucket with a given prefix. + + Args: + dir_path (str | Path): Path of the directory. + + Examples: + >>> backend = Boto3Backend() + >>> dir_path = 's3://path/of/dir' + >>> backend.list_dir(dir_path) + """ + dir_path = self._map_path(dir_path) + dir_path = self._format_path(dir_path) + dir_path = self._replace_prefix(dir_path) + return self._client.ls_dir(dir_path) + + def list_dir_or_file( # pylint: disable=too-many-arguments + self, + dir_path: Union[str, Path], + list_dir: bool = True, + list_file: bool = True, + suffix: Optional[Union[str, tuple[str]]] = None, + recursive: bool = False, + ) -> Iterator[str]: + """Scan a directory to find the interested directories or files in + arbitrary order. + + Note: + Boto3 has no concept of directories but it simulates the directory + hierarchy in the filesystem through public prefixes. In addition, + if the returned path ends with '/', it means the path is a public + prefix which is a logical directory. + + Note: + :meth:`list_dir_or_file` returns the path relative to ``dir_path``. + In addition, the returned path of directory will not contains the + suffix '/' which is consistent with other backends. + + Args: + dir_path (str | Path): Path of the directory. + list_dir (bool): List the directories. Defaults to True. + list_file (bool): List the path of files. Defaults to True. + suffix (str or tuple[str], optional): File suffix + that we are interested in. Defaults to None. + recursive (bool): If set to True, recursively scan the + directory. Defaults to False. + + Yields: + Iterable[str]: A relative path to ``dir_path``. + + Examples: + >>> backend = Boto3Backend() + >>> dir_path = 's3://path/of/dir' + >>> # list those files and directories in current directory + >>> for file_path in backend.list_dir_or_file(dir_path): + ... print(file_path) + >>> # only list files + >>> for file_path in backend.list_dir_or_file(dir_path, list_dir=False): + ... print(file_path) + >>> # only list directories + >>> for file_path in backend.list_dir_or_file(dir_path, list_file=False): + ... print(file_path) + >>> # only list files ending with specified suffixes + >>> for file_path in backend.list_dir_or_file(dir_path, suffix='.txt'): + ... print(file_path) + >>> # list all files and directory recursively + >>> for file_path in backend.list_dir_or_file(dir_path, recursive=True): + ... print(file_path) + """ # noqa: E501 + if not has_method(self._client, "list"): + raise NotImplementedError( + "Current version of Boto3 Python SDK has not supported " + "the `list` method, please use a higher version or dev" + " branch instead." + ) + + dir_path = self._map_path(dir_path) + dir_path = self._format_path(dir_path) + dir_path = self._replace_prefix(dir_path) + if list_dir and suffix is not None: + raise TypeError("`list_dir` should be False when `suffix` is not None") + + if list_dir and not list_file and not recursive: + raise TypeError( + "Please use `list_dir` instead of `list_dir_or_file` when you only want to list the first level directories." + ) + + if (suffix is not None) and not isinstance(suffix, (str, tuple)): + raise TypeError("`suffix` must be a string or tuple of strings") + + # Boto3's simulated directory hierarchy assumes that directory paths + # should end with `/` + if not dir_path.endswith("/"): + dir_path += "/" + + root = dir_path + + def _list_dir_or_file(dir_path, list_dir, list_file, suffix, recursive): + # Keep track of directories we've already yielded to avoid duplicates + yielded_dirs = set() if list_dir else None + + for path in self._client.list(dir_path): + # All paths returned by S3 list are file paths, never directory paths + absolute_path = self.join_path(dir_path, path) + rel_path = absolute_path[len(root) :] + + # If we want directories, extract directory prefixes from file paths + # boto3 client actually never return dir, it only return file paths + if list_dir and "/" in rel_path: + if not recursive: + # Non-recursive: only yield immediate child directory (first level) + first_slash_pos = rel_path.find("/") + immediate_child_dir = rel_path[:first_slash_pos] + + if immediate_child_dir not in yielded_dirs: + yielded_dirs.add(immediate_child_dir) + yield immediate_child_dir + else: + # Recursive: yield all directory levels + path_parts = rel_path.split("/")[:-1] # Exclude filename + current_dir = "" + for part in path_parts: + if current_dir: + current_dir += "/" + part + else: + current_dir = part + + if current_dir not in yielded_dirs: + yielded_dirs.add(current_dir) + yield current_dir + + # Handle file listing + if (suffix is None or rel_path.endswith(suffix)) and list_file: + yield rel_path + + return _list_dir_or_file(dir_path, list_dir, list_file, suffix, recursive) + + def generate_presigned_url(self, url: str, client_method: str = "get_object", expires_in: int = 3600) -> str: + """Generate the presigned url of video stream which can be passed to + mmcv.VideoReader. Now only work on Boto3 backend. + + Note: + Now only work on Boto3 backend. + + Args: + url (str): Url of video stream. + client_method (str): Method of client, 'get_object' or + 'put_object'. Default: 'get_object'. + expires_in (int): expires, in seconds. Default: 3600. + + Returns: + str: Generated presigned url. + """ + raise NotImplementedError("generate_presigned_url is not supported in Boto3Backend") + return self._client.generate_presigned_url(url, client_method, expires_in) diff --git a/cosmos-inference/cosmos3/_src/imaginaire/utils/easy_io/backends/boto3_client.py b/cosmos-inference/cosmos3/_src/imaginaire/utils/easy_io/backends/boto3_client.py new file mode 100644 index 00000000..becc1a71 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/imaginaire/utils/easy_io/backends/boto3_client.py @@ -0,0 +1,640 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +import asyncio +import concurrent.futures +import io +import os +import time +from collections.abc import Generator +from math import ceil +from multiprocessing import shared_memory +from typing import Any, Optional + +import boto3 +import numpy as np +from botocore.config import Config as S3Config +from botocore.exceptions import ClientError + +import cosmos3._src.imaginaire.utils.easy_io.backends.auto_auth as auto +from cosmos3._src.imaginaire.utils import log +from cosmos3._src.imaginaire.utils.env_parsers.cred_env_parser import CRED_ENVS + +try: + # pyrefly: ignore # import-error + import aioboto3 + + # pyrefly: ignore # import-error + import aioboto3.session + + # pyrefly: ignore # import-error + from aiobotocore.config import AioConfig + + # pyrefly: ignore # import-error + from aiobotocore.session import AioSession +except ImportError: + aioboto3 = None + AioSession = None + +MAX_RETRIES = 5 +RETRY_DELAY = 1 # seconds + + +async def upload_single_part_async( + s3: AioSession, bucket: str, key: str, part_number: int, data: bytes, upload_id: str +) -> dict[str, Any]: + """ + Uploads a single part of a file asynchronously to S3. + + Args: + s3 (S3): The S3 client. + bucket (str): The S3 bucket name. + key (str): The S3 key (file path). + part_number (int): The part number of the upload. + data (bytes): The data to upload. + upload_id (str): The upload ID for the multipart upload. + + Returns: + dict[str, Any]: A dictionary containing the part number and ETag. + """ + for attempt in range(MAX_RETRIES): + try: + response = await s3.upload_part( + Bucket=bucket, Key=key, PartNumber=part_number, UploadId=upload_id, Body=data + ) + return {"PartNumber": part_number, "ETag": response["ETag"]} + except (ClientError, asyncio.TimeoutError, Exception) as e: + log.warning(f"Attempt {attempt + 1} failed for part {part_number}: {str(e)}", rank0_only=False) + if attempt < MAX_RETRIES - 1: + await asyncio.sleep(RETRY_DELAY * (2**attempt)) # Exponential backoff + else: + log.error(f"Failed to upload part {part_number} after {MAX_RETRIES} attempts", rank0_only=False) + raise + + +async def upload_parts_async( + part_size: int, + part_numbers: range, + upload_id: str, + data: bytes, + bucket: str, + key: str, + client_config: dict[str, Any], +) -> list[dict[str, Any]]: + """ + Uploads multiple parts of a file asynchronously to S3. + + Args: + part_size (int): The size of each part in bytes. + part_numbers (range): The range of part numbers to upload. + upload_id (str): The upload ID for the multipart upload. + data (bytes): The data to upload. + bucket (str): The S3 bucket name. + key (str): The S3 key (file path). + client_config (dict[str, Any]): The S3 client configuration. + + Returns: + list[dict[str, Any]]: A list of dictionaries containing part numbers and ETags. + """ + session = aioboto3.Session() + config = AioConfig(retries={"max_attempts": 3, "mode": "adaptive"}, connect_timeout=5, read_timeout=10) + start_idx = part_numbers[0] + async with session.client("s3", config=config, **client_config) as s3: + tasks = [] + for part_number in part_numbers: + start = (part_number - start_idx) * part_size + end = min(start + part_size, len(data)) + part_data = data[start:end] + tasks.append(upload_single_part_async(s3, bucket, key, part_number + 1, part_data, upload_id)) + + results = await asyncio.gather(*tasks, return_exceptions=True) + + successful_parts = [] + failed_parts = [] + for part_number, result in enumerate(results, start=start_idx + 1): + if isinstance(result, Exception): + failed_parts.append(part_number) + else: + successful_parts.append(result) + + if failed_parts: + log.error(f"Failed to upload parts: {failed_parts}", rank0_only=False) + raise Exception(f"Failed to upload {len(failed_parts)} parts") + + successful_parts.sort(key=lambda part: part["PartNumber"]) + return successful_parts + + +def upload_parts_to_s3(args: tuple[range, str, int, bytes, str, str, dict[str, Any]]) -> list[dict[str, Any]]: + """ + Uploads parts of a file to S3 using a new event loop. + + Args: + args (tuple[range, str, int, bytes, str, str, dict[str, Any]]): The arguments for uploading parts, including: + part_numbers (range): The range of part numbers to upload. + upload_id (str): The upload ID for the multipart upload. + part_size (int): The size of each part in bytes. + data (bytes): The data to upload. + bucket (str): The S3 bucket name. + key (str): The S3 key (file path). + client_config (dict[str, Any]): The S3 client configuration. + + Returns: + list[dict[str, Any]]: A list of dictionaries containing part numbers and ETags. + """ + part_numbers, upload_id, part_size, data, bucket, key, client_config = args + loop = asyncio.new_event_loop() + asyncio.set_event_loop(loop) + parts = loop.run_until_complete( + upload_parts_async(part_size, part_numbers, upload_id, data, bucket, key, client_config) + ) + loop.close() + return parts + + +async def download_single_part_async( + s3, bucket: str, key: str, part_number: int, start: int, end: int, shm_name: str, part_size: int +) -> None: + """ + Downloads a single part of a file asynchronously and writes it to shared memory. + + Args: + s3 (S3): The S3 client. + bucket (str): The S3 bucket name. + key (str): The S3 key (file path). + part_number (int): The part number. + start (int): The start byte of the part. + end (int): The end byte of the part. + shm_name (str): The name of the shared memory block. + part_size (int): The size of each part in bytes. + """ + for attempt in range(MAX_RETRIES): + try: + range_header = f"bytes={start}-{end}" + response = await s3.get_object(Bucket=bucket, Key=key, Range=range_header) + data = await response["Body"].read() + + shm = shared_memory.SharedMemory(name=shm_name) + offset = part_number * part_size + shm.buf[offset : offset + len(data)] = data + shm.close() + return + except (ClientError, asyncio.TimeoutError, Exception) as e: + log.warning(f"Attempt {attempt + 1} failed for part {part_number}: {str(e)}", rank0_only=False) + if attempt < MAX_RETRIES - 1: + await asyncio.sleep(RETRY_DELAY * (2**attempt)) # Exponential backoff + else: + log.error(f"Failed to download part {part_number} after {MAX_RETRIES} attempts", rank0_only=False) + raise + + +async def download_parts_async( + part_size: int, part_numbers: range, bucket: str, key: str, client_config: dict[str, Any], shm_name: str +) -> None: + """ + Downloads multiple parts of a file asynchronously and writes them to shared memory. + + Args: + part_size (int): The size of each part in bytes. + part_numbers (range): The range of part numbers to download. + bucket (str): The S3 bucket name. + key (str): The S3 key (file path). + client_config (dict[str, Any]): The S3 client configuration. + shm_name (str): The name of the shared memory block. + """ + session = aioboto3.Session() + config = AioConfig(retries={"max_attempts": 5, "mode": "adaptive"}, connect_timeout=10, read_timeout=30) + async with session.client("s3", config=config, **client_config) as s3: + tasks = [ + download_single_part_async( + s3, + bucket, + key, + part_number, + part_number * part_size, + (part_number + 1) * part_size - 1, + shm_name, + part_size, + ) + for part_number in part_numbers + ] + results = await asyncio.gather(*tasks, return_exceptions=True) + failed_parts = [part for part, result in zip(part_numbers, results) if isinstance(result, Exception)] + + if failed_parts: + log.error(f"Failed to download parts: {failed_parts}", rank0_only=False) + raise Exception(f"Failed to download {len(failed_parts)} parts") + + +def download_parts_to_s3(args: tuple[range, int, str, str, dict[str, Any], str]) -> bytes: + """ + Downloads parts of a file using a new event loop. + + Args: + args (tuple[range, int, str, str, dict[str, Any]]): The arguments for downloading parts, including: + part_numbers (range): The range of part numbers to download. + part_size (int): The size of each part in bytes. + bucket (str): The S3 bucket name. + key (str): The S3 key (file path). + client_config (dict[str, Any]): The S3 client configuration. + + Returns: + bytes: The combined file data from all downloaded parts. + """ + part_numbers, part_size, bucket, key, client_config, shm_name = args + loop = asyncio.new_event_loop() + asyncio.set_event_loop(loop) + loop.run_until_complete(download_parts_async(part_size, part_numbers, bucket, key, client_config, shm_name)) + loop.close() + + +class Boto3Client: + """ + This class: + + - Provides higher-level S3 operations. + - Serves as a wrapper around boto3.client in order to make boto3.client serializable. + - It's required to use spawn method of creating DataLoader workers, + which is in turn required to avoid segfaults when using Triton, + e.g. for torch.compile or custom kernels. + """ + + def __init__( + self, + s3_credential_path: str, + max_attempt: int = 3, + ): + self.max_attempt: int = max_attempt + assert s3_credential_path, "s3_credential_path is required" + assert os.path.exists(s3_credential_path) or CRED_ENVS.APP_ENV in [ + "prod", + "dev", + "stg", + ], f"Credential file not found: {s3_credential_path}" + + # Keep track of S3 client constructor parameters so it can be recreated when pickling. + with auto.open_auth(s3_credential_path, "r") as f: + self._s3_cred_info = auto.json_load_auth(f) + self._s3_config = S3Config( + signature_version="s3v4", + s3={"addressing_style": "virtual"}, + response_checksum_validation="when_required", + request_checksum_calculation="when_required", + ) + self._init_client() + self._mc_kv_store = None + + def _init_client(self): + """Initialize the S3 client.""" + self._client = boto3.client("s3", **self._s3_cred_info, config=self._s3_config) + + def __getstate__(self): + state = self.__dict__.copy() + # S3 client isn't pickleable. + del state["_client"] + return state + + def __setstate__(self, state: dict[str, Any]): + self.__dict__.update(state) + self._init_client() + + def size(self, filepath: str) -> int: + filepath = self._check_path(filepath) + + if self._mc_kv_store and self._mc_kv_store.available: + if self._mc_kv_store.has(filepath): + return len(self._mc_kv_store.get(filepath)) + + attempt: int = 0 + while attempt < self.max_attempt: + try: + return self._client.head_object( + Bucket=filepath.split("/")[0], + Key="/".join(filepath.split("/")[1:]), + )["ContentLength"] + except ClientError as e: + if e.response["Error"]["Code"] == "404": + raise # Object does not exist. + else: + attempt += 1 + log.error(f"Attempt {attempt} failed for {filepath}: {e}", rank0_only=False) + if attempt >= self.max_attempt: + raise # Re-raise the exception after max attempt + time.sleep(2) # Wait for 2 seconds before retrying + except Exception as e: + attempt += 1 + log.error(f"Attempt {attempt} failed for {filepath}: due to an unexpected error: {e}", rank0_only=False) + if attempt >= self.max_attempt: + raise # Re-raise the exception after max attempt + time.sleep(2) # Wait for 2 seconds before retrying + + raise ConnectionError("Unable to head {} from. {} attempts tried.".format(filepath, attempt)) + + def get(self, filepath: str, offset: Optional[int] = None, size: Optional[int] = None) -> bytes: + raw_filepath = filepath + filepath = self._check_path(filepath) + + read_offset: Optional[int] = None + read_size: Optional[int] = None + byte_range: Optional[str] = None + if offset is not None or size is not None: + read_offset = offset or 0 + assert read_offset >= 0, "Read offset must be ≥ 0" + + # Try not to incur a remote call to get the file size. This can heavily slow down ranged reads. + # + # This means we won't always validate the read offset or read size against the file size. + read_size = size or (self.size(filepath=raw_filepath) - read_offset) + assert read_size >= 1, "Read size must be ≥ 1 or read offset must be < file size" + + byte_range = f"bytes={read_offset}-{read_offset + read_size - 1}" + + if self._mc_kv_store and self._mc_kv_store.available: + if self._mc_kv_store.has(filepath): + chunk: bytes = self._mc_kv_store.get(filepath) + if read_offset is not None and read_size is not None: + return chunk[read_offset : read_offset + read_size] + else: + return chunk + + attempt = 0 + while attempt < self.max_attempt: + try: + buffer = io.BytesIO() + if byte_range is None: + self._client.download_fileobj( + Bucket=filepath.split("/")[0], + Key="/".join(filepath.split("/")[1:]), + Fileobj=buffer, + ) + else: + # The boto S3 Transfer Manager doesn't support ranged reads yet. + # + # https://github.com/boto/boto3/issues/1215 + # https://github.com/boto/s3transfer/issues/248 + resp = self._client.get_object( + Bucket=filepath.split("/")[0], + Key="/".join(filepath.split("/")[1:]), + Range=byte_range, + ) + buffer.write(resp["Body"].read()) + buffer.seek(0) + # Only cache full reads. + if byte_range is None: + if self._mc_kv_store and self._mc_kv_store.available: + self._mc_kv_store.put(filepath, buffer.read()) + buffer.seek(0) + + return buffer.read() + except Exception as e: + attempt += 1 + log.error(f"Got an exception: attempt={attempt} - {e} - {filepath}", rank0_only=False) + + raise ConnectionError("Unable to read {} from. {} attempts tried.".format(filepath, attempt)) + + def put(self, obj, filepath): + filepath = self._check_path(filepath) + bucket_name = filepath.split("/")[0] + key = "/".join(filepath.split("/")[1:]) + attempt = 0 + while attempt < self.max_attempt: + try: + # If obj is a string path to a local file, use upload_file instead + if isinstance(obj, str) and os.path.isfile(obj): + self._client.upload_file(Filename=obj, Bucket=bucket_name, Key=key) + return + if isinstance(obj, io.BytesIO): + obj.seek(0) + self._client.upload_fileobj(obj, Bucket=bucket_name, Key=key) + return + if isinstance(obj, bytes): + self._client.put_object(Body=obj, Bucket=bucket_name, Key=key) + return + else: + raise ValueError("Unsupported object type for upload") + except ClientError as e: + attempt += 1 + log.error(f"Got an exception: attempt={attempt} - {e} - {filepath}", rank0_only=False) + + raise ConnectionError("Unable to write {} to. {} attempts tried.".format(filepath, attempt)) + + def fast_put(self, obj, filepath, num_processes: int = 32): + assert aioboto3 is not None, "aioboto3 is required for fast_put" + original_filepath = filepath + filepath = self._check_path(filepath) + bucket = filepath.split("/")[0] + key = "/".join(filepath.split("/")[1:]) + part_size = 16 * 1024 * 1024 # 16 MB part size + + if isinstance(obj, bytes): + data = obj + elif isinstance(obj, str) and os.path.isfile(obj): + with open(obj, "rb") as f: + data = f.read() + elif isinstance(obj, io.BytesIO): + obj.seek(0) + data = obj.read() + else: + raise ValueError("Unsupported object type for upload") + + file_size = len(data) + if file_size <= part_size * num_processes: + return self.put(data, original_filepath) + num_parts = ceil(file_size / part_size) + upload_id = self._client.create_multipart_upload(Bucket=bucket, Key=key)["UploadId"] + + part_numbers = np.array_split(np.arange(num_parts), num_processes) + + with concurrent.futures.ProcessPoolExecutor(max_workers=num_processes) as executor: + args = [] + for i in range(num_processes): + cur_parts = part_numbers[i].tolist() + cur_data = data[cur_parts[0] * part_size : min(cur_parts[-1] * part_size + part_size, file_size)] + args.append((cur_parts, upload_id, part_size, cur_data, bucket, key, self._s3_cred_info)) + results = executor.map(upload_parts_to_s3, args) + parts = [] + for result in results: + parts.extend(result) + + parts = sorted(parts, key=lambda part: part["PartNumber"]) + self._client.complete_multipart_upload( + Bucket=bucket, Key=key, UploadId=upload_id, MultipartUpload={"Parts": parts} + ) + + def contains(self, filepath: str, max_retries=10) -> bool: + """ + Checks if the specified object exists in the S3 bucket with retry logic for errors. + + Args: + filepath (str): The s3 path of the file to check, must start with "s3://". + + Returns: + bool: True if the object exists in the S3 bucket, False otherwise. + + Raises: + ClientError: If an error response other than "404 Not Found" is returned from the S3 service. + """ + filepath = self._check_path(filepath) + bucket = filepath.split("/")[0] + key = "/".join(filepath.split("/")[1:]) + + retries = 0 + while retries < max_retries: + try: + # Try to check if the object exists + self._client.head_object(Bucket=bucket, Key=key) + return True # Object exists + except ClientError as e: + if e.response["Error"]["Code"] == "404": + return False # Object does not exist + else: + retries += 1 + print(f"Attempt {retries} failed with error: {e}") + if retries >= max_retries: + raise # Re-raise the exception if max retries are reached + time.sleep(2) # Wait for 2 seconds before retrying + except Exception as e: + retries += 1 + print(f"Attempt {retries} failed due to an unexpected error: {e}") + if retries >= max_retries: + raise # Re-raise the exception if max retries are reached + time.sleep(2) # Wait for 2 seconds before retrying + + def isdir(self, filepath: str, max_retries=10) -> bool: + """ + Determines if the specified path corresponds to a directory in S3 with retry logic. + + A directory in S3 is implied if there are any objects stored with the given prefix, + which means this function checks for the existence of any objects at or under the specified path. + + Args: + filepath (str): The s3 path to check, must start with "s3://". + + Returns: + bool: True if the specified path corresponds to a directory in S3, False otherwise. + Directories in S3 are not physical entities but are implied by object keys. + + Raises: + ClientError: An error from the S3 API that isn't related to the absence of the directory + (logged but not raised further). + """ + filepath = self._check_path(filepath) + if not filepath.endswith("/"): + filepath += "/" + + bucket = filepath.split("/")[0] + prefix = "/".join(filepath.split("/")[1:]) + + retries = 0 + while retries < max_retries: + try: + # Try to check if any objects exist with the given prefix (i.e., directory in S3) + resp = self._client.list_objects_v2(Bucket=bucket, Prefix=prefix, Delimiter="/", MaxKeys=1) + # Check if any content or prefixes exist under the given path + return "CommonPrefixes" in resp or "Contents" in resp + except ClientError as e: + retries += 1 + log.error(f"Attempt {retries} failed: {e}", rank0_only=False) + if retries >= max_retries: + return False # Return False if maximum retries are reached + time.sleep(2) # Wait for 2 seconds before retrying + except Exception as e: + retries += 1 + log.error(f"Attempt {retries} failed due to an unexpected error: {e}", rank0_only=False) + if retries >= max_retries: + return False # Return False if maximum retries are reached + time.sleep(2) # Wait for 2 seconds before retrying + + def delete(self, filepath): + filepath = self._check_path(filepath) + self._client.delete_object(Bucket=filepath.split("/")[0], Key="/".join(filepath.split("/")[1:])) + + def ls_dir(self, filepath: str) -> Generator[str, None, None]: + """ + List all folders in an S3 bucket with a given prefix. + + Args: + filepath (str): The S3 path of the folder to list. + + Yields: + str: The keys of the folders in the S3 bucket. + """ + filepath = self._check_path(filepath) + bucket = filepath.split("/")[0] + prefix = "/".join(filepath.split("/")[1:]) + continuation_token = None + if prefix and not prefix.endswith("/"): + prefix += "/" + + while True: + if continuation_token: + resp = self._client.list_objects_v2( + Bucket=bucket, Prefix=prefix, Delimiter="/", ContinuationToken=continuation_token + ) + else: + resp = self._client.list_objects_v2(Bucket=bucket, Prefix=prefix, Delimiter="/") + + if "CommonPrefixes" in resp: + for item in resp["CommonPrefixes"]: + yield item["Prefix"][len(prefix) :] + + # Check if there are more keys to retrieve + if resp.get("IsTruncated"): # If IsTruncated is True, there are more keys + continuation_token = resp.get("NextContinuationToken") + else: + break + + def list(self, filepath: str, exclude_prefix: Optional[str] = None) -> Generator[str, None, None]: + """ + List all keys in an S3 bucket with a given prefix, excluding files that start with + specified prefix. + + Args: + filepath (str): The S3 path of the file to list. + exclude_prefix (str): Files starting with this prefix will be excluded from results. + Defaults to "real". + + Yields: + str: The keys of the files in the S3 bucket that don't start with exclude_prefix. + """ + filepath = self._check_path(filepath) + bucket = filepath.split("/")[0] + prefix = "/".join(filepath.split("/")[1:]) + + continuation_token = None + + while True: + if continuation_token: + resp = self._client.list_objects_v2(Bucket=bucket, Prefix=prefix, ContinuationToken=continuation_token) + else: + resp = self._client.list_objects_v2(Bucket=bucket, Prefix=prefix) + + if "Contents" in resp: + for item in resp["Contents"]: + key = item["Key"][len(prefix) :] + # Skip files that start with the excluded prefix + if exclude_prefix is None or not key.startswith(exclude_prefix): + yield key + + # Check if there are more keys to retrieve + if resp.get("IsTruncated"): # If IsTruncated is True, there are more keys + continuation_token = resp.get("NextContinuationToken") + else: + break + + def _check_path(self, filepath: str): + assert filepath.startswith("s3://") + filepath = filepath[5:] + return filepath diff --git a/cosmos-inference/cosmos3/_src/imaginaire/utils/easy_io/backends/http_backend.py b/cosmos-inference/cosmos3/_src/imaginaire/utils/easy_io/backends/http_backend.py new file mode 100644 index 00000000..7847c74d --- /dev/null +++ b/cosmos-inference/cosmos3/_src/imaginaire/utils/easy_io/backends/http_backend.py @@ -0,0 +1,198 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +import io +import os +import tempfile +from collections.abc import Generator, Iterator +from contextlib import contextmanager +from pathlib import Path +from typing import Optional, Union +from urllib.request import Request, urlopen + +from cosmos3._src.imaginaire.utils.easy_io.backends.base_backend import BaseStorageBackend + + +class HTTPBackend(BaseStorageBackend): + """HTTP and HTTPS storage bachend.""" + + def size(self, filepath: Union[str, Path]) -> int: + """Get the file size in bytes for a given ``filepath``. + + Args: + filepath (str or Path): Path to get file size in bytes. + + Returns: + int: File size in bytes for filepath. + + Examples: + >>> backend = HTTPBackend() + >>> filepath = 'http://path/of/file' + >>> backend.size(filepath) # file containing 'hello world' + 11 + """ + request = Request(url=str(filepath), method="HEAD") + with urlopen(request) as response: + if response.status == 200: + return int(response.headers["Content-Length"]) + else: + raise RuntimeError(f"Unexpected response: {response}") + + def get(self, filepath: Union[str, Path], offset: Optional[int] = None, size: Optional[int] = None) -> bytes: + """Read bytes from a given ``filepath`` with 'rb' mode in range [offset, offset + size). + + Args: + filepath (str): Path to read data. + offset (int, optional): Read offset in bytes (0-index). Defaults to 0. + size (int, optional): Read size in bytes. Defaults to the file size. + + Returns: + bytes: Expected bytes object. + + Examples: + >>> backend = HTTPBackend() + >>> backend.get('http://path/of/file') + b'hello world' + """ + request = Request(url=str(filepath), method="GET") + if offset is not None or size is not None: + read_offset = offset or 0 + assert read_offset >= 0, "Read offset must be ≥ 0" + + # Try not to incur a remote call to get the file size. This can heavily slow down ranged reads. + # + # This means we won't always validate the read offset or read size against the file size. + read_size = size or (self.size(filepath=filepath) - read_offset) + assert read_size >= 1, "Read size must be ≥ 1 or read offset must be < file size" + + request.add_header("Range", f"bytes={read_offset}-{read_offset + read_size - 1}") + with urlopen(request) as response: + if response.status in {200, 206}: + return response.read() + else: + raise RuntimeError(f"Unexpected response: {response}") + + def get_text(self, filepath: Union[str, Path], encoding: str = "utf-8") -> str: + """Read text from a given ``filepath``. + + Args: + filepath (str): Path to read data. + encoding (str): The encoding format used to open the ``filepath``. + Defaults to 'utf-8'. + + Returns: + str: Expected text reading from ``filepath``. + + Examples: + >>> backend = HTTPBackend() + >>> backend.get_text('http://path/of/file') + 'hello world' + """ + return self.get(filepath=filepath).decode(encoding) + + def put(self, obj: Union[bytes, io.BytesIO], filepath: Union[str, Path]) -> None: + raise NotImplementedError(f"put not supported in {self.name}") + + def put_text(self, obj: str, filepath: Union[str, Path], encoding: str = "utf-8") -> None: + raise NotImplementedError(f"put_text not supported in {self.name}") + + def exists(self, filepath: Union[str, Path]) -> bool: + request = Request(url=str(filepath), method="HEAD") + with urlopen(request) as response: + if response.status == 404: + return False + elif response.status == 200: + return True + else: + raise RuntimeError(f"Unexpected response: {response}") + + def isdir(self, filepath: Union[str, Path]) -> bool: + raise NotImplementedError(f"isdir not supported in {self.name}") + + def isfile(self, filepath: Union[str, Path]) -> bool: + raise NotImplementedError(f"isfile not supported in {self.name}") + + def join_path(self, filepath: Union[str, Path], *filepaths: Union[str, Path]) -> str: + raise NotImplementedError(f"join_path not supported in {self.name}") + + @contextmanager + def get_local_path(self, filepath: Union[str, Path]) -> Generator[Union[str, Path], None, None]: + """Download a file from ``filepath`` to a local temporary directory, + and return the temporary path. + + ``get_local_path`` is decorated by :meth:`contxtlib.contextmanager`. It + can be called with ``with`` statement, and when exists from the + ``with`` statement, the temporary path will be released. + + Args: + filepath (str): Download a file from ``filepath``. + + Yields: + Iterable[str]: Only yield one temporary path. + + Examples: + >>> backend = HTTPBackend() + >>> # After existing from the ``with`` clause, + >>> # the path will be removed + >>> with backend.get_local_path('http://path/of/file') as path: + ... # do something here + """ + try: + f = tempfile.NamedTemporaryFile(delete=False) + f.write(self.get(filepath)) + f.close() + yield f.name + finally: + os.remove(f.name) + + def copyfile(self, src: Union[str, Path], dst: Union[str, Path]) -> str: + raise NotImplementedError(f"copyfile not supported in {self.name}") + + def copytree(self, src: Union[str, Path], dst: Union[str, Path]) -> str: + raise NotImplementedError(f"copytree not supported in {self.name}") + + def copyfile_from_local(self, src: Union[str, Path], dst: Union[str, Path]) -> str: + raise NotImplementedError(f"copyfile_from_local not supported in {self.name}") + + def copytree_from_local(self, src: Union[str, Path], dst: Union[str, Path]) -> str: + raise NotImplementedError(f"copytree_from_local not supported in {self.name}") + + def copyfile_to_local(self, src: Union[str, Path], dst: Union[str, Path], dst_type: str) -> Union[str, Path]: + raise NotImplementedError(f"copyfile_to_local not supported in {self.name}") + + def copytree_to_local(self, src: Union[str, Path], dst: Union[str, Path]) -> Union[str, Path]: + raise NotImplementedError(f"copytree_to_local not supported in {self.name}") + + def remove(self, filepath: Union[str, Path]) -> None: + raise NotImplementedError(f"remove not supported in {self.name}") + + def rmtree(self, dir_path: Union[str, Path]) -> None: + raise NotImplementedError(f"rmtree not supported in {self.name}") + + def copy_if_symlink_fails(self, src: Union[str, Path], dst: Union[str, Path]) -> bool: + raise NotImplementedError(f"copy_if_symlink_fails not supported in {self.name}") + + def list_dir(self, dir_path: Union[str, Path]) -> Generator[str, None, None]: + raise NotImplementedError(f"list_dir not supported in {self.name}") + + def list_dir_or_file( # pylint: disable=too-many-arguments + self, + dir_path: Union[str, Path], + list_dir: bool = True, + list_file: bool = True, + suffix: Optional[Union[str, tuple[str]]] = None, + recursive: bool = False, + ) -> Iterator[str]: + raise NotImplementedError(f"list_dir_or_file not supported in {self.name}") diff --git a/cosmos-inference/cosmos3/_src/imaginaire/utils/easy_io/backends/local_backend.py b/cosmos-inference/cosmos3/_src/imaginaire/utils/easy_io/backends/local_backend.py new file mode 100644 index 00000000..91927dcc --- /dev/null +++ b/cosmos-inference/cosmos3/_src/imaginaire/utils/easy_io/backends/local_backend.py @@ -0,0 +1,599 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +import io +import os +import os.path as osp +import shutil +from collections.abc import Generator, Iterator +from contextlib import contextmanager +from pathlib import Path +from typing import Optional, Union + +from cosmos3._src.imaginaire.utils.easy_io.backends.base_backend import BaseStorageBackend, mkdir_or_exist + + +class LocalBackend(BaseStorageBackend): + """Raw local storage backend.""" + + _allow_symlink = True + + def size(self, filepath: Union[str, Path]) -> int: + """Get the file size in bytes for a given ``filepath``. + + Args: + filepath (str or Path): Path to get file size in bytes. + + Returns: + int: File size in bytes for filepath. + + Examples: + >>> backend = LocalBackend() + >>> filepath = '/path/of/file' + >>> backend.size(filepath) # file containing 'hello world' + 11 + """ + return osp.getsize(filepath) + + def get(self, filepath: Union[str, Path], offset: Optional[int] = None, size: Optional[int] = None) -> bytes: + """Read bytes from a given ``filepath`` with 'rb' mode. + + Args: + filepath (str or Path): Path to read data. + offset (int, optional): Read offset in bytes (0-index). Defaults to 0. + size (int, optional): Read size in bytes. Defaults to the file size. + + Returns: + bytes: Expected bytes object. + + Examples: + >>> backend = LocalBackend() + >>> filepath = '/path/of/file' + >>> backend.get(filepath) + b'hello world' + """ + read_offset: Optional[int] = None + read_size: Optional[int] = None + if offset is not None or size is not None: + read_offset = offset or 0 + assert read_offset >= 0, "Read offset must be ≥ 0" + + read_size = size or (self.size(filepath=filepath) - read_offset) + assert read_size >= 1, "Read size must be ≥ 1 or read offset must be < file size" + + with open(filepath, "rb") as f: + if read_offset is not None: + f.seek(read_offset) + value = f.read(read_size) + return value + + def get_text(self, filepath: Union[str, Path], encoding: str = "utf-8") -> str: + """Read text from a given ``filepath`` with 'r' mode. + + Args: + filepath (str or Path): Path to read data. + encoding (str): The encoding format used to open the ``filepath``. + Defaults to 'utf-8'. + + Returns: + str: Expected text reading from ``filepath``. + + Examples: + >>> backend = LocalBackend() + >>> filepath = '/path/of/file' + >>> backend.get_text(filepath) + 'hello world' + """ + with open(filepath, encoding=encoding) as f: + text = f.read() + return text + + def put(self, obj: Union[bytes, io.BytesIO], filepath: Union[str, Path]) -> None: + """Write bytes to a given ``filepath`` with 'wb' mode. + + Note: + ``put`` will create a directory if the directory of + ``filepath`` does not exist. + + Args: + obj (bytes): Data to be written. + filepath (str or Path): Path to write data. + + Examples: + >>> backend = LocalBackend() + >>> filepath = '/path/of/file' + >>> backend.put(b'hello world', filepath) + """ + mkdir_or_exist(osp.dirname(filepath)) + if isinstance(obj, io.BytesIO): + obj.seek(0) + obj = obj.getvalue() + with open(filepath, "wb") as f: + f.write(obj) + + def put_text(self, obj: str, filepath: Union[str, Path], encoding: str = "utf-8") -> None: + """Write text to a given ``filepath`` with 'w' mode. + + Note: + ``put_text`` will create a directory if the directory of + ``filepath`` does not exist. + + Args: + obj (str): Data to be written. + filepath (str or Path): Path to write data. + encoding (str): The encoding format used to open the ``filepath``. + Defaults to 'utf-8'. + + Examples: + >>> backend = LocalBackend() + >>> filepath = '/path/of/file' + >>> backend.put_text('hello world', filepath) + """ + mkdir_or_exist(osp.dirname(filepath)) + with open(filepath, "w", encoding=encoding) as f: + f.write(obj) + + def exists(self, filepath: Union[str, Path]) -> bool: + """Check whether a file path exists. + + Args: + filepath (str or Path): Path to be checked whether exists. + + Returns: + bool: Return ``True`` if ``filepath`` exists, ``False`` otherwise. + + Examples: + >>> backend = LocalBackend() + >>> filepath = '/path/of/file' + >>> backend.exists(filepath) + True + """ + return osp.exists(filepath) + + def isdir(self, filepath: Union[str, Path]) -> bool: + """Check whether a file path is a directory. + + Args: + filepath (str or Path): Path to be checked whether it is a + directory. + + Returns: + bool: Return ``True`` if ``filepath`` points to a directory, + ``False`` otherwise. + + Examples: + >>> backend = LocalBackend() + >>> filepath = '/path/of/dir' + >>> backend.isdir(filepath) + True + """ + return osp.isdir(filepath) + + def isfile(self, filepath: Union[str, Path]) -> bool: + """Check whether a file path is a file. + + Args: + filepath (str or Path): Path to be checked whether it is a file. + + Returns: + bool: Return ``True`` if ``filepath`` points to a file, ``False`` + otherwise. + + Examples: + >>> backend = LocalBackend() + >>> filepath = '/path/of/file' + >>> backend.isfile(filepath) + True + """ + return osp.isfile(filepath) + + def join_path(self, filepath: Union[str, Path], *filepaths: Union[str, Path]) -> str: + r"""Concatenate all file paths. + + Join one or more filepath components intelligently. The return value + is the concatenation of filepath and any members of \*filepaths. + + Args: + filepath (str or Path): Path to be concatenated. + + Returns: + str: The result of concatenation. + + Examples: + >>> backend = LocalBackend() + >>> filepath1 = '/path/of/dir1' + >>> filepath2 = 'dir2' + >>> filepath3 = 'path/of/file' + >>> backend.join_path(filepath1, filepath2, filepath3) + '/path/of/dir/dir2/path/of/file' + """ + # TODO, if filepath or filepaths are Path, should return Path + return osp.join(filepath, *filepaths) + + @contextmanager + def get_local_path( + self, + filepath: Union[str, Path], + ) -> Generator[Union[str, Path], None, None]: + """Only for unified API and do nothing. + + Args: + filepath (str or Path): Path to be read data. + backend_args (dict, optional): Arguments to instantiate the + corresponding backend. Defaults to None. + + Examples: + >>> backend = LocalBackend() + >>> with backend.get_local_path('s3://bucket/abc.jpg') as path: + ... # do something here + """ + yield filepath + + def copyfile( + self, + src: Union[str, Path], + dst: Union[str, Path], + ) -> str: + """Copy a file src to dst and return the destination file. + + src and dst should have the same prefix. If dst specifies a directory, + the file will be copied into dst using the base filename from src. If + dst specifies a file that already exists, it will be replaced. + + Args: + src (str or Path): A file to be copied. + dst (str or Path): Copy file to dst. + + Returns: + str: The destination file. + + Raises: + SameFileError: If src and dst are the same file, a SameFileError + will be raised. + + Examples: + >>> backend = LocalBackend() + >>> # dst is a file + >>> src = '/path/of/file' + >>> dst = '/path1/of/file1' + >>> # src will be copied to '/path1/of/file1' + >>> backend.copyfile(src, dst) + '/path1/of/file1' + + >>> # dst is a directory + >>> dst = '/path1/of/dir' + >>> # src will be copied to '/path1/of/dir/file' + >>> backend.copyfile(src, dst) + '/path1/of/dir/file' + """ + return shutil.copy(src, dst) + + def copytree( + self, + src: Union[str, Path], + dst: Union[str, Path], + ) -> str: + """Recursively copy an entire directory tree rooted at src to a + directory named dst and return the destination directory. + + src and dst should have the same prefix and dst must not already exist. + + TODO: Whether to support dirs_exist_ok parameter. + + Args: + src (str or Path): A directory to be copied. + dst (str or Path): Copy directory to dst. + + Returns: + str: The destination directory. + + Raises: + FileExistsError: If dst had already existed, a FileExistsError will + be raised. + + Examples: + >>> backend = LocalBackend() + >>> src = '/path/of/dir1' + >>> dst = '/path/of/dir2' + >>> backend.copytree(src, dst) + '/path/of/dir2' + """ + return shutil.copytree(src, dst) + + def copyfile_from_local( + self, + src: Union[str, Path], + dst: Union[str, Path], + ) -> str: + """Copy a local file src to dst and return the destination file. Same + as :meth:`copyfile`. + + Args: + src (str or Path): A local file to be copied. + dst (str or Path): Copy file to dst. + + Returns: + str: If dst specifies a directory, the file will be copied into dst + using the base filename from src. + + Raises: + SameFileError: If src and dst are the same file, a SameFileError + will be raised. + + Examples: + >>> backend = LocalBackend() + >>> # dst is a file + >>> src = '/path/of/file' + >>> dst = '/path1/of/file1' + >>> # src will be copied to '/path1/of/file1' + >>> backend.copyfile_from_local(src, dst) + '/path1/of/file1' + + >>> # dst is a directory + >>> dst = '/path1/of/dir' + >>> # src will be copied to + >>> backend.copyfile_from_local(src, dst) + '/path1/of/dir/file' + """ + return self.copyfile(src, dst) + + def copytree_from_local( + self, + src: Union[str, Path], + dst: Union[str, Path], + ) -> str: + """Recursively copy an entire directory tree rooted at src to a + directory named dst and return the destination directory. Same as + :meth:`copytree`. + + Args: + src (str or Path): A local directory to be copied. + dst (str or Path): Copy directory to dst. + + Returns: + str: The destination directory. + + Examples: + >>> backend = LocalBackend() + >>> src = '/path/of/dir1' + >>> dst = '/path/of/dir2' + >>> backend.copytree_from_local(src, dst) + '/path/of/dir2' + """ + return self.copytree(src, dst) + + def copyfile_to_local( + self, + src: Union[str, Path], + dst: Union[str, Path], + dst_type: Optional[str] = None, + ) -> str: + """Copy the file src to local dst and return the destination file. Same + as :meth:`copyfile`. + + If dst specifies a directory, the file will be copied into dst using + the base filename from src. If dst specifies a file that already + exists, it will be replaced. + + Args: + src (str or Path): A file to be copied. + dst (str or Path): Copy file to to local dst. + + Returns: + str: If dst specifies a directory, the file will be copied into dst + using the base filename from src. + + Examples: + >>> backend = LocalBackend() + >>> # dst is a file + >>> src = '/path/of/file' + >>> dst = '/path1/of/file1' + >>> # src will be copied to '/path1/of/file1' + >>> backend.copyfile_to_local(src, dst) + '/path1/of/file1' + + >>> # dst is a directory + >>> dst = '/path1/of/dir' + >>> # src will be copied to + >>> backend.copyfile_to_local(src, dst) + '/path1/of/dir/file' + """ + return self.copyfile(src, dst) + + def copytree_to_local( + self, + src: Union[str, Path], + dst: Union[str, Path], + ) -> str: + """Recursively copy an entire directory tree rooted at src to a local + directory named dst and return the destination directory. + + Args: + src (str or Path): A directory to be copied. + dst (str or Path): Copy directory to local dst. + backend_args (dict, optional): Arguments to instantiate the + prefix of uri corresponding backend. Defaults to None. + + Returns: + str: The destination directory. + + Examples: + >>> backend = LocalBackend() + >>> src = '/path/of/dir1' + >>> dst = '/path/of/dir2' + >>> backend.copytree_from_local(src, dst) + '/path/of/dir2' + """ + return self.copytree(src, dst) + + def remove(self, filepath: Union[str, Path]) -> None: + """Remove a file. + + Args: + filepath (str or Path): Path to be removed. + + Raises: + IsADirectoryError: If filepath is a directory, an IsADirectoryError + will be raised. + FileNotFoundError: If filepath does not exist, an FileNotFoundError + will be raised. + + Examples: + >>> backend = LocalBackend() + >>> filepath = '/path/of/file' + >>> backend.remove(filepath) + """ + if not self.exists(filepath): + raise FileNotFoundError(f"filepath {filepath} does not exist") + + if self.isdir(filepath): + raise IsADirectoryError("filepath should be a file") + + os.remove(filepath) + + def rmtree(self, dir_path: Union[str, Path]) -> None: + """Recursively delete a directory tree. + + Args: + dir_path (str or Path): A directory to be removed. + + Examples: + >>> dir_path = '/path/of/dir' + >>> backend.rmtree(dir_path) + """ + shutil.rmtree(dir_path) + + def copy_if_symlink_fails( + self, + src: Union[str, Path], + dst: Union[str, Path], + ) -> bool: + """Create a symbolic link pointing to src named dst. + + If failed to create a symbolic link pointing to src, directly copy src + to dst instead. + + Args: + src (str or Path): Create a symbolic link pointing to src. + dst (str or Path): Create a symbolic link named dst. + + Returns: + bool: Return True if successfully create a symbolic link pointing + to src. Otherwise, return False. + + Examples: + >>> backend = LocalBackend() + >>> src = '/path/of/file' + >>> dst = '/path1/of/file1' + >>> backend.copy_if_symlink_fails(src, dst) + True + >>> src = '/path/of/dir' + >>> dst = '/path1/of/dir1' + >>> backend.copy_if_symlink_fails(src, dst) + True + """ + try: + os.symlink(src, dst) + return True + except Exception: + if self.isfile(src): + self.copyfile(src, dst) + else: + self.copytree(src, dst) + return False + + def list_dir(self, dir_path: Union[str, Path]) -> Generator[str, None, None]: + """List all folders in a storage location with a given prefix. + + Args: + dir_path (str | Path): Path of the directory. + + Examples: + >>> backend = LocalBackend() + >>> dir_path = 'path/of/dir' + >>> list(backend.list_dir(dir_path)) + ['subdir1/', 'subdir2/'] + """ + for entry in os.scandir(dir_path): + if entry.is_dir(): + yield f"{entry.name}/" + + def list_dir_or_file( + self, + dir_path: Union[str, Path], + list_dir: bool = True, + list_file: bool = True, + suffix: Optional[Union[str, tuple[str]]] = None, + recursive: bool = False, + ) -> Iterator[str]: + """Scan a directory to find the interested directories or files in + arbitrary order. + + Note: + :meth:`list_dir_or_file` returns the path relative to ``dir_path``. + + Args: + dir_path (str or Path): Path of the directory. + list_dir (bool): List the directories. Defaults to True. + list_file (bool): List the path of files. Defaults to True. + suffix (str or tuple[str], optional): File suffix that we are + interested in. Defaults to None. + recursive (bool): If set to True, recursively scan the directory. + Defaults to False. + + Yields: + Iterable[str]: A relative path to ``dir_path``. + + Examples: + >>> backend = LocalBackend() + >>> dir_path = '/path/of/dir' + >>> # list those files and directories in current directory + >>> for file_path in backend.list_dir_or_file(dir_path): + ... print(file_path) + >>> # only list files + >>> for file_path in backend.list_dir_or_file(dir_path, list_dir=False): + ... print(file_path) + >>> # only list directories + >>> for file_path in backend.list_dir_or_file(dir_path, list_file=False): + ... print(file_path) + >>> # only list files ending with specified suffixes + >>> for file_path in backend.list_dir_or_file(dir_path, suffix='.txt'): + ... print(file_path) + >>> # list all files and directory recursively + >>> for file_path in backend.list_dir_or_file(dir_path, recursive=True): + ... print(file_path) + """ # noqa: E501 + if list_dir and suffix is not None: + raise TypeError("`suffix` should be None when `list_dir` is True") + + if (suffix is not None) and not isinstance(suffix, (str, tuple)): + raise TypeError("`suffix` must be a string or tuple of strings") + + root = dir_path + + def _list_dir_or_file(dir_path, list_dir, list_file, suffix, recursive): + for entry in os.scandir(dir_path): + if not entry.name.startswith(".") and entry.is_file(): + rel_path = osp.relpath(entry.path, root) + if (suffix is None or rel_path.endswith(suffix)) and list_file: + yield rel_path + elif osp.isdir(entry.path): + if list_dir: + rel_dir = osp.relpath(entry.path, root) + yield rel_dir + if recursive: + yield from _list_dir_or_file(entry.path, list_dir, list_file, suffix, recursive) + + return _list_dir_or_file(dir_path, list_dir, list_file, suffix, recursive) diff --git a/cosmos-inference/cosmos3/_src/imaginaire/utils/easy_io/backends/msc_backend.py b/cosmos-inference/cosmos3/_src/imaginaire/utils/easy_io/backends/msc_backend.py new file mode 100644 index 00000000..6d54e6b9 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/imaginaire/utils/easy_io/backends/msc_backend.py @@ -0,0 +1,1075 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +import copy +import io +import os +import re +import tempfile +from collections.abc import Generator, Iterator +from contextlib import contextmanager +from pathlib import Path +from shutil import SameFileError +from typing import Any, Optional, Union +from urllib.parse import urlparse + +import yaml +from multistorageclient import StorageClient, StorageClientConfig +from multistorageclient.types import Range + +import cosmos3._src.imaginaire.utils.easy_io.backends.auto_auth as auto +from cosmos3._src.imaginaire.utils import log +from cosmos3._src.imaginaire.utils.easy_io.backends.base_backend import BaseStorageBackend, mkdir_or_exist + +# {scheme}:// +_URL_PREFIX_REGEX = r"[a-zA-Z0-9+.-]*:\/\/" + + +def _get_telemetry_config_from_msc_secret() -> Optional[dict[str, Any]]: + """Generate MSC telemetry configuration from credentials/msc.secret file if available. + + Reads OpenTelemetry configuration from the ``credentials/msc.secret`` YAML file. + The file should contain an ``msc.opentelemetry`` section with the required fields + (vault_endpoint, vault_namespace, approle_id, approle_secret, mount_point, secret_path). + The endpoint field is optional and defaults to the production OTLP endpoint. + + Returns: + Optional[dict]: OpenTelemetry configuration dictionary if file exists and contains + required fields, None otherwise. + """ + msc_secret_path = Path("credentials/msc.secret") + + if not msc_secret_path.exists(): + log.debug(f"MSC secret file not found at {msc_secret_path}", rank0_only=True) + return None + + try: + with open(msc_secret_path, "r") as f: + msc_config = yaml.safe_load(f) + + if not msc_config or not isinstance(msc_config, dict): + log.warning(f"Invalid MSC secret file format at {msc_secret_path}", rank0_only=True) + return None + + msc_section = msc_config.get("msc", {}) + opentelemetry_section = msc_section.get("opentelemetry", {}) + + required_fields = ( + "vault_endpoint", + "vault_namespace", + "approle_id", + "approle_secret", + "mount_point", + "secret_path", + ) + missing = [f for f in required_fields if not opentelemetry_section.get(f)] + if missing: + log.warning( + f"MSC secret file at {msc_secret_path} missing required fields: {', '.join(missing)}", + rank0_only=True, + ) + return None + + vault_endpoint = opentelemetry_section["vault_endpoint"] + vault_namespace = opentelemetry_section["vault_namespace"] + approle_id = opentelemetry_section["approle_id"] + approle_secret = opentelemetry_section["approle_secret"] + mount_point = opentelemetry_section["mount_point"] + secret_path = opentelemetry_section["secret_path"] + cert_key = opentelemetry_section["cert_key"] + key_key = opentelemetry_section["key_key"] + ca_key = opentelemetry_section["ca_key"] + endpoint = opentelemetry_section["endpoint"] + except Exception as e: + log.warning(f"Failed to load MSC secret file at {msc_secret_path}: {e}", rank0_only=True) + return None + + # Construct OpenTelemetry configuration dictionary. + opentelemetry_config = { + "opentelemetry": { + "metrics": { + "attributes": [ + # All environments. + {"type": "static", "options": {"attributes": {"msc.ppp": "COSMOS", "msc.job": "unknown"}}}, + {"type": "host", "options": {"attributes": {"msc.cluster": "name", "msc.node": "name"}}}, + {"type": "process", "options": {"attributes": {"msc.process": "pid"}}}, + { + "type": "msc_config", + "options": { + "attributes": { + "msc.approle_id": { + "expression": "opentelemetry.metrics.exporter.options.auth.approle_id" + }, + "msc.approle_secret": { + "expression": ( + "hash('sha3-224', opentelemetry.metrics.exporter.options.auth.approle_secret)" + ) + }, + } + }, + }, + # Progressive enhancement for Lepton environments. + # + # https://docs.nvidia.com/dgx-cloud/lepton/features/batch-jobs/predefined-env-vars + { + "type": "environment_variables", + "options": { + "attributes": { + "msc.job": "LEPTON_JOB_NAME", + "msc.job_user": "LEPTON_USERID", + "msc.job_nodes": "LEPTON_JOB_TOTAL_WORKERS", + "msc.cluster": "LEPTON_WORKER_CLUSTER_NAME", + "msc.node": "LEPTON_WORKER_ID", + "msc.node_gpus": "LEPTON_RESOURCE_ACCELERATOR_NUM", + } + }, + }, + # Progressive enhancement for Slurm environments. + # + # https://slurm.schedmd.com/prolog_epilog.html#environment_variables + { + "type": "environment_variables", + "options": { + "attributes": { + "msc.ppp": "SLURM_JOB_ACCOUNT", + "msc.job": "SLURM_JOB_ID", + "msc.job_user": "SLURM_JOB_USER", + "msc.job_nodes": "SLURM_JOB_NUM_NODES", + "msc.job_gpus": "SLURM_GPUS", + "msc.cluster": "SLURM_CLUSTER_NAME", + "msc.node": "SLURMD_NODENAME", + "msc.node_gpus": "SLURM_GPUS_ON_NODE", + "msc.slurm_job_partition": "SLURM_JOB_PARTITION", + } + }, + }, + ], + "reader": { + "options": { + # ≤ 100 Hz collect frequency. + "collect_interval_millis": 10, + "collect_timeout_millis": 100, + # ≤ 1 Hz export frequency. + "export_interval_millis": 1000, + "export_timeout_millis": 500, + } + }, + "exporter": { + "type": "_otlp_mtls_vault", + "options": { + "exporter": {"endpoint": endpoint}, + "auth": { + "vault_endpoint": vault_endpoint, + "vault_namespace": vault_namespace, + "approle_id": approle_id, + "approle_secret": approle_secret, + "mount_point": mount_point, + "secret_path": secret_path, + "cert_key": cert_key, + "key_key": key_key, + "ca_key": ca_key, + }, + }, + }, + }, + } + } + + return opentelemetry_config + + +class MSCBackend(BaseStorageBackend): + """Multi-Storage Client (MSC) backend. + + Uses MSC storage clients instead of MSC shortcuts. + + URL file paths (e.g. 's3://path/of/file') are handled transparently. Using URL file paths + as input will return URL file path outputs when appropriate to match Boto3Backend behavior. + + **If using URL file paths, the storage provider's base path option must be empty!** + + Get/put concurrency can be set for certain providers in the MSC configuration file. + + Examples: + >>> backend = MSCBackend() + >>> filepath = "path/of/file" # or "s3://path/of/file" + >>> backend.get(filepath) + """ + + _storage_client: StorageClient + _path_mapping: dict[str, str] + + def __init__( + self, + config_path: Optional[str] = "credentials/msc_config.yaml", + profile: Optional[str] = None, + s3_credential_path: Optional[str] = None, + path_mapping: Optional[dict[str, str]] = None, + ): + """Initialize a backend. + + Args: + config_path (str, optional): MSC config path (e.g. ``credentials/msc_config.yaml``). + profile (str, optional): MSC profile from the MSC config to use. + Mutually exclusive with ``s3_credential_path``. + s3_credential_path (str, optional): Legacy Boto3 config path (e.g. ``credentials/s3_training.secret``). + Translated into an MSC profile that's merged with the MSC config at ``config_path`` with: + + - The profile name set to ``s3_credential_path`` verbatim. + - The storage and credentials provider types determined by the file contents. + + Mutually exclusive with ``profile``. + path_mapping (dict, optional): Path mapping dict from src path to dst path. + When ``path_mapping={'src': 'dst'}``, ``src`` in ``filepath`` will be replaced by ``dst``. + Doesn't apply to the local path in ``copy{file,tree}_{from,to}_local`` methods. + """ + if all(_ is None for _ in (profile, s3_credential_path)) or all( + _ is not None for _ in (profile, s3_credential_path) + ): + raise ValueError("Must specify exactly one of profile or s3_credential_path") + + msc_config_dict: dict[str, Any] = {} + + # Use an existing MSC config file as the base MSC config. + if config_path is not None: + config_dict, _ = StorageClientConfig.read_msc_config(config_file_paths=[config_path]) + if config_dict is None: + log.info(f"No MSC config at {config_path}, using empty base MSC config", rank0_only=True) + else: + msc_config_dict = config_dict + + # Create an MSC profile from the legacy Boto3 config. + if s3_credential_path is not None: + with auto.open_auth(s3_credential_path, "r") as unloaded_legacy_boto3_config: + legacy_boto3_config = auto.json_load_auth(unloaded_legacy_boto3_config) + if len(legacy_boto3_config) > 0: + profile = s3_credential_path + + # Merge with any existing profiles. + msc_config_dict["profiles"] = msc_config_dict.get("profiles", {}) + # Merge with the existing profile, replacing `storage_provider` and `credentials_provider` completely. + msc_config_dict["profiles"][profile] = msc_config_dict["profiles"].get(profile, {}) + + storage_provider_type: str = "s3" + parsed_endpoint_url = urlparse(legacy_boto3_config["endpoint_url"]) + # Handle regional SwiftStack endpoints. + if parsed_endpoint_url.hostname.endswith(".s8k.io"): + storage_provider_type = "s8k" + # Handle global and regional GCS endpoints. + elif parsed_endpoint_url.hostname.startswith("storage.") and parsed_endpoint_url.hostname.endswith( + ".googleapis.com" + ): + storage_provider_type = "gcs_s3" + + msc_config_dict["profiles"][profile]["storage_provider"] = { + "type": storage_provider_type, + "options": { + "base_path": "", + "endpoint_url": legacy_boto3_config["endpoint_url"], + "region_name": legacy_boto3_config["region_name"], + }, + } + + if all(_ in legacy_boto3_config for _ in ("aws_access_key_id", "aws_secret_access_key")): + msc_config_dict["profiles"][profile]["credentials_provider"] = { + "type": "S3Credentials", + "options": { + "access_key": legacy_boto3_config["aws_access_key_id"], + "secret_key": legacy_boto3_config["aws_secret_access_key"], + }, + } + else: + raise ValueError("Cannot create profile from empty legacy Boto3 config") + + assert profile is not None, "Failed to resolve MSC profile" + + # Add OpenTelemetry configuration if credentials/msc.secret file is provided. + otel_config = _get_telemetry_config_from_msc_secret() + if otel_config: + msc_config_dict.update(otel_config) + log.debug("MSC Observability is configured from credentials/msc.secret", rank0_only=True) + else: + log.debug( + "MSC Observability is not configured (credentials/msc.secret not found or invalid)", rank0_only=True + ) + + # easy_io needs backend args to be JSON-serializable for backend instance cache keys. + # + # StorageClientConfig isn't, so we need to construct it here instead of receiving one. + self._storage_client = StorageClient( + config=StorageClientConfig.from_dict(config_dict=msc_config_dict, profile=profile) + ) + + assert isinstance(path_mapping, dict) or path_mapping is None + # Make a deep copy of the path mapping to prevent external mutation. + self._path_mapping = {} if path_mapping is None else copy.deepcopy(path_mapping) + for src, dst in self._path_mapping.items(): + log.info(f"Path mapping: {src} -> {dst}", rank0_only=False) + + def _translate_filepath(self, filepath: Union[str, Path], translate_url: bool = True) -> str: + """Translate a `filepath` to a string. + + Paths are of the form 'path/to/file' (path form) or '{protocol}://path/to/file' (URL form). + + Args: + filepath (str): File path to be translated. + translate_url (bool): Strip '{scheme}://' prefixes. Needed for paths passed directly to MSC storage clients. + """ + assert isinstance(filepath, (str, Path)) + + # Change to a POSIX path string. + if isinstance(filepath, str): + # If the ``filepath`` is concatenated by ``os.path.join`` in a Windows + # environment, the ``filepath`` will be the format of 'prefix\file.txt'. + filepath = re.sub(r"\\+", "/", filepath) + elif isinstance(filepath, Path): + # These should only be filesystem paths (e.g. '/path/of/file'). + # URL paths (e.g. ``Path('s3://profile/path/of/file')``) collapse '://' to ':/'. + filepath = filepath.as_posix() + else: + raise ValueError(f"Unhandled filepath type: {type(filepath)}") + + # Remap path. + # + # If there's multiple matching srcs, use the longest src (i.e. the most specific). + longest_src: str = "" + for src in self._path_mapping.keys(): + if filepath.startswith(src) and len(src) > len(longest_src): + longest_src = src + if len(longest_src) > 0: + filepath = filepath.replace(longest_src, self._path_mapping[longest_src], 1) + + # Optionally strip URL prefix then return. + # + # Don't use urlparse in case filepath is an invalid URL. + return re.sub(rf"^{_URL_PREFIX_REGEX}", "", filepath) if translate_url else filepath + + def size(self, filepath: Union[str, Path]) -> int: + """Get the file size in bytes for a given ``filepath``. + + Args: + filepath (str or Path): Path to get file size in bytes. + + Returns: + int: File size in bytes for filepath. + + Examples: + >>> backend = MSCBackend() + >>> filepath = "path/of/file" # or "s3://path/of/file" + >>> backend.size(filepath) # file containing "hello world" + 11 + """ + path = self._translate_filepath(filepath=filepath) + return self._storage_client.info(path=path, strict=False).content_length + + def get(self, filepath: Union[str, Path], offset: Optional[int] = None, size: Optional[int] = None) -> bytes: + """Read bytes from a given ``filepath`` with 'rb' mode in range [offset, offset + size). + + Args: + filepath (str or Path): Path to read data. + offset (int, optional): Read offset in bytes (0-index). Defaults to 0. + size (int, optional): Read size in bytes. Defaults to the file size. + + Returns: + bytes: Return bytes read from filepath. + + Examples: + >>> backend = MSCBackend() + >>> filepath = "path/of/file" # or "s3://path/of/file" + >>> backend.get(filepath) + b'hello world' + """ + path = self._translate_filepath(filepath=filepath) + byte_range: Optional[Range] = None + if offset is not None or size is not None: + read_offset = offset or 0 + assert read_offset >= 0, "Read offset must be ≥ 0" + + # Try not to incur a remote call to get the file size. This can heavily slow down ranged reads. + # + # This means we won't always validate the read offset or read size against the file size. + read_size = size or (self.size(filepath=filepath) - read_offset) + assert read_size >= 1, "Read size must be ≥ 1 or read offset must be < file size" + + byte_range = Range(offset=read_offset, size=read_size) + + if byte_range is None: + buffer = io.BytesIO() + # `StorageClient.read()` defers to `StorageProvider.get_object()` while + # `StorageClient.download_file()` defers to `StorageProvider.download_file()`. + # + # Currently, only `StorageProvider.download_file()` supports parallel downloads + # in some storage providers (e.g. boto S3 transfer manager for S3 storage providers) + # so it's often much faster. + self._storage_client.download_file(remote_path=path, local_path=buffer) + buffer.seek(0) + return buffer.read() + else: + return self._storage_client.read(path=path, byte_range=byte_range) + + def get_text( + self, + filepath: Union[str, Path], + encoding: str = "utf-8", + ) -> str: + """Read text from a given ``filepath`` with 'r' mode. + + Args: + filepath (str or Path): Path to read data. + encoding (str): The encoding format used to open the ``filepath``. + Defaults to 'utf-8'. + + Returns: + str: Expected text reading from ``filepath``. + + Examples: + >>> backend = MSCBackend() + >>> filepath = "path/of/file" # or "s3://path/of/file" + >>> backend.get_text(filepath) + 'hello world' + """ + return str(self.get(filepath=filepath), encoding=encoding) + + def put(self, obj: Union[bytes, io.BytesIO], filepath: Union[str, Path]) -> None: + """Write bytes to a given ``filepath``. + + Args: + obj (bytes): Data to be saved. + filepath (str or Path): Path to write data. + + Examples: + >>> backend = MSCBackend() + >>> filepath = "path/of/file" # or "s3://path/of/file" + >>> backend.put(b"hello world", filepath) + """ + path = self._translate_filepath(filepath=filepath) + buffer = io.BytesIO() + if isinstance(obj, bytes): + buffer.write(obj) + buffer.seek(0) + elif isinstance(obj, io.BytesIO): + buffer = obj + else: + raise ValueError(f"Unhandled obj type: {type(obj)}") + # `StorageClient.write()` defers to `StorageProvider.put_object()` while + # `StorageClient.upload_file()` defers to `StorageProvider.upload_file()`. + # + # Currently, only `StorageProvider.upload_file()` supports parallel uploads + # in some storage providers (e.g. boto S3 transfer manager for S3 storage providers) + # so it's often much faster. + self._storage_client.upload_file(remote_path=path, local_path=buffer) + + def put_text( + self, + obj: str, + filepath: Union[str, Path], + encoding: str = "utf-8", + ) -> None: + """Write text to a given ``filepath``. + + Args: + obj (str): Data to be written. + filepath (str or Path): Path to write data. + encoding (str): The encoding format used to encode the ``obj``. + Defaults to 'utf-8'. + + Examples: + >>> backend = MSCBackend() + >>> filepath = "path/of/file" # or "s3://path/of/file" + >>> backend.put_text("hello world", filepath) + """ + self.put(obj=bytes(obj, encoding=encoding), filepath=filepath) + + def exists(self, filepath: Union[str, Path]) -> bool: + """Check whether a file path exists. + + Args: + filepath (str or Path): Path to be checked whether exists. + + Returns: + bool: Return ``True`` if ``filepath`` exists, ``False`` otherwise. + + Examples: + >>> backend = MSCBackend() + >>> filepath = "path/of/file" # or "s3://path/of/file" + >>> backend.exists(filepath) + True + """ + path = self._translate_filepath(filepath=filepath) + try: + # Include directories and files. + self._storage_client.info(path=path, strict=True) + return True + except FileNotFoundError: + return False + + def isdir(self, filepath: Union[str, Path]) -> bool: + """Check whether a file path is a directory. + + Args: + filepath (str or Path): Path to be checked whether it is a + directory. + + Returns: + bool: Return ``True`` if ``filepath`` points to a directory, + ``False`` otherwise. + + Examples: + >>> backend = MSCBackend() + >>> filepath = "path/of/dir" # or "s3://path/of/file" + >>> backend.isdir(filepath) + True + """ + path = self._translate_filepath(filepath=filepath) + try: + # Include directories and files. + metadata = self._storage_client.info(path=path, strict=True) + return metadata.type == "directory" + except FileNotFoundError: + return False + + def isfile(self, filepath: Union[str, Path]) -> bool: + """Check whether a file path is a file. + + Args: + filepath (str or Path): Path to be checked whether it is a file. + + Returns: + bool: Return ``True`` if ``filepath`` points to a file, ``False`` + otherwise. + + Examples: + >>> backend = MSCBackend() + >>> filepath = "path/of/file" # or "s3://path/of/file" + >>> backend.isfile(filepath) + True + """ + path = self._translate_filepath(filepath=filepath) + try: + return self._storage_client.is_file(path=path) + except FileNotFoundError: + return False + + def join_path( + self, + filepath: Union[str, Path], + *filepaths: Union[str, Path], + ) -> str: + r"""Concatenate all file paths. + + Join one or more filepath components intelligently. The return value + is the concatenation of filepath and any members of \*filepaths. + + Args: + filepath (str or Path): Path to be concatenated. + + Returns: + str: The result after concatenation. + + Examples: + >>> backend = MSCBackend() + >>> filepath = "path/of/file" # or "s3://path/of/file" + >>> backend.join_path(filepath, "another/path") + 'path/of/file/another/path' # or "s3://path/of/file/another/path" + >>> backend.join_path(filepath, "/another/path") + 'path/of/file/another/path' # or "s3://path/of/file/another/path" + """ + filepath = self._translate_filepath(filepath=filepath, translate_url=False) + if filepath.endswith("/") and not filepath.endswith("://"): + filepath = filepath[:-1] + formatted_paths = [filepath] + for path in filepaths: + formatted_path = self._translate_filepath(filepath=path) + formatted_paths.append(formatted_path.lstrip("/")) + + return "/".join(formatted_paths) + + @contextmanager + def get_local_path( + self, + filepath: Union[str, Path], + ) -> Generator[Union[str, Path], None, None]: + """Download a file from ``filepath`` to a local temporary directory, + and return the temporary path. + + ``get_local_path`` is decorated by :meth:`contxtlib.contextmanager`. It + can be called with ``with`` statement, and when exists from the + ``with`` statement, the temporary path will be released. + + Args: + filepath (str or Path): Download a file from ``filepath``. + + Yields: + Iterable[str]: Only yield one temporary path. + + Examples: + >>> backend = MSCBackend() + >>> # After existing from the ``with`` clause, + >>> # the path will be removed + >>> filepath = "path/of/file" # or "s3://path/of/file" + >>> with backend.get_local_path(filepath) as path: + ... # do something here + """ + assert self.isfile(filepath=filepath) + try: + f = tempfile.NamedTemporaryFile(delete=False) + f.write(self.get(filepath=filepath)) + f.close() + yield f.name + finally: + os.remove(f.name) + + def copyfile( + self, + src: Union[str, Path], + dst: Union[str, Path], + ) -> str: + """Copy a file src to dst and return the destination file. + + If dst specifies a file that already exists, it will be replaced. + + Args: + src (str or Path): A file to be copied. + dst (str or Path): Copy file to dst. + + Returns: + str: The destination file. + + Raises: + SameFileError: If src and dst are the same file, a SameFileError + will be raised. + + Examples: + >>> backend = MSCBackend() + >>> # dst is a file + >>> src = "path/of/file" # or "s3://path/of/file" + >>> dst = "path/of/file1" # or "s3://path/of/file1" + >>> backend.copyfile(src, dst) + 'path/of/file1' # or "s3://path/of/file1" + + >>> # dst is a directory + >>> dst = "path/of/dir" # or "s3://path/of/dir" + >>> backend.copyfile(src, dst) + 'path/of/dir/file' # or "s3://path/of/dir/file" + """ + if not self.isfile(filepath=src): + raise FileNotFoundError("src does not exist or is not a file") + if self.isdir(filepath=dst): + dst = self.join_path(dst, self._translate_filepath(filepath=src).split("/")[-1]) + if self._translate_filepath(filepath=src) == self._translate_filepath(filepath=dst): + raise SameFileError("src and dst should not be same") + + self.put(obj=self.get(filepath=src), filepath=dst) + + return self._translate_filepath(filepath=dst, translate_url=False) + + def copytree( + self, + src: Union[str, Path], + dst: Union[str, Path], + ) -> str: + """Recursively copy an entire directory tree rooted at src to a + directory named dst and return the destination directory. + + Args: + src (str or Path): A directory to be copied. + dst (str or Path): Copy directory to dst. + + Returns: + str: The destination directory. + + Raises: + FileExistsError: If dst had already existed, a FileExistsError will + be raised. + + Examples: + >>> backend = MSCBackend() + >>> src = "path/of/dir" # or "s3://path/of/dir" + >>> dst = "path/of/dir1" # or "s3://path/of/dir1" + >>> backend.copytree(src, dst) + 'path/of/dir1' # or "s3://path/of/dir1" + """ + if not self.isdir(filepath=src): + raise FileNotFoundError("src does not exist or is not a directory") + if self.exists(filepath=dst): + raise FileExistsError("dst should not exist") + + for path in self.list_dir_or_file(src, list_dir=False, recursive=True): + src_path = self.join_path(src, path) + dst_path = self.join_path(dst, path) + self.put(obj=self.get(filepath=src_path), filepath=dst_path) + + return self._translate_filepath(filepath=dst, translate_url=False) + + def copyfile_from_local( + self, + src: Union[str, Path], + dst: Union[str, Path], + ) -> str: + """Upload a local file src to dst and return the destination file. + + Args: + src (str or Path): A local file to be copied. + dst (str or Path): Copy file to dst. + + Returns: + str: If dst specifies a directory, the file will be copied into dst + using the base filename from src. + + Examples: + >>> backend = MSCBackend() + >>> # dst is a file + >>> src = "path/of/your/file" + >>> dst = "path/of/file1" # or "s3://path/of/file1" + >>> backend.copyfile_from_local(src, dst) + 'path/of/file1' # or "s3://path/of/file1" + + >>> # dst is a directory + >>> dst = "path/of/dir" + >>> backend.copyfile_from_local(src, dst) + 'path/of/dir/file' # or "s3://path/of/dir/file" + """ + if self.isdir(filepath=dst): + dst = self.join_path(dst, os.path.basename(src)) + + with open(src, "rb") as f: + self.put(obj=f.read(), filepath=dst) + + return self._translate_filepath(filepath=dst, translate_url=False) + + def copytree_from_local( + self, + src: Union[str, Path], + dst: Union[str, Path], + ) -> str: + """Recursively copy an entire directory tree rooted at src to a + directory named dst and return the destination directory. + + Args: + src (str or Path): A local directory to be copied. + dst (str or Path): Copy directory to dst. + + Returns: + str: The destination directory. + + Raises: + FileExistsError: If dst had already existed, a FileExistsError will + be raised. + + Examples: + >>> backend = MSCBackend() + >>> src = "path/of/your/dir" + >>> dst = "path/of/dir1" # or "s3://path/of/dir1" + >>> backend.copytree_from_local(src, dst) + 'path/of/dir1' # or "s3://path/of/dir1" + """ + if self.exists(filepath=dst): + raise FileExistsError("dst should not exist") + + src = str(src) + + for cur_dir, _, files in os.walk(src): + for f in files: + src_path = os.path.join(cur_dir, f) + dst_path = self.join_path(dst, src_path.replace(src, "")) + self.copyfile_from_local(src=src_path, dst=dst_path) + + return self._translate_filepath(filepath=dst, translate_url=False) + + def copyfile_to_local( + self, + src: Union[str, Path], + dst: Union[str, Path], + dst_type: str, # Choose from ["file", "dir"] + ) -> Union[str, Path]: + """Copy the file src to local dst and return the destination file. + + If dst specifies a directory, the file will be copied into dst using + the base filename from src. If dst specifies a file that already + exists, it will be replaced. + + Args: + src (str or Path): A file to be copied. + dst (str or Path): Copy file to to local dst. + + Returns: + str: If dst specifies a directory, the file will be copied into dst + using the base filename from src. + + Examples: + >>> backend = MSCBackend() + >>> # dst is a file + >>> src = "path/of/file" # or "s3://path/of/file" + >>> dst = "path/of/your/file" + >>> backend.copyfile_to_local(src, dst) + 'path/of/your/file' + + >>> # dst is a directory + >>> dst = "path/of/your/dir" + >>> backend.copyfile_to_local(src, dst) + 'path/of/your/dir/file' + """ + assert dst_type in ["file", "dir"] + # There is no good way to detect whether dst is a directory or a file, so we make dst_type required + if dst_type == "dir": + basename = os.path.basename(self._translate_filepath(filepath=src)) + if isinstance(dst, str): + dst = os.path.join(dst, basename) + else: + assert isinstance(dst, Path) + dst = dst / basename + + # Create parent directory if it doesn't exist + parent_dir = os.path.dirname(dst) + os.makedirs(parent_dir, exist_ok=True) + + try: + with open(dst, "wb") as f: + data = self.get(filepath=src) + f.write(data) + except Exception as e: + log.error(f"Failed to write file: {e}") + raise + + return dst + + def copytree_to_local( + self, + src: Union[str, Path], + dst: Union[str, Path], + ) -> Union[str, Path]: + """Recursively copy an entire directory tree rooted at src to a local + directory named dst and return the destination directory. + + Args: + src (str or Path): A directory to be copied. + dst (str or Path): Copy directory to local dst. + + Returns: + str: The destination directory. + + Examples: + >>> backend = MSCBackend() + >>> src = "path/of/dir" # or "s3://path/of/dir" + >>> dst = "path/of/your/dir" + >>> backend.copytree_to_local(src, dst) + 'path/of/your/dir' + """ + for path in self.list_dir_or_file(dir_path=src, list_dir=False, recursive=True): + dst_path = os.path.join(dst, path) + mkdir_or_exist(os.path.dirname(dst_path)) + with open(dst_path, "wb") as f: + f.write(self.get(filepath=self.join_path(src, path))) + + return dst + + def remove(self, filepath: Union[str, Path]) -> None: + """Remove a file. + + Args: + filepath (str or Path): Path to be removed. + + Raises: + FileNotFoundError: If filepath does not exist, an FileNotFoundError + will be raised. + IsADirectoryError: If filepath is a directory, an IsADirectoryError + will be raised. + + Examples: + >>> backend = MSCBackend() + >>> filepath = "path/of/file" # or "s3://path/of/file" + >>> backend.remove(filepath) + """ + if not self.exists(filepath=filepath): + raise FileNotFoundError(f"filepath {filepath} does not exist") + + if self.isdir(filepath=filepath): + raise IsADirectoryError("filepath should be a file") + + self._storage_client.delete(path=self._translate_filepath(filepath=filepath), recursive=False) + + def rmtree(self, dir_path: Union[str, Path]) -> None: + """Recursively delete a directory tree. + + Args: + dir_path (str or Path): A directory to be removed. + + Examples: + >>> backend = MSCBackend() + >>> dir_path = "path/of/dir" # or "s3://path/of/dir" + >>> backend.rmtree(dir_path) + """ + self._storage_client.delete(path=self._translate_filepath(filepath=dir_path), recursive=True) + + def copy_if_symlink_fails( + self, + src: Union[str, Path], + dst: Union[str, Path], + ) -> bool: + """Create a symbolic link pointing to src named dst. + + Directly copy src to dst because MSCBackend does not support creating + a symbolic link. + + Args: + src (str or Path): A file or directory to be copied. + dst (str or Path): Copy a file or directory to dst. + + Returns: + bool: Return False because MSCBackend does not support create + a symbolic link. + + Examples: + >>> backend = MSCBackend() + >>> src = "path/of/file" # or "s3://path/of/file" + >>> dst = "path/of/your/file" # or "s3://path/of/your/file" + >>> backend.copy_if_symlink_fails(src, dst) + False + >>> src = "path/of/dir" # or "s3://path/of/dir" + >>> dst = "path/of/your/dir" # or "s3://path/of/your/dir" + >>> backend.copy_if_symlink_fails(src, dst) + False + """ + if self.isfile(filepath=src): + self.copyfile(src=src, dst=dst) + else: + self.copytree(src=src, dst=dst) + return False + + def list_dir(self, dir_path: Union[str, Path]) -> Generator[str, None, None]: + """List all folders in a storage location with a given prefix. + + Args: + dir_path (str | Path): Path of the directory. + + Examples: + >>> backend = MSCBackend() + >>> dir_path = "path/of/dir" # or "s3://path/of/dir" + >>> list(backend.list_dir(dir_path)) + ["subdir1/", "subdir2/"] + """ + path = self._translate_filepath(filepath=dir_path).removesuffix("/") + "/" + for metadata in self._storage_client.list(path=path, include_directories=True, include_url_prefix=False): + if metadata.type == "directory": + yield metadata.key.removeprefix(path).removesuffix("/") + "/" + + def list_dir_or_file( # pylint: disable=too-many-arguments + self, + dir_path: Union[str, Path], + list_dir: bool = True, + list_file: bool = True, + suffix: Optional[Union[str, tuple[str]]] = None, + recursive: bool = False, + ) -> Iterator[str]: + """Scan a directory to find the interested directories or files in + arbitrary order. + + Note: + Most object stores have no concept of directories but it simulates + the directory hierarchy in the filesystem through public prefixes. + In addition, if the returned path ends with '/', it means the path + is a public prefix which is a logical directory. + + Note: + :meth:`list_dir_or_file` returns the path relative to ``dir_path``. + In addition, the returned path of directory will not contains the + suffix '/' which is consistent with other backends. + + Args: + dir_path (str | Path): Path of the directory. + list_dir (bool): List the directories. Defaults to True. + list_file (bool): List the path of files. Defaults to True. + suffix (str or tuple[str], optional): File suffix + that we are interested in. Defaults to None. + recursive (bool): If set to True, recursively scan the + directory. Defaults to False. + + Yields: + Iterable[str]: A relative path to ``dir_path``. + + Examples: + >>> backend = MSCBackend() + >>> dir_path = "path/of/dir" # or "s3://path/of/dir" + >>> # list those files and directories in current directory + >>> list(backend.list_dir_or_file(dir_path)) + ["file.txt", "subdir", "subdir/cat.png", "subdir/subsubdir/dog.jpg"] + >>> # only list files + >>> list(backend.list_dir_or_file(dir_path, list_dir=False)) + ["file.txt", "subdir/cat.png", "subdir/subsubdir/dog.jpg"] + >>> # only list directories + >>> list(backend.list_dir_or_file(dir_path, list_file=False)) + ["subdir"] + >>> # only list files ending with specified suffixes + >>> list(backend.list_dir_or_file(dir_path, suffix=".txt")) + ["file.txt"] + >>> # list all files and directory recursively + >>> list(backend.list_dir_or_file(dir_path, recursive=True)) + ["file.txt", "subdir", "subdir/cat.png", "subdir/subsubdir", "subdir/subsubdir/dog.png"] + """ + dir_path = self._translate_filepath(filepath=dir_path).removesuffix("/") + "/" + + if list_dir and suffix is not None: + raise TypeError("`list_dir` should be False when `suffix` is not None") + + if list_dir and not list_file and not recursive: + raise TypeError( + "Please use `list_dir` instead of `list_dir_or_file` " + "when you only want to list the first level directories." + ) + + if (suffix is not None) and not isinstance(suffix, (str, tuple)): + raise TypeError("`suffix` must be a string or tuple of strings") + + yielded_subdir_paths: set[str] = set() + # In the MSC, the `include_directories` option switches between flat and hierarchical for both files and "directories". + # + # In the Boto3Backend, however, the `recursive` option only applies to "directories" (seems like a bug). + # + # Construct directories from file paths to match the Boto3Backend behavior. + # + # If this behavior needs to be fixed, switch to `include_directories=(not recursive)` and adjust metadata processing. + for metadata in self._storage_client.list(path=dir_path, include_directories=False, include_url_prefix=False): + # Only files should be returned with `include_directories=False`, but just in case. + if metadata.type == "file": + rel_path: str = metadata.key.removeprefix(dir_path) + if list_dir: + rel_path_fragments = rel_path.split("/") + if len(rel_path_fragments) > 1: + for i in range(len(rel_path_fragments) - 1 if recursive else 1): + subdir_path = "/".join(rel_path_fragments[: i + 1]) + if subdir_path not in yielded_subdir_paths: + yielded_subdir_paths.add(subdir_path) + yield subdir_path + if list_file: + if suffix is None or rel_path.endswith(suffix): + yield rel_path + + def generate_presigned_url(self, url: str, client_method: str = "get_object", expires_in: int = 3600) -> str: + """Generate the presigned url of video stream which can be passed to + mmcv.VideoReader. Now only work on Boto3 backend. + + Note: + Now only work on Boto3 backend. + + Args: + url (str): Url of video stream. + client_method (str): Method of client, 'get_object' or + 'put_object'. Default: 'get_object'. + expires_in (int): expires, in seconds. Default: 3600. + + Returns: + str: Generated presigned url. + """ + raise NotImplementedError("generate_presigned_url is not supported in MSCBackend") diff --git a/cosmos-inference/cosmos3/_src/imaginaire/utils/easy_io/backends/registry_utils.py b/cosmos-inference/cosmos3/_src/imaginaire/utils/easy_io/backends/registry_utils.py new file mode 100644 index 00000000..7ea93250 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/imaginaire/utils/easy_io/backends/registry_utils.py @@ -0,0 +1,134 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +import inspect +from typing import Optional, Type, Union + +from cosmos3._src.imaginaire.flags import TRAINING +from cosmos3._src.imaginaire.utils.easy_io.backends.base_backend import BaseStorageBackend +from cosmos3._src.imaginaire.utils.easy_io.backends.http_backend import HTTPBackend +from cosmos3._src.imaginaire.utils.easy_io.backends.local_backend import LocalBackend + +backends: dict = {} +prefix_to_backends: dict = {} + + +def _register_backend( + name: str, + backend: Type[BaseStorageBackend], + force: bool = False, + prefixes: Union[str, list, tuple, None] = None, +): + """Register a backend. + + Args: + name (str): The name of the registered backend. + backend (BaseStorageBackend): The backend class to be registered, + which must be a subclass of :class:`BaseStorageBackend`. + force (bool): Whether to override the backend if the name has already + been registered. Defaults to False. + prefixes (str or list[str] or tuple[str], optional): The prefix + of the registered storage backend. Defaults to None. + """ + global backends, prefix_to_backends + + if not isinstance(name, str): + raise TypeError(f"the backend name should be a string, but got {type(name)}") + + if not inspect.isclass(backend): + raise TypeError(f"backend should be a class, but got {type(backend)}") + if not issubclass(backend, BaseStorageBackend): + raise TypeError(f"backend {backend} is not a subclass of BaseStorageBackend") + + if name in backends and not force: + raise ValueError( + f'{name} is already registered as a storage backend, add "force=True" if you want to override it' + ) + backends[name] = backend + + if prefixes is not None: + if isinstance(prefixes, str): + prefixes = [prefixes] + else: + assert isinstance(prefixes, (list, tuple)) + + for prefix in prefixes: + if prefix in prefix_to_backends and not force: + raise ValueError( + f'{prefix} is already registered as a storage backend, add "force=True" if you want to override it' + ) + + prefix_to_backends[prefix] = backend + + +def register_backend( + name: str, + backend: Optional[Type[BaseStorageBackend]] = None, + force: bool = False, + prefixes: Union[str, list, tuple, None] = None, +): + """Register a backend. + + Args: + name (str): The name of the registered backend. + backend (class, optional): The backend class to be registered, + which must be a subclass of :class:`BaseStorageBackend`. + When this method is used as a decorator, backend is None. + Defaults to None. + force (bool): Whether to override the backend if the name has already + been registered. Defaults to False. + prefixes (str or list[str] or tuple[str], optional): The prefix + of the registered storage backend. Defaults to None. + + This method can be used as a normal method or a decorator. + + Examples: + + >>> class NewBackend(BaseStorageBackend): + ... def get(self, filepath): + ... return filepath + ... + ... def get_text(self, filepath): + ... return filepath + >>> register_backend('new', NewBackend) + + >>> @register_backend('new') + ... class NewBackend(BaseStorageBackend): + ... def get(self, filepath): + ... return filepath + ... + ... def get_text(self, filepath): + ... return filepath + """ + if backend is not None: + _register_backend(name, backend, force=force, prefixes=prefixes) + return + + def _register(backend_cls): + _register_backend(name, backend_cls, force=force, prefixes=prefixes) + return backend_cls + + return _register + + +register_backend("local", LocalBackend, prefixes="") +register_backend("http", HTTPBackend, prefixes=["http", "https"]) + +if TRAINING: + from cosmos3._src.imaginaire.utils.easy_io.backends.msc_backend import MSCBackend + + # To avoid breaking backward Compatibility, 's3' is also used as a + # prefix for MSCBackend + register_backend("s3", MSCBackend, prefixes=["s3"]) diff --git a/cosmos-inference/cosmos3/_src/imaginaire/utils/easy_io/easy_io.py b/cosmos-inference/cosmos3/_src/imaginaire/utils/easy_io/easy_io.py new file mode 100644 index 00000000..6b66d145 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/imaginaire/utils/easy_io/easy_io.py @@ -0,0 +1,1117 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +import json +import warnings +from contextlib import contextmanager +from io import BytesIO, StringIO +from pathlib import Path +from typing import IO, Any, Generator, Iterator, Optional, Tuple, Union + +from cosmos3._src.imaginaire.utils.easy_io.backends import backends, prefix_to_backends +from cosmos3._src.imaginaire.utils.easy_io.file_client import FileClient +from cosmos3._src.imaginaire.utils.easy_io.handlers import file_handlers + +backend_instances: dict = {} + + +def is_filepath(filepath): + return isinstance(filepath, (str, Path)) + + +def _parse_uri_prefix(uri: Union[str, Path]) -> str: + """Parse the prefix of uri. + + Args: + uri (str or Path): Uri to be parsed that contains the file prefix. + + Examples: + >>> _parse_uri_prefix('/home/path/of/your/file') + '' + >>> _parse_uri_prefix('s3://path/of/your/file') + 's3' + >>> _parse_uri_prefix('clusterName:s3://path/of/your/file') + 's3' + + Returns: + str: Return the prefix of uri if the uri contains '://'. Otherwise, + return ''. + """ + assert is_filepath(uri) + uri = str(uri) + # if uri does not contains '://', the uri will be handled by + # LocalBackend by default + if "://" not in uri: + return "" + else: + prefix, _ = uri.split("://") + # In the case of Boto3Backend, the prefix may contain the cluster + # name like clusterName:s3://path/of/your/file + if ":" in prefix: + _, prefix = prefix.split(":") + return prefix + + +def _get_file_backend(prefix: str, backend_args: dict): + """Return a file backend based on the prefix or backend_args. + + Args: + prefix (str): Prefix of uri. + backend_args (dict): Arguments to instantiate the corresponding + backend. + """ + # backend name has a higher priority + if "backend" in backend_args: + # backend_args should not be modified + backend_args_bak = backend_args.copy() + backend_name = backend_args_bak.pop("backend") + backend = backends[backend_name](**backend_args_bak) + else: + backend = prefix_to_backends[prefix](**backend_args) + return backend + + +def set_s3_backend( + key: str = "s3:{}", + backend_args: Optional[dict] = None, +): + """register s3 backend. + + Args: + key str: The key to register the s3 backend. Defaults to s3. + backend_args (dict, optional): Arguments to instantiate the + corresponding backend. Defaults to None. + """ + global backend_instances + if backend_args is None: + backend_args = {} + backend = _get_file_backend(key, backend_args) + backend_instances[key] = backend + return backend + + +def get_file_backend( + uri: Union[str, Path, None] = None, + *, + backend_args: Optional[dict] = None, + enable_singleton: bool = False, + backend_key: Optional[str] = None, +): + """Return a file backend based on the prefix of uri or backend_args. + + Args: + uri (str or Path): Uri to be parsed that contains the file prefix. + backend_args (dict, optional): Arguments to instantiate the + corresponding backend. Defaults to None. + enable_singleton (bool): Whether to enable the singleton pattern. + If it is True, the backend created will be reused if the + signature is same with the previous one. Defaults to False. + backend_key: str: The key to register the backend. Defaults to None. + + Returns: + BaseStorageBackend: Instantiated Backend object. + + Examples: + >>> # get file backend based on the prefix of uri + >>> uri = 's3://path/of/your/file' + >>> backend = get_file_backend(uri) + >>> # get file backend based on the backend_args + >>> backend = get_file_backend(backend_args={'backend': 's3'}) + >>> # backend name has a higher priority if 'backend' in backend_args + >>> backend = get_file_backend(uri, backend_args={'backend': 's3'}) + """ + global backend_instances + if backend_key is not None: + if backend_key in backend_instances: + return backend_instances[backend_key] + + if backend_args is None: + backend_args = {} + + if uri is None and "backend" not in backend_args and backend_key is None: + raise ValueError('uri should not be None when "backend" does not exist in backend_args and backend_key is None') + + if uri is not None: + prefix = _parse_uri_prefix(uri) + else: + prefix = "" + + if enable_singleton: + + unique_key = f"{prefix}:{json.dumps(backend_args)}" + if unique_key in backend_instances: + return backend_instances[unique_key] + + backend = _get_file_backend(prefix, backend_args) + backend_instances[unique_key] = backend + if backend_key is not None: + backend_instances[backend_key] = backend + return backend + else: + backend = _get_file_backend(prefix, backend_args) + return backend + + +def size( + filepath: Union[str, Path], + backend_args: Optional[dict] = None, + backend_key: Optional[str] = None, +) -> int: + """Get the file size in bytes for a given ``filepath``. + + Args: + filepath (str or Path): Path to get file size in bytes. + + Returns: + int: File size in bytes for filepath. + + Examples: + >>> filepath = 'path/of/file' + >>> size(filepath) # file containing 'hello world' + 11 + """ + backend = get_file_backend( + filepath, + backend_args=backend_args, + enable_singleton=True, + backend_key=backend_key, + ) + return backend.size(filepath) + + +def get( + filepath: Union[str, Path], + offset: Optional[int] = None, + size: Optional[int] = None, + backend_args: Optional[dict] = None, + backend_key: Optional[str] = None, +) -> bytes: + """Read bytes from a given ``filepath`` with 'rb' mode in range [offset, offset + size). + + Args: + filepath (str or Path): Path to read data. + offset (int, optional): Read offset in bytes (0-index). Defaults to 0. + size (int, optional): Read size in bytes. Defaults to the file size. + backend_args (dict, optional): Arguments to instantiate the + corresponding backend. Defaults to None. + backend_key (str, optional): The key to get the backend from register. + + Returns: + bytes: Expected bytes object. + + Examples: + >>> filepath = '/path/of/file' + >>> get(filepath) + b'hello world' + """ + backend = get_file_backend( + filepath, + backend_args=backend_args, + enable_singleton=True, + backend_key=backend_key, + ) + return backend.get(filepath, offset=offset, size=size) + + +def get_text( + filepath: Union[str, Path], + encoding="utf-8", + backend_args: Optional[dict] = None, + backend_key: Optional[str] = None, +) -> str: + """Read text from a given ``filepath`` with 'r' mode. + + Args: + filepath (str or Path): Path to read data. + encoding (str): The encoding format used to open the ``filepath``. + Defaults to 'utf-8'. + backend_args (dict, optional): Arguments to instantiate the + corresponding backend. Defaults to None. + backend_key (str, optional): The key to get the backend from register. + + Returns: + str: Expected text reading from ``filepath``. + + Examples: + >>> filepath = '/path/of/file' + >>> get_text(filepath) + 'hello world' + """ + backend = get_file_backend( + filepath, + backend_args=backend_args, + enable_singleton=True, + backend_key=backend_key, + ) + return backend.get_text(filepath, encoding) + + +def put( + obj: bytes, + filepath: Union[str, Path], + backend_args: Optional[dict] = None, + backend_key: Optional[str] = None, +) -> None: + """Write bytes to a given ``filepath`` with 'wb' mode. + + Note: + ``put`` should create a directory if the directory of + ``filepath`` does not exist. + + Args: + obj (bytes): Data to be written. + filepath (str or Path): Path to write data. + backend_args (dict, optional): Arguments to instantiate the + corresponding backend. Defaults to None. + backend_key (str, optional): The key to get the backend from register. + + Examples: + >>> filepath = '/path/of/file' + >>> put(b'hello world', filepath) + """ + backend = get_file_backend( + filepath, + backend_args=backend_args, + enable_singleton=True, + backend_key=backend_key, + ) + backend.put(obj, filepath) + + +def put_text( + obj: str, + filepath: Union[str, Path], + backend_args: Optional[dict] = None, + backend_key: Optional[str] = None, +) -> None: + """Write text to a given ``filepath`` with 'w' mode. + + Note: + ``put_text`` should create a directory if the directory of + ``filepath`` does not exist. + + Args: + obj (str): Data to be written. + filepath (str or Path): Path to write data. + encoding (str, optional): The encoding format used to open the + ``filepath``. Defaults to 'utf-8'. + backend_args (dict, optional): Arguments to instantiate the + corresponding backend. Defaults to None. + backend_key (str, optional): The key to get the backend from register. + + Examples: + >>> filepath = '/path/of/file' + >>> put_text('hello world', filepath) + """ + backend = get_file_backend( + filepath, + backend_args=backend_args, + enable_singleton=True, + backend_key=backend_key, + ) + backend.put_text(obj, filepath) + + +def exists( + filepath: Union[str, Path], + backend_args: Optional[dict] = None, + backend_key: Optional[str] = None, +) -> bool: + """Check whether a file path exists. + + Args: + filepath (str or Path): Path to be checked whether exists. + backend_args (dict, optional): Arguments to instantiate the + corresponding backend. Defaults to None. + backend_key (str, optional): The key to get the backend from register. + + Returns: + bool: Return ``True`` if ``filepath`` exists, ``False`` otherwise. + + Examples: + >>> filepath = '/path/of/file' + >>> exists(filepath) + True + """ + backend = get_file_backend( + filepath, + backend_args=backend_args, + enable_singleton=True, + backend_key=backend_key, + ) + return backend.exists(filepath) + + +def isdir( + filepath: Union[str, Path], + backend_args: Optional[dict] = None, + backend_key: Optional[str] = None, +) -> bool: + """Check whether a file path is a directory. + + Args: + filepath (str or Path): Path to be checked whether it is a + directory. + backend_args (dict, optional): Arguments to instantiate the + corresponding backend. Defaults to None. + backend_key (str, optional): The key to get the backend from register. + + Returns: + bool: Return ``True`` if ``filepath`` points to a directory, + ``False`` otherwise. + + Examples: + >>> filepath = '/path/of/dir' + >>> isdir(filepath) + True + """ + backend = get_file_backend( + filepath, + backend_args=backend_args, + enable_singleton=True, + backend_key=backend_key, + ) + return backend.isdir(filepath) + + +def isfile( + filepath: Union[str, Path], + backend_args: Optional[dict] = None, + backend_key: Optional[str] = None, +) -> bool: + """Check whether a file path is a file. + + Args: + filepath (str or Path): Path to be checked whether it is a file. + backend_args (dict, optional): Arguments to instantiate the + corresponding backend. Defaults to None. + backend_key (str, optional): The key to get the backend from register. + + Returns: + bool: Return ``True`` if ``filepath`` points to a file, ``False`` + otherwise. + + Examples: + >>> filepath = '/path/of/file' + >>> isfile(filepath) + True + """ + backend = get_file_backend( + filepath, + backend_args=backend_args, + enable_singleton=True, + backend_key=backend_key, + ) + return backend.isfile(filepath) + + +def join_path( + filepath: Union[str, Path], + *filepaths: Union[str, Path], + backend_args: Optional[dict] = None, + backend_key: Optional[str] = None, +) -> Union[str, Path]: + r"""Concatenate all file paths. + + Join one or more filepath components intelligently. The return value + is the concatenation of filepath and any members of \*filepaths. + + Args: + filepath (str or Path): Path to be concatenated. + *filepaths (str or Path): Other paths to be concatenated. + backend_args (dict, optional): Arguments to instantiate the + corresponding backend. Defaults to None. + backend_key (str, optional): The key to get the backend from register. + + Returns: + str: The result of concatenation. + + Examples: + >>> filepath1 = '/path/of/dir1' + >>> filepath2 = 'dir2' + >>> filepath3 = 'path/of/file' + >>> join_path(filepath1, filepath2, filepath3) + '/path/of/dir/dir2/path/of/file' + """ + backend = get_file_backend( + filepath, + backend_args=backend_args, + enable_singleton=True, + backend_key=backend_key, + ) + return backend.join_path(filepath, *filepaths) + + +@contextmanager +def get_local_path( + filepath: Union[str, Path], + backend_args: Optional[dict] = None, + backend_key: Optional[str] = None, +) -> Generator[Union[str, Path], None, None]: + """Download data from ``filepath`` and write the data to local path. + + ``get_local_path`` is decorated by :meth:`contxtlib.contextmanager`. It + can be called with ``with`` statement, and when exists from the + ``with`` statement, the temporary path will be released. + + Note: + If the ``filepath`` is a local path, just return itself and it will + not be released (removed). + + Args: + filepath (str or Path): Path to be read data. + backend_args (dict, optional): Arguments to instantiate the + corresponding backend. Defaults to None. + + Yields: + Iterable[str]: Only yield one path. + + Examples: + >>> with get_local_path('s3://bucket/abc.jpg') as path: + ... # do something here + """ + backend = get_file_backend( + filepath, + backend_args=backend_args, + enable_singleton=True, + backend_key=backend_key, + ) + with backend.get_local_path(str(filepath)) as local_path: + yield local_path + + +def copyfile( + src: Union[str, Path], + dst: Union[str, Path], + backend_args: Optional[dict] = None, + backend_key: Optional[str] = None, +) -> Union[str, Path]: + """Copy a file src to dst and return the destination file. + + src and dst should have the same prefix. If dst specifies a directory, + the file will be copied into dst using the base filename from src. If + dst specifies a file that already exists, it will be replaced. + + Args: + src (str or Path): A file to be copied. + dst (str or Path): Copy file to dst. + backend_args (dict, optional): Arguments to instantiate the + corresponding backend. Defaults to None. + + Returns: + str: The destination file. + + Raises: + SameFileError: If src and dst are the same file, a SameFileError will + be raised. + + Examples: + >>> # dst is a file + >>> src = '/path/of/file' + >>> dst = '/path1/of/file1' + >>> # src will be copied to '/path1/of/file1' + >>> copyfile(src, dst) + '/path1/of/file1' + + >>> # dst is a directory + >>> dst = '/path1/of/dir' + >>> # src will be copied to '/path1/of/dir/file' + >>> copyfile(src, dst) + '/path1/of/dir/file' + """ + backend = get_file_backend(src, backend_args=backend_args, enable_singleton=True, backend_key=backend_key) + return backend.copyfile(src, dst) + + +def copytree( + src: Union[str, Path], + dst: Union[str, Path], + backend_args: Optional[dict] = None, + backend_key: Optional[str] = None, +) -> Union[str, Path]: + """Recursively copy an entire directory tree rooted at src to a directory + named dst and return the destination directory. + + src and dst should have the same prefix and dst must not already exist. + + Args: + src (str or Path): A directory to be copied. + dst (str or Path): Copy directory to dst. + backend_args (dict, optional): Arguments to instantiate the + corresponding backend. Defaults to None. + backend_key (str, optional): The key to get the backend from register. + + Returns: + str: The destination directory. + + Raises: + FileExistsError: If dst had already existed, a FileExistsError will be + raised. + + Examples: + >>> src = '/path/of/dir1' + >>> dst = '/path/of/dir2' + >>> copytree(src, dst) + '/path/of/dir2' + """ + backend = get_file_backend(src, backend_args=backend_args, enable_singleton=True, backend_key=backend_key) + return backend.copytree(src, dst) + + +def copyfile_from_local( + src: Union[str, Path], + dst: Union[str, Path], + backend_args: Optional[dict] = None, + backend_key: Optional[str] = None, +) -> Union[str, Path]: + """Copy a local file src to dst and return the destination file. + + Note: + If the backend is the instance of LocalBackend, it does the same + thing with :func:`copyfile`. + + Args: + src (str or Path): A local file to be copied. + dst (str or Path): Copy file to dst. + backend_args (dict, optional): Arguments to instantiate the + corresponding backend. Defaults to None. + + Returns: + str: If dst specifies a directory, the file will be copied into dst + using the base filename from src. + + Examples: + >>> # dst is a file + >>> src = '/path/of/file' + >>> dst = 's3://openmmlab/mmengine/file1' + >>> # src will be copied to 's3://openmmlab/mmengine/file1' + >>> copyfile_from_local(src, dst) + s3://openmmlab/mmengine/file1 + + >>> # dst is a directory + >>> dst = 's3://openmmlab/mmengine' + >>> # src will be copied to 's3://openmmlab/mmengine/file'' + >>> copyfile_from_local(src, dst) + 's3://openmmlab/mmengine/file' + """ + backend = get_file_backend(dst, backend_args=backend_args, enable_singleton=True, backend_key=backend_key) + return backend.copyfile_from_local(src, dst) + + +def copytree_from_local( + src: Union[str, Path], + dst: Union[str, Path], + backend_args: Optional[dict] = None, + backend_key: Optional[str] = None, +) -> Union[str, Path]: + """Recursively copy an entire directory tree rooted at src to a directory + named dst and return the destination directory. + + Note: + If the backend is the instance of LocalBackend, it does the same + thing with :func:`copytree`. + + Args: + src (str or Path): A local directory to be copied. + dst (str or Path): Copy directory to dst. + backend_args (dict, optional): Arguments to instantiate the + corresponding backend. Defaults to None. + + Returns: + str: The destination directory. + + Examples: + >>> src = '/path/of/dir' + >>> dst = 's3://openmmlab/mmengine/dir' + >>> copyfile_from_local(src, dst) + 's3://openmmlab/mmengine/dir' + """ + backend = get_file_backend(dst, backend_args=backend_args, enable_singleton=True, backend_key=backend_key) + return backend.copytree_from_local(src, dst) + + +def copyfile_to_local( + src: Union[str, Path], + dst: Union[str, Path], + dst_type: str, # Choose from ["file", "dir"] + backend_args: Optional[dict] = None, + backend_key: Optional[str] = None, +) -> Union[str, Path]: + """Copy the file src to local dst and return the destination file. + + If dst specifies a directory, the file will be copied into dst using + the base filename from src. If dst specifies a file that already + exists, it will be replaced. + + Note: + If the backend is the instance of LocalBackend, it does the same + thing with :func:`copyfile`. + + Args: + src (str or Path): A file to be copied. + dst (str or Path): Copy file to to local dst. + backend_args (dict, optional): Arguments to instantiate the + corresponding backend. Defaults to None. + + Returns: + str: If dst specifies a directory, the file will be copied into dst + using the base filename from src. + + Examples: + >>> # dst is a file + >>> src = 's3://openmmlab/mmengine/file' + >>> dst = '/path/of/file' + >>> # src will be copied to '/path/of/file' + >>> copyfile_to_local(src, dst) + '/path/of/file' + + >>> # dst is a directory + >>> dst = '/path/of/dir' + >>> # src will be copied to '/path/of/dir/file' + >>> copyfile_to_local(src, dst) + '/path/of/dir/file' + """ + assert dst_type in ["file", "dir"] + Path(dst).parent.mkdir(parents=True, exist_ok=True) + backend = get_file_backend(src, backend_args=backend_args, enable_singleton=True, backend_key=backend_key) + return backend.copyfile_to_local(src, dst, dst_type=dst_type) + + +def copytree_to_local( + src: Union[str, Path], + dst: Union[str, Path], + backend_args: Optional[dict] = None, + backend_key: Optional[str] = None, +) -> Union[str, Path]: + """Recursively copy an entire directory tree rooted at src to a local + directory named dst and return the destination directory. + + Note: + If the backend is the instance of LocalBackend, it does the same + thing with :func:`copytree`. + + Args: + src (str or Path): A directory to be copied. + dst (str or Path): Copy directory to local dst. + backend_args (dict, optional): Arguments to instantiate the + corresponding backend. Defaults to None. + + Returns: + str: The destination directory. + + Examples: + >>> src = 's3://openmmlab/mmengine/dir' + >>> dst = '/path/of/dir' + >>> copytree_to_local(src, dst) + '/path/of/dir' + """ + Path(dst).parent.mkdir(parents=True, exist_ok=True) + backend = get_file_backend(dst, backend_args=backend_args, enable_singleton=True, backend_key=backend_key) + return backend.copytree_to_local(src, dst) + + +def remove( + filepath: Union[str, Path], + backend_args: Optional[dict] = None, + backend_key: Optional[str] = None, +) -> None: + """Remove a file. + + Args: + filepath (str, Path): Path to be removed. + backend_args (dict, optional): Arguments to instantiate the + corresponding backend. Defaults to None. + + Raises: + FileNotFoundError: If filepath does not exist, an FileNotFoundError + will be raised. + IsADirectoryError: If filepath is a directory, an IsADirectoryError + will be raised. + + Examples: + >>> filepath = '/path/of/file' + >>> remove(filepath) + """ + backend = get_file_backend( + filepath, + backend_args=backend_args, + enable_singleton=True, + backend_key=backend_key, + ) + backend.remove(filepath) + + +def rmtree( + dir_path: Union[str, Path], + backend_args: Optional[dict] = None, + backend_key: Optional[str] = None, +) -> None: + """Recursively delete a directory tree. + + Args: + dir_path (str or Path): A directory to be removed. + backend_args (dict, optional): Arguments to instantiate the + corresponding backend. Defaults to None. + + Examples: + >>> dir_path = '/path/of/dir' + >>> rmtree(dir_path) + """ + backend = get_file_backend( + dir_path, + backend_args=backend_args, + enable_singleton=True, + backend_key=backend_key, + ) + backend.rmtree(dir_path) + + +def copy_if_symlink_fails( + src: Union[str, Path], + dst: Union[str, Path], + backend_args: Optional[dict] = None, + backend_key: Optional[str] = None, +) -> bool: + """Create a symbolic link pointing to src named dst. + + If failed to create a symbolic link pointing to src, directory copy src to + dst instead. + + Args: + src (str or Path): Create a symbolic link pointing to src. + dst (str or Path): Create a symbolic link named dst. + backend_args (dict, optional): Arguments to instantiate the + corresponding backend. Defaults to None. + + Returns: + bool: Return True if successfully create a symbolic link pointing to + src. Otherwise, return False. + + Examples: + >>> src = '/path/of/file' + >>> dst = '/path1/of/file1' + >>> copy_if_symlink_fails(src, dst) + True + >>> src = '/path/of/dir' + >>> dst = '/path1/of/dir1' + >>> copy_if_symlink_fails(src, dst) + True + """ + backend = get_file_backend(src, backend_args=backend_args, enable_singleton=True, backend_key=backend_key) + return backend.copy_if_symlink_fails(src, dst) + + +def list_dir( + dir_path: Union[str, Path], + backend_args: Optional[dict] = None, + backend_key: Optional[str] = None, +): + """List all folders in an S3 bucket with a given prefix. + + Args: + dir_path (str | Path): Path of the directory. + + Examples: + >>> dir_path = '/path/of/dir' + >>> for file_path in list_dir(dir_path): + ... print(file_path) + """ + if not dir_path.endswith("/"): + dir_path += "/" + backend = get_file_backend( + dir_path, + backend_args=backend_args, + enable_singleton=True, + backend_key=backend_key, + ) + + return backend.list_dir(dir_path) + + +def list_dir_or_file( + dir_path: Union[str, Path], + list_dir: bool = True, + list_file: bool = True, + suffix: Optional[Union[str, Tuple[str]]] = None, + recursive: bool = False, + backend_args: Optional[dict] = None, + backend_key: Optional[str] = None, +) -> Iterator[str]: + """Scan a directory to find the interested directories or files in + arbitrary order. + + Note: + :meth:`list_dir_or_file` returns the path relative to ``dir_path``. + + Args: + dir_path (str or Path): Path of the directory. + list_dir (bool): List the directories. Defaults to True. + list_file (bool): List the path of files. Defaults to True. + suffix (str or tuple[str], optional): File suffix that we are + interested in. Defaults to None. + recursive (bool): If set to True, recursively scan the directory. + Defaults to False. + backend_args (dict, optional): Arguments to instantiate the + corresponding backend. Defaults to None. + + Yields: + Iterable[str]: A relative path to ``dir_path``. + + Examples: + >>> dir_path = '/path/of/dir' + >>> for file_path in list_dir_or_file(dir_path): + ... print(file_path) + >>> # list those files and directories in current directory + >>> for file_path in list_dir_or_file(dir_path): + ... print(file_path) + >>> # only list files + >>> for file_path in list_dir_or_file(dir_path, list_dir=False): + ... print(file_path) + >>> # only list directories + >>> for file_path in list_dir_or_file(dir_path, list_file=False): + ... print(file_path) + >>> # only list files ending with specified suffixes + >>> for file_path in list_dir_or_file(dir_path, suffix='.txt'): + ... print(file_path) + >>> # list all files and directory recursively + >>> for file_path in list_dir_or_file(dir_path, recursive=True): + ... print(file_path) + """ + backend = get_file_backend( + dir_path, + backend_args=backend_args, + enable_singleton=True, + backend_key=backend_key, + ) + yield from backend.list_dir_or_file(dir_path, list_dir, list_file, suffix, recursive) + + +def generate_presigned_url( + url: str, + client_method: str = "get_object", + expires_in: int = 3600, + backend_args: Optional[dict] = None, + backend_key: Optional[str] = None, +) -> str: + """Generate the presigned url of video stream which can be passed to + mmcv.VideoReader. Now only work on s3 backend. + + Note: + Now only work on s3 backend. + + Args: + url (str): Url of video stream. + client_method (str): Method of client, 'get_object' or + 'put_object'. Defaults to 'get_object'. + expires_in (int): expires, in seconds. Defaults to 3600. + backend_args (dict, optional): Arguments to instantiate the + corresponding backend. Defaults to None. + + Returns: + str: Generated presigned url. + """ + backend = get_file_backend(url, backend_args=backend_args, enable_singleton=True, backend_key=backend_key) + return backend.generate_presigned_url(url, client_method, expires_in) + + +def load( + file: Union[str, Path, IO[Any]], + file_format: Optional[str] = None, + file_client_args: Optional[dict] = None, + fast_backend: bool = False, + backend_args: Optional[dict] = None, + backend_key: Optional[str] = None, + **kwargs, +): + """Load data from json/yaml/pickle files. + + This method provides a unified api for loading data from serialized files. + + ``load`` supports loading data from serialized files those can be storaged + in different backends. + + Args: + file (str or :obj:`Path` or file-like object): Filename or a file-like + object. + file_format (str, optional): If not specified, the file format will be + inferred from the file extension, otherwise use the specified one. + Currently supported formats include "json", "yaml/yml" and + "pickle/pkl". + file_client_args (dict, optional): Arguments to instantiate a + FileClient. See :class:`mmengine.fileio.FileClient` for details. + Defaults to None. It will be deprecated in future. Please use + ``backend_args`` instead. + fast_backend: bool: Whether to use multiprocess. Defaults to False. + backend_args (dict, optional): Arguments to instantiate the + prefix of uri corresponding backend. Defaults to None. + New in v0.2.0. + + Examples: + >>> load('/path/of/your/file') # file is storaged in disk + >>> load('https://path/of/your/file') # file is storaged in Internet + >>> load('s3://path/of/your/file') # file is storaged in s3 + + Returns: + The content from the file. + """ + if isinstance(file, Path): + file = str(file) + if file_format is None and isinstance(file, str): + file_format = file.split(".")[-1] + # convert file_format to lower case + file_format = file_format.lower() + if file_format not in file_handlers: + raise TypeError(f"Unsupported format: {file_format}") + + if file_client_args is not None: + warnings.warn( + '"file_client_args" will be deprecated in future. Please use "backend_args" instead', + DeprecationWarning, + ) + if backend_args is not None: + raise ValueError('"file_client_args and "backend_args" cannot be set at the same time.') + + handler = file_handlers[file_format] + if isinstance(file, str): + if file_client_args is not None: + file_client = FileClient.infer_client(file_client_args, file) + file_backend = file_client + else: + file_backend = get_file_backend( + file, + backend_args=backend_args, + backend_key=backend_key, + enable_singleton=True, + ) + + if handler.str_like: + with StringIO(file_backend.get_text(file)) as f: + obj = handler.load_from_fileobj(f, **kwargs) + else: + if fast_backend: + if hasattr(file_backend, "fast_get"): + with BytesIO(file_backend.fast_get(file)) as f: + obj = handler.load_from_fileobj(f, **kwargs) + else: + warnings.warn( + f"fast_backend is not supported by the backend, type {type(file_backend)} fallback to normal get" + ) + with BytesIO(file_backend.get(file)) as f: + obj = handler.load_from_fileobj(f, **kwargs) + else: + with BytesIO(file_backend.get(file)) as f: + obj = handler.load_from_fileobj(f, **kwargs) + elif hasattr(file, "read"): + obj = handler.load_from_fileobj(file, **kwargs) + else: + raise TypeError('"file" must be a filepath str or a file-object') + return obj + + +def dump( + obj: Any, + file: Union[str, Path, IO[Any], None] = None, + file_format: Optional[str] = None, + file_client_args: Optional[dict] = None, + fast_backend: bool = False, + backend_args: Optional[dict] = None, + backend_key: Optional[str] = None, + **kwargs, +): + """Dump data to json/yaml/pickle strings or files. + + This method provides a unified api for dumping data as strings or to files, + and also supports custom arguments for each file format. + + ``dump`` supports dumping data as strings or to files which is saved to + different backends. + + Args: + obj (any): The python object to be dumped. + file (str or :obj:`Path` or file-like object, optional): If not + specified, then the object is dumped to a str, otherwise to a file + specified by the filename or file-like object. + file_format (str, optional): Same as :func:`load`. + file_client_args (dict, optional): Arguments to instantiate a + FileClient. See :class:`mmengine.fileio.FileClient` for details. + Defaults to None. It will be deprecated in future. Please use + ``backend_args`` instead. + fast_backend: bool: Whether to use multiprocess. Defaults to False. + backend_args (dict, optional): Arguments to instantiate the + prefix of uri corresponding backend. Defaults to None. + New in v0.2.0. + backend_key: str: The key to register the backend. Defaults to None. + + Examples: + >>> dump('hello world', '/path/of/your/file') # disk + >>> dump('hello world', 's3://path/of/your/file') # ceph or s3 + + Returns: + bool: True for success, False otherwise. + """ + if isinstance(file, Path): + file = str(file) + if file_format is None: + if isinstance(file, str): + file_format = file.split(".")[-1] + elif file is None: + raise ValueError("file_format must be specified since file is None") + # convert file_format to lower case + file_format = file_format.lower() + if file_format not in file_handlers: + raise TypeError(f"Unsupported format: {file_format}") + + if file_client_args is not None: + warnings.warn( + '"file_client_args" will be deprecated in future. Please use "backend_args" instead', + DeprecationWarning, + ) + if backend_args is not None: + raise ValueError('"file_client_args" and "backend_args" cannot be set at the same time.') + + handler = file_handlers[file_format] + if file is None: + return handler.dump_to_str(obj, **kwargs) + elif isinstance(file, str): + if file_client_args is not None: + file_client = FileClient.infer_client(file_client_args, file) + file_backend = file_client + else: + file_backend = get_file_backend( + file, + backend_args=backend_args, + backend_key=backend_key, + enable_singleton=True, + ) + + if handler.str_like: + with StringIO() as f: + handler.dump_to_fileobj(obj, f, **kwargs) + file_backend.put_text(f.getvalue(), file) + else: + with BytesIO() as f: + handler.dump_to_fileobj(obj, f, **kwargs) + if fast_backend: + if hasattr(file_backend, "fast_put"): + file_backend.fast_put(f, file) + else: + warnings.warn("fast_backend is not supported by the backend, fallback to normal put") + file_backend.put(f, file) + else: + file_backend.put(f, file) + elif hasattr(file, "write"): + handler.dump_to_fileobj(obj, file, **kwargs) + else: + raise TypeError('"file" must be a filename str or a file-object') diff --git a/cosmos-inference/cosmos3/_src/imaginaire/utils/easy_io/file_client.py b/cosmos-inference/cosmos3/_src/imaginaire/utils/easy_io/file_client.py new file mode 100644 index 00000000..b283d8bf --- /dev/null +++ b/cosmos-inference/cosmos3/_src/imaginaire/utils/easy_io/file_client.py @@ -0,0 +1,459 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +import inspect +from contextlib import contextmanager +from pathlib import Path +from typing import Any, Generator, Iterator, Optional, Tuple, Union + +from cosmos3._src.imaginaire.flags import TRAINING +from cosmos3._src.imaginaire.utils.easy_io.backends import BaseStorageBackend, HTTPBackend, LocalBackend + + +def is_filepath(filepath): + return isinstance(filepath, (str, Path)) + + +class HardDiskBackend(LocalBackend): + """Raw hard disks storage backend.""" + + @property + def name(self): + return self.__class__.__name__ + + +class FileClient: + """A general file client to access files in different backends. + + The client loads a file or text in a specified backend from its path + and returns it as a binary or text file. There are two ways to choose a + backend, the name of backend and the prefix of path. Although both of them + can be used to choose a storage backend, ``backend`` has a higher priority + that is if they are all set, the storage backend will be chosen by the + backend argument. If they are all `None`, the disk backend will be chosen. + Note that It can also register other backend accessor with a given name, + prefixes, and backend class. In addition, We use the singleton pattern to + avoid repeated object creation. If the arguments are the same, the same + object will be returned. + + Warning: + `FileClient` will be deprecated in future. Please use io functions + in https://mmengine.readthedocs.io/en/latest/api/fileio.html#file-io + + Args: + backend (str, optional): The storage backend type. Options are "disk", + "memcached", "lmdb", "http" and "s3". Defaults to None. + prefix (str, optional): The prefix of the registered storage backend. + Options are "s3", "http", "https". Defaults to None. + + Examples: + >>> # only set backend + >>> file_client = FileClient(backend='s3') + >>> # only set prefix + >>> file_client = FileClient(prefix='s3') + >>> # set both backend and prefix but use backend to choose client + >>> file_client = FileClient(backend='s3', prefix='s3') + >>> # if the arguments are the same, the same object is returned + >>> file_client1 = FileClient(backend='s3') + >>> file_client1 is file_client + True + + Attributes: + client (:obj:`BaseStorageBackend`): The backend object. + """ + + _backends = { + "disk": HardDiskBackend, + "http": HTTPBackend, + } + + _prefix_to_backends: dict = { + "http": HTTPBackend, + "https": HTTPBackend, + } + + if TRAINING: + from cosmos3._src.imaginaire.utils.easy_io.backends.msc_backend import MSCBackend + + _backends["s3"] = MSCBackend + _backends["msc"] = MSCBackend + _prefix_to_backends["s3"] = MSCBackend + + _instances: dict = {} + + client: Any + + def __new__(cls, backend=None, prefix=None, **kwargs): + if backend is None and prefix is None: + backend = "disk" + if backend is not None and backend not in cls._backends: + raise ValueError( + f"Backend {backend} is not supported. Currently supported ones are {list(cls._backends.keys())}" + ) + if prefix is not None and prefix not in cls._prefix_to_backends: + raise ValueError( + f"prefix {prefix} is not supported. Currently supported ones are {list(cls._prefix_to_backends.keys())}" + ) + + # concatenate the arguments to a unique key for determining whether + # objects with the same arguments were created + arg_key = f"{backend}:{prefix}" + for key, value in kwargs.items(): + arg_key += f":{key}:{value}" + + # if a backend was overridden, it will create a new object + if arg_key in cls._instances: + _instance = cls._instances[arg_key] + else: + # create a new object and put it to _instance + _instance = super().__new__(cls) + if backend is not None: + _instance.client = cls._backends[backend](**kwargs) + else: + _instance.client = cls._prefix_to_backends[prefix](**kwargs) + + cls._instances[arg_key] = _instance + + return _instance + + @property + def name(self): + return self.client.name + + @property + def allow_symlink(self): + return self.client.allow_symlink + + @staticmethod + def parse_uri_prefix(uri: Union[str, Path]) -> Optional[str]: + """Parse the prefix of a uri. + + Args: + uri (str | Path): Uri to be parsed that contains the file prefix. + + Examples: + >>> FileClient.parse_uri_prefix('s3://path/of/your/file') + 's3' + + Returns: + str | None: Return the prefix of uri if the uri contains '://' else + ``None``. + """ + assert is_filepath(uri) + uri = str(uri) + if "://" not in uri: + return None + else: + prefix, _ = uri.split("://") + # In the case of MSCBackend, the prefix may contains the cluster + # name like clusterName:s3 + if ":" in prefix: + _, prefix = prefix.split(":") + return prefix + + @classmethod + def infer_client( + cls, + file_client_args: Optional[dict] = None, + uri: Optional[Union[str, Path]] = None, + ) -> "FileClient": + """Infer a suitable file client based on the URI and arguments. + + Args: + file_client_args (dict, optional): Arguments to instantiate a + FileClient. Defaults to None. + uri (str | Path, optional): Uri to be parsed that contains the file + prefix. Defaults to None. + + Examples: + >>> uri = 's3://path/of/your/file' + >>> file_client = FileClient.infer_client(uri=uri) + >>> file_client_args = {'backend': 's3'} + >>> file_client = FileClient.infer_client(file_client_args) + + Returns: + FileClient: Instantiated FileClient object. + """ + assert file_client_args is not None or uri is not None + if file_client_args is None: + file_prefix = cls.parse_uri_prefix(uri) # type: ignore + return cls(prefix=file_prefix) + else: + return cls(**file_client_args) + + @classmethod + def _register_backend(cls, name, backend, force=False, prefixes=None): + if not isinstance(name, str): + raise TypeError(f"the backend name should be a string, but got {type(name)}") + if not inspect.isclass(backend): + raise TypeError(f"backend should be a class but got {type(backend)}") + if not issubclass(backend, BaseStorageBackend): + raise TypeError(f"backend {backend} is not a subclass of BaseStorageBackend") + if not force and name in cls._backends: + raise KeyError( + f'{name} is already registered as a storage backend, add "force=True" if you want to override it' + ) + + if name in cls._backends and force: + for arg_key, instance in list(cls._instances.items()): + if isinstance(instance.client, cls._backends[name]): + cls._instances.pop(arg_key) + cls._backends[name] = backend + + if prefixes is not None: + if isinstance(prefixes, str): + prefixes = [prefixes] + else: + assert isinstance(prefixes, (list, tuple)) + for prefix in prefixes: + if prefix not in cls._prefix_to_backends: + cls._prefix_to_backends[prefix] = backend + elif (prefix in cls._prefix_to_backends) and force: + overridden_backend = cls._prefix_to_backends[prefix] + for arg_key, instance in list(cls._instances.items()): + if isinstance(instance.client, overridden_backend): + cls._instances.pop(arg_key) + else: + raise KeyError( + f"{prefix} is already registered as a storage backend," + ' add "force=True" if you want to override it' + ) + + @classmethod + def register_backend(cls, name, backend=None, force=False, prefixes=None): + """Register a backend to FileClient. + + This method can be used as a normal class method or a decorator. + + .. code-block:: python + + class NewBackend(BaseStorageBackend): + + def get(self, filepath): + return filepath + + def get_text(self, filepath): + return filepath + + FileClient.register_backend('new', NewBackend) + + or + + .. code-block:: python + + @FileClient.register_backend('new') + class NewBackend(BaseStorageBackend): + + def get(self, filepath): + return filepath + + def get_text(self, filepath): + return filepath + + Args: + name (str): The name of the registered backend. + backend (class, optional): The backend class to be registered, + which must be a subclass of :class:`BaseStorageBackend`. + When this method is used as a decorator, backend is None. + Defaults to None. + force (bool, optional): Whether to override the backend if the name + has already been registered. Defaults to False. + prefixes (str or list[str] or tuple[str], optional): The prefixes + of the registered storage backend. Defaults to None. + `New in version 1.3.15.` + """ + if backend is not None: + cls._register_backend(name, backend, force=force, prefixes=prefixes) + return + + def _register(backend_cls): + cls._register_backend(name, backend_cls, force=force, prefixes=prefixes) + return backend_cls + + return _register + + def get(self, filepath: Union[str, Path]) -> Union[bytes, memoryview]: + """Read data from a given ``filepath`` with 'rb' mode. + + Note: + There are two types of return values for ``get``, one is ``bytes`` + and the other is ``memoryview``. The advantage of using memoryview + is that you can avoid copying, and if you want to convert it to + ``bytes``, you can use ``.tobytes()``. + + Args: + filepath (str or Path): Path to read data. + + Returns: + bytes | memoryview: Expected bytes object or a memory view of the + bytes object. + """ + return self.client.get(filepath) + + def get_text(self, filepath: Union[str, Path], encoding="utf-8") -> str: + """Read data from a given ``filepath`` with 'r' mode. + + Args: + filepath (str or Path): Path to read data. + encoding (str): The encoding format used to open the ``filepath``. + Defaults to 'utf-8'. + + Returns: + str: Expected text reading from ``filepath``. + """ + return self.client.get_text(filepath, encoding) + + def put(self, obj: bytes, filepath: Union[str, Path]) -> None: + """Write data to a given ``filepath`` with 'wb' mode. + + Note: + ``put`` should create a directory if the directory of ``filepath`` + does not exist. + + Args: + obj (bytes): Data to be written. + filepath (str or Path): Path to write data. + """ + self.client.put(obj, filepath) + + def put_text(self, obj: str, filepath: Union[str, Path]) -> None: + """Write data to a given ``filepath`` with 'w' mode. + + Note: + ``put_text`` should create a directory if the directory of + ``filepath`` does not exist. + + Args: + obj (str): Data to be written. + filepath (str or Path): Path to write data. + encoding (str, optional): The encoding format used to open the + `filepath`. Defaults to 'utf-8'. + """ + self.client.put_text(obj, filepath) + + def remove(self, filepath: Union[str, Path]) -> None: + """Remove a file. + + Args: + filepath (str, Path): Path to be removed. + """ + self.client.remove(filepath) + + def exists(self, filepath: Union[str, Path]) -> bool: + """Check whether a file path exists. + + Args: + filepath (str or Path): Path to be checked whether exists. + + Returns: + bool: Return ``True`` if ``filepath`` exists, ``False`` otherwise. + """ + return self.client.exists(filepath) + + def isdir(self, filepath: Union[str, Path]) -> bool: + """Check whether a file path is a directory. + + Args: + filepath (str or Path): Path to be checked whether it is a + directory. + + Returns: + bool: Return ``True`` if ``filepath`` points to a directory, + ``False`` otherwise. + """ + return self.client.isdir(filepath) + + def isfile(self, filepath: Union[str, Path]) -> bool: + """Check whether a file path is a file. + + Args: + filepath (str or Path): Path to be checked whether it is a file. + + Returns: + bool: Return ``True`` if ``filepath`` points to a file, ``False`` + otherwise. + """ + return self.client.isfile(filepath) + + def join_path(self, filepath: Union[str, Path], *filepaths: Union[str, Path]) -> str: + r"""Concatenate all file paths. + + Join one or more filepath components intelligently. The return value + is the concatenation of filepath and any members of \*filepaths. + + Args: + filepath (str or Path): Path to be concatenated. + + Returns: + str: The result of concatenation. + """ + return self.client.join_path(filepath, *filepaths) + + @contextmanager + def get_local_path(self, filepath: Union[str, Path]) -> Generator[Union[str, Path], None, None]: + """Download data from ``filepath`` and write the data to local path. + + ``get_local_path`` is decorated by :meth:`contxtlib.contextmanager`. It + can be called with ``with`` statement, and when exists from the + ``with`` statement, the temporary path will be released. + + Note: + If the ``filepath`` is a local path, just return itself. + + .. warning:: + ``get_local_path`` is an experimental interface that may change in + the future. + + Args: + filepath (str or Path): Path to be read data. + + Examples: + >>> file_client = FileClient(prefix='s3') + >>> with file_client.get_local_path('s3://bucket/abc.jpg') as path: + ... # do something here + + Yields: + Iterable[str]: Only yield one path. + """ + with self.client.get_local_path(str(filepath)) as local_path: + yield local_path + + def list_dir_or_file( # pylint: disable=too-many-arguments + self, + dir_path: Union[str, Path], + list_dir: bool = True, + list_file: bool = True, + suffix: Optional[Union[str, Tuple[str]]] = None, + recursive: bool = False, + ) -> Iterator[str]: + """Scan a directory to find the interested directories or files in + arbitrary order. + + Note: + :meth:`list_dir_or_file` returns the path relative to ``dir_path``. + + Args: + dir_path (str | Path): Path of the directory. + list_dir (bool): List the directories. Defaults to True. + list_file (bool): List the path of files. Defaults to True. + suffix (str or tuple[str], optional): File suffix + that we are interested in. Defaults to None. + recursive (bool): If set to True, recursively scan the + directory. Defaults to False. + + Yields: + Iterable[str]: A relative path to ``dir_path``. + """ + yield from self.client.list_dir_or_file(dir_path, list_dir, list_file, suffix, recursive) diff --git a/cosmos-inference/cosmos3/_src/imaginaire/utils/easy_io/handlers/__init__.py b/cosmos-inference/cosmos3/_src/imaginaire/utils/easy_io/handlers/__init__.py new file mode 100644 index 00000000..6c135233 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/imaginaire/utils/easy_io/handlers/__init__.py @@ -0,0 +1,29 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +from cosmos3._src.imaginaire.utils.easy_io.handlers.base import BaseFileHandler +from cosmos3._src.imaginaire.utils.easy_io.handlers.json_handler import JsonHandler +from cosmos3._src.imaginaire.utils.easy_io.handlers.pickle_handler import PickleHandler +from cosmos3._src.imaginaire.utils.easy_io.handlers.registry_utils import file_handlers, register_handler +from cosmos3._src.imaginaire.utils.easy_io.handlers.yaml_handler import YamlHandler + +__all__ = [ + "BaseFileHandler", + "JsonHandler", + "PickleHandler", + "YamlHandler", + "register_handler", + "file_handlers", +] diff --git a/cosmos-inference/cosmos3/_src/imaginaire/utils/easy_io/handlers/base.py b/cosmos-inference/cosmos3/_src/imaginaire/utils/easy_io/handlers/base.py new file mode 100644 index 00000000..5e5dcbca --- /dev/null +++ b/cosmos-inference/cosmos3/_src/imaginaire/utils/easy_io/handlers/base.py @@ -0,0 +1,44 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +from abc import ABCMeta, abstractmethod + + +class BaseFileHandler(metaclass=ABCMeta): + # `str_like` is a flag to indicate whether the type of file object is + # str-like object or bytes-like object. Pickle only processes bytes-like + # objects but json only processes str-like object. If it is str-like + # object, `StringIO` will be used to process the buffer. + str_like = True + + @abstractmethod + def load_from_fileobj(self, file, **kwargs): + pass + + @abstractmethod + def dump_to_fileobj(self, obj, file, **kwargs): + pass + + @abstractmethod + def dump_to_str(self, obj, **kwargs): + pass + + def load_from_path(self, filepath, mode="r", **kwargs): + with open(filepath, mode) as f: + return self.load_from_fileobj(f, **kwargs) + + def dump_to_path(self, obj, filepath, mode="w", **kwargs): + with open(filepath, mode) as f: + self.dump_to_fileobj(obj, f, **kwargs) diff --git a/cosmos-inference/cosmos3/_src/imaginaire/utils/easy_io/handlers/byte_handler.py b/cosmos-inference/cosmos3/_src/imaginaire/utils/easy_io/handlers/byte_handler.py new file mode 100644 index 00000000..269a504d --- /dev/null +++ b/cosmos-inference/cosmos3/_src/imaginaire/utils/easy_io/handlers/byte_handler.py @@ -0,0 +1,39 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +from typing import IO + +from cosmos3._src.imaginaire.utils.easy_io.handlers.base import BaseFileHandler + + +class ByteHandler(BaseFileHandler): + str_like = False + + def load_from_fileobj(self, file: IO[bytes], **kwargs): + file.seek(0) + # extra all bytes and return + return file.read() + + def dump_to_fileobj( + self, + obj: bytes, + file: IO[bytes], + **kwargs, + ): + # write all bytes to file + file.write(obj) + + def dump_to_str(self, obj, **kwargs): + raise NotImplementedError diff --git a/cosmos-inference/cosmos3/_src/imaginaire/utils/easy_io/handlers/csv_handler.py b/cosmos-inference/cosmos3/_src/imaginaire/utils/easy_io/handlers/csv_handler.py new file mode 100644 index 00000000..307a69df --- /dev/null +++ b/cosmos-inference/cosmos3/_src/imaginaire/utils/easy_io/handlers/csv_handler.py @@ -0,0 +1,42 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +import csv +from io import StringIO + +from cosmos3._src.imaginaire.utils.easy_io.handlers.base import BaseFileHandler + + +class CsvHandler(BaseFileHandler): + def load_from_fileobj(self, file, **kwargs): + del kwargs + reader = csv.reader(file) + return list(reader) + + def dump_to_fileobj(self, obj, file, **kwargs): + del kwargs + writer = csv.writer(file) + if not all(isinstance(row, list) for row in obj): + raise ValueError("Each row must be a list") + writer.writerows(obj) + + def dump_to_str(self, obj, **kwargs): + del kwargs + output = StringIO() + writer = csv.writer(output) + if not all(isinstance(row, list) for row in obj): + raise ValueError("Each row must be a list") + writer.writerows(obj) + return output.getvalue() diff --git a/cosmos-inference/cosmos3/_src/imaginaire/utils/easy_io/handlers/gzip_handler.py b/cosmos-inference/cosmos3/_src/imaginaire/utils/easy_io/handlers/gzip_handler.py new file mode 100644 index 00000000..880877b3 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/imaginaire/utils/easy_io/handlers/gzip_handler.py @@ -0,0 +1,33 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +import gzip +import pickle +from io import BytesIO +from typing import Any + +from cosmos3._src.imaginaire.utils.easy_io.handlers.pickle_handler import PickleHandler + + +class GzipHandler(PickleHandler): + str_like = False + + def load_from_fileobj(self, file: BytesIO, **kwargs): + with gzip.GzipFile(fileobj=file, mode="rb") as f: + return pickle.load(f) + + def dump_to_fileobj(self, obj: Any, file: BytesIO, **kwargs): + with gzip.GzipFile(fileobj=file, mode="wb") as f: + pickle.dump(obj, f) diff --git a/cosmos-inference/cosmos3/_src/imaginaire/utils/easy_io/handlers/imageio_video_handler.py b/cosmos-inference/cosmos3/_src/imaginaire/utils/easy_io/handlers/imageio_video_handler.py new file mode 100644 index 00000000..2f402d3e --- /dev/null +++ b/cosmos-inference/cosmos3/_src/imaginaire/utils/easy_io/handlers/imageio_video_handler.py @@ -0,0 +1,168 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +from typing import IO, Any, Dict, Tuple + +import imageio +import imageio.v3 as iio_v3 +import numpy as np +import torch + +from cosmos3._src.imaginaire.utils import log +from cosmos3._src.imaginaire.utils.easy_io.handlers.base import BaseFileHandler + + +class ImageioVideoHandler(BaseFileHandler): + str_like = False + + def load_from_fileobj( + self, file: IO[bytes], format: str = "mp4", mode: str = "rgb", **kwargs + ) -> Tuple[np.ndarray, Dict[str, Any]]: + """ + Load video from a file-like object using imageio.v3 with specified format and color mode. + + Parameters: + file (IO[bytes]): A file-like object containing video data. + format (str): Format of the video file (default 'mp4'). + mode (str): Color mode of the video, 'rgb' or 'gray' (default 'rgb'). + + Returns: + tuple: A tuple containing an array of video frames and metadata about the video. + """ + file.seek(0) + + # The plugin argument in v3 replaces the format argument in v2 + plugin = kwargs.pop("plugin", "pyav") + + # Load all frames at once using v3 API + video_frames = iio_v3.imread(file, plugin=plugin, **kwargs) + + # Handle grayscale conversion if needed + if mode == "gray": + import cv2 + + if len(video_frames.shape) == 4: # (frames, height, width, channels) + gray_frames = [] + for frame in video_frames: + gray_frame = cv2.cvtColor(frame, cv2.COLOR_RGB2GRAY) + gray_frame = np.expand_dims(gray_frame, axis=2) # Keep dimensions consistent + gray_frames.append(gray_frame) + video_frames = np.array(gray_frames) + + # Extract metadata + # Note: iio_v3.imread doesn't return metadata directly like v2 did + # We need to extract it separately + file.seek(0) + metadata = self._extract_metadata(file, plugin=plugin) + + return video_frames, metadata + + def _extract_metadata(self, file: IO[bytes], plugin: str = "pyav") -> Dict[str, Any]: + """ + Extract metadata from a video file. + + Parameters: + file (IO[bytes]): File-like object containing video data. + plugin (str): Plugin to use for reading. + + Returns: + dict: Video metadata. + """ + try: + # Create a generator to read frames and metadata + metadata = iio_v3.immeta(file, plugin=plugin) + + # Add some standard fields similar to v2 metadata format + if "fps" not in metadata and "duration" in metadata: + # Read the first frame to get shape information + file.seek(0) + first_frame = iio_v3.imread(file, plugin=plugin, index=0) + metadata["size"] = first_frame.shape[1::-1] # (width, height) + metadata["source_size"] = metadata["size"] + + # Create a consistent metadata structure with v2 + metadata["plugin"] = plugin + if "codec" not in metadata: + metadata["codec"] = "unknown" + if "pix_fmt" not in metadata: + metadata["pix_fmt"] = "unknown" + + # Calculate nframes if possible + if "fps" in metadata and "duration" in metadata: + metadata["nframes"] = int(metadata["fps"] * metadata["duration"]) + else: + metadata["nframes"] = float("inf") + + return metadata + + except Exception as e: + # Fallback to basic metadata + return { + "plugin": plugin, + "nframes": float("inf"), + "codec": "unknown", + "fps": 30.0, # Default values + "duration": 0, + "size": (0, 0), + } + + def dump_to_fileobj( + self, + obj: np.ndarray | torch.Tensor, + file: IO[bytes], + format: str = "mp4", # pylint: disable=redefined-builtin + fps: int = 17, + quality: int = 5, + ffmpeg_params=None, + **kwargs, + ): + """ + Save an array of video frames to a file-like object using imageio. + + Parameters: + obj (Union[np.ndarray, torch.Tensor]): An array of frames to be saved as video. + file (IO[bytes]): A file-like object to which the video data will be written. + format (str): Format of the video file (default 'mp4'). + fps (int): Frames per second of the output video (default 17). + quality (int): Quality of the video (0-10, default 5). + ffmpeg_params (list): Additional parameters to pass to ffmpeg. + + """ + if isinstance(obj, torch.Tensor): + assert obj.dtype == torch.uint8, "Tensor must be of type uint8" + obj = obj.cpu().numpy() + h, w = obj.shape[1:-1] + + # Default ffmpeg params that ensure width and height are set + default_ffmpeg_params = ["-s", f"{w}x{h}"] + + # Use provided ffmpeg_params if any, otherwise use defaults + final_ffmpeg_params = ffmpeg_params if ffmpeg_params is not None else default_ffmpeg_params + + mimsave_kwargs = { + "fps": fps, + "quality": quality, + "macro_block_size": 1, + "ffmpeg_params": final_ffmpeg_params, + "output_params": ["-f", "mp4"], + } + # Update with any other kwargs + mimsave_kwargs.update(kwargs) + log.debug(f"mimsave_kwargs: {mimsave_kwargs}") + + imageio.mimsave(file, obj, format, **mimsave_kwargs) + + def dump_to_str(self, obj, **kwargs): + raise NotImplementedError diff --git a/cosmos-inference/cosmos3/_src/imaginaire/utils/easy_io/handlers/json_handler.py b/cosmos-inference/cosmos3/_src/imaginaire/utils/easy_io/handlers/json_handler.py new file mode 100644 index 00000000..360cf2e1 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/imaginaire/utils/easy_io/handlers/json_handler.py @@ -0,0 +1,49 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +import json + +import numpy as np + +from cosmos3._src.imaginaire.utils.easy_io.handlers.base import BaseFileHandler + + +def set_default(obj): + """Set default json values for non-serializable values. + + It helps convert ``set``, ``range`` and ``np.ndarray`` data types to list. + It also converts ``np.generic`` (including ``np.int32``, ``np.float32``, + etc.) into plain numbers of plain python built-in types. + """ + if isinstance(obj, (set, range)): + return list(obj) + elif isinstance(obj, np.ndarray): + return obj.tolist() + elif isinstance(obj, np.generic): + return obj.item() + raise TypeError(f"{type(obj)} is unsupported for json dump") + + +class JsonHandler(BaseFileHandler): + def load_from_fileobj(self, file): + return json.load(file) + + def dump_to_fileobj(self, obj, file, **kwargs): + kwargs.setdefault("default", set_default) + json.dump(obj, file, **kwargs) + + def dump_to_str(self, obj, **kwargs): + kwargs.setdefault("default", set_default) + return json.dumps(obj, **kwargs) diff --git a/cosmos-inference/cosmos3/_src/imaginaire/utils/easy_io/handlers/jsonl_handler.py b/cosmos-inference/cosmos3/_src/imaginaire/utils/easy_io/handlers/jsonl_handler.py new file mode 100644 index 00000000..cac37184 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/imaginaire/utils/easy_io/handlers/jsonl_handler.py @@ -0,0 +1,80 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +import json +from typing import IO + +import numpy as np + +from cosmos3._src.imaginaire.utils.easy_io.handlers.base import BaseFileHandler + + +def set_default(obj): + """Set default json values for non-serializable values. + + It helps convert ``set``, ``range`` and ``np.ndarray`` data types to list. + It also converts ``np.generic`` (including ``np.int32``, ``np.float32``, + etc.) into plain numbers of plain python built-in types. + """ + if isinstance(obj, (set, range)): + return list(obj) + elif isinstance(obj, np.ndarray): + return obj.tolist() + elif isinstance(obj, np.generic): + return obj.item() + raise TypeError(f"{type(obj)} is unsupported for json dump") + + +class JsonlHandler(BaseFileHandler): + """Handler for JSON lines (JSONL) files.""" + + def load_from_fileobj(self, file: IO[bytes]): + """Load JSON objects from a newline-delimited JSON (JSONL) file object. + + Returns: + A list of Python objects loaded from each JSON line. + """ + data = [] + for line in file: + line = line.strip() + if not line: + continue # skip empty lines if any + data.append(json.loads(line)) + return data + + def dump_to_fileobj(self, obj: IO[bytes], file, **kwargs): + """Dump a list of objects to a newline-delimited JSON (JSONL) file object. + + Args: + obj: A list (or iterable) of objects to dump line by line. + """ + kwargs.setdefault("default", set_default) + for item in obj: + file.write(json.dumps(item, **kwargs) + "\n") + + def dump_to_str(self, obj, **kwargs): + """Dump a list of objects to a newline-delimited JSON (JSONL) string.""" + kwargs.setdefault("default", set_default) + lines = [json.dumps(item, **kwargs) for item in obj] + return "\n".join(lines) + + +if __name__ == "__main__": + from cosmos3._src.imaginaire.utils.easy_io import easy_io + + easy_io.dump([1, 2, 3], "test.jsonl", file_format="jsonl") + print(easy_io.load("test.jsonl")) + easy_io.dump([{"key1": 1, "key2": 2}, {"key1": 3, "key2": 4}], "test.jsonl", file_format="jsonl") + print(easy_io.load("test.jsonl")) diff --git a/cosmos-inference/cosmos3/_src/imaginaire/utils/easy_io/handlers/np_handler.py b/cosmos-inference/cosmos3/_src/imaginaire/utils/easy_io/handlers/np_handler.py new file mode 100644 index 00000000..3db6308e --- /dev/null +++ b/cosmos-inference/cosmos3/_src/imaginaire/utils/easy_io/handlers/np_handler.py @@ -0,0 +1,89 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +from io import BytesIO +from typing import IO, Any + +import numpy as np + +from cosmos3._src.imaginaire.utils.easy_io.handlers.base import BaseFileHandler + + +class NumpyHandler(BaseFileHandler): + str_like = False + + def load_from_fileobj(self, file: IO[bytes], **kwargs) -> Any: + """ + Load a NumPy array from a file-like object. + + Parameters: + file (IO[bytes]): The file-like object containing the NumPy array data. + **kwargs: Additional keyword arguments passed to `np.load`. + + Returns: + numpy.ndarray: The loaded NumPy array. + """ + return np.load(file, **kwargs) + + def load_from_path(self, filepath: str, **kwargs) -> Any: + """ + Load a NumPy array from a file path. + + Parameters: + filepath (str): The path to the file to load. + **kwargs: Additional keyword arguments passed to `np.load`. + + Returns: + numpy.ndarray: The loaded NumPy array. + """ + return super().load_from_path(filepath, mode="rb", **kwargs) + + def dump_to_str(self, obj: np.ndarray, **kwargs) -> str: + """ + Serialize a NumPy array to a string in binary format. + + Parameters: + obj (np.ndarray): The NumPy array to serialize. + **kwargs: Additional keyword arguments passed to `np.save`. + + Returns: + str: The serialized NumPy array as a string. + """ + with BytesIO() as f: + np.save(f, obj, **kwargs) + return f.getvalue() + + def dump_to_fileobj(self, obj: np.ndarray, file: IO[bytes], **kwargs): + """ + Dump a NumPy array to a file-like object. + + Parameters: + obj (np.ndarray): The NumPy array to dump. + file (IO[bytes]): The file-like object to which the array is dumped. + **kwargs: Additional keyword arguments passed to `np.save`. + """ + np.save(file, obj, **kwargs) + + def dump_to_path(self, obj: np.ndarray, filepath: str, **kwargs): + """ + Dump a NumPy array to a file path. + + Parameters: + obj (np.ndarray): The NumPy array to dump. + filepath (str): The file path where the array should be saved. + **kwargs: Additional keyword arguments passed to `np.save`. + """ + with open(filepath, "wb") as f: + np.save(f, obj, **kwargs) diff --git a/cosmos-inference/cosmos3/_src/imaginaire/utils/easy_io/handlers/pandas_handler.py b/cosmos-inference/cosmos3/_src/imaginaire/utils/easy_io/handlers/pandas_handler.py new file mode 100644 index 00000000..8d0b1395 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/imaginaire/utils/easy_io/handlers/pandas_handler.py @@ -0,0 +1,31 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +import pandas as pd + +from cosmos3._src.imaginaire.utils.easy_io.handlers.base import BaseFileHandler # isort:skip + + +class PandasHandler(BaseFileHandler): + str_like = False + + def load_from_fileobj(self, file, **kwargs): + return pd.read_csv(file, **kwargs) + + def dump_to_fileobj(self, obj, file, **kwargs): + obj.to_csv(file, **kwargs) + + def dump_to_str(self, obj, **kwargs): + raise NotImplementedError("PandasHandler does not support dumping to str") diff --git a/cosmos-inference/cosmos3/_src/imaginaire/utils/easy_io/handlers/pickle_handler.py b/cosmos-inference/cosmos3/_src/imaginaire/utils/easy_io/handlers/pickle_handler.py new file mode 100644 index 00000000..6f49c0e2 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/imaginaire/utils/easy_io/handlers/pickle_handler.py @@ -0,0 +1,42 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +import pickle +from io import BytesIO +from typing import Any + +from cosmos3._src.imaginaire.utils.easy_io.handlers.base import BaseFileHandler + + +class PickleHandler(BaseFileHandler): + str_like = False + + def load_from_fileobj(self, file: BytesIO, **kwargs): + return pickle.load(file, **kwargs) + + def load_from_path(self, filepath, **kwargs): + return super().load_from_path(filepath, mode="rb", **kwargs) + + def dump_to_str(self, obj, **kwargs): + kwargs.setdefault("protocol", 2) + return pickle.dumps(obj, **kwargs) + + def dump_to_fileobj(self, obj: Any, file: BytesIO, **kwargs): + kwargs.setdefault("protocol", 2) + pickle.dump(obj, file, **kwargs) + + def dump_to_path(self, obj, filepath, **kwargs): + with open(filepath, "wb") as f: + pickle.dump(obj, f, **kwargs) diff --git a/cosmos-inference/cosmos3/_src/imaginaire/utils/easy_io/handlers/pil_handler.py b/cosmos-inference/cosmos3/_src/imaginaire/utils/easy_io/handlers/pil_handler.py new file mode 100644 index 00000000..54a7f753 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/imaginaire/utils/easy_io/handlers/pil_handler.py @@ -0,0 +1,96 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +from typing import IO, Optional, Tuple, Union + +import numpy as np + +from cosmos3._src.imaginaire.utils.easy_io.handlers.base import BaseFileHandler + +try: + from PIL import Image +except ImportError: + Image = None + + +class PILHandler(BaseFileHandler): + format: str + str_like = False + + def load_from_fileobj( + self, + file: IO[bytes], + fmt: str = "pil", + size: Optional[Union[int, Tuple[int, int]]] = None, + **kwargs, + ): + """ + Load an image from a file-like object and return it in a specified format. + + Args: + file (IO[bytes]): A file-like object containing the image data. + fmt (str): The format to convert the image into. Options are \ + 'numpy', 'np', 'npy', 'type' (all return numpy arrays), \ + 'pil' (returns PIL Image), 'th', 'torch' (returns a torch tensor). + size (Optional[Union[int, Tuple[int, int]]]): The new size of the image as a single integer \ + or a tuple of (width, height). If specified, the image is resized accordingly. + **kwargs: Additional keyword arguments that can be passed to conversion functions. + + Returns: + Image data in the format specified by `fmt`. + + Raises: + IOError: If the image cannot be loaded or processed. + ValueError: If the specified format is unsupported. + """ + try: + img = Image.open(file) + img.load() # Explicitly load the image data + if size is not None: + if isinstance(size, int): + size = ( + size, + size, + ) # create a tuple if only one integer is provided + img = img.resize(size, Image.ANTIALIAS) + + # Return the image in the requested format + if fmt in ["numpy", "np", "npy"]: + return np.array(img, **kwargs) + if fmt == "pil": + return img + if fmt in ["th", "torch"]: + import torch + + # Convert to tensor + img_tensor = torch.from_numpy(np.array(img, **kwargs)) + # Convert image from HxWxC to CxHxW + if img_tensor.ndim == 3: + img_tensor = img_tensor.permute(2, 0, 1) + return img_tensor + raise ValueError( + "Unsupported format. Supported formats are 'numpy', 'np', 'npy', 'pil', 'th', and 'torch'." + ) + except Exception as e: + raise IOError(f"Unable to load image: {e}") from e + + def dump_to_fileobj(self, obj, file: IO[bytes], **kwargs): + if "format" not in kwargs: + kwargs["format"] = self.format + kwargs["format"] = "JPEG" if self.format.lower() == "jpg" else self.format.upper() + obj.save(file, **kwargs) + + def dump_to_str(self, obj, **kwargs): + raise NotImplementedError diff --git a/cosmos-inference/cosmos3/_src/imaginaire/utils/easy_io/handlers/registry_utils.py b/cosmos-inference/cosmos3/_src/imaginaire/utils/easy_io/handlers/registry_utils.py new file mode 100644 index 00000000..3ef04823 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/imaginaire/utils/easy_io/handlers/registry_utils.py @@ -0,0 +1,93 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +from cosmos3._src.imaginaire.flags import TRAINING +from cosmos3._src.imaginaire.utils.easy_io.handlers.base import BaseFileHandler +from cosmos3._src.imaginaire.utils.easy_io.handlers.byte_handler import ByteHandler +from cosmos3._src.imaginaire.utils.easy_io.handlers.csv_handler import CsvHandler +from cosmos3._src.imaginaire.utils.easy_io.handlers.gzip_handler import GzipHandler +from cosmos3._src.imaginaire.utils.easy_io.handlers.imageio_video_handler import ImageioVideoHandler +from cosmos3._src.imaginaire.utils.easy_io.handlers.json_handler import JsonHandler +from cosmos3._src.imaginaire.utils.easy_io.handlers.jsonl_handler import JsonlHandler +from cosmos3._src.imaginaire.utils.easy_io.handlers.np_handler import NumpyHandler +from cosmos3._src.imaginaire.utils.easy_io.handlers.pickle_handler import PickleHandler +from cosmos3._src.imaginaire.utils.easy_io.handlers.pil_handler import PILHandler +from cosmos3._src.imaginaire.utils.easy_io.handlers.tarfile_handler import TarHandler +from cosmos3._src.imaginaire.utils.easy_io.handlers.torch_handler import TorchHandler +from cosmos3._src.imaginaire.utils.easy_io.handlers.torchjit_handler import TorchJitHandler +from cosmos3._src.imaginaire.utils.easy_io.handlers.txt_handler import TxtHandler +from cosmos3._src.imaginaire.utils.easy_io.handlers.yaml_handler import YamlHandler + +file_handlers = { + "json": JsonHandler(), + "yaml": YamlHandler(), + "yml": YamlHandler(), + "pickle": PickleHandler(), + "pkl": PickleHandler(), + "tar": TarHandler(), + "jit": TorchJitHandler(), + "npy": NumpyHandler(), + "txt": TxtHandler(), + "csv": CsvHandler(), + "gz": GzipHandler(), + "jsonl": JsonlHandler(), + "byte": ByteHandler(), +} + +if TRAINING: + from cosmos3._src.imaginaire.utils.easy_io.handlers.pandas_handler import PandasHandler + + file_handlers["pandas"] = PandasHandler() + +for torch_type in ["pt", "pth", "ckpt"]: + file_handlers[torch_type] = TorchHandler() +for img_type in ["jpg", "jpeg", "png", "bmp", "gif"]: + file_handlers[img_type] = PILHandler() + file_handlers[img_type].format = img_type +try: + from cosmos3._src.imaginaire.utils.easy_io.handlers.trimesh_handler import TrimeshHandler + + for mesh_type in ["ply", "stl", "obj", "glb"]: + file_handlers[mesh_type] = TrimeshHandler() + file_handlers[mesh_type].format = mesh_type +except ImportError: + pass +for video_type in ["mp4", "avi", "mov", "webm", "flv", "wmv"]: + file_handlers[video_type] = ImageioVideoHandler() + + +def _register_handler(handler, file_formats): + """Register a handler for some file extensions. + + Args: + handler (:obj:`BaseFileHandler`): Handler to be registered. + file_formats (str or list[str]): File formats to be handled by this + handler. + """ + if not isinstance(handler, BaseFileHandler): + raise TypeError(f"handler must be a child of BaseFileHandler, not {type(handler)}") + if isinstance(file_formats, str): + file_formats = [file_formats] + if not all([isinstance(item, str) for item in file_formats]): + raise TypeError("file_formats must be a str or a list of str") + for ext in file_formats: + file_handlers[ext] = handler + + +def register_handler(file_formats, **kwargs): + def wrap(cls): + _register_handler(cls(**kwargs), file_formats) + return cls + + return wrap diff --git a/cosmos-inference/cosmos3/_src/imaginaire/utils/easy_io/handlers/tarfile_handler.py b/cosmos-inference/cosmos3/_src/imaginaire/utils/easy_io/handlers/tarfile_handler.py new file mode 100644 index 00000000..973146ad --- /dev/null +++ b/cosmos-inference/cosmos3/_src/imaginaire/utils/easy_io/handlers/tarfile_handler.py @@ -0,0 +1,39 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +import tarfile + +from cosmos3._src.imaginaire.utils.easy_io.handlers.base import BaseFileHandler + + +class TarHandler(BaseFileHandler): + str_like = False + + def load_from_fileobj(self, file, mode="r|*", **kwargs): + return tarfile.open(fileobj=file, mode=mode, **kwargs) + + def load_from_path(self, filepath, mode="r|*", **kwargs): + return tarfile.open(filepath, mode=mode, **kwargs) + + def dump_to_fileobj(self, obj, file, mode="w", **kwargs): + with tarfile.open(fileobj=file, mode=mode) as tar: + tar.add(obj, **kwargs) + + def dump_to_path(self, obj, filepath, mode="w", **kwargs): + with tarfile.open(filepath, mode=mode) as tar: + tar.add(obj, **kwargs) + + def dump_to_str(self, obj, **kwargs): + raise NotImplementedError diff --git a/cosmos-inference/cosmos3/_src/imaginaire/utils/easy_io/handlers/torch_handler.py b/cosmos-inference/cosmos3/_src/imaginaire/utils/easy_io/handlers/torch_handler.py new file mode 100644 index 00000000..8cd77e53 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/imaginaire/utils/easy_io/handlers/torch_handler.py @@ -0,0 +1,34 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +try: + import torch +except ImportError: + torch = None + +from cosmos3._src.imaginaire.utils.easy_io.handlers.base import BaseFileHandler + + +class TorchHandler(BaseFileHandler): + str_like = False + + def load_from_fileobj(self, file, **kwargs): + return torch.load(file, **kwargs) + + def dump_to_fileobj(self, obj, file, **kwargs): + torch.save(obj, file, **kwargs) + + def dump_to_str(self, obj, **kwargs): + raise NotImplementedError diff --git a/cosmos-inference/cosmos3/_src/imaginaire/utils/easy_io/handlers/torchjit_handler.py b/cosmos-inference/cosmos3/_src/imaginaire/utils/easy_io/handlers/torchjit_handler.py new file mode 100644 index 00000000..e5eefe4f --- /dev/null +++ b/cosmos-inference/cosmos3/_src/imaginaire/utils/easy_io/handlers/torchjit_handler.py @@ -0,0 +1,34 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +try: + import torch +except ImportError: + torch = None + +from cosmos3._src.imaginaire.utils.easy_io.handlers.base import BaseFileHandler + + +class TorchJitHandler(BaseFileHandler): + str_like = False + + def load_from_fileobj(self, file, **kwargs): + return torch.jit.load(file, **kwargs) + + def dump_to_fileobj(self, obj, file, **kwargs): + torch.jit.save(obj, file, **kwargs) + + def dump_to_str(self, obj, **kwargs): + raise NotImplementedError diff --git a/cosmos-inference/cosmos3/_src/imaginaire/utils/easy_io/handlers/trimesh_handler.py b/cosmos-inference/cosmos3/_src/imaginaire/utils/easy_io/handlers/trimesh_handler.py new file mode 100644 index 00000000..0af255a6 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/imaginaire/utils/easy_io/handlers/trimesh_handler.py @@ -0,0 +1,36 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +from typing import IO + +import trimesh + +from cosmos3._src.imaginaire.utils.easy_io.handlers.base import BaseFileHandler + + +class TrimeshHandler(BaseFileHandler): + format: str + str_like = False + + def load_from_fileobj(self, file: IO[bytes], **kwargs) -> trimesh.Trimesh: + file = trimesh.load(file_obj=file, file_type=self.format) + return file + + def dump_to_fileobj(self, obj, file: IO[bytes], **kwargs): + obj.export(file_obj=file, file_type=self.format) + return file + + def dump_to_str(self, obj, **kwargs): + raise NotImplementedError diff --git a/cosmos-inference/cosmos3/_src/imaginaire/utils/easy_io/handlers/txt_handler.py b/cosmos-inference/cosmos3/_src/imaginaire/utils/easy_io/handlers/txt_handler.py new file mode 100644 index 00000000..c8d6d740 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/imaginaire/utils/easy_io/handlers/txt_handler.py @@ -0,0 +1,34 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +from cosmos3._src.imaginaire.utils.easy_io.handlers.base import BaseFileHandler + + +class TxtHandler(BaseFileHandler): + def load_from_fileobj(self, file, **kwargs): + del kwargs + return file.read() + + def dump_to_fileobj(self, obj, file, **kwargs): + del kwargs + if not isinstance(obj, str): + obj = str(obj) + file.write(obj) + + def dump_to_str(self, obj, **kwargs): + del kwargs + if not isinstance(obj, str): + obj = str(obj) + return obj diff --git a/cosmos-inference/cosmos3/_src/imaginaire/utils/easy_io/handlers/yaml_handler.py b/cosmos-inference/cosmos3/_src/imaginaire/utils/easy_io/handlers/yaml_handler.py new file mode 100644 index 00000000..07d6a8fd --- /dev/null +++ b/cosmos-inference/cosmos3/_src/imaginaire/utils/easy_io/handlers/yaml_handler.py @@ -0,0 +1,38 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +import yaml + +try: + from yaml import CDumper as Dumper # type: ignore + from yaml import CLoader as Loader # type: ignore +except ImportError: + from yaml import Dumper, Loader # type: ignore + +from cosmos3._src.imaginaire.utils.easy_io.handlers.base import BaseFileHandler # isort:skip + + +class YamlHandler(BaseFileHandler): + def load_from_fileobj(self, file, **kwargs): + kwargs.setdefault("Loader", Loader) + return yaml.load(file, **kwargs) + + def dump_to_fileobj(self, obj, file, **kwargs): + kwargs.setdefault("Dumper", Dumper) + yaml.dump(obj, file, **kwargs) + + def dump_to_str(self, obj, **kwargs): + kwargs.setdefault("Dumper", Dumper) + return yaml.dump(obj, **kwargs) diff --git a/cosmos-inference/cosmos3/_src/imaginaire/utils/ema.py b/cosmos-inference/cosmos3/_src/imaginaire/utils/ema.py new file mode 100644 index 00000000..67810acb --- /dev/null +++ b/cosmos-inference/cosmos3/_src/imaginaire/utils/ema.py @@ -0,0 +1,366 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +from __future__ import annotations + +from contextlib import contextmanager, nullcontext +from typing import TYPE_CHECKING, Any, Generator, List, Optional, Union + +import numpy as np +import torch + +try: + from megatron.core import parallel_state + + USE_MEGATRON = True +except ImportError: + USE_MEGATRON = False + +from cosmos3._src.imaginaire.utils import distributed, log + +if TYPE_CHECKING: + from cosmos3._src.imaginaire.model import ImaginaireModel + + +class FastEmaModelUpdater: + """ + This class is used to update target model~(EMA) given source model~(regular model) and beta. + The method interaface mimic :class:`EMAModelTracker` and :class:`PowerEMATracker`. + Different from two classes, this class does not maintain the EMA model weights as buffers. It expects the user to have two module with same architecture and weights shape. + The class is proposed to work with FSDP model where above two classes are not working as expected. Besides, it is strange to claim model weights as buffers and do unnecessary name changing in :class:`EMAModelTracker` and :class:`PowerEMATracker`. Moeving forward, we should use this class instead of above two classes. + """ + + def __init__(self): + # Flag to indicate whether the cache is taken or not. Useful to avoid cache overwrite + self.is_cached = False + + def update_average(self, src_model: torch.nn.Module, tgt_model: torch.nn.Module, beta: float = 0.9999) -> None: + target_list = [] + source_list = [] + for tgt_params, src_params in zip(tgt_model.parameters(), src_model.parameters()): + assert tgt_params.dtype == torch.float32, ( + f"EMA model only works in FP32 dtype, got {tgt_params.dtype} instead." + ) + target_list.append(tgt_params) + source_list.append(src_params.data) + torch._foreach_mul_(target_list, beta) + torch._foreach_add_(target_list, source_list, alpha=1.0 - beta) + + def copy_to(self, src_model: torch.nn.Module, tgt_model: torch.nn.Module) -> None: + for tgt_params, src_params in zip(tgt_model.parameters(), src_model.parameters()): + tgt_params.data.copy_(src_params.data) + + def cache(self, parameters: Any, is_cpu: bool = False) -> None: + """Save the current parameters for restoring later. + + Args: + parameters (iterable): Iterable of torch.nn.Parameter to be temporarily stored. + """ + assert self.is_cached is False, "EMA cache is already taken. Did you forget to restore it?" + device = "cpu" if is_cpu else "cuda" + self.collected_params = [param.clone().to(device) for param in parameters] + self.is_cached = True + + def restore(self, parameters: Any) -> None: + """Restore the parameters in self.collected_params. + + Useful to validate the model with EMA parameters without affecting the + original optimization process. Store the parameters before copy_to(). + After validation (or model saving), use this to restore the former parameters. + + Args: + parameters (iterable): Iterable of torch.nn.Parameter to be updated with the stored parameters. + """ + assert self.is_cached, "EMA cache is not taken yet." + for c_param, param in zip(self.collected_params, parameters, strict=False): + param.data.copy_(c_param.data.type_as(param.data)) + self.collected_params = [] + # Release the cache after we call restore + self.is_cached = False + + +def get_buffer_name(param_name: str, torch_compile_buffer_renaming: bool = False) -> str: + """ + This function creates buffer name used by EMA from parameter's name + + Args: + param_name (str): Model's parameter name + Returns: + buffer_name (str): buffer name to be used for given parameter name + """ + + buffer_name = param_name.replace(".", "-") + + if torch_compile_buffer_renaming: + # torch.compile() adds _orig_mod to state dict names, this way we get original name + buffer_name = buffer_name.replace("_orig_mod-", "") + + return buffer_name + + +class EMAModelTracker(torch.nn.Module): + """This is a class to track the EMA model weights. + + The EMA weights are registered as buffers, which are extractable as state dicts. The names follow those of the + regular weights, except all "." are replaced with "-" (limitation of register_buffer()). This is similar to SDXL's + implementation of EMA. There are no optimizable parameters. + TODO(snah): multi-EMA weights. + + Attributes: + collected_params (list): temporarily stores the regular weights while in EMA mode. + beta (float): EMA decay rate. (default: 0.9999). + torch_compile_buffer_renaming (bool): whether to remove '_orig_mod-' from buffer names when torch.compile is used + """ + + def __init__(self, model: ImaginaireModel, beta: float = 0.9999, torch_compile_buffer_renaming: bool = False): + """Constructor of the EMA model weight tracker. + + Args: + model (ImaginaireModel): The PyTorch model. + beta (float): EMA decay rate. (default: 0.9999). + """ + super().__init__() + self.torch_compile_buffer_renaming: bool = torch_compile_buffer_renaming + if not 0.0 <= beta <= 1.0: + raise ValueError("Decay must be between 0 and 1") + self.beta = beta + for name, param in model.named_parameters(): + if param.requires_grad: + buffer_name = get_buffer_name(name, self.torch_compile_buffer_renaming) + self.register_buffer(buffer_name, param.clone().detach().data) + self.collected_params = [] + # Flag to indicate whether the cache is taken or not. Useful to avoid cache overwrite + self.is_cached = False + + @torch.no_grad() + def update_average(self, model: ImaginaireModel, iteration: Optional[int] = None) -> None: + del iteration + target_list = [] + source_list = [] + ema_buffers = self.state_dict() + for name, param in model.named_parameters(): + if param.requires_grad: + buffer_name = get_buffer_name(name, self.torch_compile_buffer_renaming) + buffer = ema_buffers[buffer_name] + assert buffer.dtype == torch.float32, f"EMA model only works in FP32 dtype, got {buffer.dtype} instead." + target_list.append(buffer) + source_list.append(param.data) + torch._foreach_mul_(target_list, self.beta) + torch._foreach_add_(target_list, source_list, alpha=1.0 - self.beta) + + def copy_to(self, model: ImaginaireModel) -> None: + ema_buffers = self.state_dict() + for name, param in model.named_parameters(): + if param.requires_grad: + buffer_name = get_buffer_name(name, self.torch_compile_buffer_renaming) + buffer = ema_buffers[buffer_name] + param.data.copy_(buffer.data) + + def cache(self, parameters: Any, is_cpu: bool = False) -> None: + """Save the current parameters for restoring later. + + Args: + parameters (iterable): Iterable of torch.nn.Parameter to be temporarily stored. + """ + assert self.is_cached is False, "EMA cache is already taken. Did you forget to restore it?" + device = "cpu" if is_cpu else "cuda" + self.collected_params = [param.clone().to(device) for param in parameters] + self.is_cached = True + + def restore(self, parameters: Any) -> None: + """Restore the parameters in self.collected_params. + + Useful to validate the model with EMA parameters without affecting the + original optimization process. Store the parameters before copy_to(). + After validation (or model saving), use this to restore the former parameters. + + Args: + parameters (iterable): Iterable of torch.nn.Parameter to be updated with the stored parameters. + """ + assert self.is_cached, "EMA cache is not taken yet." + for c_param, param in zip(self.collected_params, parameters, strict=False): + param.data.copy_(c_param.data.type_as(param.data)) + self.collected_params = [] + # Release the cache after we call restore + self.is_cached = False + + @classmethod + def initialize_multi_rank_ema( + cls, model: torch.nn.Module, rate: Union[float, List[float]], num: int = 1, enabled: bool = True + ) -> Optional[EMAModelTracker]: + """ + Class method to initialize per rank EMA Model Tracker with different rate. + Each rank will have a different rate based on the given configuration, resulting in different EMA weights. + + Args: + model (torch.nn.Module): The neural network model to be tracked. + rate (Union[float, List[float]]): The decay rate(s) for the EMA. If a list is provided, + it corresponds to rates for different ranks. + num (int, optional): The number of leading ranks to consider for different rates. + Defaults to 1. + enabled (bool, optional): Flag to enable or disable the creation of the tracker. + If False, returns None. Defaults to True. + + Returns: + Optional[EMAModelTracker]: An instance of EMAModelTracker if enabled, otherwise None. + + Example: + >>> model = torch.nn.Linear(10, 2) + >>> tracker = EMAModelTracker.initialize_ema_from_settings(model, rate=[0.1, 0.2], num=2) + >>> print(tracker) + + Notes: + If `rate` is a list and the current rank is less than `num`, the rate for the current rank + is used. If the current rank exceeds `num`, the first rate in the list is used by default. + """ + if not enabled: + return None + if USE_MEGATRON and parallel_state.is_initialized(): + cur_dp_rank = parallel_state.get_data_parallel_rank(with_context_parallel=True) + log.critical(f"using MCore parallel_state for EMA initialization. DP RANK: {cur_dp_rank}", rank0_only=False) + log.warning("It should not used together with FSDP!") + else: + cur_dp_rank = distributed.get_rank() + log.critical(f"using torch.distributed for EMA initialization. DP RANK: {cur_dp_rank}", rank0_only=False) + rate = rate if isinstance(rate, list) else [rate] + num = min(num, len(rate)) + rate = rate[cur_dp_rank] if cur_dp_rank < num else rate[0] + if cur_dp_rank < num: + print(f"EMAModelTracker: rank {cur_dp_rank}, rate {rate}") + return cls(model, rate) + + +class PowerEMATracker(EMAModelTracker): + def __init__(self, model: ImaginaireModel, s: float = 0.1, torch_compile_buffer_renaming: bool = False): + """Constructor of the EMA model weight tracker. + + Args: + model (ImaginaireModel): The PyTorch model. + s (float): EMA decay rate. See EDM2 paper + torch_compile_buffer_renaming (bool): whether to remove '_orig_mod-' from buffer names when torch.compile is used + """ + super().__init__(model=model, beta=0.0, torch_compile_buffer_renaming=torch_compile_buffer_renaming) + self.exp = np.roots([1, 7, 16 - s**-2, 12 - s**-2]).real.max() + + @torch.no_grad() + def update_average(self, model: ImaginaireModel, iteration: Optional[int] = None) -> None: + if iteration == 0: + beta = 0.0 + else: + i = iteration + 1 + beta = (1 - 1 / i) ** (self.exp + 1) + self.beta = beta + + super().update_average(model, iteration) + + @classmethod + def initialize_multi_rank_ema( + cls, model: torch.nn.Module, rate: float, num: int, enabled: bool = True + ) -> Optional[PowerEMATracker]: + """ + Class method to initialize per rank EMA Model Tracker with different rate. + Each rank will have a different rate based on the given configuration, resulting in different EMA weights. + + Args: + model (torch.nn.Module): The neural network model for which the EMA tracker is being set up. + num (int): The number of ranks for which the rate adjustment is applied. Beyond this, the rate remains unchanged. + rate (float): The base decay rate for the EMA calculation. + enabled (bool, optional): Flag to enable or disable the initialization of the tracker. If False, returns None. + Defaults to True. + + Returns: + Optional[PowerEMATracker]: An instance of PowerEMATracker with adjusted rate if enabled, otherwise None. + + Raises: + None + + Example: + >>> model = torch.nn.Linear(10, 2) + >>> tracker = PowerEMATracker.initialize_multi_rank_ema(model, num=3, rate=0.99) + >>> print(tracker) + + Notes: + The decay rate is modified by dividing it by 2 raised to the power of the rank for each rank less than `num`. + If the rank is greater than or equal to `num`, the base rate is used without modification. This approach + allows higher ranked processes to have a less aggressive decay, potentially reflecting their delayed synchronization + in a distributed training scenario. + """ + if not enabled: + return None + if USE_MEGATRON and parallel_state.is_initialized(): + cur_dp_rank = parallel_state.get_data_parallel_rank(with_context_parallel=True) + log.critical(f"using MCore parallel_state for EMA initialization. DP RANK: {cur_dp_rank}", rank0_only=False) + log.warning("It should not used together with FSDP!") + else: + cur_dp_rank = distributed.get_rank() + log.critical(f"using torch.distributed for EMA initialization. DP RANK: {cur_dp_rank}", rank0_only=False) + + divider = 2**cur_dp_rank if cur_dp_rank < num else 1 + if cur_dp_rank < num: + print(f"PowerEMATracker: rank {cur_dp_rank}, rate {rate / divider}") + return cls(model, rate / divider) + + +@contextmanager +def ema_scope(model: ImaginaireModel, enabled: bool = False, context: str | None = None) -> Generator[None, None, None]: + """Context manager for switching between regular and EMA model weights. + + This function is a dispatcher that handles two main cases: + 1. If the model has its own `ema_scope` method, it will be used. + This allows models to define custom EMA logic (e.g., for FSDP). + 2. If not, it falls back to a generic mechanism that expects the model + to have a `.ema` attribute containing an EMA tracker object. + + Args: + model (ImaginaireModel): The PyTorch model. + enabled (bool): Whether switching to EMA weights is enabled (default: False). + context (str | None): A logging context string, passed to the model's ema_scope if used. + """ + + def scope_function(): + if enabled: + has_custom_scope = hasattr(model, "ema_scope") and callable(model.ema_scope) + has_generic_ema = hasattr(model, "ema") and isinstance( + model.ema, (FastEmaModelUpdater, EMAModelTracker, PowerEMATracker) + ) + assert has_custom_scope or has_generic_ema + + if has_custom_scope: + return model.ema_scope(context=context) + else: + return ema_scope_generic(model) + else: + return nullcontext() + + with scope_function(): + yield + + +@contextmanager +def ema_scope_generic(model: ImaginaireModel) -> Generator[None, None, None]: + """Generic context manager for switching between regular and EMA model weights. + + Args: + model (ImaginaireModel): The PyTorch model, which must have a `.ema` attribute. + """ + model.ema.cache(model.parameters()) + model.ema.copy_to(model) + + log.info("EMA: switched to EMA weights.") + try: + yield + finally: + model.ema.restore(model.parameters()) + log.info("EMA: restored regular weights.") diff --git a/cosmos-inference/cosmos3/_src/imaginaire/utils/embedding_concat_strategy.py b/cosmos-inference/cosmos3/_src/imaginaire/utils/embedding_concat_strategy.py new file mode 100644 index 00000000..612f45a9 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/imaginaire/utils/embedding_concat_strategy.py @@ -0,0 +1,25 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +from enum import Enum + + +class EmbeddingConcatStrategy(str, Enum): + FULL_CONCAT = "full_concat" # Concatenate embeddings all layers + MEAN_POOLING = "mean_pooling" # Average pool embeddings all layers + POOL_EVERY_N_LAYERS_AND_CONCAT = "pool_every_n_layers_and_concat" # Pool every n layers and concatenatenate + + def __str__(self) -> str: + return self.value diff --git a/cosmos-inference/cosmos3/_src/imaginaire/utils/env_parsers/cred_env_parser.py b/cosmos-inference/cosmos3/_src/imaginaire/utils/env_parsers/cred_env_parser.py new file mode 100644 index 00000000..e39041ac --- /dev/null +++ b/cosmos-inference/cosmos3/_src/imaginaire/utils/env_parsers/cred_env_parser.py @@ -0,0 +1,83 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +from cosmos3._src.imaginaire.utils.env_parsers.env_parser import EnvParser +from cosmos3._src.imaginaire.utils.validator import String + + +class CredentialEnvParser(EnvParser): + APP_ENV = String(default="") + PROD_FT_AWS_CREDS_ACCESS_KEY_ID = String(default="") + PROD_FT_AWS_CREDS_SECRET_ACCESS_KEY = String(default="") + PROD_FT_AWS_CREDS_ENDPOINT_URL = String(default="https://s3.us-west-2.amazonaws.com") + PROD_FT_AWS_CREDS_REGION_NAME = String(default="us-west-2") + + PROD_S3_CHECKPOINT_ACCESS_KEY_ID = String(default="") + PROD_S3_CHECKPOINT_SECRET_ACCESS_KEY = String(default="") + PROD_S3_CHECKPOINT_ENDPOINT_URL = String(default="") + PROD_S3_CHECKPOINT_REGION_NAME = String(default="") + + PROD_GCP_CHECKPOINT_ACCESS_KEY_ID = String(default="") + PROD_GCP_CHECKPOINT_SECRET_ACCESS_KEY = String(default="") + PROD_GCP_CHECKPOINT_ENDPOINT_URL = String(default="") + PROD_GCP_CHECKPOINT_REGION_NAME = String(default="") + + PROD_PDX_BENCHMARK_ACCESS_KEY_ID = String(default="") + PROD_PDX_BENCHMARK_SECRET_ACCESS_KEY = String(default="") + PROD_PDX_BENCHMARK_ENDPOINT_URL = String(default="") + PROD_PDX_BENCHMARK_REGION_NAME = String(default="") + + PROD_TEAM_DIR_ACCESS_KEY_ID = String(default="") + PROD_TEAM_DIR_SECRET_ACCESS_KEY = String(default="") + PROD_TEAM_DIR_ENDPOINT_URL = String(default="") + PROD_TEAM_DIR_REGION_NAME = String(default="") + + PICASSO_AUTH_MODEL_REGISTRY_API_KEY = String(default="") + PICASSO_API_ENDPOINT_URL = String(default="https://invalid") + + +CRED_ENVS = CredentialEnvParser() +CRED_ENVS_DICT = { + "PROD_FT_AWS_CREDS": { + "aws_access_key_id": CRED_ENVS.PROD_FT_AWS_CREDS_ACCESS_KEY_ID, + "aws_secret_access_key": CRED_ENVS.PROD_FT_AWS_CREDS_SECRET_ACCESS_KEY, + "endpoint_url": CRED_ENVS.PROD_FT_AWS_CREDS_ENDPOINT_URL, + "region_name": CRED_ENVS.PROD_FT_AWS_CREDS_REGION_NAME, + }, + "PROD_S3_CHECKPOINT": { + "aws_access_key_id": CRED_ENVS.PROD_S3_CHECKPOINT_ACCESS_KEY_ID, + "aws_secret_access_key": CRED_ENVS.PROD_S3_CHECKPOINT_SECRET_ACCESS_KEY, + "endpoint_url": CRED_ENVS.PROD_S3_CHECKPOINT_ENDPOINT_URL, + "region_name": CRED_ENVS.PROD_S3_CHECKPOINT_REGION_NAME, + }, + "PROD_GCP_CHECKPOINT": { + "aws_access_key_id": CRED_ENVS.PROD_GCP_CHECKPOINT_ACCESS_KEY_ID, + "aws_secret_access_key": CRED_ENVS.PROD_GCP_CHECKPOINT_SECRET_ACCESS_KEY, + "endpoint_url": CRED_ENVS.PROD_GCP_CHECKPOINT_ENDPOINT_URL, + "region_name": CRED_ENVS.PROD_GCP_CHECKPOINT_REGION_NAME, + }, + "PROD_PDX_BENCHMARK": { + "aws_access_key_id": CRED_ENVS.PROD_PDX_BENCHMARK_ACCESS_KEY_ID, + "aws_secret_access_key": CRED_ENVS.PROD_PDX_BENCHMARK_SECRET_ACCESS_KEY, + "endpoint_url": CRED_ENVS.PROD_PDX_BENCHMARK_ENDPOINT_URL, + "region_name": CRED_ENVS.PROD_PDX_BENCHMARK_REGION_NAME, + }, + "PROD_TEAM_DIR": { + "aws_access_key_id": CRED_ENVS.PROD_TEAM_DIR_ACCESS_KEY_ID, + "aws_secret_access_key": CRED_ENVS.PROD_TEAM_DIR_SECRET_ACCESS_KEY, + "endpoint_url": CRED_ENVS.PROD_TEAM_DIR_ENDPOINT_URL, + "region_name": CRED_ENVS.PROD_TEAM_DIR_REGION_NAME, + }, +} diff --git a/cosmos-inference/cosmos3/_src/imaginaire/utils/env_parsers/customization_env_parser.py b/cosmos-inference/cosmos3/_src/imaginaire/utils/env_parsers/customization_env_parser.py new file mode 100644 index 00000000..fd369dba --- /dev/null +++ b/cosmos-inference/cosmos3/_src/imaginaire/utils/env_parsers/customization_env_parser.py @@ -0,0 +1,31 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +from cosmos3._src.imaginaire.utils.env_parsers.env_parser import EnvParser +from cosmos3._src.imaginaire.utils.validator import Bool, String + + +class CustomizationEnvParser(EnvParser): + FLEET_FUNCTION = Bool(default=False) + CUSTOMIZATION_TYPE = String(default="") + DEBUG_SKIP_CUSTOMIZATION_DOWNLOAD = Bool(default=False) + FT_AWS_ACCESS_KEY_ID = String(default="") + FT_AWS_SECRET_ACCESS_KEY = String(default="") + FT_AWS_REGION_NAME = String(default="") + FT_AWS_GATEWAY_URL = String(default="") + LAMBDA_STAGE = String(default="prod") + + +CUSTOMIZATION_ENVS = CustomizationEnvParser() diff --git a/cosmos-inference/cosmos3/_src/imaginaire/utils/env_parsers/env_parser.py b/cosmos-inference/cosmos3/_src/imaginaire/utils/env_parsers/env_parser.py new file mode 100644 index 00000000..36c277f6 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/imaginaire/utils/env_parsers/env_parser.py @@ -0,0 +1,127 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +import base64 +import json +import os + +from cosmos3._src.imaginaire.utils import log +from cosmos3._src.imaginaire.utils.validator import JsonDict, Validator + +""" +Base class for parsing environment variables using validators. +Class will go through its list of validators and retrieve values from same named environment variables. +Validators provide: +- default value +- typed parsing +- enforments of mandatory values + +Additionally the environment variables can be passed as single base64 encoded string. + +we cannot enforce that a component isn't directly using the environment variables. +so evaluation of params should throw error to make sure actual env var is correct. +""" + + +class EnvParser: + def __init__(self, b64_str=None): + if b64_str: + log.critical(f"b64_str recieved: {b64_str}") + self.from_b64(b64_str) + else: + self.from_env() + + def from_env(self): + validators = self.get_val_dict() + for key in validators.keys(): + val = os.getenv(key.upper()) + # log.debug(f"getting env var {key.upper()}: {val}") + if val: + setattr(self, key, val) + self.check_mandatory_values() + + def from_json(self, file_name): + with open(file_name, "r") as f: + log.info(f"Reading env params from {file_name}") + dict = json.load(f) + for key, value in dict.items(): + setattr(self, key, value) + self.check_mandatory_values() + + def to_b64(self): + json_str = self.to_json() + # create bytes-like object for b64 encoder + json_str_bytes = json_str.encode() + b64_str = base64.b64encode(json_str_bytes).decode() + + print(b64_str) + return b64_str + + def from_b64(self, b64_str): + json_str = base64.b64decode(b64_str).decode() + dict = json.loads(json_str) + for key, value in dict.items(): + setattr(self, key, value) + self.check_mandatory_values() + + def check_mandatory_values(self): + for key, validator in self.get_val_dict().items(): + if getattr(self, key) is None and validator.default is None: + raise ValueError(f"Missing mandatory env var: {key}") + + @classmethod + def get_val_dict(cls): + log.debug(f"getting val dict of {cls.__name__}") + val_dict = {} + val_dict.update({key: value for key, value in cls.__dict__.items() if isinstance(value, Validator)}) + + return val_dict + + def dump_validators(self): + validators = self.get_val_dict() + for key, value in validators.items(): + log.debug(f"{key}: {value.__get__(self)}") + + def to_json(self, file_name=None): + dict = { + key.upper(): value.__get__(self) + for key, value in EnvParser.__dict__.items() + if isinstance(value, Validator) + } + json_str = json.dumps(dict, indent=4) + print(json_str) + + if file_name: + with open(file_name, "w") as f: + log.info(f"Writing env params to {file_name}") + f.write(json_str) + + return json_str + + def to_string_dict(self): + result = {} + for key, validator in self.get_val_dict().items(): + value = getattr(self, key) + if value is None: + value = validator.default + if isinstance(validator, JsonDict): + value = json.dumps(value) + else: + value = str(value) + result[key] = value + return result + + def __str__(self): + return ", ".join(f"{key}={value}" for key, value in self.__dict__.items()) diff --git a/cosmos-inference/cosmos3/_src/imaginaire/utils/env_parsers/inference_env_parser.py b/cosmos-inference/cosmos3/_src/imaginaire/utils/env_parsers/inference_env_parser.py new file mode 100644 index 00000000..18000901 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/imaginaire/utils/env_parsers/inference_env_parser.py @@ -0,0 +1,45 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +from cosmos3._src.imaginaire.utils.env_parsers.env_parser import EnvParser +from cosmos3._src.imaginaire.utils.validator import Bool, Int, String + + +class InferenceEnvParser(EnvParser): + MODEL_MODULE = String(default=None) + MODEL_CLASS = String(default=None) + TORCH_HOME = String(default="/config/models/checkpoints") + TRT_ENABLED = Bool(default=False) + PORT = Int(default=8000) + CP_SIZE = Int(default=1) + TP_SIZE = Int(default=1) + FSDP_ENABLED = Bool(default=False) + CUSTOMIZATION_TYPE = String(default="") + NIM_DEPLOYMENT = Bool(default=False) + RUNAI_DEPLOYMENT = Bool(default=False) + BLUR_CUDA = Bool(default=False) + RESIZE_CUDA = Bool(default=False) + + +INFERENCE_ENVS = InferenceEnvParser() + +if __name__ == "__main__": + INFERENCE_ENVS.to_json("env_params.json") + + b64 = INFERENCE_ENVS.to_b64() + + env_params_restored = InferenceEnvParser(b64) + + print(env_params_restored) diff --git a/cosmos-inference/cosmos3/_src/imaginaire/utils/fsdp_helper.py b/cosmos-inference/cosmos3/_src/imaginaire/utils/fsdp_helper.py new file mode 100644 index 00000000..47f47ab9 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/imaginaire/utils/fsdp_helper.py @@ -0,0 +1,159 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +from __future__ import annotations + +from contextlib import contextmanager +from functools import partial + +import torch +from torch.distributed.algorithms._checkpoint.checkpoint_wrapper import ( + CheckpointImpl, + apply_activation_checkpointing, + checkpoint_wrapper, +) +from torch.distributed.device_mesh import init_device_mesh +from torch.distributed.fsdp import FullyShardedDataParallel as FSDP +from torch.distributed.fsdp._runtime_utils import ( + _post_forward, + _post_forward_reshard, + _pre_forward, + _pre_forward_unshard, + _root_pre_forward, +) +from torch.distributed.utils import _p_assert + +from cosmos3._src.imaginaire.utils import distributed, log + + +def apply_fsdp_checkpointing(model, list_block_cls): + """apply activation checkpointing to model + returns None as model is updated directly + """ + log.critical("--> applying fdsp activation checkpointing...") + non_reentrant_wrapper = partial( + checkpoint_wrapper, + # offload_to_cpu=False, + checkpoint_impl=CheckpointImpl.NO_REENTRANT, + ) + + def check_fn(submodule): + result = False + for block_cls in list_block_cls: + if isinstance(submodule, block_cls): + result = True + break + return result + + apply_activation_checkpointing(model, checkpoint_wrapper_fn=non_reentrant_wrapper, check_fn=check_fn) + + +@contextmanager +def possible_fsdp_scope( + model: torch.nn.Module, +): + enabled = isinstance(model, FSDP) + if enabled: + assert not torch.is_grad_enabled(), "FSDP context should be entered with grad disabled" + handle = model._handle + args, kwargs = [0], dict(dummy=0) + with torch.autograd.profiler.record_function("FullyShardedDataParallel.possible_fsdp_scope"): + args, kwargs = _root_pre_forward(model, model, args, kwargs) + unused = None + args, kwargs = _pre_forward( + model, + handle, + _pre_forward_unshard, + model._fsdp_wrapped_module, + args, + kwargs, + ) + if handle: + _p_assert( + handle.flat_param.device == model.compute_device, + "Expected `FlatParameter` to be on the compute device " + f"{model.compute_device} but got {handle.flat_param.device}", + ) + try: + yield None + finally: + if enabled: + output = {"output": 1} + _post_forward(model, handle, _post_forward_reshard, model, unused, output) + + +def hsdp_device_mesh(replica_group_size=None, sharding_group_size=None, device=None): + """ + Initializes a device mesh for use with Hybrid Sharding strategy in FSDP (HSDP) training. + + This function requires explicit sizes for replica and sharding groups to accommodate models + whose GPU fit is unknown, providing flexibility in distributed training setups. + + Args: + replica_group_size (int): The size of each replica group. Must be provided to ensure + the model fits within the available resources. + sharding_group_size (int): The size of each sharding group that the model can fit. Must be provided to + ensure the correct distribution of model parameters. + device (str, optional): The device to use (e.g., "cuda:0"). If None, defaults to "cuda" + with the local rank as the device index. + + Returns: + A device mesh object compatible with FSDP. + + Raises: + ValueError: If replica_group_size or sharding_group_size are not provided, or if the + world size is not evenly divisible by the sharding group size. + RuntimeError: If a valid device mesh cannot be created. + + Usage: + If your model fits on 4 GPUS, and you have 3 nodes of 8 GPUs, then: + Sharding_Group_Size = 4 + Replica_Groups_Size = (24 total gpus, 4 per sharding group) = 6 Replica Groups + >>> device_mesh = initialize_device_mesh(replica_group_size, sharding_group_size) + >>> sharded_model = FSDP(model, device_mesh=device_mesh, ...) + """ + + # world_size = int(os.getenv("WORLD_SIZE", "1")) + world_size = distributed.get_world_size() + if sharding_group_size is None: + sharding_group_size = min(world_size, 8) + sharding_group_size = min(sharding_group_size, world_size) + if replica_group_size is None: + replica_group_size = world_size // sharding_group_size + + device = device or "cuda" + + if world_size % sharding_group_size != 0: + raise ValueError( + f"World size {world_size} is not evenly divisible by sharding group size {sharding_group_size}." + ) + + if (world_size // sharding_group_size) % replica_group_size != 0: + raise ValueError( + f"The calculated number of replica groups is not evenly divisible by " + f"replica_group_size {replica_group_size}." + ) + + device_mesh = init_device_mesh( + device, (replica_group_size, sharding_group_size), mesh_dim_names=("replicate", "shard") + ) + if device_mesh is None: + raise RuntimeError("Failed to create a valid device mesh.") + + log.critical( + f"Device mesh initialized with replica group size {replica_group_size} and sharding group size {sharding_group_size}" + ) + + return device_mesh diff --git a/cosmos-inference/cosmos3/_src/imaginaire/utils/fused_adam.py b/cosmos-inference/cosmos3/_src/imaginaire/utils/fused_adam.py new file mode 100644 index 00000000..e8131608 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/imaginaire/utils/fused_adam.py @@ -0,0 +1,383 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +import torch +import transformer_engine as te +import transformer_engine_torch as tex + +from cosmos3._src.imaginaire.utils import distributed, log + + +class FusedAdam(torch.optim.Optimizer): + """Implements Adam algorithm. + + Currently GPU-only. Requires Apex to be installed via + ``pip install -v --no-cache-dir --global-option="--cpp_ext" --global-option="--cuda_ext" ./``. + + This version of fused Adam implements 2 fusions. + + * Fusion of the Adam update's elementwise operations + * A multi-tensor apply launch that batches the elementwise updates applied to all the model's parameters + into one or a few kernel launches. + + :class:`FusedAdam` may be used as a drop-in replacement for ``torch.optim.AdamW``, + or ``torch.optim.Adam`` with ``adam_w_mode=False``:: + + opt = FusedAdam(model.parameters(), lr = ....) + ... + opt.step() + + .. warning:: + A previous version of :class:`FusedAdam` allowed a number of additional arguments to ``step``. + These additional arguments are now deprecated and unnecessary. + + Adam was been proposed in `Adam: A Method for Stochastic Optimization`_. + + Arguments: + params (iterable): iterable of parameters to optimize or dicts defining + parameter groups. + lr (float, optional): learning rate. (default: 1e-3) + betas (Tuple[float, float], optional): coefficients used for computing + running averages of gradient and its square. (default: (0.9, 0.999)) + eps (float, optional): term added to the denominator to improve + numerical stability. (default: 1e-8) + weight_decay (float, optional): weight decay (L2 penalty) (default: 0) + amsgrad (boolean, optional): whether to use the AMSGrad variant of this + algorithm from the paper `On the Convergence of Adam and Beyond`_ + (default: False) NOT SUPPORTED in FusedAdam! + adam_w_mode (boolean, optional): Apply L2 regularization or weight decay + True for decoupled weight decay(also known as AdamW) (default: True) + capturable (bool, optional): whether to use the version of the optimizer + that can be used with CUDA Graphs. (default: False) + master_weights (bool, optional): whether to maintain FP32 master weights + in the optimizer with FP16 mixed precision training, currently can + only be used with capturable set to True. (default: False) + + .. _Adam - A Method for Stochastic Optimization: + https://arxiv.org/abs/1412.6980 + .. _On the Convergence of Adam and Beyond: + https://openreview.net/forum?id=ryQu7f-RZ + """ + + def __init__( + self, + params, + lr=1e-3, + bias_correction=True, + betas=(0.9, 0.999), + eps=1e-8, + adam_w_mode=True, + weight_decay=0.0, + amsgrad=False, + capturable=False, + master_weights=False, + ): + if amsgrad: + raise RuntimeError("FusedAdam does not support the AMSGrad variant.") + if master_weights and not capturable: + raise RuntimeError("Master weights is currently only supported with the capturable version.") + # If the optimizer is capturable then LR should be a tensor (on GPU) + log.warning(f"FusedAdam master_weights: {master_weights} capturable: {capturable}") + lr = torch.tensor(lr, dtype=torch.float32) if capturable else lr + defaults = dict(lr=lr, bias_correction=bias_correction, betas=betas, eps=eps, weight_decay=weight_decay) + super(FusedAdam, self).__init__(params, defaults) + self.adam_w_mode = 1 if adam_w_mode else 0 + + self.capturable = capturable + self.master_weights = master_weights + + self.param_groups_master = None + + if capturable: + for idx, group in enumerate(self.param_groups): + if len(group["params"]) == 0: + continue + device = group["params"][0].device + for item in ["lr"]: + if isinstance(group[item], float): + group[item] = torch.tensor(group[item], dtype=torch.float32) + self.param_groups[idx][item] = group[item].to(device=device) + + self._step_supports_amp_scaling = True + + # Skip buffer + self._dummy_overflow_buf = torch.tensor([0], dtype=torch.int, device="cuda") + self.multi_tensor_adam = tex.multi_tensor_adam + self.multi_tensor_adam_capturable = tex.multi_tensor_adam_capturable + self.multi_tensor_adam_capturable_master = tex.multi_tensor_adam_capturable_master + + def step(self, closure=None, grads=None, output_params=None, scale=None, grad_norms=None, grad_scaler=None): + """Performs a single optimization step. + + Arguments: + closure (callable, optional): A closure that reevaluates the model + and returns the loss. + + The remaining arguments are deprecated, and are only retained (for the moment) for error-checking purposes. + """ + if any(p is not None for p in [grads, output_params, scale, grad_norms]): + raise RuntimeError( + "FusedAdam has been updated. " + "Simply initialize it identically to torch.optim.Adam, and call step() with no arguments." + ) + loss = None + if closure is not None: + loss = closure() + + if self.param_groups_master is None: + # Create full precision master weights + self.param_groups_master = [] + for i, pg in enumerate(self.param_groups): + param_list = pg["params"] + self.param_groups_master.append( + { + "params": [p.clone().detach().float() if self.master_weights else None for p in param_list], + } + ) + + for group, group_master in zip(self.param_groups, self.param_groups_master): + if len(group["params"]) == 0: + continue + device = group["params"][0].device + bias_correction = 1 if "bias_correction" in group and group["bias_correction"] else 0 + beta1, beta2 = group["betas"] + + # assume same step across group now to simplify things + # per parameter step can be easily support by making it tensor, or pass list into kernel + if "step" in group: + if self.capturable: + group["step"] = ( + group["step"].to(device=device) + if isinstance(group["step"], torch.Tensor) + else torch.tensor(group["step"], dtype=torch.int32, device=device) + ) + group["step"] += (self._dummy_overflow_buf != 1).to(torch.int) + else: + group["step"] += 1 + else: + group["step"] = 1 if not self.capturable else torch.tensor([1], dtype=torch.int, device=device) + + if self.capturable: + group["lr"] = ( + group["lr"].to(device=device) + if isinstance(group["lr"], torch.Tensor) + else torch.tensor(group["lr"], dtype=torch.float32, device=device) + ) + + # create lists for multi-tensor apply + g_16, p_16, m_16, v_16 = [], [], [], [] + g_bf, p_bf, m_bf, v_bf = [], [], [], [] + g_32, p_32, m_32, v_32 = [], [], [], [] + p_16_master = [] + p_32_master = [] + bf16_master = [] + + for p, p_master in zip(group["params"], group_master["params"]): + if p.grad is None: + continue + if p.grad.data.is_sparse: + raise RuntimeError( + "FusedAdam does not support sparse gradients, please consider SparseAdam instead" + ) + + state = self.state[p] + # State initialization + if len(state) == 0: + # Exponential moving average of gradient values + state["exp_avg"] = torch.zeros_like(p.data).float() + # Exponential moving average of squared gradient values + state["exp_avg_sq"] = torch.zeros_like(p.data).float() + + if p.dtype == torch.float16: + if self.master_weights: + p_16_master.append(p_master.data) + g_16.append(p.grad.data) + p_16.append(p.data) + m_16.append(state["exp_avg"]) + v_16.append(state["exp_avg_sq"]) + elif p.dtype == torch.bfloat16: + if self.master_weights: + bf16_master.append(p_master.data) + g_bf.append(p.grad) + p_bf.append(p) + m_bf.append(state["exp_avg"]) + v_bf.append(state["exp_avg_sq"]) + elif p.dtype == torch.float32: + if self.master_weights: + p_32_master.append(p_master.data) + g_32.append(p.grad.data) + p_32.append(p.data) + m_32.append(state["exp_avg"]) + v_32.append(state["exp_avg_sq"]) + else: + raise RuntimeError("FusedAdam only support fp16 and fp32.") + + # If the optimizer is capturable, then if there's a grad scaler it works + # on the GPU + a different multi_tensor_applier should be called + if self.capturable: + # overflow check of gradients + found_inf = ( + grad_scaler._check_inf_per_device(self)[device] + if grad_scaler is not None + else torch.zeros((1,), device=device) + ) + self._dummy_overflow_buf.copy_(found_inf) + + # get unscale scale factor + scale, inv_scale = None, None + if grad_scaler: + scale = grad_scaler._get_scale_async() + inv_scale = scale.double().reciprocal().float() + else: + scale = torch.ones((1,), device=device, dtype=torch.float32) + inv_scale = torch.ones((1,), device=device, dtype=torch.float32) + + if len(g_16) > 0: + te.pytorch.optimizers.multi_tensor_applier( + ( + self.multi_tensor_adam_capturable_master + if self.master_weights + else self.multi_tensor_adam_capturable + ), + self._dummy_overflow_buf, + [g_16, p_16, m_16, v_16, p_16_master] if self.master_weights else [g_16, p_16, m_16, v_16], + group["lr"], + beta1, + beta2, + group["eps"], + group["step"], + self.adam_w_mode, + bias_correction, + group["weight_decay"], + inv_scale, + ) + + if len(g_bf) > 0: + te.pytorch.optimizers.multi_tensor_applier( + ( + self.multi_tensor_adam_capturable_master + if self.master_weights + else self.multi_tensor_adam_capturable + ), + self._dummy_overflow_buf, + [g_bf, p_bf, m_bf, v_bf, bf16_master] if self.master_weights else [g_bf, p_bf, m_bf, v_bf], + group["lr"], + beta1, + beta2, + group["eps"], + group["step"], + self.adam_w_mode, + bias_correction, + group["weight_decay"], + inv_scale, + ) + + if len(g_32) > 0: + te.pytorch.optimizers.multi_tensor_applier( + ( + self.multi_tensor_adam_capturable_master + if self.master_weights + else self.multi_tensor_adam_capturable + ), + self._dummy_overflow_buf, + [g_32, p_32, m_32, v_32, p_32_master] if self.master_weights else [g_32, p_32, m_32, v_32], + group["lr"], + beta1, + beta2, + group["eps"], + group["step"], + self.adam_w_mode, + bias_correction, + group["weight_decay"], + inv_scale, + ) + else: + if len(g_16) > 0: + te.pytorch.optimizers.multi_tensor_applier( + self.multi_tensor_adam, + self._dummy_overflow_buf, + [g_16, p_16, m_16, v_16], + group["lr"], + beta1, + beta2, + group["eps"], + group["step"], + self.adam_w_mode, + bias_correction, + group["weight_decay"], + ) + + if len(g_bf) > 0: + te.pytorch.optimizers.multi_tensor_applier( + self.multi_tensor_adam, + self._dummy_overflow_buf, + [g_bf, p_bf, m_bf, v_bf], + group["lr"], + beta1, + beta2, + group["eps"], + group["step"], + self.adam_w_mode, + bias_correction, + group["weight_decay"], + ) + + if len(g_32) > 0: + te.pytorch.optimizers.multi_tensor_applier( + self.multi_tensor_adam, + self._dummy_overflow_buf, + [g_32, p_32, m_32, v_32], + group["lr"], + beta1, + beta2, + group["eps"], + group["step"], + self.adam_w_mode, + bias_correction, + group["weight_decay"], + ) + + return loss + + def load_state_dict(self, state_dict): + super().load_state_dict(state_dict) + for group in self.param_groups: + if self.capturable: + group["lr"] = ( + group["lr"].cuda() + if isinstance(group["lr"], torch.Tensor) + else torch.tensor(group["lr"], dtype=torch.float32).cuda() + ) + + if "step" in group: + if self.capturable: + if distributed.get_rank() == 0: + step = ( + group["step"].cuda() + if isinstance(group["step"], torch.Tensor) + else torch.tensor([group["step"]], dtype=torch.int32).cuda() + ) + else: + step = torch.zeros(1, dtype=torch.int32).cuda() + # make it compatible with FSDP optimizer + distributed.broadcast(step, 0) + group["step"] = step + elif isinstance(group["step"], torch.Tensor): + group["step"] = group["step"].item() + for p in group["params"]: + state = self.state[p] + if "exp_avg" in state: + state["exp_avg"] = state["exp_avg"].float() + state["exp_avg_sq"] = state["exp_avg_sq"].float() diff --git a/cosmos-inference/cosmos3/_src/imaginaire/utils/fused_nan_to_num.py b/cosmos-inference/cosmos3/_src/imaginaire/utils/fused_nan_to_num.py new file mode 100644 index 00000000..520a83e1 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/imaginaire/utils/fused_nan_to_num.py @@ -0,0 +1,24 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +from typing import List + +import torch + + +@torch.jit.script +def fused_nan_to_num(params: List[torch.Tensor]): + for param in params: + torch.nan_to_num(param, nan=0.0, posinf=0.0, neginf=0.0, out=param) diff --git a/cosmos-inference/cosmos3/_src/imaginaire/utils/graph.py b/cosmos-inference/cosmos3/_src/imaginaire/utils/graph.py new file mode 100644 index 00000000..6c96deca --- /dev/null +++ b/cosmos-inference/cosmos3/_src/imaginaire/utils/graph.py @@ -0,0 +1,444 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +"""A rework of make_graphed_callabled function from TransformerEngine so that it works with inference-only.""" + +from typing import Any, Callable, Dict, Optional, Tuple, TypeVar, Union + +import torch +from torch._C import _graph_pool_handle +from torch.utils._pytree import tree_flatten as _tree_flatten +from torch.utils._pytree import tree_unflatten as _tree_unflatten +from transformer_engine.pytorch.distributed import get_all_rng_states, graph_safe_rng_available +from transformer_engine.pytorch.module.base import TransformerEngineBaseModule + +from cosmos3._src.imaginaire.utils import log + +__all__ = ["create_cuda_graph"] + + +_IS_GRAPH_CAPTURING = False + +_T = TypeVar("_T") +SingleOrTuple = Union[_T, Tuple[_T, ...]] + + +def set_capture_start() -> None: + """Record beginning of `make_graphed_callables`.""" + global _IS_GRAPH_CAPTURING + _IS_GRAPH_CAPTURING = True + + +def set_capture_end() -> None: + """Record end of `make_graphed_callables`.""" + global _IS_GRAPH_CAPTURING + _IS_GRAPH_CAPTURING = False + + +def is_graph_capturing() -> None: + """Return whether within `make_graphed_callables`.""" + return _IS_GRAPH_CAPTURING + + +def graph_pool_handle(): + """ + Returns an opaque token representing the id of a graph memory pool. + """ + return _graph_pool_handle() + + +def _make_graphed_callables( + callables: SingleOrTuple[Callable], + sample_args: SingleOrTuple[Tuple[torch.Tensor, ...]], + num_warmup_iters: int = 3, + sample_kwargs: Optional[SingleOrTuple[Dict[str, Any]]] = None, + pool: Optional[Tuple[int, ...]] = None, +) -> SingleOrTuple[Callable]: + """ + Helper method for `make_graphed_callables` + """ + + if torch.is_autocast_enabled() and torch.is_autocast_cache_enabled(): + raise RuntimeError( + "make_graphed_callables does not support the autocast caching. Please set `cache_enabled=False`." + ) + + # Default is to pass no kwargs to callables + if sample_kwargs is None: + if isinstance(callables, tuple): + sample_kwargs = tuple({} for _ in range(len(sample_args))) + else: + sample_kwargs = {} + + # Canonicalize args as tuples + just_one_callable = False + if not isinstance(callables, tuple): + just_one_callable = True + callables = (callables,) + sample_args = (sample_args,) + sample_kwargs = (sample_kwargs,) + + # Check sizes of args + assert len(sample_args) == len(callables) + assert len(sample_kwargs) == len(callables) + + # Check callables + for c in callables: + if isinstance(c, torch.nn.Module): + assert len(c._backward_hooks) == 0 and len(c._forward_hooks) == 0 and len(c._forward_pre_hooks) == 0, ( + "Modules must not have hooks registered at the time they are passed. " + + "However, registering hooks on modules after passing them " + + "through make_graphed_callables is allowed." + ) + assert all(b.requires_grad is False for b in c.buffers()), ( + "In any :class:`~torch.nn.Module` passed to " + + ":func:`~make_graphed_callables`, only parameters may be trainable. " + + "All buffers must have ``requires_grad=False``." + ) + + # Flatten callable arguments + per_callable_kwargs_keys = [list(kwargs.keys()) for kwargs in sample_kwargs] + flatten_sample_args = [] + for args, kwargs, kwargs_keys in zip(sample_args, sample_kwargs, per_callable_kwargs_keys): + flatten_arg, _ = _tree_flatten(args) + flatten_kwarg, _ = _tree_flatten([kwargs[key] for key in kwargs_keys]) + flatten_sample_args.append(tuple(flatten_arg + flatten_kwarg)) + assert all(isinstance(arg, torch.Tensor) for arg in flatten_arg), ( + "In the beta API, sample_args " + + "for each callable must contain only Tensors. Other types are not allowed." + ) + + # If a callable is an nn.Module, its graph's full input surface is the args the user explicitly + # passes to forward (ie, its sample_args) AND the module's parameter attributes. + per_callable_len_user_args = [len(args) for args in flatten_sample_args] + per_callable_module_params = [tuple(c.parameters()) if isinstance(c, torch.nn.Module) else () for c in callables] + per_callable_static_input_surfaces = [ + flatten_sample_args[i] + per_callable_module_params[i] for i in range(len(callables)) + ] + + fwd_graphs = [torch.cuda.CUDAGraph() for _ in range(len(flatten_sample_args))] + graph_callables = [None for _ in range(len(flatten_sample_args))] + + # For cases with multiple active RNG states, e.g. TP. + if graph_safe_rng_available(): + for _, state in get_all_rng_states().items(): + for fwd_graph in fwd_graphs: + fwd_graph.register_generator_state(state) + + mempool = graph_pool_handle() if pool is None else pool + + # Warmup + # Hopefully prevents cudnn benchmarking and other lazy-initialization cuda work + # from ending up in any captures. + torch.cuda.synchronize() + + # Get warmup func and func_idx. + warmup_func_idx = [] + warmup_func = [] + for func_idx, func in enumerate(callables): + warmup_func_idx.append(func_idx) + warmup_func.append(func) + assert len(warmup_func) == len(sample_args), f"Warmup runs {len(warmup_func)} don't match args {len(sample_args)}." + assert len(warmup_func_idx) == len(set(warmup_func_idx)), ( + f"Warmup runs {len(warmup_func)} but only {len(set(warmup_func_idx))} are unique." + ) + + # Filter the TE modules that cudagraph can access. + visited_te_modules = set() + + def hook_fn(module, inputs, outputs): # pylint: disable=unused-argument + if isinstance(module, TransformerEngineBaseModule): + visited_te_modules.add(module) + + # Run warmup and do the above filtering. + with torch.cuda.stream(torch.cuda.Stream()): + for func_idx, func in zip(warmup_func_idx, warmup_func): + args = sample_args[func_idx] + kwargs = sample_kwargs[func_idx] + for _ in range(num_warmup_iters): + hooks = [] + for module in func.modules(): + hook = module.register_forward_hook(hook_fn) + hooks.append(hook) + outputs, _ = _tree_flatten(func(*args, **kwargs)) + for hook in hooks: + hook.remove() + del outputs + # The following code is added specifically for MCore's special requirements, + # aimed at preventing warmup from altering the control flow. + for module in func.modules(): + if hasattr(module, "is_first_microbatch"): + module.is_first_microbatch = True + torch.cuda.synchronize() + + # All captures here share a mempool. To avoid replays corrupting each other's memory, + # the safest approach is to capture all passes in the same order they'll run: + # Capture forward graphs + per_callable_static_outputs = [] + per_callable_output_unflatten_spec = [] + graph_id = 0 + for func, args, kwargs, fwd_graph in zip(callables, sample_args, sample_kwargs, fwd_graphs): + with torch.cuda.graph(fwd_graph, pool=mempool): + outputs = func(*args, **kwargs) + graph_callables[graph_id] = func + graph_id += 1 + + flatten_outputs, spec = _tree_flatten(outputs) + per_callable_static_outputs.append(tuple(flatten_outputs)) + per_callable_output_unflatten_spec.append(spec) + + def make_graphed_autograd_function( + fwd_graph, + module_params, + kwargs_keys, + len_user_args, + output_unflatten_spec, + static_input_surface, + static_outputs, + ): + class Graphed(torch.autograd.Function): + """Autograd function for graph replay.""" + + @staticmethod + def forward(ctx, *inputs): + # pylint: disable=missing-function-docstring + + # Copy values from new tensors into static tensors + for i in range(len_user_args): + if static_input_surface[i].data_ptr() != inputs[i].data_ptr(): + static_input_surface[i].copy_(inputs[i]) + + # Replay forward graph + fwd_graph.replay() + assert isinstance(static_outputs, tuple) + return tuple(o.detach() for o in static_outputs) + + def functionalized(*user_args, **user_kwargs): + # Check that required kwargs are provided + for key in kwargs_keys: + if key not in user_kwargs: + raise TypeError( + f"Graphed callable was initialized with kwarg {key} ,but it was not provided in graph replay" + ) + + # Runs the autograd function with inputs == all inputs to + # the graph that might require grad (explicit user args + + # module parameters) + # Assumes module params didn't change since capture. + flatten_user_args, _ = _tree_flatten(user_args) + flatten_user_kwargs, _ = _tree_flatten([user_kwargs[key] for key in kwargs_keys]) + func_args = tuple(flatten_user_args) + tuple(flatten_user_kwargs) + module_params + out = Graphed.apply(*func_args) + return _tree_unflatten(out, output_unflatten_spec) + + return functionalized + + # Put together the final graphed callables + ret = [] + for i in range(len(sample_args)): + graphed = make_graphed_autograd_function( + fwd_graphs[i], + per_callable_module_params[i], + per_callable_kwargs_keys[i], + per_callable_len_user_args[i], + per_callable_output_unflatten_spec[i], + per_callable_static_input_surfaces[i], + per_callable_static_outputs[i], + ) + + func = graph_callables[i] + if isinstance(func, torch.nn.Module): + + def make_graphed_forward(func, graph_training_state, graphed, orig_fwd): + def new_fwd(*user_args, **user_kwargs): + # If the module's training-or-eval state matches what we graphed, + # run the graph, otherwise run the original forward method + if func.training == graph_training_state: + return graphed(*user_args, **user_kwargs) + return orig_fwd(*user_args, **user_kwargs) + + return new_fwd + + forward = make_graphed_forward(func, func.training, graphed, func.forward) + ret.append(forward) + else: + ret.append(graphed) + + if just_one_callable: + return ret[0] + + return tuple(ret) + + +def make_graphed_callables_forward( + modules: SingleOrTuple[Callable], + sample_args: SingleOrTuple[Tuple[torch.Tensor, ...]], + num_warmup_iters: int = 3, + sample_kwargs: Optional[SingleOrTuple[Dict[str, Any]]] = None, + pool: Optional[Tuple[int, ...]] = None, +) -> Union[Callable, Tuple[Callable, ...]]: + """ + Make CUDA graph version of Transformer Engine modules + A variation of PyTorch's `make_graphed_callables` utility function. + `original PyTorch implementation `_ + for more documentation. + Graphing parameters + ------------------- + modules: (tuple of) callable + Callable or callables to graph. + sample_args: (tuple of) tuple of torch.Tensor + Positional arguments to callable(s). + num_warmup_iters: int, default = 3 + Number of warmup iterations. + sample_kwargs: (tuple of) dict, optional + Keyword arguments to callable(s) + pool: (tuple of) int, default = `None`, optional + An instance returned from function `torch.cuda.graph_pool_handle` that hints + this graph may share memory with the indicated pool. + """ + set_capture_start() + + # Handle single module. + just_one_callable = False + if not isinstance(modules, tuple): + just_one_callable = True + modules = (modules,) + + forward_funcs = [] + for module in modules: + assert isinstance(module, torch.nn.Module), f"Graphing for {type(module)} is not supported." + forward_funcs.append(module) + + if just_one_callable: + forward_funcs = forward_funcs[0] + else: + forward_funcs = tuple(forward_funcs) + + # Save RNG state. + if graph_safe_rng_available(): + generators = [ + torch.cuda.default_generators[torch.cuda.current_device()], + *get_all_rng_states().values(), + ] + original_rng_states = [state.get_state() for state in generators] + else: + original_rng_states = torch.cuda.get_rng_state() + + graphed_callables = _make_graphed_callables( + forward_funcs, + sample_args, + num_warmup_iters=num_warmup_iters, + sample_kwargs=sample_kwargs, + pool=pool, + ) + + # Ensures warmup does not affect numerics for ops such as dropout. + if graph_safe_rng_available(): + for gen, state in zip(generators, original_rng_states): + gen.set_state(state) + else: + torch.cuda.set_rng_state(original_rng_states) + set_capture_end() + return graphed_callables + + +def create_cuda_graph( + cuda_graphs_storage: dict, + blocks: torch.nn.ModuleList, + tensor_args: list[Any], + tensor_kwargs: dict[str, Any], + extra_key: Optional[str] = None, +) -> str: + def _make_dummy_tensor_like(t: torch.Tensor) -> torch.Tensor: + if t.dtype.is_floating_point: + return torch.randn(t.shape, device=t.device, dtype=t.dtype) + if t.dtype == torch.bool: + return torch.zeros(t.shape, device=t.device, dtype=t.dtype) + if t.dtype in (torch.uint8, torch.int8, torch.int16, torch.int32, torch.int64): + if t.numel() > 0: + low = int(t.min().item()) + high = int(t.max().item()) + if high == low: + high = low + 1 + else: + high = high + 1 + else: + low, high = 0, 1 + return torch.randint(low, high, t.shape, device=t.device, dtype=t.dtype) + # Fallback: use zeros for uncommon dtypes (e.g., complex) to avoid dtype/range pitfalls. + return torch.zeros(t.shape, device=t.device, dtype=t.dtype) + + def _make_dummy_tree(x: Any) -> Any: + flat, spec = _tree_flatten(x) + dummy_flat: list[torch.Tensor] = [] + for leaf in flat: + if not isinstance(leaf, torch.Tensor): + raise TypeError( + f"create_cuda_graph only supports pytrees of torch.Tensor leaves; got leaf type {type(leaf)}" + ) + dummy = _make_dummy_tensor_like(leaf) + dummy.requires_grad = leaf.requires_grad + dummy_flat.append(dummy) + return _tree_unflatten(dummy_flat, spec) + + real_args = [arg for arg in tensor_args if arg is not None] + real_kwargs = {k: v for k, v in tensor_kwargs.items() if v is not None} + + # Shapes key must reflect all tensor leaves (supports tuple/list/dict structures). + flat_tensors: list[torch.Tensor] = [] + for arg in real_args: + flat, _ = _tree_flatten(arg) + for leaf in flat: + if not isinstance(leaf, torch.Tensor): + raise TypeError( + f"create_cuda_graph only supports pytrees of torch.Tensor leaves; got leaf type {type(leaf)}" + ) + flat_tensors.append(leaf) + for _, kwarg in real_kwargs.items(): + flat, _ = _tree_flatten(kwarg) + for leaf in flat: + if not isinstance(leaf, torch.Tensor): + raise TypeError( + f"create_cuda_graph only supports pytrees of torch.Tensor leaves; got leaf type {type(leaf)}" + ) + flat_tensors.append(leaf) + + shapes_key = "_".join(str(shape_component) for t in flat_tensors for shape_component in t.shape) + if extra_key: + shapes_key = f"{shapes_key}_{extra_key}" + if shapes_key not in cuda_graphs_storage: + callables = [] + sample_args = [] + sample_kwargs = [] + for block in blocks: + callables.append(block) + args = [] + kwargs = {} + for arg in real_args: + args.append(_make_dummy_tree(arg)) + for name, kwarg in real_kwargs.items(): + kwargs[name] = _make_dummy_tree(kwarg) + sample_args.append(tuple(args)) + sample_kwargs.append(kwargs) + + log.critical(f"Creating graph for shape {shapes_key}") + cuda_graphs_storage[shapes_key] = make_graphed_callables_forward( + tuple(callables), + tuple(sample_args), + sample_kwargs=tuple(sample_kwargs), + num_warmup_iters=11, + ) + log.critical(f"Created graph for shape {shapes_key}") + return shapes_key diff --git a/cosmos-inference/cosmos3/_src/imaginaire/utils/high_sigma_strategy.py b/cosmos-inference/cosmos3/_src/imaginaire/utils/high_sigma_strategy.py new file mode 100644 index 00000000..2e3f5ccd --- /dev/null +++ b/cosmos-inference/cosmos3/_src/imaginaire/utils/high_sigma_strategy.py @@ -0,0 +1,28 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +from enum import Enum + + +class HighSigmaStrategy(str, Enum): + NONE = "none" + UNIFORM80_2000 = "uniform80_2000" + LOGUNIFORM200_100000 = "LOGUNIFORM200_100000" + SHIFT24 = "shift24" + BALANCED_TWO_HEADS_V1 = "balanced_two_heads_v1" + HARDCODED_20steps = "hardcoded_20steps" + + def __str__(self) -> str: + return self.value diff --git a/cosmos-inference/cosmos3/_src/imaginaire/utils/launch.py b/cosmos-inference/cosmos3/_src/imaginaire/utils/launch.py new file mode 100644 index 00000000..6e87488e --- /dev/null +++ b/cosmos-inference/cosmos3/_src/imaginaire/utils/launch.py @@ -0,0 +1,179 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +import argparse +import os +import sys +import time + +import torch +from omegaconf import OmegaConf + +from cosmos3._src.imaginaire.config import Config +from cosmos3._src.imaginaire.utils import distributed, log +from cosmos3._src.imaginaire.utils.cluster_env import get_cluster_env +from cosmos3._src.imaginaire.utils.easy_io import easy_io +from cosmos3._src.imaginaire.utils.env_parsers.cred_env_parser import CRED_ENVS +from cosmos3._src.imaginaire.utils.wandb_util import set_wandb_job_info + +# Global variable to track S3 readiness +S3_READY = False + + +def log_reproducible_setup(config: Config, args: argparse.Namespace) -> None: + """ + Configures the environment for reproducibility of experiments by setting up + S3 backends for storage, logging important job details, and saving configuration and + environment details both locally and on S3. + This function is crucial for ensuring that all aspects of the computational environment are captured and can be + replicated for future runs or analysis. + + Parameters: + config (Config): A configuration object containing all the settings necessary + for the job, including paths and credentials. + args (argparse.Namespace): An argparse namespace containing the command line + arguments passed to the script. This includes configurations + and any overrides specified at runtime. + + Actions: + - Sets up S3 backend for storing user data and other outputs. + - Logs job paths and critical information regarding job execution. + - Saves the job configuration locally only for the main node in a distributed setting. + - Captures and logs command-line execution details. + - Optionally reads git commit and branch information if available and logs them. + - Saves both job environment information and launch details locally and syncs these to S3. + - Supports conditional integration with Weights & Biases (wandb) for experiment tracking. + + Notes: + - The function is designed to run within a distributed environment where certain actions + (like saving configurations) are restricted to the main node (rank 0). + - It uses the 'easy_io' module for interacting with S3, ensuring files are written and + read correctly from the object store. + - It leverages OmegaConf for saving YAML configurations + - git information is read from 'git_commit.txt' and 'git_branch.txt' files if they exist. + - snapshot codebase is saved as 'codebase.zip' if it exists in the current directory. + + Raises: + FileNotFoundError: If specific files like 'git_commit.txt' or 'codebase.zip' are expected + but not found. + IOError: If there are issues in file handling operations, particularly with file + reading/writing. + """ + + run_timestamp = f"{time.strftime('%Y-%m-%d_%H-%M-%S')}" + time_tensor = torch.ByteTensor(bytearray(run_timestamp, "utf-8")).cuda() + distributed.broadcast(time_tensor, 0) + run_timestamp = time_tensor.cpu().numpy().tobytes().decode("utf-8") + + global S3_READY + if os.path.exists(config.checkpoint.save_to_object_store.credentials) or CRED_ENVS.APP_ENV in [ + "prod", + "dev", + "stg", + ]: + easy_io.set_s3_backend( + backend_args={ + "backend": "s3", + "path_mapping": { + "s3://timestamps_rundir/": f"s3://{config.checkpoint.save_to_object_store.bucket}/{config.job.path}/job_runs/{run_timestamp}/", + "s3://rundir/": f"s3://{config.checkpoint.save_to_object_store.bucket}/{config.job.path}/", + }, + "s3_credential_path": config.checkpoint.save_to_object_store.credentials, + } + ) + S3_READY = True + else: + log.warning("S3 credentials not found. Skipping easy_io S3 setup.") + + log.warning(f"Job path: {config.job.path}") + job_info = get_cluster_env() + # save cfg to local + if distributed.get_rank() == 0: + job_local_path = config.job.path_local + log.critical(f"Job local path: {job_local_path}") + os.makedirs(config.job.path_local, exist_ok=True) + launch_info = { + "cmd": " ".join(sys.argv), + "args_cfg_path": args.config, + "args_override": args.opts, + } + + job_info["job_local_path"] = str(job_local_path) + job_info["s3"] = f"s3://{config.checkpoint.save_to_object_store.bucket}/{config.job.path}/" + # optional read git_commit.txt and save git commit id + if os.path.exists("git_commit.txt"): + with open("git_commit.txt", "r") as f: + job_info["commit_id"] = f.read().strip() + log.critical(f"Commit id: {job_info['commit_id']}") + if os.path.exists("git_branch.txt"): + with open("git_branch.txt", "r") as f: + job_info["git_branch"] = f.read().strip() + log.critical(f"git branch: {job_info['git_branch']}") + if os.path.exists("git_diff.txt"): + with open("git_diff.txt", "r") as f: + job_info["git_diff"] = f.read().strip() + log.critical(f"git diff: {job_info['git_diff']}") + + with open(f"{job_local_path}/job_env.yaml", "w") as f: + OmegaConf.save(job_info, f) + with open(f"{job_local_path}/launch_info.yaml", "w") as f: + OmegaConf.save(launch_info, f) + set_wandb_job_info(job_info) + + # by default, we upload run in ngc and slurm + if config.upload_reproducible_setup: + # sync to s3 + if S3_READY: + log.critical( + f"Uploading reproducible setup to s3://{config.checkpoint.save_to_object_store.bucket}/{config.job.path}/job_runs/{run_timestamp}/" + ) + + config_pkl_save_fp = f"{config.job.path_local}/config.pkl" + easy_io.copyfile_from_local( + config_pkl_save_fp, f"s3://timestamps_rundir/{config_pkl_save_fp.split('/')[-1]}" + ) + config_yaml_save_fp = config_pkl_save_fp.replace(".pkl", ".yaml") + easy_io.copyfile_from_local( + config_yaml_save_fp, f"s3://timestamps_rundir/{config_yaml_save_fp.split('/')[-1]}" + ) + easy_io.copyfile_from_local(f"{job_local_path}/job_env.yaml", "s3://timestamps_rundir/job_env.yaml") + easy_io.copyfile_from_local( + f"{job_local_path}/launch_info.yaml", + "s3://timestamps_rundir/launch_info.yaml", + ) + if os.path.exists("codebase.zip"): + easy_io.copyfile_from_local("codebase.zip", "s3://timestamps_rundir/codebase.zip") + if os.path.exists("code.tar.gz"): + easy_io.copyfile_from_local("code.tar.gz", "s3://timestamps_rundir/code.tar.gz") + if os.path.exists("git_diff.txt"): + easy_io.copyfile_from_local("git_diff.txt", "s3://timestamps_rundir/git_diff.txt") + if easy_io.exists("s3://rundir/job_history.yaml"): + job_history = easy_io.load("s3://rundir/job_history.yaml") + else: + job_history = {} + job_history[len(job_history)] = { + "timestamp": run_timestamp, + "reproduce_dir": f"s3://{config.checkpoint.save_to_object_store.bucket}/{config.job.path}/job_runs/{run_timestamp}/", + **launch_info, + } + print(job_history) + easy_io.dump(job_history, "s3://rundir/job_history.yaml") + else: + log.warning("S3 credentials not found. Skipping upload of reproducible setup.") + + # save per rank cluster information to s3 + if config.upload_reproducible_setup: + if S3_READY: + easy_io.dump(job_info, f"s3://timestamps_rundir/cluster_env/RANK_{distributed.get_rank():06d}.yaml") diff --git a/cosmos-inference/cosmos3/_src/imaginaire/utils/log.py b/cosmos-inference/cosmos3/_src/imaginaire/utils/log.py new file mode 100644 index 00000000..45545d38 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/imaginaire/utils/log.py @@ -0,0 +1,156 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +import atexit +import os +import sys +from typing import Any + +import torch.distributed as dist +from loguru._logger import Core, Logger + +RANK0_ONLY = True +LEVEL = os.environ.get("LOGURU_LEVEL", "INFO") +RANK = int(os.environ.get("RANK", "0")) + + +def make_new_logger(depth: int = 1) -> Logger: + return Logger( + core=Core(), + exception=None, + depth=depth, + record=False, + lazy=False, + colors=False, + raw=False, + capture=True, + patchers=[], + extra={}, + ) + + +logger = make_new_logger(depth=1) +atexit.register(logger.remove) + + +def _add_relative_path(record: dict[str, Any]) -> None: + try: + start = os.getcwd() + record["extra"]["relative_path"] = os.path.relpath(record["file"].path, start) + except OSError: + # CWD may have been removed (e.g. on some ranks in distributed jobs). + # Fall back to the absolute path so logging still works. + record["extra"]["relative_path"] = f":{record['file'].path}" + + +*options, _, extra = logger._options # type: ignore +logger._options = tuple([*options, [_add_relative_path], extra]) # type: ignore + + +def init_loguru_stdout() -> None: + logger.remove() + datetime_format = get_datetime_format() + machine_format = get_machine_format() + message_format = get_message_format() + logger.add( + sys.stdout, + level=LEVEL, + format=f"{datetime_format}{machine_format}{message_format}", + filter=_rank0_only_filter, + ) + + +def init_loguru_file(path: str) -> None: + datetime_format = get_datetime_format() + machine_format = get_machine_format() + message_format = get_message_format() + logger.add( + path, + encoding="utf8", + level=LEVEL, + format=f"{datetime_format}{machine_format}{message_format}", + rotation="100 MB", + filter=lambda result: _rank0_only_filter(result) or not RANK0_ONLY, + enqueue=True, + ) + + +def get_datetime_format() -> str: + return "[{time:MM-DD HH:mm:ss}|" + + +def get_machine_format() -> str: + node_id = os.environ.get("NGC_ARRAY_INDEX", "0") + num_nodes = int(os.environ.get("NGC_ARRAY_SIZE", "1")) + machine_format = "" + rank = 0 + if dist.is_available(): + if not RANK0_ONLY and dist.is_initialized(): + rank = dist.get_rank() + world_size = dist.get_world_size() + machine_format = ( + f"[Node{node_id:<3}/{num_nodes:<3}][RANK{rank:<5}/{world_size:<5}]" + "[{process.name:<8}]| " + ) + return machine_format + + +def get_message_format() -> str: + message_format = "{level}|{extra[relative_path]}:{line}:{function}] {message}" + return message_format + + +def _rank0_only_filter(record: Any) -> bool: + is_rank0 = record["extra"].get("rank0_only", True) + if RANK == 0 and is_rank0: + return True + if not is_rank0: + record["message"] = f"[RANK {RANK}] " + record["message"] + return not is_rank0 + + +def trace(message: str, rank0_only: bool = True) -> None: + logger.opt(depth=1).bind(rank0_only=rank0_only).trace(message) + + +def debug(message: str, rank0_only: bool = True) -> None: + logger.opt(depth=1).bind(rank0_only=rank0_only).debug(message) + + +def info(message: str, rank0_only: bool = True) -> None: + logger.opt(depth=1).bind(rank0_only=rank0_only).info(message) + + +def success(message: str, rank0_only: bool = True) -> None: + logger.opt(depth=1).bind(rank0_only=rank0_only).success(message) + + +def warning(message: str, rank0_only: bool = True) -> None: + logger.opt(depth=1).bind(rank0_only=rank0_only).warning(message) + + +def error(message: str, rank0_only: bool = True) -> None: + logger.opt(depth=1).bind(rank0_only=rank0_only).error(message) + + +def critical(message: str, rank0_only: bool = True) -> None: + logger.opt(depth=1).bind(rank0_only=rank0_only).critical(message) + + +def exception(message: str, rank0_only: bool = True) -> None: + logger.opt(depth=1).bind(rank0_only=rank0_only).exception(message) + + +# Execute at import time. +init_loguru_stdout() diff --git a/cosmos-inference/cosmos3/_src/imaginaire/utils/misc.py b/cosmos-inference/cosmos3/_src/imaginaire/utils/misc.py new file mode 100644 index 00000000..48c01e7f --- /dev/null +++ b/cosmos-inference/cosmos3/_src/imaginaire/utils/misc.py @@ -0,0 +1,689 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +from __future__ import annotations + +import collections +import collections.abc +import functools +import json +import os +import random +from contextlib import ContextDecorator, nullcontext +from dataclasses import fields +from typing import Any, Callable, List, Tuple, TypeVar, Union + +import numpy as np +from loguru import logger as logging + +try: + # pyrefly: ignore # import-error + import straggler +except ImportError: + straggler = None +import termcolor +import torch +from torch.distributed._functional_collectives import AsyncCollectiveTensor +from torch.distributed._tensor.api import DTensor + +from cosmos3._src.imaginaire.utils import distributed, log +from cosmos3._src.imaginaire.utils.distributed import all_gather_tensor +from cosmos3._src.imaginaire.utils.easy_io import easy_io +from cosmos3._src.imaginaire.utils.timer import Timer + + +def requires_grad(model: torch.nn.Module, value: bool = True) -> None: + """Set a model to require gradients or not. + + Args: + model (torch.nn.Module): Neural network model. + value (bool): Whether the network requires gradients or not. + """ + for p in model.parameters(): + p.requires_grad = value + + +def to( + data: Any, + device: str | torch.device | None = None, + dtype: torch.dtype | None = None, + memory_format: torch.memory_format = torch.preserve_format, +) -> Any: + """Recursively cast data into the specified device, dtype, and/or memory_format. + + The input data can be a tensor, a list of tensors, a dict of tensors. + See the documentation for torch.Tensor.to() for details. + + Args: + data (Any): Input data. + device (str | torch.device): GPU device (default: None). + dtype (torch.dtype): data type (default: None). + memory_format (torch.memory_format): memory organization format (default: torch.preserve_format). + + Returns: + data (Any): Data cast to the specified device, dtype, and/or memory_format. + """ + assert device is not None or dtype is not None or memory_format is not None, ( + "at least one of device, dtype, memory_format should be specified" + ) + + if isinstance(data, torch.Tensor): + if ( + memory_format == torch.channels_last + and data.dim() != 4 + or memory_format == torch.channels_last_3d + and data.dim() != 5 + ): + memory_format = torch.preserve_format # do not change the memory format + is_cpu = (isinstance(device, str) and device == "cpu") or ( + isinstance(device, torch.device) and device.type == "cpu" + ) + data = data.to( + device=device, + dtype=dtype, + memory_format=memory_format, + non_blocking=(not is_cpu), + ) + return data + elif isinstance(data, collections.abc.Mapping): + return type(data)({key: to(data[key], device=device, dtype=dtype, memory_format=memory_format) for key in data}) + elif isinstance(data, collections.abc.Sequence) and not isinstance(data, (str, bytes)): + return type(data)([to(elem, device=device, dtype=dtype, memory_format=memory_format) for elem in data]) + else: + return data + + +def serialize(data: Any) -> Any: + """Serialize data by hierarchically traversing through iterables. + + Args: + data (Any): Input data. + + Returns: + data (Any): Serialized data. + """ + if isinstance(data, collections.abc.Mapping): + return type(data)({key: serialize(data[key]) for key in data}) + elif isinstance(data, collections.abc.Sequence) and not isinstance(data, (str, bytes)): + return type(data)([serialize(elem) for elem in data]) + else: + try: + json.dumps(data) + except TypeError: + data = str(data) + return data + + +def print_environ_variables(env_vars: list[str]) -> None: + """Print a specific list of environment variables. + + Args: + env_vars (list[str]): List of specified environment variables. + """ + for env_var in env_vars: + if env_var in os.environ: + log.info(f"Environment variable {Color.green(env_var)}: {Color.yellow(os.environ[env_var])}") + else: + log.warning(f"Environment variable {Color.green(env_var)} not set!") + + +def set_random_seed(seed: int, by_rank: bool = False) -> None: + """Set random seed. This includes random, numpy, Pytorch. + + Args: + seed (int): Random seed. + by_rank (bool): if true, each GPU will use a different random seed. + """ + if by_rank: + seed += distributed.get_rank() + log.info(f"Using random seed {seed}.") + random.seed(seed) + np.random.seed(seed) + torch.manual_seed(seed) # sets seed on the current CPU & all GPUs + + +def arch_invariant_rand( + shape: List[int] | Tuple[int], dtype: torch.dtype, device: str | torch.device, seed: int | None = None +): + """Produce a GPU-architecture-invariant randomized Torch tensor. + + Args: + shape (list or tuple of ints): Output tensor shape. + dtype (torch.dtype): Output tensor type. + device (torch.device): Device holding the output. + seed (int): Optional randomization seed. + + Returns: + tensor (torch.tensor): Randomly-generated tensor. + """ + # Create a random number generator, optionally seeded + rng = np.random.RandomState(seed) + + # Generate random numbers using the generator + random_array = rng.standard_normal(shape).astype(np.float32) # Use standard_normal for normal distribution + + # Convert to torch tensor and return + return torch.from_numpy(random_array).to(dtype=dtype, device=device) + + +def get_data_batch_size(data: dict[str, torch.Tensor] | torch.Tensor) -> int: + """Get the batch size from a data batch, a (possibly hierarchical) dictionary of tensors. + + Args: + data (dict[str, torch.Tensor]): Data batch (dictionary of tensors). + + Returns: + batch_size (int): Data batch size. + """ + + def _get_batch_size(input_data: Any) -> Union[int, None]: + """ + Helper function that recursively finds a tensor in the input data + (could be a nested dictionary or list of tensors) and returns its batch size. + """ + if isinstance(input_data, torch.Tensor): + return len(input_data) + elif isinstance(input_data, collections.abc.Mapping): + for key, value in input_data.items(): + batch_size = _get_batch_size(value) + if batch_size is not None: + return batch_size + elif isinstance(input_data, (list, tuple)) and len(input_data) > 0: + # Handle list/tuple of tensors (variable-length batches) + # The batch size is the length of the list + # We are verifying if input_data[0] is indeed a tensor. If so, return the length of the list. + if isinstance(input_data[0], torch.Tensor): + return len(input_data) + # Recurse into first element if it's a nested structure + return _get_batch_size(input_data[0]) + return None + + batch_size = _get_batch_size(data) + if not isinstance(batch_size, int): + raise ValueError(f"Batch size ({batch_size}) obtained from invalid data: {data}") + return batch_size + + +def parameters_to_buffer(module: torch.nn.Module, persistent: bool = True): + """Convert parameters in a module to buffers. + Buffers do not have its own gradients and thus not updated by backpropagation. + + Args: + module (torch.nn.Module): a module to convert parameters + persistent (bool): If True, buffers are included in state_dict. + """ + named_params = dict() + + for name, param in module.named_parameters(): + named_params[name] = param + + for name, param in named_params.items(): + module_hierarchy = name.split(".") + submodule_name = ".".join(module_hierarchy[:-1]) + submodule = module.get_submodule(submodule_name) + subname = module_hierarchy[-1] + delattr(submodule, subname) + submodule.register_buffer(subname, param, persistent=persistent) + + return + + +T = TypeVar("T", bound=Callable[..., Any]) + + +class timer(Timer): + """Simple CPU timer for timing the execution of code. + + It can be used as either a context manager or a function decorator. The timing result will be logged upon exit. + + Example: + def func_a(): + time.sleep(1) + with timer("func_a"): + func_a() + + @timer("func_b) + def func_b(): + time.sleep(1) + func_b() + """ + + def __init__(self, context: str, debug: bool = False): + super().__init__( + tag=context, + measure_cpu=True, + measure_cuda=False, + unit="s", + debug=debug, + ) + + +class memory_checker(ContextDecorator): # noqa: N801 + """Simple memory checker for a given block of code. + + It can be used as either a context manager or a function decorator. The memory usage will be logged upon exit. + Example: + def func_a(): + torch.rand([int(1024**2)]).float().cuda() + with memory_checker("func_a"): + func_a() + >>> 0.004GB memory used + + @memory_checker("func_b") + def func_b(): + random_var = torch.rand([int(1024**2)]).cuda() + func_b() + """ + + def __init__(self, context: str, debug: bool = False): + self.context = context + self.debug = debug + + def __enter__(self) -> None: + torch.cuda.synchronize() + torch.cuda.reset_peak_memory_stats() + self.initial_memory = torch.cuda.max_memory_allocated() + + def __exit__(self, exc_type, exc_value, traceback) -> None: # noqa: ANN001 + torch.cuda.synchronize() + final_memory = torch.cuda.max_memory_allocated() + message = f"Memory used within {self.context}: {(final_memory - self.initial_memory) / 1024**3:.4f} GB" + if self.debug: + log.debug(message) + else: + log.info(message) + + def __call__(self, func: T) -> T: + @functools.wraps(func) + def wrapper(*args, **kwargs): # noqa: ANN202 + torch.cuda.synchronize() + torch.cuda.reset_peak_memory_stats() + initial_memory = torch.cuda.max_memory_allocated() + result = func(*args, **kwargs) + torch.cuda.synchronize() + final_memory = torch.cuda.max_memory_allocated() + message = f"Memory used within {self.context}: {(final_memory - initial_memory) / 1024**3:.4f} GB" + if self.debug: + log.debug(message) + else: + log.info(message) + return result + + return wrapper # type: ignore + + +class TrainingTimer: + """Timer for timing the execution of code, aggregating over multiple training iterations. + + It is used as a context manager to measure the execution time of code and store the timing results + for each function. The context managers can be nested. + + Attributes: + results (dict): A dictionary to store timing results for various code. + + Example: + timer = Timer() + for i in range(100): + with timer("func_a"): + func_a() + avg_time = sum(timer.results["func_a"]) / len(timer.results["func_a"]) + print(f"func_a() took {avg_time} seconds.") + """ + + def __init__(self) -> None: + self.results = dict() + self.average_results = dict() + self.timers = [] + self.func_stack = [] + self.reset() + + def reset(self) -> None: + self.results = {key: [] for key in self.results} + + def __enter__(self) -> TrainingTimer: + timer = Timer(measure_cpu=True, measure_cuda=False, debug=True, unit="s") + self.timers.append(timer) + timer.start() + return self + + def __exit__(self, exc_type, exc_value, traceback) -> None: # noqa: ANN001 + timer = self.timers.pop() + timer.end() + result = timer.get_cpu_time() + key = self.func_stack.pop() + self.results.setdefault(key, []) + self.results[key].append(result) + + def __call__(self, func_name: str) -> TrainingTimer: + self.func_stack.append(func_name) + return self + + def __getattr__(self, func_name: str) -> TrainingTimer: + return self.__call__(func_name) + + def nested(self, func_name: str) -> TrainingTimer: + return self.__call__(func_name) + + def compute_average_results(self) -> dict[str, float]: + results = dict() + for key, value_list in self.results.items(): + results[key] = sum(value_list) / len(value_list) + return results + + +def timeout_handler(timeout_period: float, signum: int, frame: int) -> None: + # What to do when the process gets stuck. For now, we simply end the process. + error_message = f"Timeout error: more than {timeout_period} seconds passed since the last iteration." + if distributed.is_rank0(): + import wandb + + wandb.alert(title="Timeout error!", text=error_message, level=wandb.AlertLevel.ERROR) + raise TimeoutError(error_message) + + +class Color: + """A convenience class to colorize strings in the console. + + Example: + import + print("This is {Color.red('important')}.") + """ + + @staticmethod + def red(x: str) -> str: + return termcolor.colored(str(x), color="red") + + @staticmethod + def green(x: str) -> str: + return termcolor.colored(str(x), color="green") + + @staticmethod + def blue(x: str) -> str: + return termcolor.colored(str(x), color="blue") + + @staticmethod + def cyan(x: str) -> str: + return termcolor.colored(str(x), color="cyan") + + @staticmethod + def yellow(x: str) -> str: + return termcolor.colored(str(x), color="yellow") + + @staticmethod + def magenta(x: str) -> str: + return termcolor.colored(str(x), color="magenta") + + @staticmethod + def grey(x: str) -> str: + return termcolor.colored(str(x), color="grey") + + +class BufferCnt: + """ + Buffer counter which keeps track of the condition when called and returns True when the condition in met "thres" + amount of times, otherwise returns False. + + Example usage: + buf = BufferCnt(thres=3) + for _ in range(5): + if buf(random.random() > 0.5): + print("We got lucky 3 times out of 5.") + + Args: + thres (int): The amount of times the expression needs to be True before returning True. + reset_over_thres (bool): Whether to reset the buffer after returning True. + """ + + def __init__(self, thres=10, reset_over_thres=False): + self._cnt = 0 + self.thres = thres + self.reset_over_thres = reset_over_thres + + def __call__(self, expre, thres=None): + if expre is True: + self._cnt += 1 + else: + self._cnt = 0 + + if thres is None: + thres = self.thres + + if self._cnt >= thres: + if self.reset_over_thres: + self.reset() + return True + + return False + + @property + def cnt(self): + return self._cnt + + def reset(self): + self._cnt = 0 + + +def dataclass_instance_to_dict(dataclass: Any) -> dict: + """Convert a dataclass to a dictionary. + + Args: + dataclass (Any): Dataclass object. + + Returns: + dict: Dictionary representation of the dataclass. + """ + return {f.name: getattr(dataclass, f.name) for f in fields(dataclass)} + + +def get_local_tensor_if_DTensor(tensor: torch.Tensor | DTensor) -> torch.tensor: + if isinstance(tensor, DTensor): + local = tensor.to_local() + # As per PyTorch documentation, if the communication is not finished yet, we need to wait for it to finish + # https://pytorch.org/docs/stable/distributed.tensor.html#torch.distributed.tensor.DTensor.to_local + if isinstance(local, AsyncCollectiveTensor): + return local.wait() + else: + return local + return tensor + + +def set_torch_compile_options(recompile_limit: int = 8, use_duck_shape: bool = True): + """ + Set some of the torch compile config options. The default values of arguments are default config values in PyTorch as of 2.10 version. + The value recompile_limit=32 is useful for Wan Tokenizer encoding compilation, as the standard value of 8 can easily overflow. + The value of use_duck_shape=False is useful for Cosmos3 MoT training to reduce recompilations. + + Args: + recompile_limit (int): Controls the maximum number of cache entries with a guard on same ID_MATCH'd object. + use_duck_shape (bool): This flag changes whether we should use the same symbolic variable to represent input sizes that are the same + """ + try: + # PyTorch >= 2.7 + torch._dynamo.config.recompile_limit = recompile_limit + torch.fx.experimental._config.use_duck_shape = use_duck_shape + except AttributeError: + try: + torch._dynamo.config.cache_size_limit = recompile_limit + torch.fx.experimental._config.use_duck_shape = use_duck_shape + except AttributeError as e: + log.warning("torch.compile is not available due to missing config options.") + raise e + + +class NVTXRangeContext: + """ + Context manager which inserts NVTX range around the current context and optionally calls torch.cuda.synchronize + at the start and the end of the context. + + Args: + name (str): Name of the NVTX range. + enabled (bool): Whether the context manager is enabled. When disabled, it does nothing. Default: True. + synchronize (bool): Whether to call torch.cuda.synchronize() at the start and the end of the context. Default: True. + """ + + def __init__(self, name: str, enabled: bool = True, synchronize: bool = True): + self.name = name + self.enabled = enabled + self.synchronize = synchronize + + def __enter__(self): + if not self.enabled: + return + if self.synchronize: + torch.cuda.synchronize() + torch.cuda.nvtx.range_push(self.name) + + def __exit__(self, exc_type, exc_val, exc_tb): + if not self.enabled: + return + if self.synchronize: + torch.cuda.synchronize() + torch.cuda.nvtx.range_pop() + + +class StragglerDetectorV2: + """StragglerDetectorV2 is a class that allows you to easily integrate "straggler" tool: + https://gitlab-master.nvidia.com/dl/gwe/fault_tolerance_related/straggler/-/tree/cupti?ref_type=heads. + + This tool detects stragglers using low-level CUPTI tool, which can gather kernel execution time with very low overhead. + The execution times are compared across different ranks, as well as to the execution time of the exact same kernels in the past. + This tool can be easily integrated, as it's resilient to any synchronizations, since it captures kernels execution time. + It means that we can wrap the entire forward or backward passes and the stragglers will be identified regardless + of synchronizations happening during the iteration. + + Args: + enabled (bool): Whether the straggler detection is enabled. When disabled, it does nothing. Default: True. + report_freq (int): Generate a report each report_freq iterations that analyzes the GPUs performance. Defaults to 100. + profile_freq (int): Enable the CUPTI profiling each profile_freq iterations. Since the overhead is very low, + the default value is 1. + max_diff (float): Defines the maximum relative difference between the fastest and the slowest rank to determine the slowdown. Defaults to 2.0 + raise_error (bool): Whether to raise error when stragglers are detected enough times. Defaults to True.""" + + def __init__( + self, + enabled: bool = True, + report_freq: int = 100, + profile_freq: int = 1, + max_diff: float = 2.0, + raise_error: bool = True, + save_s3: bool = False, + ): + self.enabled = enabled + self.report_freq = report_freq + self.profile_freq = profile_freq + self.name = self.__class__.__name__ + self.slowdown_count = BufferCnt(thres=10, reset_over_thres=True) + self.max_diff = max_diff + self.raise_error = raise_error + self.save_s3 = save_s3 + + def initialize(self): + if self.enabled: + if not straggler: + + raise RuntimeError( + "Please install straggler package before using StragglerDetectionV2." + "Package can be installed from here: https://gitlab-master.nvidia.com/dl/osiris/straggler" + ) + + straggler.Detector.initialize( + scores_to_compute=["relative_perf_scores", "individual_perf_scores"], + gather_on_rank0=False, # all ranks results will be available on rank 0 + profiling_interval=self.profile_freq, + ) + + def profile_section(self, name: str, section_enabled: bool, profile_cuda: bool = True): + if section_enabled and self.enabled: + return straggler.Detector.detection_section(name, profile_cuda=profile_cuda) + else: + return nullcontext() + + def _aggregate_section_results(self, local_section_summaries): + data = [] + for key in local_section_summaries: + # straggler reports time in ms + data.append(local_section_summaries[key][straggler.Statistic.MAX] / 1000) + return distributed.all_gather_tensor(torch.tensor(data).cuda()) + + def generate_report(self, iteration): + if self.enabled and iteration % self.report_freq == 0: + report = straggler.Detector.generate_report() + gpu_relative_perf_score = report.gpu_relative_perf_scores[distributed.get_rank()] + gpu_relative_perf_score_gather_list = distributed.all_gather_tensor( + torch.tensor([gpu_relative_perf_score]).cuda() + ) + local_section_data = self._aggregate_section_results(report.local_section_summaries) + if distributed.get_rank() == 0: + stragglers = report.identify_stragglers(gpu_rel_threshold=1 / self.max_diff) + wandb_info = { + f"{self.name}/relative_gpu_perf_{rank}": perf[0].item() + for rank, perf in enumerate(gpu_relative_perf_score_gather_list) + } + for key_id, key in enumerate(report.local_section_summaries): + wandb_info.update( + {f"{self.name}/{key}_{rank:03d}": v[key_id].item() for rank, v in enumerate(local_section_data)} + ) + + data_tensor = torch.tensor(gpu_relative_perf_score_gather_list) + slowest_rank_id = torch.argmin(data_tensor) + wandb_info.update( + { + f"slowest_rank/{self.name}_rank": slowest_rank_id.item(), + f"slowest_rank/{self.name}_relative_perf": torch.min(data_tensor).item(), + } + ) + + for key_id, key in enumerate(report.local_section_summaries): + data_tensor = torch.tensor([v[key_id] for v in local_section_data]) + wandb_info.update( + { + f"slowest_rank/slowest_{key}_rank": torch.argmax(data_tensor).item(), + f"slowest_rank/slowest_{key}_time": torch.max(data_tensor).item(), + } + ) + + import wandb + + if wandb.run: + wandb.log(wandb_info, step=iteration) + + import cosmos3._src.imaginaire.utils.launch + + if cosmos3._src.imaginaire.utils.launch.S3_READY and (iteration % (5 * self.report_freq) == 0) and self.save_s3: + easy_io.dump( + wandb_info, + f"s3://rundir/{self.__class__.__name__}/iter_{iteration:09d}.yaml", + ) + easy_io.dump( + report, + f"s3://rundir/{self.__class__.__name__}/report_iter_{iteration:09d}.pkl", + ) + + # Which GPUs are slower than other GPUs, based on the execution time of kernels + relative_stragglers = stragglers["straggler_gpus_relative"] + # Which GPUs are slower than itself in the past, based on the past execution time of kernels. + individual_stragglers = stragglers["straggler_gpus_individual"] + is_slowdown = relative_stragglers or individual_stragglers + if is_slowdown: + hostname = torch.ByteTensor(bytearray(os.uname().nodename, "utf-8")).cuda() + whole_hostname = all_gather_tensor(hostname) + slowest_hostname = whole_hostname[slowest_rank_id].cpu().numpy().tobytes().decode("utf-8") + logging.critical(f"Slowest rank hostname: {slowest_hostname}") + + if self.slowdown_count(is_slowdown) and self.raise_error: + raise RuntimeError( + f"Detected GPU {slowest_rank_id} to be too slow compared to other GPUs." + f" The relative performance of {slowest_rank_id} rank was {report.gpu_relative_perf_scores[slowest_rank_id]}. Terminating the training." + ) diff --git a/cosmos-inference/cosmos3/_src/imaginaire/utils/nsys_wrapper.sh b/cosmos-inference/cosmos3/_src/imaginaire/utils/nsys_wrapper.sh new file mode 100755 index 00000000..18738ac2 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/imaginaire/utils/nsys_wrapper.sh @@ -0,0 +1,19 @@ +#!/bin/bash +# SPDX-FileCopyrightText: Copyright (c) 2025-2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +GPU_METRIC=" --gpu-metrics-device=${SLURM_LOCALID} " +NSYSCMD="nsys profile --capture-range cudaProfilerApi --capture-range-end=stop --cuda-memory-usage=true --cudabacktrace=all --python-backtrace=cuda --trace=cuda,nvtx,ucx,mpi,osrt ${GPU_METRIC} --force-overwrite true --output nsys_rank_${SLURM_PROCID}.nsys-rep " +${NSYSCMD} $@ diff --git a/cosmos-inference/cosmos3/_src/imaginaire/utils/object_store.py b/cosmos-inference/cosmos3/_src/imaginaire/utils/object_store.py new file mode 100644 index 00000000..45983fb2 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/imaginaire/utils/object_store.py @@ -0,0 +1,417 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +from __future__ import annotations + +import io +import json +import os +import pickle +from pathlib import Path +from typing import TYPE_CHECKING, Any, Callable, Optional +from urllib.parse import urlparse + +import boto3 +import numpy as np +import torch +import yaml +from botocore.config import Config +from PIL import Image + +import cosmos3._src.imaginaire.utils.easy_io.backends.auto_auth as auto +from cosmos3._src.imaginaire.utils import distributed, log +from cosmos3._src.imaginaire.utils.easy_io import easy_io + +GLOBAL_S3_CONFIG = Config( + retries={"max_attempts": 20, "mode": "adaptive"}, + connect_timeout=10, + read_timeout=60, + request_checksum_calculation="when_required", + response_checksum_validation="when_required", +) +Image.MAX_IMAGE_PIXELS = None + +if TYPE_CHECKING: + from cosmos3._src.imaginaire.config import ObjectStoreConfig + + +class ObjectStore: + """This is the interface class for object store, used for interacting with PBSS/AWS (S3). + + **Deprecated**. Use `easy_io` directly instead. + + Attributes: + client (botocore.client.S3): Object store client object. + easy_io_backend: easy_io backend. + bucket (str): Object store bucket name. + """ + + def __init__(self, config_object_storage: ObjectStoreConfig): + + # extracts the easy_io backend instead of the boto3 S3 client. + with auto.open_auth(config_object_storage.credentials, "r") as file: + object_storage_config = auto.json_load_auth(file) + self.client = Boto3Wrapper( + "s3", + **object_storage_config, + ) + self.easy_io_backend = easy_io.get_file_backend( + backend_args={ + "backend": "s3", + "s3_credential_path": config_object_storage.credentials, + "path_mapping": None, + } + ) + self.bucket = config_object_storage.bucket + + def _translate_key(self, key: str) -> str: + """Translate an object key to an S3 URL for easy_io. + + Args: + key (str): The key of the object. + + Returns: + str: The object's S3 URL. + """ + return f"s3://{self.bucket}/{key}" + + def load_object( + self, + key: str, + type: str | None = None, + load_func: Callable | None = None, + encoding: str = "UTF-8", + ) -> Any: + """Helper function for loading object from storage. + + Args: + key (str): The key of the object. + type (str): Specified for some common data types. If not provided, `load_func` should be specified. + The predefined types currently supported are: + - "torch": PyTorch model checkpoints, opened with torch.load(). + - "torch.jit": A JIT-compiled TorchScript model, loaded with torch.jit.load(). + - "image": Image objects, opened with PIL.Image.open(). + - "json": JSON files, opened with json.load(). + - "pickle": Picklable objects, opened with pickle.load(). + - "yaml": YAML files, opened with yaml.safe_load(). + - "text": Pure text files. + - "numpy": Numpy arrays, opened with np.load(). + - "bytes": Raw bytes. + load_func (Callable): a custom function for reading the buffer if `type` were not provided. + encoding (str): Text encoding standard (default: "UTF-8"). + + Returns: + object (Any): The downloaded object. + """ + assert type is not None or load_func is not None, "Either type or load_func should be specified." + + buffer = io.BytesIO(self.easy_io_backend.get(filepath=self._translate_key(key=key))) + buffer.seek(0) + + # Read from buffer for common data types. + if type == "torch": + return torch.load(buffer, map_location=lambda storage, loc: storage, weights_only=False) + elif type == "torch.jit": + return torch.jit.load(buffer) + elif type == "image": + image = Image.open(buffer) + image.load() + return image + elif type == "json": + return json.load(buffer) + elif type == "pickle": + return pickle.load(buffer) + elif type == "yaml": + return yaml.safe_load(buffer) + elif type == "text": + return buffer.read().decode(encoding) + elif type == "numpy": + return np.load(buffer, allow_pickle=True) + # Read from buffer as raw bytes. + elif type == "bytes": + return buffer.read() + # Customized load_func should be provided. + else: + return load_func(buffer) + + def save_object( + self, object: Any, key: str, type: str | None = None, save_func: Callable | None = None, encoding: str = "UTF-8" + ) -> None: + """Helper function for saving object to storage. + + Args: + object (Any): The object to upload. + key (str): The key of the object. + type (str): Specified for some common data types. If not provided, `save_func` should be specified. + The predefined types currently supported are: + - "torch": PyTorch model checkpoints, saved with torch.save(). + - "torch.jit": A JIT-compiled TorchScript model, exported with torch.jit.save(). + - "image": Image objects, saved with PIL.Image.save(). + - "json": JSON files, saved with json.dumps(). + - "pickle": Picklable objects, saved with pickle.dump(). + - "yaml": YAML files, saved with yaml.safe_dump(). + - "text": Pure text files. + - "numpy": Numpy arrays, saved with np.save(). + - "bytes": Raw bytes. + save_func (Callable): a custom function for writing the buffer if `type` were not provided. + encoding (str): Text encoding standard (default: "UTF-8"). + """ + assert type is not None or save_func is not None + with io.BytesIO() as buffer: + + # Write to buffer for common data types. + if type == "torch": + torch.save(object, buffer) + elif type == "torch.jit": + torch.jit.save(object, buffer) + elif type == "image": + type = os.path.basename(key).split(".")[-1] + object.save(buffer, format=type) + elif type == "json": + buffer.write(json.dumps(object).encode(encoding)) + elif type == "pickle": + pickle.dump(object, buffer) + elif type == "yaml": + buffer.write(yaml.safe_dump(object).encode(encoding)) + elif type == "text": + buffer.write(object.encode(encoding)) + elif type == "numpy": + np.save(buffer, object) + # Write to buffer as raw bytes. + elif type == "bytes": + buffer.write(bytes(object)) + # Customized save_func should be provided. + else: + save_func(object, buffer) + buffer.seek(0) + self.easy_io_backend.put(obj=buffer, filepath=self._translate_key(key=key)) + + def object_exists(self, key: str) -> bool: + """ + Check whether an object exists in the storage, with retry logic for transient errors. + + Args: + key (str): The key of the object. + + Returns: + bool: True if the object exists, False if not. + """ + return self.easy_io_backend.exists(filepath=self._translate_key(key=key)) + + +class Boto3Wrapper: + """ + This class serves as a wrapper around boto3.client in order to make boto3.client serializable. It's required to use + spawn method of creating DataLoader workers, which is in turn required to avoid segfaults when using Triton, e.g. + for torch.compile or custom kernels. + """ + + def __init__(self, *args, **kwargs): + self._args = args + self._kwargs = kwargs + self.client = None + + def __setstate__(self, state): + self.__dict__ = state + + def __getattr__(self, item): + is_worker = torch.utils.data.get_worker_info() is not None + client = ( + boto3.client(*self._args, **self._kwargs, config=GLOBAL_S3_CONFIG) if self.client is None else self.client + ) + if is_worker: + self.client = client + return getattr(client, item) + + +def sync_s3_dir_to_local( + s3_dir: str, + s3_credential_path: str, + cache_dir: Optional[str] = None, + rank_sync: bool = True, + local_rank_sync: bool = False, +) -> str: + """ + Download an entire directory from S3 to the local cache directory. + + Args: + s3_dir (str): The AWS S3 directory to download. + s3_credential_path (str): The path to the AWS S3 credentials file. + rank_sync (bool, optional): Whether to synchronize download across + ALL distributed workers using `distributed.barrier()`. Defaults to True. + cache_dir (str, optional): The cache folder to sync the S3 directory to. + If None, the environment variable `IMAGINAIRE_CACHE_DIR` (defaulting + to "~/.cache/imaginaire") will be used. + local_rank_sync (bool, optional): Whether to synchronize download across + workers within the same node using a node-level barrier. This is useful + when the cache directory is not shared across nodes. Defaults to False. + Note: rank_sync and local_rank_sync cannot both be True. + + Returns: + local_dir (str): The path to the local directory. + """ + if local_rank_sync and rank_sync: + raise ValueError("rank_sync and local_rank_sync cannot be True at the same time.") + + if not s3_dir.startswith("s3://"): + # If the directory exists locally, return the local path + assert os.path.exists(s3_dir), f"{s3_dir} is not a S3 path or a local path." + return s3_dir + + # Get local rank for node-level synchronization + local_rank = int(os.getenv("LOCAL_RANK", 0)) if local_rank_sync else None + + easy_io_backend = easy_io.get_file_backend( + backend_args={ + "backend": "s3", + "s3_credential_path": s3_credential_path, + "path_mapping": None, + } + ) + + # Parse the S3 URL + parsed_url = urlparse(s3_dir) + obj_prefix = parsed_url.path.lstrip("/") + + # If the local directory is not specified, use the default cache directory + cache_dir = ( + os.environ.get("IMAGINAIRE_CACHE_DIR", os.path.expanduser("~/.cache/imaginaire")) + if cache_dir is None + else cache_dir + ) + cache_dir = os.path.expanduser(cache_dir) + Path(cache_dir).mkdir(parents=True, exist_ok=True) + + for obj_suffix in easy_io_backend.list_dir_or_file(dir_path=s3_dir, list_dir=False, list_file=True): + # Create the full path for the destination file, preserving the directory structure + dest_path = os.path.join(cache_dir, obj_prefix, obj_suffix) + + # Ensure the directory exists + os.makedirs(os.path.dirname(dest_path), exist_ok=True) + + # Check if the file already exists + if os.path.exists(dest_path): + continue + else: + s3_obj = f"{s3_dir.removesuffix('/')}/{obj_suffix}" + log.info(f"Downloading {s3_obj} to {dest_path}") + # Download the file + if rank_sync: + # Only rank 0 downloads when using global rank sync + if distributed.get_rank() == 0: + easy_io_backend.copyfile_to_local(src=s3_obj, dst=dest_path, dst_type="file") + elif local_rank_sync: + # Only local rank 0 (first rank on each node) downloads when using local rank sync + if local_rank == 0: + easy_io_backend.copyfile_to_local(src=s3_obj, dst=dest_path, dst_type="file") + else: + # No synchronization - every rank downloads + easy_io_backend.copyfile_to_local(src=s3_obj, dst=dest_path, dst_type="file") + # Synchronize after downloads complete + if rank_sync or local_rank_sync: + distributed.barrier() + + local_dir = os.path.join(cache_dir, obj_prefix) + return local_dir + + +def download_from_s3_with_cache( + s3_path: str, + s3_credential_path: str, + cache_fp: Optional[str] = None, + cache_dir: Optional[str] = None, + rank_sync: bool = True, + backend_args: Optional[dict] = None, + backend_key: Optional[str] = None, +) -> str: + """download data from S3 with optional caching. + + This function first attempts to load the data from a local cache file. If + the cache file doesn't exist, it downloads the data from S3 to the cache + location. Caching is performed in a rank-aware manner + using `distributed.barrier()` to ensure only one download occurs across + distributed workers (if `rank_sync` is True). + + Args: + s3_path (str): The S3 path of the data to load. + cache_fp (str, optional): The path to the local cache file. If None, + a filename will be generated based on `s3_path` within `cache_dir`. + cache_dir (str, optional): The directory to store the cache file. If + None, the environment variable `IMAGINAIRE_CACHE_DIR` (defaulting + to "/tmp") will be used. + rank_sync (bool, optional): Whether to synchronize download across + distributed workers using `distributed.barrier()`. Defaults to True. + backend_args (dict, optional): The backend arguments passed to easy_io to construct the backend. + backend_key (str, optional): The backend key passed to easy_io to registry the backend or retrieve the backend if it is already registered. + + Returns: + cache_fp (str): The path to the local cache file. + + Raises: + FileNotFoundError: If the data cannot be found in S3 or the cache. + """ + if not s3_path.startswith("s3://"): + # If the file exists locally, return the local path + assert os.path.exists(s3_path), f"{s3_path} is not a S3 path nor a local path." + return s3_path + + easy_io_backend = easy_io.get_file_backend( + backend_args={ + "backend": "s3", + "s3_credential_path": s3_credential_path, + "path_mapping": None, + } + ) + cache_dir = ( + os.environ.get("IMAGINAIRE_CACHE_DIR", os.path.expanduser("~/.cache/imaginaire")) + if cache_dir is None + else cache_dir + ) + cache_dir = os.path.expanduser(cache_dir) + if cache_fp is None: + cache_fp = os.path.join(cache_dir, s3_path.replace("s3://", "")) + if not cache_fp.startswith("/"): + cache_fp = os.path.join(cache_dir, cache_fp) + + if rank_sync: + if distributed.get_rank() == 0: + if os.path.exists(cache_fp): + # check the size of cache_fp + if os.path.getsize(cache_fp) < 1: + os.remove(cache_fp) + log.warning(f"Removed empty cache file {cache_fp}.") + + if not os.path.exists(cache_fp): + easy_io_backend.copyfile_to_local( + s3_path, cache_fp, dst_type="file", backend_args=backend_args, backend_key=backend_key + ) + log.info(f"Downloaded {s3_path} to {cache_fp}.") + else: + log.info(f"The cache file {cache_fp} already exists.") + distributed.barrier() + else: + if os.path.exists(cache_fp): + # check the size of cache_fp + if os.path.getsize(cache_fp) < 1: + os.remove(cache_fp) + log.warning(f"Removed empty cache file {cache_fp}.") + if not os.path.exists(cache_fp): + easy_io_backend.copyfile_to_local( + s3_path, cache_fp, dst_type="file", backend_args=backend_args, backend_key=backend_key + ) + log.info(f"Downloaded {s3_path} to {cache_fp}.") + else: + log.info(f"The cache file {cache_fp} already exists") + return cache_fp diff --git a/cosmos-inference/cosmos3/_src/imaginaire/utils/optim_instantiate.py b/cosmos-inference/cosmos3/_src/imaginaire/utils/optim_instantiate.py new file mode 100644 index 00000000..6dde60bc --- /dev/null +++ b/cosmos-inference/cosmos3/_src/imaginaire/utils/optim_instantiate.py @@ -0,0 +1,87 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +import hydra +import torch +from torch import nn + +from cosmos3._src.imaginaire.utils import log + + +def get_regular_param_group(net: nn.Module): + """ + seperate the parameters of the network into two groups: decay and no_decay. + based on nano_gpt codebase. + """ + param_dict = {pn: p for pn, p in net.named_parameters()} + param_dict = {pn: p for pn, p in param_dict.items() if p.requires_grad} + + decay_params = [p for n, p in param_dict.items() if p.dim() >= 2] + nodecay_params = [p for n, p in param_dict.items() if p.dim() < 2] + return decay_params, nodecay_params + + +def get_base_optimizer( + model: nn.Module, + lr: float, + weight_decay: float, + optim_type: str = "adamw", + sharding: bool = False, + **kwargs, +) -> torch.optim.Optimizer: + net_decay_param, net_nodecay_param = get_regular_param_group(model) + + num_decay_params = sum(p.numel() for p in net_decay_param) + num_nodecay_params = sum(p.numel() for p in net_nodecay_param) + net_param_total = num_decay_params + num_nodecay_params + log.critical(f"total num parameters : {net_param_total:,}") + + param_group = [ + { + "params": net_decay_param + net_nodecay_param, + "lr": lr, + "weight_decay": weight_decay, + }, + ] + + if optim_type == "adamw": + opt_cls = torch.optim.AdamW + elif optim_type == "fusedadam": + from cosmos3._src.imaginaire.utils.fused_adam import FusedAdam + + opt_cls = FusedAdam + else: + raise ValueError(f"Unknown optimizer type: {optim_type}") + + return opt_cls(param_group, **kwargs) + + +def get_base_scheduler( + optimizer: torch.optim.Optimizer, + model: nn.Module, + scheduler_config: dict, +): + net_scheduler = hydra.utils.instantiate(scheduler_config) + net_scheduler.model = model + + num_param_groups = len(optimizer.param_groups) + + return torch.optim.lr_scheduler.LambdaLR( + optimizer, + lr_lambda=[ + net_scheduler.schedule, + ] + * num_param_groups, + ) diff --git a/cosmos-inference/cosmos3/_src/imaginaire/utils/parallel_state_helper.py b/cosmos-inference/cosmos3/_src/imaginaire/utils/parallel_state_helper.py new file mode 100644 index 00000000..ee27fb22 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/imaginaire/utils/parallel_state_helper.py @@ -0,0 +1,35 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +""" +This module contains various helper functions designed to extend the functionality of parallel states within the MCore library. + +MCore is a third-party library that is infrequently updated and may introduce backward compatibility issues in our codebase, such as changes in function signatures or missing / new functions in new versions. + +To mitigate these issues, this module provides stable functions that ensure the cosmos3._src.imaginaire codebase remains compatible with different versions of MCore. +""" + +try: + from megatron.core import parallel_state +except ImportError: + print("Megatron is not installed, is_tp_cp_pp_rank0 functions will not work.") + + +def is_tp_cp_pp_rank0(): + return ( + parallel_state.get_tensor_model_parallel_rank() == 0 + and parallel_state.get_pipeline_model_parallel_rank() == 0 + and parallel_state.get_context_parallel_rank() == 0 + ) diff --git a/cosmos-inference/cosmos3/_src/imaginaire/utils/primitives.py b/cosmos-inference/cosmos3/_src/imaginaire/utils/primitives.py new file mode 100644 index 00000000..2b85de49 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/imaginaire/utils/primitives.py @@ -0,0 +1,29 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + + +def is_primitive(value): + return isinstance(value, (int, float, str, bool, type(None))) + + +def convert_to_primitive(value): + if isinstance(value, (list, tuple)): + return [convert_to_primitive(v) for v in value if is_primitive(v) or isinstance(v, (list, dict))] + elif isinstance(value, dict): + return {k: convert_to_primitive(v) for k, v in value.items() if is_primitive(v) or isinstance(v, (list, dict))} + elif is_primitive(value): + return value + else: + return "non-primitive" # Skip non-primitive types diff --git a/cosmos-inference/cosmos3/_src/imaginaire/utils/profiling.py b/cosmos-inference/cosmos3/_src/imaginaire/utils/profiling.py new file mode 100644 index 00000000..569caa45 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/imaginaire/utils/profiling.py @@ -0,0 +1,188 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +import contextlib +import os +import time + +import torch + +from cosmos3._src.imaginaire.utils import distributed, log +from cosmos3._src.imaginaire.utils.easy_io import easy_io + +# (qsh 2024-11-23) credits +# https://github.com/pytorch/torchtitan/blob/main/torchtitan/profiling.py + +# how much memory allocation/free ops to record in memory snapshots +MEMORY_SNAPSHOT_MAX_ENTRIES = 100000 + + +@contextlib.contextmanager +def maybe_enable_profiling(config, *, global_step: int = 0): + # get user defined profiler settings + enable_profiling = config.trainer.profiling.enable_profiling + profile_freq = config.trainer.profiling.profile_freq + + if enable_profiling: + trace_dir = os.path.join(config.job.path_local, "torch_trace") + if distributed.get_rank() == 0: + os.makedirs(trace_dir, exist_ok=True) + + rank = distributed.get_rank() + + def trace_handler(prof): + curr_trace_dir_name = "iteration_" + str(prof.step_num) + curr_trace_dir = os.path.join(trace_dir, curr_trace_dir_name) + if not os.path.exists(curr_trace_dir): + os.makedirs(curr_trace_dir, exist_ok=True) + + log.info(f"Dumping traces at step {prof.step_num}") + begin = time.monotonic() + if rank in config.trainer.profiling.target_ranks: + prof.export_chrome_trace(f"{curr_trace_dir}/rank{rank}_trace.json.gz") + log.info(f"Finished dumping traces in {time.monotonic() - begin:.2f} seconds") + + log.info(f"Profiling active. Traces will be saved at {trace_dir}") + + if not os.path.exists(trace_dir): + os.makedirs(trace_dir, exist_ok=True) + + warmup, active = config.trainer.profiling.profile_warmup, 1 + wait = profile_freq - (active + warmup) + assert wait >= 0, "profile_freq must be greater than or equal to warmup + active" + + with torch.profiler.profile( + activities=[ + torch.profiler.ProfilerActivity.CPU, + torch.profiler.ProfilerActivity.CUDA, + ], + schedule=torch.profiler.schedule(wait=wait, warmup=warmup, active=active), + on_trace_ready=trace_handler, + record_shapes=config.trainer.profiling.record_shape, + profile_memory=config.trainer.profiling.profile_memory, + with_stack=config.trainer.profiling.with_stack, + with_modules=config.trainer.profiling.with_modules, + ) as torch_profiler: + torch_profiler.step_num = global_step + yield torch_profiler + else: + torch_profiler = contextlib.nullcontext() + yield None + + +@contextlib.contextmanager +def maybe_enable_memory_snapshot(config, *, global_step: int = 0): + enable_snapshot = config.trainer.profiling.enable_memory_snapshot + if enable_snapshot: + if config.trainer.profiling.save_s3: + snapshot_dir = "s3://rundir" + else: + snapshot_dir = os.path.join(config.job.path_local, "memory_snapshot") + if distributed.get_rank() == 0: + os.makedirs(snapshot_dir, exist_ok=True) + + rank = torch.distributed.get_rank() + + class MemoryProfiler: + def __init__(self, step_num: int, freq: int): + torch.cuda.memory._record_memory_history(max_entries=MEMORY_SNAPSHOT_MAX_ENTRIES) + # when resume training, we start from the last step + self.step_num = step_num + self.freq = freq + + def step(self, exit_ctx: bool = False): + self.step_num += 1 + if not exit_ctx and self.step_num % self.freq != 0: + return + if not exit_ctx: + curr_step = self.step_num + dir_name = f"iteration_{curr_step}" + else: + # dump as iteration_0_exit if OOM at iter 1 + curr_step = self.step_num - 1 + dir_name = f"iteration_{curr_step}_exit" + curr_snapshot_dir = os.path.join(snapshot_dir, dir_name) + if not config.trainer.profiling.save_s3 and not os.path.exists(curr_snapshot_dir): + os.makedirs(curr_snapshot_dir, exist_ok=True) + log.info(f"Dumping memory snapshot at step {curr_step}") + begin = time.monotonic() + + if rank in config.trainer.profiling.target_ranks: + easy_io.dump( + torch.cuda.memory._snapshot(), + f"{curr_snapshot_dir}/rank{rank}_memory_snapshot.pickle", + ) + log.info(f"Finished dumping memory snapshot in {time.monotonic() - begin:.2f} seconds") + + log.info(f"Memory profiler active. Snapshot will be saved at {snapshot_dir}") + profiler = MemoryProfiler(global_step, config.trainer.profiling.profile_freq) + try: + yield profiler + except torch.cuda.OutOfMemoryError as e: + profiler.step(exit_ctx=True) + else: + yield None + + +@contextlib.contextmanager +def maybe_enable_nsys_profiling(config, *, global_step: int = 0): + """Context manager for Nsight Systems profiling via cudaProfilerStart/Stop. + + Usage: launch training with + nsys profile --capture-range=cudaProfilerApi --capture-range-end=stop python ... + and set trainer.profiling.enable_nsys=true, profile_freq=. + + Reuses the torch-profile flags (profile_freq, target_ranks, profile_warmup). + The profiler is started `profile_warmup` iterations before the target and + stopped right after it. + """ + enable_nsys = config.trainer.profiling.enable_nsys + if not enable_nsys: + yield None + return + + rank = distributed.get_rank() + target_ranks = config.trainer.profiling.target_ranks + freq = config.trainer.profiling.profile_freq + warmup = config.trainer.profiling.profile_warmup + + active_iter = freq - 1 # profile_freq=5001 profiles iter 5000 + start_iter = max(0, active_iter - warmup) + + class NsysProfiler: + def __init__(self, step_num: int): + self.step_num = step_num + self._profiling = False + + def step(self): + self.step_num += 1 + if rank not in target_ranks: + return + if self.step_num == start_iter and not self._profiling: + log.info(f"[Nsys] Starting CUDA profiler at iter {self.step_num} (active iter: {active_iter})") + torch.cuda.cudart().cudaProfilerStart() + self._profiling = True + if self.step_num == active_iter + 1 and self._profiling: + torch.cuda.cudart().cudaProfilerStop() + self._profiling = False + log.info(f"[Nsys] Stopped CUDA profiler at iter {self.step_num}") + + log.info(f"[Nsys] Profiling enabled. Will capture iter {start_iter}-{active_iter} on ranks {target_ranks}") + profiler = NsysProfiler(global_step) + try: + yield profiler + finally: + if profiler._profiling: + torch.cuda.cudart().cudaProfilerStop() diff --git a/cosmos-inference/cosmos3/_src/imaginaire/utils/progress_bar.py b/cosmos-inference/cosmos3/_src/imaginaire/utils/progress_bar.py new file mode 100644 index 00000000..3b0d6ec7 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/imaginaire/utils/progress_bar.py @@ -0,0 +1,76 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +""" +Progress bar wrapper that gets automatically disabled when in a Timer region, or any other context +where we'd want to disable progress bars, including when TQDM is not present, or when user sets +DISABLE_TQDM=1. We can eventually add a simple ascii progress bar as fallback for missing +dependencies. +""" + +import os + +from cosmos3._src.imaginaire.utils import distributed +from cosmos3._src.imaginaire.utils.timer import in_timer_region + +try: + import tqdm as _tqdm # noqa: F401 + + HAS_TQDM = True +except ImportError: + HAS_TQDM = False +except Exception as e: + HAS_TQDM = True + + +def _tqdm_wrapper(*args, **kwargs): + if HAS_TQDM: + import tqdm + + return tqdm.tqdm(*args, **kwargs) + + raise ImportError("TQDM is not installed. Please install it and try again.") + + +def progress_bar(fn, desc=None, total=None, force_display: bool = False): + """ + Progress bars a great, but they're not for everybody, certainly not for everywhere. + They must be guarded against: + * We're benchmarking performance (with Timer) + * If tqdm / other progress bars aren't available, skip instead of failing. + * If multi-process / GPU, only one (usually rank 0) must display it, just like prints. + * If the user just doesn't want progress bars (toggle via environment variables. + + This function consideres all of those cases + """ + + disable_tqdm = os.environ.get("DISABLE_TQDM", "0") == "1" + is_in_timer_region = in_timer_region() + is_rank0 = True + + # Wide-scope try/except on determining rank, in case distributed context is uninitialized in a + # single-process program. If exception occurs, it's better to just assume single-process. + try: + is_rank0 = distributed.get_rank() == 0 + except Exception as e: + pass + + if not force_display and (not is_rank0 or is_in_timer_region or disable_tqdm): + return fn + + return _tqdm_wrapper(fn, desc=desc, total=total) + + +__all__ = ["progress_bar"] diff --git a/cosmos-inference/cosmos3/_src/imaginaire/utils/registry.py b/cosmos-inference/cosmos3/_src/imaginaire/utils/registry.py new file mode 100644 index 00000000..6f27c1a0 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/imaginaire/utils/registry.py @@ -0,0 +1,160 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +""" +Utilities for managing registries. +Credit: https://gitlab.com/qsh.zh/jam/-/blob/master/jammy/utils/registry.py with MIT License +""" + +import collections + +__all__ = [ + "Registry", + "DefaultRegistry", + "RegistryGroup", + "CallbackRegistry", +] + + +class Registry: + __FALLBACK_KEY__ = "__fallback__" + + _registry = None + + def __init__(self): + self._init_registry() + + def _init_registry(self): + self._registry = {} + + @property + def fallback(self): + return self._registry.get(self.__FALLBACK_KEY__, None) + + def set_fallback(self, value): + self._registry[self.__FALLBACK_KEY__] = value + return self + + def register(self, entry, value): + self._registry[entry] = value + return self + + def unregister(self, entry): + return self._registry.pop(entry, None) + + def has(self, entry): + return entry in self._registry + + def lookup(self, entry, fallback=True, default=None): + if fallback: + fallback_value = self._registry.get(self.__FALLBACK_KEY__, default) + else: + fallback_value = default + return self._registry.get(entry, fallback_value) + + def keys(self): + return list(self._registry.keys()) + + def items(self): + return list(self._registry.items()) + + +class DefaultRegistry(Registry): + __base_class__ = dict + + def _init_registry(self): + base_class = type(self).__base_class__ + self._registry = collections.defaultdict(base_class) + + def lookup(self, entry, fallback=False, default=None): + assert fallback is False and default is None + return self._registry[entry] + + def __getitem__(self, item): + return self.lookup(item) + + +class RegistryGroup: + __base_class__ = Registry + + def __init__(self): + self._init_registry_group() + + def _init_registry_group(self): + base_class = type(self).__base_class__ + self._registries = collections.defaultdict(base_class) + + def __getitem__(self, item): + return self._registries[item] + + def register(self, registry_name, entry, value, **kwargs): + return self._registries[registry_name].register(entry, value, **kwargs) + + def lookup(self, registry_name, entry, fallback=True, default=None): + return self._registries[registry_name].lookup(entry, fallback=fallback, default=default) + + +class CallbackRegistry(Registry): + """ + A callable manager utils. + + If there exists a super callback, it will block all callbacks. + A super callback will receive the called name as its first argument. + + Then the dispatcher will try to call the callback by name. + If such name does not exists, a fallback callback will be called. + + The fallback callback will also receive the called name as its first argument. + + Examples: + + >>> registry = CallbackRegistry() + >>> callback_func = print + >>> registry.register('name', callback_func) # register a callback. + >>> registry.dispatch('name', 'arg1', 'arg2', kwarg1='kwarg1') # dispatch. + """ + + def __init__(self): + super().__init__() + self._super_callback = None + + @property + def super_callback(self): + return self._super_callback + + def set_super_callback(self, callback): + self._super_callback = callback + return self + + @property + def fallback_callback(self): + return self.fallback + + def set_fallback_callback(self, callback): + return self.set_fallback(callback) + + def dispatch(self, name, *args, **kwargs): + if self._super_callback is not None: + return self._super_callback(self, name, *args, **kwargs) + return self.dispatch_direct(name, *args) + + def dispatch_direct(self, name, *args, **kwargs): + """Dispatch by name, ignoring the super callback.""" + callback = self.lookup(name, fallback=False) + if callback is None: + if self.fallback_callback is None: + raise ValueError('Unknown callback entry: "{}".'.format(name)) + return self.fallback_callback(name, *args, **kwargs) + return callback(*args, **kwargs) diff --git a/cosmos-inference/cosmos3/_src/imaginaire/utils/replace_bg_color.py b/cosmos-inference/cosmos3/_src/imaginaire/utils/replace_bg_color.py new file mode 100644 index 00000000..b48a810f --- /dev/null +++ b/cosmos-inference/cosmos3/_src/imaginaire/utils/replace_bg_color.py @@ -0,0 +1,124 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +import io +import re + +import numpy as np +from PIL import Image + +_IMG_EXTENSIONS = "jpg jpeg png ppm pgm pbm pnm".split() + + +def lin2srgb(lin): + """Convert sRGB values to physically linear ones. The transformation is + uniform in RGB, so *srgb* can be of any shape. + + *srgb* values should range between 0 and 1, inclusively. + + """ + gamma = 1.055 * lin ** (1.0 / 2.4) - 0.055 + scale = 12.92 * lin + return np.where(lin > 0.0031308, gamma, scale) + + +def srgb2lin(srgb): + """Convert sRGB values to physically linear ones. The transformation is + uniform in RGB, so *srgb* can be of any shape. + + *srgb* values should range between 0 and 1, inclusively. + + """ + gamma = ((srgb + 0.055) / 1.055) ** 2.4 + scale = srgb / 12.92 + return np.where(srgb > 0.04045, gamma, scale) + + +def replace_bg_color_u8(fg: np.array, fg_mask: np.array, bg_color_old: list, bg_color_new: list): + r"""Given an image with background, as well as the foreground mask and old background color, + Replace the old background color with the new one. + Assuming everything is in uint8 + Args: + fg [..., 3] np.array + fg_mask[..., 1] np.array: 0 -> full background; 255 -> full foreground. + bg_color_old [3] RGB 0-255: Old background. + bg_color_new [3] RGB 0-255: New background + """ + assert fg.dtype == np.uint8 and fg_mask.dtype == np.uint8 + fg_mask = fg_mask.astype(np.float32) / 255.0 + fg = fg.astype(np.float32) / 255.0 + bg_color_old = np.array(bg_color_old, dtype=np.float32) / 255.0 + bg_color_new = np.array(bg_color_new, dtype=np.float32) / 255.0 + bg_mask = 1.0 - fg_mask + result = srgb2lin(fg) + bg_mask * (srgb2lin(bg_color_new) - srgb2lin(bg_color_old)) + result = lin2srgb(result) + result = np.clip((result * 255.0).round(), 0, 255).astype(np.uint8) + return result + + +def replace_bg_color_pil(fg_pil: Image.Image, fg_mask_pil: Image.Image, bg_color_old: list, bg_color_new: list): + fg = np.array(fg_pil) + fg_mask = np.array(fg_mask_pil) + if fg_mask.ndim == 2: + fg_mask = fg_mask[..., None] + else: + fg_mask = fg_mask[..., :1] + result = replace_bg_color_u8(fg, fg_mask, bg_color_old, bg_color_new) + return Image.fromarray(result) + + +def pil_loader_with_mask(key, data, background_color_new=None, background_color_old=[255, 255, 255], mask=None): + r""" + Function to load an image. + If the image is corrupt, it returns a black image. + Args: + key: Image key. + data: Image data stream. + """ + extension = re.sub(r".*[.]", "", key) + if extension.lower() not in _IMG_EXTENSIONS: + return None + + with io.BytesIO(data) as stream: + img = Image.open(stream) + img = img.convert("RGB") + if background_color_new is not None: + assert mask is not None + with io.BytesIO(mask) as stream: + mask = Image.open(stream) + mask.load() + mask = mask.convert("L") + img = replace_bg_color_pil(img, mask, background_color_old, background_color_new) + return img + + +def pil_loader(key, data, type="RGB"): + r""" + Function to load an image. + If the image is corrupt, it returns a black image. + Args: + key: Image key. + data: Image data stream. + """ + extension = re.sub(r".*[.]", "", key) + if extension.lower() not in _IMG_EXTENSIONS: + return None + + with io.BytesIO(data) as stream: + img = Image.open(stream) + img.load() + img = img.convert(type) + + return img diff --git a/cosmos-inference/cosmos3/_src/imaginaire/utils/s3_utils.py b/cosmos-inference/cosmos3/_src/imaginaire/utils/s3_utils.py new file mode 100644 index 00000000..e46eff16 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/imaginaire/utils/s3_utils.py @@ -0,0 +1,139 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +import os +from typing import Any, Optional + +from cosmos3._src.imaginaire.utils import distributed, log +from cosmos3._src.imaginaire.utils.easy_io import easy_io + + +def download_from_s3_with_cache( + s3_path: str, + cache_fp: Optional[str] = None, + cache_dir: Optional[str] = None, + rank_sync: bool = True, + backend_args: Optional[dict] = None, + backend_key: Optional[str] = None, +) -> str: + """download data from S3 with optional caching. + + This function first attempts to load the data from a local cache file. If + the cache file doesn't exist, it downloads the data from S3 to the cache + location. Caching is performed in a rank-aware manner + using `distributed.barrier()` to ensure only one download occurs across + distributed workers (if `rank_sync` is True). + + Args: + s3_path (str): The S3 path of the data to load. + cache_fp (str, optional): The path to the local cache file. If None, + a filename will be generated based on `s3_path` within `cache_dir`. + cache_dir (str, optional): The directory to store the cache file. If + None, the environment variable `IMAGINAIRE_CACHE_DIR` (defaulting + to "/tmp") will be used. + rank_sync (bool, optional): Whether to synchronize download across + distributed workers using `distributed.barrier()`. Defaults to True. + backend_args (dict, optional): The backend arguments passed to easy_io to construct the backend. + backend_key (str, optional): The backend key passed to easy_io to registry the backend or retrieve the backend if it is already registered. + + Returns: + cache_fp (str): The path to the local cache file. + + Raises: + FileNotFoundError: If the data cannot be found in S3 or the cache. + """ + cache_dir = os.environ.get("TORCH_HOME") if cache_dir is None else cache_dir + cache_dir = ( + os.environ.get("IMAGINAIRE_CACHE_DIR", os.path.expanduser("~/.cache/imaginaire")) + if cache_dir is None + else cache_dir + ) + cache_dir = os.path.expanduser(cache_dir) + if cache_fp is None: + cache_fp = os.path.join(cache_dir, s3_path.replace("s3://", "")) + if not cache_fp.startswith("/"): + cache_fp = os.path.join(cache_dir, cache_fp) + + if distributed.get_rank() == 0: + if os.path.exists(cache_fp): + # check the size of cache_fp + if os.path.getsize(cache_fp) < 1: + os.remove(cache_fp) + log.warning(f"Removed empty cache file {cache_fp}.") + + if rank_sync: + if not os.path.exists(cache_fp): + log.critical(f"Local cache {cache_fp} Not exist! Downloading {s3_path} to {cache_fp}.") + log.info(f"backend_args: {backend_args}") + log.info(f"backend_key: {backend_key}") + + easy_io.copyfile_to_local( + s3_path, cache_fp, dst_type="file", backend_args=backend_args, backend_key=backend_key + ) + log.info(f"Downloaded {s3_path} to {cache_fp}.") + else: + log.info(f"Local cache {cache_fp} already exist! {s3_path} -> {cache_fp}.") + + distributed.barrier() + else: + if not os.path.exists(cache_fp): + easy_io.copyfile_to_local( + s3_path, cache_fp, dst_type="file", backend_args=backend_args, backend_key=backend_key + ) + + log.info(f"Downloaded {s3_path} to {cache_fp}.") + return cache_fp + + +def load_from_s3_with_cache( + s3_path: str, + cache_fp: Optional[str] = None, + cache_dir: Optional[str] = None, + rank_sync: bool = True, + backend_args: Optional[dict] = None, + backend_key: Optional[str] = None, + easy_io_kwargs: Optional[dict] = None, +) -> Any: + """Loads data from S3 with optional caching. + + This function first attempts to load the data from a local cache file. If + the cache file doesn't exist, it downloads the data from S3 to the cache + location and then loads it. Caching is performed in a rank-aware manner + using `distributed.barrier()` to ensure only one download occurs across + distributed workers (if `rank_sync` is True). + + Args: + s3_path (str): The S3 path of the data to load. + cache_fp (str, optional): The path to the local cache file. If None, + a filename will be generated based on `s3_path` within `cache_dir`. + cache_dir (str, optional): The directory to store the cache file. If + None, the environment variable `IMAGINAIRE_CACHE_DIR` (defaulting + to "/tmp") will be used. + rank_sync (bool, optional): Whether to synchronize download across + distributed workers using `distributed.barrier()`. Defaults to True. + backend_args (dict, optional): The backend arguments passed to easy_io to construct the backend. + backend_key (str, optional): The backend key passed to easy_io to registry the backend or retrieve the backend if it is already registered. + + Returns: + Any: The loaded data from the S3 path or cache file. + + Raises: + FileNotFoundError: If the data cannot be found in S3 or the cache. + """ + cache_fp = download_from_s3_with_cache(s3_path, cache_fp, cache_dir, rank_sync, backend_args, backend_key) + + if easy_io_kwargs is None: + easy_io_kwargs = {} + return easy_io.load(cache_fp, **easy_io_kwargs) diff --git a/cosmos-inference/cosmos3/_src/imaginaire/utils/scheduler.py b/cosmos-inference/cosmos3/_src/imaginaire/utils/scheduler.py new file mode 100644 index 00000000..e45eb96a --- /dev/null +++ b/cosmos-inference/cosmos3/_src/imaginaire/utils/scheduler.py @@ -0,0 +1,64 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +import math +from typing import List + +import torch + + +class WarmupLambdaLR(torch.optim.lr_scheduler.LambdaLR): + def __init__(self, optimizer, warmup, last_epoch=-1, verbose=False): + # Define the lambda function based on the warmup period + self.warmup = warmup + + def lr_lambda(epoch): + # Increase lr linearly for the first 'warmup' epochs + if epoch < warmup: + return float(epoch + 1) / warmup + # After 'warmup' epochs, keep lr constant + return 1.0 + + # Initialize the parent class with the generated lr_lambda + super(WarmupLambdaLR, self).__init__(optimizer, lr_lambda, last_epoch) + + +# cosine lr decay scheduler with warmup from https://github.com/karpathy/nanoGPT/blob/master/train.py#L228 +class WarmupCosineLR(torch.optim.lr_scheduler.LRScheduler): + def __init__( + self, + optimizer: torch.optim.Optimizer, + warmup_iters: int, + lr_decay_iters: int, + min_lr: float, + last_epoch: int = -1, + ): + self.warmup_iters = warmup_iters + self.lr_decay_iters = lr_decay_iters + self.min_lr = min_lr + super().__init__(optimizer, last_epoch) + + def get_lr(self) -> List[float]: + # 1) linear warmup for warmup_iters steps + if self.last_epoch < self.warmup_iters: + return [base_lr * self.last_epoch / self.warmup_iters for base_lr in self.base_lrs] + # 2) if it > lr_decay_iters, return min learning rate + if self.last_epoch > self.lr_decay_iters: + return [self.min_lr for _ in self.base_lrs] + # 3) in between, use cosine decay down to min learning rate + decay_ratio = (self.last_epoch - self.warmup_iters) / (self.lr_decay_iters - self.warmup_iters) + assert 0 <= decay_ratio <= 1 + coeff = 0.5 * (1.0 + math.cos(math.pi * decay_ratio)) # coeff ranges 0..1 + return [self.min_lr + coeff * (base_lr - self.min_lr) for base_lr in self.base_lrs] diff --git a/cosmos-inference/cosmos3/_src/imaginaire/utils/submit_job_helper.py b/cosmos-inference/cosmos3/_src/imaginaire/utils/submit_job_helper.py new file mode 100644 index 00000000..5619aa66 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/imaginaire/utils/submit_job_helper.py @@ -0,0 +1,37 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +import os +import os.path as osp + +import git +from loguru import logger as logging + + +def is_git(path): + try: + _ = git.Repo(path, search_parent_directories=True).git_dir + return True + except git.exc.InvalidGitRepositoryError: + return False + + +def git_rootdir(path=""): + if is_git(os.getcwd()): + git_repo = git.Repo(os.getcwd(), search_parent_directories=True) + root = git_repo.git.rev_parse("--show-toplevel") + return osp.join(root, path) + logging.info("not a git repo") + return osp.join(os.getcwd(), path) diff --git a/cosmos-inference/cosmos3/_src/imaginaire/utils/timer.py b/cosmos-inference/cosmos3/_src/imaginaire/utils/timer.py new file mode 100644 index 00000000..f351b1d2 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/imaginaire/utils/timer.py @@ -0,0 +1,297 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +""" +Timer: helps measure CPU and CUDA times easily and reliably. +""" + +import time +from contextlib import ContextDecorator +from contextvars import ContextVar +from functools import wraps +from typing import Callable + +import torch + +from cosmos3._src.imaginaire.utils import log + +_timer_active = ContextVar("_timer_active", default=False) + + +def in_timer_region() -> bool: + return _timer_active.get() + + +def _autoformat_time_us(time_us: float) -> str: + """ + Automatically format time in nanoseconds. + """ + if time_us >= 1e6: + time_s = time_us * 1e-6 + return f"{time_s:.2f} s" + + if time_us >= 1e3: + time_ms = time_us * 1e-3 + return f"{time_ms:.2f} ms" + + return f"{time_us:.2f} us" + + +def format_time_str(time_us: float, unit: str | None = None) -> str: + """ + Automatically format time in nanoseconds either automatically or based on + desired unit. + """ + if unit is None: + return _autoformat_time_us(time_us) + + if unit == "us": + return f"{time_us:.2f} us" + + if unit == "ms": + return f"{time_us * 1e-3:.2f} ms" + + if unit == "s": + return f"{time_us * 1e-6:.2f} s" + + raise NotImplementedError(f"Time unit {unit} is not supported.") + + +def format_time(time_us: float, unit: str) -> float: + """ + Format time in nanoseconds based on desired unit. + """ + + if unit == "us": + return time_us + + if unit == "ms": + return time_us * 1e-3 + + if unit == "s": + return time_us * 1e-6 + + raise NotImplementedError(f"Time unit {unit} is not supported.") + + +class Timer(ContextDecorator): + """ + Reliable CPU and CUDA Timer. + + Args: + tag (str | None): Optional tag used in logs/prints. + + measure_cpu (bool): Whether to measure CPU time (using `time`). Default: `True`. + + measure_cuda (bool): Whether to measure CUDA time (using CUDA events). Default: `True`. + + unit (str | None): Optional time unit. Must be either "s" (seconds), "ms" (microseconds), + "us" (nanoseconds), or None (format automatically based on value). + + debug (bool): Whether to log results in debug mode instead of info. Default is False. + + Examples: + ```python + with Timer(measure_cpu=True, measure_cuda=True, unit="ms"): + model(x) + ``` + + ```python + @Timer(measure_cpu=True, measure_cuda=True, unit="ms") + def func(x): + return model(x) + ``` + """ + + def __init__( + self, + tag: str | None = None, + measure_cpu: bool = True, + measure_cuda: bool = True, + unit: str | None = None, + debug: bool = False, + ): + self.measure_cpu = measure_cpu + self.measure_cuda = measure_cuda + + self.measured = False + self.cpu_time_us = 0 + self.cuda_time_us = 0 + + self.busy = False + self.cpu_time_start = None + self.cuda_start_event = None + self.cuda_end_event = None + self.cuda_stream = None + + self.tag = "unknown" if tag is None else tag + self.unit = unit + if self.unit is not None and self.unit not in ["s", "ms", "us"]: + raise NotImplementedError(f"Time unit {self.unit} is not supported.") + + self.debug = debug + + def _log(self, msg: str): + if self.debug: + log.debug(msg) + else: + log.info(msg) + + def __enter__(self): + self.token = _timer_active.set(True) + self.start() + + def __exit__(self, exc_type, exc_value, traceback): + self.end() + self.report() + _timer_active.reset(self.token) + + def __call__(self, func: Callable) -> Callable: + @wraps(func) + def wrapper(*args, **kwargs): # noqa: ANN202 + self.start() + result = func(*args, **kwargs) + self.end() + self.report() + return result + + return wrapper # type: ignore + + def report(self): + """ + Reports measurements. + """ + if self.measure_cpu and self.measure_cuda: + self._log(f"Time spent on {self.tag}: CPU: {self.get_cpu_time_str()}, CUDA: {self.get_cuda_time_str()}") + elif self.measure_cpu: + self._log(f"Time spent on {self.tag}: {self.get_cpu_time_str()}") + elif self.measure_cuda: + self._log(f"CUDA time spent on {self.tag}: {self.get_cuda_time_str()}") + else: + raise NotImplementedError() + + def get_cpu_time(self) -> float: + """ + Returns CPU time measurement. + """ + if not self.measure_cpu: + raise RuntimeError(f"CPU timer is disabled ({self.measure_cpu=}).") + + if not self.measured: + raise RuntimeError("No measurements were made yet!") + + if self.unit is None: + raise RuntimeError("No unit was specified. Please use get_cpu_time_str() instead.") + + assert self.unit is not None + return format_time(self.cpu_time_us, unit=self.unit) + + def get_cuda_time(self) -> float: + """ + Returns CUDA time measurement. + """ + if not self.measure_cuda: + raise RuntimeError(f"CUDA timer is disabled ({self.measure_cuda=}).") + + if not self.measured: + raise RuntimeError("No measurements were made yet!") + + if self.unit is None: + raise RuntimeError("No unit was specified. Please use get_cuda_time_str() instead.") + + assert self.unit is not None + return format_time(self.cuda_time_us, unit=self.unit) + + def get_cpu_time_str(self) -> str: + """ + Returns CPU time measurement in string format. + """ + if not self.measure_cpu: + raise RuntimeError(f"CPU timer is disabled ({self.measure_cpu=}).") + + if not self.measured: + raise RuntimeError("No measurements were made yet!") + + return format_time_str(self.cpu_time_us, unit=self.unit) + + def get_cuda_time_str(self) -> str: + """ + Returns CUDA time measurement in string format. + """ + if not self.measure_cuda: + raise RuntimeError(f"CUDA timer is disabled ({self.measure_cuda=}).") + + if not self.measured: + raise RuntimeError("No measurements were made yet!") + + return format_time_str(self.cuda_time_us, unit=self.unit) + + def reset(self): + """ + Resets recorded measurements + """ + self.measured = False + self.cpu_time_us = 0 + self.cuda_time_us = 0 + + def start(self, cuda_device: torch.device | None = None, cuda_stream: torch.cuda.Stream | None = None): + """ + Start time measurements. + + Args: + cuda_device (torch.device | None): CUDA device. Will use default CUDA device if not indicated. + + cuda_stream (torch.cuda.Stream | None): CUDA stream to use for CUDA time measurement. + Will use default stream for current CUDA device if not indicated. + """ + if self.busy: + raise RuntimeError("Already called Timer.start() once!") + + self.busy = True + + if self.measure_cuda: + self.cuda_stream = cuda_stream if cuda_stream is not None else torch.cuda.current_stream(cuda_device) + self.cuda_stream.synchronize() + + if self.measure_cpu: + self.cpu_time_start = time.time() + + if self.measure_cuda: + self.cuda_start_event = torch.cuda.Event(enable_timing=True) + self.cuda_end_event = torch.cuda.Event(enable_timing=True) + self.cuda_stream.record_event(self.cuda_start_event) + + def end(self): + """ + Ends time measurements. + + NOTE: must be done on the same CUDA device and stream as start(). + """ + if not self.busy: + raise RuntimeError("Timer.start() must be called exactly once before end()!") + + if self.measure_cuda: + self.cuda_stream.record_event(self.cuda_end_event) + self.cuda_end_event.synchronize() + + if self.measure_cpu: + self.cpu_time_end = time.time() + self.cpu_time_us = (self.cpu_time_end - self.cpu_time_start) * 1e6 + + if self.measure_cuda: + self.cuda_time_us = self.cuda_start_event.elapsed_time(self.cuda_end_event) * 1e3 + + self.busy = False + self.measured = True diff --git a/cosmos-inference/cosmos3/_src/imaginaire/utils/tone_curve.py b/cosmos-inference/cosmos3/_src/imaginaire/utils/tone_curve.py new file mode 100644 index 00000000..2f394bca --- /dev/null +++ b/cosmos-inference/cosmos3/_src/imaginaire/utils/tone_curve.py @@ -0,0 +1,197 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +import math +from typing import Literal + +import numpy as np +from PIL import Image + + +def lin2srgb(lin): + """Convert sRGB values to physically linear ones. The transformation is + uniform in RGB, so *srgb* can be of any shape. + + *srgb* values should range between 0 and 1, inclusively. + + """ + gamma = 1.055 * lin ** (1.0 / 2.4) - 0.055 + scale = 12.92 * lin + return np.where(lin > 0.0031308, gamma, scale) + + +def srgb2lin(srgb): + """Convert sRGB values to physically linear ones. The transformation is + uniform in RGB, so *srgb* can be of any shape. + + *srgb* values should range between 0 and 1, inclusively. + + """ + gamma = ((srgb + 0.055) / 1.055) ** 2.4 + scale = srgb / 12.92 + return np.where(srgb > 0.04045, gamma, scale) + + +def commerce_tonemap(color): + startCompression = 0.8 - 0.04 + desaturation = 0.15 + + x = np.min(color, axis=-1, keepdims=True) + offset = np.where(x < 0.08, x - 6.25 * x * x, 0.04) + color -= offset + peak = np.max(color, axis=-1, keepdims=True) + uncompressed = color + + d = 1.0 - startCompression + newPeak = 1.0 - d * d / (peak + d - startCompression) + with np.errstate(divide="ignore", invalid="ignore"): # Avoid error print + color = color * (newPeak / peak) + + g = 1.0 - 1.0 / (desaturation * (peak - newPeak) + 1.0) + + compressed = color * (1 - g) + newPeak * g + + return np.where(peak < startCompression, uncompressed, compressed) + + +# https://github.com/RenderKit/oidn/blob/master/training/color.py + + +# Computes the luminance of an RGB color +def luminance(r, g, b): + return 0.212671 * r + 0.715160 * g + 0.072169 * b + + +# Computes an autoexposure value for a NumPy image +def autoexposure(image, mask, key=0.18): + maxBinSize = 16 # downsampling amount + eps = 1e-8 + + image = image * mask + # Compute the luminance of each pixel + r = image[..., 0] + g = image[..., 1] + b = image[..., 2] + L = luminance(r, g, b) + + # Center crop if the image size is not whole multiple of maxBinSize + crop_H = L.shape[0] // maxBinSize * maxBinSize + pad_top = round((L.shape[0] - crop_H) / 2) + crop_W = L.shape[1] // maxBinSize * maxBinSize + pad_left = round((L.shape[1] - crop_W) / 2) + L = L[pad_top : pad_top + crop_H, pad_left : pad_left + crop_W] + mask = mask[pad_top : pad_top + crop_H, pad_left : pad_left + crop_W] + + # Downsample the image to minimize sensitivity to noise + H = L.shape[0] # original height + W = L.shape[1] # original width + L = L.reshape(H // maxBinSize, maxBinSize, W // maxBinSize, maxBinSize) + L = np.mean(L, axis=(1, 3)) + mask = mask.reshape(H // maxBinSize, maxBinSize, W // maxBinSize, maxBinSize) + mask = np.mean(mask, axis=(1, 3)) + with np.errstate(divide="ignore", invalid="ignore"): # Avoid error print + L /= mask + L = L[mask > eps] + + # Keep only values greater than epsilon + L = L[L > eps] + if L.size == 0: + return 1.0 + + # Compute the exposure value + return float(key / np.exp2(np.log2(L).mean())) + + +# Default values changed to identity transformation, aka do nothing. +def apply_tone_curve( + imgs: list[Image.Image], + input_mapping: Literal["log", "straight"] = "log", + output_mapping: Literal["commerce", "straight", "log"] = "commerce", + exposure_bias: float = 1.5, + auto: bool = True, + ae_pregain: float = 1.0, + ae_key: float = 0.18, + ae_strength_below: float = 1.0, + ae_strength_above: float = 1.0, +) -> tuple[list[Image.Image], float]: + r"""Adjust the exposure of a list of images together. + For cam_v1 data, use input_mapping="log" + For cam_v2 data, use input_mapping="straight" + Some of the previous models are trained with output_mapping="commerce". This is a very forgiving curve. + But to match the style of PixelSquid, use output_mapping="straight" + See https://docs.google.com/document/d/1z08rWvWzqd_tNPlh7_D4aIkdaLAerSSagXK4pQlQxCk/edit for detail + + Args: + imgs: list of PIL images + + Returns: + ret: list of PIL images with exposure adjusted + """ + num_imgs = len(imgs) + img = np.concatenate([np.asarray(x) for x in imgs], axis=0).astype(np.float32) / 255.0 + mask = img[..., 3:4].astype(np.float32) # H,W,1 + img = img[..., :3] # Remove alpha + + img = srgb2lin(img) + + if input_mapping == "log": + img = np.exp(img) - 1 + elif input_mapping == "straight": + pass + else: + raise NotImplementedError(f"Unknown input_mapping: {input_mapping}") + + if auto: + img *= ae_pregain + exposure = autoexposure(img, mask, key=ae_key) + log_exposure = math.log2(exposure) + if log_exposure <= 0: + log_exposure *= ae_strength_below + else: + log_exposure *= ae_strength_above + exposure = 2.0**log_exposure + else: + exposure = 1.0 + exposure *= exposure_bias + + img = img * exposure + + if output_mapping == "commerce": + img = commerce_tonemap(img) + elif output_mapping == "log": + img = np.log(img + 1) + elif output_mapping == "straight": + pass + else: + raise NotImplementedError(f"Unknown output_mapping: {output_mapping}") + + img = lin2srgb(img) + img = np.concatenate([img, mask], axis=-1) + img = np.clip((img * 255.0).round(), 0, 255).astype(np.uint8) + return [Image.fromarray(x) for x in np.split(img, num_imgs, axis=0)], exposure + + +def apply_exposure(img: Image, exposure: float) -> Image: + r"""Apply exposure adjustment to a PIL image. + Args: + img: a PIL image, RGB or RGBA + exposure: exposure value + Returns: + img: PIL image with exposure adjusted + """ + img = np.asarray(img).astype(np.float32) / 255.0 + img[..., :3] = lin2srgb(srgb2lin(img[..., :3]) * exposure) + img = np.clip((img * 255.0).round(), 0, 255).astype(np.uint8) + return Image.fromarray(img) diff --git a/cosmos-inference/cosmos3/_src/imaginaire/utils/training.py b/cosmos-inference/cosmos3/_src/imaginaire/utils/training.py new file mode 100644 index 00000000..2644a057 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/imaginaire/utils/training.py @@ -0,0 +1,171 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +import torch + +from cosmos3._src.imaginaire.functional.batch_ops import batch_mul +from cosmos3._src.imaginaire.utils import log + + +def random_dropout(embeddings, drop_rate): + r""" + Function to perform random dropout for embeddings. + When we drop embeddings, we zero them out. + Args: + embeddings (tensor): Input embeddings + drop_rate (float): Rate of dropping the embedding. + """ + num_samples = embeddings.shape[0] + # Create a shape (num_samples, 1, 1, 1, 1, ...) depending on embeddings dim. + # This is done to ensure we can broadcast the zero_flag to the embeddings. + # embeddings.ndim is 3 for images, and 4 for videos, and the corresponding + # shapes are (num_samples, 1, 1) and (num_samples, 1, 1, 1) respectively. + tensor_shape = (num_samples,) + tuple([1] * (embeddings.ndim - 1)) + zero_flag = torch.ones(tensor_shape).to(embeddings.dtype) * (1 - drop_rate) + zero_flag = torch.bernoulli(zero_flag).to(embeddings.device) + embeddings = embeddings * zero_flag + return embeddings + + +def random_embed_replace(src_embed, tgt_embed, drop_rate, position=0): + r""" + Function to perform random embedding replacement. + With probability given by drop rate, we replace src_embed by tgt_embed + Args: + src_embed (tensor): Src embeddings + tgt_embed (tensor): Tgt embeddings + drop_rate (float): Rate of replacing the embedding. + position (int): Starting position to replace the sequence + """ + for i in range(src_embed.shape[0]): + coin_flip = torch.rand(1).item() + if coin_flip < drop_rate: + src_embed[i][position:] = tgt_embed + + return src_embed + + +def to255_round_uint8_append(vis_images, total_vis_images): + r""" + Map pixel values of vis_images to [0 255] and quantize them to 256 bins + """ + if vis_images is not None: + vis_images = ((vis_images + 1) / 2).clamp_(0, 1).mul_(255).round_().type(torch.uint8) + total_vis_images.append(vis_images) + return vis_images + + +def no_round_append(vis_images, total_vis_images): + r""" + Append the images as is without type casting + """ + if vis_images is not None: + total_vis_images.append(vis_images) + return vis_images + + +def sample_sigma_and_xt( + sde, + target_data: torch.Tensor, + data_batch: dict = None, + use_same_noise_multiview: bool = False, + use_low_noise_first_view: bool = False, +): + """Sample pertubation noise levels and generate noisy observations.""" + # Sample pertubation noise levels + tensor_kwargs = {"device": "cuda", "dtype": target_data.dtype} + t = sde.sample_t(batch_size=target_data.size()[0]).to(**tensor_kwargs) # check precision and memory_format later + if data_batch is not None and data_batch.get("num_view", None) is not None: + if use_same_noise_multiview and not use_low_noise_first_view: + t_shape = t.shape + t = t.view(-1, int(data_batch["num_view"].view(-1)[0].item())) + t[:, 1:] = t[:, 0:1] + t = t.view(t_shape) + elif use_low_noise_first_view and not use_same_noise_multiview: + t_shape = t.shape + t = t.view(-1, int(data_batch["num_view"].view(-1)[0].item())) + t[:, 0] = 0.02 + t = t.view(t_shape) + elif use_low_noise_first_view and use_same_noise_multiview: + t_shape = t.shape + t = t.view(-1, int(data_batch["num_view"].view(-1)[0].item())) + t[:, 0] = 0.02 + t[:, 2:] = t[:, 1:2] + t = t.view(t_shape) + # Generate an N(0,1) noise map. + epsilon = torch.randn_like(target_data, **tensor_kwargs) + # Get the mean and stand deviation of the marginal probability distribution. + mean, std = sde.marginal_prob(target_data, t) + # Generate noisy observations + xt = mean + batch_mul(std, epsilon) # corrupted data + + data_batch = {} + data_batch["t"] = t # between model.sde.eps to 1 + data_batch["epsilon"] = epsilon # Standard normal noise map + data_batch["mean"] = mean # mean of the marginal distribution + data_batch["std"] = std # std deviation of the marginal distribution + data_batch["xt"] = xt # corrupted data + data_batch["target"] = target_data + return data_batch + + +def form_loss_mask( + data_batch: dict, + x_shape: tuple, + dtype: torch.dtype, + device: torch.device, + loss_masking_cfg: dict = {"human_body_mask": 2, "human_face_mask": 4, "human_hand_mask": 4, "padding_mask": 0}, +) -> torch.Tensor: + r""" + Function to form a combined mask given several loss masks. + If there are overlapping region between multiple masks, we assign the max value to the + overlapping region. For the unmasked regions, we assign a value of 1. + However, if there is a mask specifying zero value, we zero it out. + Zeroing is crucial for padded loss. + Copied from i3, kaal.py form_loss_mask function. + zero_mask: mask out some region by setting them as zero + For example, + mask1: [0, 1, 1, 1, 0, 0], weight: 2 + mask2: [1, 0, 1, 0, 0, 0], weight: 4 + mask3: [0, 1, 0, 0, 0, 0], weight: 0 + + Final loss mask: [4, 0, 4, 2, 1, 1] + """ + loss_mask = torch.ones(x_shape, dtype=dtype, device=device) + zero_mask = torch.ones(x_shape, dtype=dtype, device=device) + + for key in loss_masking_cfg: + if key not in data_batch: + if loss_masking_cfg[key] > 0: + log.warning(f"You set {key} to have larger loss, but there is no such mask data") + continue + # Repeat mask along channel's dimension. ndim=4 for images. + repeat_dims = (1, 3) + tuple([1] * (data_batch[key].ndim - 2)) + mask_key = torch.tile(data_batch[key], dims=repeat_dims) + weight_key = loss_masking_cfg[key] + + assert weight_key >= 0, "Current support only for weight >= 0" + + if key == "zero_mask": + zero_mask = zero_mask * mask_key + elif weight_key == 0: + zero_mask = zero_mask * (1 - mask_key) + else: + no_mask_region = (mask_key == 0).float() + loss_mask = mask_key * weight_key + no_mask_region * loss_mask + # loss_mask = torch.max(loss_mask_new, loss_mask) + + loss_mask_final = loss_mask * zero_mask + return loss_mask_final diff --git a/cosmos-inference/cosmos3/_src/imaginaire/utils/validator.py b/cosmos-inference/cosmos3/_src/imaginaire/utils/validator.py new file mode 100644 index 00000000..b61d1220 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/imaginaire/utils/validator.py @@ -0,0 +1,512 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +import base64 +import binascii +import itertools +import json +import os +from abc import ABC, abstractmethod +from io import BytesIO +from typing import Any + +# Sentinel value to indicate that no default was explicitly set by the user +# we want to mimic usage of function parameters: if no default is provided, the parameter is mandatory +_UNSET = object() + + +# from https://docs.python.org/3/howto/descriptor.html#validator-class + + +# validators can be customized to very specific needs, e.g. see HumanAttributes below +class Validator(ABC): + def __init__(self, default=_UNSET, hidden=False): + self.default = default + self.hidden = hidden + + # set name is called when the validator is created as class variable + # name is the name of the variable in the owner class, so here we create the name for the backing variable + def __set_name__(self, owner, name): + self.private_name = "_" + name + + def __get__(self, obj, objtype=None): + value = getattr(obj, self.private_name, self.default) + if value is _UNSET: + # If we reach here, it means a mandatory parameter was accessed without being set + attr_name = getattr(self, "private_name", "unknown").lstrip("_") + raise ValueError( + f"Parameter '{attr_name}' is mandatory but has not been set. " + f"No default value was provided and no value was assigned." + ) + return value + + def __set__(self, obj, value): + value = self.validate(value) + setattr(obj, self.private_name, value) + + @abstractmethod + def validate(self, value): + pass + + def json(self): + pass + + +class Bool(Validator): + def __init__(self, default=_UNSET, hidden=False, tooltip=None): + super().__init__(default, hidden) + self.default = default + self.hidden = hidden + self.tooltip = tooltip + + def validate(self, value): + if isinstance(value, int): + value = value != 0 + elif isinstance(value, str): + value = value.lower() + if value in ["true", "1"]: + value = True + elif value in ["false", "0"]: + value = False + else: + raise ValueError(f"Expected {value!r} to be one of ['True', 'False', '1', '0']") + elif not isinstance(value, bool): + raise TypeError(f"Expected {value!r} to be an bool") + + return value + + def get_range_iterator(self): + return [True, False] + + def __repr__(self) -> str: + return f"Bool({self.private_name=} {self.default=} {self.hidden=})" + + def json(self): + return { + "type": bool.__name__, + "default": self.default, + "tooltip": self.tooltip, + } + + +class Int(Validator): + def __init__(self, default=_UNSET, min=None, max=None, step=1, hidden=False, tooltip=None): + self.min = min + self.max = max + self.default = default + self.step = step + self.hidden = hidden + self.tooltip = tooltip + + def validate(self, value): + if isinstance(value, str): + value = int(value) + elif not isinstance(value, int): + raise TypeError(f"Expected {value!r} to be an int") + + if self.min is not None and value < self.min: + raise ValueError(f"Expected {value!r} to be at least {self.min!r}") + if self.max is not None and value > self.max: + raise ValueError(f"Expected {value!r} to be no more than {self.max!r}") + return value + + def get_range_iterator(self): + if self.default is _UNSET: + default_val = 0 + else: + default_val = int(self.default) if isinstance(self.default, (int, float, str)) else 0 + iter_min = self.min if self.min is not None else default_val + iter_max = self.max if self.max is not None else (default_val + 100) + return itertools.takewhile(lambda x: x <= iter_max, itertools.count(iter_min, self.step)) + + def __repr__(self) -> str: + return f"Int({self.private_name=} {self.default=}, {self.min=}, {self.max=} {self.hidden=})" + + def json(self): + return { + "type": int.__name__, + "default": self.default, + "min": self.min, + "max": self.max, + "step": self.step, + "tooltip": self.tooltip, + } + + +class Float(Validator): + def __init__(self, default=_UNSET, min=None, max=None, step=0.5, hidden=False, tooltip=None): + self.min = min + self.max = max + self.default = default + self.step = step + self.hidden = hidden + self.tooltip = tooltip + + def validate(self, value): + if isinstance(value, str) or isinstance(value, int): + value = float(value) + elif not isinstance(value, float): + raise TypeError(f"Expected {value!r} to be float") + + if self.min is not None and value < self.min: + raise ValueError(f"Expected {value!r} to be at least {self.min!r}") + if self.max is not None and value > self.max: + raise ValueError(f"Expected {value!r} to be no more than {self.max!r}") + return value + + def get_range_iterator(self): + if self.default is _UNSET: + default_val = 0.0 + else: + default_val = float(self.default) if isinstance(self.default, (int, float, str)) else 0.0 + iter_min = self.min if self.min is not None else default_val + iter_max = self.max if self.max is not None else (default_val + 100.0) + return itertools.takewhile(lambda x: x <= iter_max, itertools.count(iter_min, self.step)) + + def __repr__(self) -> str: + return f"Float({self.private_name=} {self.default=}, {self.min=}, {self.max=} {self.hidden=})" + + def json(self): + return { + "type": float.__name__, + "default": self.default, + "min": self.min, + "max": self.max, + "step": self.step, + "tooltip": self.tooltip, + } + + +class String(Validator): + def __init__(self, default=_UNSET, min=None, max=None, predicate=None, hidden=False, tooltip=None): + self.min = min + self.max = max + self.predicate = predicate + self.default = default + self.hidden = hidden + self.tooltip = tooltip + + def validate(self, value): + if value is None: + return value # Allow None as a valid value to be compatible with existing code + # this breaks strict typing, so do this only for strings + if not isinstance(value, str): + raise TypeError(f"Expected {value!r} to be an str or None") + if self.min is not None and len(value) < self.min: + raise ValueError(f"Expected {value!r} to be no smaller than {self.min!r}") + if self.max is not None and len(value) > self.max: + raise ValueError(f"Expected {value!r} to be no bigger than {self.max!r}") + if self.predicate is not None and not self.predicate(value): + raise ValueError(f"Expected {self.predicate} to be true for {value!r}") + return value + + def get_range_iterator(self): + return iter([self.default]) + + def __repr__(self) -> str: + return f"String({self.private_name=} {self.default=}, {self.min=}, {self.max=} {self.hidden=})" + + def json(self): + return { + "type": str.__name__, + "default": self.default, + "tooltip": self.tooltip, + } + + +class Path(Validator): + def __init__(self, default=_UNSET, hidden=False, tooltip=None): + self.default = default + self.hidden = hidden + self.tooltip = tooltip + + def validate(self, value): + if value is None: + return value + if not isinstance(value, str): + raise TypeError(f"{self.private_name} validator: Expected {value!r} to be an str") + if not os.path.exists(value): + raise ValueError(f"{self.private_name} validator: Expected {value!r} to be a valid path") + + return value + + def get_range_iterator(self): + return iter([self.default]) + + def __repr__(self) -> str: + return f"String({self.private_name=} {self.default=}, {self.hidden=})" + + +class InputImage(Validator): + def __init__( + self, default=_UNSET, hidden=False, tooltip=None, supported_formats=["jpeg", "jpg", "png", "bmp", "gif"] + ): + self.default = default + self.hidden = hidden + self.tooltip = tooltip + self.supported_formats = supported_formats + + def validate(self, value): + ext = os.path.splitext(value)[1].lower() + + if ext not in self.supported_formats: + raise ValueError(f"Unsupported image format: {ext}") + + if not isinstance(value, str): + raise TypeError(f"Expected {value!r} to be an str") + if not os.path.exists(value): + raise ValueError(f"Expected {value!r} to be a valid path") + return value + + def get_range_iterator(self): + return iter([self.default]) + + def __repr__(self) -> str: + return f"String({self.private_name=} {self.default=} {self.hidden=})" + + def json(self): + return { + "type": InputImage.__name__, + "default": self.default, + "values": self.supported_formats, + "tooltip": self.tooltip, + } + + +class JsonDict(Validator): + """ + JSON stringified version of a python dict. + Example: '{"ema_customization_iter.pt": "ema_customization_iter.pt"}' + """ + + def __init__(self, default=_UNSET, hidden=False): + self.default = default + self.hidden = hidden + + def validate(self, value): + if not value: + return {} + try: + dict = json.loads(value) + return dict + except json.JSONDecodeError as e: + raise ValueError(f"Expected {value!r} to be json stringified dict. Error: {str(e)}") + + def __repr__(self) -> str: + return f"Dict({self.default=} {self.hidden=})" + + +class Dict(Validator): + """ + Python dict. + Example: {'key': 'value'} + + This allows a single level of parameter nesting, but not a full nested dict. + For now we validate the individual keys here and store the dict as is. + Alternatively, we could have a validator that gets/sets another ValidatorParams class. + """ + + def __init__(self, default=_UNSET, hidden=False): + self.default = default + self.hidden = hidden + + def validate(self, value): + if not isinstance(value, dict): + raise TypeError(f"Expected {value!r} to be an dict") + return value + + def __repr__(self) -> str: + value = getattr(self, self.private_name, self.default) + + return f"Dict({self.private_name=} {self.default=} {self.hidden=} value={json.dumps(value, indent=4)})" + + +class OneOf(Validator): + def __init__(self, default=_UNSET, options=None, type_cast=None, hidden=False, tooltip=None): + self.options = set(options) if options is not None else set() + self.default = default + self.type_cast = type_cast # Cast the value to this type before checking if it's in options + self.tooltip = tooltip + self.hidden = hidden + + def validate(self, value): + if self.type_cast: + try: + value = self.type_cast(value) + except ValueError: + raise ValueError(f"Expected {value!r} to be castable to {self.type_cast!r}") + + if value not in self.options: + raise ValueError(f"Expected {value!r} to be one of {self.options!r}") + + return value + + def get_range_iterator(self): + return self.options + + def __repr__(self) -> str: + return f"OneOf({self.private_name=} {self.options=} {self.hidden=})" + + def json(self): + return { + "type": OneOf.__name__, + "default": self.default, + "values": list(self.options), + "tooltip": self.tooltip, + } + + +class MultipleOf(Validator): + def __init__(self, default=_UNSET, multiple_of: int = 1, type_cast=None, hidden=False, tooltip=None): + if type(multiple_of) is not int: + raise ValueError(f"Expected {multiple_of!r} to be an int") + self.multiple_of = multiple_of + self.default = default + self.type_cast = type_cast + + # if a parameter is hidden then probe() can't expose the param + # and the param can't be set anymore + self.hidden = hidden + self.tooltip = tooltip + + def validate(self, value): + if self.type_cast: + try: + value = self.type_cast(value) + except ValueError: + raise ValueError(f"Expected {value!r} to be castable to {self.type_cast!r}") + + if value % self.multiple_of != 0: + raise ValueError(f"Expected {value!r} to be a multiple of {self.multiple_of!r}") + + return value + + def get_range_iterator(self): + return itertools.count(0, self.multiple_of) + + def __repr__(self) -> str: + return f"MultipleOf({self.private_name=} {self.multiple_of=} {self.hidden=})" + + def json(self): + return { + "type": MultipleOf.__name__, + "default": self.default, + "multiple_of": self.multiple_of, + "tooltip": self.tooltip, + } + + +class HumanAttributes(Validator): + def __init__(self, default=_UNSET, hidden=False, tooltip=None): + self.default = default + self.hidden = hidden + self.tooltip = tooltip + + # hard code the options for now + # we extend this to init parameter as needed + valid_attributes = { + "emotion": ["angry", "contemptful", "disgusted", "fearful", "happy", "neutral", "sad", "surprised"], + "race": ["asian", "indian", "black", "white", "middle eastern", "latino hispanic"], + "gender": ["male", "female"], + "age group": [ + "young", + "teen", + "adult early twenties", + "adult late twenties", + "adult early thirties", + "adult late thirties", + "adult middle aged", + "older adult", + ], + } + + def get_range_iterator(self): + # create a list of all possible combinations + l1 = self.valid_attributes["emotion"] + l2 = self.valid_attributes["race"] + l3 = self.valid_attributes["gender"] + l4 = self.valid_attributes["age group"] + all_combinations = list(itertools.product(l1, l2, l3, l4)) + return iter(all_combinations) + + def validate(self, value): + human_attributes = value.lower() + if human_attributes not in ["none", "random"]: + # In this case, we need for custom attribute string + + attr_string = human_attributes + for attr_key in ["emotion", "race", "gender", "age group"]: + attr_detected = False + for attr_label in self.valid_attributes[attr_key]: + if attr_string.startswith(attr_label): + attr_string = attr_string[len(attr_label) + 1 :] # noqa: E203 + attr_detected = True + break + + if attr_detected is False: + raise ValueError(f"Expected {value!r} to be one of {self.valid_attributes!r}") + + return value + + def __repr__(self) -> str: + return f"HumanAttributes({self.private_name=} {self.hidden=})" + + def json(self): + return { + "type": HumanAttributes.__name__, + "default": self.default, + "values": self.valid_attributes, + "tooltip": self.tooltip, + } + + +class BytesIOType(Validator): + """ + Validator class for BytesIO. Valid inputs are either: + - bytes + - objects of class BytesIO + - str which can be successfully decoded into BytesIO + """ + + def __init__(self, default=_UNSET, hidden=False, tooltip=None): + self.default = default + self.hidden = hidden + self.tooltip = tooltip + + def validate(self, value: Any) -> BytesIO: + if isinstance(value, str): + try: + # Decode the Base64 string + decoded_bytes = base64.b64decode(value) + # Create a BytesIO stream from the decoded bytes + return BytesIO(decoded_bytes) + except (binascii.Error, ValueError) as e: + raise ValueError(f"Invalid Base64 encoded string: {e}") + elif isinstance(value, bytes): + return BytesIO(value) + elif isinstance(value, BytesIO): + return value + else: + raise TypeError(f"Expected {value!r} to be a Base64 encoded string, bytes, or BytesIO") + + def __repr__(self) -> str: + return f"BytesIOValidator({self.default=}, {self.hidden=})" + + def json(self): + return { + "type": BytesIO.__name__, + "default": self.default, + "tooltip": self.tooltip, + } diff --git a/cosmos-inference/cosmos3/_src/imaginaire/utils/validator_params.py b/cosmos-inference/cosmos3/_src/imaginaire/utils/validator_params.py new file mode 100644 index 00000000..abdfbd58 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/imaginaire/utils/validator_params.py @@ -0,0 +1,184 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +import json +import pprint +import shlex + +from cosmos3._src.imaginaire.utils import log +from cosmos3._src.imaginaire.utils.validator import _UNSET, Validator + +""" +Base class for all model parameter classes. + +The primary purpose is to fully validate any input parameter including type, range, etc. +By using custom validators, we can additionally validate complex parameters such as images, text, etc. +Additioally, the class can parse command line arguments into a dictionary of parameters +and create a model parameter class from a dictionary of parameters. + +if default of a validator is _UNSET, the parameter is mandatory and must be provided by the user. +Hence validators without explicit defaults require user input. +""" + + +class ValidatorParams: + """ + factory method to create a model params class from a given api and a dictionary of args + in comparison to createFromCmd, the server can first parse and modify some args, + finally use this factory method to create the model params + """ + + @classmethod + def create(cls, kwargs): + instance = cls() + log.info(f"creating model params class={cls}") + instance.from_kwargs(kwargs) + + val_dict = cls.get_val_dict() + + for key, validator in val_dict.items(): + # Check if validator has no user-provided default (_UNSET) and no value was assigned + if validator.default is _UNSET: + value = getattr(instance, key, _UNSET) + if value is _UNSET: + raise ValueError( + f"mandatory parameter {key} is missing - no default provided and no value assigned by user" + ) + + return instance + + """ + factory method to create a model params class from a command string + """ + + @classmethod + def createFromCmd(cls, cmd: str) -> object: + kwargs = cls.parse(cmd) + return cls.create(kwargs) + + def from_kwargs(self, kwargs): + # most attributes of this class are validators, + # but dervied class could add non-validators + # or some validators might be hidden + # therefore only allow exposed params to be set + for key, value in kwargs.items(): + if key in self.get_exposed_params(): + setattr(self, key, value) + else: + raise ValueError(f"unknown parameter {key} in command line") + + def to_kwargs(self) -> dict: + """for a given config return a dictionary of all the parameters and their values""" + param_names = self.get_exposed_params() + return {key: getattr(self, key) for key in param_names} + + @classmethod + def validate_kwargs(cls, kwargs) -> dict: + """validate a dictionary of args and return the validated dictionary""" + instance = cls.create(kwargs) + return instance.to_kwargs() + + @staticmethod + def parse(cmd: str) -> dict: + """parse a command string into an api command (e.g. text2image) and a dictionary of args""" + args = {} + pairs = shlex.split(cmd) + + for arg in pairs: + key, value = arg.split("=", 1) # Split only on the first '=' + value = value.strip().strip("'") + key = key.strip("--") + args[key.strip()] = value + + log.debug(f"parsed cmd-line: {args}") + return args + + @classmethod + def probe(cls) -> list[str]: + params = cls.get_exposed_params() + log.info(f"exposed params for {cls}: {params}") + return params + + """ + extened version of probe will query from each validator extended information. + This will include default parameters, min, max, step, etc. + """ + + @classmethod + def probe_ex(cls) -> dict: + validator_dict = cls.get_val_dict() + + parameter_info = {key: value.json() for key, value in validator_dict.items() if not value.hidden} + log.info(f"exposed params for {cls}: {json.dumps(parameter_info, indent=4)}") + return parameter_info + + # a model parameter class can also have non exposed parameters: + # we can hide parameters as needed from public API (compare to former exposed_params list in yaml configs in imaginaire3) + # class can also have non-validator attributes + @classmethod + def get_exposed_params(cls) -> list[str]: + # log.debug(f"getting exposed params of {cls.__name__}") + + # the exposed params are repeatedly used for parsing so we cache them + # note that we are caching the exposed params per class in the class hierarchy! + # each class has its own set of exposed params. + # instances of the class will have the same set of exposed params + if "_exposed_params" not in cls.__dict__: + # log.debug(f"creating cache exposed params of {cls.__name__}") + validator_dict = cls.get_val_dict() + + # if a parameter is hidden then probe() can't expose the param + # and the param can't be set anymore + cls._exposed_params = [key for key, value in validator_dict.items() if not value.hidden] + return cls._exposed_params + + def exposed_params_dict(self): + keys = self.get_exposed_params() + out_dict = {key: getattr(self, key) for key in keys} + return out_dict + + """ + returns a dictionary of all validators in the class hierarchy, e.g. for a string validator: + + prompt_validator = String() + + so prompt_validator is the instance of the String validator. the dictionary will be: + + {'prompt_validator': prompt_validator} + """ + + @classmethod + def get_val_dict(cls) -> dict[str, Validator]: + # log.debug(f"getting val dict of {cls.__name__}") + val_dict = {} + if cls is not ValidatorParams: + val_dict.update(cls.__bases__[0].get_val_dict()) + + val_dict.update({key: value for key, value in cls.__dict__.items() if isinstance(value, Validator)}) + + return val_dict + + @classmethod + def debug_print(cls): + pp = pprint.PrettyPrinter(indent=4) + print(f"*********** validator dict for {cls.__name__} ***********") + val_dict = cls.get_val_dict() + pp.pprint(val_dict) + + def __str__(self): + return ", ".join(f"{key}={value}" for key, value in self.__dict__.items()) + + def __repr__(self): + return ", ".join(f"{key}={value}" for key, value in self.__dict__.items()) diff --git a/cosmos-inference/cosmos3/_src/imaginaire/utils/wandb_util.py b/cosmos-inference/cosmos3/_src/imaginaire/utils/wandb_util.py new file mode 100644 index 00000000..fd681a8a --- /dev/null +++ b/cosmos-inference/cosmos3/_src/imaginaire/utils/wandb_util.py @@ -0,0 +1,172 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +from __future__ import annotations + +import os +from typing import TYPE_CHECKING + +import attrs +import wandb +import wandb.util +from omegaconf import DictConfig + +from cosmos3._src.imaginaire.lazy_config.lazy import LazyConfig +from cosmos3._src.imaginaire.utils import distributed, log, object_store +from cosmos3._src.imaginaire.utils.easy_io import easy_io + +if TYPE_CHECKING: + from cosmos3._src.imaginaire.config import CheckpointConfig, Config, JobConfig + from cosmos3._src.imaginaire.model import ImaginaireModel + +JOB_INFO = {} + + +def set_wandb_job_info(job_info: dict) -> None: + """Set the job info for the W&B logger. + + Args: + job_info (dict): The job info. + """ + JOB_INFO.update(job_info) + + +@distributed.rank0_only +def init_wandb(config: Config, model: ImaginaireModel) -> None: + """Initialize Weights & Biases (wandb) logger. + + Args: + config (Config): The config object for the Imaginaire codebase. + model (ImaginaireModel): The PyTorch model. + """ + if isinstance(config.job, DictConfig): + from cosmos3._src.imaginaire.config import JobConfig + + config_job = JobConfig(**config.job) + else: + config_job = config.job + config_checkpoint = config.checkpoint + # Try to fetch the W&B job ID for resuming training. + wandb_id = _read_wandb_id(config_job, config_checkpoint) + if wandb_id is None: + # Generate a new W&B job ID. + wandb_id = wandb.util.generate_id() + _write_wandb_id(config_job, config_checkpoint, wandb_id=wandb_id) + log.info(f"Generating new wandb ID: {wandb_id}") + else: + log.info(f"Resuming with existing wandb ID: {wandb_id}") + # refactor config so that wandb better understands it + local_safe_yaml_fp = LazyConfig.save_yaml(config, os.path.join(config_job.path_local, "config.yaml")) + if os.path.exists(local_safe_yaml_fp): + config_resolved = easy_io.load(local_safe_yaml_fp) + else: + config_resolved = attrs.asdict(config) + # Initialize the wandb library. If we attempt to resume an existing run + # but the current user does not have permission to update that run + # (common when re-using an ID created by someone else), fall back to + # creating a fresh run ID and re-initializing. + try: + wandb.init( + force=True, + id=wandb_id, + project=config_job.project, + group=config_job.group, + name=config_job.name, + config=config_resolved, + dir=config_job.path_local, + resume="allow", + mode=config_job.wandb_mode, + ) + except Exception as e: + # Detect common permission / upload errors from wandb and recover + msg = str(e) + if ( + "member role does not have Update Run permission" in msg + or "Error uploading run" in msg + or "returned error 403" in msg + ): + log.warning("W&B run exists but current user lacks update permission; starting a new run instead.") + # Generate and persist a new wandb id, then create a fresh run. + wandb_id = wandb.util.generate_id() + _write_wandb_id(config_job, config_checkpoint, wandb_id=wandb_id) + wandb.init( + force=True, + id=wandb_id, + project=config_job.project, + group=config_job.group, + name=config_job.name, + config=config_resolved, + dir=config_job.path_local, + mode=config_job.wandb_mode, + ) + elif "returned error 401" in msg or "user is not logged in" in msg: + log.warning("W&B authentication failed (401); falling back to offline mode. Error: %s", msg) + wandb.init( + force=True, + id=wandb_id, + project=config_job.project, + group=config_job.group, + name=config_job.name, + config=config_resolved, + dir=config_job.path_local, + mode="offline", + ) + else: + raise + + if wandb.run: + wandb.run.config.update({f"JOB_INFO/{k}": v for k, v in JOB_INFO.items()}, allow_val_change=True) + + +def _read_wandb_id(config_job: JobConfig, config_checkpoint: CheckpointConfig) -> str | None: + """Read the W&B job ID. If it doesn't exist, return None. + + Args: + config_wandb (JobConfig): The config object for the W&B logger. + config_checkpoint (CheckpointConfig): The config object for the checkpointer. + + Returns: + wandb_id (str | None): W&B job ID. + """ + wandb_id = None + if config_checkpoint.load_from_object_store.enabled: + object_store_loader = object_store.ObjectStore(config_checkpoint.load_from_object_store) + wandb_id_path = f"{config_job.path}/wandb_id.txt" + if object_store_loader.object_exists(key=wandb_id_path): + wandb_id = object_store_loader.load_object(key=wandb_id_path, type="text").strip() + else: + wandb_id_path = f"{config_job.path_local}/wandb_id.txt" + if os.path.isfile(wandb_id_path): + wandb_id = open(wandb_id_path).read().strip() + return wandb_id + + +def _write_wandb_id(config_job: JobConfig, config_checkpoint: CheckpointConfig, wandb_id: str) -> None: + """Write the generated W&B job ID. + + Args: + config_wandb (JobConfig): The config object for the W&B logger. + config_checkpoint (CheckpointConfig): The config object for the checkpointer. + wandb_id (str): The W&B job ID. + """ + content = f"{wandb_id}\n" + if config_checkpoint.save_to_object_store.enabled: + object_store_saver = object_store.ObjectStore(config_checkpoint.save_to_object_store) + wandb_id_path = f"{config_job.path}/wandb_id.txt" + object_store_saver.save_object(content, key=wandb_id_path, type="text") + else: + wandb_id_path = f"{config_job.path_local}/wandb_id.txt" + with open(wandb_id_path, "w") as file: + file.write(content) diff --git a/cosmos-inference/cosmos3/_src/imaginaire/visualize/__init__.py b/cosmos-inference/cosmos3/_src/imaginaire/visualize/__init__.py new file mode 100644 index 00000000..f5f9b5f8 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/imaginaire/visualize/__init__.py @@ -0,0 +1,16 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +from cosmos3._src.imaginaire.visualize.img import save_batch_img, show_batch_img # noqa: F401 diff --git a/cosmos-inference/cosmos3/_src/imaginaire/visualize/img.py b/cosmos-inference/cosmos3/_src/imaginaire/visualize/img.py new file mode 100644 index 00000000..b5cf296c --- /dev/null +++ b/cosmos-inference/cosmos3/_src/imaginaire/visualize/img.py @@ -0,0 +1,149 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +"""image visualization utilities. + +based on https://gitlab.com/qsh.zh/jam/-/blob/master/jamviz/img.py MIT License +""" + +import os +from typing import Union + +import numpy as np +import torch +from einops import rearrange +from PIL import Image +from torchvision.utils import make_grid + +__all__ = [ + "show_batch_img", + "save_batch_img", +] + + +def _reshape_viz_batch_img(img_data: torch.Tensor | np.ndarray, shape: int | str = 7) -> tuple: + """ + Reshapes a batch of images for visualization, organizing them into a grid format. + + Args: + img_data (torch.Tensor | np.ndarray): The image data to be reshaped, can be either a PyTorch tensor or a NumPy array. + shape (int | str, optional): Defines the layout of the grid. If an integer is provided, it specifies both the number of rows and columns. If a string is provided in the format 'nrowxncol', it parses to individual row and column numbers. Defaults to 7. + + Returns: + tuple: A tuple containing: + img (np.ndarray | torch.Tensor): The image data arranged in grid format. + nrow (int): Number of rows in the grid. + ncol (int): Number of columns in the grid. + + Raises: + RuntimeError: If the shape parameter is neither an int nor a string, or if it's a string that doesn't contain 'x'. + + Example: + >>> tensor_images = torch.rand(64, 3, 28, 28) # Example tensor of 64 images + >>> img_grid, rows, cols = _reshape_viz_batch_img(tensor_images, '8x8') + >>> img_grid.shape + (224, 224, 3) + """ + if isinstance(shape, int): + nrow, ncol = shape, shape + elif isinstance(shape, str): + if "x" not in shape: + nrow, ncol = int(shape), int(shape) + else: + shape = shape.split("x") + nrow, ncol = int(shape[0]), int(shape[1]) + else: + raise RuntimeError(f"shape {shape} not support") + if isinstance(img_data, torch.Tensor): + assert img_data.shape[1] in [1, 3] + grid_img = make_grid(img_data[: nrow * ncol].detach().cpu(), ncol) + img = grid_img.permute(1, 2, 0) + elif isinstance(img_data, np.ndarray): + if img_data.shape[1] in [1, 3]: + img = rearrange(img_data[: nrow * ncol], "(b t) c h w -> (b h) (t w) c", b=nrow) + else: + img = rearrange(img_data[: nrow * ncol], "(b t) h w c -> (b h) (t w) c", b=nrow) + return img, nrow, ncol + + +def show_batch_img( + img_data: torch.Tensor | np.ndarray, + shape: int | str = 7, + grid: int = 3, + is_n1p1: bool = False, + auto_n1p1: bool = True, +) -> None: + """ + Displays a batch of images using matplotlib after arranging them into a specified grid layout. + + Args: + img_data (torch.Tensor | np.ndarray): The image data to be displayed. + shape (int | str, optional): The grid shape to organize the images. Defaults to 7. + grid (int, optional): Scaling factor for each image in the grid, affecting the overall size of the displayed figure. Defaults to 3. + is_n1p1 (bool, optional): Whether to normalize the images from [-1, 1] to [0, 1] for visualization. Defaults to False. + auto_n1p1 (bool, optional): If true, automatically adjusts images from [-1, 1] to [0, 1] based on minimum pixel value detection. Defaults to True. + + Returns: + None: This function does not return anything but displays the image grid using matplotlib. + + Example: + >>> tensor_images = torch.rand(64, 3, 28, 28) # Example tensor of 64 images + >>> show_batch_img(tensor_images, '8x8') + """ + import matplotlib.pyplot as plt + + if is_n1p1: + img_data = (img_data + 1) / 2 + else: + if auto_n1p1: + if isinstance(img_data, torch.Tensor): + if img_data.min().item() < -0.5: + img_data = (img_data + 1) / 2 + elif isinstance(img_data, np.ndarray): + if np.min(img_data) < -0.5: + img_data = (img_data + 1) / 2 + img, nrow, ncol = _reshape_viz_batch_img(img_data, shape) + plt.figure(figsize=(ncol * grid, nrow * grid)) + plt.axis("off") + plt.imshow(img) + + +def save_batch_img(fpath: str, img_data: Union[torch.Tensor, np.ndarray], shape: Union[int, str] = 7) -> None: + """ + Saves a batch of images to a file after arranging them into a grid format. Handles both PyTorch tensors and NumPy arrays as input. + + Args: + fpath (str): File path where the image will be saved. + img_data (Union[torch.Tensor, np.ndarray]): The image data to be saved. Can be a PyTorch tensor or a NumPy array. + shape (Union[int, str], optional): The grid shape to organize the images. Can be an integer specifying equal number of rows and columns, or a string specifying 'nrowxncol'. Defaults to 7. + + Returns: + None: This function does not return anything but saves the image to the specified file path. + + Raises: + RuntimeError: If the input shape is neither an integer nor a string, or it does not include 'x' when provided as a string. + + Example: + >>> tensor_images = torch.rand(64, 3, 28, 28) # Example tensor of 64 images + >>> save_batch_img('path/to/save/image.png', tensor_images, '8x8') + # This saves the image grid to 'path/to/save/image.png' + """ + img, _, _ = _reshape_viz_batch_img(img_data, shape) + if isinstance(img, np.ndarray): + img = torch.from_numpy(img) + ndarr = img.mul(255).add_(0.5).clamp_(0, 255).to("cpu", torch.uint8).numpy() + im = Image.fromarray(ndarr) + os.makedirs(os.path.dirname(fpath), exist_ok=True) + im.save(fpath) diff --git a/cosmos-inference/cosmos3/_src/imaginaire/visualize/video.py b/cosmos-inference/cosmos3/_src/imaginaire/visualize/video.py new file mode 100644 index 00000000..20b8f403 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/imaginaire/visualize/video.py @@ -0,0 +1,96 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +from typing import IO, Any, Union + +import numpy as np +import torch +from einops import rearrange +from PIL import Image as PILImage +from torch import Tensor + +from cosmos3._src.imaginaire.utils.easy_io import easy_io + + +def save_video(grid, video_name, fps=30): + import cv2 + import ffmpegcv + + grid = (grid * 255).astype(np.uint8) + grid = np.transpose(grid, (1, 2, 3, 0)) + with ffmpegcv.VideoWriter(video_name, "h264", fps) as writer: + for frame in grid: + frame = cv2.cvtColor(frame, cv2.COLOR_RGB2BGR) + + writer.write(frame) + + +def save_img_or_video( + sample: Tensor, # [C,T,H,W] in [0,1] range + save_fp_wo_ext: Union[str, IO[Any]], + fps: int = 24, + quality=None, + ffmpeg_params=None, + **kwargs, +) -> None: + """ + Save a tensor as an image or video file based on shape + + Args: + sample (Tensor): Input tensor with shape (C, T, H, W) in [0, 1] range. + save_fp_wo_ext (Union[str, IO[Any]]): File path without extension or file-like object. + fps (int): Frames per second for video. Default is 24. + """ + assert sample.ndim == 4, "Only support 4D tensor" + assert isinstance(save_fp_wo_ext, str) or hasattr(save_fp_wo_ext, "write"), ( + "save_fp_wo_ext must be a string or file-like object" + ) + + if torch.is_floating_point(sample): + sample = sample.clamp(0, 1) + else: + assert sample.dtype == torch.uint8, "Only support uint8 tensor" + sample = sample.float().div(255) + + if ffmpeg_params is not None: + kwargs["ffmpeg_params"] = ffmpeg_params + + if sample.shape[1] == 1: + save_obj = PILImage.fromarray( + rearrange((sample.cpu().float().numpy() * 255), "c 1 h w -> h w c").astype(np.uint8), + mode="RGB", + ) + ext = ".jpg" if isinstance(save_fp_wo_ext, str) else "" + easy_io.dump( + save_obj, + f"{save_fp_wo_ext}{ext}" if isinstance(save_fp_wo_ext, str) else save_fp_wo_ext, + file_format="jpg", + format="JPEG", + quality=85 if quality is None else quality, + **kwargs, + ) + else: + if quality is not None: + kwargs["quality"] = quality + save_obj = rearrange((sample.cpu().float().numpy() * 255), "c t h w -> t h w c").astype(np.uint8) + ext = ".mp4" if isinstance(save_fp_wo_ext, str) else "" + easy_io.dump( + save_obj, + f"{save_fp_wo_ext}{ext}" if isinstance(save_fp_wo_ext, str) else save_fp_wo_ext, + file_format="mp4", + format="mp4", + fps=fps, + **kwargs, + ) diff --git a/cosmos-inference/cosmos3/_src/vfm/__init__.py b/cosmos-inference/cosmos3/_src/vfm/__init__.py new file mode 100644 index 00000000..db6d6eb6 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/vfm/__init__.py @@ -0,0 +1,15 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025-2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + diff --git a/cosmos-inference/cosmos3/_src/vfm/algorithm/__init__.py b/cosmos-inference/cosmos3/_src/vfm/algorithm/__init__.py new file mode 100644 index 00000000..3159bfe6 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/vfm/algorithm/__init__.py @@ -0,0 +1,14 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. diff --git a/cosmos-inference/cosmos3/_src/vfm/algorithm/loss/__init__.py b/cosmos-inference/cosmos3/_src/vfm/algorithm/loss/__init__.py new file mode 100644 index 00000000..690db8b6 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/vfm/algorithm/loss/__init__.py @@ -0,0 +1,33 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +"""Loss functions used by VFM (rectified flow) and VLM (next-token CE) training paths.""" + +__all__: list[str] = [] + +from cosmos3._src.vfm.algorithm.loss.cross_entropy import cross_entropy_loss + +__all__ += ["cross_entropy_loss"] + +from cosmos3._src.vfm.algorithm.loss.load_balancing import compute_load_balancing_loss + +__all__ += ["compute_load_balancing_loss"] + +from cosmos3._src.vfm.algorithm.loss.time_weight import TrainTimeWeight + +__all__ += ["TrainTimeWeight"] + +from cosmos3._src.vfm.algorithm.loss.flow_matching import compute_flow_matching_loss + +__all__ += ["compute_flow_matching_loss"] diff --git a/cosmos-inference/cosmos3/_src/vfm/algorithm/loss/cross_entropy.py b/cosmos-inference/cosmos3/_src/vfm/algorithm/loss/cross_entropy.py new file mode 100644 index 00000000..bf1dba75 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/vfm/algorithm/loss/cross_entropy.py @@ -0,0 +1,110 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +"""CE loss for VLM training. + +Ported from cosmos_rl.policy.trainer.llm_trainer.sft_trainer.async_safe_ce +(packages/cosmos-rl/cosmos_rl/policy/trainer/llm_trainer/sft_trainer.py). + +The reduction formula must match async_safe_ce exactly to preserve loss parity +between Phase 0 (cosmos-rl path) and Phase 2 (this module). + +Two reduction paths — determined by cp_group presence: + + CP enabled (cp_group.size() > 1): + Per-rank mean CE loss × loss_scaling_factor. + Rationale: each CP rank sees a different segment of the sequence; computing + a weighted-mean here would require knowing each rank's valid-token count, + which is expensive. The simpler per-rank mean × scaling is consistent with + cosmos-rl's implementation. + + CP disabled (cp_group is None or cp_group.size() == 1): + Sum CE loss / (global_n_valid_tokens + 1e-8) × (num_dp_workers × scaling). + The ×num_dp_workers compensates for FSDP's gradient averaging across DP + ranks, ensuring the effective gradient equals the gradient of the global + mean loss even with unbalanced per-rank token counts. + Reference: async_safe_ce:97-109 in the source file above. +""" + +from __future__ import annotations + +from typing import Optional + +import torch +import torch.distributed as dist +import torch.nn.functional as F + + +def cross_entropy_loss( + logits: torch.Tensor, + labels: torch.Tensor, + loss_scaling_factor: float = 1.0, + dp_group: Optional[dist.ProcessGroup] = None, + cp_group: Optional[dist.ProcessGroup] = None, + ignore_index: int = -100, +) -> torch.Tensor: + """Next-token-prediction CE loss with DP/CP group reduction. + + Matches the behavior of cosmos_rl.policy.trainer.llm_trainer.sft_trainer.async_safe_ce + with the TORCH_CROSS_ENTROPY backend (F.cross_entropy with float32 cast). + + Args: + logits: (B, T, V) float tensor — raw model output before softmax. + labels: (B, T) long tensor — ground-truth token ids. + Positions equal to ignore_index are excluded from the loss. + loss_scaling_factor: scalar multiplied into the returned loss. + dp_group: FSDP data-parallel shard group for loss normalization. + None = no DP reduction (single-GPU or replicate-only). + cp_group: Context-parallel group. If size > 1, use per-rank mean. + None = no CP reduction. + ignore_index: label value to exclude (default -100). + + Returns: + Scalar loss tensor. + """ + # Shift for next-token prediction: predict token[t+1] using hidden state[t]. + # logits[:, :-1] aligns with labels[:, 1:]. + # Reference: async_safe_ce:63-73 (output[:, :-1], target[:, 1:]) + shifted_logits = logits[:, :-1].contiguous().view(-1, logits.size(-1)) + shifted_labels = labels[:, 1:].contiguous().view(-1) + + if cp_group is not None and cp_group.size() > 1: + # CP path: each rank sees a different sequence segment. + # Use simple mean reduction; nan_to_num handles fully-ignored batches. + # Reference: async_safe_ce:74-88 + loss = F.cross_entropy( + shifted_logits.float(), + shifted_labels, + ignore_index=ignore_index, + reduction="mean", + ) + loss = torch.nan_to_num(loss, nan=0.0) + return loss * loss_scaling_factor + + # No-CP path: per-token loss, then normalize over the global valid-token count. + # Reference: async_safe_ce:89-109 + per_token_loss = F.cross_entropy( + shifted_logits.float(), + shifted_labels, + ignore_index=ignore_index, + reduction="none", + ) + n_valid_tokens = (shifted_labels != ignore_index).sum() + num_dp_workers = 1 + if dp_group is not None: + dist.all_reduce(n_valid_tokens, op=dist.ReduceOp.SUM, group=dp_group) + num_dp_workers = dist.get_world_size(group=dp_group) + + loss = per_token_loss.sum() / (n_valid_tokens + 1e-8) * (num_dp_workers * loss_scaling_factor) + return loss diff --git a/cosmos-inference/cosmos3/_src/vfm/algorithm/loss/flow_matching.py b/cosmos-inference/cosmos3/_src/vfm/algorithm/loss/flow_matching.py new file mode 100644 index 00000000..e3f39b6a --- /dev/null +++ b/cosmos-inference/cosmos3/_src/vfm/algorithm/loss/flow_matching.py @@ -0,0 +1,108 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +"""Rectified-flow matching loss (vision / action / sound modalities). + +Extracted from OmniMoTModel._compute_flow_matching_loss. The loss math is +unchanged; the only structural change is that ``tensor_kwargs_fp32`` is now +passed explicitly instead of being read from ``self``. +""" + +from __future__ import annotations + +import torch + +from cosmos3._src.vfm.diffusion.rectified_flow import RectifiedFlow + + +def compute_flow_matching_loss( + pred: list[torch.Tensor], + target: list[torch.Tensor], + condition_mask: list[torch.Tensor], + timesteps: torch.Tensor, + has_valid_tokens: bool, + rectified_flow: RectifiedFlow, + tensor_kwargs_fp32: dict, + loss_scale: float | None = None, + raw_action_dim: list[torch.Tensor] | None = None, + normalize_by_active: bool = False, +) -> tuple[torch.Tensor, torch.Tensor]: + """Compute flow matching loss for a modality. + + Args: + pred: Predicted velocity field (list of tensors, one per sample). + target: Target velocity field (list of tensors, one per sample). + Under rectified flow the target is ``v = eps - x0``. + condition_mask: Mask where 1 = clean/conditioning, 0 = noisy/generation (list of tensors). + timesteps: Diffusion timesteps for time weighting. Shape [B,1] for + base/teacher_forcing (all frames share one timestep) or [B,T_max] + for diffusion_forcing (per-frame independent timesteps). Time weights + are applied per-frame before averaging, so non-uniform weight functions + are handled correctly. + has_valid_tokens: Whether this modality has valid noisy tokens. + rectified_flow: The rectified flow object for time weighting. + tensor_kwargs_fp32: Dict of dtype/device kwargs forwarded to + ``rectified_flow.train_time_weight``. + loss_scale: Optional per-modality loss scale. Falls back to the global + ``rectified_flow_training_config.loss_scale`` when *None*. + (Currently unused inside the function body — scaling is applied at the + call site in ``OmniMoTModel._compute_losses``. Kept in the signature + to preserve the original API.) + normalize_by_active: When True, normalize per-instance loss by the count of + active (noisy) elements rather than all elements. Preserves the + ``sum / active_count`` semantics needed for distillation critics where + conditioned frames contribute no signal and should not dilute the + denominator. + + Returns: + tuple: A tuple containing two elements: + - Flow matching loss (or dummy loss for gradient consistency). + - Per-instance loss (or dummy loss for gradient consistency). + """ + if not has_valid_tokens: + # Dummy loss to maintain backward graph consistency across ranks + dummy_loss = 0.0 * sum(p.sum() for p in pred) + return dummy_loss, dummy_loss.unsqueeze(0) # make per-instance loss 1-D + + # condition_mask[i] is T-first with trailing singletons: [T,1,1] vision, [T,1] action. + # tw_i gets the same shape so w(σ_t) broadcasts element-wise over non-T dims. + per_instance_losses = [] + per_instance_weighted_losses = [] + + for i in range(len(pred)): + T_i = condition_mask[i].shape[0] + sqerr_i = (pred[i] - target[i]) ** 2 # vision:[C,T,H,W] action/sound:[T,D] + noisy_mask_i = 1.0 - condition_mask[i] # vision:[T,1,1] action/sound:[T,1] + if raw_action_dim is not None and raw_action_dim[i] is not None: + sqerr_i = sqerr_i[:, : raw_action_dim[i]] + if normalize_by_active: + active_count = (noisy_mask_i.sum() * (sqerr_i.numel() // noisy_mask_i.numel())).clamp(min=1) + per_instance_losses.append((sqerr_i * noisy_mask_i).sum() / active_count) # [] + else: + per_instance_losses.append((sqerr_i * noisy_mask_i).mean()) # [] + + ts_i = timesteps[i, :T_i] if timesteps.dim() > 1 else timesteps[i] # DF:[T_i] TF:[1] + tw_i = rectified_flow.train_time_weight(ts_i, tensor_kwargs_fp32) # DF:[T_i] TF:[1] + tw_i = tw_i.reshape(-1, *([1] * (condition_mask[i].ndim - 1))) # vision:[T_i,1,1] action/sound:[T_i,1] + if normalize_by_active: + per_instance_weighted_losses.append((sqerr_i * tw_i * noisy_mask_i).sum() / active_count) + else: + per_instance_weighted_losses.append((sqerr_i * tw_i * noisy_mask_i).mean()) + + per_instance_loss = torch.stack(per_instance_losses) # [B] + per_instance_weighted_loss = torch.stack(per_instance_weighted_losses) # [B] + return ( + per_instance_weighted_loss.mean(), # [] + per_instance_loss, # [B] + ) diff --git a/cosmos-inference/cosmos3/_src/vfm/algorithm/loss/load_balancing.py b/cosmos-inference/cosmos3/_src/vfm/algorithm/loss/load_balancing.py new file mode 100644 index 00000000..d09a445a --- /dev/null +++ b/cosmos-inference/cosmos3/_src/vfm/algorithm/loss/load_balancing.py @@ -0,0 +1,82 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +import torch +from torch.distributed.tensor import DTensor, Partial +from torch.distributed.tensor.device_mesh import DeviceMesh + +from cosmos3._src.vfm.models.vlm.qwen3_vl_moe.qwen3_vl_moe import LBLMetadata + + +def compute_load_balancing_loss( + lbl_metadata: LBLMetadata | None, + coeff: float | None, + method: str, + device_mesh: DeviceMesh | None, +) -> torch.Tensor | None: + """ + Compute the load balancing loss. We compute the load balancing loss + for each layer, and then average the loss across all layers. + + For computing the load balancing loss for each layer, we can either + use the fraction of tokens routed to each expert for this rank ("local" method), or + use the fraction of tokens routed to each expert across all ranks ("global" method). + + Args: + lbl_metadata: The load balancing metadata. Contains the following tensors + - num_tokens_per_expert: [num_layers, num_experts] - The number of + tokens routed to each expert for this rank for each layer. + - num_tokens: [num_layers, 1] - The total number of tokens in the + batch for each layer. + - mean_router_prob_per_expert: [num_layers, num_experts] - The average + probability of routing to each expert for this rank for each layer. + coeff: The coefficient for the load balancing loss. + method: The method for the load balancing loss. Can be "local" or "global". + device_mesh: The device mesh. Only needed if method is "global". + + Returns: + The load balancing loss. None if lbl_metadata is None or coeff is None. + """ + if lbl_metadata is None or coeff is None: + return None + assert method in ["local", "global"], "Invalid method" + + num_tokens_per_expert = lbl_metadata.num_tokens_per_expert + num_experts = num_tokens_per_expert.shape[-1] + num_tokens = lbl_metadata.num_tokens + mean_router_prob_per_expert = lbl_metadata.mean_router_prob_per_expert + + if method == "global": + # Note that these collectives must be executed outside a torch compiled region + # since torch compile could reorder the collectives and cause deadlocks. + assert device_mesh is not None, "MoE models require multiple GPUs." + + num_tokens_per_expert = DTensor.from_local( + num_tokens_per_expert, + device_mesh=device_mesh, + placements=[Partial()] * device_mesh.ndim, + ).full_tensor() + num_tokens = DTensor.from_local( + num_tokens, + device_mesh=device_mesh, + placements=[Partial()] * device_mesh.ndim, + ).full_tensor() + + # Compute the fraction of tokens routed to each experts. + # Summing over all experts should be equal to self.top_k. + mean_tokens_per_expert = num_tokens_per_expert.float() / num_tokens.float() + + lbl = torch.mean(torch.sum(mean_tokens_per_expert * mean_router_prob_per_expert, dim=-1) * num_experts) + return lbl * coeff diff --git a/cosmos-inference/cosmos3/_src/vfm/algorithm/loss/time_weight.py b/cosmos-inference/cosmos3/_src/vfm/algorithm/loss/time_weight.py new file mode 100644 index 00000000..8766159d --- /dev/null +++ b/cosmos-inference/cosmos3/_src/vfm/algorithm/loss/time_weight.py @@ -0,0 +1,40 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +import torch + + +class TrainTimeWeight: + def __init__( + self, + noise_scheduler, + weight: str = "uniform", + ): + # Map reweighting -> uniform to support inference for existing checkpoints. + if weight == "reweighting": + weight = "uniform" + + self.weight = weight + self.noise_scheduler = noise_scheduler + + assert self.weight == "uniform", "Only uniform loss weight is supported in RF" + + def __call__(self, t, tensor_kwargs) -> torch.Tensor: # t: [B], returns [B] + if self.weight == "uniform": + wts = torch.ones_like(t) # [B] + else: + raise NotImplementedError(f"Time weight '{self.weight}' is not implemented.") + + return wts diff --git a/cosmos-inference/cosmos3/_src/vfm/callbacks/__init__.py b/cosmos-inference/cosmos3/_src/vfm/callbacks/__init__.py new file mode 100644 index 00000000..db6d6eb6 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/vfm/callbacks/__init__.py @@ -0,0 +1,15 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025-2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + diff --git a/cosmos-inference/cosmos3/_src/vfm/callbacks/compile_tokenizer.py b/cosmos-inference/cosmos3/_src/vfm/callbacks/compile_tokenizer.py new file mode 100644 index 00000000..70f8860f --- /dev/null +++ b/cosmos-inference/cosmos3/_src/vfm/callbacks/compile_tokenizer.py @@ -0,0 +1,118 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +"""Training callback that defers AOT compilation of the VAE tokenizer. + +The actual compilation logic lives in +:meth:`~projects.cosmos3.vfm.tokenizers.wan2pt2_vae_4x16x16.Wan2pt2VAEInterface.compile_encode`. +This module provides a :class:`CompileTokenizer` callback that invokes it +at the right point during training (after ``compile_after_iterations`` +steps, to avoid NCCL timeouts during CUDA/cuDNN warm-up). + +Typical config usage +-------------------- +.. code-block:: python + + CompileTokenizer( + enabled=True, + compile_after_iterations=3, + warmup_resolutions=["256", "480", "720"], + ) +""" + +from collections.abc import Sequence + +import torch + +from cosmos3._src.imaginaire.utils import log +from cosmos3._src.imaginaire.utils.callback import Callback +from cosmos3._src.vfm.models.omni_mot_model import OmniMoTModel + + +class CompileTokenizer(Callback): + """Training callback that defers AOT compilation of the VAE tokenizer. + + Hooks into ``on_training_step_start``. On the + ``compile_after_iterations``-th step it calls + ``Wan2pt2VAEInterface.compile_encode`` to compile and load all chunk + variants. Every subsequent step is a no-op. + """ + + def __init__( + self, + enabled: bool = False, + compile_after_iterations: int = 3, + warmup_resolutions: Sequence[str] | None = None, + ): + """ + Args: + enabled: Master switch. When ``False`` the callback is a + complete no-op and no compilation occurs. + compile_after_iterations: How many training steps to skip + before triggering compilation. The default (3) lets CUDA + context setup and Transformer compilation finish first. + warmup_resolutions: Resolution keys (e.g. ``["256", "480", "720"]``) + to AOT-compile. Should include every resolution used in + training. Must be a non-empty list when *enabled* is ``True``. + """ + super().__init__() + self.enabled: bool = enabled + self.compile_after_iterations: int = compile_after_iterations + self.skip_counter: int = 0 + self.warmup_resolutions: Sequence[str] | None = warmup_resolutions + + if self.enabled: + if self.warmup_resolutions is None: + raise ValueError("warmup_resolutions must be provided when enabled, got None") + if len(self.warmup_resolutions) == 0: + raise ValueError("warmup_resolutions must be a non-empty list when enabled, got an empty list") + + def on_training_step_start( + self, model: OmniMoTModel, data_batch: dict[str, torch.Tensor], iteration: int = 0 + ) -> None: + """Called at the start of every training step. + + On the ``compile_after_iterations``-th call, triggers AOT compilation + via ``tokenizer.compile_encode``. + + Args: + model: The OmniMoTModel whose ``tokenizer_vision_gen`` will be compiled. + data_batch: Current training batch (unused, required by Callback API). + iteration: Current training iteration (unused; we track our own counter + via ``skip_counter`` because this callback may be registered after + iteration 0). + """ + if not self.enabled: + return + + tokenizer = model.tokenizer_vision_gen + + if isinstance(tokenizer, torch.jit.ScriptModule): + log.critical( + f"The Tokenizer model {type(tokenizer)} is a JIT model, " + "which is not compilable. The Tokenizer will not be compiled.", + rank0_only=False, + ) + self.enabled = False + return + + if self.skip_counter == self.compile_after_iterations: + if self.warmup_resolutions is not None: + tokenizer.compile_encode( + self.warmup_resolutions, + output_dir=self.config.job.path_local, + ) + + self.skip_counter += 1 diff --git a/cosmos-inference/cosmos3/_src/vfm/callbacks/dataloader_state.py b/cosmos-inference/cosmos3/_src/vfm/callbacks/dataloader_state.py new file mode 100644 index 00000000..265ad3d7 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/vfm/callbacks/dataloader_state.py @@ -0,0 +1,119 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +from __future__ import annotations + +import os +from dataclasses import dataclass +from typing import Any + +import torch + +from cosmos3._src.imaginaire.model import ImaginaireModel +from cosmos3._src.imaginaire.utils import log +from cosmos3._src.imaginaire.utils.callback import Callback + + +@dataclass +class NoReplaceShardlistState: + epoch: int = 0 + index: int = 0 + + +class DataLoaderStateCallback(Callback): + checkpoint_component: str = "dataloader" + + def __init__( + self, + distributor_type: str | None = None, + ) -> None: + super().__init__() + self.distributor_type = distributor_type + self.config: Any = None + self.state: dict[int, NoReplaceShardlistState] = {} + self.verbose = True + + def _update_state_from_batch(self, data_batch: dict[str, torch.Tensor]) -> None: + worker_ids = data_batch["sample_worker_id"].tolist() # [B] + epochs = data_batch["sample_epoch"].tolist() # [B] + indices = data_batch["sample_index"].tolist() # [B] + for worker_id, epoch, index in zip(worker_ids, epochs, indices, strict=True): + if worker_id not in self.state: + self.state[worker_id] = NoReplaceShardlistState(epoch=epoch, index=index) + + elif self.state[worker_id].epoch < epoch or ( + self.state[worker_id].index < index and self.state[worker_id].epoch == epoch + ): + self.state[worker_id] = NoReplaceShardlistState(epoch=epoch, index=index) + + def on_training_step_batch_end( + self, + model: ImaginaireModel, + data_batch: dict[str, torch.Tensor], + output_batch: dict[str, torch.Tensor], + loss: torch.Tensor, + iteration: int = 0, + ) -> None: + if self.distributor_type == "no_replace": + self._update_state_from_batch(data_batch) + + def on_training_step_end( + self, + model: ImaginaireModel, + data_batch: dict[str, torch.Tensor], + output_batch: dict[str, torch.Tensor], + loss: torch.Tensor, + iteration: int = 0, + ) -> None: + if self.distributor_type == "no_replace": + if self.verbose: + if iteration % self.config.trainer.logging_iter == 0: + msg = "\n" + for wid, state in self.state.items(): + msg += f"worker {wid}: epoch={state.epoch}, index={state.index}\n" + log.info(msg) + + def has_checkpoint_state(self) -> bool: + return self.distributor_type == "no_replace" + + def state_dict(self) -> dict[int, dict[str, int]]: + if self.distributor_type != "no_replace": + return {} + + state_dict: dict[int, dict[str, int]] = {} + for worker_id, per_worker_state in self.state.items(): + state_dict[worker_id] = {"epoch": per_worker_state.epoch, "index": per_worker_state.index} + log.info( + f"Saved dataloader state for worker {worker_id}: " + f"epoch={per_worker_state.epoch}, index={per_worker_state.index}" + ) + return state_dict + + def load_state_dict(self, state_dict: dict[int, dict[str, int]]) -> None: + if self.distributor_type != "no_replace": + return + + if not state_dict: + log.info("No dataloader state found in checkpoint") + return + + self.state = {} + for worker_id, per_worker_state in state_dict.items(): + epoch = per_worker_state["epoch"] + index = per_worker_state["index"] + self.state[worker_id] = NoReplaceShardlistState(epoch=epoch, index=index) + os.environ[f"NSL_STATE_WORKER_{worker_id}_EPOCH"] = str(epoch) + os.environ[f"NSL_STATE_WORKER_{worker_id}_INDEX"] = str(index) + log.info(f"Loaded no replace dataloader state for worker {worker_id}: epoch={epoch}, index={index}") diff --git a/cosmos-inference/cosmos3/_src/vfm/callbacks/dataloading_monitor.py b/cosmos-inference/cosmos3/_src/vfm/callbacks/dataloading_monitor.py new file mode 100644 index 00000000..77e3a1b4 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/vfm/callbacks/dataloading_monitor.py @@ -0,0 +1,538 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +import time +from collections import defaultdict + +import psutil +import torch +import torch.distributed as dist +import wandb + +from cosmos3._src.imaginaire.datasets.webdataset.utils.stream import ( + ENABLE_STREAM_WANDB, + WATCHDOG_ENABLED, + collect_throughput_ipc_stats, +) +from cosmos3._src.imaginaire.model import ImaginaireModel +from cosmos3._src.imaginaire.utils import distributed +from cosmos3._src.imaginaire.utils.callback import Callback +from cosmos3._src.imaginaire.utils.easy_io import easy_io +from cosmos3._src.vfm.datasets.joint_dataloader import _PackingMetrics + +_AGG_COUNT, _AGG_SUM, _AGG_MIN, _AGG_MAX = 0, 1, 2, 3 +_AGG_COLS = 4 + + +class DetailedDataLoadingSpeedMonitor(Callback): + def __init__( + self, + every_n: int, + step_size: int = 1, + save_s3: bool = False, + ): + self.every_n = every_n + self.step_size = step_size + self.should_run = False + self.start_dataloading_time = None + self.dataloading_time = None + self.name = self.__class__.__name__ + self.save_s3 = save_s3 + self.time_delta_list = [] + self.memory_consumption_list = [] + self.memory_consumption_percentage_list = [] + self._pending_time_delta = None + self.dataloading_time_per_dataset = {} + self._worker_batch_times = [] + self._worker_aug_times = [] + self._worker_io_times = [] + self._worker_aug_step_times: dict[str, list[float]] = defaultdict(list) + self._worker_times_by_ds_wid: dict[tuple[str, int], list[float]] = defaultdict(list) + self._dataset_scalar_stats: dict[str, dict[str, int]] = defaultdict(lambda: defaultdict(int)) + self._dataset_list_stats: dict[str, dict[str, list[int]]] = defaultdict(lambda: defaultdict(list)) + + def on_before_dataloading(self, iteration: int = 0) -> None: + # We want to run it one iteration before on_training_step_start should_run is set to True. + global_step = iteration // self.step_size + self.should_run = (global_step + 1) % self.every_n == 0 + self.start_dataloading_time = time.time() + + def on_after_dataloading(self, iteration: int = 0) -> None: + self._pending_time_delta = time.time() - self.start_dataloading_time + self.time_delta_list.append(self._pending_time_delta) + memory = psutil.virtual_memory() + self.memory_consumption_list.append(memory.used / (1024**3)) + self.memory_consumption_percentage_list.append(memory.percent) + + def on_training_step_end( + self, + model: ImaginaireModel, + data_batch: dict[str, torch.Tensor], + output_batch: dict[str, torch.Tensor], + loss: torch.Tensor, + iteration: int = 0, + ) -> None: + dataset_name = data_batch.get("dataset_name", ["default"])[0] + if self._pending_time_delta is not None: + if dataset_name not in self.dataloading_time_per_dataset: + self.dataloading_time_per_dataset[dataset_name] = [] + self.dataloading_time_per_dataset[dataset_name].append(self._pending_time_delta) + self._pending_time_delta = None + + for batch_key, _, agg_type in _PackingMetrics.STATS_SPEC: + if batch_key not in data_batch: + continue + val = int(data_batch[batch_key]) + if agg_type == "scalar": + self._dataset_scalar_stats[batch_key][dataset_name] += val + else: + self._dataset_list_stats[batch_key][dataset_name].append(val) + + if "_worker_batch_time" in data_batch: + bt = float(data_batch["_worker_batch_time"]) + self._worker_batch_times.append(bt) + wid = int(data_batch.get("_worker_id", 0)) + self._worker_times_by_ds_wid[(dataset_name, wid)].append(bt) + if "_worker_aug_time" in data_batch: + self._worker_aug_times.append(float(data_batch["_worker_aug_time"])) + if "_worker_io_time" in data_batch: + self._worker_io_times.append(float(data_batch["_worker_io_time"])) + if "_worker_aug_step_times" in data_batch: + for step_name, t in data_batch["_worker_aug_step_times"].items(): + self._worker_aug_step_times[step_name].append(float(t)) + + if self.should_run: + # Convert list to tensor on GPU for gathering + local_times = torch.tensor(self.time_delta_list).cuda() # [num_iters] + local_memory_consumption = torch.tensor(self.memory_consumption_list).cuda() # [num_iters] + local_memory_consumption_percentage = torch.tensor( + self.memory_consumption_percentage_list + ).cuda() # [num_iters] + iteration_count = len(self.time_delta_list) + self.time_delta_list = [] # Reset the list + self.memory_consumption_list = [] + self.memory_consumption_percentage_list = [] + + # Gather all times from all ranks + # Each tensor in the list has shape (num_iterations,) + gathered_times_list = distributed.all_gather_tensor(local_times) # list of [num_iters], len=world_size + + # Stack to get shape (world_size, num_iterations) + all_times = torch.stack(gathered_times_list) # [world_size,num_iters] + + # Calculate per-rank statistics + # dim=1 is across iterations + rank_means = torch.mean(all_times, dim=1) # [world_size] + rank_maxes = torch.max(all_times, dim=1).values # [world_size] + + wandb_info = {f"{self.name}_mean/dataloading_{k:03d}": v.item() for k, v in enumerate(rank_means)} + wandb_info.update({f"{self.name}_max/dataloading_{k:03d}": v.item() for k, v in enumerate(rank_maxes)}) + + gathered_memory_consumption = distributed.all_gather_tensor( + local_memory_consumption + ) # list of [num_iters], len=world_size + gathered_memory_consumption_percentage = distributed.all_gather_tensor( + local_memory_consumption_percentage + ) # list of [num_iters], len=world_size + + wandb_info.update( + { + f"{self.name}_mean/memory_consumption_gb_{k:03d}": v.mean().item() + for k, v in enumerate(gathered_memory_consumption) + } + ) + wandb_info.update( + { + f"{self.name}_mean/memory_consumption_percentage_{k:03d}": v.mean().item() + for k, v in enumerate(gathered_memory_consumption_percentage) + } + ) + + wandb_info[f"{self.name}_mean/memory_consumption_gb_mean"] = ( + torch.stack(gathered_memory_consumption).mean().item() # [world_size,num_iters] + ) + wandb_info[f"{self.name}_mean/memory_consumption_percentage_mean"] = ( + torch.stack(gathered_memory_consumption_percentage).mean().item() # [world_size,num_iters] + ) + wandb_info[f"{self.name}_max/memory_consumption_gb_max"] = ( + torch.stack(gathered_memory_consumption).max().item() # [world_size,num_iters] + ) + wandb_info[f"{self.name}_max/memory_consumption_percentage_max"] = ( + torch.stack(gathered_memory_consumption_percentage).max().item() # [world_size,num_iters] + ) + + # Identify slowest rank based on mean time + slowest_dataloading_rank_id = torch.argmax(rank_means) # [] + max_dataloading = torch.max(rank_means) # [] + + # Calculate sum of max times across all iterations (new metric) + # Max across ranks for each iteration (dim=0) + max_per_iter = torch.max(all_times, dim=0).values # [num_iters] + sum_of_max_times = torch.sum(max_per_iter).item() / iteration_count + + wandb_info.update( + { + "slowest_rank/slowest_dataloading_rank": slowest_dataloading_rank_id.item(), + "slowest_rank/slowest_dataloading_time": max_dataloading.item(), + "slowest_rank/sum_of_max_dataloading_time_per_iteration": sum_of_max_times, + } + ) + + # 1. Gather and log stream throughput and watchdog reconnect stats for `stream_throughput` metrics + self._gather_and_log_stream_throughput(wandb_info) + + # Only all_gather_object to get name indices (dataset names, aug-step names, worker-balance keys) across all ranks + # Later methods 2-4 use efficient all_gather_tensor to gather tensor data, then compute statistics and log metrics + ds_index, aug_index, dswid_index = self._discover_name_indices() + + # 2.Gather and log per-dataset dataloading wait times for `dl_wait_time_per_dataset` metrics + self._gather_and_log_per_dataset_time(wandb_info, ds_index) + + # 3. Gather and log per-dataset sampling stats for `dl_packing_stats` metrics + self._gather_and_log_packing_stats(wandb_info, ds_index) + + # 4. Gather and log worker timing metrics for `dl_worker_batch_time`, `dl_worker_balance_per_dataset`, `dl_worker_augmentation` metrics + self._gather_and_log_worker_timing(wandb_info, dswid_index, aug_index) + + if wandb.run: + wandb.log(wandb_info, step=iteration) + + if self.save_s3 and distributed.is_rank0(): + easy_io.dump( + wandb_info, + f"s3://rundir/{self.name}/iter_{iteration:09d}.yaml", + ) + + def _discover_name_indices( + self, + ) -> tuple[dict[str, int], dict[str, int], dict[str, int]]: + """Discover the global union of dataset, aug-step, and worker-balance names. + + Performs a single ``all_gather_object`` call to exchange short string + lists across all ranks and returns deterministic index mappings. + + Returns: + ds_index: ``{dataset_name: col_idx}`` for per-dataset tensors. + aug_index: ``{step_name: col_idx}`` for augmentation step tensors. + dswid_index: ``{"ds|wid": col_idx}`` for worker-balance tensors. + """ + local_ds_names: set[str] = set(self.dataloading_time_per_dataset.keys()) + for key_dict in self._dataset_scalar_stats.values(): + local_ds_names.update(key_dict.keys()) + for key_dict in self._dataset_list_stats.values(): + local_ds_names.update(key_dict.keys()) + + local_names = { + "datasets": sorted(local_ds_names), + "aug_steps": sorted(self._worker_aug_step_times.keys()), + "ds_wid": sorted(f"{ds}|{wid}" for ds, wid in self._worker_times_by_ds_wid.keys()), + } + all_names: list[dict] = [{} for _ in range(dist.get_world_size())] # len=world_size + dist.all_gather_object(all_names, local_names) + + union_ds = sorted({n for r in all_names for n in r.get("datasets", [])}) + ds_index = {name: i for i, name in enumerate(union_ds)} + + union_aug = sorted({n for r in all_names for n in r.get("aug_steps", [])}) + aug_index = {name: i for i, name in enumerate(union_aug)} + + union_dswid = sorted({n for r in all_names for n in r.get("ds_wid", [])}) + dswid_index = {name: i for i, name in enumerate(union_dswid)} + + return ds_index, aug_index, dswid_index + + def _gather_and_log_per_dataset_time(self, wandb_info: dict, ds_index: dict[str, int]) -> None: + """Gather per-dataset dataloading wait times via ``all_gather_tensor``.""" + N = len(ds_index) + if N == 0: + self.dataloading_time_per_dataset = {} + return + + local_ds_time = torch.full((N,), float("nan"), dtype=torch.float64).cuda() # [num_datasets] + for ds, times in self.dataloading_time_per_dataset.items(): + if ds in ds_index: + local_ds_time[ds_index[ds]] = sum(times) / len(times) + + all_ds_time = self._gather_list_stats(local_ds_time) # [world_size, num_datasets] + for ds, i in ds_index.items(): + col = all_ds_time[:, i] # [world_size] + valid = col[~col.isnan()] # [<=world_size] + if len(valid) > 0: + wandb_info[f"dl_wait_time_per_dataset/{ds}_mean"] = valid.mean().item() + wandb_info[f"dl_wait_time_per_dataset/{ds}_max"] = valid.max().item() + + self.dataloading_time_per_dataset = {} + + def _gather_and_log_packing_stats(self, wandb_info: dict, ds_index: dict[str, int]) -> None: + """Gather packing diagnostics via ``all_gather_tensor``, driven by ``_PackingMetrics.STATS_SPEC``.""" + _STATS = "dl_packing_stats" + N = len(ds_index) + if N == 0: + self._dataset_scalar_stats = defaultdict(lambda: defaultdict(int)) + self._dataset_list_stats = defaultdict(lambda: defaultdict(list)) + return + + for batch_key, wandb_suffix, _ in _PackingMetrics.STATS_SPEC: + if batch_key == "_num_tokens": + # Token fraction: gather per-rank token sums, compute each dataset's share of total + local_v = torch.zeros(N, dtype=torch.float64).cuda() # [num_datasets] + for ds, i in ds_index.items(): + local_v[i] = self._dataset_scalar_stats.get(batch_key, {}).get(ds, 0) + all_v = self._gather_list_stats(local_v) # [world_size, num_datasets] + global_tokens = all_v.sum(dim=0) # [num_datasets] + total = global_tokens.sum().item() + for ds, i in ds_index.items(): + wandb_info[f"{_STATS}/{ds}_{wandb_suffix}"] = global_tokens[i].item() / total if total > 0 else 0.0 + + elif batch_key == "_dropped_count": + # Dropped samples: gather per-rank counts, report global total per dataset + local_v = torch.zeros(N, dtype=torch.float64).cuda() # [num_datasets] + for ds, i in ds_index.items(): + local_v[i] = self._dataset_scalar_stats.get(batch_key, {}).get(ds, 0) + all_v = self._gather_list_stats(local_v) # [world_size, num_datasets] + for ds, i in ds_index.items(): + wandb_info[f"{_STATS}/{ds}_{wandb_suffix}_total"] = int(all_v[:, i].sum().item()) + + else: + # Per-batch distributions (_num_samples, _from_buffer, _from_workers, _buffer_size). + # Each rank packs [count, sum, min, max]; reduce to weighted global mean/min/max. + local_t = torch.full( + (N, _AGG_COLS), float("nan"), dtype=torch.float64 + ).cuda() # [num_datasets, _AGG_COLS] + for ds, i in ds_index.items(): + vals = self._dataset_list_stats.get(batch_key, {}).get(ds, []) + if vals: + local_t[i] = torch.tensor([len(vals), sum(vals), min(vals), max(vals)], dtype=torch.float64) + all_t = self._gather_list_stats(local_t) # [world_size, num_datasets, _AGG_COLS] + for ds, i in ds_index.items(): + result = self._reduce_agg_column(all_t[:, i, :]) + if result: + mean_val, min_val, max_val = result + wandb_info[f"{_STATS}/{ds}_{wandb_suffix}_mean"] = mean_val + wandb_info[f"{_STATS}/{ds}_{wandb_suffix}_min"] = min_val + wandb_info[f"{_STATS}/{ds}_{wandb_suffix}_max"] = max_val + + self._dataset_scalar_stats = defaultdict(lambda: defaultdict(int)) + self._dataset_list_stats = defaultdict(lambda: defaultdict(list)) + + def _gather_and_log_stream_throughput(self, wandb_info: dict) -> None: + """Gather stream throughput and watchdog reconnect stats via IPC files.""" + if not ENABLE_STREAM_WANDB: + return + + tp_keys = ["MBps"] + if WATCHDOG_ENABLED: + tp_keys.append("watchdog_reconnects") + tp_stats = collect_throughput_ipc_stats() + local_tp = torch.tensor([tp_stats.get(k, 0.0) for k in tp_keys]).cuda() # [num_metrics] + gathered_tp = distributed.all_gather_tensor(local_tp) # list of [num_metrics], len=world_size + all_tp = torch.stack(gathered_tp) # [world_size, num_metrics] + + for ki, k in enumerate(tp_keys): + col = all_tp[:, ki] # [world_size] + wandb_info[f"stream_throughput/{k}_mean"] = col.mean().item() + wandb_info[f"stream_throughput/{k}_min"] = col.min().item() + wandb_info[f"stream_throughput/{k}_max"] = col.max().item() + if k == "watchdog_reconnects": + wandb_info[f"stream_throughput/{k}_sum"] = col.sum().item() + + mbps_col = all_tp[:, 0] # [world_size] + slowest_throughput_rank = mbps_col.argmin().item() + wandb_info["slowest_rank/slowest_stream_throughput_rank"] = slowest_throughput_rank + + @staticmethod + def _gather_list_stats(local: torch.Tensor) -> torch.Tensor: + """all_gather_tensor + stack, returning [world_size, *local.shape].""" + return torch.stack(distributed.all_gather_tensor(local)) + + @staticmethod + def _reduce_agg_column(col: torch.Tensor) -> tuple[float, float, float] | None: + """From a [world_size, _AGG_COLS] slice, return (mean, min, max) or None if empty. + + Each row is [count, sum, min, max] from one rank. Rows with NaN count + are ranks that had no data for this key. + + Used for metrics where each rank accumulates a variable-length list of + values (e.g. samples_per_batch, buffer_size, per-aug-step times) and we + need a correct weighted global mean rather than a simple average of + per-rank means. The sum/count columns make this possible. + + Callers: ``_gather_and_log_packing_stats`` (list-type metrics) and + ``_gather_and_log_worker_timing`` (per-aug-step breakdown). + """ + valid = col[~col[:, _AGG_COUNT].isnan()] + if len(valid) == 0: + return None + total_count = valid[:, _AGG_COUNT].sum().item() + total_sum = valid[:, _AGG_SUM].sum().item() + if total_count == 0: + return None + return ( + total_sum / total_count, + valid[:, _AGG_MIN].min().item(), + valid[:, _AGG_MAX].max().item(), + ) + + def _gather_and_log_worker_timing( + self, wandb_info: dict, dswid_index: dict[str, int], aug_index: dict[str, int] + ) -> None: + """Gather worker timing from all ranks and log percentile metrics. + + All metrics here are worker-side measurements — time spent inside + DataLoader worker processes producing batches. This is different from + DetailedDataLoadingSpeedMonitor or dl_wait_time_per_dataset/ metrics which measure main-process wall-clock time, + This can help identify if the bottleneck is in the dataloader worker processes or in the main process, + for example waiting for a packed output batch from the JointDataLoader + + Logged metrics: + Section 1 – dl_worker_batch_time/ + Every individual batch time from every worker from every rank, all + thrown into one pool. One data point = one batch produced by one + worker at one step. Computes p50/p90/p99/max/mean of that pool. + Answers: What is the tail latency to produce a batch? + + Section 2 – dl_worker_balance_per_dataset/ + First computes each worker's average batch time over the logging + window. One data point = one worker's mean over several batches. + Then gathers these per-worker averages across all ranks, grouped by + dataset. Computes mean/std/min/max of those averages. + Answers: Are some workers consistently slower than others within + each sub-dataloader? + + Section 3 – dl_worker_augmentation/ + Unified augmentation profiling. Contains: + - total_aug_mean|min|max – total augmentation time per batch + - total_io_mean|min|max – I/O time per batch (batch_time minus aug_time) + - aug_fraction_mean, io_fraction_mean – what fraction of batch time is spent in augmentation vs I/O + - aug_steps/{StepName}_mean|min|max – per-augmentor-step breakdown + (e.g. VideoParsingWithFullFrames for video decode, + TextTokenizerTransform for text tokenization). + All use mean/min/max globally across all ranks. + Answers: Is the bottleneck in augmentations or downloads, and + which augmentor step dominates? + + Note: dl_packing_stats/ is logged from on_training_step_end (not here). + It reports token_fraction, samples_per_batch, from_buffer, from_workers, buffer_size, and dropped_total per dataset — useful for tuning num_workers/batch_size/prefetch per dataloader. + """ + if not self._worker_batch_times: + self._worker_aug_times = [] + self._worker_io_times = [] + self._worker_aug_step_times = defaultdict(list) + self._worker_times_by_ds_wid = defaultdict(list) + return + + _PERCENTILES = [0.50, 0.90, 0.99] + _PNAMES = ["p50", "p90", "p99"] + + # Gather raw batch times across all ranks + local_bt = torch.tensor(self._worker_batch_times, dtype=torch.float32).cuda() # [num_batches_local] + gathered_bt = distributed.all_gather_tensor(local_bt) # list of [num_batches_local], len=world_size + all_bt = torch.cat(gathered_bt) # [num_batches_all_ranks] + + # Section 1: global batch time percentiles + _BATCH_PREFIX = "dl_worker_batch_time" + for pval, pname in zip(_PERCENTILES, _PNAMES): + wandb_info[f"{_BATCH_PREFIX}/{pname}"] = all_bt.quantile(pval).item() + wandb_info[f"{_BATCH_PREFIX}/max"] = all_bt.max().item() + wandb_info[f"{_BATCH_PREFIX}/mean"] = all_bt.mean().item() + + # Section 2: per-dataloader worker balance + # Each rank fills its (dataset, worker_id) slots with that worker's + # mean batch time; NaN marks absent slots. After all_gather we group + # by dataset and compute cross-rank statistics. + + _BALANCE_PREFIX = "dl_worker_balance_per_dataset" + if dswid_index: + N_dswid = len(dswid_index) + local_pw = torch.full((N_dswid,), float("nan"), dtype=torch.float64).cuda() # [num_ds_worker_pairs] + for (ds_name, wid), ts in self._worker_times_by_ds_wid.items(): + key = f"{ds_name}|{wid}" + if key in dswid_index: + local_pw[dswid_index[key]] = sum(ts) / len(ts) + + all_pw = self._gather_list_stats(local_pw) # [world_size, num_ds_worker_pairs] + + # Pass 1: collect all valid per-worker means, grouped by dataset + ds_worker_vals: dict[str, list[float]] = defaultdict(list) + for key, idx in dswid_index.items(): + ds_name = key.rsplit("|", 1)[0] + col = all_pw[:, idx] # [world_size] + valid = col[~col.isnan()] # [<=world_size] + ds_worker_vals[ds_name].extend(valid.tolist()) + + # Pass 2: log per-dataset worker balance statistics + for ds_name in sorted(ds_worker_vals): + pw_means = ds_worker_vals[ds_name] + if not pw_means: + continue + pw_t = torch.tensor(pw_means, dtype=torch.float32).cuda() # [num_workers_for_ds] + wandb_info[f"{_BALANCE_PREFIX}/{ds_name}_mean"] = pw_t.mean().item() + wandb_info[f"{_BALANCE_PREFIX}/{ds_name}_std"] = pw_t.std().item() + wandb_info[f"{_BALANCE_PREFIX}/{ds_name}_min"] = pw_t.min().item() + wandb_info[f"{_BALANCE_PREFIX}/{ds_name}_max"] = pw_t.max().item() + + # Section 3: augmentation profiling (total aug/io + per-step breakdown) + _AUG_PREFIX = "dl_worker_augmentation" + + if self._worker_aug_times: + local_aug = torch.tensor(self._worker_aug_times, dtype=torch.float32).cuda() # [num_batches_local] + all_aug = torch.cat(distributed.all_gather_tensor(local_aug)) # [num_batches_all_ranks] + wandb_info[f"{_AUG_PREFIX}/total_aug_mean"] = all_aug.mean().item() + wandb_info[f"{_AUG_PREFIX}/total_aug_min"] = all_aug.min().item() + wandb_info[f"{_AUG_PREFIX}/total_aug_max"] = all_aug.max().item() + + if self._worker_io_times: + local_io = torch.tensor(self._worker_io_times, dtype=torch.float32).cuda() # [num_batches_local] + all_io = torch.cat(distributed.all_gather_tensor(local_io)) # [num_batches_all_ranks] + wandb_info[f"{_AUG_PREFIX}/total_io_mean"] = all_io.mean().item() + wandb_info[f"{_AUG_PREFIX}/total_io_min"] = all_io.min().item() + wandb_info[f"{_AUG_PREFIX}/total_io_max"] = all_io.max().item() + + if self._worker_aug_times and self._worker_batch_times: + aug_fracs = [ + a / b for a, b in zip(self._worker_aug_times, self._worker_batch_times) if b > 0 + ] # [num_valid_batches_local] + if aug_fracs: + local_fracs = torch.tensor(aug_fracs, dtype=torch.float32).cuda() # [num_valid_batches_local] + all_fracs = torch.cat(distributed.all_gather_tensor(local_fracs)) # [num_valid_batches_all_ranks] + wandb_info[f"{_AUG_PREFIX}/aug_fraction_mean"] = all_fracs.mean().item() + wandb_info[f"{_AUG_PREFIX}/io_fraction_mean"] = 1.0 - all_fracs.mean().item() + + # Per-augmentor-step breakdown (converted to all_gather_tensor) + if aug_index: + N_aug = len(aug_index) + local_aug_steps = torch.full( + (N_aug, _AGG_COLS), float("nan"), dtype=torch.float64 + ).cuda() # [num_aug_steps, _AGG_COLS] + for step_name, ts in self._worker_aug_step_times.items(): + if step_name in aug_index and ts: + local_aug_steps[aug_index[step_name]] = torch.tensor( + [len(ts), sum(ts), min(ts), max(ts)], dtype=torch.float64 + ) + + all_aug_steps = self._gather_list_stats(local_aug_steps) # [world_size, num_aug_steps, _AGG_COLS] + for step_name, idx in aug_index.items(): + result = self._reduce_agg_column(all_aug_steps[:, idx, :]) + if result: + mean_val, min_val, max_val = result + wandb_info[f"{_AUG_PREFIX}/aug_steps/{step_name}_mean"] = mean_val + wandb_info[f"{_AUG_PREFIX}/aug_steps/{step_name}_min"] = min_val + wandb_info[f"{_AUG_PREFIX}/aug_steps/{step_name}_max"] = max_val + + self._worker_batch_times = [] + self._worker_aug_times = [] + self._worker_io_times = [] + self._worker_aug_step_times = defaultdict(list) + self._worker_times_by_ds_wid = defaultdict(list) diff --git a/cosmos-inference/cosmos3/_src/vfm/callbacks/device_monitor.py b/cosmos-inference/cosmos3/_src/vfm/callbacks/device_monitor.py new file mode 100644 index 00000000..80365485 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/vfm/callbacks/device_monitor.py @@ -0,0 +1,201 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +import os +from typing import Any, Dict, List, Tuple + +import pandas as pd +import psutil +import pynvml +import torch +import wandb + +from cosmos3._src.imaginaire.callbacks.every_n import EveryN +from cosmos3._src.imaginaire.model import ImaginaireModel +from cosmos3._src.imaginaire.trainer import ImaginaireTrainer +from cosmos3._src.imaginaire.utils import distributed, log +from cosmos3._src.imaginaire.utils.easy_io import easy_io + + +def log_prof_data( + data_list: List[Dict[str, Any]], + iteration: int, +) -> Tuple[pd.DataFrame]: + # Create a table to log data with rank information + columns = ["iteration", "rank"] + list(data_list[0].keys()) + data = [] + + # Initialize dictionaries to store min and max values for each metric + min_values = {key: float("inf") for key in columns[2:]} + max_values = {key: float("-inf") for key in columns[2:]} + sum_values = {key: 0.0 for key in columns[2:]} + + count = 0 + + for _rank, prof_data in enumerate(data_list): + row = [iteration, _rank] + [prof_data[key] for key in columns[2:]] + data.append(row) + count += 1 + + # Update min, max, and sum values + for key in columns[2:]: + min_values[key] = min(min_values[key], prof_data[key]) + max_values[key] = max(max_values[key], prof_data[key]) + sum_values[key] += prof_data[key] + + # Calculate average values + avg_values = {key: sum_values[key] / count for key in columns[2:]} + + df = pd.DataFrame(data, columns=columns) + summary_df = pd.DataFrame({"Avg": avg_values, "Max": max_values, "Min": min_values}) + + if wandb.run: + # Log the table + table = wandb.Table(dataframe=df) + wandb.log({"DeviceMonitor/prof_data": table}, step=iteration) + + # Log summary statistics + summary = {} + for key in columns[2:]: + summary[f"DeviceMonitor/min_{key}"] = min_values[key] + summary[f"DeviceMonitor/max_{key}"] = max_values[key] + summary[f"DeviceMonitor/avg_{key}"] = avg_values[key] + + wandb.log(summary, step=iteration) + return df, summary_df + + +class DeviceMonitor(EveryN): + """ + A callback to monitor device (CPU/GPU) usage and log it at regular intervals. + + Args: + every_n (int, optional): The frequency at which the callback is invoked. Defaults to 200. + step_size (int, optional): The step size for the callback. Defaults to 1. + save_s3 (bool, optional): Whether to save the monitoring data to S3. Defaults to False. + """ + + def __init__( + self, + every_n: int = 200, + step_size: int = 1, + save_s3: bool = False, + upload_every_n_mul: int = 1, + log_memory_detail: bool = True, + ): + super().__init__(every_n=every_n, step_size=step_size) + self.name = self.__class__.__name__ + self.save_s3 = save_s3 + self.s3_save_fp = f"s3://rundir/{self.name}" + self.upload_every_n = upload_every_n_mul * every_n + + self.log_memory_detail = log_memory_detail + + def on_train_start(self, model, iteration=0): + torch.cuda.reset_peak_memory_stats() + self.world_size = distributed.get_world_size() + self.rank = distributed.get_rank() + config_job = self.config.job + self.local_dir = f"{config_job.path_local}/{self.name}" + if self.rank == 0: + os.makedirs(self.local_dir, exist_ok=True) + log.info(f"{self.name} callback: local_dir: {self.local_dir}") + + local_rank = int(os.getenv("LOCAL_RANK", 0)) + self.handle = pynvml.nvmlDeviceGetHandleByIndex(local_rank) + + def every_n_impl( + self, + trainer: ImaginaireTrainer, + model: ImaginaireModel, + data_batch: dict[str, torch.Tensor], + output_batch: dict[str, torch.Tensor], + loss: torch.Tensor, + iteration: int, + ) -> None: + cur_process = psutil.Process(os.getpid()) + # cur_process.children(recursive=True) can crash if the dataloader is constantly creating and destroying processes (e.g. calling FFmpeg). + try: + cpu_memory_usage = sum(p.memory_info().rss for p in [cur_process] + cur_process.children(recursive=True)) + except Exception as e: # e.g. psutil.NoSuchProcess + log.warning(f"Failed to get CPU memory usage with error {e}") + cpu_memory_usage = 0 + cpu_mem_gb = cpu_memory_usage / (1024**3) + + peak_gpu_mem_gb = torch.cuda.max_memory_allocated() / (1024**3) + peak_gpu_mem_reserved_gb = torch.cuda.max_memory_reserved() / (1024**3) + temp = torch.cuda.temperature() + try: + power = torch.cuda.power_draw() + except Exception as e: + log.warning(f"Failed to get power draw with error {e}") + power = 0 + util = torch.cuda.utilization() + clock = torch.cuda.clock_rate() + + memory_info = pynvml.nvmlDeviceGetMemoryInfo(self.handle) + nvml_used_gpu_mem_gb = memory_info.used / (1024**3) + nvml_free_gpu_mem_gb = memory_info.free / (1024**3) + + prof_data = { + "cpu_mem_gb": cpu_mem_gb, + "peak_gpu_mem_gb": peak_gpu_mem_gb, + "peak_gpu_mem_reserved_gb": peak_gpu_mem_reserved_gb, + "nvml_used_gpu_mem_gb": nvml_used_gpu_mem_gb, + "nvml_free_gpu_mem_gb": nvml_free_gpu_mem_gb, + "temp": temp, + "power": power, + "util": util, + "clock": clock, + } + + data_list = [prof_data] * self.world_size + # this is blocking by default + if self.world_size > 1: + torch.distributed.all_gather_object(data_list, prof_data) + torch.distributed.barrier() + + df, summary_df = log_prof_data(data_list, iteration) + if self.save_s3 and self.rank == 0: + global_step = iteration // self.step_size + should_run = global_step % self.upload_every_n == 0 + if should_run: + df.to_csv(os.path.join(self.local_dir, f"prof_data_{iteration:09d}.csv"), index=False) + summary_df.to_csv(os.path.join(self.local_dir, f"summary_{iteration:09d}.csv"), index=True) + easy_io.copyfile_from_local( + os.path.join(self.local_dir, f"prof_data_{iteration:09d}.csv"), + os.path.join(self.s3_save_fp, f"prof_data_{iteration:09d}.csv"), + ) + easy_io.copyfile_from_local( + os.path.join(self.local_dir, f"summary_{iteration:09d}.csv"), + os.path.join(self.s3_save_fp, f"summary_{iteration:09d}.csv"), + ) + if self.rank == 0: + log.info(f"{self.name} Stats:\n{summary_df.to_string()}") + if self.log_memory_detail: + memory_stats = torch.cuda.memory_stats() + if wandb.run: + wandb_memory_info = {f"mem/{key}": memory_stats[key] for key in memory_stats.keys()} + wandb.log(wandb_memory_info, step=iteration) + if self.save_s3: + global_step = iteration // self.step_size + should_run = global_step % self.upload_every_n == 0 + if should_run: + easy_io.dump( + memory_stats, + os.path.join(self.s3_save_fp, f"memory_stats_{iteration:09d}.yaml"), + ) + + torch.cuda.reset_peak_memory_stats() diff --git a/cosmos-inference/cosmos3/_src/vfm/callbacks/every_n_draw_audio_video_sample.py b/cosmos-inference/cosmos3/_src/vfm/callbacks/every_n_draw_audio_video_sample.py new file mode 100644 index 00000000..77cd4556 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/vfm/callbacks/every_n_draw_audio_video_sample.py @@ -0,0 +1,429 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +"""Callback for sampling and visualizing joint audio-video generation. + +Extends the video sampling callback to also handle sound: +- Generates video and audio samples via model.generate_samples_from_batch() +- Logs video frames as image grids to WandB (same as EveryNDrawSample) +- Logs audio as WandB Audio objects +- Optionally creates combined video+audio MP4 files via ffmpeg +""" + +import os +import subprocess +from contextlib import nullcontext +from functools import partial + +import torch +import torch.distributed as dist +import torchvision +import wandb +from einops import rearrange + +from cosmos3._src.imaginaire.callbacks.every_n import EveryN +from cosmos3._src.imaginaire.model import ImaginaireModel +from cosmos3._src.imaginaire.utils import distributed, log, misc +from cosmos3._src.vfm.callbacks.every_n_draw_sample import pad_images_and_cat, resize_image +from cosmos3._src.vfm.utils.data_utils import slice_data_batch + + +class EveryNDrawAudioVideoSample(EveryN): + """Callback for sampling and visualizing joint audio-video generation. + + Samples from the model and logs both video frames and audio to WandB. + + Args: + every_n: Frequency at which the callback is invoked + step_size: Step size for the callback (default: 1) + n_viz_sample: Number of samples to visualize in WandB (default: 3) + num_sampling_step: Number of ODE integration steps (default: 35) + guidance: List of guidance scales to try (default: [1.0, 3.0, 7.0]) + save_s3: Whether to save to S3 (default: False) + is_ema: Whether to use EMA model (default: False) + video_fps: FPS for video visualization (default: 24) + """ + + def __init__( + self, + every_n: int, + step_size: int = 1, + n_viz_sample: int = 3, + num_sampling_step: int = 35, + guidance: list[float] | None = None, + save_s3: bool = False, + is_ema: bool = False, + video_fps: int = 24, + generation_mode: str = "t2vs", + run_at_start: bool = False, + ): + super().__init__(every_n, step_size, run_at_start=run_at_start) + self.n_viz_sample = n_viz_sample + self.save_s3 = save_s3 + self.name = self.__class__.__name__ + self.is_ema = is_ema + self.guidance = guidance if guidance is not None else [1.0, 3.0, 7.0] + self.num_sampling_step = num_sampling_step + self.rank = distributed.get_rank() + self.video_fps = video_fps + self.generation_mode = generation_mode + + def on_train_start(self, model: ImaginaireModel, iteration: int = 0) -> None: + config_job = self.config.job + self.local_dir = f"{config_job.path_local}/{self.name}" + if distributed.get_rank() == 0: + os.makedirs(self.local_dir, exist_ok=True) + log.info(f"Callback: local_dir: {self.local_dir}") + + self.data_parallel_id = self.rank + + # Check if model has sound tokenizer + self.has_sound = hasattr(model, "tokenizer_sound_gen") and model.tokenizer_sound_gen is not None + if self.has_sound: + self.audio_sample_rate = model.tokenizer_sound_gen.sample_rate + log.info( + f"[{self.name}] Audio-video callback initialized: " + f"audio_sample_rate={self.audio_sample_rate}, is_ema={self.is_ema}" + ) + else: + self.audio_sample_rate = 48000 + log.warning(f"[{self.name}] Model does not have tokenizer_sound_gen, audio sampling disabled.") + + @torch.no_grad() + def every_n_impl(self, trainer, model, data_batch, output_batch, loss, iteration): + if self.is_ema: + if not model.config.ema.enabled: + return + context = partial(model.ema_scope, "every_n_av_sampling") + else: + context = nullcontext + + tag = "ema" if self.is_ema else "reg" + + with context(): + sample_results = self.sample(trainer, model, data_batch, output_batch, loss, iteration) + dist.barrier() + + if wandb.run and self.rank == 0: + info = {"trainer/global_step": iteration} + + # Log video grid + if sample_results.get("video_grid_path"): + info[f"{self.name}/{tag}_video"] = wandb.Image( + sample_results["video_grid_path"], + caption=f"iter={iteration}, guidance={self.guidance}", + ) + + # Log video+audio MP4s + for i, (mp4_path, caption) in enumerate(sample_results.get("video_audio_samples", [])): + if os.path.exists(mp4_path): + info[f"{self.name}/{tag}_video_audio_{i}"] = wandb.Video( + mp4_path, fps=self.video_fps, caption=caption + ) + + # Log standalone audio + for i, (audio_path, caption) in enumerate(sample_results.get("audio_samples", [])): + if os.path.exists(audio_path): + info[f"{self.name}/{tag}_audio_{i}"] = wandb.Audio( + audio_path, sample_rate=self.audio_sample_rate, caption=caption + ) + + # Log captions + if sample_results.get("captions"): + captions_text = "\n".join([f"{i}: {c}" for i, c in enumerate(sample_results["captions"])]) + info[f"{self.name}/{tag}_captions"] = wandb.Html(f"
{captions_text}
") + + wandb.log(info, step=iteration) + + torch.cuda.empty_cache() + + @misc.timer("EveryNDrawAudioVideoSample: sample") + def sample(self, trainer, model, data_batch, output_batch, loss, iteration): + """Generate audio-video samples and save results. + + Mode-aware behavior: + - t2vs/ti2sv: Decode both video and sound from model output + - tv2s: Use raw conditioning video for visualization, only decode sound + - ts2v: Only decode video, skip sound visualization (conditioned) + """ + data_batch = slice_data_batch(data_batch, start=0, limit=self.n_viz_sample) + + tag = "ema" if self.is_ema else "reg" + mode = self.generation_mode + results = {} + + # Get conditioning data + gen_data_clean = model.get_data_and_condition(data_batch) + raw_data = gen_data_clean.raw_state_vision + x0 = gen_data_clean.x0_tokens_vision + batch_size = len(x0) + + # Get captions for logging + captions = data_batch.get(model.input_caption_key, [""] * batch_size) + if isinstance(captions, torch.Tensor): + captions = ["[tensor]"] * batch_size + results["captions"] = captions[: self.n_viz_sample] + + # Determine what to decode based on mode + # tv2s: Video is conditioning input (use raw_data), only sound is generated + # ts2v: Sound is conditioning input, only video is generated + # t2vs/ti2sv: Both are generated + decode_video = mode not in ("tv2s",) # Skip video decode when video is conditioning + decode_sound = mode not in ("ts2v",) # Skip sound decode when sound is conditioning + + video_samples_all = [] + audio_samples_all = [] + + max_w = max(image.shape[-1] for image in raw_data) + max_h = max(image.shape[-2] for image in raw_data) + t_crop = min(image.shape[-3] for image in raw_data) + + for guidance in self.guidance: + sample_output = model.generate_samples_from_batch( + data_batch, + guidance=guidance, + n_sample=self.n_viz_sample, + num_steps=self.num_sampling_step, + seed=list(range(iteration, iteration + self.n_viz_sample)), + ) + + # Video handling based on mode — decode one at a time and move to CPU immediately + if decode_video: + sample_vision = sample_output.get("vision", []) + if sample_vision and hasattr(model, "decode"): + decoded_cpu = [] + for i in range(len(sample_vision)): + dec = model.decode(sample_vision[i]).float().cpu() + decoded_cpu.append(dec) + video_samples_all.append(pad_images_and_cat(decoded_cpu, max_w, max_h, t_crop)) + del decoded_cpu + + # Sound handling based on mode — decode one at a time and move to CPU immediately + if decode_sound: + sound_latents = sample_output.get("sound", []) + if sound_latents and self.has_sound: + audio_cpu = [] + for s in sound_latents: + dec = model.decode_sound(s).float().cpu() + audio_cpu.append(dec) + audio_samples_all.append(audio_cpu) + del audio_cpu + + # Free all GPU memory from this guidance iteration before next one + del sample_output + torch.cuda.empty_cache() + + # For tv2s: Use raw conditioning video instead of decoded (avoids wasteful VAE round-trip) + if mode == "tv2s": + conditioning_video = ( + pad_images_and_cat(raw_data[: min(self.n_viz_sample, batch_size)], max_w, max_h, t_crop).float().cpu() + ) + # Use conditioning video for all guidance scales (same video, different audio) + video_for_mp4 = conditioning_video + else: + video_for_mp4 = None # Will use per-guidance decoded video below + + # Add ground truth video for comparison (skip for tv2s where video isn't generated) + if decode_video: + video_samples_all.append( + pad_images_and_cat(raw_data[: min(self.n_viz_sample, batch_size)], max_w, max_h, t_crop).float().cpu() + ) + + # Save video grid (skip for tv2s — video evaluation should be done separately) + if video_samples_all and decode_video: + video_grid_path = self._save_video_grid(video_samples_all, batch_size, tag, iteration) + if video_grid_path: + results["video_grid_path"] = video_grid_path + + # Save audio samples and video+audio MP4s + if audio_samples_all and self.rank == 0: + audio_paths = [] + video_audio_paths = [] + + # Get conditioning FPS for video playback + conditioning_fps = data_batch.get("conditioning_fps", None) + if conditioning_fps is not None and isinstance(conditioning_fps, (torch.Tensor, list)): + video_write_fps = float( + conditioning_fps[0].item() if isinstance(conditioning_fps, torch.Tensor) else conditioning_fps[0] + ) + else: + video_write_fps = self.video_fps + + for g_idx, audio_batch in enumerate(audio_samples_all): + # Determine which video to pair with audio for MP4 + if video_for_mp4 is not None: + # tv2s: Use conditioning video for all guidance scales + video_batch = video_for_mp4 + elif g_idx < len(video_samples_all) - 1: + # t2vs/ti2sv: Use decoded video from this guidance scale (exclude GT at end) + video_batch = video_samples_all[g_idx] + else: + video_batch = None + + for sample_idx in range(min(self.n_viz_sample, len(audio_batch))): + audio_waveform = audio_batch[sample_idx] # [C,N_samples] + + # Save standalone audio + audio_path = self._save_audio(audio_waveform, tag, iteration, g_idx, sample_idx) + if audio_path: + caption = f"mode={mode}, guidance={self.guidance[g_idx]}, sample={sample_idx}" + if sample_idx < len(captions): + caption += f", caption: {captions[sample_idx][:100]}" + audio_paths.append((audio_path, caption)) + + # Create video+audio MP4 + if video_batch is not None and sample_idx < video_batch.shape[0]: + video_tensor = video_batch[sample_idx] # [C,T,H,W] + mp4_path = self._save_video_with_audio( + video_tensor, + audio_waveform, + tag, + iteration, + g_idx, + sample_idx, + fps=video_write_fps, + ) + if mp4_path: + video_audio_paths.append((mp4_path, caption)) + + results["audio_samples"] = audio_paths + results["video_audio_samples"] = video_audio_paths + + return results + + def _save_video_grid( + self, video_samples: list[torch.Tensor], batch_size: int, tag: str, iteration: int + ) -> str | None: + """Save video samples as image grid for WandB.""" + if self.rank != 0 or not wandb.run: + return None + + to_show = (1.0 + torch.stack(video_samples, dim=0).clamp(-1, 1)) / 2.0 # [N_rows,B,C,T,H,W] range [0,1] + n_viz_sample = min(self.n_viz_sample, batch_size) + is_single_frame = to_show.shape[3] == 1 + + file_base_fp = f"{tag}_AV_Video_Iter{iteration:09d}.jpg" + local_path = f"{self.local_dir}/{file_base_fp}" + + if is_single_frame: + to_show = rearrange( + to_show[:, :n_viz_sample], "n b c t h w -> t c (n h) (b w)" + ) # [1,C,N_rows*H,B*W] (t=1) + image_grid = torchvision.utils.make_grid(to_show, nrow=1, padding=0, normalize=False) + torchvision.utils.save_image(resize_image(image_grid, 1024), local_path) + else: + to_show = to_show[:, :n_viz_sample] # [N_rows,B,C,T,H,W] + _T = to_show.shape[3] + three_frames_list = [0, _T // 2, _T - 1] + to_show = to_show[:, :, :, three_frames_list] # [N_rows,B,C,3,H,W] + to_show = rearrange(to_show, "n b c t h w -> 1 c (n h) (b t w)") # [1,C,N_rows*H,B*3*W] (t=3) + image_grid = torchvision.utils.make_grid(to_show, nrow=1, padding=0, normalize=False) + torchvision.utils.save_image(resize_image(image_grid, 1024), local_path) + + return local_path + + def _save_audio( + self, audio_waveform: torch.Tensor, tag: str, iteration: int, guidance_idx: int, sample_idx: int + ) -> str | None: + """Save audio waveform as WAV file.""" + if self.rank != 0: + return None + try: + import soundfile as sf + + file_name = f"{tag}_Audio_Iter{iteration:09d}_g{guidance_idx}_s{sample_idx}.wav" + local_path = f"{self.local_dir}/{file_name}" + + audio_np = audio_waveform.clamp(-1, 1).numpy() + if audio_np.ndim == 2: + audio_np = audio_np.T # [C, N] → [N, C] for soundfile + + sf.write(local_path, audio_np, self.audio_sample_rate) + return local_path + except Exception as e: + log.warning(f"Failed to save audio: {e}", rank0_only=False) + return None + + def _save_video_with_audio( + self, + video_tensor: torch.Tensor, + audio_tensor: torch.Tensor, + tag: str, + iteration: int, + guidance_idx: int, + sample_idx: int, + fps: float | None = None, + ) -> str | None: + """Create MP4 video with audio using ffmpeg.""" + video_fps = fps if fps is not None else self.video_fps + if self.rank != 0: + return None + try: + import soundfile as sf + + file_base = f"{tag}_VideoAudio_Iter{iteration:09d}_g{guidance_idx}_s{sample_idx}" + mp4_path = f"{self.local_dir}/{file_base}.mp4" + temp_video_path = f"{self.local_dir}/{file_base}_temp.mp4" + temp_audio_path = f"{self.local_dir}/{file_base}_temp.wav" + + # Save video frames as temp MP4 + video_frames = video_tensor.permute(1, 0, 2, 3) # [T,C,H,W] + video_frames = (video_frames.clamp(-1, 1) + 1) / 2 # [T,C,H,W] range [0,1] + video_frames = (video_frames * 255).to(torch.uint8) # [T,C,H,W] + + torchvision.io.write_video( + temp_video_path, + video_frames.permute(0, 2, 3, 1).cpu(), # [T,H,W,C] + fps=video_fps, + video_codec="libx264", + ) + + # Save audio as temp WAV + audio_np = audio_tensor.clamp(-1, 1).numpy() + if audio_np.ndim == 2: + audio_np = audio_np.T + sf.write(temp_audio_path, audio_np, self.audio_sample_rate) + + # Combine with ffmpeg + cmd = [ + "ffmpeg", + "-y", + "-i", + temp_video_path, + "-i", + temp_audio_path, + "-c:v", + "libx264", + "-c:a", + "aac", + "-shortest", + "-loglevel", + "error", + mp4_path, + ] + result = subprocess.run(cmd, capture_output=True, text=True) + if result.returncode != 0: + log.warning(f"ffmpeg failed: {result.stderr}") + return None + + # Cleanup + for f in [temp_video_path, temp_audio_path]: + if os.path.exists(f): + os.remove(f) + + return mp4_path + except Exception as e: + log.warning(f"Failed to create video with audio: {e}", rank0_only=False) + return None diff --git a/cosmos-inference/cosmos3/_src/vfm/callbacks/every_n_draw_sample.py b/cosmos-inference/cosmos3/_src/vfm/callbacks/every_n_draw_sample.py new file mode 100644 index 00000000..1863a915 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/vfm/callbacks/every_n_draw_sample.py @@ -0,0 +1,438 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +import math +import os +from contextlib import nullcontext +from functools import partial +from typing import List, Optional + +import numpy as np +import torch +import torch.distributed as dist +import torch.nn.functional as F +import torchvision +import torchvision.transforms.functional as torchvision_F +import wandb +from einops import rearrange + +from cosmos3._src.imaginaire.callbacks.every_n import EveryN +from cosmos3._src.imaginaire.model import ImaginaireModel +from cosmos3._src.imaginaire.utils import distributed, log, misc +from cosmos3._src.imaginaire.utils.easy_io import easy_io +from cosmos3._src.imaginaire.visualize.video import save_img_or_video +from cosmos3._src.vfm.utils.data_utils import slice_data_batch + + +def resize_image(image: torch.Tensor, size: int = 1024) -> torch.Tensor: + """ + Resize the image to the given size. This is done so that wandb can display the image correctly. + """ + _, h, w = image.shape + ratio = size / max(h, w) + new_h, new_w = int(ratio * h), int(ratio * w) + return torchvision_F.resize(image, (new_h, new_w)) + + +def is_primitive(value): + return isinstance(value, (int, float, str, bool, type(None))) + + +def convert_to_primitive(value): + if isinstance(value, (list, tuple)): + return [convert_to_primitive(v) for v in value if is_primitive(v) or isinstance(v, (list, dict))] + elif isinstance(value, dict): + return {k: convert_to_primitive(v) for k, v in value.items() if is_primitive(v) or isinstance(v, (list, dict))} + elif is_primitive(value): + return value + else: + return "non-primitive" # Skip non-primitive types + + +def pad_images_and_cat(images: List[torch.Tensor], max_w: int, max_h: int, t_crop: int = 1) -> torch.Tensor: + """ + Pad images to a common size and concatenate them along the batch dimension. + + This function is needed because different samples in a batch can have different resolutions. + To create a unified visualization grid, all images must be padded to the same dimensions. + Images are center-padded to preserve their visual content in the middle. + + Args: + images: List of image/video tensors with shape [B, C, T, H, W]. + max_w: Target width to pad all images to. + max_h: Target height to pad all images to. + t_crop: Number of temporal frames to keep for videos. If > 1 and the image + has more than 1 frame, only the first t_crop frames are retained. + + Returns: + Concatenated tensor of padded images with shape [total_B, C, T, max_h, max_w]. + """ + padded_images = [] + for image in images: + # Pad the image to the center + padding_h = (max_h - image.shape[-2]) // 2 + padding_w = (max_w - image.shape[-1]) // 2 + padded_image = torch.nn.functional.pad( + image, (padding_w, max_w - image.shape[-1] - padding_w, padding_h, max_h - image.shape[-2] - padding_h) + ) # [B,C,T,max_h,max_w] + # Handle video case + if image.shape[2] > 1 and t_crop > 1: + padded_image = padded_image[:, :, 0:t_crop, :, :] + + padded_images.append(padded_image) + return torch.cat(padded_images, dim=0) # [total_B,C,T,max_h,max_w] (total_B = sum of batch dims) + + +class EveryNDrawSample(EveryN): + """ + This callback sample condition inputs from training data, run inference and save the results to wandb and s3. + + Args: + every_n (int): The frequency at which the callback is invoked. + step_size (int, optional): The step size for the callback. Defaults to 1. + n_viz_sample (int, optional): for each batch, min(n_viz_sample, batch_size) samples will be saved to wandb. Defaults to 3. + n_sample_to_save (int, optional): number of samples to save. The actual number of samples to save is min(n_sample_to_save, data parallel instances). Defaults to 128. + num_sampling_step (int, optional): number of sampling steps. Defaults to 35. + guidance (List[float], optional): guidance scale. Defaults to [0.0, 3.0, 7.0]. + do_x0_prediction (bool, optional): whether to do x0 prediction. Defaults to True. + n_sigmas_for_x0_prediction (int, optional): number of sigmas to use for x0 prediction. Defaults to 4. + save_s3 (bool, optional): whether to save to s3. Defaults to False. + is_ema (bool, optional): whether the callback is run for ema model. Defaults to False. + use_negative_prompt (bool, optional): whether to use negative prompt. Defaults to False. + fps (int, optional): frames per second when saving the video. Defaults to 16. + """ + + def __init__( + self, + every_n: int, + step_size: int = 1, + n_viz_sample: int = 2, + n_sample_to_save: int = 128, + num_sampling_step: int = 35, + guidance: List[float] = [0.0, 3.0, 7.0], + do_x0_prediction: bool = True, + n_sigmas_for_x0_prediction: int = 4, + save_s3: bool = False, + save_local: bool = False, + is_ema: bool = False, + use_negative_prompt: bool = False, + prompt_type: str = "t5_xxl", + fps: int = 16, + run_at_start: bool = False, + ): + # s3: # files: min(n_sample_to_save, data instance) # per file: min(batch_size, n_viz_sample) + # wandb: 1 file, # per file: min(batch_size, n_viz_sample) + super().__init__(every_n, step_size, run_at_start=run_at_start) + + self.n_viz_sample = n_viz_sample + self.n_sample_to_save = n_sample_to_save + self.save_s3 = save_s3 + self.save_local = save_local + self.do_x0_prediction = do_x0_prediction + self.n_sigmas_for_x0_prediction = n_sigmas_for_x0_prediction + self.name = self.__class__.__name__ + self.is_ema = is_ema + self.use_negative_prompt = use_negative_prompt + self.prompt_type = prompt_type + self.guidance = guidance + self.num_sampling_step = num_sampling_step + self.rank = distributed.get_rank() + self.fps = fps + + def on_train_start(self, model: ImaginaireModel, iteration: int = 0) -> None: + config_job = self.config.job + self.local_dir = f"{config_job.path_local}/{self.name}" + if distributed.get_rank() == 0: + os.makedirs(self.local_dir, exist_ok=True) + log.info(f"Callback: local_dir: {self.local_dir}") + + self.data_parallel_id = self.rank + + @misc.timer("EveryNDrawSample: x0") + @torch.no_grad() + def x0_pred(self, trainer, model, data_batch, output_batch, loss, iteration): + tag = "ema" if self.is_ema else "reg" + + log.debug("starting data and condition model", rank0_only=False) + + + data_clean = model.get_data_and_condition(data_batch) + raw_data = data_clean.raw_state_vision + x0 = data_clean.x0_tokens_vision + + # Handle model parallelism if available (legacy models) + if hasattr(model, "broadcast_split_for_model_parallelsim"): + _, condition, x0, _ = model.broadcast_split_for_model_parallelsim(None, None, x0, None) + + log.debug("done data and condition model", rank0_only=False) + batch_size = len(x0) + sigmas = np.exp( + np.linspace( + math.log(model.sde.sigma_min), math.log(model.sde.sigma_max), self.n_sigmas_for_x0_prediction + 1 + )[1:] + ) + + to_show = [] + generator = torch.Generator(device="cuda") + generator.manual_seed(0) + random_noise = torch.randn(*x0.shape, generator=generator, **model.tensor_kwargs) # same shape as x0 + _ones = torch.ones(batch_size, **model.tensor_kwargs) # [B] + mse_loss_list = [] + for _, sigma in enumerate(sigmas): + x_sigma = sigma * random_noise + x0 + log.debug(f"starting denoising {sigma}", rank0_only=False) + sample = model.denoise(x_sigma, None).x0 + log.debug(f"done denoising {sigma}", rank0_only=False) + mse_loss = distributed.dist_reduce_tensor(F.mse_loss(sample, x0)) + mse_loss_list.append(mse_loss) + + if hasattr(model, "decode"): + sample = model.decode(sample) + to_show.append(sample.float().cpu()) + to_show.append( + raw_data.float().cpu(), + ) + + base_fp_wo_ext = f"{tag}_ReplicateID{self.data_parallel_id:04d}_x0_Iter{iteration:09d}" + + local_path = self.run_save(to_show, batch_size, base_fp_wo_ext) + return local_path, torch.tensor(mse_loss_list).cuda(), sigmas # [N_sigmas] + + @torch.no_grad() + def every_n_impl(self, trainer, model, data_batch, output_batch, loss, iteration): + if self.is_ema: + if not model.config.ema.enabled: + return + context = partial(model.ema_scope, "every_n_sampling") + else: + context = nullcontext + + tag = "ema" if self.is_ema else "reg" + sample_counter = getattr(trainer, "sample_counter", iteration) + batch_info = { + "data": { + k: convert_to_primitive(v) + for k, v in data_batch.items() + if is_primitive(v) or isinstance(v, (list, dict)) + }, + "sample_counter": sample_counter, + "iteration": iteration, + } + if self.save_s3 and self.data_parallel_id < self.n_sample_to_save: + easy_io.dump( + batch_info, + f"s3://rundir/{self.name}/Iter{iteration:09d}/BatchInfo_ReplicateID{self.data_parallel_id:04d}_Iter{iteration:09d}.json", + ) + + log.debug("entering, every_n_impl", rank0_only=False) + with context(): + if self.do_x0_prediction: + log.debug("entering, x0_pred", rank0_only=False) + x0_img_fp, mse_loss, sigmas = self.x0_pred( + trainer, + model, + data_batch, + output_batch, + loss, + iteration, + ) + log.debug("done, x0_pred", rank0_only=False) + if self.save_s3 and self.rank == 0: + easy_io.dump( + { + "mse_loss": mse_loss.tolist(), + "sigmas": sigmas.tolist(), + "iteration": iteration, + }, + f"s3://rundir/{self.name}/{tag}_MSE_Iter{iteration:09d}.json", + ) + + log.debug("entering, sample", rank0_only=False) + sample_img_fp = self.sample( + trainer, + model, + data_batch, + output_batch, + loss, + iteration, + ) + log.debug("done, sample", rank0_only=False) + + log.debug("waiting for all ranks to finish", rank0_only=False) + dist.barrier() + if wandb.run: + sample_counter = getattr(trainer, "sample_counter", iteration) + data_type = "image" if model.is_image_batch(data_batch) else "video" + tag += f"_{data_type}" + info = { + "trainer/global_step": iteration, + "sample_counter": sample_counter, + } + if self.do_x0_prediction: + info[f"{self.name}/{tag}_x0"] = wandb.Image(x0_img_fp, caption=f"{sample_counter}") + # convert mse_loss to a dict + mse_loss = mse_loss.tolist() + info.update({f"x0_pred_mse_{tag}/Sigma{sigmas[i]:0.5f}": mse_loss[i] for i in range(len(mse_loss))}) + + info[f"{self.name}/{tag}_sample"] = wandb.Image(sample_img_fp, caption=f"{sample_counter}") + wandb.log( + info, + step=iteration, + ) + torch.cuda.empty_cache() + + @misc.timer("EveryNDrawSample: sample") + def sample(self, trainer, model, data_batch, output_batch, loss, iteration): + data_batch = slice_data_batch(data_batch, start=0, limit=self.n_viz_sample) + + tag = "ema" if self.is_ema else "reg" + + # Obtain text embeddings online + text_encoder_config = getattr(model.config, "text_encoder_config", None) + if text_encoder_config is not None and text_encoder_config.compute_online: + text_embeddings = model.text_encoder.compute_text_embeddings_online(data_batch, model.input_caption_key) + data_batch["t5_text_embeddings"] = text_embeddings + data_batch["t5_text_mask"] = torch.ones( + text_embeddings.shape[0], text_embeddings.shape[1], device="cuda" + ) # [B,N_tokens] (all tokens valid) + + data_clean = model.get_data_and_condition(data_batch) + raw_data = data_clean.raw_state_vision + x0 = data_clean.x0_tokens_vision + + # determine the number of visualized samples + n_viz_sample = min(self.n_viz_sample, data_clean.batch_size) + + # Check if this is a multi-item vision batch (image editing) + num_items = data_clean.num_vision_items_per_sample + is_multi_item = num_items is not None + + if is_multi_item: + # Image editing: raw_data is flat [src1, tgt1, src2, tgt2, ...]. + # Split into per-sample condition (source) and GT target images. + condition_images: list[torch.Tensor] = [] + gt_target_images: list[torch.Tensor] = [] + vis_offset = 0 + for sample_idx in range(data_clean.batch_size): + n_vis = num_items[sample_idx] + # First item(s) are condition, last item is generation target + + # but we need to support multiple conditions per sample in the future. Current code + # can handle this without throwing an error. + condition_images.append(raw_data[vis_offset]) # source image (1, C, 1, H, W) + gt_target_images.append(raw_data[vis_offset + n_vis - 1]) # target image (1, C, 1, H, W) + vis_offset += n_vis + + # Use target images for max_w/max_h/t_crop (generated samples match target size) + max_w = max(img.shape[-1] for img in gt_target_images) + max_h = max(img.shape[-2] for img in gt_target_images) + t_crop = min(img.shape[-3] for img in gt_target_images) + else: + max_w = max(image.shape[-1] for image in raw_data) + max_h = max(image.shape[-2] for image in raw_data) + t_crop = min(image.shape[-3] for image in raw_data) + + to_show = [] + + # Row 0 (image editing only): condition (source) images + if is_multi_item: + to_show.append(pad_images_and_cat(condition_images[:n_viz_sample], max_w, max_h, t_crop).float().cpu()) + + for guidance in self.guidance: + sample = model.generate_samples_from_batch( + data_batch, + guidance=guidance, + n_sample=n_viz_sample, + num_steps=self.num_sampling_step, + has_negative_prompt=True if self.use_negative_prompt else False, + seed=list(range(iteration, iteration + n_viz_sample)), + ) + sample_vision = sample["vision"] + assert hasattr(model, "decode") + sample_vision_decoded = [model.decode(sample_vision_i) for sample_vision_i in sample_vision] + assert len(sample_vision_decoded) == n_viz_sample + to_show.append(pad_images_and_cat(sample_vision_decoded, max_w, max_h, t_crop).float().cpu()) + + # Last row: ground truth + if is_multi_item: + # Image editing: show GT target images (not the flat raw_data which mixes src + tgt) + assert len(gt_target_images) == n_viz_sample + to_show.append(pad_images_and_cat(gt_target_images, max_w, max_h, t_crop).float().cpu()) + else: + assert len(raw_data) == n_viz_sample + to_show.append(pad_images_and_cat(raw_data, max_w, max_h, t_crop).float().cpu()) + + base_fp_wo_ext = f"{tag}_ReplicateID{self.data_parallel_id:04d}_Sample_Iter{iteration:09d}" + base_fp_wo_ext = f"Iter{iteration:09d}/{base_fp_wo_ext}" + + batch_size = data_clean.batch_size + local_path = self.run_save(to_show, batch_size, base_fp_wo_ext) + return local_path + + def run_save(self, to_show, batch_size, base_fp_wo_ext) -> Optional[str]: + to_show = (1.0 + torch.stack(to_show, dim=0).clamp(-1, 1)) / 2.0 # [N_rows,B,C,T,H,W] range [0,1] + is_single_frame = to_show.shape[3] == 1 + n_viz_sample = min(self.n_viz_sample, batch_size) + to_show = to_show[:, :n_viz_sample] + + # ! we only save first n_sample_to_save video! + video_grid = rearrange(to_show, "n b c t h w -> c t (n h) (b w)") # [C,T,N_rows*H,B*W] + if self.save_s3 and self.data_parallel_id < self.n_sample_to_save: + save_img_or_video( + video_grid, + f"s3://rundir/{self.name}/{base_fp_wo_ext}", + fps=self.fps, + ) + if self.save_local and self.data_parallel_id < self.n_sample_to_save: + local_video_path = f"{self.local_dir}/{base_fp_wo_ext}" + os.makedirs(os.path.dirname(local_video_path), exist_ok=True) + save_img_or_video(video_grid, local_video_path, fps=self.fps) + + file_base_fp = f"{base_fp_wo_ext}_resize.jpg" + local_path = f"{self.local_dir}/{file_base_fp}" + + if self.rank == 0 and wandb.run: + if is_single_frame: # image case + to_show = rearrange( + to_show[:, :n_viz_sample], + "n b c t h w -> t c (n h) (b w)", + ) # [1,C,N_rows*H,B*W] (t=1 for images) + image_grid = torchvision.utils.make_grid(to_show, nrow=1, padding=0, normalize=False) + # resize so that wandb can handle it + os.makedirs(os.path.dirname(local_path), exist_ok=True) + torchvision.utils.save_image(resize_image(image_grid, 1024), local_path, nrow=1, scale_each=True) + else: + to_show = to_show[:, :n_viz_sample] # [N_rows,B,C,T,H,W] + + # resize 3 frames frames so that we can display them on wandb + _T = to_show.shape[3] + three_frames_list = [0, _T // 2, _T - 1] + to_show = to_show[:, :, :, three_frames_list] # [N_rows,B,C,3,H,W] (3 sampled frames) + log_image_size = 1024 + to_show = rearrange( + to_show, + "n b c t h w -> 1 c (n h) (b t w)", + ) # [1,C,N_rows*H,B*3*W] (t=3 sampled frames) + + os.makedirs(os.path.dirname(local_path), exist_ok=True) + # resize so that wandb can handle it + image_grid = torchvision.utils.make_grid(to_show, nrow=1, padding=0, normalize=False) + os.makedirs(os.path.dirname(local_path), exist_ok=True) + torchvision.utils.save_image( + resize_image(image_grid, log_image_size), local_path, nrow=1, scale_each=True + ) + + return local_path + return None diff --git a/cosmos-inference/cosmos3/_src/vfm/callbacks/every_n_dur_fps_draw_sample.py b/cosmos-inference/cosmos3/_src/vfm/callbacks/every_n_dur_fps_draw_sample.py new file mode 100644 index 00000000..24d1a3b2 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/vfm/callbacks/every_n_dur_fps_draw_sample.py @@ -0,0 +1,333 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +import re +from contextlib import nullcontext +from functools import partial + +import torch +import torch.distributed as dist +import torchvision +import wandb +from einops import rearrange, repeat + +from cosmos3._src.imaginaire.utils import log, misc +from cosmos3._src.imaginaire.utils.easy_io import easy_io +from cosmos3._src.imaginaire.visualize.video import save_img_or_video +from cosmos3._src.vfm.callbacks.every_n_draw_sample import ( + EveryNDrawSample, + convert_to_primitive, + is_primitive, + pad_images_and_cat, + resize_image, +) +from cosmos3._src.vfm.utils.data_utils import slice_data_batch + + +class EveryNDurationFPSDrawSample(EveryNDrawSample): + """ + Callback to visualize samples with specific Duration/FPS metadata control. + It performs two types of generation: + 1. Standard generation (using the batch as-is). + 2. "Consistent" generation: Rewrites the duration/FPS metadata in the caption + to match the actual video FPS and generated frame count, then generates again. + The "Consistent" results are logged as individual windows per sample to show + the exact caption used. + """ + + @misc.timer("EveryNDurationFPSDrawSample: sample") + def sample(self, trainer, model, data_batch, output_batch, loss, iteration): + data_batch = slice_data_batch(data_batch, start=0, limit=self.n_viz_sample) + + tag = "ema" if self.is_ema else "reg" + results = {} + + # Obtain text embeddings online + text_encoder_config = getattr(model.config, "text_encoder_config", None) + if text_encoder_config is not None and text_encoder_config.compute_online: + text_embeddings = model.text_encoder.compute_text_embeddings_online(data_batch, model.input_caption_key) + data_batch["t5_text_embeddings"] = text_embeddings + data_batch["t5_text_mask"] = torch.ones( + text_embeddings.shape[0], text_embeddings.shape[1], device="cuda" + ) # [B,N_tokens] (all tokens valid) + + data_clean = model.get_data_and_condition(data_batch) + raw_data = data_clean.raw_state_vision + x0 = data_clean.x0_tokens_vision + + # Setup negative prompts if needed + if self.use_negative_prompt: + batch_size = len(x0) + if self.negative_prompt_data["t5_text_embeddings"].shape != data_batch["t5_text_embeddings"].shape: + data_batch["neg_t5_text_embeddings"] = misc.to( + repeat( + self.negative_prompt_data["t5_text_embeddings"], + "... -> b ...", + b=batch_size, + ), # [B,N_tokens,D] + **model.tensor_kwargs, + ) + else: + data_batch["neg_t5_text_embeddings"] = misc.to( + self.negative_prompt_data["t5_text_embeddings"], + **model.tensor_kwargs, + ) + data_batch["neg_t5_text_mask"] = data_batch["t5_text_mask"] + + # Compute max dimensions for padding (supports variable shapes) + max_w = max(image.shape[-1] for image in raw_data) + max_h = max(image.shape[-2] for image in raw_data) + t_crop = min(image.shape[-3] for image in raw_data) + + # Helper to run generation for a specific data batch configuration + def _generate_and_save(batch_to_use, suffix="", split_batch=False, save_fps=None): + to_show = [] + for guidance in self.guidance: + sample = model.generate_samples_from_batch( + batch_to_use, + guidance=guidance, + n_sample=self.n_viz_sample, + num_steps=self.num_sampling_step, + has_negative_prompt=True if self.use_negative_prompt else False, + seed=list(range(iteration, iteration + self.n_viz_sample)), + ) + sample_vision = sample["vision"] + if hasattr(model, "decode"): + sample_vision_decoded = [model.decode(sample_vision[i]) for i in range(len(sample_vision))] + else: + sample_vision_decoded = sample_vision + to_show.append(pad_images_and_cat(sample_vision_decoded, max_w, max_h, t_crop).float().cpu()) + + to_show.append( + pad_images_and_cat(raw_data[: len(sample_vision_decoded)], max_w, max_h, t_crop).float().cpu() + ) + + base_fp_wo_ext = f"{tag}_ReplicateID{self.data_parallel_id:04d}_Sample_Iter{iteration:09d}{suffix}" + batch_size = len(x0) + + if split_batch: + # When splitting, run_save_split returns keys like "_0", "_1" + # We need to prepend the suffix (e.g. "_consistent") to these keys + # so they become "_consistent_0", "_consistent_1" + split_results = self.run_save_split(to_show, batch_size, base_fp_wo_ext, save_fps=save_fps) + return {f"{suffix}{k}": v for k, v in split_results.items()} + else: + return self.run_save(to_show, batch_size, base_fp_wo_ext) + + # 1. Standard generation + results[""] = _generate_and_save(data_batch) + + # 2. "Consistent Duration/FPS" Variation + is_video = not model.is_image_batch(data_batch) + input_caption_key = getattr(model, "input_caption_key", "ai_caption") + + if is_video and input_caption_key in data_batch and "conditioning_fps" in data_batch: + original_captions = data_batch[input_caption_key] + + batch_copy = data_batch.copy() + fps_values = data_batch["conditioning_fps"] + + new_captions = [] + for i, cap in enumerate(original_captions): + new_captions.append(cap) + + batch_copy[input_caption_key] = new_captions + + # Re-compute embeddings + if text_encoder_config is not None and text_encoder_config.compute_online: + text_embeddings = model.text_encoder.compute_text_embeddings_online(batch_copy, input_caption_key) + batch_copy["t5_text_embeddings"] = text_embeddings + batch_copy["t5_text_mask"] = torch.ones( + text_embeddings.shape[0], text_embeddings.shape[1], device="cuda" + ) # [B,N_tokens] (all tokens valid) + if "neg_t5_text_embeddings" in batch_copy: + pass + + # Pass captions back so we can log them + batch_result = _generate_and_save(batch_copy, suffix="_consistent", split_batch=True, save_fps=fps_values) + # Attach captions to the result dictionary for logging + batch_result["__captions__"] = new_captions + results.update(batch_result) + + return results + + def run_save_split(self, to_show, batch_size, base_fp_wo_ext, save_fps=None) -> dict: + """ + Similar to run_save but splits the batch into individual images. + """ + to_show = (1.0 + torch.stack(to_show, dim=0).clamp(-1, 1)) / 2.0 # [N_rows,B,C,T,H,W] range [0,1] + n_viz_sample = min(self.n_viz_sample, batch_size) + + # We assume video here since we checked is_video + to_show_full = to_show[:, :n_viz_sample] # [N_rows,B,C,T,H,W] - Keep full video for S3 saving + _T = to_show_full.shape[3] + + # Save individual FULL videos to S3 if enabled (before 3-frame reduction) + if self.save_s3 and self.data_parallel_id < self.n_sample_to_save: + for i in range(n_viz_sample): + # Extract individual FULL video from batch + individual_video = to_show_full[:, i : i + 1] # [n, 1, c, T, h, w] - FULL video + + # Get FPS for this specific batch item + item_fps = self.fps # fallback + if save_fps is not None: + if isinstance(save_fps, torch.Tensor): + item_fps = save_fps[i].item() if save_fps.ndim > 0 else save_fps.item() + elif isinstance(save_fps, (list, tuple)): + item_fps = save_fps[i] + else: + item_fps = float(save_fps) + + # Save individual FULL video to S3 + save_img_or_video( + rearrange(individual_video, "n b c t h w -> c t (n h) (b w)"), # [C,T,N_rows*H,W] + f"s3://rundir/{self.name}/{base_fp_wo_ext}_{i}", + fps=item_fps, + ) + + # NOW reduce to 3 frames for WandB visualization only + three_frames_list = [0, _T // 2, _T - 1] + to_show_3frames = to_show_full[:, :, :, three_frames_list] # [N_rows,B,C,3,H,W] + log_image_size = 1024 + + # Save individual FULL videos to S3 if enabled (before 3-frame reduction) + if self.save_s3 and self.data_parallel_id < self.n_sample_to_save: + for i in range(n_viz_sample): + # Extract individual FULL video from batch + individual_video = to_show_full[:, i : i + 1] # [N_rows,1,C,T,H,W] + + # Get FPS for this specific batch item + item_fps = self.fps # fallback + if save_fps is not None: + if isinstance(save_fps, torch.Tensor): + item_fps = save_fps[i].item() if save_fps.ndim > 0 else save_fps.item() + elif isinstance(save_fps, (list, tuple)): + item_fps = save_fps[i] + else: + item_fps = float(save_fps) + + # Save individual FULL video to S3 + save_img_or_video( + rearrange(individual_video, "n b c t h w -> c t (n h) (b w)"), # [C,T,N_rows*H,W] + f"s3://rundir/{self.name}/{base_fp_wo_ext}_{i}", + fps=item_fps, + ) + + # NOW reduce to 3 frames for WandB visualization only + three_frames_list = [0, _T // 2, _T - 1] + to_show_3frames = to_show_full[:, :, :, three_frames_list] # [N_rows,B,C,3,H,W] + log_image_size = 1024 + + paths = {} + for i in range(n_viz_sample): + sample_data = to_show_3frames[:, i : i + 1] # [N_rows,1,C,3,H,W] + sample_grid_data = rearrange(sample_data, "n b c t h w -> 1 c (n h) (b t w)") # [1,C,N_rows*H,3*W] (t=3) + + sample_path = f"{self.local_dir}/{base_fp_wo_ext}_{i}_resize.jpg" + if self.rank == 0: + image_grid = torchvision.utils.make_grid(sample_grid_data, nrow=1, padding=0, normalize=False) + torchvision.utils.save_image( + resize_image(image_grid, log_image_size), sample_path, nrow=1, scale_each=True + ) + paths[f"_{i}"] = sample_path + return paths + + @torch.no_grad() + def every_n_impl(self, trainer, model, data_batch, output_batch, loss, iteration): + if self.is_ema: + if not model.config.ema.enabled: + return + context = partial(model.ema_scope, "every_n_sampling") + else: + context = nullcontext + + tag = "ema" if self.is_ema else "reg" + sample_counter = getattr(trainer, "sample_counter", iteration) + # Log batch info logic from base class... + batch_info = { + "data": { + k: convert_to_primitive(v) + for k, v in data_batch.items() + if is_primitive(v) or isinstance(v, (list, dict)) + }, + "sample_counter": sample_counter, + "iteration": iteration, + } + if self.save_s3 and self.data_parallel_id < self.n_sample_to_save: + easy_io.dump( + batch_info, + f"s3://rundir/{self.name}/BatchInfo_ReplicateID{self.data_parallel_id:04d}_Iter{iteration:09d}.json", + ) + + log.debug("entering, every_n_impl", rank0_only=False) + with context(): + # Skipping x0_pred for brevity in this specialized callback + log.debug("entering, sample", rank0_only=False) + sample_img_paths = self.sample( + trainer, + model, + data_batch, + output_batch, + loss, + iteration, + ) + log.debug("done, sample", rank0_only=False) + dist.barrier() + + if wandb.run: + data_type = "image" if model.is_image_batch(data_batch) else "video" + tag += f"_{data_type}" + info = { + "trainer/global_step": iteration, + "sample_counter": sample_counter, + } + # Handle dictionary of paths + if isinstance(sample_img_paths, dict): + # Retrieve captions if available + consistent_captions = sample_img_paths.get("__captions__", []) + + # Log standard (key "") + if "" in sample_img_paths: + info[f"{self.name}/{tag}_sample"] = wandb.Image(sample_img_paths[""], caption=f"{sample_counter}") + + # Log consistent variations (keys "_consistent_0", etc) + for suffix, path in sample_img_paths.items(): + if suffix == "" or suffix == "__captions__": + continue + + caption_text = f"{sample_counter}{suffix}" + + if "_consistent_" in suffix: + try: + idx = int(suffix.split("_")[-1]) + if idx < len(consistent_captions): + full_caption = consistent_captions[idx] + # Extract duration and FPS values and prepend for WandB display + duration_match = re.search(r"(\d+\.?\d*)\s+seconds?", full_caption) + fps_match = re.search(r"(\d+\.?\d*)\s+FPS", full_caption, re.IGNORECASE) + + if duration_match and fps_match: + duration = duration_match.group(1) + fps = fps_match.group(1) + caption_text = f"(Dur: {duration}s, FPS: {fps}fps) {full_caption}" + else: + caption_text = full_caption # No metadata found, use as-is + except Exception as e: + log.warning(f"Failed to parse suffix '{suffix}' for caption lookup: {e}") + + info[f"{self.name}/{tag}_sample{suffix}"] = wandb.Image(path, caption=caption_text) + + wandb.log(info, step=iteration) + torch.cuda.empty_cache() diff --git a/cosmos-inference/cosmos3/_src/vfm/callbacks/expert_heatmap.py b/cosmos-inference/cosmos3/_src/vfm/callbacks/expert_heatmap.py new file mode 100644 index 00000000..e30d022d --- /dev/null +++ b/cosmos-inference/cosmos3/_src/vfm/callbacks/expert_heatmap.py @@ -0,0 +1,120 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025-2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +import matplotlib.pyplot as plt +import torch +import wandb +from torch.distributed.tensor import DTensor, Partial + +from cosmos3._src.imaginaire.callbacks.every_n import EveryN +from cosmos3._src.imaginaire.model import ImaginaireModel +from cosmos3._src.imaginaire.trainer import ImaginaireTrainer +from cosmos3._src.imaginaire.utils import distributed +from cosmos3._src.vfm.models.vlm.qwen3_vl_moe.qwen3_vl_moe import Qwen3VLMoeTextSparseMoeBlock + + +def compute_expert_heatmap(vfm: torch.nn.Module) -> dict[str, torch.Tensor]: + """ + Compute the heatmap for the MoE blocks in the language model. + + The heatmap is a dictionary with keys set to ["und", "gen"] and values set to + a tensor of shape (num_layers, num_experts). + + Each element of the tensor is the average number of tokens routed to each expert for a + given layer. The sum of the elements in each row should be equal to the average number + of experts per token for the MoE model (config.num_experts_per_tok). + + For dense models, the heatmap is an empty dictionary. + """ + with torch.no_grad(): + num_layers = len(vfm.language_model.model.layers) + + example_dtensor = vfm.language_model.model.layers[0].self_attn.q_proj.weight + if isinstance(example_dtensor, DTensor): + assert hasattr(example_dtensor, "device_mesh") + device_mesh = example_dtensor.device_mesh + else: + device_mesh = None + + expert_heatmaps = {} + for tower in ["und", "gen"]: + expert_heatmaps_per_layer = [] + + for layer_idx in range(num_layers): + layer_module = vfm.language_model.model.layers[layer_idx] + mlp_module = layer_module.mlp if tower == "und" else layer_module.mlp_moe_gen + if isinstance(mlp_module, Qwen3VLMoeTextSparseMoeBlock): + # This is accumulated across all iterations. + total_tokens_per_expert = mlp_module.get_total_tokens_per_expert() + total_tokens = mlp_module.get_total_tokens() + + # Compute the average across all ranks. + assert device_mesh is not None, "MoE models require multiple GPUs." + total_tokens_per_expert = DTensor.from_local( + total_tokens_per_expert, + device_mesh=device_mesh, + placements=[Partial()] * device_mesh.ndim, + ).full_tensor() + total_tokens = DTensor.from_local( + total_tokens, + device_mesh=device_mesh, + placements=[Partial()] * device_mesh.ndim, + ).full_tensor() + + mean_tokens_per_expert = total_tokens_per_expert.float() / total_tokens.float() # [num_experts] + expert_heatmaps_per_layer.append(mean_tokens_per_expert) + + if len(expert_heatmaps_per_layer) > 0: + expert_heatmaps[tower] = torch.stack(expert_heatmaps_per_layer, dim=0) # [num_layers,num_experts] + + return expert_heatmaps + + +class ExpertHeatmap(EveryN): + """ + Plots the expert heatmap for the MoE blocks in the language model. + + Args: + every_n (int): Number of iterations to log the expert heatmap. + """ + + def __init__(self, every_n: int = 1000): + super().__init__(every_n=every_n) + + def every_n_impl( + self, + trainer: ImaginaireTrainer, + model: ImaginaireModel, + data_batch: dict[str, torch.Tensor], + output_batch: dict[str, torch.Tensor], + loss: torch.Tensor, + iteration: int, + ) -> None: + expert_heatmaps = compute_expert_heatmap(model.net) + + if distributed.is_rank0() and wandb.run: + for tower, heatmap in expert_heatmaps.items(): + fig, ax = plt.subplots() + im = ax.imshow(heatmap.cpu().numpy()) + ax.set_xlabel("Experts") + ax.set_ylabel("Layers") + plt.colorbar(im, ax=ax) + wandb.log( + { + f"expert_heatmap/{tower}": fig, + }, + step=iteration, + ) + plt.close(fig) diff --git a/cosmos-inference/cosmos3/_src/vfm/callbacks/generation.py b/cosmos-inference/cosmos3/_src/vfm/callbacks/generation.py new file mode 100644 index 00000000..5e045b95 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/vfm/callbacks/generation.py @@ -0,0 +1,144 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +import glob +import os +from pathlib import Path + +import einops +import numpy as np +import torch +import torchvision +import wandb +from PIL import Image + +from cosmos3._src.imaginaire.callbacks.every_n import EveryN +from cosmos3._src.imaginaire.trainer import ImaginaireTrainer +from cosmos3._src.imaginaire.utils import distributed, log + + +class Generation(EveryN): + def __init__( + self, + every_n: int = 500, + num_vis: int = 10, + ): + r""" + This callback enables us to perform full generation from class indices. + The generated images are saved to s3. + + Args: + every_n (int): Call this callback every_n steps + num_vis (int): Number of visualizations to save + """ + super().__init__(every_n) + self.num_vis = num_vis + + def on_train_start(self, model: torch.nn.Module, iteration: int = 0) -> None: + config_job = self.config.job + self.local_dir = f"{config_job.path_local}/generation" + if distributed.get_rank() == 0: + os.makedirs(self.local_dir, exist_ok=True) + log.info(f"Callback: local_dir: {self.local_dir}") + + @torch.inference_mode() + def every_n_impl( + self, + trainer: ImaginaireTrainer, + model: torch.nn.Module, + data_batch: dict[str, torch.Tensor], + output_batch: dict[str, torch.Tensor], + loss: torch.Tensor, + iteration: int, + ) -> None: + if not hasattr(model, "run_pipe_for_data_batch"): + log.warning("Model does not have run_pipe_for_data_batch method, skipping generation") + return + if model.config.train_mllm_only: + log.warning("Skipping generation in MLLM only mode") + return + assert ( + len(data_batch["diffusion_media_input"].shape) == 5 and data_batch["diffusion_media_input"].shape[1] == 3 + ), ( + f"`diffusion_media_input` must have the shape of (bs, 3, T, H, W), current shape is {data_batch['diffusion_media_input'].shape}" + ) + + log.info(f"Generating video for iteration {iteration}, data_batch keys: {data_batch.keys()}") + video = model.run_pipe_for_data_batch(data_batch) + + input_video = data_batch["diffusion_media_input"] # [B,3,T,H,W] + + log.info(f"Video list length: {len(video)}") + rank = distributed.get_rank() + output_path = os.path.join(self.local_dir, f"iter_{iteration}") + os.makedirs(os.path.dirname(output_path), exist_ok=True) + + B, _, T, height, width = input_video.shape + gt_image = einops.rearrange(input_video, "B C T H W -> (B T) C H W") # [BT,3,H,W] + gt_image = ((gt_image + 1) / 2).clamp(0, 1).float() # [BT,3,H,W], range: 0-1 + gt_grid = torchvision.utils.make_grid(gt_image, nrow=B * T, padding=0).cpu() # [3,H,W*BT] + + video = torch.stack( + [torch.from_numpy(np.array(image.resize((width, height))) / 255.0) for image in video], dim=0 + ) # [BT,H,W,3] + video = einops.rearrange(video, "BT H W C -> BT C H W") # [BT,3,H,W] + video_grid = torchvision.utils.make_grid(video, nrow=B * T, padding=0).cpu() # [3,H,W*BT] + if video_grid.shape[2] < gt_grid.shape[2]: + # the output from sampling function is less than the ground truth, so we need to pad the video_grid on the left + pad_width = gt_grid.shape[2] - video_grid.shape[2] + video_grid = torch.nn.functional.pad(video_grid, (pad_width, 0)) # [3,H,W*BT] + video_grid[:, :, :pad_width] = gt_grid[ + :, :, :pad_width + ] # Pad the generated grid with the ground truth images + + log.info(f"video_grid: {video_grid.shape}, gt_grid: {gt_grid.shape}") + display_image = torch.stack([video_grid, gt_grid], dim=0) # [2,3,H,W*BT] + display_image = torchvision.utils.make_grid( + display_image, nrow=1, padding=2, pad_value=1.0 + ) # [3,H_total,W_total] + + log.info( + f"Generated image: {video[0].shape} -> {video_grid.shape}, gt_image: {gt_image[0].shape} -> {gt_grid.shape} | display_image: {display_image.shape}" + ) + if rank <= self.num_vis: + display_image = einops.rearrange(display_image.numpy(), "C H W -> H W C") * 255.0 # [H,W,3] + display_image = display_image.astype(np.uint8) + display_image = Image.fromarray(display_image) + current_width, current_height = display_image.size + # reduce the image size to half + display_image = display_image.resize((current_width // 2, current_height // 2)) + display_image.save(output_path + f"_rank_{rank}.jpg") + caption_list = data_batch["raw_captions"][0] + with open(output_path + f"_rank_{rank}.txt", "w") as f: + f.write("top: generation, bottom: ground truth. Left to right: condition, generation\n") + f.write(caption_list) + + # barrier + distributed.barrier() + if rank == 0 and wandb.run is not None: + file_list = ( + sorted(glob.glob(output_path + "*.jpg"))[: self.num_vis] + + sorted(glob.glob(output_path + "*.mp4"))[: self.num_vis] + ) + caption_file_list = [file.replace(".jpg", ".txt").replace(".mp4", ".txt") for file in file_list] + caption_list = [Path(caption_file).read_text() for caption_file in caption_file_list] + wandb.log( + { + "vis/generation": [ + wandb.Image(file, caption=caption) for file, caption in zip(file_list, caption_list) + ] + }, + step=iteration, + ) diff --git a/cosmos-inference/cosmos3/_src/vfm/callbacks/grad_clip.py b/cosmos-inference/cosmos3/_src/vfm/callbacks/grad_clip.py new file mode 100644 index 00000000..5f0e3d03 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/vfm/callbacks/grad_clip.py @@ -0,0 +1,340 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +import math +from collections import defaultdict + +import torch +import wandb +from torch.distributed.tensor import DTensor +from torch.nn.parallel import DistributedDataParallel + +from cosmos3._src.imaginaire.utils import log +from cosmos3._src.imaginaire.utils.callback import Callback + + +@torch.compile +def _fused_nan_to_num(grads: list[torch.Tensor]) -> None: + """Replace NaN/Inf entries with 0.0 in every floating-point grad in-place. + + The Python-level loop over ``grads`` is wrapped in ``@torch.compile`` so + Inductor can fuse the per-tensor ``nan_to_num`` ops into a single CUDA + kernel. This is NOT the ``torch._foreach_*`` API; it is fusion-via-compile + and depends on the grad list structure (length, dtypes, shapes) staying + stable across iterations so Dynamo can reuse its specialized graph. + """ + grads = [g for g in grads if torch.is_floating_point(g)] + for g in grads: + torch.nan_to_num(g, nan=0.0, posinf=0.0, neginf=0.0, out=g) + + +class _MagnitudeRecord: + def __init__(self) -> None: + self.state: torch.Tensor | None = None + self.iter_count: int = 0 + + def reset(self) -> None: + self.state = None + self.iter_count = 0 + + def update(self, cur_state: torch.Tensor) -> None: + if self.state is None: + self.state = cur_state.detach().clone() + else: + self.state.add_(cur_state) + self.iter_count += 1 + + def get_stat(self) -> float: + if self.state is not None and self.iter_count > 0: + avg_state = (self.state / self.iter_count).item() + else: + avg_state = 0.0 + self.reset() + return avg_state + + +@torch.no_grad() +def _clip_grad( + parameters: list[torch.Tensor], + max_norm: float, + norm_type: float = 2.0, + error_if_nonfinite: bool = False, + foreach: bool | None = None, + return_norm_only: bool = False, +) -> tuple[torch.Tensor, dict[str, torch.Tensor]]: + """ + Clip the gradient norm of an iterable of parameters. + + Gradient norm clipping requires computing the gradient norm over the entire model. + `torch.nn.utils.clip_grad_norm_` only computes gradient norm along DP/FSDP/TP dimensions. + We need to manually reduce the gradient norm across PP stages. + See https://github.com/pytorch/torchtitan/issues/596 for details. + + Params are grouped by their ``device_mesh`` (by mesh-dim-names string — + plain (non-DTensor) params map to ``"default"``). A scalar L2 norm is + computed per mesh group, DTensor results are reduced to local scalars + via ``.full_tensor()``, the per-mesh scalars are combined into one + global norm, and (unless ``return_norm_only=True``) every mesh group + is rescaled with that single global scalar. + + Args: + parameters: an iterable of Tensors or a single Tensor that will have gradients normalized + max_norm (float): max norm of the gradients + norm_type (float): type of the used p-norm. Can be ``'inf'`` for + infinity norm. + error_if_nonfinite (bool): if True, an error is thrown if the total + norm of the gradients from :attr:`parameters` is ``nan``, + ``inf``, or ``-inf``. Default: False (will switch to True in the future) + foreach (bool): use the faster foreach-based implementation. + If ``None``, use the foreach implementation for CUDA and CPU native tensors and silently + fall back to the slow implementation for other device types. + Default: ``None`` + return_norm_only: if True, skip in-place rescaling of grads and only + return the computed norms. + + Returns: + ``(total_norm, per_mesh_norms)`` where ``total_norm`` is the global + scalar norm used for the rescale, and ``per_mesh_norms`` maps each + mesh-dim-names key (or ``"default"`` for plain params) to its + pre-clip per-mesh L2 norm. + + """ + # Group the parameters by their device meshes. + parameters_by_mesh: dict[str, list[torch.Tensor]] = defaultdict(list) + for param in parameters: + if param.grad is None: + raise ValueError( + f"_clip_grad received a parameter with no gradient " + f"(shape={tuple(param.shape)}, dtype={param.dtype}); " + "callers are expected to pre-filter." + ) + + # If one parameter belongs to multiple meshes, use a flattened mesh name + # by concatenating all the mesh-dim names together. ``mesh_dim_names`` + # is ``tuple[str, ...] | None`` on DeviceMesh — fall back to ``default`` + # when names weren't assigned. + if hasattr(param, "device_mesh"): + names = param.device_mesh.mesh_dim_names + device_mesh_str = "-".join(names) if names else "default" + else: + device_mesh_str = "default" + parameters_by_mesh[device_mesh_str].append(param) + + # Compute the norm for each mesh group + per_mesh_norms: dict[str, torch.Tensor] = {} + per_mesh_norm_list = [] + for mesh, params in parameters_by_mesh.items(): + # Every param reached here passed the ``param.grad is None`` check in + # the grouping loop above, so this list comprehension is total. + grads = [p.grad for p in params] + mesh_norm = torch.nn.utils.get_total_norm(grads, norm_type, error_if_nonfinite, foreach) + + # If mesh_norm is a DTensor, the placements must be + # `torch.distributed._tensor.ops.math_ops._NormPartial`. + # We can simply reduce the DTensor to get the total norm in this + # tensor's process group and then convert it to a local tensor. + + # 1. to make sure the total norm is computed correctly when PP is used (see below) + # 2. to return a reduced mesh_norm tensor whose .item() would return the correct value + if isinstance(mesh_norm, DTensor): + # Will reach here if any non-PP parallelism is used. + # If only using PP, mesh_norm will be a local tensor. + + # Remove FT replicate dimension if it exists. + mesh_norm = mesh_norm.full_tensor() + # Expose the (rank-replicated) per-mesh scalar for diagnostic logging. + per_mesh_norms[mesh] = mesh_norm + + # Make the norm to be a 1D tensor so we can call cat() later. + if mesh_norm.ndim == 0: + mesh_norm = mesh_norm.reshape(1) + per_mesh_norm_list.append(mesh_norm) + + # Compute the total norm among all meshes. + if len(per_mesh_norm_list) > 1: + per_mesh_norm_tensor = torch.cat(per_mesh_norm_list) + if math.isinf(norm_type): + total_norm = torch.max(per_mesh_norm_tensor) + else: + per_mesh_norm_tensor **= norm_type + total_norm = torch.sum(per_mesh_norm_tensor) + total_norm **= 1.0 / norm_type + else: + assert per_mesh_norm_list[0].numel() == 1, "total_norm should be a scalar" + total_norm = per_mesh_norm_list[0].view(-1)[0] + + if not return_norm_only: + # Perform clipping on each mesh group + for mesh, params in parameters_by_mesh.items(): + torch.nn.utils.clip_grads_with_norm_(params, max_norm, total_norm, foreach) + + return total_norm, per_mesh_norms + + +class GradClip(Callback): + """Unified gradient-clipping callback for both VFM (diffusion) and VLM training. + + The heavy lifting is delegated to ``_clip_grad``: it groups + params by their ``device_mesh`` (using mesh-dim-names as the key), + computes a scalar L2 norm per mesh group (reducing any DTensor result + via ``.full_tensor()``), combines the per-mesh scalars into ONE global + norm via ``sqrt(sum(per_mesh_norm**2))``, and applies + ``torch.nn.utils.clip_grads_with_norm_`` per mesh group with the SAME + global scalar — a SINGLE GLOBAL rescale across every parameter. + + This is necessary for correctness when parameters live on multiple device + meshes (e.g. dense FSDP-shard + EP-shard MoE experts): clipping each + mesh independently with stock ``torch.nn.utils.clip_grad_norm_`` would + assign a different rescale factor per mesh and distort the relative + magnitudes of dense vs MoE updates. Under VFM's current FSDP-only + setup the math reduces to a single mesh group and is identical to + stock ``clip_grad_norm_``; this implementation is forward-correct + once EP is enabled. + + For diagnostics, the callback ALSO records pre-clip per-mesh sub-norms + alongside the actual global norm. When ``track_per_modality=True`` (VFM), + samples are bucketed by image/video via ``model.is_image_batch(data_batch)``, + producing wandb keys ``clip_grad_norm/{image|video}/{mesh_key}`` plus a + ``.../global`` synthetic key carrying the actual rescale norm. When False + (VLM), keys are ``clip_grad_norm/{mesh_key}`` plus ``clip_grad_norm/global``. + + Param-source semantics: + * ``track_per_modality=True`` (VFM): caller passes the ``OmniMoTModel``; + only ``model_ddp.net.parameters()`` is iterated, matching legacy VFM + behavior (the optimizer is built from ``self.net``). + * ``track_per_modality=False`` (VLM): caller passes a single + ``nn.Module`` or a list of model parts; each is unwrapped from + ``DistributedDataParallel`` if needed, then ``parameters()`` is + iterated and filtered by grad-presence. + + Args: + clip_norm: max norm to clip to. + force_finite: if True, NaN/Inf in any grad is zeroed in-place before + the norm computation. + track_per_modality: if True, route stats into image/video buckets via + ``model.is_image_batch(data_batch)``. If False, accumulate into a + single un-bucketed log group. + """ + + def __init__( + self, + clip_norm: float = 1.0, + force_finite: bool = True, + track_per_modality: bool = False, + ): + self.clip_norm = clip_norm + self.force_finite = force_finite + self.track_per_modality = track_per_modality + + # Outer key: modality bucket name. For VLM we use a single bucket "" so + # wandb keys are short (`clip_grad_norm/{mesh}`); for VFM the bucket is + # "image" or "video" (`clip_grad_norm/image/{mesh}`). + # Inner key: mesh string, plus the synthetic "global" key for the + # actual rescale norm returned by _clip_grad. + self._states: dict[str, dict[str, _MagnitudeRecord]] = defaultdict(lambda: defaultdict(_MagnitudeRecord)) + self._state_key: str = "" + + def on_training_step_start( + self, + model: torch.nn.Module, + data_batch: dict[str, torch.Tensor], + iteration: int = 0, + ) -> None: + if not self.track_per_modality: + return + self._state_key = "image" if model.is_image_batch(data_batch) else "video" + + def on_before_optimizer_step( + self, + model_ddp: torch.nn.Module | list[torch.nn.Module], + optimizer: torch.optim.Optimizer, + scheduler: torch.optim.lr_scheduler.LRScheduler, + grad_scaler: torch.amp.GradScaler, + iteration: int = 0, + ) -> None: + del optimizer, scheduler, grad_scaler + + # 1. Resolve which parameters to clip. + if self.track_per_modality: + # VFM: only clip `.net` params, matching legacy semantics + the + # optimizer's actual param set. + assert not isinstance(model_ddp, list), "track_per_modality=True expects a single OmniMoTModel, not a list" + model = model_ddp.module if isinstance(model_ddp, DistributedDataParallel) else model_ddp + model_parts = [model.net] + else: + # VLM: list of model parts (or single module). Unwrap DDP per part. + model_parts = model_ddp if isinstance(model_ddp, list) else [model_ddp] + model_parts = [m.module if isinstance(m, DistributedDataParallel) else m for m in model_parts] + + # 2. Collect params with grads. + all_params: list[torch.Tensor] = [] + for part in model_parts: + for p in part.parameters(): + if p.grad is not None: + all_params.append(p) + + # 3. No-grad / all-frozen step → skip. _clip_grad's empty + # fallback uses torch.cuda.current_device() and would crash on CPU. + if not all_params: + return + + # 4. Optionally zero NaN/Inf in grads. + if self.force_finite: + _fused_nan_to_num([p.grad for p in all_params]) + + # 5. Compute per-mesh norms, the global rescale norm, and clip in + # one call. ``_clip_grad`` groups params by mesh, + # computes per-mesh L2 norms (reducing DTensor results to local + # scalars), combines them into a single global norm, and + # rescales every mesh group with that scalar. + # + # When ``force_finite`` is False we did NOT sanitize the grads, so + # ask ``get_total_norm`` to raise rather than silently producing a + # NaN ``total_norm`` that would taint every parameter on rescale. + global_norm, per_mesh_norms = _clip_grad( + all_params, + self.clip_norm, + error_if_nonfinite=False, + foreach=True, + ) + + # 6. Record diagnostic stats: pre-clip per-mesh sub-norms plus the + # actual global rescale norm. + cur_state = self._states[self._state_key] + for mesh_str, mesh_norm in per_mesh_norms.items(): + cur_state[mesh_str].update(mesh_norm) + cur_state["global"].update(global_norm) + + # 7. Log every logging_iter. The reset is intentionally *outside* + # the ``wandb.run`` gate: ``_MagnitudeRecord.get_stat`` is the + # consumer that flushes the windowed accumulator, so coupling it + # to wandb being live would let stats accumulate unboundedly + # whenever wandb is disabled (smoke tests, ``job.wandb_mode=disabled``, + # wandb init failure) and would back-fill any later wandb enablement + # with the entire pre-enable history. + if iteration % self.config.trainer.logging_iter == 0: + log_dict: dict[str, float | int] = {"iteration": iteration} + for modality, state in self._states.items(): + for mesh_str, record in state.items(): + avg = record.get_stat() + if self.track_per_modality: + key = f"clip_grad_norm/{modality}/{mesh_str}" + else: + key = f"clip_grad_norm/{mesh_str}" + log_dict[key] = avg + if mesh_str == "global": + log.info(f"{key}: {avg:.5f} (iteration {iteration})", rank0_only=False) + if wandb.run: + wandb.log(log_dict, step=iteration) diff --git a/cosmos-inference/cosmos3/_src/vfm/callbacks/heart_beat.py b/cosmos-inference/cosmos3/_src/vfm/callbacks/heart_beat.py new file mode 100644 index 00000000..e39a0970 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/vfm/callbacks/heart_beat.py @@ -0,0 +1,106 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +import time +from datetime import datetime + +import pytz +import torch + +from cosmos3._src.imaginaire.callbacks.every_n import EveryN +from cosmos3._src.imaginaire.model import ImaginaireModel +from cosmos3._src.imaginaire.trainer import ImaginaireTrainer +from cosmos3._src.imaginaire.utils import distributed +from cosmos3._src.imaginaire.utils.easy_io import easy_io + + +class HeartBeat(EveryN): + """ + A callback that logs a heartbeat message at regular intervals to indicate that the training process is still running. + + Args: + every_n (int): The frequency at which the callback is invoked. + step_size (int, optional): The step size for the callback. Defaults to 1. + update_interval_in_minute (int, optional): The interval in minutes for logging the heartbeat. Defaults to 20 minutes. + save_s3 (bool, optional): Whether to save the heartbeat information to S3. Defaults to False. + """ + + def __init__(self, every_n: int, step_size: int = 1, update_interval_in_minute: int = 20, save_s3: bool = False): + super().__init__(every_n=every_n, step_size=step_size) + self.name = self.__class__.__name__ + self.update_interval_in_minute = update_interval_in_minute + self.save_s3 = save_s3 + self.pst = pytz.timezone("America/Los_Angeles") + self.is_hitted = False + + @distributed.rank0_only + def on_train_start(self, model: ImaginaireModel, iteration: int = 0) -> None: + self.time = time.time() + if self.save_s3: + current_time_pst = datetime.now(self.pst).strftime("%Y_%m_%d-%H_%M_%S") + info = { + "iteration": iteration, + "time": current_time_pst, + } + easy_io.dump(info, f"s3://rundir/{self.name}_start.yaml") + easy_io.dump(info, f"s3://timestamps_rundir/{self.name}_start.yaml") + + def on_training_step_end( + self, + model: ImaginaireModel, + data_batch: dict[str, torch.Tensor], + output_batch: dict[str, torch.Tensor], + loss: torch.Tensor, + iteration: int = 0, + ) -> None: + if not self.is_hitted: + self.is_hitted = True + if distributed.get_rank() == 0: + self.report(iteration) + super().on_training_step_end(model, data_batch, output_batch, loss, iteration) + + @distributed.rank0_only + def every_n_impl( + self, + trainer: ImaginaireTrainer, + model: ImaginaireModel, + data_batch: dict[str, torch.Tensor], + output_batch: dict[str, torch.Tensor], + loss: torch.Tensor, + iteration: int, + ) -> None: + if time.time() - self.time > 60 * self.update_interval_in_minute: + self.report(iteration) + + def report(self, iteration: int = 0): + self.time = time.time() + if self.save_s3: + current_time_pst = datetime.now(self.pst).strftime("%Y_%m_%d-%H_%M_%S") + info = { + "iteration": iteration, + "time": current_time_pst, + } + easy_io.dump(info, f"s3://rundir/{self.name}.yaml") + + @distributed.rank0_only + def on_train_end(self, model: ImaginaireModel, iteration: int = 0) -> None: + if self.save_s3: + current_time_pst = datetime.now(self.pst).strftime("%Y_%m_%d-%H_%M_%S") + info = { + "iteration": iteration, + "time": current_time_pst, + } + easy_io.dump(info, f"s3://rundir/{self.name}_end.yaml") + easy_io.dump(info, f"s3://timestamps_rundir/{self.name}_end.yaml") diff --git a/cosmos-inference/cosmos3/_src/vfm/callbacks/hf_export.py b/cosmos-inference/cosmos3/_src/vfm/callbacks/hf_export.py new file mode 100644 index 00000000..494d7033 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/vfm/callbacks/hf_export.py @@ -0,0 +1,471 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +"""HFExportCallback: export VLM DCP checkpoints to HuggingFace safetensors format. + +Design notes +------------ +- Hooks into ``on_save_checkpoint`` (called by DistributedCheckpointer.save() before I/O). +- All ranks participate in the weight-gather phase (DTensor.full_tensor() all-gathers). +- Rank 0 accumulates CPU tensors, writes shards, and uploads — other ranks exit early. +- File I/O and upload run in a background thread on rank 0 to avoid blocking training. +- Worker exceptions are stored in ``_worker_exception`` and re-raised on the next + checkpoint or at train end, so failures are never silently swallowed. +- Controlled entirely via ``config.checkpoint.hf_export`` (HFExportConfig). + +Phase 2+ note +------------- +Weight parameters are iterated via ``model.model._model.named_parameters()`` where +``model.model`` is the ``HFModel`` wrapper and ``model.model._model`` is the underlying +HuggingFace transformer. Parameter names are already HF-native — no weight_mapper +remapping is required. +""" + +import json +import os +import shutil +import threading +from typing import Any + +import torch + +from cosmos3._src.imaginaire.utils import log +from cosmos3._src.imaginaire.utils.callback import Callback +from cosmos3._src.imaginaire.utils.distributed import is_rank0 + +try: + from safetensors.torch import save_file as _safetensors_save_file +except ImportError: + _safetensors_save_file = None + +try: + from transformers import AutoTokenizer, GenerationConfig +except ImportError: + AutoTokenizer = None + GenerationConfig = None + +# Map string dtype names (as stored in ParallelismConfig.precision) to torch dtypes. +_DTYPE_MAP: dict[str, torch.dtype] = { + "float32": torch.float32, + "float16": torch.float16, + "bfloat16": torch.bfloat16, + "float64": torch.float64, +} + + +def _upload_folder_to_s3(local_folder: str, bucket: str, s3_prefix: str, credential_path: str) -> None: + """Upload every file under *local_folder* to ``s3://{bucket}/{s3_prefix}/...``. + + Uses the i4 ``easy_io`` S3 backend (Boto3Backend), which reads credentials from + *credential_path*. Files are uploaded as streaming transfers via boto3's + ``upload_file()`` — the full shard is never loaded into memory. + """ + from cosmos3._src.imaginaire.utils.easy_io import easy_io + + backend = easy_io.get_file_backend( + backend_args={ + "backend": "s3", + "s3_credential_path": credential_path, + "path_mapping": None, + } + ) + for root, _, files in os.walk(local_folder): + for fname in sorted(files): + local_path = os.path.join(root, fname) + rel = os.path.relpath(local_path, local_folder) + s3_path = f"s3://{bucket}/{s3_prefix}/{rel}" + # Pass the local path string so Boto3Backend uses upload_file() — + # a streaming transfer that avoids reading the whole shard into memory. + backend.put(local_path, s3_path) + log.info(f"[HFExportCallback] Uploaded {local_path} → {s3_path}") + + +class HFExportCallback(Callback): + """Export VLM weights to HuggingFace-compatible safetensors after each DCP checkpoint. + + Enabled / configured via ``config.checkpoint.hf_export`` (HFExportConfig). Disabled + by default — add this callback and set ``hf_export.enabled = True`` to activate. + + Exports written to:: + + {job.path_local}/hf_exports/iter_{iteration:09d}/ + 00000.safetensors + ... + model.safetensors.index.json + config.json + tokenizer.json (if tokenizer can be loaded from model_name_or_path) + + Optionally uploads to: + - S3 (``hf_export.upload_to_object_store``) + - HuggingFace Hub (``hf_export.hf_repo_id``) + + Args: + dtype: Export weight dtype (e.g. ``"bfloat16"``). Use + ``"${model.config.policy.parallelism.precision}"`` in the Hydra callback config to + inherit from the training precision. + """ + + # HuggingFace convention: max 4 GB per shard file. + _MAX_SHARD_BYTES: int = 4 * 1024**3 + + def __init__(self, dtype: str = "bfloat16") -> None: + self._export_dtype: torch.dtype | None = _DTYPE_MAP.get(dtype) + self._current_iteration: int = 0 + self._export_thread: threading.Thread | None = None + # Stores any exception raised inside the background worker so it can be + # re-raised on the main thread at the next checkpoint or train end. + self._worker_exception: BaseException | None = None + + # ------------------------------------------------------------------ + # Callback hooks + # ------------------------------------------------------------------ + + def on_save_checkpoint_start(self, model: Any, iteration: int = 0) -> None: + self._current_iteration = iteration + + def on_save_checkpoint(self, model: Any, state_dict: dict[str, Any]) -> None: + hf_cfg = self.config.checkpoint.hf_export + if not hf_cfg.enabled: + return + + iteration = self._current_iteration + if iteration % hf_cfg.export_every_n != 0: + return + + # Deferred import to avoid circular dependency at module load time. + from cosmos3._src.vfm.models.vlm_model import VLMModel + + if not isinstance(model, VLMModel): + # The legacy vlm/train.py path passes model_parts: list[nn.Module] (raw HF + # models without the VLMModel attribute structure). HF export requires the + # VLMModel wrapper, which is only available via the unified scripts/train.py path. + if isinstance(model, list): + log.warning( + "[HFExportCallback] Received model_parts (list) instead of VLMModel. " + "HF export requires the unified training path (scripts/train.py). Skipping." + ) + else: + log.warning( + "[HFExportCallback] model is not VLMModel (got %s); skipping HF export.", + type(model).__name__, + ) + return + + if _safetensors_save_file is None: + raise ImportError("safetensors is required for HFExportCallback. Install it with: pip install safetensors") + + output_dir = os.path.join(self.config.job.path_local, "hf_exports", f"iter_{iteration:09d}") + + # ---------------------------------------------------------------- + # Phase 1 (all ranks): gather sharded parameters into CPU chunks. + # full_tensor() is a collective operation — all ranks must participate. + # ---------------------------------------------------------------- + cpu_chunks, manifest, total_size = self._gather_weights(model) + + # ---------------------------------------------------------------- + # Phase 2 (rank 0, background thread): file I/O + optional upload. + # ---------------------------------------------------------------- + if not is_rank0(): + return + + # Block on any still-running export from the previous checkpoint and + # propagate any worker exception before starting a new export. + if self._export_thread is not None and self._export_thread.is_alive(): + log.warning( + "[HFExportCallback] Previous export thread still running; waiting before starting export for iter %d.", + iteration, + ) + self._export_thread.join() + + if self._worker_exception is not None: + exc = self._worker_exception + self._worker_exception = None + raise RuntimeError(f"[HFExportCallback] Previous export failed with: {exc}") from exc + + self._export_thread = threading.Thread( + target=self._save_and_upload, + args=(cpu_chunks, manifest, total_size, model.hf_config, model.model_name_or_path, output_dir, iteration), + daemon=True, + ) + self._export_thread.start() + + def on_train_end(self, model: Any, iteration: int = 0) -> None: + """Wait for the final export thread so the process does not exit prematurely.""" + if self._export_thread is not None and self._export_thread.is_alive(): + log.info("[HFExportCallback] Waiting for export thread to finish...") + self._export_thread.join() + log.info("[HFExportCallback] Export thread done.") + + if self._worker_exception is not None: + exc = self._worker_exception + self._worker_exception = None + raise RuntimeError(f"[HFExportCallback] Export thread failed with: {exc}") from exc + + # ------------------------------------------------------------------ + # Internal helpers + # ------------------------------------------------------------------ + + def _gather_weights(self, model: Any) -> tuple[list[dict[str, torch.Tensor]], dict[str, str], int]: + """Iterate model parameters, all-gather DTensor shards, and build CPU chunks. + + Must be called on **all ranks**. Only rank 0 populates the returned + ``cpu_chunks`` and ``manifest``; other ranks return empty structures but still + participate in the distributed all-gathers. + + Returns: + cpu_chunks: List of ``{weight_name: cpu_tensor}`` dicts, one per shard file. + manifest: Mapping of ``weight_name → shard_filename``. + total_size: Total byte count of all exported tensors (for the index JSON). + """ + cpu_chunks: list[dict[str, torch.Tensor]] = [] + manifest: dict[str, str] = {} + current_chunk: dict[str, torch.Tensor] = {} + current_chunk_bytes: int = 0 + total_size: int = 0 + file_idx: int = 0 + + for name, param in model.model._model.named_parameters(): + # Phase 2+: HFModel initialises _model via AutoModelForImageTextToText / + # AutoModelForCausalLM, so parameter names are HF-native and match the + # safetensors checkpoint keys loaded by _load_vlm_weights(). + # + # MoE note: Qwen3VLMoeTextExpertsGroupedMm stores expert weights in HF-native + # grouped layout — gate_up_proj: [E, H, 2F], down_proj: [E, F, H] — matching + # the checkpoint format exactly. No transposition or per-expert fan-out is + # needed. (The legacy Phase 0 path stored tensors in a transposed internal + # format [E, 2F, H] under the name "gate_and_up_projs" and required + # weight_mapper.policy_map_local_key_for_export_tensor() to transpose back on + # export. Phase 2 uses HFModel and has no such internal reformat.) + # + # torch.compile and gradient-checkpointing wrappers inject prefixes into + # named_parameters() output. Strip them so exported keys are HF-native, + # matching what HFModel._load_vlm_weights() does for the in-memory state dict. + name = name.replace("_orig_mod.", "").replace("_checkpoint_wrapped_module.", "") + + # Gather across FSDP / TP / CP ranks (collective — all ranks must call). + if isinstance(param, torch.distributed.tensor.DTensor): + param = param.full_tensor() + param = param.detach() + if self._export_dtype is not None: + param = param.to(dtype=self._export_dtype) + + tensor_bytes = param.element_size() * param.numel() + + # Flush the current chunk when the shard size limit would be exceeded. + # current_chunk_bytes is tracked on ALL ranks so shard boundaries are + # consistent (the shard_name written into manifest must agree everywhere). + if current_chunk_bytes + tensor_bytes > self._MAX_SHARD_BYTES and current_chunk_bytes > 0: + if is_rank0(): + cpu_chunks.append(current_chunk) + current_chunk = {} + file_idx += 1 + current_chunk_bytes = 0 + + shard_name = f"{file_idx:05d}.safetensors" + if is_rank0(): + current_chunk[name] = param.cpu() + manifest[name] = shard_name + total_size += tensor_bytes + current_chunk_bytes += tensor_bytes + + # Flush the final (possibly partial) chunk. + if current_chunk_bytes > 0 and is_rank0() and current_chunk: + cpu_chunks.append(current_chunk) + + return cpu_chunks, manifest, total_size + + def _save_and_upload( + self, + cpu_chunks: list[dict[str, torch.Tensor]], + manifest: dict[str, str], + total_size: int, + hf_config: Any, + model_name_or_path: str, + output_dir: str, + iteration: int, + ) -> None: + """Write safetensors shards, HF config, tokenizer; upload to S3 / HF Hub. + + Runs on rank 0 inside a background thread. Any exception is stored in + ``self._worker_exception`` so the main thread can re-raise it. + """ + try: + self._do_save_and_upload( + cpu_chunks, manifest, total_size, hf_config, model_name_or_path, output_dir, iteration + ) + except Exception as exc: + log.error( + "[HFExportCallback] Export worker for iter %d raised an exception: %s", + iteration, + exc, + exc_info=True, + ) + self._worker_exception = exc + + def _do_save_and_upload( + self, + cpu_chunks: list[dict[str, torch.Tensor]], + manifest: dict[str, str], + total_size: int, + hf_config: Any, + model_name_or_path: str, + output_dir: str, + iteration: int, + ) -> None: + """Core export logic (called from the background thread via ``_save_and_upload``). + + Error handling is tiered: + - Steps 1-4 (shards, index JSON, HF config, source-model file copy): any exception + propagates to the outer ``_save_and_upload`` wrapper so the main thread is notified. + A failed file copy leaves the checkpoint unusable for trust_remote_code models, so + it is treated as a hard failure like the shard writes. + - Steps 5-7 (tokenizer, generation_config, S3 upload, HF Hub upload): failures are + treated as soft warnings. The tokenizer and generation config are best-effort; upload + failures do not invalidate the local safetensors export, so an outage must not abort + training. + """ + hf_cfg = self.config.checkpoint.hf_export + os.makedirs(output_dir, exist_ok=True) + log.info(f"[HFExportCallback] Writing iter {iteration} export to {output_dir}") + + # 1. Safetensors shards — one file per chunk (ordered by file_idx). + # Each chunk is cleared after writing so its tensors can be GC'd + # incrementally rather than being held until the whole loop completes. + for i in range(len(cpu_chunks)): + chunk = cpu_chunks[i] + shard_path = os.path.join(output_dir, f"{i:05d}.safetensors") + _safetensors_save_file(chunk, shard_path) + log.info(f"[HFExportCallback] Wrote {shard_path}") + cpu_chunks[i] = {} # release tensor references for GC + + # 2. model.safetensors.index.json + # total_size is pre-computed in _gather_weights to avoid needing chunks here. + index_json = {"metadata": {"total_size": total_size}, "weight_map": manifest} + index_path = os.path.join(output_dir, "model.safetensors.index.json") + with open(index_path, "w") as fh: + json.dump(index_json, fh, indent=4) + + # 3. HuggingFace model config. + hf_config.save_pretrained(output_dir) + + # 4. Copy missing .py/.json files for trust_remote_code models. + # Only applicable when model_name_or_path is a local directory. + # The full directory layout is preserved so nested packages referenced by + # auto_map are included (mirroring convert_checkpoint.py's copytree approach). + # Files already present in the export dir (e.g., config.json written by + # hf_config.save_pretrained) are never overwritten. + # HARD failure: a broken copy leaves the checkpoint unloadable, so any I/O error + # propagates to the background-worker wrapper (same as shard writes). + if model_name_or_path and os.path.isdir(model_name_or_path): + real_src = os.path.realpath(model_name_or_path) + real_out = os.path.realpath(output_dir) + copied = [] + for root, dirs, files in os.walk(real_src): + real_root = os.path.realpath(root) + # Prune any subtree that is, leads to, or is inside output_dir. + # This prevents recursing into previously written export dirs when + # output_dir (or a parent of it) lives inside model_name_or_path. + dirs[:] = [ + d + for d in dirs + if not ( + (p := os.path.realpath(os.path.join(real_root, d))) == real_out + or p.startswith(real_out + os.sep) + or real_out.startswith(p + os.sep) + ) + ] + if real_root == real_out or real_root.startswith(real_out + os.sep): + continue + rel_dir = os.path.relpath(real_root, real_src) + for fname in files: + if not (fname.endswith(".py") or fname.endswith(".json")): + continue + src = os.path.join(real_root, fname) + dst_dir = output_dir if rel_dir == "." else os.path.join(output_dir, rel_dir) + dst = os.path.join(dst_dir, fname) + if not os.path.exists(dst): + os.makedirs(dst_dir, exist_ok=True) + shutil.copy2(src, dst) + copied.append(os.path.join(rel_dir, fname) if rel_dir != "." else fname) + if copied: + log.info(f"[HFExportCallback] Copied missing files from source model: {copied}") + + # 5. Tokenizer (best-effort — may fail for custom / gated models). + if AutoTokenizer is not None and model_name_or_path: + try: + tok = AutoTokenizer.from_pretrained(model_name_or_path, trust_remote_code=True) + tok.save_pretrained(output_dir) + except Exception as exc: + log.warning(f"[HFExportCallback] Tokenizer save skipped: {exc}") + + # 6. Generation config (best-effort — not all models expose one). + if GenerationConfig is not None and model_name_or_path: + try: + gen_cfg = GenerationConfig.from_pretrained(model_name_or_path, trust_remote_code=True) + gen_cfg.save_pretrained(output_dir) + except Exception as exc: + log.warning(f"[HFExportCallback] generation_config save skipped: {exc}") + + # 7. S3 upload — soft failure: local export is intact regardless of upload outcome. + obj_store = hf_cfg.upload_to_object_store + if obj_store.enabled: + s3_prefix = f"{self.config.job.path}/hf_exports/iter_{iteration:09d}" + try: + _upload_folder_to_s3(output_dir, obj_store.bucket, s3_prefix, obj_store.credentials) + log.info(f"[HFExportCallback] S3 upload done: s3://{obj_store.bucket}/{s3_prefix}") + except Exception as exc: + # Intentionally soft: an upload outage must not crash training. + log.warning(f"[HFExportCallback] S3 upload failed (local export intact): {exc}") + + # 8. HuggingFace Hub upload — soft failure: see comment above. + if hf_cfg.hf_repo_id: + self._upload_to_hf_hub(output_dir, hf_cfg.hf_repo_id) + + log.info(f"[HFExportCallback] Export complete for iter {iteration}.") + + @staticmethod + def _upload_to_hf_hub(output_dir: str, repo_id: str, max_retries: int = 3) -> None: + try: + from huggingface_hub import HfApi + except ImportError: + log.warning("[HFExportCallback] huggingface_hub not installed; skipping HF Hub upload.") + return + + api = HfApi() + for attempt in range(1, max_retries + 1): + try: + api.create_repo(repo_id=repo_id, exist_ok=True) + break + except Exception as exc: + log.warning(f"[HFExportCallback] create_repo attempt {attempt}/{max_retries} failed: {exc}") + if attempt == max_retries: + log.warning( + f"[HFExportCallback] Could not create HF Hub repo '{repo_id}' after " + f"{max_retries} attempts; skipping upload." + ) + return + + for attempt in range(1, max_retries + 1): + try: + api.upload_folder( + folder_path=output_dir, + repo_id=repo_id, + commit_message=f"Upload checkpoint from {os.path.basename(output_dir)}", + ) + log.info(f"[HFExportCallback] Uploaded to HF Hub: {repo_id}") + return + except Exception as exc: + log.warning(f"[HFExportCallback] HF Hub upload attempt {attempt}/{max_retries} failed: {exc}") + + log.warning(f"[HFExportCallback] All {max_retries} HF Hub upload attempts failed for {repo_id}.") diff --git a/cosmos-inference/cosmos3/_src/vfm/callbacks/iter_speed.py b/cosmos-inference/cosmos3/_src/vfm/callbacks/iter_speed.py new file mode 100644 index 00000000..5d36ec19 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/vfm/callbacks/iter_speed.py @@ -0,0 +1,120 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +import time + +import torch +import wandb +from torch import Tensor + +from cosmos3._src.imaginaire.callbacks.every_n import EveryN +from cosmos3._src.imaginaire.model import ImaginaireModel +from cosmos3._src.imaginaire.trainer import ImaginaireTrainer +from cosmos3._src.imaginaire.utils import log +from cosmos3._src.imaginaire.utils.distributed import rank0_only +from cosmos3._src.imaginaire.utils.easy_io import easy_io + + +class IterSpeed(EveryN): + """ + Args: + hit_thres (int): Number of iterations to wait before logging. + save_s3 (bool): Whether to save to S3. + save_s3_every_log_n (int): Save to S3 every n log iterations, which means save_s3_every_log_n n * every_n global iterations. + """ + + def __init__(self, *args, hit_thres: int = 5, save_s3: bool = True, save_s3_every_log_n: int = 10, **kwargs): + super().__init__(*args, **kwargs) + self.time = None + self.hit_counter = 0 + self.hit_thres = hit_thres + self.save_s3 = save_s3 + self.save_s3_every_log_n = save_s3_every_log_n + self.name = self.__class__.__name__ + self.last_hit_time = time.time() + + def on_training_step_end( + self, + model: ImaginaireModel, + data_batch: dict[str, torch.Tensor], + output_batch: dict[str, torch.Tensor], + loss: torch.Tensor, + iteration: int = 0, + ) -> None: + if self.hit_counter < self.hit_thres: + log.info( + f"Iteration {iteration}: " + f"Hit counter: {self.hit_counter + 1}/{self.hit_thres} | " + f"Loss: {loss.detach().item():.4f} | " + f"Time: {time.time() - self.last_hit_time:.2f}s", + rank0_only=False, + ) + self.hit_counter += 1 + self.last_hit_time = time.time() + #! useful for large scale training and avoid oom crash in the first two iterations!!! + torch.cuda.synchronize() + return + super().on_training_step_end(model, data_batch, output_batch, loss, iteration) + + @rank0_only + def every_n_impl( + self, + trainer: ImaginaireTrainer, + model: ImaginaireModel, + data_batch: dict[str, Tensor], + output_batch: dict[str, Tensor], + loss: Tensor, + iteration: int, + ) -> None: + if self.time is None: + self.time = time.time() + return + cur_time = time.time() + iter_speed = (cur_time - self.time) / self.every_n / self.step_size + + log.info( + f"{iteration} : iter_speed {iter_speed:.2f} seconds per iteration | Loss: {loss.detach().item():.4f}", + rank0_only=False, + ) + + is_image_batch = model.is_image_batch(data_batch) + per_sample_batch_counter = dict() + if is_image_batch: + image_batch_size = len(data_batch[model.input_image_key]) + per_sample_batch_counter["image_batch_size"] = image_batch_size + else: + video_batch_size = len(data_batch[model.input_video_key]) + per_sample_batch_counter["video_batch_size"] = video_batch_size + + if wandb.run: + sample_counter = getattr(trainer, "sample_counter", iteration) + wandb.log( + { + "timer/iter_speed": iter_speed, + "sample_counter": sample_counter, + } + | per_sample_batch_counter, + step=iteration, + ) + self.time = cur_time + if self.save_s3: + if iteration % (self.save_s3_every_log_n * self.every_n) == 0: + easy_io.dump( + { + "iter_speed": iter_speed, + "iteration": iteration, + }, + f"s3://rundir/{self.name}/iter_{iteration:09d}.yaml", + ) diff --git a/cosmos-inference/cosmos3/_src/vfm/callbacks/learning_rate_logger.py b/cosmos-inference/cosmos3/_src/vfm/callbacks/learning_rate_logger.py new file mode 100644 index 00000000..fa154188 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/vfm/callbacks/learning_rate_logger.py @@ -0,0 +1,59 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +import torch +import wandb + +from cosmos3._src.imaginaire.utils.callback import Callback + + +class LearningRateLogger(Callback): + """Logs per-model-part learning rate every ``every_n × logging_iter`` steps. + + Designed for VLM training where the optimizer is an + ``OptimizersContainer`` exposing ``.optimizers`` (list of single-element + optimizer lists) paired with ``.model_part_names``. Silently no-ops when + those attributes are absent so it can be registered alongside plain + ``torch.optim.Optimizer`` setups without harm. + """ + + def __init__(self, every_n: int = 10): + self.every_n = every_n + + def on_before_optimizer_step( + self, + model_ddp: torch.nn.Module | list[torch.nn.Module], + optimizer: torch.optim.Optimizer, + scheduler: torch.optim.lr_scheduler.LRScheduler, + grad_scaler: torch.amp.GradScaler, + iteration: int = 0, + ) -> None: + del model_ddp, scheduler, grad_scaler + gate = self.config.trainer.logging_iter * self.every_n + if not (iteration == 1 or (gate > 0 and iteration % gate == 0)): + return + if not wandb.run: + return + if not (hasattr(optimizer, "optimizers") and hasattr(optimizer, "model_part_names")): + return + unique_lr: dict[str, float] = {} + for optim_per_model, name in zip(optimizer.optimizers, optimizer.model_part_names): + if not optim_per_model: + continue + for pg in optim_per_model[0].param_groups: + unique_lr[f"optim/lr_{name}"] = pg["lr"] + if not unique_lr: + return + wandb.log(unique_lr, step=iteration) diff --git a/cosmos-inference/cosmos3/_src/vfm/callbacks/load_pretrained.py b/cosmos-inference/cosmos3/_src/vfm/callbacks/load_pretrained.py new file mode 100644 index 00000000..d140088d --- /dev/null +++ b/cosmos-inference/cosmos3/_src/vfm/callbacks/load_pretrained.py @@ -0,0 +1,29 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +from cosmos3._src.imaginaire.model import ImaginaireModel +from cosmos3._src.imaginaire.utils.callback import Callback + + +class LoadPretrained(Callback): + def __init__(self): + r""" + This callback enables us to load pretrained model weights if needed. + Model weights are initialized from safetensors if not loaded already from DCP checkpoint. + """ + super().__init__() + + def on_train_start(self, model: ImaginaireModel, iteration: int = 0) -> None: + model.load_pretrained_model_if_needed() diff --git a/cosmos-inference/cosmos3/_src/vfm/callbacks/low_precision.py b/cosmos-inference/cosmos3/_src/vfm/callbacks/low_precision.py new file mode 100644 index 00000000..0a0438fc --- /dev/null +++ b/cosmos-inference/cosmos3/_src/vfm/callbacks/low_precision.py @@ -0,0 +1,53 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +import sys +from typing import Union + +import torch + +from cosmos3._src.imaginaire.config import Config +from cosmos3._src.imaginaire.trainer import ImaginaireTrainer +from cosmos3._src.imaginaire.utils import log +from cosmos3._src.imaginaire.utils.callback import LowPrecisionCallback as BaseLowPrecisionCallback +from cosmos3._src.vfm.models.omni_mot_model import OmniMoTModel + + +class LowPrecisionCallback(BaseLowPrecisionCallback): + """ + Config with non-primitive type makes it difficult to override the option. + The callback gets precision from model.precision instead. + It also auto disabled when using fp32. + """ + + def __init__(self, config: Config, trainer: ImaginaireTrainer, update_iter: int): + self.config = config + self.trainer = trainer + self.update_iter = update_iter + + def on_train_start(self, model: Union[OmniMoTModel, list[OmniMoTModel]], iteration: int = 0) -> None: + if not isinstance(model, list): + model = [model] + for model_part in model: + if model_part.precision == torch.float32: + log.critical("Using fp32, should disable master weights.") + self.update_iter = sys.maxsize + else: + assert model_part.precision in [ + torch.bfloat16, + torch.float16, + torch.half, + ], "LowPrecisionCallback must use a low precision dtype." + self.precision_type = model_part.precision diff --git a/cosmos-inference/cosmos3/_src/vfm/callbacks/mfu.py b/cosmos-inference/cosmos3/_src/vfm/callbacks/mfu.py new file mode 100644 index 00000000..71a5a1be --- /dev/null +++ b/cosmos-inference/cosmos3/_src/vfm/callbacks/mfu.py @@ -0,0 +1,318 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +"""MFU (Model FLOPs Utilization) callback for OmniMoT training. + +Computes and logs MFU metrics for specified hardware targets (e.g. H100, GB200) +by calculating the actual training FLOPs per step and comparing against +theoretical peak throughput. +""" + +from __future__ import annotations + +import time +from dataclasses import dataclass +from decimal import Decimal + +import torch +import wandb + +from cosmos3._src.imaginaire.attention.utils import is_blackwell_dc +from cosmos3._src.imaginaire.callbacks.every_n import EveryN +from cosmos3._src.imaginaire.flops import ( + OmniMoTModelDescriptor, + compute_omni_mot_flops_per_batch, + compute_wan_vae_encoder_flops, + get_omni_mot_model_descriptor, +) +from cosmos3._src.imaginaire.model import ImaginaireModel +from cosmos3._src.imaginaire.trainer import ImaginaireTrainer +from cosmos3._src.imaginaire.utils import log +from cosmos3._src.imaginaire.utils.distributed import rank0_only + + +@dataclass +class HardwareTarget: + """Specification of a hardware target for MFU computation. + + Attributes: + name: Human-readable name (used as W&B tag, e.g. "H100"). + peak_tflops: Theoretical peak throughput in TFLOPS (e.g. 989 for H100 BF16). + """ + + name: str + peak_tflops: float + + +# Pre-defined hardware targets +H100 = HardwareTarget(name="H100", peak_tflops=989.0) +GB200 = HardwareTarget(name="GB200", peak_tflops=2250.0) + + +class MFUCallback(EveryN): + """Callback that computes and logs Model FLOPs Utilization (MFU) to W&B. + + MFU is defined as: + MFU = achieved_tflops_per_gpu / peak_tflops_per_gpu + + where achieved_tflops_per_gpu is computed from the model's theoretical + training FLOPs (forward + backward) divided by the measured wall-clock + time per step. + + This callback accumulates per-step FLOPs between logging intervals and + reports the average MFU over that window. + + Args: + backwardpass_ratio: Ratio of backward-to-forward FLOPs (default 2.0). + hit_thres: Number of warm-up iterations before logging begins. + include_vae_encoder: If True (default), include the Wan 2.2 VAE encoder + forward-pass FLOPs in the per-step total. The VAE is frozen during + training so only forward FLOPs are counted. + include_padding: If True, include FLOPs spent on padding tokens (the + causal split appended by sequence-packing finalize()). Gives a + ``total GPU FLOPs`` view instead of ``useful FLOPs`` only. + grad_accum_iter: Number of gradient accumulation steps per optimizer + update (default 1). When > 1, ``on_training_step_end`` is called + once per optimizer step but the wall-clock time covers all + micro-batches, so per-step FLOPs are multiplied by this count. + """ + + def __init__( + self, + *args, + backwardpass_ratio: float = 2.0, + hit_thres: int = 5, + include_vae_encoder: bool = True, + include_padding: bool = True, + grad_accum_iter: int = 1, + **kwargs, + ) -> None: + super().__init__(*args, **kwargs) + self.hardware_target = GB200 if is_blackwell_dc() else H100 + self.backwardpass_ratio = backwardpass_ratio + self.hit_thres = hit_thres + self.include_vae_encoder = include_vae_encoder + self.include_padding = include_padding + self.grad_accum_iter = grad_accum_iter + + # Lazily initialised from model config on first call + self._model_descriptor: OmniMoTModelDescriptor | None = None + self._freeze_und: bool = False + self._vision_gen: bool = True + self._action_gen: bool = False + self._sound_gen: bool = False + self._world_size: int = 1 + self._use_activation_checkpointing: bool = False + + # Accumulation state between every_n windows + self._accumulated_flops = Decimal(0) + self._accumulated_flops_vae = Decimal(0) + self._steps_in_window: int = 0 + self._window_start_time: float | None = None + + # Warm-up counter + self._hit_counter: int = 0 + + # ------------------------------------------------------------------ # + # Lazy initialisation from model + # ------------------------------------------------------------------ # + + def _ensure_initialised(self, model: ImaginaireModel) -> None: + """Build the ``OmniMoTModelDescriptor`` from the live model config.""" + if self._model_descriptor is not None: + return + + # Access VLM config from the language model inside the network + vlm_cfg = model.net.language_model.config # type: ignore[attr-defined] + net_cfg = model.net.config # type: ignore[attr-defined] + + self._freeze_und = getattr(vlm_cfg, "freeze_und", False) + self._vision_gen = getattr(net_cfg, "vision_gen", True) + self._action_gen = getattr(net_cfg, "action_gen", False) + self._sound_gen = getattr(net_cfg, "sound_gen", False) + + # Read activation checkpointing from the model's parallelism config + model_cfg = getattr(model, "config", None) + parallelism_cfg = getattr(model_cfg, "parallelism", None) + self._use_activation_checkpointing = getattr(parallelism_cfg, "use_activation_checkpointing", False) + + # MoE fields (may not exist for dense-only configs) + use_moe = getattr(vlm_cfg, "use_moe", False) + num_experts = getattr(vlm_cfg, "num_experts", 0) + num_experts_per_tok = getattr(vlm_cfg, "num_experts_per_tok", 0) + moe_intermediate_size = getattr(vlm_cfg, "moe_intermediate_size", 0) + decoder_sparse_step = getattr(vlm_cfg, "decoder_sparse_step", 1) + mlp_only_layers = list(getattr(vlm_cfg, "mlp_only_layers", [])) + + self._model_descriptor = get_omni_mot_model_descriptor( + hidden_size=vlm_cfg.hidden_size, + num_hidden_layers=vlm_cfg.num_hidden_layers, + num_attention_heads=vlm_cfg.num_attention_heads, + num_key_value_heads=vlm_cfg.num_key_value_heads, + head_dim=getattr(vlm_cfg, "head_dim", None), + intermediate_size=vlm_cfg.intermediate_size, + vocab_size=vlm_cfg.vocab_size, + use_moe=use_moe, + num_experts=num_experts, + num_experts_per_tok=num_experts_per_tok, + moe_intermediate_size=moe_intermediate_size, + decoder_sparse_step=decoder_sparse_step, + mlp_only_layers=mlp_only_layers, + latent_patch_size=getattr(net_cfg, "latent_patch_size", 2), + latent_channel_size=getattr(net_cfg, "latent_channel_size", 48), + action_dim=getattr(net_cfg, "action_dim", 32), + sound_dim=getattr(net_cfg, "sound_dim", 64), + frequency_embedding_size=getattr(net_cfg, "frequency_embedding_size", 256), + predict_text_tokens=getattr(net_cfg, "predict_text_tokens", False), + ) + + self._world_size = torch.distributed.get_world_size() if torch.distributed.is_initialized() else 1 + + # ------------------------------------------------------------------ # + # Per-step accumulation + # ------------------------------------------------------------------ # + + def on_training_step_end( + self, + model: ImaginaireModel, + data_batch: dict[str, torch.Tensor], + output_batch: dict[str, torch.Tensor], + loss: torch.Tensor, + iteration: int = 0, + ) -> None: + # Warm-up: skip first few iterations (compilation, allocation, etc.) + if self._hit_counter < self.hit_thres: + self._hit_counter += 1 + return + + self._ensure_initialised(model) + + # Start the timing window on the first post-warmup step + if self._window_start_time is None: + self._window_start_time = time.monotonic() + + # Extract per-modality token counts from output_batch + und_token_length = output_batch.get("und_token_length") + if und_token_length is None: + return + + und_tokens = int(und_token_length) + vision_tokens = int(output_batch.get("vision_token_length", 0)) + action_tokens = int(output_batch.get("action_token_length", 0)) + sound_tokens = int(output_batch.get("sound_token_length", 0)) + + # Per-split attention metadata for packed sequences + split_lens: list[int] | None = output_batch.get("split_lens") + attn_modes_list: list[str] | None = output_batch.get("attn_modes") + + # Compute FLOPs for this per-device micro-batch. + # B = 1 because token counts are already summed across all samples in + # the packed sequence on this device. + assert self._model_descriptor is not None + step_flops = compute_omni_mot_flops_per_batch( + cfg=self._model_descriptor, + B=1, + text_tokens=und_tokens, + vision_tokens=vision_tokens, + action_tokens=action_tokens, + sound_tokens=sound_tokens, + freeze_und=self._freeze_und, + vision_gen=self._vision_gen, + action_gen=self._action_gen, + sound_gen=self._sound_gen, + backwardpass_ratio=self.backwardpass_ratio, + split_lens=split_lens, + attn_modes=attn_modes_list, + include_padding=self.include_padding, + use_activation_checkpointing=self._use_activation_checkpointing, + ) + + # VAE encoder forward-pass FLOPs (frozen, no backward). + if self.include_vae_encoder: + vae_pixel_shapes = output_batch.get("vae_pixel_shapes") + if vae_pixel_shapes: + for pT, pH, pW in vae_pixel_shapes: + vae_flops = compute_wan_vae_encoder_flops(B=1, T=pT, H=pH, W=pW) + self._accumulated_flops_vae += vae_flops + step_flops += vae_flops + + # When gradient accumulation is used, on_training_step_end is called + # once per optimizer step (not per micro-batch). Multiply by the + # accumulation count so the FLOPs cover all micro-batches in the step. + # For VAE with gradient accumulation we assume all micro-batches have the same FLOP count + if self.grad_accum_iter > 1: + step_flops *= self.grad_accum_iter + + self._accumulated_flops += step_flops + self._steps_in_window += 1 + + # Delegate to EveryN for the periodic reporting logic + super().on_training_step_end(model, data_batch, output_batch, loss, iteration) + + # ------------------------------------------------------------------ # + # Periodic reporting + # ------------------------------------------------------------------ # + + @rank0_only + def every_n_impl( + self, + trainer: ImaginaireTrainer, + model: ImaginaireModel, + data_batch: dict[str, torch.Tensor], + output_batch: dict[str, torch.Tensor], + loss: torch.Tensor, + iteration: int, + ) -> None: + if self._window_start_time is None or self._steps_in_window == 0: + return + + elapsed = time.monotonic() - self._window_start_time + if elapsed <= 0: + return + + if self._accumulated_flops <= 0: + log.warning( + f"Number of calculated FLOPs must be more than 0, got {self._accumulated_flops} at iteration {iteration} for {self._steps_in_window} steps." + ) + + # Achieved TFLOPS *per GPU* over the window + # accumulated_flops is the total per-device FLOPs over all steps in window + achieved_tflops_per_gpu = float(self._accumulated_flops) / elapsed / 1e12 + + avg_flops_per_step = float(self._accumulated_flops) / self._steps_in_window + avg_time_per_step = elapsed / self._steps_in_window + + log_info: dict[str, float] = { + "mfu/achieved_tflops_per_gpu": achieved_tflops_per_gpu, + "mfu/avg_flops_per_step": avg_flops_per_step, + "mfu/avg_time_per_step_s": avg_time_per_step, + "mfu/steps_in_window": float(self._steps_in_window), + "mfu/vae_flops_percentage": float(self._accumulated_flops_vae / self._accumulated_flops) * 100.0, + } + + mfu = ( + achieved_tflops_per_gpu / self.hardware_target.peak_tflops if self.hardware_target.peak_tflops > 0 else 0.0 + ) + log_info[f"mfu/{self.hardware_target.name}"] = mfu + + # W&B log + if wandb.run is not None: + wandb.log(log_info, step=iteration) + + # Reset accumulation window + self._accumulated_flops = Decimal(0) + self._accumulated_flops_vae = Decimal(0) + self._steps_in_window = 0 + self._window_start_time = time.monotonic() diff --git a/cosmos-inference/cosmos3/_src/vfm/callbacks/moe_specialization_callback.py b/cosmos-inference/cosmos3/_src/vfm/callbacks/moe_specialization_callback.py new file mode 100644 index 00000000..5954c059 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/vfm/callbacks/moe_specialization_callback.py @@ -0,0 +1,203 @@ +# ----------------------------------------------------------------------------- +# Copyright (c) 2026 NVIDIA CORPORATION & AFFILIATES. +# All rights reserved. +# +# This codebase constitutes NVIDIA proprietary technology and is strictly +# confidential. Any unauthorized reproduction, distribution, or disclosure +# of this code, in whole or in part, outside NVIDIA is strictly prohibited +# without prior written consent. +# +# For inquiries regarding the use of this code in other NVIDIA proprietary +# projects, please contact the Deep Imagination Research Team at +# dir@exchange.nvidia.com. +# ----------------------------------------------------------------------------- + +""" +MoE Specialization Callback +============================ +Monitors whether MoE experts are developing distinct, stable roles over training. +A well-trained MoE should have experts that specialize — each processing a different +kind of input — rather than a few generalist experts doing everything while the rest +idle. + + Expert Co-activation Rate + ------------------------- + If two experts frequently fire together on the same token (both in the top-K + selected), they are likely learning redundant representations. Ideally experts + specialize on non-overlapping token types, so co-activation should stay close + to the chance baseline of K/N (e.g. 8/128 ≈ 0.0625 for the 235B model). + + For each layer and each unique expert pair (i, j), we compute: + CoAct(i, j) = N_{i,j} / N_i + where N_{i,j} = number of tokens where both i and j were selected, and N_i = + total tokens routed to expert i. We then summarize across all pairs as max and + mean. A rising mean_coact, especially well above the chance baseline, signals + that the router is collapsing onto a small correlated cluster of experts. + +Buffer ownership +---------------- + coactivation_counts is reset here (in compute_moe_coactivation_metrics). + Per-expert token counts are derived from coactivation_counts itself + (row_sum + col_sum) / (K-1), so this callback is fully independent of + ExpertHeatmap's reset cycle for total_tokens_per_expert. +""" + +import torch +import wandb +from torch.distributed.tensor import DTensor, Partial + +from cosmos3._src.imaginaire.callbacks.every_n import EveryN +from cosmos3._src.imaginaire.model import ImaginaireModel +from cosmos3._src.imaginaire.trainer import ImaginaireTrainer +from cosmos3._src.imaginaire.utils import distributed +from cosmos3._src.vfm.models.vlm.qwen3_vl_moe.qwen3_vl_moe import Qwen3VLMoeTextSparseMoeBlock + + +def _get_device_mesh(vfm: torch.nn.Module): + weight = vfm.language_model.model.layers[0].self_attn.q_proj.weight + return weight.device_mesh if isinstance(weight, DTensor) else None + + +def _allreduce_dtensor(t: torch.Tensor, device_mesh) -> torch.Tensor: + """Sum-reduce a local tensor across all FSDP ranks and return the global tensor.""" + return DTensor.from_local( + t, + device_mesh=device_mesh, + placements=[Partial()] * device_mesh.ndim, + ).full_tensor() + + +def compute_moe_coactivation_metrics(vfm: torch.nn.Module) -> dict[str, dict]: + """ + Compute per-layer Expert Co-activation metrics for both towers. + + For each unique expert pair (i < j) in the upper triangle of the N×N + coactivation matrix, computes: + CoAct(i, j) = N_{i,j} / N_i + where N_{i,j} is the count of tokens where both i and j were in the top-K, + and N_i is the total token count for expert i (the row expert, i.e. the + lower-indexed expert in the pair). + + N_i is derived directly from the co-activation matrix rather than from + the shared total_tokens_per_expert buffer, so this metric is independent + of ExpertHeatmap's reset cycle. Each token routed to expert i contributes + to (K-1) co-activation pairs, so N_i = (row_sum_i + col_sum_i) / (K-1). + + High co-activation relative to the chance baseline (K/N) indicates that + certain expert pairs are systematically selected together — a sign of + redundancy rather than specialization. + + Returns a dict: tower -> { + "layer_indices": list[int] — actual model layer positions + "max_coact": Tensor[num_moe_layers] — worst pair per layer + "mean_coact": Tensor[num_moe_layers] — average over all pairs + "chance_baseline": float — K/N, same for all layers (reference) + } + """ + with torch.no_grad(): + device_mesh = _get_device_mesh(vfm) + if device_mesh is None: + return {} + + results: dict[str, dict] = {} + for tower in ["und", "gen"]: + layer_indices, max_coacts, mean_coacts, chance_baselines = [], [], [], [] + + num_layers = len(vfm.language_model.model.layers) + for layer_idx in range(num_layers): + layer = vfm.language_model.model.layers[layer_idx] + mlp = layer.mlp if tower == "und" else getattr(layer, "mlp_moe_gen", None) + if not isinstance(mlp, Qwen3VLMoeTextSparseMoeBlock): + continue + + coact_counts = _allreduce_dtensor(mlp.get_coactivation_counts(reset=True), device_mesh) # [N, N] + + n = mlp.num_experts + k = mlp.top_k + + # Derive per-expert token counts directly from the co-activation + # matrix so we don't depend on ExpertHeatmap's reset cycle. + # Each token that routes to expert i contributes (K-1) entries + # across row i and column i of the upper-triangle matrix. + tokens_per_expert = (coact_counts.sum(dim=1) + coact_counts.sum(dim=0)).float() / (k - 1) + + mask = torch.triu(torch.ones(n, n, dtype=torch.bool, device=coact_counts.device), diagonal=1) + # CoAct(i, j) = N_{i,j} / N_i — normalise by how often expert i fires overall. + denom = tokens_per_expert.unsqueeze(1).clamp(min=1) # [N, 1] + coact_rates = (coact_counts.float() / denom)[mask] # [N*(N-1)/2] + + layer_indices.append(layer_idx) + max_coacts.append(coact_rates.max()) + mean_coacts.append(coact_rates.mean()) + # Chance baseline = probability two randomly-chosen top-K slots land on the + # same pair under uniform routing = K/N. Constant across layers and steps, + # logged once per tower as a reference line. + chance_baselines.append(k / n) + + if layer_indices: + results[tower] = { + "layer_indices": layer_indices, + "max_coact": torch.stack(max_coacts), + "mean_coact": torch.stack(mean_coacts), + "chance_baseline": chance_baselines[0], # same value for all layers + } + + return results + + +class MoESpecializationCallback(EveryN): + """ + Logs per-layer MoE specialization metrics to W&B every N training steps. + + What it captures + ---------------- + Whether MoE experts are developing distinct routing identities: + + Expert Co-activation (logged every N steps) + - mean_coact / max_coact per layer: how often expert pairs fire together + relative to the chance_baseline (K/N). Values well above the baseline + suggest the router is selecting a redundant cluster of experts rather + than a diverse set. + + W&B layout + ---------- + moe_specialization/coact_chance_baseline/ — flat reference (K/N) + moe_specialization/max_coact//layer_NNN|mean|max + moe_specialization/mean_coact//layer_NNN|mean|max + + Args: + every_n (int): Logging interval in training steps. + """ + + def __init__(self, every_n: int = 100): + super().__init__(every_n=every_n) + + def every_n_impl( + self, + trainer: ImaginaireTrainer, + model: ImaginaireModel, + data_batch: dict[str, torch.Tensor], + output_batch: dict[str, torch.Tensor], + loss: torch.Tensor, + iteration: int, + ) -> None: + vfm = model.net + + coact_results = compute_moe_coactivation_metrics(vfm) + + if not (distributed.is_rank0() and wandb.run): + return + + log_dict: dict[str, float] = {} + + for tower, tower_metrics in coact_results.items(): + layer_indices = tower_metrics.pop("layer_indices") + chance_baseline = tower_metrics.pop("chance_baseline") + log_dict[f"moe_specialization/coact_chance_baseline/{tower}"] = chance_baseline + for metric_name, values in tower_metrics.items(): + for layer_idx, val in zip(layer_indices, values): + log_dict[f"moe_specialization/{metric_name}/{tower}/layer_{layer_idx:03d}"] = val.item() + log_dict[f"moe_specialization/{metric_name}/{tower}/mean"] = values.mean().item() + log_dict[f"moe_specialization/{metric_name}/{tower}/max"] = values.max().item() + + wandb.log(log_dict, step=iteration) diff --git a/cosmos-inference/cosmos3/_src/vfm/callbacks/moe_stability_callback.py b/cosmos-inference/cosmos3/_src/vfm/callbacks/moe_stability_callback.py new file mode 100644 index 00000000..84746774 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/vfm/callbacks/moe_stability_callback.py @@ -0,0 +1,323 @@ +# ----------------------------------------------------------------------------- +# Copyright (c) 2026 NVIDIA CORPORATION & AFFILIATES. +# All rights reserved. +# +# This codebase constitutes NVIDIA proprietary technology and is strictly +# confidential. Any unauthorized reproduction, distribution, or disclosure +# of this code, in whole or in part, outside NVIDIA is strictly prohibited +# without prior written consent. +# +# For inquiries regarding the use of this code in other NVIDIA proprietary +# projects, please contact the Deep Imagination Research Team at +# dir@exchange.nvidia.com. +# ----------------------------------------------------------------------------- + +""" +MoE Stability Callback +====================== +Monitors whether the MoE router is staying healthy over the course of training. +A healthy router distributes tokens reasonably evenly, keeps all experts alive, +and remains uncertain enough (high entropy) that it is still learning to route. + +Five metrics are tracked per layer, per tower (und / gen): + + Dead Expert Rate + ---------------- + Fraction of experts receiving fewer than 10% of their fair-share of tokens + (i.e. load fraction f_i < 0.1 / N). A dead expert has been effectively shut + out by the router — it gets no gradient signal and its capacity is wasted. + Ideal = 0. A rising dead-expert rate in the gen tower during early training + is a common failure mode. + + Load Imbalance Factor (LIF) + --------------------------- + N * max(f_i), where f_i is the fraction of tokens routed to expert i. + Measures how much the busiest expert is overloaded relative to uniform. + LIF = 1.0 is perfect balance; <= 1.3 is healthy; > 3.0 indicates severe + collapse onto a small set of experts. This is the same quantity watched by + the load-balancing loss, but measured empirically rather than from the loss + objective. + + Router Entropy (normalized) + --------------------------- + Mean per-token Shannon entropy of the full routing distribution, divided by + log(N) to put it on a [0, 1] scale. H = 1 means the router is maximally + uncertain (uniform over all experts); H = 0 means it always picks the same + expert. Early in training entropy is high; we want it to stay reasonably + high (> ~0.7) so the router continues to explore. A sudden drop signals + routing collapse. + + Soft-vs-Hard Effective Experts (normalized) + ------------------------------------------- + Soft and hard effective experts separate what the router *considers* (full + probability distribution, before dispatch) from what top-k dispatch *actually + uses* (empirical token-to-expert assignment, after dispatch). Both are + expressed as a fraction of N, so they sit on the same axis as + router_entropy_normalized. Their lower bounds differ slightly: + soft_eff_normalized is bounded in [1/N, 1]. + hard_eff_normalized is bounded in [K/N, 1] — top-K dispatch always engages + at least K experts in aggregate (the floor case is when every token + picks the same K-expert subset). + + soft_eff_normalized = mean_t exp(H(p_t)) / N + Average per-token router perplexity, divided by N. Asks: what fraction + of experts is the router *considering* on a typical token? Computed + as sum_per_token_soft_eff / total_tokens / N. Note: the unnormalized + numerator is NOT exp of the mean entropy — by Jensen, + mean_t exp(H_t) >= exp(mean_t H_t), and the gap matters when + per-token entropies are heterogeneous. + + hard_eff_normalized = exp(H(f)) / N + where f_i is the empirical fraction of *expert assignments* (not + tokens) that went to expert i: f_i = tokens_per_expert_i / (T * K). + Perplexity of the buffer-wide dispatch distribution, divided by N. + Asks: what fraction of experts is top-k *actually* engaging across the + buffer? A smoother sibling of LIF: where LIF watches the busiest + expert, hard_eff watches the spread of the whole load distribution. + + Interpretation (high/low refer to values close to 1 vs close to 1/N): + + high soft_eff, high hard_eff + Router considers many experts; top-k dispatch also uses many experts. + Broadly healthy routing. + low soft_eff, low hard_eff + Router is confident or collapsed in probability space; dispatch is + also concentrated. Entropy, LIF, and hard usage all agree that + routing is narrow. + high soft_eff, low hard_eff + Router distribution is broad, but top-k dispatch is concentrated — + the "hidden top-k concentration" case where entropy can look healthy + while LIF and co-activation are high. + low soft_eff, high hard_eff + Less common: each token has a sharp router distribution, but + different tokens choose different experts. Per-token confidence with + buffer-wide diversity. + +Buffer ownership +---------------- + This callback is fully self-contained: it reads and resets its own dedicated + buffers (stability_tokens_per_expert, stability_total_tokens, sum_token_entropy, + sum_per_token_soft_eff). It does not depend on ExpertHeatmap's reset cycle. +""" + +import math + +# Fraction of uniform fair-share below which an expert is considered "dead" (e.g. 0.1 → < 10% of K/N). +DEAD_EXPERT_THRESHOLD_MULTIPLIER = 0.1 + +# Smoothing added inside log() to avoid log(0) for experts that received zero +# tokens in the current buffer window. Matches the constant used inside the +# MoE block when accumulating router entropy. +ENTROPY_EPSILON = 1e-9 + +import torch +import wandb +from torch.distributed.tensor import DTensor, Partial + +from cosmos3._src.imaginaire.callbacks.every_n import EveryN +from cosmos3._src.imaginaire.model import ImaginaireModel +from cosmos3._src.imaginaire.trainer import ImaginaireTrainer +from cosmos3._src.imaginaire.utils import distributed +from cosmos3._src.vfm.models.vlm.qwen3_vl_moe.qwen3_vl_moe import Qwen3VLMoeTextSparseMoeBlock + + +def _effective_experts( + sum_per_token_soft_eff: torch.Tensor, + total_tokens: torch.Tensor, + tokens_per_expert: torch.Tensor, +) -> tuple[torch.Tensor, torch.Tensor]: + """Compute (soft_eff, hard_eff) from already-reduced stability buffers. + + Extracted as a pure-tensor function so it can be unit-tested without + instantiating any MoE module or distributed state. + + Args: + sum_per_token_soft_eff: 0-d or [1] tensor holding sum_t exp(H(p_t)) + accumulated across the buffer window. + total_tokens: 0-d or [1] tensor holding the number of tokens seen + since the last reset. + tokens_per_expert: [N] tensor of per-expert token counts over the + same buffer window. + + Returns: + soft_eff: scalar tensor, mean_t exp(H(p_t)) in [1, N]. + hard_eff: scalar tensor, exp(H(f)) over the empirical dispatch + distribution f_i = tokens_per_expert_i / sum_i tokens_per_expert_i. + Bounded in [K, N] (not [1, N]) because top-K dispatch always + engages at least K experts in aggregate. + + Note on hard_eff normalization: + tokens_per_expert is a histogram over the K top-k slots per token, so + it sums to T * K rather than T. We must divide by its own sum (== T*K) + to get a true probability distribution before taking entropy. + Dividing by total_tokens (== T) instead would give a vector summing to + K, producing exp(H) values up to (N/K)^K — orders of magnitude beyond + the intended [K, N] range. + """ + total = total_tokens.float().clamp(min=1) + soft_eff = (sum_per_token_soft_eff.float() / total).squeeze() + + total_assignments = tokens_per_expert.sum().float().clamp(min=1) + f_i = (tokens_per_expert.float() / total_assignments).clamp(min=ENTROPY_EPSILON) + hard_entropy = -(f_i * f_i.log()).sum() + hard_eff = hard_entropy.exp() + + return soft_eff, hard_eff + + +def compute_moe_stability_metrics(vfm: torch.nn.Module) -> dict[str, dict]: + """ + Compute per-layer MoE stability metrics for both towers. + + Iterates over all model layers, skipping any that do not use + Qwen3VLMoeTextSparseMoeBlock (e.g. dense layers when decoder_sparse_step > 1). + Actual model layer indices are preserved so W&B keys (layer_000, layer_042, ...) + always refer to the correct transformer layer regardless of MoE sparsity pattern. + + Returns a dict: tower -> { + "layer_indices": list[int] — actual model layer positions + "dead_expert_rate": Tensor[num_moe_layers] + "lif": Tensor[num_moe_layers] + "router_entropy_normalized": Tensor[num_moe_layers] + "soft_eff_normalized": Tensor[num_moe_layers] — mean_t exp(H(p_t)) / N, in [1/N, 1] + "hard_eff_normalized": Tensor[num_moe_layers] — exp(H(f)) / N, in [1/N, 1] + } + """ + with torch.no_grad(): + num_layers = len(vfm.language_model.model.layers) + + example_weight = vfm.language_model.model.layers[0].self_attn.q_proj.weight + device_mesh = example_weight.device_mesh if isinstance(example_weight, DTensor) else None + + if device_mesh is None: + return {} + + def _allreduce(t: torch.Tensor) -> torch.Tensor: + return DTensor.from_local( + t, + device_mesh=device_mesh, + placements=[Partial()] * device_mesh.ndim, + ).full_tensor() + + results: dict[str, dict] = {} + for tower in ["und", "gen"]: + layer_indices: list[int] = [] + dead_rates: list[torch.Tensor] = [] + lifs: list[torch.Tensor] = [] + entropies: list[torch.Tensor] = [] + soft_effs_norm: list[torch.Tensor] = [] + hard_effs_norm: list[torch.Tensor] = [] + + for layer_idx in range(num_layers): + layer_module = vfm.language_model.model.layers[layer_idx] + # "und" tower uses layer.mlp; "gen" tower uses layer.mlp_moe_gen. + # Both attributes exist on every layer (set in unified_mot.py), but only + # layers where (layer_idx+1) % decoder_sparse_step == 0 are MoE blocks. + mlp_module = layer_module.mlp if tower == "und" else getattr(layer_module, "mlp_moe_gen", None) + if not isinstance(mlp_module, Qwen3VLMoeTextSparseMoeBlock): + continue + + total_tokens_per_expert = _allreduce(mlp_module.get_stability_tokens_per_expert(reset=True)) + total_tokens = _allreduce(mlp_module.get_stability_total_tokens(reset=True)) + sum_token_entropy = _allreduce(mlp_module.get_sum_token_entropy(reset=True)) + sum_per_token_soft_eff = _allreduce(mlp_module.get_sum_per_token_soft_eff(reset=True)) + + n = mlp_module.num_experts + total = total_tokens.float().clamp(min=1) + f_i = total_tokens_per_expert.float() / total # [N] load fraction per expert + + k = mlp_module.top_k + + layer_indices.append(layer_idx) + # Uniform fair share per expert is K/N. "Dead" = below 10% of that. + dead_rates.append((f_i < DEAD_EXPERT_THRESHOLD_MULTIPLIER * k / n).float().mean()) + # LIF = max(f_i) * N / K. Interpretation: + # 1.0 = perfectly balanced (every expert gets its fair share) + # 2.0 = busiest expert handles 2x its fair share + # >3.0 = severe imbalance, consider tuning load-balancing loss + lifs.append(f_i.max() * n / k) + # Mean per-token entropy, normalized to [0, 1] by log(N). + # squeeze() collapses the [1] buffer shape to a 0-d scalar. + entropies.append((sum_token_entropy.float() / total / math.log(n)).squeeze()) + + soft_eff, hard_eff = _effective_experts( + sum_per_token_soft_eff=sum_per_token_soft_eff, + total_tokens=total_tokens, + tokens_per_expert=total_tokens_per_expert, + ) + soft_effs_norm.append(soft_eff / n) + hard_effs_norm.append(hard_eff / n) + + if layer_indices: + results[tower] = { + "layer_indices": layer_indices, + "dead_expert_rate": torch.stack(dead_rates), + "lif": torch.stack(lifs), + "router_entropy_normalized": torch.stack(entropies), + "soft_eff_normalized": torch.stack(soft_effs_norm), + "hard_eff_normalized": torch.stack(hard_effs_norm), + } + + return results + + +class MoEStabilityCallback(EveryN): + """ + Logs per-layer MoE stability metrics to W&B every N training steps. + + What it captures + ---------------- + Whether the MoE router remains in a healthy, balanced state over training. + The metrics collectively answer: are all experts still being used + (dead_expert_rate), is load spread evenly (lif), is the router still + making uncertain, exploratory decisions (router_entropy_normalized), and + do the experts the router considers (soft_eff) match the experts top-k + dispatch actually engages (hard_eff)? + + W&B layout + ---------- + For each metric and each tower, two kinds of series are logged: + - moe_stability///layer_NNN — per model layer time series + - moe_stability///mean|max — summary across all MoE layers + + Metrics logged: dead_expert_rate, lif, router_entropy_normalized, + soft_eff_normalized, hard_eff_normalized. + + Typical healthy ranges: + dead_expert_rate → 0 (any sustained non-zero value is a concern) + lif → <= 1.3 (alarm at > 3.0) + router_entropy_normalized → > 0.7 (collapse if it drops sharply) + soft_eff_normalized, hard_eff_normalized → high; a large gap between + them (e.g. soft high, hard low) indicates hidden top-k concentration + + Args: + every_n (int): Logging interval in training steps. + """ + + def __init__(self, every_n: int = 100): + super().__init__(every_n=every_n) + + def every_n_impl( + self, + trainer: ImaginaireTrainer, + model: ImaginaireModel, + data_batch: dict[str, torch.Tensor], + output_batch: dict[str, torch.Tensor], + loss: torch.Tensor, + iteration: int, + ) -> None: + metrics = compute_moe_stability_metrics(model.net) + + if not (distributed.is_rank0() and wandb.run): + return + + log_dict: dict[str, float] = {} + for tower, tower_metrics in metrics.items(): + layer_indices = tower_metrics.pop("layer_indices") + for metric_name, values in tower_metrics.items(): + for layer_idx, val in zip(layer_indices, values): + log_dict[f"moe_stability/{metric_name}/{tower}/layer_{layer_idx:03d}"] = val.item() + log_dict[f"moe_stability/{metric_name}/{tower}/mean"] = values.mean().item() + log_dict[f"moe_stability/{metric_name}/{tower}/max"] = values.max().item() + + wandb.log(log_dict, step=iteration) diff --git a/cosmos-inference/cosmos3/_src/vfm/callbacks/norm_monitor.py b/cosmos-inference/cosmos3/_src/vfm/callbacks/norm_monitor.py new file mode 100644 index 00000000..429496ef --- /dev/null +++ b/cosmos-inference/cosmos3/_src/vfm/callbacks/norm_monitor.py @@ -0,0 +1,345 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +import os +from typing import Optional, Union + +import torch +import torch.distributed as dist +import wandb +from torch import nn +from torch.distributed.tensor import DTensor + +from cosmos3._src.imaginaire.model import ImaginaireModel +from cosmos3._src.imaginaire.utils import distributed, log, misc +from cosmos3._src.imaginaire.utils.callback import Callback +from cosmos3._src.imaginaire.utils.distributed import DistributedDataParallel +from cosmos3._src.imaginaire.utils.easy_io import easy_io +from cosmos3._src.vfm.datasets.sequence_packing import get_gen_seq + +try: + from apex.contrib.layer_norm import FastLayerNorm +except ImportError: + FastLayerNorm = None + + +class NormMonitor(Callback): + def __init__( + self, + every_n: Optional[int] = None, + step_size: int = 1, + layer_norm_only: bool = False, + model_key: Optional[str] = None, + log_stat_wandb: bool = False, + save_s3: bool = False, + track_activations: bool = False, + ): + """Monitor and log parameter/gradient/activation norms during training. + + Args: + every_n: Log statistics every N global steps. If None, logging is disabled. + step_size: Number of micro-steps per global step (for gradient accumulation). + layer_norm_only: If True, only track LayerNorm and Embedding parameters. + If False, track all parameters. + model_key: Attribute name to access the model (e.g., "diffusion_model"). + If None, use the model directly. + log_stat_wandb: If True, log per-parameter statistics to wandb. + If False, only log aggregate norms. + save_s3: If True, save statistics to S3 bucket. + track_activations: If True, track activation norms + and gradients of activations at each transformer block. If set to False, only + weight norms and weight gradient norms will be tracked. + """ + self.every_n = every_n + self.step_size = step_size + self.model_key = model_key + self.layer_norm_only = layer_norm_only + self.log_stat_wandb = log_stat_wandb + self.save_s3 = save_s3 + self.track_activations = track_activations + self.name = self.__class__.__name__ + + # Storage for activation statistics (populated by hooks) + self._activation_stats: dict[str, dict[str, torch.Tensor]] = {} + self._activation_grad_stats: dict[str, dict[str, torch.Tensor]] = {} + self._hooks: list[torch.utils.hooks.RemovableHandle] = [] + self._should_record = False + + def on_train_start(self, model: ImaginaireModel, iteration: int = 0) -> None: + config_job = self.config.job + self.local_dir = f"{config_job.path_local}/norm_monitor" + if distributed.get_rank() == 0: + os.makedirs(self.local_dir, exist_ok=True) + log.info(f"{self.__class__.__name__} callback: local_dir: {self.local_dir}") + + # Register activation hooks if enabled + if self.track_activations: + self._register_activation_hooks(model) + + def _register_activation_hooks(self, model: ImaginaireModel) -> None: + """Register forward and backward hooks on transformer blocks to capture activation statistics. + + Hooks are registered at the block level (on model.model.layers children) rather than + on individual modules inside blocks. This is compatible with torch.compile since + compile is applied per-block, and hooks on the outer block fire outside the compiled graph. + """ + if self.model_key is not None: + model = getattr(model, self.model_key) + + # Get the transformer layers - hooks are registered on each block + if not hasattr(model.net.language_model.model, "layers"): + log.warning( + f"{self.__class__.__name__}: Could not find model.net.language_model.model.layers. " + "Activation tracking requires model structure with model.net.language_model.model.layers." + ) + return + + layers = model.net.language_model.model.layers + + for layer_id, block in layers.named_children(): + block_name = f"blocks.{layer_id}" + + # Forward hook to capture activation norms (block output) + # Also registers a tensor hook for gradient tracking + def make_forward_hook(name: str): + def forward_hook( + mod: nn.Module, inp: tuple[torch.Tensor, ...], out: torch.Tensor | tuple[torch.Tensor, ...] + ) -> None: + if not self._should_record: + return + # We track activation norms of only generation sequences. + activation = get_gen_seq(out[0]) + + # Certain algorithms do more than one pass through the model. + # (E.g. teacher forcing). We merge stats in that case. + new_stats = self._compute_l2_stats(activation) + existing = self._activation_stats.get(name) + if existing is not None: + existing["sq_sum"] += new_stats["sq_sum"] + existing["max"] = torch.max(existing["max"], new_stats["max"]) + else: + self._activation_stats[name] = new_stats + + # Register tensor hook for gradient tracking. + # This works with activation checkpointing (unlike module backward hooks). + def make_tensor_grad_hook(hook_name: str): + def tensor_grad_hook(grad: torch.Tensor | None) -> None: + # The block may get gradients internally via attention, + # even if the output is unused. + if grad is None: + return + + # If there is more than one pass through the model + # (e.g. teacher forcing), then merge the stats. + new_stats = self._compute_l2_stats(grad) + existing = self._activation_grad_stats.get(hook_name) + if existing is not None: + existing["sq_sum"] += new_stats["sq_sum"] + existing["max"] = torch.max(existing["max"], new_stats["max"]) + else: + self._activation_grad_stats[hook_name] = new_stats + + return tensor_grad_hook + + if activation.requires_grad: + activation.register_hook(make_tensor_grad_hook(name)) + + return forward_hook + + forward_handle = block.register_forward_hook(make_forward_hook(block_name)) + self._hooks.append(forward_handle) + + if distributed.is_rank0(): + num_blocks = len(list(layers.named_children())) + log.info(f"{self.__class__.__name__}: Registered activation hooks on {num_blocks} transformer blocks") + + def on_train_end(self, model: ImaginaireModel, iteration: int = 0) -> None: + """Clean up hooks when training ends.""" + for hook in self._hooks: + hook.remove() + self._hooks.clear() + + def on_before_forward( + self, + iteration: int = 0, + ) -> None: + """Enable activation recording before forward pass if this iteration should be logged.""" + if not self.track_activations: + return + global_step = iteration // self.step_size + should_run = global_step % self.every_n == 0 + self._should_record = should_run + if should_run: + # Clear previous activation stats + self._activation_stats.clear() + self._activation_grad_stats.clear() + + def on_before_optimizer_step( + self, + model_ddp: Union[DistributedDataParallel, ImaginaireModel], + optimizer: torch.optim.Optimizer, + scheduler: torch.optim.lr_scheduler.LRScheduler, + grad_scaler: torch.amp.GradScaler, + iteration: int = 0, + ) -> None: + global_step = iteration // self.step_size + should_run = global_step % self.every_n == 0 + if not should_run: + return + + if isinstance(model_ddp, DistributedDataParallel): + model = model_ddp.module + else: + model = model_ddp + if self.model_key is not None: + model = getattr(model, self.model_key) + + self._compute_and_log_stats(model, iteration) + + # Disable recording after logging + self._should_record = False + + def _get_named_parameters(self, model: nn.Module) -> dict[str, nn.Parameter]: + """Get named parameters, optionally filtered to layer norm only.""" + named_parameters = {} + if self.layer_norm_only: + ln_modules = (nn.LayerNorm, nn.Embedding) + if FastLayerNorm is not None: + ln_modules += (FastLayerNorm,) + for mn, m in model.named_modules(): + if isinstance(m, ln_modules): + for pn, p in m.named_parameters(): + fpn = f"{mn}.{pn}" if mn else pn + named_parameters[fpn] = p + else: + named_parameters = dict(model.named_parameters()) + return named_parameters + + def _should_track_param(self, param_name: str) -> bool: + """Check if parameter should be tracked based on naming conventions.""" + # Track only generation tower params, exclude EMA params + return "moe_gen" in param_name and "net_ema" not in param_name + + def _compute_l2_stats(self, tensor: torch.Tensor, detach: bool = True) -> dict[str, torch.Tensor]: + """Compute statistics (squared sum and max) for a tensor. + + Args: + tensor: Input tensor to compute statistics for. + detach: If True, detach the tensor before computing stats. + + Returns: + Dictionary with "sq_sum" (squared sum for L2 norm) and "max" (absolute max). + """ + data = tensor.detach() if detach else tensor + if isinstance(data, DTensor): + data = data.to_local() + + return { + "sq_sum": (data.float() ** 2).sum(), + "max": data.abs().max(), + } + + @misc.timer("norm_monitor") + def _compute_and_log_stats(self, model: nn.Module, iteration: int = 0) -> None: + """FSDP-efficient implementation using local shards + all_reduce. + + Instead of gathering full parameters with summon_full_params (expensive), + we compute local statistics on each rank's shard and use all_reduce to + aggregate them across all ranks. + """ + named_parameters = self._get_named_parameters(model) + + # Accumulators for local shard statistics (squared sum for L2 norm) + local_param_sq_sum = torch.tensor(0.0, device="cuda", dtype=torch.float32) + local_grad_sq_sum = torch.tensor(0.0, device="cuda", dtype=torch.float32) + + # Per-parameter stats: {param_name: [local_sq_sum, local_max]} + per_param_stats: dict[str, dict[str, torch.Tensor]] = {} + per_grad_stats: dict[str, dict[str, torch.Tensor]] = {} + + for param_name, param in named_parameters.items(): + if not self._should_track_param(param_name): + continue + + # Compute local statistics on this rank's shard + per_param_stats[param_name] = self._compute_l2_stats(param) + local_param_sq_sum += per_param_stats[param_name]["sq_sum"] + + if param.grad is not None: + per_grad_stats[param_name] = self._compute_l2_stats(param.grad, detach=False) + local_grad_sq_sum += per_grad_stats[param_name]["sq_sum"] + + # All-reduce to aggregate statistics across all FSDP ranks + dist.all_reduce(local_param_sq_sum, op=dist.ReduceOp.SUM) + dist.all_reduce(local_grad_sq_sum, op=dist.ReduceOp.SUM) + + # All-reduce per-parameter stats + for param_name, stats_dict in per_param_stats.items(): + dist.all_reduce(stats_dict["sq_sum"], op=dist.ReduceOp.SUM) + dist.all_reduce(stats_dict["max"], op=dist.ReduceOp.MAX) + + for param_name, stats_dict in per_grad_stats.items(): + dist.all_reduce(stats_dict["sq_sum"], op=dist.ReduceOp.SUM) + dist.all_reduce(stats_dict["max"], op=dist.ReduceOp.MAX) + + # All-reduce activation stats (activations are replicated, so reduce across all ranks for consistency) + for module_name, stats_dict in self._activation_stats.items(): + dist.all_reduce(stats_dict["sq_sum"], op=dist.ReduceOp.SUM) + dist.all_reduce(stats_dict["max"], op=dist.ReduceOp.MAX) + + for module_name, stats_dict in self._activation_grad_stats.items(): + dist.all_reduce(stats_dict["sq_sum"], op=dist.ReduceOp.SUM) + dist.all_reduce(stats_dict["max"], op=dist.ReduceOp.MAX) + + # Only rank 0 logs the results + if distributed.is_rank0(): + important_info = { + "trainer/global_step": iteration, + "sample_counter": getattr(self.trainer, "sample_counter", iteration), + "total_param_l2_norm": local_param_sq_sum.sqrt().item(), + } + if local_grad_sq_sum > 0: + important_info["total_grad_l2_norm"] = local_grad_sq_sum.sqrt().item() + + stats = {} + for param_name, stats_dict in per_param_stats.items(): + l2_norm = stats_dict["sq_sum"].sqrt() + stats[f"stats/weight_norm/{param_name}"] = l2_norm.item() + stats[f"stats/weight_max/{param_name}"] = stats_dict["max"].item() + + for param_name, stats_dict in per_grad_stats.items(): + l2_norm = stats_dict["sq_sum"].sqrt() + stats[f"stats/grad_norm/{param_name}"] = l2_norm.item() + stats[f"stats/grad_max/{param_name}"] = stats_dict["max"].item() + + # Add activation stats + for module_name, stats_dict in self._activation_stats.items(): + l2_norm = stats_dict["sq_sum"].sqrt() + stats[f"stats/act_norm/{module_name}"] = l2_norm.item() + stats[f"stats/act_max/{module_name}"] = stats_dict["max"].item() + + for module_name, stats_dict in self._activation_grad_stats.items(): + l2_norm = stats_dict["sq_sum"].sqrt() + stats[f"stats/act_grad_norm/{module_name}"] = l2_norm.item() + stats[f"stats/act_grad_max/{module_name}"] = stats_dict["max"].item() + + if wandb.run is not None: + if self.log_stat_wandb: + wandb.log({**stats, **important_info}, step=iteration) + else: + wandb.log(important_info, step=iteration) + + if self.save_s3: + easy_io.dump({**stats, **important_info}, f"s3://rundir/{self.name}/stats_{iteration:09d}.pt") diff --git a/cosmos-inference/cosmos3/_src/vfm/callbacks/ofu.py b/cosmos-inference/cosmos3/_src/vfm/callbacks/ofu.py new file mode 100644 index 00000000..f18c9127 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/vfm/callbacks/ofu.py @@ -0,0 +1,293 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +"""OFU (Operational FLOPs Utilization) callback for OmniMoT training. + +Computes and logs OFU metrics by launching ``nvidia-smi dmon`` as a background +subprocess and parsing the Tensor Core activity (mmaact) and processor clock +(pclk) columns. OFU is defined as:: + + OFU = mmaact * (pclk / max_pclk) + +where ``max_pclk`` is the max boost clock for the detected hardware (e.g. +1980 MHz for H100, 2062 MHz for GB200). The result is in the 0-100 range. +""" + +from __future__ import annotations + +import subprocess +import threading +from collections import defaultdict +from dataclasses import dataclass + +import torch +import wandb + +from cosmos3._src.imaginaire.attention.utils import is_blackwell_dc +from cosmos3._src.imaginaire.callbacks.every_n import EveryN +from cosmos3._src.imaginaire.model import ImaginaireModel +from cosmos3._src.imaginaire.trainer import ImaginaireTrainer +from cosmos3._src.imaginaire.utils import log +from cosmos3._src.imaginaire.utils.distributed import is_rank0, rank0_only + + +@dataclass +class HardwareTarget: + """Hardware-specific constants for OFU normalisation. + + Attributes: + name: Human-readable name (used as W&B tag, e.g. "H100"). + max_pclk_mhz: Max boost SM clock in MHz used to normalise OFU. + """ + + name: str + max_pclk_mhz: float + + +# Pre-defined hardware targets +H100 = HardwareTarget(name="H100", max_pclk_mhz=1980.0) +GB200 = HardwareTarget(name="GB200", max_pclk_mhz=2062.0) + + +class OFUCallback(EveryN): + """Callback that computes and logs Operational FLOPs Utilization (OFU) to W&B. + + OFU = mmaact * (pclk / max_pclk), where mmaact is the MMA activity + percentage and pclk is the current processor clock from ``nvidia-smi dmon``. + ``max_pclk`` is determined from the detected hardware (H100 or GB200). + The result is in the 0-100 range. + + The callback launches ``nvidia-smi dmon`` as a background subprocess on + ``on_train_start`` and a daemon thread continuously reads its output. + At every logging interval, accumulated samples are consumed, averaged per GPU + and overall, and logged to W&B under ``ofu/{hardware_name}``. + + Args: + hit_thres: Number of warm-up training iterations to skip before logging. + """ + + def __init__( + self, + *args, + hit_thres: int = 5, + **kwargs, + ) -> None: + super().__init__(*args, **kwargs) + self.hardware_target = GB200 if is_blackwell_dc() else H100 + self.hit_thres = hit_thres + + # Subprocess state + self._process: subprocess.Popen | None = None + self._reader_thread: threading.Thread | None = None + self._stop_event = threading.Event() + + # Buffered samples protected by a lock: list of (gpu_idx, mmaact, pclk) + self._lock = threading.Lock() + self._samples: list[tuple[int, float, float]] = [] + + # Column indices parsed from the header (set by _reader_loop) + self._col_gpu: int | None = None + self._col_mmaact: int | None = None + self._col_pclk: int | None = None + + # Warm-up counter + self._hit_counter: int = 0 + + # ------------------------------------------------------------------ # + # Background reader + # ------------------------------------------------------------------ # + + def _parse_header(self, line: str) -> bool: + """Parse a dmon header line to locate column indices. + + Called on every ``#`` line because nvidia-smi dmon reprints the header + every few seconds. Returns True if ``gpu``, ``mmaact``, and ``pclk`` + columns are all found; warns only when the column-names line (identified + by the presence of ``gpu``) lacks a required column. Silently ignores + the units line (``# Idx W C ...``) which does not contain ``gpu``. + """ + cols = line.lstrip("#").strip().split() + col_map = {name.lower(): idx for idx, name in enumerate(cols)} + gpu_idx = col_map.get("gpu") + mmaact_idx = col_map.get("mmaact") + pclk_idx = col_map.get("pclk") + if gpu_idx is not None and mmaact_idx is not None and pclk_idx is not None: + if self._col_mmaact is None: + log.info(f"OFUCallback: found mmaact at column {mmaact_idx}, pclk at column {pclk_idx}") + self._col_gpu = gpu_idx + self._col_mmaact = mmaact_idx + self._col_pclk = pclk_idx + return True + if gpu_idx is not None: + missing = [name for name, idx in [("mmaact", mmaact_idx), ("pclk", pclk_idx)] if idx is None] + log.warning( + f"OFUCallback: column(s) {missing} not found in nvidia-smi dmon header: {cols}. " + "OFU metrics will not be available." + ) + return False + + def _reader_loop(self) -> None: + """Background thread that reads nvidia-smi dmon output line-by-line.""" + assert self._process is not None and self._process.stdout is not None + + for line in self._process.stdout: + if self._stop_event.is_set(): + break + line = line.strip() + if not line: + continue + + # Header lines repeat every few seconds — always re-parse so that a + # missed or failed first parse is recovered on the next occurrence. + if line.startswith("#"): + self._parse_header(line) + continue + + # Skip data lines until we have column indices + if self._col_gpu is None or self._col_mmaact is None or self._col_pclk is None: + continue + + parts = line.split() + try: + gpu_idx = int(parts[self._col_gpu]) + mmaact = float(parts[self._col_mmaact]) + pclk = float(parts[self._col_pclk]) + except (ValueError, IndexError): + continue + + with self._lock: + self._samples.append((gpu_idx, mmaact, pclk)) + + # ------------------------------------------------------------------ # + # Lifecycle hooks + # ------------------------------------------------------------------ # + + def on_train_start(self, model: ImaginaireModel, iteration: int = 0) -> None: + if not is_rank0(): + return + + try: + # --gpm-metrics 5 means that we access Tensor Activity under mmaact column. + # -d 5 means that we sample the data every 5 seconds. + cmd = ["nvidia-smi", "dmon", "--gpm-metrics", "5", "-d", "5"] + self._process = subprocess.Popen( + cmd, + stdout=subprocess.PIPE, + stderr=subprocess.PIPE, + text=True, + bufsize=1, # line-buffered + ) + self._reader_thread = threading.Thread(target=self._reader_loop, daemon=True) + self._reader_thread.start() + log.info(f"OFUCallback: launched nvidia-smi dmon --gpm-metrics 5") + except FileNotFoundError: + log.warning("OFUCallback: nvidia-smi not found, OFU metrics will not be available") + except Exception as e: + log.warning(f"OFUCallback: failed to launch nvidia-smi dmon: {e}") + + def on_train_end(self, model: ImaginaireModel, iteration: int = 0) -> None: + if not is_rank0(): + return + self._stop_event.set() + if self._process is not None: + try: + self._process.terminate() + self._process.wait(timeout=5) + except ProcessLookupError: + pass # already exited + except subprocess.TimeoutExpired: + self._process.kill() + self._process = None + if self._reader_thread is not None: + self._reader_thread.join(timeout=5) + self._reader_thread = None + + # ------------------------------------------------------------------ # + # Per-step gating + # ------------------------------------------------------------------ # + + def on_training_step_end( + self, + model: ImaginaireModel, + data_batch: dict[str, torch.Tensor], + output_batch: dict[str, torch.Tensor], + loss: torch.Tensor, + iteration: int = 0, + ) -> None: + # All ranks must enter super().on_training_step_end() so they reach the + # distributed barrier inside EveryN. Only rank 0 has samples to clear. + if self._hit_counter < self.hit_thres: + self._hit_counter += 1 + if self._hit_counter == self.hit_thres: + # Discard samples collected during warm-up (compilation, allocation, etc.) + with self._lock: + self._samples.clear() + return + # Delegate to EveryN for the periodic reporting logic + super().on_training_step_end(model, data_batch, output_batch, loss, iteration) + + # ------------------------------------------------------------------ # + # Periodic reporting + # ------------------------------------------------------------------ # + + @rank0_only + def every_n_impl( + self, + trainer: ImaginaireTrainer, + model: ImaginaireModel, + data_batch: dict[str, torch.Tensor], + output_batch: dict[str, torch.Tensor], + loss: torch.Tensor, + iteration: int, + ) -> None: + if self._process is None: + return + + # Drain buffered samples + with self._lock: + samples = list(self._samples) + self._samples.clear() + + if not samples: + log.warning( + f"OFUCallback: no nvidia-smi samples collected at iteration {iteration}. " + "Check that the dmon subprocess launched and that the mmaact column is present." + ) + return + + # Compute per-GPU OFU: mmaact * (pclk / max_pclk) + max_pclk = self.hardware_target.max_pclk_mhz + gpu_ofu: dict[int, list[float]] = defaultdict(list) + gpu_mmaact: dict[int, list[float]] = defaultdict(list) + gpu_pclk: dict[int, list[float]] = defaultdict(list) + for gpu_idx, mmaact, pclk in samples: + gpu_ofu[gpu_idx].append(mmaact * (pclk / max_pclk)) + gpu_mmaact[gpu_idx].append(mmaact) + gpu_pclk[gpu_idx].append(pclk) + + # Overall averages across all GPUs and samples + all_ofu = [v for vals in gpu_ofu.values() for v in vals] + all_mmaact = [v for vals in gpu_mmaact.values() for v in vals] + all_pclk = [v for vals in gpu_pclk.values() for v in vals] + + log_info: dict[str, float] = { + f"ofu/{self.hardware_target.name}": sum(all_ofu) / len(all_ofu), + "ofu/mmaact": sum(all_mmaact) / len(all_mmaact), + "ofu/avg_pclk_mhz": sum(all_pclk) / len(all_pclk), + "ofu/num_samples": float(len(samples)), + } + + if wandb.run is not None: + wandb.log(log_info, step=iteration) diff --git a/cosmos-inference/cosmos3/_src/vfm/callbacks/param_count.py b/cosmos-inference/cosmos3/_src/vfm/callbacks/param_count.py new file mode 100644 index 00000000..cc4a1263 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/vfm/callbacks/param_count.py @@ -0,0 +1,50 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +from typing import Union + +import torch.nn as nn + +from cosmos3._src.imaginaire.model import ImaginaireModel +from cosmos3._src.imaginaire.utils import distributed, log +from cosmos3._src.imaginaire.utils.callback import Callback +from cosmos3._src.imaginaire.utils.count_params import count_params +from cosmos3._src.imaginaire.utils.distributed import rank0_only +from cosmos3._src.imaginaire.utils.easy_io import easy_io + + +class ParamCount(Callback): + def __init__( + self, + save_s3: bool = False, + ): + self.save_s3 = save_s3 + self.name = self.__class__.__name__ + + @rank0_only + def on_train_start(self, model: Union[ImaginaireModel, list[nn.Module]], iteration: int = 0) -> None: + if isinstance(model, list): + num_param = sum([count_params(m) for m in model]) + else: + num_param = count_params(model) + + log.info(f"Total number of parameters on current rank: {num_param}", rank0_only=False) + info = { + "num_parameters": num_param, + } + + if self.save_s3: + rank = distributed.get_rank() + easy_io.dump(info, f"s3://rundir/{self.name}_{rank}.yaml") diff --git a/cosmos-inference/cosmos3/_src/vfm/callbacks/per_stream_timing.py b/cosmos-inference/cosmos3/_src/vfm/callbacks/per_stream_timing.py new file mode 100644 index 00000000..e264e32e --- /dev/null +++ b/cosmos-inference/cosmos3/_src/vfm/callbacks/per_stream_timing.py @@ -0,0 +1,117 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +"""Per-stream timing callback. + +Logs ``forward``, ``backward`` and ``optimizer_step`` wall-clock time broken +down by the data stream that produced each iteration's batch +(``data_batch["dataset_name"]``). + +Useful for verifying load-balance hypotheses such as "action_data_slow drives +the long ``optimizer_step`` time observed at large node counts". Because +:class:`IterativeJointDataLoader` synchronises stream selection across all +ranks via ``seed + global_id``, every rank processes the same stream at the +same iteration, so logging on rank 0 is representative of the global cost. +""" + +from __future__ import annotations + +from collections import defaultdict + +import torch +import wandb + +from cosmos3._src.imaginaire.model import ImaginaireModel +from cosmos3._src.imaginaire.utils import distributed +from cosmos3._src.imaginaire.utils.callback import Callback + +_TIMER_KEYS: tuple[str, ...] = ("forward", "backward", "optimizer_step", "dataloader_train") + + +class PerStreamTiming(Callback): + """Aggregate ``training_timer`` results by data stream and log to wandb. + + Args: + log_freq: Number of iterations between wandb logs. Each log emits + the per-stream mean time for every key in :data:`_TIMER_KEYS` and + the per-stream iteration count, then resets the accumulators. + """ + + def __init__(self, log_freq: int = 100) -> None: + super().__init__() + self.log_freq = log_freq + self._sums: dict[str, dict[str, float]] = defaultdict(lambda: defaultdict(float)) + self._counts: dict[str, int] = defaultdict(int) + + @staticmethod + def _extract_stream_name(data_batch: dict) -> str | None: + """Return the ``dataset_name`` carried by the packed batch. + + ``IterativeJointDataLoader`` attaches a per-sample ``dataset_name`` + and the collation produces a list of identical names for one stream. + """ + ds = data_batch.get("dataset_name") + if ds is None: + return None + if isinstance(ds, str): + return ds + if isinstance(ds, (list, tuple)) and ds: + first = ds[0] + return first if isinstance(first, str) else None + return None + + @torch.no_grad() + def on_training_step_end( + self, + model: ImaginaireModel, + data_batch: dict[str, torch.Tensor], + output_batch: dict[str, torch.Tensor], + loss: torch.Tensor, + iteration: int = 0, + ) -> None: + del model, output_batch, loss + stream = self._extract_stream_name(data_batch) + if stream is None: + return + + timer_results = self.trainer.training_timer.results + for key in _TIMER_KEYS: + values = timer_results.get(key) + if not values: + continue + self._sums[stream][key] += float(values[-1]) + self._counts[stream] += 1 + + if iteration % self.log_freq != 0 or iteration == 0: + return + if not distributed.is_rank0() or wandb.run is None: + self._reset() + return + + log_dict: dict[str, float] = {} + for stream_name, key_sums in self._sums.items(): + n = self._counts[stream_name] + if n == 0: + continue + log_dict[f"per_stream_iters/{stream_name}"] = float(n) + for key, total in key_sums.items(): + log_dict[f"per_stream_timer/{stream_name}/{key}"] = total / n + + wandb.log(log_dict, step=iteration) + self._reset() + + def _reset(self) -> None: + self._sums = defaultdict(lambda: defaultdict(float)) + self._counts = defaultdict(int) diff --git a/cosmos-inference/cosmos3/_src/vfm/callbacks/sequence_packing_padding.py b/cosmos-inference/cosmos3/_src/vfm/callbacks/sequence_packing_padding.py new file mode 100644 index 00000000..f61ecdec --- /dev/null +++ b/cosmos-inference/cosmos3/_src/vfm/callbacks/sequence_packing_padding.py @@ -0,0 +1,70 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +import torch +import wandb + +import cosmos3._src.vfm.datasets.sequence_packing as sequence_packing +from cosmos3._src.imaginaire.callbacks.every_n import EveryN +from cosmos3._src.imaginaire.model import ImaginaireModel +from cosmos3._src.imaginaire.trainer import ImaginaireTrainer + + +class SequencePackingPadding(EveryN): + """ + Callback that saves lengths to which und and gen sequences are padded. This information will be used + to compute FLOPs done during training. + + Args: + every_n (int): Frequency with which callback is run during training. + """ + + def __init__(self, every_n: int = 500): + super().__init__(every_n=every_n, step_size=1, barrier_after_run=False, run_at_start=True) + + def every_n_impl( + self, + trainer: ImaginaireTrainer, + model: ImaginaireModel, + data_batch: dict[str, torch.Tensor], + output_batch: dict[str, torch.Tensor], + loss: torch.Tensor, + iteration: int, + ) -> None: + if wandb.run: + log_dict = { + "SequencePackingPadding/max_causal_len_image_batch": sequence_packing.MAX_CAUSAL_LEN_IMAGE_BATCH, + "SequencePackingPadding/max_full_len_image_batch": sequence_packing.MAX_FULL_LEN_IMAGE_BATCH, + "SequencePackingPadding/max_causal_len_video_batch": sequence_packing.MAX_CAUSAL_LEN_VIDEO_BATCH, + "SequencePackingPadding/max_full_len_video_batch": sequence_packing.MAX_FULL_LEN_VIDEO_BATCH, + } + modality = "video" + if "is_image_batch" in output_batch: + modality = "image" if output_batch["is_image_batch"] else "video" + if "und_token_length" in output_batch: + log_dict[f"SequencePackingPadding/und_token_length_{modality}"] = output_batch["und_token_length"] + if "gen_token_length" in output_batch: + log_dict[f"SequencePackingPadding/gen_token_length_{modality}"] = output_batch["gen_token_length"] + if "action_token_length" in output_batch: + log_dict[f"SequencePackingPadding/action_token_length"] = output_batch["action_token_length"] + if "sound_token_length" in output_batch: + log_dict[f"SequencePackingPadding/sound_token_length"] = output_batch["sound_token_length"] + if "vision_token_length" in output_batch: + log_dict[f"SequencePackingPadding/vision_token_length"] = output_batch["vision_token_length"] + + wandb.log( + log_dict, + step=iteration, + ) diff --git a/cosmos-inference/cosmos3/_src/vfm/callbacks/sigma_loss_analysis.py b/cosmos-inference/cosmos3/_src/vfm/callbacks/sigma_loss_analysis.py new file mode 100644 index 00000000..482282fe --- /dev/null +++ b/cosmos-inference/cosmos3/_src/vfm/callbacks/sigma_loss_analysis.py @@ -0,0 +1,356 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +from dataclasses import dataclass +from typing import Optional + +import matplotlib +import matplotlib.pyplot as plt +import numpy as np +import torch +import torch.distributed as dist +import wandb + +from cosmos3._src.imaginaire.model import ImaginaireModel +from cosmos3._src.imaginaire.utils import distributed, misc +from cosmos3._src.imaginaire.utils.callback import Callback +from cosmos3._src.imaginaire.utils.easy_io import easy_io + + +def _get_quantile_bins(n=10) -> np.ndarray: + """Get predefined bins based on logarithmically spaced values""" + points = torch.linspace(0, 1, n + 1) + return points.numpy() + + +@dataclass +class _SigmaLossCache: + """A fixed-size queue for caching sigma and loss tensors. + + Stores sigma/loss pairs on CPU. + When the total number of elements exceeds queue_size, the oldest entries + are automatically removed to maintain the size limit. + + Args: + queue_size: Maximum number of elements to store in the cache. + """ + + def __init__(self, queue_size: int = 2000): + self.queue_size = queue_size + self.reset() + + def reset(self): + self.sigma_list: list[torch.Tensor] = [] + self.loss_list: list[torch.Tensor] = [] + self._total_elements: int = 0 + + def add(self, sigma: torch.Tensor, loss: torch.Tensor): + # Convert to bf16 and store on CPU + sigma_cpu = sigma.detach().cpu().to(torch.bfloat16) + loss_cpu = loss.detach().cpu().to(torch.bfloat16) + + self.sigma_list.append(sigma_cpu) + self.loss_list.append(loss_cpu) + self._total_elements += sigma_cpu.numel() + + # Remove oldest elements if queue exceeds max size + while self._total_elements > self.queue_size and len(self.sigma_list) > 1: + removed_sigma = self.sigma_list.pop(0) + self.loss_list.pop(0) + self._total_elements -= removed_sigma.numel() + + def get_arrays(self) -> tuple[torch.Tensor, torch.Tensor]: + if not self.sigma_list: + return torch.tensor([], dtype=torch.bfloat16), torch.tensor([], dtype=torch.bfloat16) + + sigma_arr = torch.cat(self.sigma_list, dim=0) # [N_total] (concatenated across cached batches) + loss_arr = torch.cat(self.loss_list, dim=0) # [N_total] + + return sigma_arr, loss_arr + + +class SigmaLossAnalysis(Callback): + """Analyze the relationship between sigma (noise level) and flow matching loss. + + This callback tracks per-instance flow matching losses at different sigma values + during training. It maintains separate caches for image and video batches, + periodically aggregates statistics across all distributed ranks, and logs + the results to wandb. + + The analysis helps understand how well the model learns to denoise at different + noise levels, which is useful for diagnosing training dynamics in flow matching + models. + + Args: + every_n: Log statistics every N iterations. + every_n_viz: Create visualization plots every N iterations (must be multiple of every_n). + save_s3: If True, save raw data to S3 for offline analysis. + """ + + def __init__( + self, + every_n: int = 1, + every_n_viz: int = 1, + save_s3: bool = False, + ) -> None: + super().__init__() + self.save_s3 = save_s3 + self.every_n = every_n + assert every_n_viz % every_n == 0, "every_n_viz must be a multiple of every_n in sigma_loss_analysis callback" + self.every_n_viz = every_n_viz + self.name = self.__class__.__name__ + + self.image_cache = _SigmaLossCache(queue_size=2000) + self.video_cache = _SigmaLossCache(queue_size=2000) + + def _create_analysis_plots( + self, + sigma_arr: torch.Tensor, + loss_arr: torch.Tensor, # [N] # [N] + ) -> Optional[wandb.Image]: + if len(sigma_arr) == 0: + return None + + # Convert to numpy for plotting + sigma_np = sigma_arr.cpu().float().numpy() + loss_np = loss_arr.cpu().float().numpy() + + fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(12, 5)) + + # Get predefined bins based on logarithmically spaced values + sigma_bins = _get_quantile_bins(10) + + # y_tick_min, y_tick_max = 0, 1.0 + y_tick_min, y_tick_max = 0, 1.0 + # 2D histogram with exponential sigma bins and fixed [0,1] loss range + loss_bins = np.linspace(y_tick_min, y_tick_max, 20) + + counts, xedges, yedges = np.histogram2d(sigma_np, loss_np, bins=(sigma_bins, loss_bins)) + if counts.max() < 0.1: + return None + + # Plot heatmap with exponential scale colormap + im = ax1.imshow( + counts.T, + origin="lower", + aspect="auto", + extent=[sigma_bins[0], sigma_bins[-1], y_tick_min, y_tick_max], + norm=matplotlib.colors.LogNorm(vmin=1, vmax=counts.max()), + ) + plt.colorbar(im, ax=ax1) + + # Set fixed loss ticks from 0 to 1 + yticks = np.linspace(y_tick_min, y_tick_max, 6) + ax1.set_yticks(yticks) + ax1.set_yticklabels([f"{y:.1f}" for y in yticks]) + + ax1.set_xlabel("Sigma") + ax1.set_ylabel("Loss") + title = "Sigma vs Loss Distribution" + ax1.set_title(title) + + # Sigma histogram with loss statistics per bin + hist_counts, _ = np.histogram(sigma_np, bins=sigma_bins) + bin_indices = np.digitize(sigma_np, sigma_bins) - 1 + + # Calculate statistics per bin + n_bins = len(sigma_bins) - 1 + means = np.zeros(n_bins) + stds = np.zeros(n_bins) + for i in range(n_bins): + bin_mask = bin_indices == i + if bin_mask.any(): + means[i] = loss_np[bin_mask].mean() + stds[i] = loss_np[bin_mask].std() + else: + means[i] = np.nan + stds[i] = np.nan + + # Plot histogram + bin_centers = (sigma_bins[:-1] + sigma_bins[1:]) / 2 + ax2.bar(bin_centers, hist_counts, width=np.diff(sigma_bins), alpha=0.3, align="center") + + # Plot loss statistics on twin axis + ax2_twin = ax2.twinx() + valid_mask = ~np.isnan(means) + ax2_twin.errorbar( + bin_centers[valid_mask], means[valid_mask], yerr=stds[valid_mask], color="red", fmt="o-", alpha=0.5 + ) + + ax2.set_xlabel("Sigma (Log Scale)") + ax2.set_ylabel("Count") + ax2_twin.set_ylabel("Loss (mean ± std)") + title = "Sigma Distribution with Loss Statistics" + ax2.set_title(title) + + # Add grid for better readability + ax1.grid(True, alpha=0.3) + ax2.grid(True, alpha=0.3) + + # Create log-scale labels + sigma_labels = [f"{val:.1e}" for val in sigma_bins] + ax1.set_xticks(sigma_bins[1:-1]) # Skip boundary bins + ax1.set_xticklabels(sigma_labels[1:-1], rotation=45) + ax1.set_xscale("linear") + ax2.set_xticks(sigma_bins[1:-1]) + ax2.set_xticklabels(sigma_labels[1:-1], rotation=45) + ax2.set_xscale("linear") + + plt.tight_layout() + fig_img = wandb.Image(fig) + plt.close(fig) + + return fig_img + + def _process_stats(self, sigma: torch.Tensor, loss: torch.Tensor) -> dict: + """Calculate summary statistics for sigma and loss distributions. + + Args: + sigma: Tensor of sigma (noise level) values. + loss: Tensor of corresponding loss values. + + Returns: + Dictionary containing: + - sigma_log_mean: Mean of log(sigma). Log-space is used since sigma spans + multiple orders of magnitude, a standard practice on flow matching / EDM models. + - sigma_log_std: Standard deviation of log(sigma). + - loss_mean: Average loss across all samples. + - loss_std: Standard deviation of loss, measuring spread. + - loss_min: Minimum loss value observed. + - loss_max: Maximum loss value observed. + - loss_median: Median (50th percentile) loss, robust to outliers. + - loss_q1: First quartile (25th percentile) of loss. + - loss_q3: Third quartile (75th percentile) of loss. + """ + return { + "sigma_log_mean": float(sigma.log().mean()), + "sigma_log_std": float(sigma.log().std()), + "loss_mean": float(loss.mean()), + "loss_std": float(loss.std()), + "loss_min": float(loss.min()), + "loss_max": float(loss.max()), + "loss_median": float(loss.median()), + "loss_q1": float(torch.quantile(loss.float(), 0.25)), + "loss_q3": float(torch.quantile(loss.float(), 0.75)), + } + + def _gather_and_save(self, cache: _SigmaLossCache, iteration: int, prefix: str, log_viz: bool = True) -> dict: + info = {} + + # Gather data from all ranks + local_sigma, local_loss = cache.get_arrays() + world_size = dist.get_world_size() + + if world_size > 1: + # Gather sizes first + local_size = torch.tensor([len(local_sigma)], dtype=torch.long, device="cuda") # [1] + sizes = [torch.zeros_like(local_size) for _ in range(world_size)] + dist.all_gather(sizes, local_size) + sizes = [s.item() for s in sizes] + + # Gather data + max_size = max(sizes) + if max_size > 0: + # Move to GPU for gathering + padded_sigma = torch.zeros(max_size, dtype=torch.bfloat16, device="cuda") # [max_size] + padded_loss = torch.zeros(max_size, dtype=torch.bfloat16, device="cuda") # [max_size] + + if len(local_sigma) > 0: + padded_sigma[: len(local_sigma)] = local_sigma.cuda() + padded_loss[: len(local_loss)] = local_loss.cuda() + + all_sigma = [torch.zeros_like(padded_sigma) for _ in range(world_size)] + all_loss = [torch.zeros_like(padded_loss) for _ in range(world_size)] + + dist.all_gather(all_sigma, padded_sigma) + dist.all_gather(all_loss, padded_loss) + + if distributed.is_rank0(): + # Combine data from all ranks + valid_sigma = [] + valid_loss = [] + for sigma, loss, size in zip(all_sigma, all_loss, sizes): + if size > 0: + valid_sigma.append(sigma[:size]) + valid_loss.append(loss[:size]) + + if valid_sigma: + sigma_arr = torch.cat(valid_sigma) # [N_total] (across all ranks) + loss_arr = torch.cat(valid_loss) # [N_total] + + # Overall statistics + info[f"{prefix}/total_samples"] = sigma_arr.shape[0] + + # Calculate statistics + stats = self._process_stats(sigma_arr, loss_arr) + info.update({f"{prefix}/{k}": v for k, v in stats.items()}) + + # Create visualization + if log_viz: + fig_img = self._create_analysis_plots(sigma_arr, loss_arr) + print(fig_img) + if fig_img is not None: + info[f"{prefix}/distribution_plot"] = fig_img + + if self.save_s3: + save_data = { + "sigma": sigma_arr.cpu(), + "loss": loss_arr.cpu(), + "stats": {k: v for k, v in info.items() if not isinstance(v, wandb.Image)}, + } + easy_io.dump( + save_data, + f"s3://rundir/{self.name}/{prefix}_Iter{iteration:09d}.pkl", + ) + + cache.reset() + return info + + def on_training_step_end( + self, + model: ImaginaireModel, + data_batch: dict[str, torch.Tensor], + output_batch: dict[str, torch.Tensor], + loss: torch.Tensor, + iteration: int = 0, + ): + sigma = output_batch["sigma"] + fm_loss_vision_per_instance = output_batch["flow_matching_loss_vision_per_instance"] + + # sigma is [B] (base), [B,1] (TF), or [B,T_max] (DF); reduce to [B] for logging + assert sigma.ndim <= 2, f"Sigma should be [B] or [B,T_max], got shape {sigma.shape}" + if sigma.ndim == 2: + sigma = sigma.mean(dim=-1) # [B] (reduced from [B,T_max] or [B,1]) + + if model.is_image_batch(data_batch): + self.image_cache.add(sigma, fm_loss_vision_per_instance) + else: + self.video_cache.add(sigma, fm_loss_vision_per_instance) + + if iteration % self.every_n == 0: + info = {} + + with misc.timer("sigma_loss_analysis"): + log_viz = iteration % self.every_n_viz == 0 + # Process image data + if len(self.image_cache.sigma_list) > 0: + info.update(self._gather_and_save(self.image_cache, iteration, "sigma_loss_image", log_viz=log_viz)) + + # Process video data + if len(self.video_cache.sigma_list) > 0: + info.update(self._gather_and_save(self.video_cache, iteration, "sigma_loss_video", log_viz=log_viz)) + + if distributed.is_rank0() and info and wandb.run: + wandb.log(info, step=iteration) diff --git a/cosmos-inference/cosmos3/_src/vfm/callbacks/skip_nan_step.py b/cosmos-inference/cosmos3/_src/vfm/callbacks/skip_nan_step.py new file mode 100644 index 00000000..799c181f --- /dev/null +++ b/cosmos-inference/cosmos3/_src/vfm/callbacks/skip_nan_step.py @@ -0,0 +1,96 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +from __future__ import annotations + +import torch +import torch.distributed as dist + +from cosmos3._src.imaginaire.utils import distributed, log +from cosmos3._src.imaginaire.utils.callback import Callback + + +class SkipNaNStep(Callback): + """Skip the optimizer step only when ALL ranks produce NaN/Inf loss. + + When only some ranks produce NaN, the existing GradClip callback's + nan_to_num handling is sufficient (NaN gradients become 0, valid + gradients from clean ranks are still used). This callback only + intervenes when every rank has NaN, meaning no useful gradient + signal exists. + + The all-reduce ensures all ranks agree on skip/no-skip, preventing + NCCL desync. + + Args: + max_consecutive_nan: Abort training after this many consecutive + all-rank-NaN optimizer steps. Set to 0 to disable the limit. + """ + + def __init__(self, max_consecutive_nan: int = 100) -> None: + super().__init__() + self.max_consecutive_nan = max_consecutive_nan + self._nan_detected = False + self._consecutive_nan_count = 0 + + def on_before_backward( + self, + model_ddp: distributed.DistributedDataParallel, + loss: torch.Tensor, + iteration: int = 0, + ) -> None: + if torch.isnan(loss).any() or torch.isinf(loss).any(): + self._nan_detected = True + + def on_before_optimizer_step( + self, + model_ddp: distributed.DistributedDataParallel, + optimizer: torch.optim.Optimizer, + scheduler: torch.optim.lr_scheduler.LRScheduler, + grad_scaler: torch.amp.GradScaler, + iteration: int = 0, + ) -> None: + nan_flag = torch.tensor([1.0 if self._nan_detected else 0.0], device="cuda") + dist.all_reduce(nan_flag, op=dist.ReduceOp.SUM) + nan_rank_count = int(nan_flag.item()) + world_size = dist.get_world_size() + + if nan_rank_count > 0 and nan_rank_count < world_size: + self._consecutive_nan_count = 0 + + elif nan_rank_count == world_size: + if isinstance(model_ddp, distributed.DistributedDataParallel): + model = model_ddp.module + else: + model = model_ddp + for param in model.parameters(): + if param.grad is not None: + param.grad.zero_() + + self._consecutive_nan_count += 1 + log.warning( + f"ALL ranks NaN/Inf at iteration {iteration}, skipping optimizer step " + f"(consecutive: {self._consecutive_nan_count})", + ) + + if self.max_consecutive_nan > 0 and self._consecutive_nan_count >= self.max_consecutive_nan: + raise RuntimeError( + f"Training unstable: all-rank NaN/Inf loss for {self._consecutive_nan_count} " + f"consecutive optimizer steps at iteration {iteration}. Aborting.", + ) + else: + self._consecutive_nan_count = 0 + + self._nan_detected = False diff --git a/cosmos-inference/cosmos3/_src/vfm/callbacks/termination_signal_checkpoint.py b/cosmos-inference/cosmos3/_src/vfm/callbacks/termination_signal_checkpoint.py new file mode 100644 index 00000000..cb3b4ce7 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/vfm/callbacks/termination_signal_checkpoint.py @@ -0,0 +1,178 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +"""Callback that saves an emergency checkpoint before a Slurm job is killed. + +Slurm signal timeline +--------------------- + +**Timeout** (job hits ``--time`` limit):: + + T SIGUSR1 → batch shell (N is 300 via ``--signal=B:SIGUSR1@300``) + T+N sec SIGTERM → all processes + T+N+KillWait SIGKILL → anything still alive (KillWait is 30s) + +**Preemption** (higher-priority job needs the nodes):: + + T SIGUSR1 → batch shell + T+Grace SIGTERM + SIGKILL (GraceTime is 300s in GCP-IAD) + +**User cancel** (``scancel ``):: + + T SIGTERM → all processes (no SIGUSR1) + T+KillWait SIGKILL → anything still alive (KillWait is 30s) + +Implementation +-------------- + +* **How the SIGUSR1 signal is handled:** + + - The Slurm batch script responds to SIGUSR1 by creating a sentinel file + (``$SLURM_LOG_DIR/SIGUSR1_RECEIVED``) on the shared filesystem. + - This callback polls for the presence of this sentinel file at the end of + each training step. + - When detected, it triggers an emergency checkpoint save. + +* **Why the Python process can't receive SIGUSR1 directly:** + + - Pyxis/Enroot containers do not receive SIGUSR1 signals from Slurm (the + signal is sent to the batch shell, not the container). + - Attempted to forward SIGUSR1 with both ``srun`` and + ``scancel --signal`` in batch scripts, proven not working. + +* **Why no SIGTERM handler is needed:** + + - The SIGUSR1 signal is the trigger of preemption and timeout, it + is already able to distinguish them from user cancel. + - Before SIGTERM arrives there are at least 300 s (``GraceTime=300`` for + preemption, ``--signal=B:SIGUSR1@N`` for timeout), which is sufficent + for the poll to detect the sentinel and save a checkpoint. + - SIGTERM and the subsequent SIGKILL are left to terminate the process + naturally after the checkpoint has been saved. + - SIGTERM and SIGUSR1 are logged so we can observe signal delivery into + Pyxis/Enroot containers for debugging purposes. + +To avoid redundant checkpoints, a save is only performed if at least +``save_iter * min_save_fraction`` iterations have elapsed since the last +checkpoint. +""" + +from __future__ import annotations + +import os +import signal +import sys + +import torch + +from cosmos3._src.imaginaire.model import ImaginaireModel +from cosmos3._src.imaginaire.utils import log +from cosmos3._src.imaginaire.utils.callback import Callback + + +class TerminationSignalCheckpoint(Callback): + """Save a checkpoint in response to SIGUSR1 (preemption or timeout). + + Args: + min_save_fraction: Fraction of the regular checkpoint interval (between 0 + and 1) that must have elapsed since the last checkpoint before an + emergency save is allowed. Defaults to 1/3. + """ + + def __init__(self, min_save_fraction: float = 1 / 3): + super().__init__() + self._min_save_fraction = min_save_fraction + self._current_iteration: int = 0 + self._last_checkpoint_iteration: int = 0 + # Captured from on_before_optimizer_step so we can call checkpointer.save(). + self._optimizer: torch.optim.Optimizer | None = None + self._scheduler: torch.optim.lr_scheduler.LRScheduler | None = None + self._grad_scaler: torch.amp.GradScaler | None = None + # Sentinel file created by the batch-shell trap when SIGUSR1 arrives. + # This is the sole detection mechanism because srun/Pyxis does not + # relay SIGUSR1 into the container. + slurm_log_dir = os.environ.get("SLURM_LOG_DIR", "") + self._sigusr1_sentinel = os.path.join(slurm_log_dir, "SIGUSR1_RECEIVED") if slurm_log_dir else "" + + # ------------------------------------------------------------------ + # Lifecycle hooks + # ------------------------------------------------------------------ + + def on_train_start(self, model: ImaginaireModel, iteration: int = 0) -> None: + self._current_iteration = iteration + self._last_checkpoint_iteration = iteration + self._install_termination_signal_handlers() + + def on_before_optimizer_step( + self, + model_ddp, + optimizer: torch.optim.Optimizer, + scheduler: torch.optim.lr_scheduler.LRScheduler, + grad_scaler: torch.amp.GradScaler, + iteration: int = 0, + ) -> None: + self._optimizer = optimizer + self._scheduler = scheduler + self._grad_scaler = grad_scaler + + def on_save_checkpoint_success(self, iteration: int = 0, elapsed_time: float = 0) -> None: + self._last_checkpoint_iteration = iteration + + def on_training_step_end( + self, + model: ImaginaireModel, + data_batch: dict[str, torch.Tensor], + output_batch: dict[str, torch.Tensor], + loss: torch.Tensor, + iteration: int = 0, + ) -> None: + self._current_iteration = iteration + + if not self._sigusr1_sentinel or not os.path.exists(self._sigusr1_sentinel): + return + + log.info("[TerminationSignalCheckpoint] Detected SIGUSR1 sentinel file. Will save checkpoint.") + + # Check if the minimum progress has been reached since the last checkpoint. + min_progress = int(self.config.checkpoint.save_iter * self._min_save_fraction) + if (iteration - self._last_checkpoint_iteration) < min_progress: + log.info( + f"[TerminationSignalCheckpoint] Only {iteration - self._last_checkpoint_iteration} iterations " + f"since last checkpoint (threshold {min_progress}). Skipping checkpoint save." + ) + sys.exit(0) + + assert self._optimizer is not None, ( + "[TerminationSignalCheckpoint] Optimizer reference not set — on_before_optimizer_step was never called" + ) + + log.info(f"[TerminationSignalCheckpoint] Saving checkpoint at iteration {iteration}.") + self.trainer.checkpointer.save(model, self._optimizer, self._scheduler, self._grad_scaler, iteration=iteration) + # Async DCP checkpointing queues the write to a background process. + # We must wait for it to finish before exiting. + self.trainer.checkpointer.finalize() + log.info(f"[TerminationSignalCheckpoint] Checkpoint saved at iteration {iteration}.") + sys.exit(0) + + # ------------------------------------------------------------------ + # Termination signal handlers + # ------------------------------------------------------------------ + + def _install_termination_signal_handlers(self) -> None: + signal.signal(signal.SIGTERM, self._log_sigterm) + log.info("[TerminationSignalCheckpoint] Installed SIGTERM handler.") + + def _log_sigterm(self, signum: int, frame: object) -> None: + log.info(f"[TerminationSignalCheckpoint] Received SIGTERM at iteration {self._current_iteration}.") diff --git a/cosmos-inference/cosmos3/_src/vfm/callbacks/training_stats.py b/cosmos-inference/cosmos3/_src/vfm/callbacks/training_stats.py new file mode 100644 index 00000000..ba54b861 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/vfm/callbacks/training_stats.py @@ -0,0 +1,307 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +from __future__ import annotations + +import torch +import torch.distributed as dist +import wandb + +from cosmos3._src.imaginaire.model import ImaginaireModel +from cosmos3._src.imaginaire.utils import distributed +from cosmos3._src.imaginaire.utils.callback import Callback +from cosmos3._src.vfm.callbacks.wandb_log import _LossRecord +from cosmos3._src.vfm.datasets.action.domain_utils import EMBODIMENT_TO_DOMAIN_ID + +# Build inverse mapping: domain_id -> embodiment_type. First occurrence wins when multiple embodiment names share the +# same domain id. +DOMAIN_ID_TO_EMBODIMENT: dict[int, str] = {} +for _k, _v in EMBODIMENT_TO_DOMAIN_ID.items(): + DOMAIN_ID_TO_EMBODIMENT.setdefault(_v, _k) + + +class TrainingStatsCallback(Callback): + """Callback for tracking and logging training mode and embodiment statistics to wandb.""" + + def __init__(self, log_freq: int = 100): + super().__init__() + self.log_freq = log_freq + self._mode_counts: dict[str, int] = {} + self._mode_total_count: int = 0 + self._embodiment_counts: dict[str, int] = {} + self._embodiment_total_count: int = 0 + self._per_embodiment_loss: dict[str, _LossRecord] = {} + self._per_embodiment_sub_loss: dict[str, dict[str, _LossRecord]] = {} + + def _accumulate_mode_counts(self, data_batch: dict[str, torch.Tensor]) -> None: + modes = data_batch.get("mode", None) + if modes is None: + return + + if isinstance(modes, str): + modes_list = [modes] + elif isinstance(modes, (list, tuple)): + modes_list = [str(m) for m in modes] + elif isinstance(modes, torch.Tensor): + # Defensive: support cases where mode might be encoded numerically. + modes_list = [str(m) for m in modes.detach().cpu().tolist()] + else: + modes_list = [str(modes)] + + for mode in modes_list: + self._mode_total_count += 1 + self._mode_counts[mode] = self._mode_counts.get(mode, 0) + 1 + + def _accumulate_embodiment_counts(self, data_batch: dict[str, torch.Tensor]) -> None: + domain_ids = data_batch.get("domain_id", None) + if domain_ids is None: + return + + if isinstance(domain_ids, int): + domain_id_list = [domain_ids] + elif isinstance(domain_ids, (list, tuple)): + domain_id_list = [int(d) for d in domain_ids if d is not None] + elif isinstance(domain_ids, torch.Tensor): + # Flatten to handle any shape (scalar, 1D, or 2D with trailing dim) + domain_id_list = [int(d) for d in domain_ids.detach().cpu().flatten().tolist()] + else: + domain_id_list = [int(domain_ids)] + + for domain_id in domain_id_list: + embodiment = DOMAIN_ID_TO_EMBODIMENT.get(domain_id, f"unknown_{domain_id}") + self._embodiment_total_count += 1 + self._embodiment_counts[embodiment] = self._embodiment_counts.get(embodiment, 0) + 1 + + def _gather_global_mode_counts(self) -> tuple[int, dict[str, int]]: + """ + Returns (global_total, global_mode_counts) aggregated across all ranks. + """ + local: dict[str, int] = dict(self._mode_counts) + local["__total__"] = int(self._mode_total_count) + + if dist.is_available() and dist.is_initialized(): + world_size = int(dist.get_world_size()) + gathered: list[dict[str, int] | None] = [None for _ in range(world_size)] + dist.all_gather_object(gathered, local) + else: + gathered = [local] + + global_total = 0 + global_counts: dict[str, int] = {} + for item in gathered: + if not item: + continue + global_total += int(item.get("__total__", 0)) + for k, v in item.items(): + if k == "__total__": + continue + global_counts[k] = global_counts.get(k, 0) + int(v) + return global_total, global_counts + + def _gather_global_embodiment_counts(self) -> tuple[int, dict[str, int]]: + """ + Returns (global_total, global_embodiment_counts) aggregated across all ranks. + """ + local: dict[str, int] = dict(self._embodiment_counts) + local["__total__"] = int(self._embodiment_total_count) + + if dist.is_available() and dist.is_initialized(): + world_size = int(dist.get_world_size()) + gathered: list[dict[str, int] | None] = [None for _ in range(world_size)] + dist.all_gather_object(gathered, local) + else: + gathered = [local] + + global_total = 0 + global_counts: dict[str, int] = {} + for item in gathered: + if not item: + continue + global_total += int(item.get("__total__", 0)) + for k, v in item.items(): + if k == "__total__": + continue + global_counts[k] = global_counts.get(k, 0) + int(v) + return global_total, global_counts + + def _build_mode_log_dict( + self, *, log_prefix: str, global_total: int, global_counts: dict[str, int] + ) -> dict[str, float]: + info: dict[str, float] = {} + + denom = float(global_total) if global_total > 0 else 0.0 + for mode in sorted(global_counts.keys()): + count = float(global_counts.get(mode, 0)) + pct = (100.0 * count / denom) if denom > 0 else 0.0 + info[f"{log_prefix}_stats_mode/{mode}"] = pct + + return info + + def _build_embodiment_log_dict( + self, *, log_prefix: str, global_total: int, global_counts: dict[str, int] + ) -> dict[str, float]: + info: dict[str, float] = {} + + denom = float(global_total) if global_total > 0 else 0.0 + for embodiment in sorted(global_counts.keys()): + count = float(global_counts.get(embodiment, 0)) + pct = (100.0 * count / denom) if denom > 0 else 0.0 + info[f"{log_prefix}_stats_embodiment/{embodiment}"] = pct + + return info + + def _get_batch_embodiment(self, data_batch: dict[str, torch.Tensor]) -> str | None: + """Extract the embodiment name from the first non-None sample's domain_id.""" + domain_ids = data_batch.get("domain_id", None) + if domain_ids is None: + return None + if isinstance(domain_ids, torch.Tensor): + if domain_ids.numel() == 0: + return None + domain_id = int(domain_ids.flatten()[0].item()) + elif isinstance(domain_ids, (list, tuple)): + first = next((d for d in domain_ids if d is not None), None) + if first is None: + return None + domain_id = int(first) + else: + domain_id = int(domain_ids) + return DOMAIN_ID_TO_EMBODIMENT.get(domain_id, f"unknown_{domain_id}") + + def _accumulate_per_embodiment_loss( + self, + data_batch: dict[str, torch.Tensor], + output_batch: dict[str, torch.Tensor], + loss: torch.Tensor, + ) -> None: + embodiment = self._get_batch_embodiment(data_batch) + if embodiment is None: + return + + if embodiment not in self._per_embodiment_loss: + self._per_embodiment_loss[embodiment] = _LossRecord() + self._per_embodiment_loss[embodiment].loss += loss.detach().float() + self._per_embodiment_loss[embodiment].iter_count += 1 + + if embodiment not in self._per_embodiment_sub_loss: + self._per_embodiment_sub_loss[embodiment] = {} + for key in output_batch: + if "loss" in key and "per_instance" not in key: + if key not in self._per_embodiment_sub_loss[embodiment]: + self._per_embodiment_sub_loss[embodiment][key] = _LossRecord() + self._per_embodiment_sub_loss[embodiment][key].loss += output_batch[key].detach().float() + self._per_embodiment_sub_loss[embodiment][key].iter_count += 1 + + def _compute_per_embodiment_loss_stats(self, log_prefix: str) -> dict[str, float]: + """Compute per-embodiment loss averages across all ranks. + + All ranks must call this method (contains collective operations). + Returns the log dict (only meaningful on rank 0). + """ + dist_available = dist.is_available() and dist.is_initialized() + world_size = int(dist.get_world_size()) if dist_available else 1 + + # Step 1: gather union of embodiment names across ranks + local_embodiments = sorted(self._per_embodiment_loss.keys()) + if dist_available: + all_embodiments: list[list[str] | None] = [None for _ in range(world_size)] + dist.all_gather_object(all_embodiments, local_embodiments) + else: + all_embodiments = [local_embodiments] + union_embodiments = sorted({e for el in all_embodiments for e in el}) + + # Step 2: gather union of sub-loss keys across ranks + local_sub_keys = sorted({k for d in self._per_embodiment_sub_loss.values() for k in d}) + if dist_available: + all_sub_keys: list[list[str] | None] = [None for _ in range(world_size)] + dist.all_gather_object(all_sub_keys, local_sub_keys) + else: + all_sub_keys = [local_sub_keys] + union_sub_keys = sorted({k for kl in all_sub_keys for k in kl}) + + # Step 3: insert NaN dummy _LossRecord for missing embodiment/key combos + for emb in union_embodiments: + if emb not in self._per_embodiment_loss: + dummy = _LossRecord() + dummy.loss += torch.tensor([float("nan")], device="cuda") + dummy.iter_count += 1 + self._per_embodiment_loss[emb] = dummy + if emb not in self._per_embodiment_sub_loss: + self._per_embodiment_sub_loss[emb] = {} + for key in union_sub_keys: + if key not in self._per_embodiment_sub_loss[emb]: + dummy = _LossRecord() + dummy.loss += torch.tensor([float("nan")], device="cuda") + dummy.iter_count += 1 + self._per_embodiment_sub_loss[emb][key] = dummy + + # Step 4: compute distributed averages (all ranks participate in all_reduce) + log_dict: dict[str, float] = {} + for emb in union_embodiments: + avg, valid = self._per_embodiment_loss[emb].get_stat(return_valid_mask_sum=True) + if valid > 0: + log_dict[f"{log_prefix}_stats_loss/{emb}"] = avg + + for emb in union_embodiments: + for key in union_sub_keys: + avg, valid = self._per_embodiment_sub_loss[emb][key].get_stat(return_valid_mask_sum=True) + if valid > 0: + log_dict[f"{log_prefix}_stats_loss_detail/{emb}_{key}"] = avg + + # Step 5: reset accumulators + self._per_embodiment_loss = {} + self._per_embodiment_sub_loss = {} + + return log_dict + + @torch.no_grad() + def on_training_step_end( + self, + model: ImaginaireModel, + data_batch: dict[str, torch.Tensor], + output_batch: dict[str, torch.Tensor], + loss: torch.Tensor, + iteration: int = 0, + ) -> None: + self._accumulate_mode_counts(data_batch) + self._accumulate_embodiment_counts(data_batch) + self._accumulate_per_embodiment_loss(data_batch, output_batch, loss) + + if iteration % self.log_freq != 0: + return + + # All ranks must participate in collective operations below. + mode_total, mode_counts = self._gather_global_mode_counts() + embodiment_total, embodiment_counts = self._gather_global_embodiment_counts() + per_embodiment_loss_dict = self._compute_per_embodiment_loss_stats(log_prefix="train") + + if not distributed.is_rank0(): + return + + if wandb.run is None: + return + + log_dict: dict[str, float] = {} + log_dict.update( + self._build_mode_log_dict(log_prefix="train", global_total=mode_total, global_counts=mode_counts) + ) + log_dict.update( + self._build_embodiment_log_dict( + log_prefix="train", global_total=embodiment_total, global_counts=embodiment_counts + ) + ) + log_dict.update(per_embodiment_loss_dict) + + wandb.log({k: float(v) for k, v in log_dict.items()}, step=iteration) diff --git a/cosmos-inference/cosmos3/_src/vfm/callbacks/wandb_log.py b/cosmos-inference/cosmos3/_src/vfm/callbacks/wandb_log.py new file mode 100644 index 00000000..1e6f62a7 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/vfm/callbacks/wandb_log.py @@ -0,0 +1,219 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +""" +Copied from projects/cosmos/reason1/callbacks/wandb_log.py remove loss_per_token related code +""" + +from __future__ import annotations + +from dataclasses import dataclass +from typing import Tuple + +import torch +import torch.distributed as dist +import torch.utils.data +import wandb + +from cosmos3._src.imaginaire.model import ImaginaireModel +from cosmos3._src.imaginaire.utils import distributed, log +from cosmos3._src.imaginaire.utils.callback import Callback +from cosmos3._src.imaginaire.utils.easy_io import easy_io + + +@dataclass +class _LossRecord: + loss: torch.Tensor | float = 0 + iter_count: int = 0 + name: str | None = None + + def reset(self) -> None: + self.loss = 0 + self.iter_count = 0 + + def get_stat(self, return_valid_mask_sum: bool = False) -> Tuple[float, float]: + if self.iter_count == 0: + self.loss = torch.tensor([float("nan")], device="cuda") + self.iter_count = 1 + self.loss = self.loss.mean() + msg_str = f"{self.name}: sum_loss={self.loss.item()}/iter_count={self.iter_count}=" + avg_loss_tensor = self.loss / self.iter_count + # Create a mask (1 if valid, 0 if NaN or Inf) + valid_mask = torch.tensor([torch.isfinite(avg_loss_tensor).float()], device="cuda") + msg_str += f"avg_loss={avg_loss_tensor.item()}, valid_mask={valid_mask.item()}, " + + # Replace NaN/Inf with 0 to avoid affecting sum + avg_loss_tensor = torch.where( + torch.isfinite(avg_loss_tensor), avg_loss_tensor, torch.tensor([0.0], device="cuda") + ) + + # Reduce across all ranks + dist.all_reduce(avg_loss_tensor, op=dist.ReduceOp.SUM) # Sum of valid losses + dist.all_reduce(valid_mask, op=dist.ReduceOp.SUM) # Count of valid losses + msg_str += f" | all_reduce: avg_loss={avg_loss_tensor.item()}, valid_mask={valid_mask.item()}" + # Compute final average, avoiding division by zero + if valid_mask.item() > 0: + final_avg_loss = (avg_loss_tensor / valid_mask).item() + valid_mask_sum = valid_mask.item() + else: + final_avg_loss = 0.0 # Default to zero if all values were invalid + valid_mask_sum = 0 + + avg_loss = final_avg_loss + msg_str += f" | final: avg_loss={final_avg_loss}" + if self.name is not None: + log.debug(msg_str, rank0_only=False) + self.reset() + if return_valid_mask_sum: + return avg_loss, valid_mask_sum + else: + return avg_loss + + +class WandbCallback(Callback): + def __init__( + self, + logging_iter_multipler: int = 1, + save_logging_iter_multipler: int = 1, + save_s3: bool = False, + ) -> None: + super().__init__() + self.final_loss_log = _LossRecord() + self.final_loss_log_per_dataset = {} + self.final_all_loss_log = {} + self.logging_iter_multipler = logging_iter_multipler + self.save_logging_iter_multipler = save_logging_iter_multipler + assert self.logging_iter_multipler > 0, "logging_iter_multipler should be greater than 0" + self.save_s3 = save_s3 + self.wandb_extra_tag = f"@{logging_iter_multipler}" if logging_iter_multipler > 1 else "" + self.name = "wandb_loss_log" + self.wandb_extra_tag + self.unstable_count = torch.zeros(1, device="cuda") + + def on_training_step_end( + self, + model: ImaginaireModel, + data_batch: dict[str, torch.Tensor], + output_batch: dict[str, torch.Tensor], + loss: torch.Tensor, + iteration: int = 0, + ) -> None: + if torch.isnan(loss) or torch.isinf(loss): + log.critical( + f"Unstable loss {loss} at iteration {iteration}", + rank0_only=False, + ) + self.unstable_count += 1 + + # All elements within a batch have the same dataset name (image or video) + dataset_name = data_batch.get("dataset_name", "default")[0] + + if dataset_name not in self.final_loss_log_per_dataset: + self.final_loss_log_per_dataset[dataset_name] = _LossRecord() + self.final_loss_log_per_dataset[dataset_name].name = dataset_name + self.final_loss_log_per_dataset[dataset_name].loss += loss.detach().float() + self.final_loss_log_per_dataset[dataset_name].iter_count += 1 + + self.final_loss_log.loss += loss.detach().float() + self.final_loss_log.iter_count += 1 + + for key in output_batch.keys(): + # Curve can be plotted only on aggregated loss, not per-instance loss + if "loss" in key and "per_instance" not in key: + if key not in self.final_all_loss_log: + self.final_all_loss_log[key] = _LossRecord() + self.final_all_loss_log[key].loss += output_batch[key].detach().float() + self.final_all_loss_log[key].iter_count += 1 + + if iteration % (self.config.trainer.logging_iter * self.logging_iter_multipler) == 0: + avg_final_loss = self.final_loss_log.get_stat() + + avg_final_all_loss = {} + for key in self.final_all_loss_log.keys(): + avg_final_all_loss[key] = self.final_all_loss_log[key].get_stat() + + # Step 1: Gather all dataset names across ranks + local_dataset_names = list(self.final_loss_log_per_dataset.keys()) + all_dataset_names = [None for _ in range(dist.get_world_size())] + dist.all_gather_object(all_dataset_names, local_dataset_names) + + # Step 2: Create the union of all dataset names + union_dataset_names = set() + for names in all_dataset_names: + union_dataset_names.update(names) + # Step 3: For any missing dataset name, add dummy _LossRecord with NaN loss + union_dataset_names = sorted(list(union_dataset_names)) # This is very important! + for dataset_name in union_dataset_names: + if dataset_name not in self.final_loss_log_per_dataset: + dummy = _LossRecord() + dummy.loss += torch.tensor([float("nan")], device="cuda") # Will be masked out + dummy.iter_count += 1 + self.final_loss_log_per_dataset[dataset_name] = dummy + + avg_final_loss_per_dataset = {} + for dataset_name in union_dataset_names: + avg_loss, valid_mask_sum = self.final_loss_log_per_dataset[dataset_name].get_stat( + return_valid_mask_sum=True + ) + if valid_mask_sum > 0: + avg_final_loss_per_dataset[dataset_name] = avg_loss + + dist.all_reduce(self.unstable_count, op=dist.ReduceOp.SUM) + + if distributed.is_rank0() and wandb.run is not None: + info = {} + info.update( + { + f"train{self.wandb_extra_tag}/loss": avg_final_loss, + f"train{self.wandb_extra_tag}/unstable_count": self.unstable_count.item(), + "iteration": iteration, + } + ) + for key, loss in avg_final_all_loss.items(): + info.update( + { + f"train{self.wandb_extra_tag}_detail/{key}": loss, + } + ) + for dataset_name, loss in avg_final_loss_per_dataset.items(): + tag = "" + if "per_seq" in dataset_name: + tag = "_per_seq" + dataset_name = dataset_name.replace("per_seq/", "") + info.update( + { + f"train{self.wandb_extra_tag}_per_data{tag}/{dataset_name}": loss, + } + ) + if self.save_s3: + if ( + iteration + % ( + self.config.trainer.logging_iter + * self.logging_iter_multipler + * self.save_logging_iter_multipler + ) + == 0 + ): + easy_io.dump( + info, + f"s3://rundir/{self.name}/Train_Iter{iteration:09d}.json", + ) + + if wandb: + wandb.log(info, step=iteration) + + # reset unstable count + self.unstable_count.zero_() + self.final_loss_log_per_dataset = {} diff --git a/cosmos-inference/cosmos3/_src/vfm/callbacks/wandb_log_eval.py b/cosmos-inference/cosmos3/_src/vfm/callbacks/wandb_log_eval.py new file mode 100644 index 00000000..739b8a7f --- /dev/null +++ b/cosmos-inference/cosmos3/_src/vfm/callbacks/wandb_log_eval.py @@ -0,0 +1,153 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +from __future__ import annotations + +from dataclasses import dataclass +from typing import Tuple + +import torch +import torch.distributed as dist +import torch.utils.data +import wandb + +from cosmos3._src.imaginaire.model import ImaginaireModel +from cosmos3._src.imaginaire.utils import distributed, log +from cosmos3._src.imaginaire.utils.callback import Callback +from cosmos3._src.imaginaire.utils.easy_io import easy_io + + +@dataclass +class _LossRecord: + loss: float = 0 + iter_count: int = 0 + + def reset(self) -> None: + self.loss = 0 + self.iter_count = 0 + + def get_stat(self) -> Tuple[float, float]: + if self.iter_count > 0: + avg_loss_tensor = self.loss / self.iter_count + # Create a mask (1 if valid, 0 if NaN or Inf) + valid_mask = torch.tensor([torch.isfinite(avg_loss_tensor).float()], device="cuda") + + # Replace NaN/Inf with 0 to avoid affecting sum + avg_loss_tensor = torch.where( + torch.isfinite(avg_loss_tensor), avg_loss_tensor, torch.tensor([0.0], device="cuda") + ) + + # Reduce across all ranks + dist.all_reduce(avg_loss_tensor, op=dist.ReduceOp.SUM) # Sum of valid losses + dist.all_reduce(valid_mask, op=dist.ReduceOp.SUM) # Count of valid losses + + # Compute final average, avoiding division by zero + if valid_mask.item() > 0: + final_avg_loss = (avg_loss_tensor / valid_mask).item() + else: + final_avg_loss = 0.0 # Default to zero if all values were invalid + + avg_loss = final_avg_loss + else: + avg_loss = 0 + self.reset() + return avg_loss + + +class WandbCallback(Callback): + def __init__( + self, + save_s3: bool = False, + ) -> None: + super().__init__() + self.final_loss_log = _LossRecord() + self.final_loss_log_per_dataset = {} + + self.save_s3 = save_s3 + self.wandb_extra_tag = "" + self.name = "wandb_loss_val_log" + self.unstable_count = torch.zeros(1, device="cuda") + self.url_key_list = [] + + def on_validation_step_end( + self, + model: ImaginaireModel, + data_batch: dict[str, torch.Tensor], + output_batch: dict[str, torch.Tensor], + loss: torch.Tensor, + iteration: int = 0, + ) -> None: + if torch.isnan(loss) or torch.isinf(loss): + log.critical( + f"Unstable val loss {loss} at iteration {iteration}", + rank0_only=False, + ) + self.unstable_count += 1 + + dataset_name = data_batch.get("dataset_name", "default") + + # Handle case where dataset_name gets batched into a list + if isinstance(dataset_name, list): + + assert len(dataset_name) == 1, "dataset_name should be a list of 1" + dataset_name = dataset_name[0] + + if dataset_name not in self.final_loss_log_per_dataset: + self.final_loss_log_per_dataset[dataset_name] = _LossRecord() + + self.final_loss_log_per_dataset[dataset_name].loss += loss.detach().float() + self.final_loss_log_per_dataset[dataset_name].iter_count += 1 + self.final_loss_log.loss += loss.detach().float() + self.final_loss_log.iter_count += 1 + + self.url_key_list.append(f"{data_batch.get('__url__', [''])[0]}, {data_batch.get('__key__', [''])[0]}") + + def on_validation_end(self, model: ImaginaireModel, iteration: int = 0) -> None: + avg_final_loss = self.final_loss_log.get_stat() + + log.info(f"avg_final_loss: {avg_final_loss}") + dist.all_reduce(self.unstable_count, op=dist.ReduceOp.SUM) + # gather url and key list from all ranks + url_key_list = [None] * dist.get_world_size() + dist.all_gather_object(url_key_list, self.url_key_list) + url_key_list = [item for sublist in url_key_list for item in sublist] + + unique_url_key_list = list(set(url_key_list)) + if distributed.is_rank0(): + info = {} + log.info( + f"[val] number of unique url and key: {len(unique_url_key_list)} / {len(url_key_list)}; avg_final_loss: {avg_final_loss}" + ) + info.update( + { + f"val{self.wandb_extra_tag}/loss": avg_final_loss, + f"val{self.wandb_extra_tag}/unstable_count": self.unstable_count.item(), + "iteration": iteration, + f"val{self.wandb_extra_tag}/num_unique_url_key": len(unique_url_key_list), + f"val{self.wandb_extra_tag}/total_url_key": len(url_key_list), + } + ) + if self.save_s3: + easy_io.dump( + info, + f"s3://rundir/{self.name}/Val_Iter{iteration:09d}.json", + ) + + if wandb.run is not None: + wandb.log(info, step=iteration) + + # reset unstable count + self.unstable_count.zero_() + self.url_key_list = [] diff --git a/cosmos-inference/cosmos3/_src/vfm/checkpointer/__init__.py b/cosmos-inference/cosmos3/_src/vfm/checkpointer/__init__.py new file mode 100644 index 00000000..db6d6eb6 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/vfm/checkpointer/__init__.py @@ -0,0 +1,15 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025-2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + diff --git a/cosmos-inference/cosmos3/_src/vfm/checkpointer/dcp.py b/cosmos-inference/cosmos3/_src/vfm/checkpointer/dcp.py new file mode 100644 index 00000000..081db462 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/vfm/checkpointer/dcp.py @@ -0,0 +1,849 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +""" +Distributed checkpoint (DCP) directory structure and storage backends. + +The checkpointer saves model state in a sharded format across multiple processes: + +self.save_dirname/ +├── iter_000000005/ # Checkpoint at iteration 5 +│ ├── model/ # Model state shards +│ │ ├── __0_0.distcp # Shard 0 from rank 0 +│ │ └── __1_0.distcp # Shard 1 from rank 1 +│ ├── optim/ # Optimizer state shards +│ │ ├── __0_0.distcp # Shard 0 from rank 0 +│ │ └── __1_0.distcp # Shard 1 from rank 1 +│ ├── scheduler/ # Learning rate scheduler state +│ │ ├── __0_0.distcp # Shard 0 from rank 0 +│ │ └── __1_0.distcp # Shard 1 from rank 1 +│ └── trainer/ # Additional training state +│ ├── __0_0.distcp # Shard 0 from rank 0 +│ └── __1_0.distcp # Shard 1 from rank 1 +│ └── dataloader/ # Optional per-rank dataloader state +│ ├── rank_0.pkl +│ └── rank_1.pkl +└── latest_checkpoint.txt # Points to most recent checkpoint folder, e.g. iter_000000005 + +Storage backends: +- Local filesystem: + self.save_dirname = "{config_job.path_local}/checkpoints" + +- S3 object store: + self.save_dirname = "s3://{bucket}/{config_job.path}/checkpoints" + where bucket = self.config_checkpoint.save_to_object_store.bucket + +The sharded format enables efficient distributed saving/loading by: +1. Parallelizing I/O across processes +2. Reducing memory usage per process +3. Supporting both local and cloud storage backends +""" + +import enum +import multiprocessing +import os +import re +import time +from multiprocessing import get_context +from typing import Any, Dict, List, Optional, Protocol, Tuple, Union, runtime_checkable + +import torch +import torch.distributed as dist +import torch.distributed.checkpoint as dcp +from torch import nn +from torch.distributed.checkpoint.filesystem import FileSystemReader, FileSystemWriter +from torch.distributed.checkpoint.metadata import ( + STATE_DICT_TYPE, + Metadata, + StorageMeta, +) +from torch.distributed.checkpoint.state_dict import ( + StateDictOptions, + get_model_state_dict, + set_model_state_dict, +) +from torch.distributed.checkpoint.stateful import Stateful + +from cosmos3._src.imaginaire.checkpointer.base import AbstractCheckpointer +from cosmos3._src.imaginaire.checkpointer.s3_filesystem import S3StorageReader, S3StorageWriter +from cosmos3._src.imaginaire.config import CheckpointConfig, JobConfig +from cosmos3._src.imaginaire.model import ImaginaireModel +from cosmos3._src.imaginaire.utils import callback, distributed, log, misc +from cosmos3._src.imaginaire.utils.easy_io import easy_io +from cosmos3._src.vfm.utils.rand_state import get_rand_state_dict, set_rand_state_dict + + +class ModelWrapper(Stateful): + """ + Wrapper for model state dict handling. Strips away the _orig_mod. prefix + among other things from the state dict keys. + """ + + def __init__(self, model: nn.Module) -> None: + self.model = model + + def state_dict(self) -> dict[str, Any]: + return get_model_state_dict(self.model) + + def load_state_dict(self, state_dict: dict[str, Any]) -> None: + set_model_state_dict( + self.model, + model_state_dict=state_dict, + options=StateDictOptions(strict=True), + ) + + +@runtime_checkable +class _DataloaderStateHandler(Protocol): + """Structural contract for callbacks that participate in dataloader-state checkpointing.""" + + checkpoint_component: str + + def has_checkpoint_state(self) -> bool: ... + def state_dict(self) -> dict[Any, Any]: ... + def load_state_dict(self, state_dict: dict[Any, Any]) -> None: ... + + +class _DataloaderWrapper: + """Adapter that surfaces a dataloader-state callback's checkpoint API. + + Walks the registered callbacks at construction time and binds to the + first callback that: + + 1. Declares ``checkpoint_component == "dataloader"``, AND + 2. Returns ``True`` from ``has_checkpoint_state()``. + + The bound callback's ``state_dict`` / ``load_state_dict`` methods are + re-exposed via :meth:`state_dict` / :meth:`load_state_dict`. Callers + must gate those on :meth:`has_state` — invoking them when nothing was + bound raises :class:`RuntimeError`. + + Note: only the first callback tagged ``checkpoint_component=="dataloader"`` + is considered; if it does not currently want its state checkpointed, + no further callbacks are searched. In practice there is at most one + such callback (see ``DataLoaderStateCallback``). + """ + + def __init__(self, callbacks: callback.CallBackGroup | None) -> None: + self._callback: _DataloaderStateHandler | None = None + if callbacks is None: + return + for current_callback in callbacks._callbacks: + if getattr(current_callback, "checkpoint_component", None) != "dataloader": + continue + if current_callback.has_checkpoint_state(): + self._callback = current_callback + return + + def has_state(self) -> bool: + return self._callback is not None + + def state_dict(self) -> dict[Any, Any]: + if self._callback is None: + raise RuntimeError("No dataloader state handler is registered, cannot save dataloader state.") + return self._callback.state_dict() + + def load_state_dict(self, state_dict: dict[Any, Any]) -> None: + if self._callback is None: + raise RuntimeError("No dataloader state handler is registered, cannot load dataloader state.") + self._callback.load_state_dict(state_dict) + + +class AsyncMode(str, enum.Enum): + DISABLED = "disabled" + ASYNC_WITH_PINNED_MEM = "async_with_pinned_mem" + + +class Terminate: + pass + + +class SaveDone: + def __init__(self, iteration: int, elapsed_time: float, succeeded: bool): + self.iteration = iteration + self.elapsed_time = elapsed_time + self.succeeded = succeeded + + def __str__(self): + return f"SaveDone(iteration={self.iteration}, elapsed_time={self.elapsed_time}, succeeded={self.succeeded})" + + +def save_checkpoint_in_background( + receiver_queue: multiprocessing.Queue, + sender_queue: multiprocessing.Queue, + config_checkpoint: CheckpointConfig, + config_job: JobConfig, +) -> None: + """ + Handles model checkpoint saving in a separate background process using PyTorch's distributed functionality. + This function runs in a dedicated process to avoid blocking the main training loop. + + Args: + receiver_queue: Queue to receive state dictionaries and commands from the main process + sender_queue: Queue to send completion signals back to the main process + config_checkpoint: Configuration settings for checkpoint saving behavior + config_job: Configuration settings for the training job + + Flow: + 1. Initializes distributed processing environment + 2. Continuously waits for state dictionaries to save + 3. Saves checkpoints asynchronously + 4. Signals completion back to main process + 5. Terminates when receiving a Terminate signal + + Raises: + AssertionError: If received object is neither Terminate signal nor valid state dict tuple + + Note: + - Uses a different port than the main process to avoid conflicts + - Disables TorchElastic agent store for checkpoint operations + - Automatically cleans up distributed process group on exit + """ + # Configure distributed environment + os.environ["MASTER_PORT"] = str(int(os.environ["MASTER_PORT"]) + 2) + os.environ["TORCHELASTIC_USE_AGENT_STORE"] = "False" + + # Set up GPU device and distributed processing + torch.cuda.set_device(int(os.environ["LOCAL_RANK"])) + if dist.is_initialized(): + dist.destroy_process_group() + dist.init_process_group(backend="gloo") + + # Initialize checkpointing mechanism + checkpoint_handler = DistributedCheckpointer( + config_checkpoint=config_checkpoint, + config_job=config_job, + callbacks=None, + disable_async=True, + ) + + while True: + log.info(f"Checkpoint background process is ready for next task, waiting for new state_dict") + received_data = receiver_queue.get() + log.info(f"Checkpoint background process received new state_dict") + + if isinstance(received_data, Terminate): + log.info(f"Checkpoint background process received termination signal, closing sender queue") + break + + assert isinstance(received_data, tuple), "Received data must be a tuple of (state_dict, checkpoint_path)" + state_dict, checkpoint_path = received_data + + # Save checkpoint and measure time taken. + start_time = time.monotonic() + iteration = state_dict["trainer"][0]["iteration"] + succeeded = False + + try: + log.info(f"Saving checkpoint to {checkpoint_path}") + checkpoint_handler.save_state_dict_worker(state_dict, checkpoint_path) + succeeded = True + except Exception as e: + log.error(f"Error saving checkpoint to {checkpoint_path}: {e}") + # continue because if the thread exits, the main thread keeps on adding to the queue + finally: + elapsed_time = time.monotonic() - start_time + log.info( + f"Checkpoint save completed in background process. " + f"Time taken: {elapsed_time:.2f} seconds, iteration: {iteration}, " + f"status: {'SUCCESS' if succeeded else 'FAILURE'}" + ) + sender_queue.put(SaveDone(iteration, elapsed_time, succeeded)) + + log.info("Cleaning up: destroying distributed process group") + dist.destroy_process_group() + + +def _replace_keys_with_ema_keys(state_dict: STATE_DICT_TYPE) -> STATE_DICT_TYPE: + """ + Renames model parameters from "net." to "net_ema.". + """ + if not all(k.startswith("net.") for k in state_dict.keys()): + raise ValueError("State dict must start with net. keys when load_ema_to_reg is True") + return {k.replace("net.", "net_ema."): v for k, v in state_dict.items()} + + +class CustomLoadPlanner(dcp.DefaultLoadPlanner): + """ + CustomLoadPlanner that supports ignoring keys during checkpoint load. + This is useful when the checkpoint is saved with a different component + architecture, e.g. different RoPE embeddings than the current model. + """ + + def __init__( + self, + flatten_state_dict: bool = True, + flatten_sharded_tensors: bool = True, + allow_partial_load: bool = False, + keys_to_skip_loading: List[str] = [], + load_ema_to_reg: bool = False, + ) -> None: + super().__init__( + flatten_state_dict=flatten_state_dict, + flatten_sharded_tensors=flatten_sharded_tensors, + allow_partial_load=allow_partial_load, + ) + self.keys_to_skip_loading = keys_to_skip_loading + self.load_ema_to_reg = load_ema_to_reg + if len(keys_to_skip_loading) > 0: + log.info(f"Skipping loading of keys that match the following patterns: {keys_to_skip_loading}") + + def set_up_planner( + self, + state_dict: STATE_DICT_TYPE, + metadata: Metadata | None = None, + is_coordinator: bool = False, + ) -> None: + state_dict = self._skip_keys_if_found(state_dict) + + if self.load_ema_to_reg: + state_dict = _replace_keys_with_ema_keys(state_dict) + + super().set_up_planner( + state_dict=state_dict, + metadata=metadata, + is_coordinator=is_coordinator, + ) + + def _skip_keys_if_found( + self, + state_dict: STATE_DICT_TYPE, + ) -> Dict[str, Any]: + """ + While loading the checkpoint, skip the weight loading for the keys + that contain any element of `self.keys_to_skip_loading` as a substring. + """ + if len(self.keys_to_skip_loading) == 0: + return state_dict + + new_state_dict = {} + for fqn, obj in state_dict.items(): + if any(skip_key in fqn for skip_key in self.keys_to_skip_loading): + log.warning(f"Skipping loading of key: {fqn}") + continue + new_state_dict[fqn] = obj + return new_state_dict + + +class CustomSavePlanner(dcp.DefaultSavePlanner): + """ + Custom save planner that enables an override for cache_plans_key when + caching of save plans is enabled. Caching of save plans reduces checkpointing + time by reusing the same save plan across checkpoints. This reduces the + checkpointing time by ~60% (benchmarked using the 235B-A22B Qwen3-VL model + on 64 GB200 nodes). + """ + + def __init__( + self, + flatten_state_dict: bool = True, + flatten_sharded_tensors: bool = True, + dedup_save_to_lowest_rank: bool = False, + save_reg_to_ema: bool = False, + enable_plan_caching: bool = False, + cache_plans_key: str | None = None, + ) -> None: + super().__init__( + flatten_state_dict=flatten_state_dict, + flatten_sharded_tensors=flatten_sharded_tensors, + dedup_save_to_lowest_rank=dedup_save_to_lowest_rank, + enable_plan_caching=enable_plan_caching, + ) + if cache_plans_key is not None: + self._cached_plans_key = cache_plans_key + + self.save_reg_to_ema = save_reg_to_ema + + def set_up_planner( + self, + state_dict: STATE_DICT_TYPE, + storage_meta: StorageMeta | None = None, + is_coordinator: bool = False, + ) -> None: + if self.save_reg_to_ema: + state_dict = _replace_keys_with_ema_keys(state_dict) + + super().set_up_planner( + state_dict=state_dict, + storage_meta=storage_meta, + is_coordinator=is_coordinator, + ) + + +class DistributedCheckpointer(AbstractCheckpointer): + CHECKPOINT_KEYS = ["model", "optim", "scheduler", "trainer", "dataloader"] + + def __init__( + self, + config_checkpoint: CheckpointConfig, + config_job: JobConfig, + callbacks: Optional[callback.CallBackGroup] = None, + disable_async: bool = False, + ): + super().__init__(config_checkpoint, config_job, callbacks) + self.config_checkpoint = config_checkpoint + if config_checkpoint.dcp_async_mode_enabled and not disable_async: + self.async_mode = AsyncMode.ASYNC_WITH_PINNED_MEM + else: + self.async_mode = AsyncMode.DISABLED + + if self.async_mode == AsyncMode.ASYNC_WITH_PINNED_MEM: + ctx = get_context("spawn") + self.mp_queue_send = ctx.Queue() + self.mp_queue_recv = ctx.Queue() + self.mp = ctx.Process( + target=save_checkpoint_in_background, + args=( + self.mp_queue_send, + self.mp_queue_recv, + config_checkpoint, + config_job, + ), + daemon=True, + ) + self.mp.start() + self.cpu_offload_state_dict = None + self.staging_ckpt_file = None + self.staging_stream = torch.cuda.Stream() + self.checkpoint_in_progress = False + + def keys_to_resume_during_load(self) -> tuple[set[str], str | None, bool | None]: + """ + Determines the keys to resume from the checkpoint and the checkpoint path. + If the checkpoint is the latest checkpoint of the same model, then it is a + normal resume. If the checkpoint is a different model's checkpoint, then it is + a warm start. + + Args: + None + + Returns: + resume_keys: The keys to resume from the checkpoint. + checkpoint_path: The path to the checkpoint. If the checkpoint is a different + warm_start: Whether to warm start the training from a different model's checkpoint. + If the checkpoint is a different model's checkpoint, then this is True. + If the checkpoint is the latest checkpoint of the same model, then this is False. + """ + latest_checkpoint_file = self._read_latest_checkpoint_file() + + resume_keys = [] + warm_start = None + + if latest_checkpoint_file is not None: + # 1. Resume training from the latest checkpoint of the same model. + warm_start = False + checkpoint_path = os.path.join(self.load_dirname, latest_checkpoint_file) + resume_keys.extend(self.CHECKPOINT_KEYS) + + else: + if self.load_path and not str(self.load_path).endswith(".pt"): + # 2. Warm Start: Resume training from a different model's checkpoint + # specified by `load_path`. + warm_start = True + checkpoint_path = self.load_path + + if self.load_s3_backend_key: + checkpoint_path = f"s3://{self.config_checkpoint.load_from_object_store.bucket}/{checkpoint_path}" + + # If the path doesn't end with specific checkpoint, read the latest + # checkpoint file to determine the most recent checkpoint iteration. + if not re.search(r"/checkpoints/iter_\d{9}/?$", checkpoint_path): + old_ckpt_path = checkpoint_path + latest_ckpt_path = os.path.join(checkpoint_path, "checkpoints/latest_checkpoint.txt") + + # If the latest checkpoint file exists, use it to determine the + # checkpoint path. Otherwise, use the original path. + if easy_io.exists(latest_ckpt_path, backend_key=self.load_s3_backend_key): + checkpoint_file = easy_io.load( + latest_ckpt_path, backend_key=self.load_s3_backend_key + ).strip() + checkpoint_path = f"{checkpoint_path}/checkpoints/{checkpoint_file}" + else: + log.warning( + f"Latest checkpoint file {latest_ckpt_path} not found, load from {old_ckpt_path}" + ) + checkpoint_path = old_ckpt_path + + if self.load_training_state: + resume_keys.extend(self.CHECKPOINT_KEYS) + else: + resume_keys.append("model") + if self.only_load_scheduler_state: + resume_keys.append("scheduler") + else: + checkpoint_path = None + + if len(self.keys_not_to_resume) > 0: + for key in self.keys_not_to_resume: + assert key in self.CHECKPOINT_KEYS, f"Invalid key to resume: {key} not in {self.CHECKPOINT_KEYS}" + resume_keys = [key for key in resume_keys if key not in self.keys_not_to_resume] + + return set(resume_keys), checkpoint_path, warm_start + + @misc.timer("checkpoint loading") + def load( + self, + model: ImaginaireModel, + optimizer: torch.optim.Optimizer | None = None, + scheduler: torch.optim.lr_scheduler.LRScheduler | None = None, + grad_scaler: torch.amp.GradScaler | None = None, + ) -> int: + if self.callbacks is not None: + self.callbacks.on_load_checkpoint_start(model) + + resume_keys, checkpoint_path, warm_start = self.keys_to_resume_during_load() + resume_keys = sorted(resume_keys) + log.critical(f"Resuming ckpt {checkpoint_path} with keys: {resume_keys}") + + iteration = 0 + + if checkpoint_path is not None: + self._check_checkpoint_exists(checkpoint_path) + + for key in resume_keys: + dist.barrier() + + cur_key_ckpt_full_path = os.path.join(checkpoint_path, key) + log.critical(f"Start loading checkpoint from {cur_key_ckpt_full_path}") + + storage_reader = self.get_storage_reader(cur_key_ckpt_full_path) + strict_resume = self.config_checkpoint.strict_resume + + # Note that we only allow skipping loading of keys during warm start. If the checkpoint is + # the latest checkpoint of the same model, then we don't need to skip any keys. + keys_to_skip_loading = self.config_checkpoint.keys_to_skip_loading if warm_start else [] + + load_planner = CustomLoadPlanner( + allow_partial_load=not strict_resume, + keys_to_skip_loading=keys_to_skip_loading, + ) + + if key == "model": + log.info("- Loading the model...") + _model_wrapper = ModelWrapper(model) + _state_dict = _model_wrapper.state_dict() + dcp.load( + _state_dict, + storage_reader=storage_reader, + planner=load_planner, + ) + if self.config_checkpoint.load_ema_to_reg: + # The model has both net.* and net_ema.* submodules, so _state_dict + # contains both sets of keys after dcp.load(). Copy EMA weights into + # regular model weights so we can resume from EMA and reset EMA. + for sd_key in list(_state_dict.keys()): + if sd_key.startswith("net."): + key_ema = "net_ema." + sd_key.removeprefix("net.") + assert key_ema in _state_dict, ( + f"EMA key {key_ema} not found in state_dict. " + "Ensure the model has net_ema submodule." + ) + _state_dict[sd_key] = _state_dict[key_ema] + results = _model_wrapper.load_state_dict(_state_dict) + if results is not None: + if len(results.missing_keys) > 0: + raise ValueError(f"Missing keys (not found in checkpoint): {results.missing_keys}") + if len(results.unexpected_keys) > 0: + raise ValueError( + f"Unexpected keys (found in checkpoint but not in model): {results.unexpected_keys}" + ) + + elif key == "optim": + log.info("- Loading the optimizer...") + _state_dict = optimizer.state_dict() + dcp.load( + _state_dict, + storage_reader=storage_reader, + planner=load_planner, + ) + optimizer.load_state_dict(_state_dict) + + elif key == "scheduler": + log.info("- Loading the scheduler...") + _state_dict = scheduler.state_dict() + dcp.load( + _state_dict, + storage_reader=storage_reader, + planner=load_planner, + ) + scheduler.load_state_dict(_state_dict) + + elif key == "trainer": + log.info("- Loading the trainer...") + + # Use rank-specific key for RNG state to support correct per-rank restoration + rng_key = f"rng_state_{dist.get_rank()}" + current_rng_state = get_rand_state_dict() + _state_dict = { + "grad_scaler": grad_scaler.state_dict(), + "iteration": iteration, + } + # Check if rng_key exists in checkpoint metadata to avoid failure with strict_resume=True + metadata = storage_reader.read_metadata() + rng_key_exists = any( + k.startswith(f"{rng_key}.") or k == rng_key for k in metadata.state_dict_metadata.keys() + ) + if rng_key_exists: + _state_dict[rng_key] = current_rng_state + + dcp.load( + _state_dict, + storage_reader=storage_reader, + planner=load_planner, + ) + grad_scaler.load_state_dict(_state_dict["grad_scaler"]) + iteration = _state_dict["iteration"] + set_rand_state_dict(_state_dict.get(rng_key, current_rng_state)) + + elif key == "dataloader": + if not easy_io.exists(cur_key_ckpt_full_path, backend_key=self.load_s3_backend_key): + log.info( + f"Checkpoint {cur_key_ckpt_full_path} does not exist, skip loading dataloader.", + rank0_only=False, + ) + continue + + rank = dist.get_rank() + dataloader_pkl_path = os.path.join(cur_key_ckpt_full_path, f"rank_{rank}.pkl") + if not easy_io.exists(dataloader_pkl_path, backend_key=self.load_s3_backend_key): + log.info(f"No dataloader checkpoint found at {dataloader_pkl_path}", rank0_only=False) + continue + + log.info(f"- Loading the dataloader {cur_key_ckpt_full_path}...", rank0_only=False) + _state_dict = easy_io.load( + dataloader_pkl_path, + file_format="pkl", + backend_key=self.load_s3_backend_key, + ) + dataloader_wrapper = _DataloaderWrapper(self.callbacks) + if dataloader_wrapper.has_state(): + dataloader_wrapper.load_state_dict(_state_dict) + + else: + raise ValueError(f"Invalid key: {key}. not support to resume.") + + if self.callbacks is not None and resume_keys: + # Note that this callback is never used in the codebase. + self.callbacks.on_load_checkpoint(model, state_dict={}) + log.info(f"Loaded checkpoint from {checkpoint_path} in iteration {iteration}") + + else: + log.info("Training from scratch.") + + torch.cuda.empty_cache() + + if self.callbacks is not None: + self.callbacks.on_load_checkpoint_end(model, iteration=iteration, checkpoint_path=checkpoint_path) + return iteration + + def _checkpoint_async_with_pinned_memory( + self, checkpoint_file: str, state_dict: Dict[str, Tuple[Any, str]] + ) -> None: + assert self.async_mode == AsyncMode.ASYNC_WITH_PINNED_MEM, "Async mode must be AsyncMode.ASYNC_WITH_PINNED_MEM" + + from torch.distributed._state_dict_utils import _copy_state_dict, _create_cpu_state_dict + + if self.cpu_offload_state_dict is None: + log.info(f"Preparing the CPU memory for staging") + self.cpu_offload_state_dict = _create_cpu_state_dict(state_dict, pin_memory=True, share_memory=True) + + log.info(f"Staging the state_dict in CPU memory") + with torch.cuda.stream(self.staging_stream): + self.cpu_offload_state_dict = _copy_state_dict( + state_dict, + self.cpu_offload_state_dict, + non_blocking=True, + ) + self.staging_ckpt_file = checkpoint_file + + self.staging_stream.synchronize() + log.info(f"Staging the state_dict in CPU memory completed") + + self.mp_queue_send.put_nowait((self.cpu_offload_state_dict, self.staging_ckpt_file)) + self.checkpoint_in_progress = True + log.info(f"Submitted checkpoint to background process") + + def _wait_for_previous_async_checkpoint(self) -> None: + """ + Gets the results of previously submitted checkpoints. + Pass them to callbacks if checkpoint succeeded. + """ + assert self.async_mode == AsyncMode.ASYNC_WITH_PINNED_MEM, "Async mode must be AsyncMode.ASYNC_WITH_PINNED_MEM" + + if not self.checkpoint_in_progress: + return + + success = False + try: + log.info(f"Waiting for checkpoint save result") + + # Note that we set a timeout of 1 hour to avoid blocking the main process + # indefinitely. Gloo and NCCL timeouts are ~30 minutes, so this timeout + # should typically be sufficient. + save_done: SaveDone = self.mp_queue_recv.get(timeout=3600) + + log.info(f"Received checkpoint save result: {save_done}") + + if self.callbacks is not None and save_done.succeeded: + self.callbacks.on_save_checkpoint_success( + iteration=save_done.iteration, elapsed_time=save_done.elapsed_time + ) + self.checkpoint_in_progress = False + success = save_done.succeeded + + except Exception as e: + log.error(f"Error waiting for checkpoint save result: {e}") + + if not success: + # Terminate training execution upon a failed checkpoint save attempt. + # A failure at this stage typically indicates a non-recoverable system error. + # Continuing execution would result in subsequent persistent failures and + # unnecessary waste of GPU resources. + raise RuntimeError("Previous checkpoint save failed. Exiting...") + + def get_storage_writer(self, checkpoint_path: str) -> Union[S3StorageWriter, FileSystemWriter]: + if self.save_to_object_store: + return S3StorageWriter( + credential_path=self.config_checkpoint.save_to_object_store.credentials, + path=checkpoint_path, + enable_gcs_patch_in_boto3=self.config_checkpoint.enable_gcs_patch_in_boto3, + ) + return FileSystemWriter(path=checkpoint_path) + + def get_storage_reader(self, checkpoint_path: str) -> Union[S3StorageReader, FileSystemReader]: + if self.load_from_object_store: + return S3StorageReader( + credential_path=self.config_checkpoint.load_from_object_store.credentials, + path=checkpoint_path, + enable_gcs_patch_in_boto3=self.config_checkpoint.enable_gcs_patch_in_boto3, + ) + return FileSystemReader(checkpoint_path) + + def _save_as_pkl(self, obj: Any, output_dir: str) -> None: + """Save per-rank Python checkpoint state such as no-replace dataloader progress.""" + rank = dist.get_rank() + path = os.path.join(output_dir, f"rank_{rank}.pkl") + easy_io.dump( + obj, + path, + file_format="pkl", + backend_key=self.save_s3_backend_key, + ) + log.info(f"Saved state to {path}") + + def save_state_dict_worker(self, to_save_dict: Dict[str, Tuple[Any, str]], checkpoint_file: str) -> None: + for key, (v, full_checkpoint_path) in to_save_dict.items(): + if key == "dataloader": + self._save_as_pkl(v, full_checkpoint_path) + else: + storage_writer = self.get_storage_writer(full_checkpoint_path) + # Note that it is ok to create a new CustomSavePlanner object + # for each checkpoint save since the save plans are cached in a + # class dictionary. + save_planner = CustomSavePlanner( + dedup_save_to_lowest_rank=True, + enable_plan_caching=True, + cache_plans_key=f"custom_planner_{key}", + ) + dcp.save( + v, + storage_writer=storage_writer, + planner=save_planner, + ) + + if distributed.is_rank0(): + log.info(f"Saving last checkpoint file {checkpoint_file}") + self._write_latest_checkpoint_file(checkpoint_file) + + log.info(f"Saved checkpoint to {os.path.join(self.save_dirname, checkpoint_file)}") + + def save( + self, + model: ImaginaireModel, + optimizer: torch.optim.Optimizer, + scheduler: torch.optim.lr_scheduler.LRScheduler, + grad_scaler: torch.amp.GradScaler, + iteration: int, + ) -> None: + """Save network weights, optimizer parameters, scheduler parameters to a checkpoint. + + Args: + model (ImaginaireModel): The PyTorch model. + optimizer (torch.optim.Optimizer): The model optimizer. + scheduler (torch.optim.lr_scheduler.LRScheduler): The optimization scheduler. + grad_scaler (torch.amp.GradScaler): The gradient scaler (for mixed precision training). + iteration (int): Current iteration number. + """ + if self.async_mode == AsyncMode.ASYNC_WITH_PINNED_MEM: + self._wait_for_previous_async_checkpoint() + + if self.callbacks is not None: + self.callbacks.on_save_checkpoint_start(model, iteration) + + checkpoint_file = f"iter_{iteration:09}" + + # Use rank-specific key for RNG state to ensure each rank saves its own state + rng_key = f"rng_state_{dist.get_rank()}" + + to_save_dict = { + "model": ModelWrapper(model).state_dict(), + "optim": optimizer.state_dict(), + "scheduler": scheduler.state_dict(), + "trainer": { + "grad_scaler": grad_scaler.state_dict(), + "iteration": iteration, + rng_key: get_rand_state_dict(), + }, + } + dataloader_wrapper = _DataloaderWrapper(self.callbacks) + if dataloader_wrapper.has_state(): + to_save_dict["dataloader"] = dataloader_wrapper.state_dict() + + if self.callbacks is not None: + self.callbacks.on_save_checkpoint(model, state_dict=to_save_dict) + + for k in to_save_dict.keys(): + output_dirname = os.path.join(self.save_dirname, f"iter_{iteration:09}/{k}") + to_save_dict[k] = (to_save_dict[k], output_dirname) + + if self.async_mode == AsyncMode.ASYNC_WITH_PINNED_MEM: + dataloader_entry = to_save_dict.pop("dataloader", None) + if dataloader_entry is not None: + dataloader_state, dataloader_save_dir = dataloader_entry + self._save_as_pkl(dataloader_state, dataloader_save_dir) + self._checkpoint_async_with_pinned_memory(checkpoint_file, to_save_dict) + else: + start_time = time.monotonic() + self.save_state_dict_worker(to_save_dict, checkpoint_file) + elapsed_time = time.monotonic() - start_time + log.info(f"Checkpoint save completed: Time taken: {elapsed_time:.2f} seconds") + + if self.callbacks is not None: + self.callbacks.on_save_checkpoint_success(iteration=iteration, elapsed_time=elapsed_time) + + # This measures exposed (synchronous) checkpoint time, on_save_checkpoint_success() + # is instead called to measure the entire duration for asynchronous checkpoint for the async case too. + if self.callbacks is not None: + self.callbacks.on_save_checkpoint_end(model=None, iteration=iteration) + + def finalize(self) -> None: + super().finalize() + if self.async_mode == AsyncMode.ASYNC_WITH_PINNED_MEM: + if self.mp and self.mp.is_alive(): + # Wait for the previous checkpoint to complete. + self._wait_for_previous_async_checkpoint() + + self.mp_queue_send.put(Terminate()) + self.mp.join() diff --git a/cosmos-inference/cosmos3/_src/vfm/conditioner.py b/cosmos-inference/cosmos3/_src/vfm/conditioner.py new file mode 100644 index 00000000..f812b723 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/vfm/conditioner.py @@ -0,0 +1,578 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +from __future__ import annotations + +import copy +from abc import ABC, abstractmethod +from collections import defaultdict +from contextlib import nullcontext +from dataclasses import dataclass, fields +from enum import Enum +from typing import Any, Dict, List, Optional, Tuple, TypeVar, Union + +import omegaconf +import torch +import torch.nn as nn +from torch.distributed import ProcessGroup + +from cosmos3._src.imaginaire.functional.batch_ops import batch_mul +from cosmos3._src.imaginaire.lazy_config import instantiate +from cosmos3._src.imaginaire.utils import log +from cosmos3._src.imaginaire.utils.count_params import count_params +from cosmos3._src.imaginaire.utils.disabled_train import disabled_train +from cosmos3._src.imaginaire.utils.easy_io import easy_io +from cosmos3._src.vfm.utils.context_parallel import broadcast + +T = TypeVar("T", bound="BaseCondition") + + +class DataType(str, Enum): + IMAGE = "image" + VIDEO = "video" + MIX = "mix" + + def __str__(self) -> str: + return self.value + + +def broadcast_condition(condition: BaseCondition, process_group: Optional[ProcessGroup] = None) -> BaseCondition: + """ + Broadcast the condition from the minimum rank in the specified group(s). + """ + if condition.is_broadcasted: + return condition + + kwargs = condition.to_dict(skip_underscore=False) + for key, value in kwargs.items(): + if value is not None: + kwargs[key] = broadcast(value, process_group) + kwargs["_is_broadcasted"] = True + return type(condition)(**kwargs) + + +@dataclass(frozen=True) +class BaseCondition(ABC): + """ + Attributes: + _is_broadcasted: Flag indicating if parallel broadcast splitting + has been performed. This is an internal implementation detail. + """ + + _is_broadcasted: bool = False + + def to_dict(self, skip_underscore: bool = True) -> Dict[str, Any]: + """Converts the condition to a dictionary. + + Returns: + Dictionary containing the condition's fields and values. + """ + # return {f.name: getattr(self, f.name) for f in fields(self) if not f.name.startswith("_")} + return {f.name: getattr(self, f.name) for f in fields(self) if not (f.name.startswith("_") and skip_underscore)} + + @property + def is_broadcasted(self) -> bool: + return self._is_broadcasted + + def broadcast(self, process_group: torch.distributed.ProcessGroup) -> BaseCondition: + """Broadcasts and splits the condition across the checkpoint parallelism group. + For most condition, such as Text2WorldCondition, we do not need split. + + Args: + process_group: The process group for broadcast and split + + Returns: + A new BaseCondition instance with the broadcasted and split condition. + """ + if self.is_broadcasted: + return self + return broadcast_condition(self, process_group) + + +@dataclass(frozen=True) +class Text2WorldCondition(BaseCondition): + crossattn_emb: Optional[torch.Tensor] = None + data_type: DataType = DataType.VIDEO + fps: Optional[torch.Tensor] = None + + def edit_data_type(self, data_type: DataType) -> Text2WorldCondition: + """Edit the data type of the condition. + + Args: + data_type: The new data type. + + Returns: + A new Text2WorldCondition instance with the new data type. + """ + kwargs = self.to_dict(skip_underscore=False) + kwargs["data_type"] = data_type + return type(self)(**kwargs) + + @property + def is_video(self) -> bool: + return self.data_type == DataType.VIDEO + + +@dataclass(frozen=True) +class GR00TV1Img2VidCondition(Text2WorldCondition): + gt_first_frame: Optional[torch.Tensor] = None + use_image_condition: bool = False + condition_video_input_mask: Optional[torch.Tensor] = None + + def edit_video_condition( + self, + x0, + process_group: Optional[ProcessGroup] = None, # x0: [B,C,T,H,W] + ) -> GR00TV1Img2VidCondition: + """Edit the video condition to include the video mask information. + + Args: + x0: The first frame of the video. + + Returns: + A new GR00TV1Img2VidCondition instance with the video mask information. + """ + pg_size = 1 if process_group is None else process_group.size() + kwargs = self.to_dict(skip_underscore=False) + B, _, T, H, W = x0.shape + condition_video_input_mask = torch.zeros((B, 1, T, H, W), dtype=x0.dtype, device=x0.device) # [B,1,T,H,W] + if pg_size == 1 or process_group.rank() == 0: + kwargs["gt_first_frame"] = x0[:, :, 0].detach() # [B,C,H,W] + condition_video_input_mask[:, :, 0] += 1 + kwargs["condition_video_input_mask"] = condition_video_input_mask + return type(self)(**kwargs) + + +class AbstractEmbModel(nn.Module): + def __init__(self): + super().__init__() + + self._is_trainable = None + self._dropout_rate = None + self._input_key = None + + self._return_dict = False + + @property + def is_trainable(self) -> bool: + return self._is_trainable + + @property + def dropout_rate(self) -> Union[float, torch.Tensor]: + return self._dropout_rate + + @property + def input_key(self) -> str: + return self._input_key + + @property + def is_return_dict(self) -> bool: + return self._return_dict + + @is_trainable.setter + def is_trainable(self, value: bool): + self._is_trainable = value + + @dropout_rate.setter + def dropout_rate(self, value: Union[float, torch.Tensor]): + self._dropout_rate = value + + @input_key.setter + def input_key(self, value: str): + self._input_key = value + + @is_return_dict.setter + def is_return_dict(self, value: bool): + self._return_dict = value + + @is_trainable.deleter + def is_trainable(self): + del self._is_trainable + + @dropout_rate.deleter + def dropout_rate(self): + del self._dropout_rate + + @input_key.deleter + def input_key(self): + del self._input_key + + @is_return_dict.deleter + def is_return_dict(self): + del self._return_dict + + def random_dropout_input( + self, in_tensor: torch.Tensor, dropout_rate: Optional[float] = None, key: Optional[str] = None + ) -> torch.Tensor: + del key + dropout_rate = dropout_rate if dropout_rate is not None else self.dropout_rate + return batch_mul( + torch.bernoulli((1.0 - dropout_rate) * torch.ones(in_tensor.shape[0])).type_as(in_tensor), # [B] + in_tensor, + ) # [B,N_text,hidden_size] + + def details(self) -> str: + return "" + + def summary(self) -> str: + input_key = self.input_key if self.input_key is not None else getattr(self, "input_keys", None) + return ( + f"{self.__class__.__name__} \n\tinput key: {input_key}" + f"\n\tParam count: {count_params(self, False)} \n\tTrainable: {self.is_trainable}" + f"\n\tDropout rate: {self.dropout_rate}" + f"\n\t{self.details()}" + ) + + +class TextAttr(AbstractEmbModel): + def __init__( + self, + input_key: List[str], + dropout_rate: Optional[float] = 0.0, + use_empty_string: bool = False, + empty_string_embeddings_path: str = "s3://bucket/predict2_assets/reason1_empty_string_embeddings.pt", + credential_path: str = "credentials/s3_training.secret", + ): + super().__init__() + self._input_key = input_key + self._dropout_rate = dropout_rate + # if True, will use empty string embeddings + # otherwise use zero tensor embeddings + self.use_empty_string = use_empty_string + self._empty_string_embeddings_cache = None + self.empty_string_embeddings_path = empty_string_embeddings_path + self.credential_path = credential_path + + def forward(self, token: torch.Tensor): + return {"crossattn_emb": token} + + def _get_empty_string_embeddings(self) -> torch.Tensor: + """Lazy load and cache empty string embeddings.""" + if self._empty_string_embeddings_cache is None: + self._empty_string_embeddings_cache = easy_io.load( + self.empty_string_embeddings_path, + backend_args={"backend": "s3", "s3_credential_path": self.credential_path}, + ) + return self._empty_string_embeddings_cache + + def random_dropout_input( + self, in_tensor: torch.Tensor, dropout_rate: Optional[float] = None, key: Optional[str] = None + ) -> torch.Tensor: + if key is not None and "mask" in key: + return in_tensor + if not self.use_empty_string: + return super().random_dropout_input(in_tensor, dropout_rate, key) + B = in_tensor.shape[0] + dropout_rate = dropout_rate if dropout_rate is not None else self.dropout_rate + empty_string_embeddings = self._get_empty_string_embeddings() + empty_string_embeddings = empty_string_embeddings.expand(in_tensor.shape).to( + dtype=in_tensor.dtype, device=in_tensor.device + ) # [B,N_text,hidden_size] + + keep_mask = torch.bernoulli((1.0 - dropout_rate) * torch.ones(B, device=in_tensor.device)).type_as( + in_tensor + ) # [B] + keep_mask = keep_mask.view(B, *[1] * (in_tensor.dim() - 1)) # [B,1,...] broadcastable shape + return keep_mask * in_tensor + (1.0 - keep_mask) * empty_string_embeddings # [B,N_text,hidden_size] + + def details(self) -> str: + return "Output key: [crossattn_emb]" + + +class TextAttrEmptyStringDrop(AbstractEmbModel): + def __init__(self, input_key: List[str], dropout_rate: Optional[float] = 0.0): + super().__init__() + self._input_key = input_key + self._dropout_rate = dropout_rate + self.empty_prompt_data = None + + def forward(self, token: torch.Tensor): + return {"crossattn_emb": token} + + def random_dropout_input( + self, in_tensor: torch.Tensor, dropout_rate: Optional[float] = None, key: Optional[str] = None + ) -> torch.Tensor: + if key is not None and "mask" in key: + return in_tensor + del key + if self.empty_prompt_data is None: + self.empty_prompt_data = easy_io.load( + "s3://bucket/edify_video/v4/validation/item_dataset/negative_prompt/empty_string_umt5.pt", + backend_args={"backend": "s3", "s3_credential_path": "credentials/s3_training.secret"}, + ) + dropout_rate = dropout_rate if dropout_rate is not None else self.dropout_rate + + B = in_tensor.shape[0] # batch size + # Create dropout mask: 1 -> keep in_tensor, 0 -> use empty_prompt_data + keep_mask = torch.bernoulli((1.0 - dropout_rate) * torch.ones(B, device=in_tensor.device)).type_as( + in_tensor + ) # [B] + keep_mask = keep_mask.view(B, *[1] * (in_tensor.dim() - 1)) # [B,1,...] broadcastable shape + # Prepare empty_prompt_data with correct shape, dtype, and device + empty_prompt = self.empty_prompt_data.to(dtype=in_tensor.dtype, device=in_tensor.device) + # Repeat empty_prompt along batch dimension if needed + if empty_prompt.shape[0] != B: + if empty_prompt.shape[0] == 1: + empty_prompt = empty_prompt.expand(B, *empty_prompt.shape[1:]) # [B,N_text,hidden_size] + else: + raise ValueError( + f"empty_prompt_data batch size {empty_prompt.shape[0]} does not match in_tensor batch size {B}" + ) + + # Mix using the dropout mask + return keep_mask * in_tensor + (1.0 - keep_mask) * empty_prompt # [B,N_text,hidden_size] + + def details(self) -> str: + return "Output key: [crossattn_emb]" + + +class ReMapkey(AbstractEmbModel): + def __init__( + self, + input_key: str, + output_key: Optional[str] = None, + dropout_rate: Optional[float] = 0.0, + dtype: Optional[str] = None, + ): + super().__init__() + self.output_key = output_key + self.dtype = { + None: None, + "float": torch.float32, + "bfloat16": torch.bfloat16, + "half": torch.float16, + "float16": torch.float16, + "int": torch.int32, + "long": torch.int64, + }[dtype] + self._input_key = input_key + self._output_key = output_key + self._dropout_rate = dropout_rate + + def forward(self, element: torch.Tensor) -> Dict[str, torch.Tensor]: + key = self.output_key if self.output_key else self.input_key + if isinstance(element, torch.Tensor): + element = element.to(dtype=self.dtype) + return {key: element} + + def details(self) -> str: + key = self.output_key if self.output_key else self.input_key + return f"Output key: {key} \n\tDtype: {self.dtype}" + + +class BooleanFlag(AbstractEmbModel): + def __init__(self, input_key: str, output_key: Optional[str] = None, dropout_rate: Optional[float] = 0.0): + super().__init__() + self._input_key = input_key + self._dropout_rate = dropout_rate + self.output_key = output_key + + def forward(self, *args, **kwargs) -> Dict[str, torch.Tensor]: + del args, kwargs + key = self.output_key if self.output_key else self.input_key + return {key: self.flag} + + def random_dropout_input( + self, in_tensor: torch.Tensor, dropout_rate: Optional[float] = None, key: Optional[str] = None + ) -> torch.Tensor: + del key + dropout_rate = dropout_rate if dropout_rate is not None else self.dropout_rate + self.flag = torch.bernoulli((1.0 - dropout_rate) * torch.ones(1)).bool().to(device=in_tensor.device) # [1] + return in_tensor + + def details(self) -> str: + key = self.output_key if self.output_key else self.input_key + return f"Output key: {key} \n\t This is a boolean flag" + + +class GeneralConditioner(nn.Module, ABC): + """ + An abstract module designed to handle various embedding models with conditional and unconditional configurations. + This abstract base class initializes and manages a collection of embedders that can dynamically adjust + their dropout rates based on conditioning. + + Attributes: + KEY2DIM (dict): A mapping from output keys to dimensions used for concatenation. + embedders (nn.ModuleDict): A dictionary containing all embedded models initialized and configured + based on the provided configurations. + + Parameters: + emb_models (Union[List, Any]): A dictionary where keys are embedder names and values are configurations + for initializing the embedders. + + Example: + See Edify4ConditionerConfig + """ + + KEY2DIM = {"crossattn_emb": 1} + + def __init__(self, **emb_models: Union[List, Any]): + super().__init__() + self.embedders = nn.ModuleDict() + for n, (emb_name, emb_config) in enumerate(emb_models.items()): + embedder = instantiate(emb_config) + assert isinstance(embedder, AbstractEmbModel), ( + f"embedder model {embedder.__class__.__name__} has to inherit from AbstractEmbModel" + ) + embedder.is_trainable = getattr(emb_config, "is_trainable", True) + embedder.dropout_rate = getattr(emb_config, "dropout_rate", 0.0) + if not embedder.is_trainable: + embedder.train = disabled_train + for param in embedder.parameters(): + param.requires_grad = False + embedder.eval() + + log.info(f"Initialized embedder #{n}-{emb_name}: \n {embedder.summary()}") + self.embedders[emb_name] = embedder + + @abstractmethod + def forward( + self, + batch: Dict, + override_dropout_rate: Optional[Dict[str, float]] = None, + ) -> Any: + """Should be implemented in subclasses to handle conditon datatype""" + raise NotImplementedError + + def _forward( + self, + batch: Dict, + override_dropout_rate: Optional[Dict[str, float]] = None, + ) -> Dict: + """ + Processes the input batch through all configured embedders, applying conditional dropout rates if specified. + Output tensors for each key are concatenated along the dimensions specified in KEY2DIM. + + Parameters: + batch (Dict): The input data batch to process. + override_dropout_rate (Optional[Dict[str, float]]): Optional dictionary to override default dropout rates + per embedder key. + + Returns: + Dict: A dictionary of output tensors concatenated by specified dimensions. + + Note: + In case the network code is sensitive to the order of concatenation, you can either control the order via \ + config file or make sure the embedders return a unique key for each output. + """ + output = defaultdict(list) + if override_dropout_rate is None: + override_dropout_rate = {} + + # make sure emb_name in override_dropout_rate is valid + for emb_name in override_dropout_rate.keys(): + assert emb_name in self.embedders, f"invalid name found {emb_name}" + + for emb_name, embedder in self.embedders.items(): + embedding_context = nullcontext if embedder.is_trainable else torch.no_grad + with embedding_context(): + if isinstance(embedder.input_key, str): + emb_out = embedder( + embedder.random_dropout_input( + batch[embedder.input_key], override_dropout_rate.get(emb_name, None) + ) + ) + elif isinstance(embedder.input_key, (list, omegaconf.listconfig.ListConfig)): + emb_out = embedder( + *[ + embedder.random_dropout_input(batch.get(k), override_dropout_rate.get(emb_name, None), k) + for k in embedder.input_key + ] + ) + else: + raise KeyError( + f"Embedder '{embedder.__class__.__name__}' requires an 'input_key' attribute to be defined as either a string or list of strings" + ) + for k, v in emb_out.items(): + output[k].append(v) + # Concatenate the outputs; crossattn_emb is concatenated along dim=1: [B,N_text,hidden_size] + return {k: torch.cat(v, dim=self.KEY2DIM.get(k, -1)) for k, v in output.items()} + + def get_condition_uncondition( + self, + data_batch: Dict, + ) -> Tuple[Any, Any]: + """ + Processes the provided data batch to generate two sets of outputs: conditioned and unconditioned. This method + manipulates the dropout rates of embedders to simulate two scenarios — one where all conditions are applied + (conditioned), and one where they are removed or reduced to the minimum (unconditioned). + + This method first sets the dropout rates to zero for the conditioned scenario to fully apply the embedders' effects. + For the unconditioned scenario, it sets the dropout rates to 1 (or to 0 if the initial unconditional dropout rate + is insignificant) to minimize the embedders' influences, simulating an unconditioned generation. + + Parameters: + data_batch (Dict): The input data batch that contains all necessary information for embedding processing. The + data is expected to match the required format and keys expected by the embedders. + + Returns: + Tuple[Any, Any]: A tuple containing two condition: + - The first one contains the outputs with all embedders fully applied (conditioned outputs). + - The second one contains the outputs with embedders minimized or not applied (unconditioned outputs). + """ + cond_dropout_rates, dropout_rates = {}, {} + for emb_name, embedder in self.embedders.items(): + cond_dropout_rates[emb_name] = 0.0 + dropout_rates[emb_name] = 1.0 if embedder.dropout_rate > 1e-4 else 0.0 + + condition: Any = self(data_batch, override_dropout_rate=cond_dropout_rates) + un_condition: Any = self(data_batch, override_dropout_rate=dropout_rates) + return condition, un_condition + + def get_condition_with_negative_prompt( + self, + data_batch: Dict, + ) -> Tuple[Any, Any]: + """ + Similar functionality as get_condition_uncondition + But use negative prompts for unconditon + """ + cond_dropout_rates, uncond_dropout_rates = {}, {} + for emb_name, embedder in self.embedders.items(): + cond_dropout_rates[emb_name] = 0.0 + if isinstance(embedder, TextAttr): + uncond_dropout_rates[emb_name] = 0.0 + else: + uncond_dropout_rates[emb_name] = 1.0 if embedder.dropout_rate > 1e-4 else 0.0 + + data_batch_neg_prompt = copy.deepcopy(data_batch) + if "neg_t5_text_embeddings" in data_batch_neg_prompt: + if isinstance(data_batch_neg_prompt["neg_t5_text_embeddings"], torch.Tensor): + data_batch_neg_prompt["t5_text_embeddings"] = data_batch_neg_prompt["neg_t5_text_embeddings"] + + condition: Any = self(data_batch, override_dropout_rate=cond_dropout_rates) + un_condition: Any = self(data_batch_neg_prompt, override_dropout_rate=uncond_dropout_rates) + + return condition, un_condition + + +class VideoConditioner(GeneralConditioner): + def forward( + self, + batch: Dict, + override_dropout_rate: Optional[Dict[str, float]] = None, + ) -> Text2WorldCondition: + output = super()._forward(batch, override_dropout_rate) + return Text2WorldCondition(**output) + + +class GR00TV1Img2VidConditioner(GeneralConditioner): + def forward( + self, + batch: Dict, + override_dropout_rate: Optional[Dict[str, float]] = None, + ) -> GR00TV1Img2VidCondition: + output = super()._forward(batch, override_dropout_rate) + return GR00TV1Img2VidCondition(**output) diff --git a/cosmos-inference/cosmos3/_src/vfm/configs/__init__.py b/cosmos-inference/cosmos3/_src/vfm/configs/__init__.py new file mode 100644 index 00000000..db6d6eb6 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/vfm/configs/__init__.py @@ -0,0 +1,15 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025-2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + diff --git a/cosmos-inference/cosmos3/_src/vfm/configs/base/__init__.py b/cosmos-inference/cosmos3/_src/vfm/configs/base/__init__.py new file mode 100644 index 00000000..db6d6eb6 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/vfm/configs/base/__init__.py @@ -0,0 +1,15 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025-2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + diff --git a/cosmos-inference/cosmos3/_src/vfm/configs/base/config.py b/cosmos-inference/cosmos3/_src/vfm/configs/base/config.py new file mode 100644 index 00000000..456d7c02 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/vfm/configs/base/config.py @@ -0,0 +1,133 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +from typing import Any, List + +import attrs +from omegaconf import OmegaConf + +from cosmos3._src.imaginaire import config +from cosmos3._src.imaginaire.trainer import ImaginaireTrainer as Trainer +from cosmos3._src.imaginaire.utils.config_helper import import_all_modules_from_package +from cosmos3._src.vfm.configs.base.defaults.model_config import ModelConfig + + +@attrs.define(slots=False) +class DataSetting: + """Configuration for data. + + Attributes: + qwen_max_video_token_length: Maximum video token length. + qwen_target_fps: Target fps for video sampling. + text_chat_order: Order of text items in user messages. + """ + + qwen_max_video_token_length: int = 8192 + + +@attrs.define(slots=False) +class Config(config.Config): + data_setting: DataSetting = attrs.field(factory=DataSetting) + defaults: List[Any] = attrs.field( + factory=lambda: [ + "_self_", + {"model": "mot_fsdp"}, + {"data_train": None}, + {"data_val": None}, + {"optimizer": "adamw"}, + {"scheduler": "warmup_cosine_lr"}, + {"checkpoint": "s3"}, + {"callbacks": ["basic", "optimization", "job_monitor", "generation"]}, + {"ema": "power"}, + {"tokenizer": "wan2pt2_tokenizer"}, + {"sound_tokenizer": None}, # Optional: for audio-video generation + {"cluster": "gcp_iad_gb200"}, + {"vlm_config": None}, + {"ckpt_type": "dcp"}, + {"experiment": None}, + ] + ) + + def validate(self) -> None: + super().validate() + self._dispatch_model_config_validate() + + def _dispatch_model_config_validate(self) -> None: + """Run model-family validation on the composed model.config. + + validate() runs before instantiate(), so self.model.config is a + DictConfig wrapping the structured schema rather than the attrs class. + DictConfig surfaces fields but not methods, so to drive the typed + isinstance dispatch the schema must first be materialized via + OmegaConf.to_object. + """ + materialized = OmegaConf.to_object(self.model.config) + if isinstance(materialized, ModelConfig): + materialized.validate(self) + + +def make_config() -> Config: + c = Config( + model=None, + optimizer=None, + scheduler=None, + dataloader_train=None, + dataloader_val=None, + ) + + # Specifying values through instances of attrs + c.job.project = "cosmos3_vfm" + c.job.group = "debug" + c.job.name = "delete_${now:%Y-%m-%d}_${now:%H-%M-%S}" + + c.trainer.type = Trainer + c.trainer.straggler_detection.enabled = False + c.trainer.max_iter = 400_000 + c.trainer.logging_iter = 20 + c.trainer.validation_iter = 100 + c.trainer.run_validation = False + c.trainer.callbacks = None + + c.upload_reproducible_setup = False + + from cosmos3._src.vfm.configs.base.defaults.callbacks import register_callbacks + from cosmos3._src.vfm.configs.base.defaults.checkpointer import register_checkpoint, register_ckpt_type + from cosmos3._src.vfm.configs.base.defaults.cluster import register_cluster + from cosmos3._src.vfm.configs.base.defaults.ema import register_ema + + # from cosmos3._src.vfm.configs.base.defaults.data import register_data + from cosmos3._src.vfm.configs.base.defaults.model import register_model + from cosmos3._src.vfm.configs.base.defaults.optimizer import register_optimizer, register_scheduler + from cosmos3._src.vfm.configs.base.defaults.tokenizer import register_sound_tokenizer, register_tokenizer + from cosmos3._src.vfm.configs.base.defaults.vlm import register_vlm + + # Call this function to register config groups for advanced overriding. the order follows the default config groups + # register_data() + register_model() + register_checkpoint() + register_ckpt_type() + register_optimizer() + register_scheduler() + register_callbacks() + register_tokenizer() + register_sound_tokenizer() + register_ema() + register_cluster() + register_vlm() + + # experiment config are defined in the experiment folder + # call import_all_modules_from_package to register them + import_all_modules_from_package("cosmos3._src.vfm.configs.base.experiment", reload=True) + return c diff --git a/cosmos-inference/cosmos3/_src/vfm/configs/base/defaults/__init__.py b/cosmos-inference/cosmos3/_src/vfm/configs/base/defaults/__init__.py new file mode 100644 index 00000000..db6d6eb6 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/vfm/configs/base/defaults/__init__.py @@ -0,0 +1,15 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025-2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + diff --git a/cosmos-inference/cosmos3/_src/vfm/configs/base/defaults/callbacks.py b/cosmos-inference/cosmos3/_src/vfm/configs/base/defaults/callbacks.py new file mode 100644 index 00000000..5691474a --- /dev/null +++ b/cosmos-inference/cosmos3/_src/vfm/configs/base/defaults/callbacks.py @@ -0,0 +1,147 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +"""Dataloader config options.""" + +from hydra.core.config_store import ConfigStore + +from cosmos3._src.imaginaire.callbacks.manual_gc import ManualGarbageCollection +from cosmos3._src.imaginaire.lazy_config import PLACEHOLDER +from cosmos3._src.imaginaire.lazy_config import LazyCall as L +from cosmos3._src.imaginaire.utils.callback import WandBCallback +from cosmos3._src.vfm.callbacks.compile_tokenizer import CompileTokenizer +from cosmos3._src.vfm.callbacks.dataloading_monitor import DetailedDataLoadingSpeedMonitor +from cosmos3._src.vfm.callbacks.device_monitor import DeviceMonitor +from cosmos3._src.vfm.callbacks.every_n_draw_sample import EveryNDrawSample +from cosmos3._src.vfm.callbacks.expert_heatmap import ExpertHeatmap +from cosmos3._src.vfm.callbacks.grad_clip import GradClip +from cosmos3._src.vfm.callbacks.heart_beat import HeartBeat +from cosmos3._src.vfm.callbacks.iter_speed import IterSpeed +from cosmos3._src.vfm.callbacks.load_pretrained import LoadPretrained +from cosmos3._src.vfm.callbacks.low_precision import LowPrecisionCallback +from cosmos3._src.vfm.callbacks.mfu import MFUCallback +from cosmos3._src.vfm.callbacks.moe_specialization_callback import MoESpecializationCallback +from cosmos3._src.vfm.callbacks.moe_stability_callback import MoEStabilityCallback +from cosmos3._src.vfm.callbacks.norm_monitor import NormMonitor +from cosmos3._src.vfm.callbacks.ofu import OFUCallback +from cosmos3._src.vfm.callbacks.param_count import ParamCount +from cosmos3._src.vfm.callbacks.sequence_packing_padding import SequencePackingPadding +from cosmos3._src.vfm.callbacks.sigma_loss_analysis import SigmaLossAnalysis +from cosmos3._src.vfm.callbacks.skip_nan_step import SkipNaNStep +from cosmos3._src.vfm.callbacks.termination_signal_checkpoint import TerminationSignalCheckpoint +from cosmos3._src.vfm.callbacks.training_stats import TrainingStatsCallback +from cosmos3._src.vfm.callbacks.wandb_log import WandbCallback as WandBCallbackMultiplier +from cosmos3._src.vfm.callbacks.wandb_log_eval import WandbCallback as WandBCallbackEval + +BASIC_CALLBACKS = dict( + iter_speed=L(IterSpeed)( # does not use model or optimizer + every_n="${trainer.logging_iter}", + save_s3="${upload_reproducible_setup}", + save_s3_every_log_n=500, + hit_thres=50, + ), + manual_gc=L(ManualGarbageCollection)(every_n=5), # does not use model or optimizer + wandb=L(WandBCallback)(), + wandb_2x=L(WandBCallbackMultiplier)( + logging_iter_multipler=2, + save_logging_iter_multipler=1, + save_s3="${upload_reproducible_setup}", + ), + param_count=L(ParamCount)( # use model + save_s3="${upload_reproducible_setup}", + ), + dataloader_speed=L(DetailedDataLoadingSpeedMonitor)( + every_n=100, + save_s3="${upload_reproducible_setup}", + ), + wandb_val=L(WandBCallbackEval)( + save_s3="${upload_reproducible_setup}", + ), + moe_stability=L(MoEStabilityCallback)(every_n=250), + moe_specialization=L(MoESpecializationCallback)(every_n=250), + expert_heatmap=L(ExpertHeatmap)(), + load_pretrained=L(LoadPretrained)(), + compile_tokenizer=L(CompileTokenizer)(enabled=False, compile_after_iterations=3), + norm_monitor=L(NormMonitor)( + every_n=5000, + log_stat_wandb=True, + save_s3="${upload_reproducible_setup}", + track_activations=True, + ), + sigma_loss_analysis=L(SigmaLossAnalysis)( + every_n=5000, + every_n_viz=5000, + save_s3="${upload_reproducible_setup}", + ), + sequence_packing_padding=L(SequencePackingPadding)(every_n="${trainer.logging_iter}"), + mfu=L(MFUCallback)(every_n="${trainer.logging_iter}", grad_accum_iter="${trainer.grad_accum_iter}"), + ofu=L(OFUCallback)(every_n="${trainer.logging_iter}"), +) + +JOB_MONITOR_CALLBACKS = dict( + heart_beat=L(HeartBeat)( + every_n=200, + update_interval_in_minute=20, + save_s3="${upload_reproducible_setup}", + ), + device_monitor=L(DeviceMonitor)( + every_n=200, + save_s3="${upload_reproducible_setup}", + upload_every_n_mul=5, + ), + termination_signal_checkpoint=L(TerminationSignalCheckpoint)( + min_save_fraction=1 / 3, + ), +) + +OPTIMIZATION_CALLBACKS = dict( + skip_nan_step=L(SkipNaNStep)(max_consecutive_nan=100), + grad_clip=L(GradClip)(clip_norm=1.0, track_per_modality=True), # image/video grad-norm split + low_precision=L(LowPrecisionCallback)(update_iter=1, config=PLACEHOLDER, trainer=PLACEHOLDER), # use model +) + +VIZ_ONLINE_SAMPLING_CALLBACKS = dict( + every_n_sample_reg=L(EveryNDrawSample)( + every_n=5000, + save_s3=True, + do_x0_prediction=False, + ), + every_n_sample_ema=L(EveryNDrawSample)( + every_n=5000, + is_ema=True, + save_s3=True, + do_x0_prediction=False, + ), +) + + +def register_callbacks(): + cs = ConfigStore.instance() + cs.store(group="callbacks", package="trainer.callbacks", name="basic", node=BASIC_CALLBACKS) + cs.store(group="callbacks", package="trainer.callbacks", name="job_monitor", node=JOB_MONITOR_CALLBACKS) + cs.store(group="callbacks", package="trainer.callbacks", name="optimization", node=OPTIMIZATION_CALLBACKS) + # Online sampling generation callback + cs.store( + group="callbacks", package="trainer.callbacks", name="viz_online_sampling", node=VIZ_ONLINE_SAMPLING_CALLBACKS + ) + # Register "generation" as alias for "viz_online_sampling" (expected by base config.py defaults) + cs.store(group="callbacks", package="trainer.callbacks", name="generation", node=VIZ_ONLINE_SAMPLING_CALLBACKS) + + TRAINING_STATS_CALLBACKS = dict( + training_stats=L(TrainingStatsCallback)( + log_freq=100, + ) + ) + cs.store(group="callbacks", package="trainer.callbacks", name="training_stats", node=TRAINING_STATS_CALLBACKS) diff --git a/cosmos-inference/cosmos3/_src/vfm/configs/base/defaults/checkpointer.py b/cosmos-inference/cosmos3/_src/vfm/configs/base/defaults/checkpointer.py new file mode 100644 index 00000000..f4e77296 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/vfm/configs/base/defaults/checkpointer.py @@ -0,0 +1,132 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +""" +Copied from https://gitlab-master.nvidia.com/dir/imaginaire4/-/blob/d0921eb675d1251e73c4b19acdd78e6ad936ae3b/projects/cosmos/reason2/configs/base/defaults/checkpointer.py without changes +""" + +from typing import Dict + +from hydra.core.config_store import ConfigStore + +from cosmos3._src.imaginaire import config +from cosmos3._src.imaginaire.checkpointer.dummy import Checkpointer as DummyCheckpointer +from cosmos3._src.imaginaire.config import CheckpointConfig +from cosmos3._src.imaginaire.lazy_config import LazyCall as L +from cosmos3._src.vfm.checkpointer.dcp import DistributedCheckpointer + +local_object_store = config.ObjectStoreConfig( + enabled=False, +) + +pdx_object_store = config.ObjectStoreConfig( + enabled=True, + credentials="credentials/pdx_vfm_checkpoint.secret", + bucket="checkpoints", +) + +s3_object_store = config.ObjectStoreConfig( + enabled=True, + credentials="credentials/s3_training.secret", + bucket="bucket", +) + +s3_eu_object_store = config.ObjectStoreConfig( + enabled=True, + credentials="credentials/s3_training_eu.secret", + bucket="bucket", +) + +gcp_object_store = config.ObjectStoreConfig( + enabled=True, + credentials="credentials/gcp_checkpoint.secret", + bucket="bucket", +) + +neb_eu_object_store = config.ObjectStoreConfig( + enabled=True, + credentials="credentials/neb_eu.secret", + bucket="nv-01-10206-checkpoint-experiments", +) + +CHECKPOINT_LOCAL = CheckpointConfig( + save_to_object_store=local_object_store, + load_from_object_store=local_object_store, + save_iter=5000, + broadcast_via_filesystem=True, + dcp_async_mode_enabled=True, +) + +CHECKPOINT_PDX = CheckpointConfig( + save_to_object_store=pdx_object_store, + load_from_object_store=pdx_object_store, + save_iter=5000, + broadcast_via_filesystem=True, + dcp_async_mode_enabled=True, +) + +CHECKPOINT_S3 = CheckpointConfig( + save_to_object_store=s3_object_store, + load_from_object_store=s3_object_store, + save_iter=5000, + broadcast_via_filesystem=True, + dcp_async_mode_enabled=True, +) + +CHECKPOINT_S3_EU = CheckpointConfig( + save_to_object_store=s3_eu_object_store, + load_from_object_store=s3_eu_object_store, + save_iter=5000, + broadcast_via_filesystem=True, + dcp_async_mode_enabled=True, +) + +CHECKPOINT_GCP = CheckpointConfig( + save_to_object_store=gcp_object_store, + save_iter=1000, + load_from_object_store=gcp_object_store, + load_path="", + load_training_state=False, + strict_resume=True, + enable_gcs_patch_in_boto3=True, + dcp_async_mode_enabled=True, +) + +CHECKPOINT_NEB_EU = CheckpointConfig( + save_to_object_store=neb_eu_object_store, + load_from_object_store=neb_eu_object_store, + save_iter=2000, + broadcast_via_filesystem=True, +) + + +def register_checkpoint(): + cs = ConfigStore.instance() + cs.store(group="checkpoint", package="checkpoint", name="local", node=CHECKPOINT_LOCAL) + cs.store(group="checkpoint", package="checkpoint", name="pdx", node=CHECKPOINT_PDX) + cs.store(group="checkpoint", package="checkpoint", name="s3", node=CHECKPOINT_S3) + cs.store(group="checkpoint", package="checkpoint", name="s3_eu", node=CHECKPOINT_S3_EU) + cs.store(group="checkpoint", package="checkpoint", name="gcp", node=CHECKPOINT_GCP) + cs.store(group="checkpoint", package="checkpoint", name="neb_eu", node=CHECKPOINT_NEB_EU) + + +DUMMY_CHECKPOINTER: Dict[str, str] = L(DummyCheckpointer)() +DISTRIBUTED_CHECKPOINTER: Dict[str, str] = L(DistributedCheckpointer)() + + +def register_ckpt_type(): + cs = ConfigStore.instance() + cs.store(group="ckpt_type", package="checkpoint.type", name="dummy", node=DUMMY_CHECKPOINTER) + cs.store(group="ckpt_type", package="checkpoint.type", name="dcp", node=DISTRIBUTED_CHECKPOINTER) diff --git a/cosmos-inference/cosmos3/_src/vfm/configs/base/defaults/cluster.py b/cosmos-inference/cosmos3/_src/vfm/configs/base/defaults/cluster.py new file mode 100644 index 00000000..bb017174 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/vfm/configs/base/defaults/cluster.py @@ -0,0 +1,58 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +import attrs +from hydra.core.config_store import ConfigStore + + +@attrs.define(slots=False) +class ClusterConfig: + """ + Config for the cluster specific information. + Everything cluster specific should be here. + """ + + object_store_bucket_data: str + object_store_bucket_checkpoint: str + object_store_bucket_pretrained: str + + object_store_credential_data: str + object_store_credential_checkpoint: str + object_store_credential_pretrained: str + + +AWSIADH100Config: ClusterConfig = ClusterConfig( + object_store_bucket_data="", + object_store_bucket_checkpoint="bucket", + object_store_bucket_pretrained="bucket", + object_store_credential_data="credentials/s3_training.secret", + object_store_credential_checkpoint="credentials/s3_checkpoint.secret", + object_store_credential_pretrained="credentials/s3_checkpoint.secret", +) + +GCPIADGB200Config: ClusterConfig = ClusterConfig( + object_store_bucket_data="", + object_store_bucket_checkpoint="bucket", + object_store_bucket_pretrained="bucket", + object_store_credential_data="credentials/gcp_checkpoint.secret", + object_store_credential_checkpoint="credentials/gcp_training.secret", + object_store_credential_pretrained="credentials/gcp_training.secret", +) + + +def register_cluster(): + cs = ConfigStore.instance() + cs.store(group="cluster", package="job.cluster", name="aws_iad_h100", node=AWSIADH100Config) + cs.store(group="cluster", package="job.cluster", name="gcp_iad_gb200", node=GCPIADGB200Config) diff --git a/cosmos-inference/cosmos3/_src/vfm/configs/base/defaults/conditioner.py b/cosmos-inference/cosmos3/_src/vfm/configs/base/defaults/conditioner.py new file mode 100644 index 00000000..69f70e71 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/vfm/configs/base/defaults/conditioner.py @@ -0,0 +1,194 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +import random +from dataclasses import dataclass +from typing import Dict, Optional + +import torch +from hydra.core.config_store import ConfigStore + +from cosmos3._src.imaginaire.lazy_config import LazyCall as L +from cosmos3._src.imaginaire.lazy_config import LazyDict +from cosmos3._src.vfm.conditioner import BooleanFlag, GeneralConditioner, ReMapkey, Text2WorldCondition +from cosmos3._src.vfm.utils.context_parallel import broadcast_split_tensor + + +@dataclass(frozen=True) +class Video2WorldCondition(Text2WorldCondition): + use_video_condition: bool = False + # the following two attributes are used to set the video condition; during training, inference + gt_frames: Optional[torch.Tensor] = None + condition_video_input_mask: Optional[torch.Tensor] = None + + def set_video_condition( + self, + gt_frames: torch.Tensor, + random_min_num_conditional_frames: int, + random_max_num_conditional_frames: int, + num_conditional_frames: Optional[int] = None, + conditional_frames_probs: Optional[Dict[int, float]] = None, + ) -> "Video2WorldCondition": + """ + Sets the video conditioning frames for video-to-video generation. + + This method creates a conditioning mask for the input video frames that determines + which frames will be used as context frames for generating new frames. The method + handles both image batches (T=1) and video batches (T>1) differently. + + Args: + gt_frames: A tensor of ground truth frames with shape [B, C, T, H, W], where: + B = batch size + C = number of channels + T = number of frames + H = height + W = width + + random_min_num_conditional_frames: Minimum number of frames to use for conditioning + when randomly selecting a number of conditioning frames. + + random_max_num_conditional_frames: Maximum number of frames to use for conditioning + when randomly selecting a number of conditioning frames. + + num_conditional_frames: Optional; If provided, all examples in the batch will use + exactly this many frames for conditioning. If None, a random number of frames + between random_min_num_conditional_frames and random_max_num_conditional_frames + will be selected for each example in the batch. + + conditional_frames_probs: Optional; Dictionary mapping number of frames to probabilities. + If provided, overrides the random_min/max_num_conditional_frames with weighted sampling. + Example: {0: 0.5, 1: 0.25, 2: 0.25} for 50% chance of 0 frames, 25% for 1, 25% for 2. + + Returns: + A new Video2WorldCondition object with the gt_frames and conditioning mask set. + The conditioning mask (condition_video_input_mask) is a binary tensor + of shape [B, 1, T, H, W] where 1 indicates frames used for conditioning and 0 + indicates frames to be generated. + + Notes: + - For image batches (T=1), no conditioning frames are used (num_conditional_frames_B = 0). + - For video batches: + - If num_conditional_frames is provided, all examples use that fixed number of frames. + - Otherwise, each example randomly uses between random_min_num_conditional_frames and + random_max_num_conditional_frames frames. + - The mask marks the first N frames as conditioning frames (set to 1) for each example. + """ + kwargs = self.to_dict(skip_underscore=False) + kwargs["gt_frames"] = gt_frames + + # condition_video_input_mask + B, _, T, H, W = gt_frames.shape + condition_video_input_mask = torch.zeros(B, 1, T, H, W, dtype=gt_frames.dtype, device=gt_frames.device) + if T == 1: # handle image batch + num_conditional_frames_B = torch.zeros(B, dtype=torch.int32) + else: # handle video batch + if num_conditional_frames is not None: + num_conditional_frames_B = torch.ones(B, dtype=torch.int32) * num_conditional_frames + elif conditional_frames_probs is not None: + # Use weighted sampling based on provided probabilities + frames_options = list(conditional_frames_probs.keys()) + weights = list(conditional_frames_probs.values()) + num_conditional_frames_B = torch.tensor( + random.choices(frames_options, weights=weights, k=B), dtype=torch.int32 + ) + else: + num_conditional_frames_B = torch.randint( + random_min_num_conditional_frames, random_max_num_conditional_frames + 1, size=(B,) + ) + for idx in range(B): + condition_video_input_mask[idx, :, : num_conditional_frames_B[idx], :, :] += 1 + + kwargs["condition_video_input_mask"] = condition_video_input_mask + return type(self)(**kwargs) + + def edit_for_inference( + self, is_cfg_conditional: bool = True, num_conditional_frames: int = 1 + ) -> "Video2WorldCondition": + _condition = self.set_video_condition( + gt_frames=self.gt_frames, + random_min_num_conditional_frames=0, + random_max_num_conditional_frames=0, + num_conditional_frames=num_conditional_frames, + ) + if not is_cfg_conditional: + # Do not use classifier free guidance on conditional frames. + # YB found that it leads to worse results. + _condition.use_video_condition.fill_(True) + return _condition + + def broadcast(self, process_group: torch.distributed.ProcessGroup) -> "Video2WorldCondition": + if self.is_broadcasted: + return self + # extra efforts + gt_frames = self.gt_frames + condition_video_input_mask = self.condition_video_input_mask + kwargs = self.to_dict(skip_underscore=False) + kwargs["gt_frames"] = None + kwargs["condition_video_input_mask"] = None + new_condition = Text2WorldCondition.broadcast( + type(self)(**kwargs), + process_group, + ) + + kwargs = new_condition.to_dict(skip_underscore=False) + _, _, T, _, _ = gt_frames.shape + if process_group is not None: + if T > 1 and process_group.size() > 1: + gt_frames = broadcast_split_tensor(gt_frames, seq_dim=2, process_group=process_group) + condition_video_input_mask = broadcast_split_tensor( + condition_video_input_mask, seq_dim=2, process_group=process_group + ) + kwargs["gt_frames"] = gt_frames + kwargs["condition_video_input_mask"] = condition_video_input_mask + return type(self)(**kwargs) + + +class Video2WorldConditioner(GeneralConditioner): + def forward( + self, + batch: Dict, + override_dropout_rate: Optional[Dict[str, float]] = None, + ) -> Video2WorldCondition: + output = super()._forward(batch, override_dropout_rate) + return Video2WorldCondition(**output) + + +_SHARED_CONFIG = dict( + fps=L(ReMapkey)( + input_key="fps", + output_key="fps", + dropout_rate=0.0, + dtype=None, + ), + use_video_condition=L(BooleanFlag)( + input_key="fps", + output_key="use_video_condition", + dropout_rate=0.2, + ), +) + +VideoPredictionConditioner: LazyDict = L(Video2WorldConditioner)( + **_SHARED_CONFIG, +) + + +def register_conditioner(): + cs = ConfigStore.instance() + cs.store( + group="conditioner", + package="model.config.conditioner", + name="video_prediction_conditioner", + node=VideoPredictionConditioner, + ) diff --git a/cosmos-inference/cosmos3/_src/vfm/configs/base/defaults/ema.py b/cosmos-inference/cosmos3/_src/vfm/configs/base/defaults/ema.py new file mode 100644 index 00000000..2ea0b65d --- /dev/null +++ b/cosmos-inference/cosmos3/_src/vfm/configs/base/defaults/ema.py @@ -0,0 +1,40 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +import attrs +from hydra.core.config_store import ConfigStore + + +@attrs.define(slots=False) +class EMAConfig: + """ + Config for the EMA. + """ + + enabled: bool = True + rate: float = 0.1 + iteration_shift: int = 0 + + +PowerEMAConfig: EMAConfig = EMAConfig( + enabled=True, + rate=0.10, + iteration_shift=0, +) + + +def register_ema(): + cs = ConfigStore.instance() + cs.store(group="ema", package="model.config.ema", name="power", node=PowerEMAConfig) diff --git a/cosmos-inference/cosmos3/_src/vfm/configs/base/defaults/model.py b/cosmos-inference/cosmos3/_src/vfm/configs/base/defaults/model.py new file mode 100644 index 00000000..3d2ed22f --- /dev/null +++ b/cosmos-inference/cosmos3/_src/vfm/configs/base/defaults/model.py @@ -0,0 +1,54 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +from hydra.core.config_store import ConfigStore + +from cosmos3._src.imaginaire.lazy_config import LazyCall as L +from cosmos3._src.vfm.configs.base.defaults.model_config import ( + OmniMoTModelConfig, + ParallelismConfig, +) +from cosmos3._src.vfm.models.omni_mot_model import OmniMoTModel + +MOT_DDP_CONFIG = dict( + trainer=dict( + distributed_parallelism="ddp", + ), + model=L(OmniMoTModel)( + config=OmniMoTModelConfig(), + _recursive_=False, + ), +) + + +MOT_FSDP_CONFIG = dict( + trainer=dict( + distributed_parallelism="fsdp", + ), + model=L(OmniMoTModel)( + config=OmniMoTModelConfig( + parallelism=ParallelismConfig( + data_parallel_shard_degree=8, + ), + ), + _recursive_=False, + ), +) + + +def register_model(): + cs = ConfigStore.instance() + cs.store(group="model", package="_global_", name="mot_ddp", node=MOT_DDP_CONFIG) + cs.store(group="model", package="_global_", name="mot_fsdp", node=MOT_FSDP_CONFIG) diff --git a/cosmos-inference/cosmos3/_src/vfm/configs/base/defaults/model_config.py b/cosmos-inference/cosmos3/_src/vfm/configs/base/defaults/model_config.py new file mode 100644 index 00000000..ffd14c4c --- /dev/null +++ b/cosmos-inference/cosmos3/_src/vfm/configs/base/defaults/model_config.py @@ -0,0 +1,360 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + + +from typing import Any + +import attrs + +from cosmos3._src.imaginaire.config import Config +from cosmos3._src.imaginaire.lazy_config import LazyDict +from cosmos3._src.vfm.configs.base.defaults.ema import EMAConfig +from cosmos3._src.vfm.configs.base.defaults.parallelism import ParallelismConfig +from cosmos3._src.vfm.configs.base.defaults.vlm import VLMConfig +from cosmos3._src.vfm.configs.base.vlm.defaults.training import PolicyConfig, TrainConfig + + +@attrs.define(slots=False) +class ModelConfig: + """Typed base for project model configs. + + Subclasses override validate to add family-specific checks. The receiver is + a fresh attrs copy from OmegaConf.to_object, so mutations to self are + discarded; write through root_config for any propagating side effects. + + The ema field is required (disabled by default) so trainer/callback reads + stay as model.config.ema.enabled across every family; subclasses that opt in + (e.g. OmniMoTModelConfig) override with their own EMAConfig. + """ + + ema: EMAConfig = EMAConfig(enabled=False) + + def validate(self, root_config: Config) -> None: + return + + +@attrs.define(slots=False) +class DiffusionExpertConfig: + # This determines the range of timesteps before the fourier feature embedding is applied. + timestep_range: float = 1.0 + # Whether to load the generation pathway weights from pretrained LLM/VLM weights. + load_weights_from_pretrained: bool = True + + patch_spatial: int = 2 + max_vae_latent_side_after_patchify: int = ( + 20 # Max dimension (h or w) of the VAE latent after patchification (320/(8*2)) + ) + # Position embedding type for vision tokens: + # - "3d_rope": Additive 3D RoPE embeddings (VideoRopePosition3DEmb) + 1D position IDs for attention + # - "flattened_sin_cos": Additive flattened sin/cos embeddings + 1D position IDs for attention + # - "unified_3d_mrope": No additive embedding + 3D position IDs for Qwen3VL-style mRoPE attention + position_embedding_type: str = "3d_rope" + # When finetuning from lower resolution to higher resolution, the spatial resolution of videos increase. + # So, we need to adjust the position embedding. + # We use NTK based RoPE extrapolation to adjust the position embedding. + # Reference: (https://www.reddit.com/r/LocalLLaMA/comments/14lz7j5/ntkaware_scaled_rope_allows_llama_models_to_have/) + # Design adapted from Cosmos2.5 (https://arxiv.org/pdf/2511.00062) + # extrapolation_ratio here is how the base of the RoPE is scaled + # b' = b * extrapolation_ratio^(dim / (dim - 2)) + rope_h_extrapolation_ratio: float = 1.0 + rope_w_extrapolation_ratio: float = 1.0 + rope_t_extrapolation_ratio: float = 1.0 + enable_fps_modulation: bool = False + base_fps: int = 24 + # For unified_3d_mrope: whether spatial (H, W) indices reset to 0 for each vision segment + unified_3d_mrope_reset_spatial_ids: bool = True + # Setting the temporal gap on the boundary of the different modalities, default is 0, using a value greater than 0 will add an additional offset on the accumulated temporal offset. + unified_3d_mrope_temporal_modality_margin: int = 0 + + +@attrs.define(slots=False) +class LBLConfig: + # For load balancing loss computation. + # - "local": Use the fraction of tokens routed to each expert only for the local rank. + # - "global": Use the fraction of tokens routed to each expert across all ranks. + method: str = "local" + + # Coefficients for the load balancing loss. + # - "und": Coefficient for the load balancing loss for the "und" pathway. + # - "gen": Coefficient for the load balancing loss for the "gen" pathway. + coeff_und: float | None = None + coeff_gen: float | None = None + + +@attrs.define(slots=False) +class RectifiedFlowTrainingConfig: + shift: Any = 5 # Training time shift. If dict, maps resolution (str) to shift value (int) + use_dynamic_shift: bool = False # Whether to use dynamic shifting + train_time_image_distribution: str = "logitnormal" # Training time distribution for images + train_time_video_distribution: str = "logitnormal" # Training time distribution for videos + train_time_action_distribution: str = "logitnormal" # Training time distribution for actions + train_time_sound_distribution: str = "logitnormal" # Training time distribution for sound + train_time_weight: str = "uniform" # Training time weight + loss_scale: float = 1.0 # Loss scale + image_loss_scale: float | None = None # If set, overrides loss_scale for images + sound_loss_scale: float | None = None # If set, overrides loss_scale for sound + use_high_sigma_strategy: bool = False # Whether to use high sigma strategy + high_sigma_ratio: float = 0.05 # Ratio of using high sigmas + high_sigma_timesteps_min: int = 995 # Minimum timestep for high sigma + high_sigma_timesteps_max: int = 1000 # Maximum timestep for high sigma + use_discrete_rf: bool = False # Whether to use discrete formulation of rectified flow + + # user: please adjust this value according to loss_scale to balance the action loss with the video loss. + # default is 10.0 to align with previous training settings. + action_loss_weight: float = 10.0 + + # Independent noise schedule for action. When False (default), action shares the sigma + # sampled from the vision RF on every step — legacy behavior. When True, action samples + # its own sigma from `rectified_flow_action` using `shift_action` and + # `use_high_sigma_strategy_action`. Action always uses a shared scalar sigma per sample + # ([B,1]), independent of vision's DF mode. If action opts in to the high-sigma strategy, + # it reuses the global ratio / min / max. + independent_action_schedule: bool = False + shift_action: int | None = None # must be int; None → inherit `shift` (which must also be int) + use_high_sigma_strategy_action: bool = False + + # Independent noise schedule for sound. When False (default), sound shares the vision + # sigma schedule, reindexed to the dense audio-bearing subset. When True, sound samples + # its own scalar sigma per sample ([B,1]) from `rectified_flow_sound` using `shift_sound` + # and `use_high_sigma_strategy_sound`. + independent_sound_schedule: bool = False + shift_sound: int | None = None # must be int; None → inherit `shift` (which must also be int) + use_high_sigma_strategy_sound: bool = False + + # When True, per-instance flow-matching loss is normalized by the count of + # active (noisy) elements rather than all elements — preserves sum/active_count + # semantics so conditioning-heavy samples (e.g. I2V, forward_dynamics, diffusion + # forcing, AR rollout teacher-forcing) contribute gradient on par with K=0 + # samples. With .mean() the gradient of a K-conditioned sample is scaled by + # (T-K)/T, which undertrains the attend-to-clean-history dynamics. Kept + # False by default to preserve legacy loss magnitudes; enable for AR/DF training. + normalize_loss_by_active: bool = False + + +@attrs.define(slots=False) +class RectifiedFlowInferenceConfig: + scheduler_type: str = "unipc" # Scheduler type + num_train_timesteps: int = 1000 + shift: int = 1 + use_dynamic_shifting: bool = False + + +@attrs.define(slots=False) +class FixedStepSamplerConfig: + """Config for the fixed-step sampler used by distilled models. + + Uses a fixed sigma schedule instead of a smooth multi-step solver. + + Mirrors the constructor args of ``FixedStepSampler``. + """ + + # Discrete noise-level schedule (descending, excluding the final 0.0 step). + # Convention: exclude the final 0.0 step — FixedStepSampler appends it automatically. + # Values must be descending. Using 0.999 instead of 1.0 avoids numeric edge cases at sigma=1. + t_list: list[float] = [0.999, 0.75, 0.5, 0.25] + # Integrator type: "ode" (deterministic Euler) or "sde" (stochastic re-noising at each step). + sample_type: str = "ode" + + +# Don't have any defaults and init only in config file. +@attrs.define(slots=False) +class OmniMoTModelConfig(ModelConfig): + """ + Config for Omni MoT model. + """ + + tokenizer: LazyDict = None + net: LazyDict = None + ema: EMAConfig = EMAConfig() + parallelism: ParallelismConfig = ParallelismConfig() + + # LoRA (parameter-efficient fine-tuning). When `lora_enabled=True`, + # `OmniMoTModel.build_net` injects custom LoRA adapters BEFORE FSDP wrap on + # the meta-device network, then re-initializes lora_A/lora_B after + # to_empty + init_weights. Pair with `optimizer.keys_to_select=["lora_"]` + # and `checkpoint.keys_to_skip_loading=[..., "lora_"]`. + lora_enabled: bool = False + lora_rank: int = 16 + lora_alpha: int = 32 + lora_target_modules: str = "q_proj_moe_gen,k_proj_moe_gen,v_proj_moe_gen,o_proj_moe_gen" + + # Rectified flow configs + rectified_flow_training_config: RectifiedFlowTrainingConfig = RectifiedFlowTrainingConfig() + rectified_flow_inference_config: RectifiedFlowInferenceConfig = RectifiedFlowInferenceConfig() + + # Optional fixed-step sampler for distilled models (None for base models). + fixed_step_sampler_config: FixedStepSamplerConfig | None = None + + # Model configs + vlm_config: VLMConfig = VLMConfig() + diffusion_expert_config: DiffusionExpertConfig = DiffusionExpertConfig() + # Training data keys + input_video_key: str = "video" + input_image_key: str = "images" # key to fetch input image from data_batch + input_caption_key: str = "ai_caption" # Key used to fetch input captions + + # State and sequence shapes + state_ch: int = 16 # for latent model, ref to the latent channel number + state_t: int = 8 # for latent model, ref to the latent number of frames + latent_downsample_factor: int = 8 + resolution: str = "512" + max_num_tokens_after_packing: int = 13312 # Final num tokens after sequence packing + + # Attention implementation for joint understanding + generation + # Note "two_way" and "three_way" disallow and remove "End-of-Vision" or other text token in the generation tower. + # "three_way" must only be used when introducing sparsity + joint_attn_implementation: str = ( + "two_way" # "two_way", "three_way" or "flex" (NOTICE: We are planning to remove "flex" soon) + ) + + # Per-layer NATTEN parameters + # Must use "three_way" attention if used. + # If None, all attention layers remain dense. + # If not None, must be a list exactly the size of number of layers, and each layer can be either + # None (dense) or a dictionary, with at least 'kernel_size' or 'kernel_size_float' keys + # specifying sparsity. NATTEN parameters 'dilation' and 'stride' may also be specified either as + # static integers, or as floating point values that will be mapped to their domain during + # runtime. Integer parameters should never be mixed with floating point ones. + # + # Floating point parameters are highly recommended, unless the use case will have a fixed token + # layout (input resolution). + # + # Examples: + # Interleaved sliding window layers, "GPT-OSS"-style, with static window size: + # natten_parameter_list = [None if layer_idx % 2 != 0 else {"kernel_size": (8, 8)}] + # Layers with odd indices ("None"s) will use dense attention, and layers with an even indices + # will use a static sliding window size of 8x8. + # + # Interleaved sliding window layers, "GPT-OSS"-style, with input-dependent window size: + # natten_parameter_list = [None if layer_idx % 2 != 0 else {"kernel_size_float": (0.5, 0.5)}] + # Layers with odd indices ("None"s) will use dense attention, and layers with an even indices + # will use a dynamic window size that is 50% of the input along each of the two dimensions. + # + # Interleaved sliding window and dilated layers, "DiNAT"-style: + # natten_parameter_list = [ + # { + # "kernel_size_float": (0.5, 0.5), + # "dilation_float": (1.0, 1.0), + # } if layer_idx % 2 != 0 else { + # "kernel_size_float": (0.5, 0.5), + # } + # ] + # All layers will use a dynamic window size that is 50% of the input along each of the two + # dimensions. Layers with odd indices will also dilate to the maximum level possible. + # + natten_parameter_list: list | None = None + + # Temporal causality for training autoregressive video generation models. + # When enabled, applies temporal causal attention to generation supertokens. + # Each supertoken is num_action_tokens_per_supertoken action tokens followed + # by H*W vision tokens; the value is stamped onto the packed sequence by the + # temporal-causal packer and read by attention/KV-cache code unchanged. + # Only supports image2video modes (with or without actions). + # Requires joint_attn_implementation="three_way". + video_temporal_causal: bool = False + # "none": standard joint denoising (shared σ, no clean context) + # "teacher_forcing": all frames noised with shared σ; clean history via cross-attention + # "diffusion_forcing": each latent frame gets independent σ ~ Uniform[0,1] + # "teacher_forcing_dcm": replayed teacher-forcing discrete-time consistency distillation + causal_training_strategy: str = attrs.field( + default="none", + validator=attrs.validators.in_({"none", "teacher_forcing", "diffusion_forcing", "teacher_forcing_dcm"}), + ) + + # Load balancing loss config. + lbl: LBLConfig = LBLConfig() + + # vision configs + vision_gen: bool = True # whether to use vision related parameters and condition/generate vision tokens + + # action configs + action_gen: bool = False # whether to use action related parameters and condition/generate action tokens + max_action_dim: int = 32 # maximum dimension of the action space, we need to pad the data to this dimension. + num_embodiment_domains: int = 32 # number of domains for the domain-aware linear layer + + # sound configs + sound_gen: bool = False # whether to use sound related parameters and condition/generate sound tokens + sound_tokenizer: LazyDict | None = None # Sound tokenizer config (e.g., AVAE) + sound_dim: int | None = None # Sound latent channel size (e.g., 64 for AVAE 48kHz) + sound_latent_fps: int = 25 # Sound tokenizer's latent rate (e.g., 48kHz / 1920 hop = 25 Hz) + + log_enc_time_every_n: int = 100 # Frequency of logging encoding time to W&B + + def validate(self, root_config: Config) -> None: + """Skip pretrained loading if a training checkpoint exists. + + Mutates root_config.model.config.* directly because the receiver self + is a fresh attrs copy from OmegaConf.to_object and its writes would be + dropped. + """ + from cosmos3._src.imaginaire.utils import log + from cosmos3._src.vfm.checkpointer.dcp import DistributedCheckpointer + + # There are three cases to consider: + # 1. Model is being trained from scratch (using weights from Hugging Face). + # (both _read_latest_checkpoint_file() and load_path are None). + # In this case, we should load the understanding pathway weights from HF weights, + # Additionally, we must copy the understanding pathway weights to the generation + # pathway. + # + # 2. Model is being trained from a previous checkpoint. + # (_read_latest_checkpoint_file() is not None and load_path can be None or not). + # In this case, the model weights have been already loaded from DCP checkpoint + # (checkpointer/dcp.py). We must skip both loading understanding pathway weights, + # and copying the understanding pathway weights to the generation pathway. + + # 3. Model is being warm-started from a load_path (but no previous checkpoint exists). + # (_read_latest_checkpoint_file() is None and load_path is not None). + # In this case, the model weights have been already loaded from DCP checkpoint + # due to load_path being specified (checkpointer/dcp.py). However, we must still + # load the understanding weights from HF weights (since the understanding model + # may be moved from Qwen3-VL to Cosmos-Reason2 for example). We should not copy + # the understanding pathway weights to the generation pathway (since the generation + # pathway has already been pretrained using the previous model weights, for example, + # the Qwen3-VL weights). But the understanding weights are always kept unchanged. + + if not self.vlm_config.load_pretrained and not self.diffusion_expert_config.load_weights_from_pretrained: + # Neither if branch below is taken; no need to create checkpointer. + return + + checkpointer = DistributedCheckpointer( + root_config.checkpoint, root_config.job, callbacks=None, disable_async=True + ) + + if self.vlm_config.load_pretrained: + if checkpointer._read_latest_checkpoint_file() is not None: + log.info( + "Checkpoint found: disabling pretrained model loading to avoid double loading. " + "Model weights will be loaded from checkpoint instead of safetensors." + ) + root_config.model.config.vlm_config.load_pretrained = False + + if self.diffusion_expert_config.load_weights_from_pretrained: + if checkpointer.load_path is not None: + log.info( + "Load path found: disabling pretrained model loading for generation pathway. " + "Generation pathway weights will be loaded from load_path instead of safetensors." + ) + root_config.model.config.diffusion_expert_config.load_weights_from_pretrained = False + + +@attrs.define(slots=False) +class VLMModelConfig(ModelConfig): + """ + Config for VLM model. + """ + + policy: PolicyConfig = PolicyConfig() + train: TrainConfig = TrainConfig() diff --git a/cosmos-inference/cosmos3/_src/vfm/configs/base/defaults/multiview_dataloader.py b/cosmos-inference/cosmos3/_src/vfm/configs/base/defaults/multiview_dataloader.py new file mode 100644 index 00000000..76475082 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/vfm/configs/base/defaults/multiview_dataloader.py @@ -0,0 +1,162 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +""" +Hydra ConfigStore registration for multiview dataloaders. + +Registers named dataloader configs that can be referenced via Hydra overrides +(e.g. ``{override /data_train: video_control_mads_multiview_0823_gcs_720p_10fps_93frames_7views}``) +or used as templates for inline ``L(get_multiview_video_loader)(...)`` in +experiment configs. + +Two naming conventions: + + **Transfer** (with control signal): + ``video_control_{dataset}_{store}_{res}_{fps}_{frames}_{views}`` + + **Predict** (no control signal): + ``video_{dataset}_{store}_{res}_{fps}_{frames}_{views}`` +""" + +from hydra.core.config_store import ConfigStore + +from cosmos3._src.imaginaire.lazy_config import LazyCall as L +from cosmos3._src.vfm.datasets.multiview.multiview_data_source import ( + DEFAULT_CAMERAS, + INDEX_TO_CAMERA_MAPPING, + TRANSFER_CAPTION_KEY_MAPPING, + TRANSFER_CONTROL_KEY_MAPPING, + TRANSFER_VIDEO_KEY_MAPPING, +) +from cosmos3._src.vfm.datasets.multiview.multiview_dataset import ( + MultiviewAugmentationConfig, + get_multiview_video_loader, +) + +# --------------------------------------------------------------------------- +# Camera view subsets +# --------------------------------------------------------------------------- + +CAMERA_VIEW_CONFIGS: dict[str, tuple[str, ...]] = { + "7views": DEFAULT_CAMERAS, + "1view_front": ("camera_front_wide_120fov",), + "4views": ( + "camera_front_wide_120fov", + "camera_cross_right_120fov", + "camera_rear_tele_30fov", + "camera_cross_left_120fov", + ), +} + +# --------------------------------------------------------------------------- +# Grid dimensions +# --------------------------------------------------------------------------- + +_TRANSFER_DATASETS = ["mads_multiview_0823"] +_OBJECT_STORES = ["gcs"] + +_RESOLUTIONS: list[tuple[str, tuple[int, int]]] = [ + ("720p", (720, 1280)), +] + +_FPS: list[tuple[str, int]] = [ + ("10fps", 1), # MADS transfer data is already at 10 fps +] + +_NUM_VIDEO_FRAMES: list[tuple[str, int]] = [ + ("29frames", 29), + ("61frames", 61), + ("93frames", 93), +] + + +def register_multiview_dataloaders() -> None: + """Register all multiview dataloader configs with Hydra ConfigStore.""" + + cs = ConfigStore.instance() + + # ----- Transfer dataloaders (with control signals) ----- + for dataset in _TRANSFER_DATASETS: + for object_store in _OBJECT_STORES: + for resolution_str, resolution_hw in _RESOLUTIONS: + for fps_str, downsample_factor in _FPS: + for num_frames_str, num_frames in _NUM_VIDEO_FRAMES: + for views_str, camera_keys in CAMERA_VIEW_CONFIGS.items(): + name = ( + f"video_control_{dataset}_{object_store}_{resolution_str}_" + f"{fps_str}_{num_frames_str}_{views_str}" + ) + cs.store( + group="data_train", + package="dataloader_train", + name=name, + node=L(get_multiview_video_loader)( + dataset_name=dataset, + is_train=True, + augmentation_config=L(MultiviewAugmentationConfig)( + resolution_hw=resolution_hw, + fps_downsample_factor=downsample_factor, + num_video_frames=num_frames, + camera_keys=camera_keys, + camera_video_key_mapping=TRANSFER_VIDEO_KEY_MAPPING, + camera_caption_key_mapping=TRANSFER_CAPTION_KEY_MAPPING, + camera_control_key_mapping=TRANSFER_CONTROL_KEY_MAPPING, + position_to_camera_mapping=INDEX_TO_CAMERA_MAPPING, + single_caption_camera_name="camera_front_wide_120fov", + ), + ), + ) + + # ----- Predict dataloaders (no control signals, for future use) ----- + # These use named keys (video_camera_front_wide_120fov, etc.) and need + # different datasets (e.g. alpamayo_dec2024) with 30 fps native data. + # Uncomment and add predict datasets to the catalog when needed. + # + # _PREDICT_DATASETS = ["alpamayo_dec2024"] + # _PREDICT_FPS = [("10fps", 3), ("15fps", 2)] # 30 fps native → downsample + # for dataset in _PREDICT_DATASETS: + # for object_store in _OBJECT_STORES: + # for resolution_str, resolution_hw in _RESOLUTIONS: + # for fps_str, downsample_factor in _PREDICT_FPS: + # for num_frames_str, num_frames in _NUM_VIDEO_FRAMES: + # for views_str, camera_keys in CAMERA_VIEW_CONFIGS.items(): + # name = ( + # f"video_{dataset}_{object_store}_{resolution_str}_" + # f"{fps_str}_{num_frames_str}_{views_str}" + # ) + # cs.store( + # group="data_train", + # package="dataloader_train", + # name=name, + # node=L(get_multiview_video_loader)( + # dataset_name=dataset, + # is_train=True, + # augmentation_config=L(MultiviewAugmentationConfig)( + # resolution_hw=resolution_hw, + # fps_downsample_factor=downsample_factor, + # num_video_frames=num_frames, + # camera_keys=camera_keys, + # camera_video_key_mapping=PREDICT_VIDEO_KEY_MAPPING, + # camera_caption_key_mapping=PREDICT_CAPTION_KEY_MAPPING, + # camera_control_key_mapping=None, + # position_to_camera_mapping=None, + # single_caption_camera_name=None, + # ), + # ), + # ) + + +# Auto-register on import +register_multiview_dataloaders() diff --git a/cosmos-inference/cosmos3/_src/vfm/configs/base/defaults/optimizer.py b/cosmos-inference/cosmos3/_src/vfm/configs/base/defaults/optimizer.py new file mode 100644 index 00000000..0ae008f8 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/vfm/configs/base/defaults/optimizer.py @@ -0,0 +1,112 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +""" +Copied from https://gitlab-master.nvidia.com/dir/imaginaire4/-/blob/d0921eb675d1251e73c4b19acdd78e6ad936ae3b/projects/cosmos/reason2/configs/base/defaults/optimizer.py without changes +""" + +from cosmos3._src.imaginaire.lazy_config import PLACEHOLDER +from cosmos3._src.imaginaire.lazy_config import LazyCall as L +from cosmos3._src.imaginaire.utils.config_helper import ConfigStore +from cosmos3._src.vfm.utils.optimizer import build_lr_scheduler, build_optimizer + +optimizer_kwargs = dict( + # Learning rate for the optimizer. + lr=1e-4, + # Weight decay for the optimizer. + weight_decay=0.1, + # Beta1 and beta2 for the optimizer. + betas=[0.9, 0.99], + # Epsilon for the optimizer. + eps=1e-8, + # Whether to use fuse updates to all parameters. + fused=True, + # Keys to select for the optimizer. + keys_to_select=[], + # Per-key LR multipliers. Maps parameter name patterns to LR multipliers. + # E.g. {"sound2llm": 5.0, "llm2sound": 5.0} gives those params 5x the base LR. + lr_multipliers={}, + # Whether to disable weight decay for one-dimensional params such as norm weights and biases. + # Default is False to preserve historical optimizer behavior. + disable_weight_decay_for_1d_params=False, +) + +lr_scheduler_kwargs = dict( + warm_up_steps=[2000], + f_min=[0.0], + f_max=[1.0], + f_start=[0.0], + cycle_lengths=[100000], + verbosity_interval=0, +) + + +def register_optimizer(): + cs = ConfigStore.instance() + cs.store( + group="optimizer", + package="optimizer", + name="fusedadamw", + node=L(build_optimizer)( + model=PLACEHOLDER, + optimizer_type="FusedAdam", + **optimizer_kwargs, + ), + ) + cs.store( + group="optimizer", + package="optimizer", + name="adamw", + node=L(build_optimizer)( + model=PLACEHOLDER, + optimizer_type="AdamW", + **optimizer_kwargs, + ), + ) + + +def register_scheduler(): + cs = ConfigStore.instance() + cs.store( + group="scheduler", + package="scheduler", + name="lambdalinear", + node=L(build_lr_scheduler)( + optimizer=PLACEHOLDER, + lr_scheduler_type="LambdaLinear", + warm_up_steps=[1000], + cycle_lengths=[10000000000000], + f_start=[1.0e-6], + f_max=[1.0], + f_min=[1.0], + ), + ) + + # Cosine scheduler that works with any optimizer (including fusedadamw) + cs.store( + group="scheduler", + package="scheduler", + name="lambdacosine", + node=L(build_lr_scheduler)( + optimizer=PLACEHOLDER, + lr_scheduler_type="LambdaCosine", + warm_up_steps=[2000], + cycle_lengths=[100000], + f_start=[0.0], + f_max=[1.0], + f_min=[0.0], + verbosity_interval=0, + ), + ) diff --git a/cosmos-inference/cosmos3/_src/vfm/configs/base/defaults/parallelism.py b/cosmos-inference/cosmos3/_src/vfm/configs/base/defaults/parallelism.py new file mode 100644 index 00000000..7403646b --- /dev/null +++ b/cosmos-inference/cosmos3/_src/vfm/configs/base/defaults/parallelism.py @@ -0,0 +1,85 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +"""Shared user-facing parallelism schema for VFM and VLM. + +Both project trees (vfm/, vfm/configs/base/vlm/) instantiate the same +ParallelDims runtime at vfm/utils/parallelism.py. They now also share this +single user-facing config schema. Trainer-side translation from the long +descriptive field names here to the short ParallelDims constructor kwargs +happens at the read site (see vfm/models/{omni_mot_model,vlm_model}.py). +""" + +import attrs + + +@attrs.define(slots=False) +class ParallelismConfig: + # Activation checkpointing is used to reduce the memory usage of the model. + # The outputs of each layer are checkpointed, the intermediate results are not saved. + use_activation_checkpointing: bool = True + + # Torch compile is used to compile the model for faster training. + use_torch_compile: bool = False + + # Whether to use CUDA graphs for faster inference. This option does not work during training. + use_cuda_graphs: bool = False + + # Whether the entire Cosmos3 VFM network is compiled, or only a specific region is compiled. + # Use "language" to compile only individual layers in the MOT model. + # Use "all" to compile the the MOT model, as well as encode/decode functions. + compiled_region: str = attrs.field( + default="language", + validator=attrs.validators.in_({"all", "language"}), + ) + + # Whether torch.compile should generate symbolic-shape (dynamic) kernels + # (maps to ``torch.compile(dynamic=...)``). Defaults to True for training, + # which sees varying shapes across batches (sequence length, CP sharding, ...); + # specializing would recompile continuously. See ParallelismOverrides in + # packages/cosmos3/cosmos3/common/args.py for the inference-side rationale + # (where dynamic=False is preferred for stable AR shapes). + compile_dynamic: bool = True + + # Enable autotuning for pointwise/reduction Triton kernels (e.g. RMSNorm). + # Explores 6 candidate configs instead of the default 1, improving kernel performance + # at the cost of longer first-iteration compilation time. + max_autotune_pointwise: bool = False + + # Enable coordinate descent tuning after autotuning. Starts from the best autotuned + # config and explores nearby configs by adjusting one parameter at a time. + # Requires max_autotune_pointwise=True to have effect on reduction kernels. + coordinate_descent_tuning: bool = False + + # Whether to enable inference mode. + enable_inference_mode: bool = False + + # Number of ranks for sharding the model weights (FSDP). The default -1 + # auto-infers to world_size at runtime via ParallelDims. + data_parallel_shard_degree: int = -1 + + # Number of ranks for replicating the model weights (HSDP outer dim). + # data_parallel_replicate_degree x data_parallel_shard_degree must divide + # world_size when both are explicitly set. + data_parallel_replicate_degree: int = 1 + + # Number of ranks for context parallelism. + context_parallel_shard_degree: int = 1 + + # Number of ranks for CFG parallelism. + cfg_parallel_shard_degree: int = 1 + + # Precision for the model. + precision: str = "bfloat16" diff --git a/cosmos-inference/cosmos3/_src/vfm/configs/base/defaults/tokenizer.py b/cosmos-inference/cosmos3/_src/vfm/configs/base/defaults/tokenizer.py new file mode 100644 index 00000000..a0f1bffe --- /dev/null +++ b/cosmos-inference/cosmos3/_src/vfm/configs/base/defaults/tokenizer.py @@ -0,0 +1,198 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +from hydra.core.config_store import ConfigStore + +from cosmos3._src.imaginaire.lazy_config import PLACEHOLDER, LazyDict +from cosmos3._src.imaginaire.lazy_config import LazyCall as L +from cosmos3._src.vfm.tokenizers.audio.avae import AVAEInterface +from cosmos3._src.vfm.tokenizers.dc_ae.dc_ae_4x32x32 import DCAE4x32x32Interface +from cosmos3._src.vfm.tokenizers.flux_vae_8x8 import FluxVAEInterface +from cosmos3._src.vfm.tokenizers.uniae.noncausal_4x16x16 import UniAEVAEInterface +from cosmos3._src.vfm.tokenizers.wan2pt1_vae_4x8x8 import Wan2pt1VAEInterface +from cosmos3._src.vfm.tokenizers.wan2pt2_vae_4x16x16 import Wan2pt2VAEInterface + +PRETRAINED_TOKENIZER_WAN2PT1_VAE_PTH = "pretrained/tokenizers/video/wan2pt1/Wan2.1_VAE.pth" +PRETRAINED_TOKENIZER_WAN2PT2_VAE_PTH = "pretrained/tokenizers/video/wan2pt2/Wan2.2_VAE.pth" +PRETRAINED_TOKENIZER_FLUX_VAE_PTH = "pretrained/tokenizers/image/flux/ae.safetensors" + +# UniAE checkpoint paths +PRETRAINED_TOKENIZER_UNIAE_4X16X16_C48_T8TO24_64TO512P_FPS_ALL_ENCODER_NONCAUSAL_DECODER_NONCAUSAL_NOGAN_BEST_S1_VAE_PTH = "pretrained/tokenizers/video/cosmos/uniae4x16x16_c48_t8to24_64to512p_fps_all_encoder_noncausal_decoder_noncausal_nogan_best_s1.pt" + +# DCAE checkpoint paths +PRETRAINED_TOKENIZER_DCAE_PTH = "pretrained/tokenizers/video/cosmos/dc-ae-v-1.0-f32t4c64-cosmos-encoder-causal-decoder-chunk-causal-4-frame-120-pad-7-no-gan.pt" +PRETRAINED_TOKENIZER_DCAE_4X32X32_C64_T120_256P_FPS_ALL_ENCODER_CAUSAL_DECODER_CHUNKCAUSAL4_NOGAN_COSMOS_PAD_7_V0PT2_PTH = "pretrained/tokenizers/video/cosmos/dcae4x32x32_c64_t120_256p_fps_all_encoder_causal_decoder_chunk_causal_4_nogan_cosmos_pad_7_v0.2.pt" + +# AVAE (Audio VAE) checkpoint paths +PRETRAINED_TOKENIZER_AVAE_PTH = "pretrained/tokenizers/audio/avae/model_unwrap.ckpt" +PRETRAINED_TOKENIZER_AVAE_44K_NONCAUSAL = "pretrained/tokenizers/audio/avae/avae_44k_noncausal_21hz_64ch.ckpt" +PRETRAINED_TOKENIZER_AVAE_44K_CAUSAL = "pretrained/tokenizers/audio/avae/avae_44k_causal_21hz_64ch.ckpt" +PRETRAINED_TOKENIZER_AVAE_48K_25HZ = "pretrained/tokenizers/audio/avae/avae_48k_noncausal_25hz_64ch.ckpt" +PRETRAINED_TOKENIZER_AVAE_48K_6HZ = "pretrained/tokenizers/audio/avae/avae_48k_noncausal_6hz_64ch.ckpt" + + +# Flux tokenizer config +FluxVAEConfig: LazyDict = L(FluxVAEInterface)( + # This is the flux image tokenizer. + # We use it for bagel inference. + # We do not use it for Cosmos3. + bucket_name=PLACEHOLDER, + object_store_credential_path_pretrained=PLACEHOLDER, + vae_path=PRETRAINED_TOKENIZER_FLUX_VAE_PTH, + chunk_duration=1, + spatial_compression_factor=8, + temporal_compression_factor=1, +) + +Wan2pt1VAEConfig: LazyDict = L(Wan2pt1VAEInterface)( + # 4x8x8 tokenizer + bucket_name=PLACEHOLDER, + object_store_credential_path_pretrained=PLACEHOLDER, + vae_path=PRETRAINED_TOKENIZER_WAN2PT1_VAE_PTH, + spatial_compression_factor=8, + temporal_compression_factor=4, +) + +Wan2pt2VAEConfig: LazyDict = L(Wan2pt2VAEInterface)( + bucket_name=PLACEHOLDER, + object_store_credential_path_pretrained=PLACEHOLDER, + vae_path=PRETRAINED_TOKENIZER_WAN2PT2_VAE_PTH, + spatial_compression_factor=16, + temporal_compression_factor=4, +) + +DCAE4x32x32Config: LazyDict = L(DCAE4x32x32Interface)( + bucket_name=PLACEHOLDER, + object_store_credential_path_pretrained=PLACEHOLDER, + vae_path=PRETRAINED_TOKENIZER_DCAE_PTH, + spatial_compression_factor=32, + temporal_compression_factor=4, +) + +DCAE4x32x32C64T120_256pFpsAllEncoderCausalDecoderChunkCausal4NoganCosmosPad7V0pt2Config: LazyDict = L( + DCAE4x32x32Interface +)( + bucket_name=PLACEHOLDER, + object_store_credential_path_pretrained=PLACEHOLDER, + vae_path=PRETRAINED_TOKENIZER_DCAE_4X32X32_C64_T120_256P_FPS_ALL_ENCODER_CAUSAL_DECODER_CHUNKCAUSAL4_NOGAN_COSMOS_PAD_7_V0PT2_PTH, + model_name="dcae4x32x32_c64_t120_256p_fps_all_encoder_causal_decoder_chunk_causal_4_nogan_cosmos_pad_7_v0.2", + spatial_compression_factor=32, + temporal_compression_factor=4, +) + +UniAE4x16x16C48T8to24_64to512pFpsAllEncoderNoncausalDecoderNoncausalNoganBestS1Config: LazyDict = L(UniAEVAEInterface)( + bucket_name=PLACEHOLDER, + object_store_credential_path_pretrained=PLACEHOLDER, + vae_path=PRETRAINED_TOKENIZER_UNIAE_4X16X16_C48_T8TO24_64TO512P_FPS_ALL_ENCODER_NONCAUSAL_DECODER_NONCAUSAL_NOGAN_BEST_S1_VAE_PTH, + spatial_compression_factor=16, + temporal_compression_factor=4, +) + +# ============================================================================= +# AVAE (Audio VAE) Tokenizer Configs +# ============================================================================= + +# Legacy config with tanh companding (non-commercial use only) +# Latent rate: 44100 / 2048 = 21.53Hz +AVAETokenizerConfig: LazyDict = L(AVAEInterface)( + bucket_name=PLACEHOLDER, + object_store_credential_path_pretrained=PLACEHOLDER, + avae_path=PRETRAINED_TOKENIZER_AVAE_PTH, + sample_rate=44100, + audio_channels=2, + io_channels=64, + hop_size=2048, + normalization_type="tanh", + tanh_input_scale=1.0, + tanh_output_scale=3.0, +) + + +# 44.1kHz Non-causal (PRIMARY - used for V2A/T2A training) +# Latent rate: 44100 / 2048 = 21.53Hz +AVAE_44k_NoncausalConfig: LazyDict = L(AVAEInterface)( + bucket_name=PLACEHOLDER, + object_store_credential_path_pretrained=PLACEHOLDER, + avae_path=PRETRAINED_TOKENIZER_AVAE_44K_NONCAUSAL, + sample_rate=44100, + audio_channels=2, + io_channels=64, + hop_size=2048, + normalize_latents=True, + tanh_input_scale=1.5, + tanh_output_scale=3.5, +) + +# 48kHz 25Hz (higher quality audio) +# Latent rate: 48000 / 1920 = 25Hz +AVAE_48k_25hzConfig: LazyDict = L(AVAEInterface)( + bucket_name=PLACEHOLDER, + object_store_credential_path_pretrained=PLACEHOLDER, + avae_path=PRETRAINED_TOKENIZER_AVAE_48K_25HZ, + sample_rate=48000, + audio_channels=2, + io_channels=64, + hop_size=1920, + normalize_latents=True, + tanh_input_scale=1.5, + tanh_output_scale=3.5, +) + + +def register_tokenizer(): + cs = ConfigStore.instance() + + # Wan2pt1 and Wan2pt2 tokenizers + cs.store(group="tokenizer", package="model.config.tokenizer", name="wan2pt1_tokenizer", node=Wan2pt1VAEConfig) + cs.store(group="tokenizer", package="model.config.tokenizer", name="wan2pt2_tokenizer", node=Wan2pt2VAEConfig) + # UniAE tokenizer + cs.store( + group="tokenizer", + package="model.config.tokenizer", + name="uniae_4x16x16_c48_t8to24_64to512p_fps_all_encoder_noncausal_decoder_noncausal_nogan_best_s1_tokenizer", + node=UniAE4x16x16C48T8to24_64to512pFpsAllEncoderNoncausalDecoderNoncausalNoganBestS1Config, + ) + # Flux tokenizer + cs.store(group="tokenizer", package="model.config.tokenizer", name="flux_tokenizer", node=FluxVAEConfig) + # DC AE 4x32x32 tokenizer + cs.store( + group="tokenizer", + package="model.config.tokenizer", + name="dc_ae_4x32x32_tokenizer", + node=DCAE4x32x32Config, + ) + cs.store( + group="tokenizer", + package="model.config.tokenizer", + name="dc_ae_4x32x32_c64_t120_256p_fps_all_encoder_causal_decoder_chunk_causal_4_nogan_cosmos_pad_7_v0.2_tokenizer", + node=DCAE4x32x32C64T120_256pFpsAllEncoderCausalDecoderChunkCausal4NoganCosmosPad7V0pt2Config, + ) + + +def register_sound_tokenizer(): + """Register sound tokenizers in Hydra ConfigStore under model.config.sound_tokenizer.""" + cs = ConfigStore.instance() + cs.store( + group="sound_tokenizer", package="model.config.sound_tokenizer", name="avae_48k_25hz", node=AVAE_48k_25hzConfig + ) + cs.store( + group="sound_tokenizer", + package="model.config.sound_tokenizer", + name="avae_44k_noncausal", + node=AVAE_44k_NoncausalConfig, + ) + cs.store( + group="sound_tokenizer", package="model.config.sound_tokenizer", name="avae_tokenizer", node=AVAETokenizerConfig + ) diff --git a/cosmos-inference/cosmos3/_src/vfm/configs/base/defaults/unittest.py b/cosmos-inference/cosmos3/_src/vfm/configs/base/defaults/unittest.py new file mode 100644 index 00000000..ed2b5333 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/vfm/configs/base/defaults/unittest.py @@ -0,0 +1,43 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +import attrs + +# from cosmos3._src.vfm.configs.base.defaults.cluster import GCPIADGB200Config + +# We are hardcoding the unittest assets in this file. + +# CLUSTER_CONFIG = GCPIADGB200Config + +# add codeowner for cosmos3/_src/vfm/tokenizers + + +@attrs.define(slots=False) +class SwfitStackPDXrConfig: + """ + Config for the cluster specific information. + Everything cluster specific should be here. + """ + + object_store_bucket_data: str + object_store_credential_data: str + + +UNITTEST_CONFIG = SwfitStackPDXrConfig( + object_store_bucket_data="unittest", + object_store_credential_data="credentials/pdx_dir.secret", +) + +TOKENIZER_RECONSTRUCTION_VIDEO_PATH = "tokenizer/video/panda70m_test_0000039_00000.mp4" +AVAE_RECONSTRUCTION_AUDIO_PATH = "tokenizer/audio/test_audio.wav" diff --git a/cosmos-inference/cosmos3/_src/vfm/configs/base/defaults/vlm.py b/cosmos-inference/cosmos3/_src/vfm/configs/base/defaults/vlm.py new file mode 100644 index 00000000..a86c1730 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/vfm/configs/base/defaults/vlm.py @@ -0,0 +1,982 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +# Configs for VLM / LLM models + +import os + +import attrs +import torch.distributed as dist + +from cosmos3._src.imaginaire.flags import INTERNAL +from cosmos3._src.imaginaire.lazy_config import LazyCall as L +from cosmos3._src.imaginaire.lazy_config import LazyDict +from cosmos3._src.imaginaire.lazy_config import instantiate as lazy_instantiate +from cosmos3._src.imaginaire.utils import log +from cosmos3._src.imaginaire.utils.config_helper import ConfigStore +from cosmos3._src.imaginaire.utils.easy_io import easy_io +from cosmos3._src.vfm.models.mot.unified_mot import ( + Nemotron3DenseVLTextConfig, + Nemotron3DenseVLTextForCausalLM, + Qwen3VLMoeTextConfig, + Qwen3VLMoeTextForCausalLM, + Qwen3VLTextConfig, + Qwen3VLTextForCausalLM, +) +from cosmos3._src.vfm.processors import LLMTokenizerProcessor, build_processor_lazy +from cosmos3._src.vfm.tokenizers.tokenization_qwen2 import Qwen2Tokenizer + + +def create_vlm_config(base_config: LazyDict, **overrides): + vlm_config = lazy_instantiate(base_config) + for key, value in overrides.items(): + setattr(vlm_config, key, value) + return vlm_config + + +def get_rank_safe() -> int: + if dist.is_available() and dist.is_initialized(): + return dist.get_rank() + return 0 # default to rank 0 when not in distributed mode + + +################################################################################ +# Download tokenizer files from s3 +# Download to ~/.cache/imaginaire4/tokenizer_files/{model_name} and then load from there. +def download_tokenizer_files(model_name: str, config_variant: str) -> str: + if config_variant == "hf": + return model_name + + if config_variant == "s3": + ckpt_bucket = "bucket" + credentials = "credentials/s3_checkpoint.secret" + elif config_variant == "gcp": + ckpt_bucket = "bucket" + credentials = "credentials/gcp_checkpoint.secret" + else: + raise ValueError(f"Invalid config variant: {config_variant}") + + model_path = f"s3://{ckpt_bucket}/cosmos3/pretrained/huggingface/{model_name}" + if not INTERNAL: + from cosmos3._src.imaginaire.utils.checkpoint_db import download_checkpoint_v2 + + model_path = download_checkpoint_v2(model_path) + if "://" not in model_path: + return model_path + + imaginaire_cache_dir = os.environ.get("IMAGINAIRE_CACHE_DIR", os.path.expanduser("~/.cache/imaginaire4")) + destination_dir = os.path.join(imaginaire_cache_dir, f"tokenizer_files/{model_name}/rank_{get_rank_safe()}") + s3_backend_args = { + "backend": "s3", + "s3_credential_path": credentials, + } + + extensions = ["json", "txt", "jinja"] + for extension in extensions: + for file_path in easy_io.list_dir_or_file( + model_path, + list_dir=False, + list_file=True, + suffix=extension, + recursive=False, + backend_args=s3_backend_args, + ): + full_path = easy_io.join_path(model_path, file_path, backend_args=s3_backend_args) + local_path = f"{destination_dir}/{file_path}" + if os.path.exists(local_path): + log.debug(f"Skipping already downloaded tokenizer file: {local_path}") + continue + log.info(f"Downloading tokenizer file: {full_path} to {local_path}, cwd: {os.getcwd()}") + # Download the file + file_data = easy_io.get(full_path, backend_args=s3_backend_args) + easy_io.put(file_data, local_path) + return destination_dir + + +def create_qwen2_tokenizer_with_download(pretrained_model_name: str, config_variant: str): + destination_dir = download_tokenizer_files(pretrained_model_name, config_variant) + return LLMTokenizerProcessor(Qwen2Tokenizer.from_pretrained(destination_dir)) + + +@attrs.define(slots=False) +class VLMConfig: + # Name of the huggingface model + model_name: str = "" + + # Langugage model class to instantiate + model_instance: LazyDict | None = None + + # Tokenizer / processor used by data augmentors. Always a BaseVLMProcessor + # subclass: VLM configs use `build_processor`, LLM-only configs wrap the raw + # tokenizer in `LLMTokenizerProcessor`. Either way, `_proc.tokenizer` exposes + # the underlying HuggingFace tokenizer. + tokenizer: LazyDict | None = None + + # Path to the checkpoint + checkpoint_path: str = "" + + # Path to the credential file + credential_path: str = "" # Path to the credential file + + # Whether to enable GCS patch in boto3 for DCP loading from GCS + enable_gcs_patch_in_boto3: bool = False + + # Whether to load the pretrained LLM / VLM + load_pretrained: bool = True + + # Layer module to use. We override the decoder layer in huggingface model with this class. + # This is needed as we need to initialize MoT layers. + layer_module: str = "Qwen2MoTDecoderLayer" + + # Whether to use QK normalization for text expert + qk_norm_for_text: bool = False + + # Whether to use QK normalization for diffusion expert + qk_norm_for_diffusion: bool = True # Whether to use QK normalization for diffusion expert + + # If True, use the same word embedding matrices for input and outut embedding layers. + tie_word_embeddings: bool = False + + # Whether to prepend a system prompt during text tokenization. + # Checkpoints trained with system prompt enabled require this to be True at inference time. + use_system_prompt: bool = False + + # If set, forces safetensors weight remapping ("qwen3" vs "nemotron_3_dense_vl"/"nemotron_3_llm"). None = auto-detect. + vlm_checkpoint_format: str | None = None + + +# Configs for LLM models +Qwen3MoT_LLM_0p6b_Config: VLMConfig = VLMConfig( + model_name="Qwen/Qwen3-0.6B", + model_instance=L(Qwen3VLTextForCausalLM)( + config=L(create_vlm_config)( + base_config=L(Qwen3VLTextConfig.from_json_file)( + json_file="cosmos3/_src/vfm/models/llm/qwen3/configs/Qwen3-0.6B.json" + ), + layer_module="MoTDecoderLayer", + qk_norm_for_text=True, + qk_norm_for_diffusion=True, + tie_word_embeddings=True, + freeze_und=False, + ), + ), + tokenizer=L(build_processor_lazy)( + tokenizer_type="Qwen/Qwen3-0.6B", + config_variant="hf", + ), + checkpoint_path="s3://bucket/cosmos3/pretrained/huggingface/Qwen/Qwen3-0.6B/", + credential_path="credentials/s3_training.secret", + load_pretrained=True, +) + +Qwen3MoT_LLM_0p6b_GCP_Config: VLMConfig = VLMConfig( + model_name="Qwen/Qwen3-0.6B", + model_instance=L(Qwen3VLTextForCausalLM)( + config=L(create_vlm_config)( + base_config=L(Qwen3VLTextConfig.from_json_file)( + json_file="cosmos3/_src/vfm/models/llm/qwen3/configs/Qwen3-0.6B.json" + ), + layer_module="MoTDecoderLayer", + qk_norm_for_text=True, + qk_norm_for_diffusion=True, + tie_word_embeddings=True, + freeze_und=False, + ), + ), + tokenizer=L(build_processor_lazy)( + tokenizer_type="Qwen/Qwen3-0.6B", + config_variant="gcp", + ), + checkpoint_path="s3://bucket/cosmos3/pretrained/huggingface/Qwen/Qwen3-0.6B/", + credential_path="credentials/gcp_checkpoint.secret", + load_pretrained=True, +) + +Nemotron3_LLM_2b_GCP_Config: VLMConfig = VLMConfig( + model_name="nvidia/NVIDIA-Nemotron-3-2B-BF16", + model_instance=L(Nemotron3DenseVLTextForCausalLM)( + config=L(create_vlm_config)( + base_config=L(Nemotron3DenseVLTextConfig.from_json_file)( + json_file="cosmos3/_src/vfm/models/vlm/nemotron_3_dense_vl/configs/Nemotron-2B-Dense-VL.json" + ), + layer_module="MoTDecoderLayer", + qk_norm_for_text=False, + qk_norm_for_diffusion=True, + tie_word_embeddings=False, + freeze_und=False, + ), + ), + tokenizer=L(build_processor_lazy)( + tokenizer_type="Nemotron/NVIDIA-Nemotron-3-2B-BF16", + config_variant="gcp", + ), + checkpoint_path="s3://bucket/cosmos3/pretrained/huggingface/Nemotron/NVIDIA-Nemotron-3-2B-BF16/", + credential_path="credentials/gcp_checkpoint.secret", + load_pretrained=True, + enable_gcs_patch_in_boto3=True, + vlm_checkpoint_format="nemotron_3_llm", +) + +# Configs for VL instruct models + +# Config for Qwen3VL 30B A3B Instruct model +# Qwen3VLMoE uses Qwen2Tokenizer +Qwen3VLMoT_VLM_30b_a3b_Instruct_Config: VLMConfig = VLMConfig( + model_name="Qwen/Qwen3-VL-30B-A3B-Instruct", + model_instance=L(Qwen3VLMoeTextForCausalLM)( + config=L(create_vlm_config)( + base_config=L(Qwen3VLMoeTextConfig.from_json_file)( + json_file="cosmos3/_src/vfm/models/vlm/qwen3_vl_moe/configs/Qwen3-VL-30B-A3B-Instruct.json" + ), + layer_module="Qwen3VLMoeTextMoTDecoderLayer", + qk_norm_for_text=True, + qk_norm_for_diffusion=True, + tie_word_embeddings=True, + freeze_und=False, + ), + ), + tokenizer=L(build_processor_lazy)( + tokenizer_type="Qwen/Qwen3-VL-30B-A3B-Instruct", + config_variant="s3", + ), + checkpoint_path="s3://bucket/cosmos3/pretrained/huggingface/Qwen/Qwen3-VL-30B-A3B-Instruct/", + credential_path="credentials/s3_training.secret", + load_pretrained=True, +) + + +Qwen3VLMoT_VLM_30b_a3b_Instruct_GCP_Config: VLMConfig = VLMConfig( + model_name="Qwen/Qwen3-VL-30B-A3B-Instruct", + model_instance=L(Qwen3VLMoeTextForCausalLM)( + config=L(create_vlm_config)( + base_config=L(Qwen3VLMoeTextConfig.from_json_file)( + json_file="cosmos3/_src/vfm/models/vlm/qwen3_vl_moe/configs/Qwen3-VL-30B-A3B-Instruct.json" + ), + layer_module="Qwen3VLMoeTextMoTDecoderLayer", + qk_norm_for_text=True, + qk_norm_for_diffusion=True, + tie_word_embeddings=True, + freeze_und=False, + ), + ), + tokenizer=L(build_processor_lazy)( + tokenizer_type="Qwen/Qwen3-VL-30B-A3B-Instruct", + config_variant="gcp", + ), + checkpoint_path="s3://bucket/cosmos3/pretrained/huggingface/Qwen/Qwen3-VL-30B-A3B-Instruct/", + credential_path="credentials/gcp_checkpoint.secret", + load_pretrained=True, + enable_gcs_patch_in_boto3=True, +) + +CosmosReason2_VLM_30b_a3b_Private_GCP_Config: VLMConfig = VLMConfig( + model_name="nvidia/Cosmos-Reason2-30B-A3B-Private", + model_instance=L(Qwen3VLMoeTextForCausalLM)( + config=L(create_vlm_config)( + base_config=L(Qwen3VLMoeTextConfig.from_json_file)( + json_file="cosmos3/_src/vfm/models/vlm/qwen3_vl_moe/configs/Qwen3-VL-30B-A3B-Instruct.json" + ), + layer_module="Qwen3VLMoeTextMoTDecoderLayer", + qk_norm_for_text=True, + qk_norm_for_diffusion=True, + tie_word_embeddings=True, + freeze_und=False, + ), + ), + tokenizer=L(build_processor_lazy)( + tokenizer_type="Qwen/Qwen3-VL-30B-A3B-Instruct", + config_variant="gcp", + ), + checkpoint_path="s3://bucket/cosmos3/pretrained/huggingface/Cosmos-Reason/Cosmos-Reason2-30B-A3B-Private/", + credential_path="credentials/gcp_checkpoint.secret", + load_pretrained=True, + enable_gcs_patch_in_boto3=True, +) + +# Config for Qwen3VL 235B A22B Instruct model +# Qwen3VLMoE uses Qwen2Tokenizer +Qwen3VLMoT_VLM_235b_a22b_Instruct_GCP_Config: VLMConfig = VLMConfig( + model_name="Qwen/Qwen3-VL-235B-A22B-Instruct", + model_instance=L(Qwen3VLMoeTextForCausalLM)( + config=L(create_vlm_config)( + base_config=L(Qwen3VLMoeTextConfig.from_json_file)( + json_file="cosmos3/_src/vfm/models/vlm/qwen3_vl_moe/configs/Qwen3-VL-235B-A22B-Instruct.json" + ), + layer_module="Qwen3VLMoeTextMoTDecoderLayer", + qk_norm_for_text=True, + qk_norm_for_diffusion=True, + tie_word_embeddings=True, + freeze_und=False, + ), + ), + tokenizer=L(build_processor_lazy)( + tokenizer_type="Qwen/Qwen3-VL-235B-A22B-Instruct", + config_variant="gcp", + ), + checkpoint_path="s3://bucket/cosmos3/pretrained/huggingface/Qwen/Qwen3-VL-235B-A22B-Instruct/", + credential_path="credentials/gcp_checkpoint.secret", + load_pretrained=True, + enable_gcs_patch_in_boto3=True, +) + + +# Config for Qwen3VL 2B Instruct model +# Qwen3VL uses Qwen2Tokenizer +Qwen3VLMoT_VLM_2b_Instruct_Config: VLMConfig = VLMConfig( + model_name="Qwen/Qwen3-VL-2B-Instruct", + model_instance=L(Qwen3VLTextForCausalLM)( + config=L(create_vlm_config)( + base_config=L(Qwen3VLTextConfig.from_json_file)( + json_file="cosmos3/_src/vfm/models/vlm/qwen3_vl/configs/Qwen3-VL-2B-Instruct.json" + ), + layer_module="MoTDecoderLayer", + qk_norm_for_text=True, + qk_norm_for_diffusion=True, + tie_word_embeddings=True, + freeze_und=False, + ), + ), + tokenizer=L(build_processor_lazy)( + tokenizer_type="Qwen/Qwen3-VL-2B-Instruct", + config_variant="s3", + ), + checkpoint_path="s3://bucket/cosmos3/pretrained/huggingface/Qwen/Qwen3-VL-2B-Instruct/", + credential_path="credentials/s3_training.secret", + load_pretrained=True, +) + +Qwen3VLMoT_VLM_2b_Instruct_GCP_Config: VLMConfig = VLMConfig( + model_name="Qwen/Qwen3-VL-2B-Instruct", + model_instance=L(Qwen3VLTextForCausalLM)( + config=L(create_vlm_config)( + base_config=L(Qwen3VLTextConfig.from_json_file)( + json_file="cosmos3/_src/vfm/models/vlm/qwen3_vl/configs/Qwen3-VL-2B-Instruct.json" + ), + layer_module="MoTDecoderLayer", + qk_norm_for_text=True, + qk_norm_for_diffusion=True, + tie_word_embeddings=True, + freeze_und=False, + ), + ), + tokenizer=L(build_processor_lazy)( + tokenizer_type="Qwen/Qwen3-VL-2B-Instruct", + config_variant="gcp", + ), + checkpoint_path="s3://bucket/cosmos3/pretrained/huggingface/Qwen/Qwen3-VL-2B-Instruct/", + credential_path="credentials/gcp_checkpoint.secret", + load_pretrained=True, + enable_gcs_patch_in_boto3=True, +) + +Qwen3VLMoT_VLM_2b_Instruct_HF_Config: VLMConfig = VLMConfig( + model_name="Qwen/Qwen3-VL-2B-Instruct", + model_instance=L(Qwen3VLTextForCausalLM)( + config=L(create_vlm_config)( + base_config=L(Qwen3VLTextConfig.from_json_file)( + json_file="cosmos3/_src/vfm/models/vlm/qwen3_vl/configs/Qwen3-VL-2B-Instruct.json" + ), + layer_module="MoTDecoderLayer", + qk_norm_for_text=True, + qk_norm_for_diffusion=True, + tie_word_embeddings=True, + freeze_und=False, + ), + ), + tokenizer=L(build_processor_lazy)( + tokenizer_type="Qwen/Qwen3-VL-2B-Instruct", + config_variant="hf", + ), + load_pretrained=True, +) + +Nemotron3DenseVL_VLM_2b_GCP_Config: VLMConfig = VLMConfig( + model_name="nvidia/Nemotron-3-Dense-VL-2B-BF16-Alignment", + model_instance=L(Nemotron3DenseVLTextForCausalLM)( + config=L(create_vlm_config)( + base_config=L(Nemotron3DenseVLTextConfig.from_json_file)( + json_file="cosmos3/_src/vfm/models/vlm/nemotron_3_dense_vl/configs/Nemotron-2B-Dense-VL.json" + ), + layer_module="MoTDecoderLayer", + qk_norm_for_text=False, + qk_norm_for_diffusion=True, + tie_word_embeddings=False, + freeze_und=False, + ), + ), + tokenizer=L(build_processor_lazy)( + tokenizer_type="Nemotron/NVIDIA-Nemotron-3-Dense-VL-2B-BF16-Alignment", + config_variant="gcp", + ), + checkpoint_path="s3://bucket/cosmos3/pretrained/huggingface/Nemotron/NVIDIA-Nemotron-3-Dense-VL-2B-BF16-Alignment/", + credential_path="credentials/gcp_checkpoint.secret", + load_pretrained=True, + enable_gcs_patch_in_boto3=True, + vlm_checkpoint_format="nemotron_3_dense_vl", +) + +Cosmos3Reasoner_Nemotron_VLM_2b_Private_GCP_Config: VLMConfig = VLMConfig( + model_name="nvidia/Cosmos3-Reasoner-2B-Private", + model_instance=L(Nemotron3DenseVLTextForCausalLM)( + config=L(create_vlm_config)( + base_config=L(Nemotron3DenseVLTextConfig.from_json_file)( + json_file="cosmos3/_src/vfm/models/vlm/nemotron_3_dense_vl/configs/Nemotron-2B-Dense-VL.json" + ), + layer_module="MoTDecoderLayer", + qk_norm_for_text=False, + qk_norm_for_diffusion=True, + tie_word_embeddings=False, + freeze_und=False, + ), + ), + tokenizer=L(build_processor_lazy)( + tokenizer_type="Nemotron/NVIDIA-Nemotron-3-Dense-VL-2B-BF16-Alignment", + config_variant="gcp", + ), + checkpoint_path="s3://bucket/cosmos3/pretrained/huggingface/nvidia/Cosmos3-Reasoner-2B-Private/", + credential_path="credentials/gcp_checkpoint.secret", + load_pretrained=True, + enable_gcs_patch_in_boto3=True, + vlm_checkpoint_format="nemotron_3_dense_vl", +) + +CosmosReason2_VLM_2b_GCP_Config: VLMConfig = VLMConfig( + model_name="nvidia/Cosmos-Reason2-2B", + model_instance=L(Qwen3VLTextForCausalLM)( + config=L(create_vlm_config)( + base_config=L(Qwen3VLTextConfig.from_json_file)( + json_file="cosmos3/_src/vfm/models/vlm/qwen3_vl/configs/Qwen3-VL-2B-Instruct.json" + ), + layer_module="MoTDecoderLayer", + qk_norm_for_text=True, + qk_norm_for_diffusion=True, + tie_word_embeddings=True, + freeze_und=False, + ), + ), + tokenizer=L(build_processor_lazy)( + tokenizer_type="Qwen/Qwen3-VL-2B-Instruct", + config_variant="gcp", + ), + checkpoint_path="s3://bucket/cosmos3/pretrained/huggingface/Cosmos-Reason/Cosmos-Reason2-2B/", + credential_path="credentials/gcp_checkpoint.secret", + load_pretrained=True, + enable_gcs_patch_in_boto3=True, +) + +CosmosReason2_VLM_2b_Private_GCP_Config: VLMConfig = VLMConfig( + model_name="nvidia/Cosmos-Reason2-2B-Private", + model_instance=L(Qwen3VLTextForCausalLM)( + config=L(create_vlm_config)( + base_config=L(Qwen3VLTextConfig.from_json_file)( + json_file="cosmos3/_src/vfm/models/vlm/qwen3_vl/configs/Qwen3-VL-2B-Instruct.json" + ), + layer_module="MoTDecoderLayer", + qk_norm_for_text=True, + qk_norm_for_diffusion=True, + tie_word_embeddings=True, + freeze_und=False, + ), + ), + tokenizer=L(build_processor_lazy)( + tokenizer_type="Qwen/Qwen3-VL-2B-Instruct", + config_variant="gcp", + ), + checkpoint_path="s3://bucket/cosmos3/pretrained/huggingface/Cosmos-Reason/Cosmos-Reason2-2B-Private/", + credential_path="credentials/gcp_checkpoint.secret", + load_pretrained=True, + enable_gcs_patch_in_boto3=True, +) + +Cosmos3Reasoner_VLM_2b_Private_GCP_Config: VLMConfig = VLMConfig( + model_name="nvidia/Cosmos3-Reasoner-2B-Private", + model_instance=L(Qwen3VLTextForCausalLM)( + config=L(create_vlm_config)( + base_config=L(Qwen3VLTextConfig.from_json_file)( + json_file="cosmos3/_src/vfm/models/vlm/qwen3_vl/configs/Qwen3-VL-2B-Instruct.json" + ), + layer_module="MoTDecoderLayer", + qk_norm_for_text=True, + qk_norm_for_diffusion=True, + tie_word_embeddings=True, + freeze_und=False, + ), + ), + tokenizer=L(build_processor_lazy)( + tokenizer_type="Qwen/Qwen3-VL-2B-Instruct", + config_variant="gcp", + ), + checkpoint_path="s3://bucket/cosmos3/pretrained/huggingface/Cosmos-Reason/Cosmos3-Reasoner-2B-Private/", + credential_path="credentials/gcp_checkpoint.secret", + load_pretrained=True, + enable_gcs_patch_in_boto3=True, +) + +# Config for Qwen3VL 4B Instruct model +# Qwen3VL uses Qwen2Tokenizer +Qwen3VLMoT_VLM_4b_Instruct_Config: VLMConfig = VLMConfig( + model_name="Qwen/Qwen3-VL-4B-Instruct", + model_instance=L(Qwen3VLTextForCausalLM)( + config=L(create_vlm_config)( + base_config=L(Qwen3VLTextConfig.from_json_file)( + json_file="cosmos3/_src/vfm/models/vlm/qwen3_vl/configs/Qwen3-VL-4B-Instruct.json" + ), + layer_module="MoTDecoderLayer", + qk_norm_for_text=True, + qk_norm_for_diffusion=True, + tie_word_embeddings=True, + freeze_und=False, + ), + ), + tokenizer=L(build_processor_lazy)( + tokenizer_type="Qwen/Qwen3-VL-4B-Instruct", + config_variant="s3", + ), + checkpoint_path="s3://bucket/cosmos3/pretrained/huggingface/Qwen/Qwen3-VL-4B-Instruct/", + credential_path="credentials/s3_training.secret", + load_pretrained=True, +) + +Qwen3VLMoT_VLM_4b_Instruct_GCP_Config: VLMConfig = VLMConfig( + model_name="Qwen/Qwen3-VL-4B-Instruct", + model_instance=L(Qwen3VLTextForCausalLM)( + config=L(create_vlm_config)( + base_config=L(Qwen3VLTextConfig.from_json_file)( + json_file="cosmos3/_src/vfm/models/vlm/qwen3_vl/configs/Qwen3-VL-4B-Instruct.json" + ), + layer_module="MoTDecoderLayer", + qk_norm_for_text=True, + qk_norm_for_diffusion=True, + tie_word_embeddings=True, + freeze_und=False, + ), + ), + tokenizer=L(build_processor_lazy)( + tokenizer_type="Qwen/Qwen3-VL-4B-Instruct", + config_variant="gcp", + ), + checkpoint_path="s3://bucket/cosmos3/pretrained/huggingface/Qwen/Qwen3-VL-4B-Instruct/", + credential_path="credentials/gcp_checkpoint.secret", + load_pretrained=True, + enable_gcs_patch_in_boto3=True, +) + +# Config for Qwen3VL 8B Instruct model +# Qwen3VL uses Qwen2Tokenizer +Qwen3VLMoT_VLM_8b_Instruct_Config: VLMConfig = VLMConfig( + model_name="Qwen/Qwen3-VL-8B-Instruct", + model_instance=L(Qwen3VLTextForCausalLM)( + config=L(create_vlm_config)( + base_config=L(Qwen3VLTextConfig.from_json_file)( + json_file="cosmos3/_src/vfm/models/vlm/qwen3_vl/configs/Qwen3-VL-8B-Instruct.json" + ), + layer_module="MoTDecoderLayer", + qk_norm_for_text=True, + qk_norm_for_diffusion=True, + tie_word_embeddings=True, + freeze_und=False, + ), + ), + tokenizer=L(build_processor_lazy)( + tokenizer_type="Qwen/Qwen3-VL-8B-Instruct", + config_variant="s3", + ), + checkpoint_path="s3://bucket/cosmos3/pretrained/huggingface/Qwen/Qwen3-VL-8B-Instruct/", + credential_path="credentials/s3_training.secret", + load_pretrained=True, +) + +Qwen3VLMoT_VLM_8b_Instruct_GCP_Config: VLMConfig = VLMConfig( + model_name="Qwen/Qwen3-VL-8B-Instruct", + model_instance=L(Qwen3VLTextForCausalLM)( + config=L(create_vlm_config)( + base_config=L(Qwen3VLTextConfig.from_json_file)( + json_file="cosmos3/_src/vfm/models/vlm/qwen3_vl/configs/Qwen3-VL-8B-Instruct.json" + ), + layer_module="MoTDecoderLayer", + qk_norm_for_text=True, + qk_norm_for_diffusion=True, + tie_word_embeddings=True, + freeze_und=False, + ), + ), + tokenizer=L(build_processor_lazy)( + tokenizer_type="Qwen/Qwen3-VL-8B-Instruct", + config_variant="gcp", + ), + checkpoint_path="s3://bucket/cosmos3/pretrained/huggingface/Qwen/Qwen3-VL-8B-Instruct/", + credential_path="credentials/gcp_checkpoint.secret", + load_pretrained=True, + enable_gcs_patch_in_boto3=True, +) + +CosmosReason2_VLM_8b_Private_GCP_Config: VLMConfig = VLMConfig( + model_name="nvidia/Cosmos-Reason2-8B-Private", + model_instance=L(Qwen3VLTextForCausalLM)( + config=L(create_vlm_config)( + base_config=L(Qwen3VLTextConfig.from_json_file)( + json_file="cosmos3/_src/vfm/models/vlm/qwen3_vl/configs/Qwen3-VL-8B-Instruct.json" + ), + layer_module="MoTDecoderLayer", + qk_norm_for_text=True, + qk_norm_for_diffusion=True, + tie_word_embeddings=True, + freeze_und=False, + ), + ), + tokenizer=L(build_processor_lazy)( + tokenizer_type="Qwen/Qwen3-VL-8B-Instruct", + config_variant="gcp", + ), + checkpoint_path="s3://bucket/cosmos3/pretrained/huggingface/Cosmos-Reason/Cosmos-Reason2-8B-Private/", + credential_path="credentials/gcp_checkpoint.secret", + load_pretrained=True, + enable_gcs_patch_in_boto3=True, +) + +Cosmos3Reasoner_VLM_8b_Private_GCP_Config: VLMConfig = VLMConfig( + model_name="nvidia/Cosmos3-Reasoner-8B-Private", + model_instance=L(Qwen3VLTextForCausalLM)( + config=L(create_vlm_config)( + base_config=L(Qwen3VLTextConfig.from_json_file)( + json_file="cosmos3/_src/vfm/models/vlm/qwen3_vl/configs/Qwen3-VL-8B-Instruct.json" + ), + layer_module="MoTDecoderLayer", + qk_norm_for_text=True, + qk_norm_for_diffusion=True, + tie_word_embeddings=True, + freeze_und=False, + ), + ), + tokenizer=L(build_processor_lazy)( + tokenizer_type="Qwen/Qwen3-VL-8B-Instruct", + config_variant="gcp", + ), + checkpoint_path="s3://bucket/cosmos3/pretrained/huggingface/Cosmos-Reason/Cosmos3-Reasoner-8B-Private/", + credential_path="credentials/gcp_checkpoint.secret", + load_pretrained=True, + enable_gcs_patch_in_boto3=True, +) + +Cosmos3NanoReasoner_VLM_GCP_Config: VLMConfig = VLMConfig( + model_name="nvidia/Cosmos3-Nano-Reasoner", + model_instance=L(Qwen3VLTextForCausalLM)( + config=L(create_vlm_config)( + base_config=L(Qwen3VLTextConfig.from_json_file)( + json_file="cosmos3/_src/vfm/models/vlm/qwen3_vl/configs/Qwen3-VL-8B-Instruct.json" + ), + layer_module="MoTDecoderLayer", + qk_norm_for_text=True, + qk_norm_for_diffusion=True, + tie_word_embeddings=True, + freeze_und=False, + ), + ), + tokenizer=L(create_qwen2_tokenizer_with_download)( + pretrained_model_name="Qwen/Qwen3-VL-8B-Instruct", + config_variant="gcp", + ), + checkpoint_path="s3://bucket/cosmos3/pretrained/huggingface/Cosmos-Reason/Cosmos3-Nano-Reasoner/", + credential_path="credentials/gcp_checkpoint.secret", + load_pretrained=True, + enable_gcs_patch_in_boto3=True, +) + +# Config for Qwen3VL 32B Instruct model +# Qwen3VL uses Qwen2Tokenizer +Qwen3VLMoT_VLM_32b_Instruct_Config: VLMConfig = VLMConfig( + model_name="Qwen/Qwen3-VL-32B-Instruct", + model_instance=L(Qwen3VLTextForCausalLM)( + config=L(create_vlm_config)( + base_config=L(Qwen3VLTextConfig.from_json_file)( + json_file="cosmos3/_src/vfm/models/vlm/qwen3_vl/configs/Qwen3-VL-32B-Instruct.json" + ), + layer_module="MoTDecoderLayer", + qk_norm_for_text=True, + qk_norm_for_diffusion=True, + tie_word_embeddings=True, + freeze_und=False, + ), + ), + tokenizer=L(build_processor_lazy)( + tokenizer_type="Qwen/Qwen3-VL-32B-Instruct", + config_variant="s3", + ), + checkpoint_path="s3://bucket/cosmos3/pretrained/huggingface/Qwen/Qwen3-VL-32B-Instruct/", + credential_path="credentials/s3_training.secret", + load_pretrained=True, +) + +Qwen3VLMoT_VLM_32b_Instruct_GCP_Config: VLMConfig = VLMConfig( + model_name="Qwen/Qwen3-VL-32B-Instruct", + model_instance=L(Qwen3VLTextForCausalLM)( + config=L(create_vlm_config)( + base_config=L(Qwen3VLTextConfig.from_json_file)( + json_file="cosmos3/_src/vfm/models/vlm/qwen3_vl/configs/Qwen3-VL-32B-Instruct.json" + ), + layer_module="MoTDecoderLayer", + qk_norm_for_text=True, + qk_norm_for_diffusion=True, + tie_word_embeddings=True, + freeze_und=False, + ), + ), + tokenizer=L(build_processor_lazy)( + tokenizer_type="Qwen/Qwen3-VL-32B-Instruct", + config_variant="gcp", + ), + checkpoint_path="s3://bucket/cosmos3/pretrained/huggingface/Qwen/Qwen3-VL-32B-Instruct/", + credential_path="credentials/gcp_checkpoint.secret", + load_pretrained=True, + enable_gcs_patch_in_boto3=True, +) + +CosmosReason2_VLM_32b_Private_GCP_Config: VLMConfig = VLMConfig( + model_name="nvidia/Cosmos-Reason2-32B-Private", + model_instance=L(Qwen3VLTextForCausalLM)( + config=L(create_vlm_config)( + base_config=L(Qwen3VLTextConfig.from_json_file)( + json_file="cosmos3/_src/vfm/models/vlm/qwen3_vl/configs/Qwen3-VL-32B-Instruct.json" + ), + layer_module="MoTDecoderLayer", + qk_norm_for_text=True, + qk_norm_for_diffusion=True, + tie_word_embeddings=True, + freeze_und=False, + ), + ), + tokenizer=L(build_processor_lazy)( + tokenizer_type="Qwen/Qwen3-VL-32B-Instruct", + config_variant="gcp", + ), + checkpoint_path="s3://bucket/cosmos3/pretrained/huggingface/Cosmos-Reason/Cosmos-Reason2-32B-Private/", + credential_path="credentials/gcp_checkpoint.secret", + load_pretrained=True, + enable_gcs_patch_in_boto3=True, +) + +Cosmos3Reasoner_VLM_32b_Private_GCP_Config: VLMConfig = VLMConfig( + model_name="nvidia/Cosmos3-Reasoner-32B-Private", + model_instance=L(Qwen3VLTextForCausalLM)( + config=L(create_vlm_config)( + base_config=L(Qwen3VLTextConfig.from_json_file)( + json_file="cosmos3/_src/vfm/models/vlm/qwen3_vl/configs/Qwen3-VL-32B-Instruct.json" + ), + layer_module="MoTDecoderLayer", + qk_norm_for_text=True, + qk_norm_for_diffusion=True, + tie_word_embeddings=True, + freeze_und=False, + ), + ), + tokenizer=L(build_processor_lazy)( + tokenizer_type="Qwen/Qwen3-VL-32B-Instruct", + config_variant="gcp", + ), + checkpoint_path="s3://bucket/cosmos3/pretrained/huggingface/Cosmos-Reason/Cosmos3-Reasoner-32B-Private/", + credential_path="credentials/gcp_checkpoint.secret", + load_pretrained=True, + enable_gcs_patch_in_boto3=True, +) + +Cosmos3SuperReasoner_VLM_GCP_Config: VLMConfig = VLMConfig( + model_name="nvidia/Cosmos3-Super-Reasoner", + model_instance=L(Qwen3VLTextForCausalLM)( + config=L(create_vlm_config)( + base_config=L(Qwen3VLTextConfig.from_json_file)( + json_file="cosmos3/_src/vfm/models/vlm/qwen3_vl/configs/Qwen3-VL-32B-Instruct.json" + ), + layer_module="MoTDecoderLayer", + qk_norm_for_text=True, + qk_norm_for_diffusion=True, + tie_word_embeddings=True, + freeze_und=False, + ), + ), + tokenizer=L(create_qwen2_tokenizer_with_download)( + pretrained_model_name="Qwen/Qwen3-VL-32B-Instruct", + config_variant="gcp", + ), + checkpoint_path="s3://bucket/cosmos3/pretrained/huggingface/Cosmos-Reason/Cosmos3-Super-Reasoner/", + credential_path="credentials/gcp_checkpoint.secret", + load_pretrained=True, + enable_gcs_patch_in_boto3=True, +) + + +def register_vlm(): + cs = ConfigStore.instance() + cs.store( + group="vlm_config", + package="model.config.vlm_config", + name="qwen3_mot_0p6b", + node=Qwen3MoT_LLM_0p6b_Config, + ) + cs.store( + group="vlm_config", + package="model.config.vlm_config", + name="qwen3_mot_0p6b_gcp", + node=Qwen3MoT_LLM_0p6b_GCP_Config, + ) + cs.store( + group="vlm_config", + package="model.config.vlm_config", + name="nemotron_3_llm_2b_gcp", + node=Nemotron3_LLM_2b_GCP_Config, + ) + cs.store( + group="vlm_config", + package="model.config.vlm_config", + name="qwen3_vl_mot_vlm_30b_a3b_instruct", + node=Qwen3VLMoT_VLM_30b_a3b_Instruct_Config, + ) + cs.store( + group="vlm_config", + package="model.config.vlm_config", + name="qwen3_vl_mot_vlm_30b_a3b_instruct_gcp", + node=Qwen3VLMoT_VLM_30b_a3b_Instruct_GCP_Config, + ) + cs.store( + group="vlm_config", + package="model.config.vlm_config", + name="qwen3_vl_mot_vlm_235b_a22b_instruct_gcp", + node=Qwen3VLMoT_VLM_235b_a22b_Instruct_GCP_Config, + ) + cs.store( + group="vlm_config", + package="model.config.vlm_config", + name="qwen3_vl_mot_vlm_2b_instruct", + node=Qwen3VLMoT_VLM_2b_Instruct_Config, + ) + cs.store( + group="vlm_config", + package="model.config.vlm_config", + name="qwen3_vl_mot_vlm_2b_instruct_gcp", + node=Qwen3VLMoT_VLM_2b_Instruct_GCP_Config, + ) + cs.store( + group="vlm_config", + package="model.config.vlm_config", + name="qwen3_vl_mot_vlm_2b_instruct_hf", + node=Qwen3VLMoT_VLM_2b_Instruct_HF_Config, + ) + cs.store( + group="vlm_config", + package="model.config.vlm_config", + name="nemotron_3_dense_vl_2b_gcp", + node=Nemotron3DenseVL_VLM_2b_GCP_Config, + ) + cs.store( + group="vlm_config", + package="model.config.vlm_config", + name="cosmos3_reasoner_nemotron_vlm_2b_private_gcp", + node=Cosmos3Reasoner_Nemotron_VLM_2b_Private_GCP_Config, + ) + cs.store( + group="vlm_config", + package="model.config.vlm_config", + name="cosmos_reason2_vlm_2b_gcp", + node=CosmosReason2_VLM_2b_GCP_Config, + ) + cs.store( + group="vlm_config", + package="model.config.vlm_config", + name="cosmos_reason2_vlm_2b_private_gcp", + node=CosmosReason2_VLM_2b_Private_GCP_Config, + ) + cs.store( + group="vlm_config", + package="model.config.vlm_config", + name="cosmos3_reasoner_vlm_2b_private_gcp", + node=Cosmos3Reasoner_VLM_2b_Private_GCP_Config, + ) + cs.store( + group="vlm_config", + package="model.config.vlm_config", + name="cosmos_reason2_vlm_8b_private_gcp", + node=CosmosReason2_VLM_8b_Private_GCP_Config, + ) + cs.store( + group="vlm_config", + package="model.config.vlm_config", + name="cosmos3_reasoner_vlm_8b_private_gcp", + node=Cosmos3Reasoner_VLM_8b_Private_GCP_Config, + ) + cs.store( + group="vlm_config", + package="model.config.vlm_config", + name="cosmos3_nano_reasoner_vlm_gcp", + node=Cosmos3NanoReasoner_VLM_GCP_Config, + ) + cs.store( + group="vlm_config", + package="model.config.vlm_config", + name="cosmos_reason2_vlm_32b_private_gcp", + node=CosmosReason2_VLM_32b_Private_GCP_Config, + ) + cs.store( + group="vlm_config", + package="model.config.vlm_config", + name="cosmos3_reasoner_vlm_32b_private_gcp", + node=Cosmos3Reasoner_VLM_32b_Private_GCP_Config, + ) + cs.store( + group="vlm_config", + package="model.config.vlm_config", + name="cosmos3_super_reasoner_vlm_gcp", + node=Cosmos3SuperReasoner_VLM_GCP_Config, + ) + cs.store( + group="vlm_config", + package="model.config.vlm_config", + name="cosmos_reason2_vlm_30b_a3b_private_gcp", + node=CosmosReason2_VLM_30b_a3b_Private_GCP_Config, + ) + cs.store( + group="vlm_config", + package="model.config.vlm_config", + name="qwen3_vl_mot_vlm_4b_instruct", + node=Qwen3VLMoT_VLM_4b_Instruct_Config, + ) + cs.store( + group="vlm_config", + package="model.config.vlm_config", + name="qwen3_vl_mot_vlm_4b_instruct_gcp", + node=Qwen3VLMoT_VLM_4b_Instruct_GCP_Config, + ) + cs.store( + group="vlm_config", + package="model.config.vlm_config", + name="qwen3_vl_mot_vlm_8b_instruct", + node=Qwen3VLMoT_VLM_8b_Instruct_Config, + ) + cs.store( + group="vlm_config", + package="model.config.vlm_config", + name="qwen3_vl_mot_vlm_8b_instruct_gcp", + node=Qwen3VLMoT_VLM_8b_Instruct_GCP_Config, + ) + cs.store( + group="vlm_config", + package="model.config.vlm_config", + name="qwen3_vl_mot_vlm_32b_instruct", + node=Qwen3VLMoT_VLM_32b_Instruct_Config, + ) + cs.store( + group="vlm_config", + package="model.config.vlm_config", + name="qwen3_vl_mot_vlm_32b_instruct_gcp", + node=Qwen3VLMoT_VLM_32b_Instruct_GCP_Config, + ) diff --git a/cosmos-inference/cosmos3/_src/vfm/configs/base/vlm/__init__.py b/cosmos-inference/cosmos3/_src/vfm/configs/base/vlm/__init__.py new file mode 100644 index 00000000..db6d6eb6 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/vfm/configs/base/vlm/__init__.py @@ -0,0 +1,15 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025-2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + diff --git a/cosmos-inference/cosmos3/_src/vfm/configs/base/vlm/config.py b/cosmos-inference/cosmos3/_src/vfm/configs/base/vlm/config.py new file mode 100644 index 00000000..2aa451f9 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/vfm/configs/base/vlm/config.py @@ -0,0 +1,73 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +from cosmos3._src.imaginaire.trainer import ImaginaireTrainer +from cosmos3._src.imaginaire.utils import log +from cosmos3._src.imaginaire.utils.config_helper import import_all_modules_from_package +from cosmos3._src.vfm.configs.base.vlm.defaults.callbacks import register_callbacks +from cosmos3._src.vfm.configs.base.vlm.defaults.checkpointer import register_checkpoint, register_ckpt_type +from cosmos3._src.vfm.configs.base.vlm.defaults.config import Config +from cosmos3._src.vfm.configs.base.vlm.defaults.dataloader import register_data_debug +from cosmos3._src.vfm.configs.base.vlm.defaults.dataloader_weighted_url import ( + register_data_recipe, + register_data_weighted_url, + register_data_weighted_url_with_text, +) +from cosmos3._src.vfm.configs.base.vlm.defaults.model import register_model +from cosmos3._src.vfm.configs.base.vlm.defaults.optimizer import register_optimizer, register_scheduler +from cosmos3._src.vfm.configs.base.vlm.defaults.vlm_policy import register_vlm_policy + + +def make_config() -> Config: + c = Config( + model=None, + optimizer=None, + scheduler=None, + dataloader_train=None, + dataloader_val=None, + ) + + # Specifying values through instances of attrs + c.job.project = "cosmos_reason2" + c.job.group = "debug" + c.job.name = "delete_${now:%Y-%m-%d}_${now:%H-%M-%S}" + + # Unified path: ImaginaireTrainer drives both VLM and VFM. + c.trainer.type = ImaginaireTrainer + c.trainer.straggler_detection.enabled = False + c.trainer.max_iter = 400_000 + c.trainer.logging_iter = 20 + c.trainer.validation_iter = 100 + c.trainer.run_validation = False + c.trainer.callbacks = None + c.trainer.cudnn.benchmark = False + c.upload_reproducible_setup = True + + # Call this function to register config groups for advanced overriding. the order follows the default config groups + register_model() + register_vlm_policy() + # Register dataloader configs + register_data_weighted_url() + register_data_recipe() + register_data_weighted_url_with_text() + register_data_debug() + log.info("Registering optimizer, scheduler, checkpoint, ckpt type, and callbacks") + register_optimizer() + register_scheduler() + register_checkpoint() + register_ckpt_type() + register_callbacks() + import_all_modules_from_package("cosmos3._src.vfm.configs.base.vlm.experiment", reload=True) + return c diff --git a/cosmos-inference/cosmos3/_src/vfm/configs/base/vlm/defaults/__init__.py b/cosmos-inference/cosmos3/_src/vfm/configs/base/vlm/defaults/__init__.py new file mode 100644 index 00000000..db6d6eb6 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/vfm/configs/base/vlm/defaults/__init__.py @@ -0,0 +1,15 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025-2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + diff --git a/cosmos-inference/cosmos3/_src/vfm/configs/base/vlm/defaults/callbacks.py b/cosmos-inference/cosmos3/_src/vfm/configs/base/vlm/defaults/callbacks.py new file mode 100644 index 00000000..efef06c0 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/vfm/configs/base/vlm/defaults/callbacks.py @@ -0,0 +1,127 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +"""Dataloader config options. +Based on projects/cosmos/ar/v1/configs/registry.py +""" + +from hydra.core.config_store import ConfigStore + +from cosmos3._src.imaginaire.callbacks.manual_gc import ManualGarbageCollection +from cosmos3._src.imaginaire.lazy_config import PLACEHOLDER +from cosmos3._src.imaginaire.lazy_config import LazyCall as L +from cosmos3._src.imaginaire.utils.callback import WandBCallback +from cosmos3._src.vfm.callbacks.dataloader_state import DataLoaderStateCallback +from cosmos3._src.vfm.callbacks.dataloading_monitor import DetailedDataLoadingSpeedMonitor +from cosmos3._src.vfm.callbacks.grad_clip import GradClip +from cosmos3._src.vfm.callbacks.hf_export import HFExportCallback +from cosmos3._src.vfm.callbacks.learning_rate_logger import LearningRateLogger +from cosmos3._src.vfm.callbacks.low_precision import LowPrecisionCallback +from cosmos3._src.vfm.configs.base.defaults.callbacks import JOB_MONITOR_CALLBACKS +from projects.cosmos3.vlm.callbacks.data_stats import DataStatsCallback +from projects.cosmos3.vlm.callbacks.iter_speed import IterSpeed +from projects.cosmos3.vlm.callbacks.log_tensor_shape import LogTensorShapeCallback +from projects.cosmos3.vlm.callbacks.param_count import ParamCount +from projects.cosmos3.vlm.callbacks.wandb_log import WandbCallback as WandBCallbackMultiplier +from projects.cosmos3.vlm.callbacks.wandb_log_eval import WandbCallback as WandBCallbackEval +from projects.cosmos3.vlm.callbacks.wandb_log_simgple import WandbCallback as WandBCallbackMultiplierSimple +from projects.cosmos3.vlm.callbacks.wandb_vis import VisualizationLoggingCallback + +# from cosmos3._src.imaginaire.utils.callback import NVTXCallback + + +def register_callbacks(): + cs = ConfigStore.instance() + BASIC_CALLBACKS = dict( + iter_speed=L(IterSpeed)( # does not use model or optimizer + every_n="${trainer.logging_iter}", + save_s3="${upload_reproducible_setup}", + save_s3_every_log_n=500, + hit_thres=50, + ), + manual_gc=L(ManualGarbageCollection)(every_n=5), # does not use model or optimizer + wandb=L(WandBCallback)(), + param_count=L(ParamCount)( # use model + save_s3="${upload_reproducible_setup}", + ), + dataloader_speed=L(DetailedDataLoadingSpeedMonitor)( + every_n=100, + save_s3="${upload_reproducible_setup}", + ), + grad_clip=L(GradClip)(clip_norm=1.0, force_finite=False), # use model + learning_rate_logger=L(LearningRateLogger)(every_n=10), + low_precision=L(LowPrecisionCallback)( + update_iter=1, + config=PLACEHOLDER, + trainer=PLACEHOLDER, + ), # reads model.precision; no extra kwarg needed + + # nvtx=L(NVTXCallback)(synchronize=True), + ) + + PER_DATASET_PERN_CALLBACKS = dict( + wandb_10x=L(WandBCallbackMultiplier)( + logging_iter_multipler=10, + save_logging_iter_multipler=1, + save_s3="${upload_reproducible_setup}", + ), + wandb_2x=L(WandBCallbackMultiplier)( + logging_iter_multipler=2, + save_logging_iter_multipler=1, + save_s3="${upload_reproducible_setup}", + ), + data_stats=L(DataStatsCallback)( + logging_iter_multipler=1, + save_s3="${upload_reproducible_setup}", + ), + wandb_val=L(WandBCallbackEval)( + save_s3="${upload_reproducible_setup}", + ), + ) + + SIMPLE_LOG_CALLBACKS = dict( + wandb_10x=L(WandBCallbackMultiplierSimple)( + logging_iter_multipler=10, + save_logging_iter_multipler=1, + save_s3="${upload_reproducible_setup}", + ), + wandb_2x=L(WandBCallbackMultiplierSimple)( + logging_iter_multipler=2, + save_logging_iter_multipler=1, + save_s3="${upload_reproducible_setup}", + ), + log_tensor_shape=L(LogTensorShapeCallback)(num_log=10), + dataloader_state=L(DataLoaderStateCallback)( + distributor_type="${data_setting.distributor_type}", + ), + ) + cs.store(group="callbacks", package="trainer.callbacks", name="basic_vlm", node=BASIC_CALLBACKS) + cs.store(group="callbacks", package="trainer.callbacks", name="per_dataset", node=PER_DATASET_PERN_CALLBACKS) + cs.store(group="callbacks", package="trainer.callbacks", name="simple_log", node=SIMPLE_LOG_CALLBACKS) + cs.store(group="callbacks", package="trainer.callbacks", name="job_monitor", node=JOB_MONITOR_CALLBACKS) + + DATA_VIS_CALLBACKS_QWEN = dict( + wandb_vis=L(VisualizationLoggingCallback)( + every_n=500, + ), + ) + cs.store(group="callbacks", package="trainer.callbacks", name="data_vis_qwen", node=DATA_VIS_CALLBACKS_QWEN) + + HF_EXPORT_CALLBACKS = dict( + hf_export=L(HFExportCallback)( + dtype="${model.config.policy.parallelism.precision}", + ), + ) + cs.store(group="callbacks", package="trainer.callbacks", name="hf_export", node=HF_EXPORT_CALLBACKS) diff --git a/cosmos-inference/cosmos3/_src/vfm/configs/base/vlm/defaults/checkpointer.py b/cosmos-inference/cosmos3/_src/vfm/configs/base/vlm/defaults/checkpointer.py new file mode 100644 index 00000000..a1ed350c --- /dev/null +++ b/cosmos-inference/cosmos3/_src/vfm/configs/base/vlm/defaults/checkpointer.py @@ -0,0 +1,105 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +from typing import Dict + +from hydra.core.config_store import ConfigStore + +from cosmos3._src.imaginaire import config +from cosmos3._src.imaginaire.checkpointer.dummy import Checkpointer as DummyCheckpointer +from cosmos3._src.imaginaire.config import CheckpointConfig +from cosmos3._src.imaginaire.lazy_config import LazyCall as L +from cosmos3._src.vfm.checkpointer.dcp import DistributedCheckpointer + +local_object_store = config.ObjectStoreConfig( + enabled=False, +) + +pdx_object_store = config.ObjectStoreConfig( + enabled=True, + credentials="credentials/pdx_vfm_checkpoint.secret", + bucket="checkpoints", +) + +s3_object_store = config.ObjectStoreConfig( + enabled=True, + credentials="credentials/s3_training.secret", + bucket="bucket", +) + +s3_eu_object_store = config.ObjectStoreConfig( + enabled=True, + credentials="credentials/s3_training_eu.secret", + bucket="bucket", +) + +gcp_object_store = config.ObjectStoreConfig( + enabled=True, + credentials="credentials/gcp_checkpoint.secret", + bucket="bucket", +) + +CHECKPOINT_LOCAL = CheckpointConfig( + save_to_object_store=local_object_store, + load_from_object_store=local_object_store, + save_iter=5000, + broadcast_via_filesystem=True, + dcp_async_mode_enabled=True, +) + +CHECKPOINT_PDX = CheckpointConfig( + save_to_object_store=pdx_object_store, + load_from_object_store=pdx_object_store, + save_iter=5000, + broadcast_via_filesystem=True, + dcp_async_mode_enabled=True, +) + +CHECKPOINT_S3 = CheckpointConfig( + save_to_object_store=s3_object_store, + load_from_object_store=s3_object_store, + save_iter=5000, + broadcast_via_filesystem=True, + dcp_async_mode_enabled=True, +) + +CHECKPOINT_GCP = CheckpointConfig( + save_to_object_store=gcp_object_store, + load_from_object_store=gcp_object_store, + save_iter=1000, + load_path="", + load_training_state=False, + strict_resume=True, + enable_gcs_patch_in_boto3=True, + dcp_async_mode_enabled=True, +) + + +def register_checkpoint() -> None: + cs = ConfigStore.instance() + cs.store(group="checkpoint", package="checkpoint", name="local", node=CHECKPOINT_LOCAL) + cs.store(group="checkpoint", package="checkpoint", name="pdx", node=CHECKPOINT_PDX) + cs.store(group="checkpoint", package="checkpoint", name="s3", node=CHECKPOINT_S3) + cs.store(group="checkpoint", package="checkpoint", name="gcp", node=CHECKPOINT_GCP) + + +DUMMY_CHECKPOINTER: Dict[str, str] = L(DummyCheckpointer)() +DISTRIBUTED_CHECKPOINTER: Dict[str, str] = L(DistributedCheckpointer)() + + +def register_ckpt_type() -> None: + cs = ConfigStore.instance() + cs.store(group="ckpt_type", package="checkpoint.type", name="dummy", node=DUMMY_CHECKPOINTER) + cs.store(group="ckpt_type", package="checkpoint.type", name="dcp", node=DISTRIBUTED_CHECKPOINTER) diff --git a/cosmos-inference/cosmos3/_src/vfm/configs/base/vlm/defaults/config.py b/cosmos-inference/cosmos3/_src/vfm/configs/base/vlm/defaults/config.py new file mode 100644 index 00000000..26b5a8b1 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/vfm/configs/base/vlm/defaults/config.py @@ -0,0 +1,82 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +from typing import Any, List + +import attrs + +from cosmos3._src.imaginaire import config +from cosmos3._src.vfm.configs.base.vlm.defaults.training import PolicyConfig, TrainConfig + + +@attrs.define(slots=False) +class DataSetting: + """Configuration for data. + + Attributes: + qwen_max_video_token_length: Maximum video token length. + qwen_target_fps: Target fps for video sampling. + text_chat_order: Order of text items in user messages. + distributor_type: "with_replace" (WeightedShardlistBasic) or "no_replace" (NoReplaceShardlistBasic). + distributor_seed: Seed for the distributor. + """ + + qwen_max_video_token_length: int = 8192 + qwen_max_image_token_length: int = 8192 + qwen_target_fps: float = 4.0 + text_chat_order: str = attrs.field( + default="text_end", + validator=attrs.validators.in_({"text_end", "text_start", "random"}), + ) + temporal_localization_output_format: str = attrs.field( + default="random", + validator=attrs.validators.in_({"dense_video_caption", "temporal_localization", "temporal_caption", "random"}), + ) + temporal_localization_fps: float = 1.0 + # For packed dataset + max_batch_size: int = 1 + max_tokens: int = 16000 + # "with_replace" (WeightedShardlistBasic) or "no_replace" (NoReplaceShardlistBasic). + distributor_type: str = attrs.field( + default="with_replace", + validator=attrs.validators.in_({"with_replace", "no_replace"}), + ) + distributor_seed: int = 1993 + webdataset_detshuffle: bool = False + num_data_workers: int = 8 + data_prefetch_factor: int = 1 + val_split_ratio: float = 0.0 + + +@attrs.define(slots=False) +class Config(config.Config): + train: TrainConfig = TrainConfig() + policy: PolicyConfig = PolicyConfig() + data_setting: DataSetting = DataSetting() + defaults: List[Any] = attrs.field( + factory=lambda: [ + "_self_", + {"model": "vlm_fsdp"}, + {"vlm_policy": None}, + {"data_train": None}, + {"data_val": None}, + {"optimizer": "fusedadamw"}, + {"scheduler": "warmup_cosine_lr"}, + {"checkpoint": "s3"}, + {"ckpt_type": "dcp"}, + {"callbacks": ["basic_vlm"]}, + {"experiment": None}, + ] + ) diff --git a/cosmos-inference/cosmos3/_src/vfm/configs/base/vlm/defaults/dataloader.py b/cosmos-inference/cosmos3/_src/vfm/configs/base/vlm/defaults/dataloader.py new file mode 100644 index 00000000..562d54a2 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/vfm/configs/base/vlm/defaults/dataloader.py @@ -0,0 +1,92 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +from torch.utils.data import DataLoader + +from cosmos3._src.imaginaire.lazy_config import LazyCall as L +from cosmos3._src.imaginaire.utils.config_helper import ConfigStore +from cosmos3._src.vfm.datasets.vlm.collate_fn import custom_collate +from cosmos3._src.vfm.datasets.vlm.debug_data_qwen import DebugQwenDataset +from cosmos3._src.vfm.datasets.vlm.dummy_data_qwen import DummyQwenDataset +from cosmos3._src.vfm.processors import build_processor_lazy + + +# Debug dataset +def create_debug_dataloader_config_qwen( + num_images, loss_on_completion_only: bool = True, use_dummy_image: bool = False +): + return L(DataLoader)( + dataset=L(DebugQwenDataset)( + tokenizer=L(build_processor_lazy)( + tokenizer_type="${model.config.policy.model_name_or_path}", + credentials="${checkpoint.load_from_object_store.credentials}", + bucket="${checkpoint.load_from_object_store.bucket}", + ), + num_images=num_images, + seq_len="${model.config.policy.model_max_length}", + image_token_len="${model.config.policy.qwen_max_video_token_length}", + # use_dummy_image=use_dummy_image, + ), + num_workers=8, + prefetch_factor=4, + batch_size=1, + sampler=None, + persistent_workers=False, + pin_memory=True, + collate_fn=custom_collate, + ) + + +def create_dummy_dataloader_config_qwen(): + return L(DataLoader)( + dataset=L(DummyQwenDataset)( + tokenizer=L(build_processor_lazy)( + tokenizer_type="${model.config.policy.model_name_or_path}", + credentials="${checkpoint.load_from_object_store.credentials}", + bucket="${checkpoint.load_from_object_store.bucket}", + ), + num_visual_tokens="${model.config.policy.qwen_max_video_token_length}", + total_tokens="${model.config.policy.model_max_length}", + batch_size="${dataloader_train.batch_size}", + ), + num_workers=8, + prefetch_factor=4, + batch_size=1, + sampler=None, + persistent_workers=False, + pin_memory=True, + collate_fn=custom_collate, + ) + + +def register_data_debug(): + cs = ConfigStore.instance() + for split in ["train", "val"]: + cs.store( + group=f"data_{split}", + package=f"dataloader_{split}", + name="debug_image_data_qwen", # This data is from pixtral model output, expected to have low loss ~1.4 + node=create_debug_dataloader_config_qwen(1), + ) + cs.store( + group=f"data_{split}", + package=f"dataloader_{split}", + name="dummy_image_data_qwen", + node=create_dummy_dataloader_config_qwen(), + ) + + +def register_data(): + register_data_debug() diff --git a/cosmos-inference/cosmos3/_src/vfm/configs/base/vlm/defaults/dataloader_weighted_url.py b/cosmos-inference/cosmos3/_src/vfm/configs/base/vlm/defaults/dataloader_weighted_url.py new file mode 100644 index 00000000..6b2d6781 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/vfm/configs/base/vlm/defaults/dataloader_weighted_url.py @@ -0,0 +1,572 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +import importlib + +from hydra.core.config_store import ConfigStore + +from cosmos3._src.imaginaire.lazy_config import LazyCall as L +from cosmos3._src.vfm.datasets.augmentors.vlm.bytes_to_media import BytesToMedia +from cosmos3._src.vfm.datasets.augmentors.vlm.filter_output_key import FilterOutputKey +from cosmos3._src.vfm.datasets.augmentors.vlm.filter_seq_length import FilterSeqLength +from cosmos3._src.vfm.datasets.augmentors.vlm.floating_number_format import FloatingNumberFormat +from cosmos3._src.vfm.datasets.augmentors.vlm.format_describe_anything import FormatDescribeAnything +from cosmos3._src.vfm.datasets.augmentors.vlm.prompt_format import PromptFormat +from cosmos3._src.vfm.datasets.augmentors.vlm.shuffle_text_media_order import ShuffleTextMediaOrder +from cosmos3._src.vfm.datasets.augmentors.vlm.timestamp import TimeStamp +from cosmos3._src.vfm.datasets.augmentors.vlm.timestamp_with_subject_tracking import ( + TimeStampWithSubjectTracking, +) +from cosmos3._src.vfm.datasets.augmentors.vlm.timestamp_without_augment_message import ( + TimeStampWithoutAugmentMessage, +) +from cosmos3._src.vfm.datasets.augmentors.vlm.timestamp_without_end_time import TimeStampWithoutEndTime +from cosmos3._src.vfm.datasets.augmentors.vlm.tokenize_data import TokenizeData +from cosmos3._src.vfm.datasets.vlm.collate_fn import custom_collate +from cosmos3._src.vfm.datasets.vlm.dataset_provider_sft import get_vlm_dataset +from cosmos3._src.vfm.datasets.vlm.distributor_with_weight import ( + NoReplaceShardlistBasic, + WeightedShardlistBasic, +) +from cosmos3._src.vfm.datasets.vlm.joint_dataloader import IterativeJointDataLoader +from cosmos3._src.vfm.datasets.vlm.joint_dataset_dynamic_batch_webloader import ( + JointDatasetDynamicBatchingWebLoader, +) +from cosmos3._src.vfm.processors import build_processor_lazy + + +def create_distributor_config( + distributor_type: str, + data_weight_dict: dict, + url_to_category_fn, + shuffle: bool = True, + split_by_node: bool = False, + split_by_worker: bool = False, + resume_flag: bool = False, + verbose: bool = True, + is_infinite_loader: bool = True, + seed: int = 1993, + subsample_config: dict | None = None, + split: str = "train", +): + """ + Return a LazyCall to the distributor class based on distributor_type. + + Args: + distributor_type: "with_replace" -> WeightedShardlistBasic, "no_replace" -> NoReplaceShardlistBasic + data_weight_dict: category -> weight (or repetitions) mapping + url_to_category_fn: maps URL path to category key + split: "train" or "val" (used by NoReplaceShardlistBasic) + Other args: passed to the distributor constructor. + + Returns: + L(WeightedShardlistBasic)(...) or L(NoReplaceShardlistBasic)(...) — a LazyCall to the distributor class. + """ + common = dict( + data_weight_dict=data_weight_dict, + url_to_category_fn=url_to_category_fn, + shuffle=shuffle, + split_by_node=split_by_node, + split_by_worker=split_by_worker, + resume_flag=resume_flag, + verbose=verbose, + is_infinite_loader=is_infinite_loader, + subsample_config=subsample_config, + split=split, + ) + if distributor_type == "with_replace": + return L(WeightedShardlistBasic)(**common) + if distributor_type == "no_replace": + return L(NoReplaceShardlistBasic)(seed=seed, **common) + + raise ValueError(f"distributor_type must be in ['with_replace', 'no_replace'], got {distributor_type!r}") + + +def create_data_augmentor_config(): + return { + "bytes_to_media": L(BytesToMedia)( + input_key="media", + output_key="media", + min_fps_thres=2, + max_fps_thres=60, + target_fps="${data_setting.qwen_target_fps}", # type: ignore + max_video_token_length="${data_setting.qwen_max_video_token_length}", # type: ignore + processor=processor, + is_input_pickle_byptes=False, # If True, it means the input "media" is pickled bytes that needs to be unpickled first; if False, it means the input "media" is raw bytes that can be directly decoded to image/video. Set to False for most cases, and only set to True for some special datasets where media is stored as pickled bytes. + ), # takes "videos" and output "videos" + "prompt_format": L(PromptFormat)( # takes text_keys and output "conversation" + input_keys=["texts"], + text_chat_order="${data_setting.text_chat_order}", + ), + "shuffle_text_media_order": L(ShuffleTextMediaOrder)(), + # ============================ + # TL data augmentation + # ============================ + "timestamp": L(TimeStamp)( + input_key="media", + # output_format="${data_setting.temporal_localization_output_format}", + output_format="temporal_localization", # Only use temporal_localization tasks to keep the caption style of base model + urls_needs_timestamp=[ + "av_reasoning_localization_20250627", + "tl_activitynet_20250630", + "tl_agibot_fisheye_20250630", + "tl_2dvlm_20250627", + "tl_2dvlm_20251121", + "tl_youcook2_20250716", + "tl_yt_cctv_warehouse_20250724", + ], + processor=processor, + ), + "TL_recaption": L(TimeStamp)( + input_key="media", + # output_format="${data_setting.temporal_localization_output_format}", + output_format="caption", # Only use temporal_localization tasks to keep the caption style of base model + urls_needs_timestamp=[ + "tl_2dvlm_recaption_20251121", + "tl_2dvlm_recaption_20250627", + ], + processor=processor, + ), + # Special augmentors: + # timestamp_without_end_time: nexar data does not contain end time + # timestamp_with_subject_trackig: plm data has subject id + mask, and it's video data + # format_describe_anything: dam data has subject id + mask + category label, and it's image data (does not need timestampt) + # timestamp_without_augment_message: rft tl data require timestamp augmentation to video, but keep original text + "timestamp_without_end_time": L(TimeStampWithoutEndTime)( + input_key="media", + # output_format="${data_setting.temporal_localization_output_format}", + output_format="temporal_localization", # Only use temporal_localization tasks to keep the caption style of base model + urls_needs_timestamp=[ + "tl_nexar_20250708", + "mimicgen_temporal_localization", + ], + processor=processor, + ), + "timestamp_with_subject_trackig": L(TimeStampWithSubjectTracking)( + input_key="media", + output_format="temporal_location_subject", # Only use temporal_localization tasks to keep the caption style of base model + urls_needs_timestamp=[ + "tl_plm_sav_20250714", + ], + processor=processor, + ), + "floating_number_format": L(FloatingNumberFormat)( + input_key="conversation", + decimal_places=2, + urls_needs_format=[ + "3d_grounding_av", + ], + ), + "format_describe_anything": L(FormatDescribeAnything)( + input_key="media", + urls_needs_timestamp=[ + "describe-anything-dataset", + ], + ), + "timestamp_without_augment_message": L(TimeStampWithoutAugmentMessage)( + input_key="media", + output_format="${data_setting.temporal_localization_output_format}", + urls_needs_timestamp=[ + "rl_distill_tl_0729", + ], + processor=processor, + ), + # ============================ + # End of TL data augmentation + # ============================ + "tokenize_data": L(TokenizeData)( + processor=processor, + max_video_token_length="${data_setting.qwen_max_video_token_length}", + max_image_token_length="${data_setting.qwen_max_image_token_length}", + add_system_prompt_if_missing=True, + text_only=False, + ), + "filter_output_keys": L(FilterOutputKey)( + text_only=False, + ), + "filter_seq_length": L(FilterSeqLength)( + max_token_length="${data_setting.max_tokens}", + processor=processor, + ), + } + + +processor = L(build_processor_lazy)( + tokenizer_type="${model.config.policy.model_name_or_path}", + credentials="${checkpoint.load_from_object_store.credentials}", + bucket="${checkpoint.load_from_object_store.bucket}", +) + + +def get_vlm_dataset_from_module( + data_module: str, + split: str = "train", + distributor_split: str = "train", + object_store: str = "s3", + augmentor_config: dict | None = None, + distributor_type: str = "with_replace", + distributor_seed: int = 1993, + buffer_size: int = 2, + detshuffle: bool = False, +): + """Resolve data module at instantiation time instead of config registration time. + + This defers importlib.import_module to when the config is actually used (training time), + avoiding the ~10+ minute startup penalty from eagerly importing all registered dataset + modules during Hydra config store population. + """ + data_weight_attr = data_module.split(".")[-1] + module_path = ".".join(data_module.split(".")[:-1]) + data_weight_module = importlib.import_module(module_path) + + full_datainfo = data_weight_module.DATAINFO + data_weight_dict = getattr(data_weight_module, data_weight_attr) + url_to_category = data_weight_module.url_to_category + subsample_config = getattr(data_weight_module, "subsample_config", None) + + distributor_config = create_distributor_config( + distributor_type=distributor_type, + data_weight_dict=data_weight_dict, + url_to_category_fn=url_to_category, + split=distributor_split, + seed=distributor_seed, + subsample_config=subsample_config, + ) + + return get_vlm_dataset( + full_datainfo=full_datainfo, + url_to_category_fn=url_to_category, + buffer_size=buffer_size, + object_store=object_store, + data_weight_dict=data_weight_dict, + split=split, + augmentor_config=augmentor_config, + distributor_config=distributor_config, + detshuffle=detshuffle, + ) + + +def create_dataloader_config( + data_module: str, + split: str = "train", + distributor_split: str = "train", + object_store: str = "s3", +): + """Create a lazy dataloader config that defers dataset module import to instantiation time. + + Args: + data_module: Full dotted path to the data weight dict, e.g. + "cosmos3._src.vfm.datasets.data_sources.vlm.data_sources_nanov2.data_weight.stage_1.data_weight_repeat". + The module should export DATAINFO, url_to_category, and the named weight dict. + split: Dataset split ("train" or "val"). + distributor_split: Distributor split for train/val sharding. + object_store: Object store backend ("s3", "s3_vlmdb", "pdx", "neb_eu"). + + Returns: + L(get_vlm_dataset_from_module): a LazyCall that resolves the module at instantiation time. + """ + return L(get_vlm_dataset_from_module)( + data_module=data_module, + split=split, + distributor_split=distributor_split, + object_store=object_store, + augmentor_config=create_data_augmentor_config(), + distributor_type="${data_setting.distributor_type}", + distributor_seed="${data_setting.distributor_seed}", + detshuffle="${data_setting.webdataset_detshuffle}", + ) + + +def register_data_weighted_url(): + cs = ConfigStore.instance() + # This will register dataset: + # reason1_v01_understanding_only_pdx + # reason1_v01_understanding_only_s3 + # reason1_v01_understanding_only_neb_eu + # eagle_v01_sft_no_text_only_pdx + # eagle_v01_sft_no_text_only_s3 + # eagle_v01_sft_no_text_only_neb_eu + # eagle_v02_grounding_2d_pdx + # eagle_v02_grounding_2d_s3 + # eagle_v02_grounding_2d_neb_eu + # eagle_v03_grounding_2d_v1_2_pdx + # eagle_v03_grounding_2d_v1_2_s3 + # eagle_v03_grounding_2d_v1_2_neb_eu + # eagle_v04_sft_no_text_only_no_grounding_2d_pdx + # eagle_v04_sft_no_text_only_no_grounding_2d_s3 + # eagle_v04_sft_no_text_only_no_grounding_2d_neb_eu + # joint_v01_cr1_understanding_eagle_sft_pdx + # joint_v01_cr1_understanding_eagle_sft_s3 + # joint_v01_cr1_understanding_eagle_sft_neb_eu + for dataset_id, data_module in { + "01": "cosmos3._src.vfm.datasets.data_sources.vlm.data_sources_reason1.data_weight.understanding_only.data_weight_default", # 01_reason1_understanding_only_default_s3 + "02": "cosmos3._src.vfm.datasets.data_sources.vlm.data_sources_eagle.data_weight.sft_full.data_weight_default", # 02_eagle_sft_full_default_s3 + "03": "cosmos3._src.vfm.datasets.data_sources.vlm.data_sources_eagle.data_weight.grounding_2d_v1_1.data_weight_default", # 03_eagle_grounding_2d_v1_1_default_s3 + "04": "cosmos3._src.vfm.datasets.data_sources.vlm.data_sources_eagle.data_weight.grounding_2d_v1_2.data_weight_default", # 04_eagle_grounding_2d_v1_2_default_s3 + "05": "cosmos3._src.vfm.datasets.data_sources.vlm.data_sources_joint.data_weight.cr1_eagle_sft.data_weight_full_5v5", # 05_joint_cr1_eagle_sft_full_5v5_s3 + "06": "cosmos3._src.vfm.datasets.data_sources.vlm.data_sources_joint.data_weight.cr1_eagle_sft.data_weight_2d_5v5", # 06_joint_cr1_eagle_sft_2d_5v5_s3 + "07": "cosmos3._src.vfm.datasets.data_sources.vlm.data_sources_eagle.data_weight.pretrain.data_weight_default", # 07_eagle_pretrain_default_s3 + "08": "cosmos3._src.vfm.datasets.data_sources.vlm.data_sources_eagle.data_weight.sft_full_mul_repeat.data_weight_default", # 08_eagle_sft_full_mul_repeat_default_s3 + "09": "cosmos3._src.vfm.datasets.data_sources.vlm.data_sources_eagle.data_weight.sft_full_mul_repeat.data_weight_debug", # 09_eagle_sft_full_mul_repeat_debug_s3 + "10": "cosmos3._src.vfm.datasets.data_sources.vlm.data_sources_joint.data_weight.cr1_eagle_sft.data_weight_full_5v5_mul_repeat", # 10_joint_cr1_eagle_sft_full_5v5_mul_repeat_s3 + "11": "cosmos3._src.vfm.datasets.data_sources.vlm.data_sources_eagle.data_weight.sft_full_mul_repeat.data_weight_single", # 11_eagle_sft_full_mul_repeat_single_s3 + "12": "cosmos3._src.vfm.datasets.data_sources.vlm.data_sources_reason1.data_weight.understanding_only.data_weight_default", # 12_reason1_understanding_only_default_s3 + "13": "cosmos3._src.vfm.datasets.data_sources.vlm.data_sources_reason1.data_weight.reason1p0_1p1.data_weight_mix_5v5", # 13_reason1p0_1p1_mix_5v5_s3 + "14": "cosmos3._src.vfm.datasets.data_sources.vlm.data_sources_joint.data_weight.reason1_2.data_weight_mix_5v5v5", # 14_joint_reason1_2_mix_5v5v5_s3 + "15": "cosmos3._src.vfm.datasets.data_sources.vlm.data_sources_reason1.data_weight.reason1p0_1p1.data_weight_debug_tl", # 15_reason1_reason1p0_1p1_debug_tl_s3 + "16": "cosmos3._src.vfm.datasets.data_sources.vlm.data_sources_joint.data_weight.reason1_2.data_weight_mix_all_zero", # 16_joint_reason1_2_mix_all_zero_s3 + "17": "cosmos3._src.vfm.datasets.data_sources.vlm.data_sources_eagle.data_weight.sft_full_mul_repeat.data_weight_only_vatex_subset", # 17_eagle_sft_full_mul_repeat_only_vatex_subset_s3 + "18": "cosmos3._src.vfm.datasets.data_sources.vlm.data_sources_joint.data_weight.reason1_2.data_weight_understand", # 18_joint_reason1_2_data_weight_understand + "19": "cosmos3._src.vfm.datasets.data_sources.vlm.data_sources_reason1.data_weight.reason1p0_1p1_721.data_weight_mix_5v5", # 19_reason1_reason1p0_1p1_721_mix_5v5_s3 + "20": "cosmos3._src.vfm.datasets.data_sources.vlm.data_sources_reason1.data_weight.reason1p0_1p1_721.data_weight_debug_2dvlm", # 20_reason1_reason1p0_1p1_721_debug_2dvlm_s3 + "21": "cosmos3._src.vfm.datasets.data_sources.vlm.data_sources_reason1.data_weight.reason1p0_1p1_721.data_weight_debug_av", # 21_reason1_reason1p0_1p1_721_debug_av_s3 + "22": "cosmos3._src.vfm.datasets.data_sources.vlm.data_sources_reason1.data_weight.reason1p0_1p1_721.data_weight_no_rft", # 22_reason1_reason1p0_1p1_721_no_rft_s3 + "23": "cosmos3._src.vfm.datasets.data_sources.vlm.data_sources_reason2.data_weight.reason2.data_weight_all_zero", # 23_reason2_reason2_all_zero_s3 + # 24: Reason 2 data count 5% of total + "24": "cosmos3._src.vfm.datasets.data_sources.vlm.data_sources_joint.data_weight.reason1p0_1p1_2_721.data_weight_mix_475v475v005", # 24_joint_reason1p0_1p1_2_721_mix_475v475v005_s3 + "25": "cosmos3._src.vfm.datasets.data_sources.vlm.data_sources_eagle.data_weight.grounding_2d_v1_1.data_weight_no_robospatial", # 25_eagle_grounding_2d_v1_1_no_robospatial_s3 + # via exploration + "26": "cosmos3._src.vfm.datasets.data_sources.vlm.data_sources_via.data_weight.default.data_weight_spatial_suc_only", # 26_via_default_spatial_suc_only_s3 + "27": "cosmos3._src.vfm.datasets.data_sources.vlm.data_sources_via.data_weight.default.data_weight_spatial_suc_only_round4", # 27_via_default_spatial_suc_only_round4_s3 + "28": "cosmos3._src.vfm.datasets.data_sources.vlm.data_sources_via.data_weight.default.data_weight_90_suc_only_round2", # + "29": "cosmos3._src.vfm.datasets.data_sources.vlm.data_sources_via.data_weight.default.data_weight_all_zeros", # 29_via_default_all_zeros_s3 + # reason2 release + "54": "cosmos3._src.vfm.datasets.data_sources.vlm.data_sources_joint.data_weight.reason2_release.data_weight_joint", # 54_joint_reason2_release_joint_s3 + "55": "cosmos3._src.vfm.datasets.data_sources.vlm.data_sources_joint.data_weight.reason2_release.data_weight_with_recaption", # 55_joint_reason2_release_joint_with_recaption_s3 + "56": "cosmos3._src.vfm.datasets.data_sources.vlm.data_sources_joint.data_weight.reason2_release.data_weight_with_recaption_wo_human", # 56_joint_reason2_release_joint_with_recaption_wo_human_s3 + "57": "cosmos3._src.vfm.datasets.data_sources.vlm.data_sources_joint.data_weight.reason2p0_2p1.data_weight_joint", # 57_joint_reason2p0_2p1_joint_s3 + "101": "cosmos3._src.vfm.datasets.data_sources.vlm.data_sources_joint.data_weight.reason2_release.data_weight_debug_recaption", # 101_joint_reason2_release_debug_recaption_s3 + # # taxonomy distillation + # "100": "cosmos3._src.vfm.datasets.data_sources.vlm_taxonomy_distill.data_weight.taxonomy_distill.data_weight_default", # 100_taxonomy_distill_taxonomy_distill_default_s3 + # # interleave document scoring distillation + # "102": "cosmos3._src.vfm.datasets.data_sources.vlm_interleave_scoring.data_weight.interleave_scoring.data_weight_default", # 102_interleave_scoring_interleave_scoring_default_s3 + # video taxonomy distillation + "103": "cosmos3._src.vfm.datasets.data_sources.vlm.data_sources_video_taxonomy.data_weight.video_taxonomy.data_weight_default", # 103_video_taxonomy_video_taxonomy_default_s3 + # nanov2 pre/post-training + "200": "cosmos3._src.vfm.datasets.data_sources.vlm.data_sources_nanov2.data_weight.stage_1_0218_34m_uniform_pretrain.data_weight_repeat", # 200_nanov2_stage_1_0218_34m_uniform_pretrain_repeat_s3_vlmdb + "201": "cosmos3._src.vfm.datasets.data_sources.vlm.data_sources_nanov2.data_weight.stage_1_0218_34m_uniform_posttrain.data_weight_repeat", # 201_nanov2_stage_1_0218_34m_uniform_posttrain_repeat_s3_vlmdb + # Data ablation configs (below is a dummy example, do not uncomment) + "202": "cosmos3._src.vfm.datasets.data_sources.vlm.data_sources_nanov2.data_weight.new_category_data_mixture.data_weight_repeat", # 202_nanov2_new_category_data_mixture_repeat_s3_vlmdb + }.items(): + data_source_name = data_module.split("data_sources_")[-1].split(".")[0] + dataset_file_name = data_module.split(".")[-2] + data_weight_name = data_module.split("data_weight_")[-1] + for distributor_split in ["train", "val"]: + for object_store in ["pdx", "s3", "s3_vlmdb", "neb_eu"]: + dataset_name = f"{dataset_id}_{data_source_name}_{dataset_file_name}_{data_weight_name}_{object_store}" + cs.store( + group=f"data_{distributor_split}", + package=f"dataloader_{distributor_split}", + name=dataset_name, + node=L(JointDatasetDynamicBatchingWebLoader)( + datasets_cfg={ + "default": { + "dataset": create_dataloader_config( + data_module=data_module, + split="train", + distributor_split=distributor_split, + object_store=object_store, + ), + "ratio": 1, + } + }, + # Arguments for the joint dataset + pool_size=16, + max_batch_size="${data_setting.max_batch_size}", + max_tokens="${data_setting.max_tokens}", + model_name_or_path="${model.config.policy.model_name_or_path}", # "Qwen/Qwen3-VL-2B-Init" + long_threshold=6400, + length_key="input_ids", + batching_strategy="prefer_closest", + # Arguments for the webloader + batch_size=1, # This is not the real batch size, it wont be used + num_workers="${data_setting.num_data_workers}" if distributor_split == "train" else 0, + sampler=None, + prefetch_factor="${data_setting.data_prefetch_factor}" + if distributor_split == "train" + else None, + persistent_workers=False, + pin_memory=True, + collate_fn=custom_collate, + ), + ) + + +def register_data_weighted_url_with_text(): + cs = ConfigStore.instance() + + # This will register dataset: + + for dataset_id, data_modules in { + "m01": { + "with_visual": ( + 5, + "cosmos3._src.vfm.datasets.data_sources.vlm.data_sources_eagle.data_weight.sft_full_mul_repeat.data_weight_default", + ), + "text_only": ( + 1, + "cosmos3._src.vfm.datasets.data_sources.vlm.data_sources_eagle.data_weight.sft_full_mul_repeat.data_weight_text_only", + ), + }, # m01_visual_5_mix_text_1__eagle_sft_full_mul_repeat_default_s3 + "m02": { + "with_visual": ( + 5, + "cosmos3._src.vfm.datasets.data_sources.vlm.data_sources_joint.data_weight.reason2p0_2p1.data_weight_joint", + ), + "text_only": ( + 1, + "cosmos3._src.vfm.datasets.data_sources.vlm.data_sources_eagle.data_weight.sft_full_mul_repeat.data_weight_text_only", + ), + }, # m02_visual_5_mix_text_1__joint_reason2p0_2p1_joint_s3 + }.items(): + ratio_with_visual, data_module_with_visual = data_modules["with_visual"] + ratio_text_only, data_module_text_only = data_modules["text_only"] + data_source_name = data_module_with_visual.split("data_sources_")[-1].split(".")[0] + dataset_file_name = data_module_with_visual.split(".")[-2] + data_weight_name = data_module_with_visual.split("data_weight_")[-1] + + for distributor_split in ["train", "val"]: + for object_store in ["pdx", "s3", "neb_eu"]: + dataset_name = f"{dataset_id}_visual_{ratio_with_visual}_mix_text_{ratio_text_only}__{data_source_name}_{dataset_file_name}_{data_weight_name}_{object_store}" + cs.store( + group=f"data_{distributor_split}", + package=f"dataloader_{distributor_split}", + name=dataset_name, + node=L(IterativeJointDataLoader)( + dataloaders={ + "with_visual": { + "ratio": ratio_with_visual, + "dataloader": L(JointDatasetDynamicBatchingWebLoader)( + datasets_cfg={ + "default": { + "dataset": create_dataloader_config( + data_module=data_module_with_visual, + split="train", + distributor_split=distributor_split, + object_store=object_store, + ), + "ratio": 1, + } + }, + # Arguments for the joint dataset + pool_size=16, + max_batch_size="${data_setting.max_batch_size}", + max_tokens="${data_setting.max_tokens}", + model_name_or_path="${model.config.policy.model_name_or_path}", # "Qwen/Qwen3-VL-2B-Init" + long_threshold=6400, + length_key="input_ids", + batching_strategy="prefer_closest", + # Arguments for the webloader + batch_size=1, # This is not the real batch size, it wont be used + num_workers="${data_setting.num_data_workers}" + if distributor_split == "train" + else 0, + sampler=None, + prefetch_factor="${data_setting.data_prefetch_factor}" + if distributor_split == "train" + else None, + persistent_workers=False, + pin_memory=True, + collate_fn=custom_collate, + ), + }, + "text_only": { + "ratio": ratio_text_only, + "dataloader": L(JointDatasetDynamicBatchingWebLoader)( + datasets_cfg={ + "default": { + "dataset": create_dataloader_config( + data_module=data_module_text_only, + split="train", + distributor_split=distributor_split, + object_store=object_store, + ), + "ratio": 1, + } + }, + # Arguments for the joint dataset + pool_size=16, + max_batch_size="${data_setting.max_batch_size}", + max_tokens="${data_setting.max_tokens}", + model_name_or_path="${model.config.policy.model_name_or_path}", # "Qwen/Qwen3-VL-2B-Init" + long_threshold=6400, + length_key="input_ids", + batching_strategy="prefer_closest", + # Arguments for the webloader + batch_size=1, # This is not the real batch size, it wont be used + num_workers=2, + sampler=None, + prefetch_factor=1, + persistent_workers=False, + pin_memory=True, + collate_fn=custom_collate, + ), + }, + } + ), + ) + + +def register_data_recipe(): + from cosmos3._src.vfm.datasets.vlm.recipe_dataloader import VLMRecipeDataLoader + + cs = ConfigStore.instance() + # This will register recipe-based dataloaders using VLMRecipeDataLoader. + # Recipe names and storage types are stored in the PostgreSQL recipe database. + # Registered configs: + # cosmos_reason2_s3_vlmdb + for recipe_name, storage_type, config_name in [ + ("cosmos_reason2_s3_vlmdb", "s3_vlmdb", "cosmos_reason2_s3_vlmdb_recipe"), + ( + "nemotron_nanov2_stage_1_0218_34m_uniform_pretrain_s3_vlmdb", + "s3_vlmdb", + "nemotron_nanov2_stage_1_0218_34m_uniform_pretrain_s3_vlmdb_recipe", + ), + ( + "nemotron_nanov2_stage_1_0218_34m_uniform_posttrain_s3_vlmdb", + "s3_vlmdb", + "nemotron_nanov2_stage_1_0218_34m_uniform_posttrain_s3_vlmdb_recipe", + ), + ("cosmos3_pai_postrain_3M_v6", "gcp", "cosmos3_pai_postrain_3M_v6_gcp_recipe"), + ]: + for distributor_split in ["train", "val"]: + cs.store( + group=f"data_{distributor_split}", + package=f"dataloader_{distributor_split}", + name=config_name, + node=L(VLMRecipeDataLoader)( + recipe_name=recipe_name, + storage_type=storage_type, + model_name_or_path="${model.config.policy.model_name_or_path}", + max_tokens="${data_setting.max_tokens}", + max_batch_size="${data_setting.max_batch_size}", + pool_size=16, + long_threshold=6400, + length_key="input_ids", + batching_strategy="prefer_closest", + augmentor_config=create_data_augmentor_config(), + distributor_type="${data_setting.distributor_type}", + detshuffle="${data_setting.webdataset_detshuffle}", + split="train", # use train split of the dataset + distributor_split=distributor_split, # split training dataset into train and val splits + val_split_ratio="${data_setting.val_split_ratio}", + distributor_seed="${data_setting.distributor_seed}", + num_workers="${data_setting.num_data_workers}" if distributor_split == "train" else 0, + prefetch_factor="${data_setting.data_prefetch_factor}" if distributor_split == "train" else None, + persistent_workers=False, + pin_memory=True, + collate_fn=custom_collate, + ), + ) diff --git a/cosmos-inference/cosmos3/_src/vfm/configs/base/vlm/defaults/model.py b/cosmos-inference/cosmos3/_src/vfm/configs/base/vlm/defaults/model.py new file mode 100644 index 00000000..47b1d1b4 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/vfm/configs/base/vlm/defaults/model.py @@ -0,0 +1,35 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +from hydra.core.config_store import ConfigStore + +from cosmos3._src.imaginaire.lazy_config import LazyCall as L +from cosmos3._src.vfm.configs.base.defaults.model_config import VLMModelConfig +from cosmos3._src.vfm.models.vlm_model import VLMModel + +VLM_FSDP_CONFIG = dict( + trainer=dict( + distributed_parallelism="fsdp", + ), + model=L(VLMModel)( + config=VLMModelConfig(), + checkpoint="${checkpoint}", + ), +) + + +def register_model(): + cs = ConfigStore.instance() + cs.store(group="model", package="_global_", name="vlm_fsdp", node=VLM_FSDP_CONFIG) diff --git a/cosmos-inference/cosmos3/_src/vfm/configs/base/vlm/defaults/optimizer.py b/cosmos-inference/cosmos3/_src/vfm/configs/base/vlm/defaults/optimizer.py new file mode 100644 index 00000000..4f77bec6 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/vfm/configs/base/vlm/defaults/optimizer.py @@ -0,0 +1,65 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +from cosmos3._src.imaginaire.lazy_config import PLACEHOLDER +from cosmos3._src.imaginaire.lazy_config import LazyCall as L +from cosmos3._src.imaginaire.utils.config_helper import ConfigStore +from cosmos3._src.vfm.utils.vlm.optimizer import OptimizerConfig, build_lr_schedulers, build_optimizers + + +def register_optimizer(): + cs = ConfigStore.instance() + cs.store( + group="optimizer", + package="optimizer", + name="fusedadamw", + node=L(build_optimizers)( + model_parts=PLACEHOLDER, + model_part_names=PLACEHOLDER, + config=L(OptimizerConfig)( + name="FusedAdam", + ), + ), + ) + cs.store( + group="optimizer", + package="optimizer", + name="adamw", + node=L(build_optimizers)( + model_parts=PLACEHOLDER, + model_part_names=PLACEHOLDER, + config=L(OptimizerConfig)( + name="AdamW", + ), + ), + ) + + +def register_scheduler(): + cs = ConfigStore.instance() + cs.store( + group="scheduler", + package="scheduler", + name="warmup_cosine_lr", + node=L(build_lr_schedulers)( + optimizers=PLACEHOLDER, + name="warmup_cosine_lr", + warmup_iters=1000, + lr_decay_iters="${trainer.max_iter}", + lr="${optimizer.config.lr}", + init_lr="${optimizer.config.init_lr}", + end_lr="${optimizer.config.end_lr}", + ), + ) diff --git a/cosmos-inference/cosmos3/_src/vfm/configs/base/vlm/defaults/training.py b/cosmos-inference/cosmos3/_src/vfm/configs/base/vlm/defaults/training.py new file mode 100644 index 00000000..9946aff6 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/vfm/configs/base/vlm/defaults/training.py @@ -0,0 +1,117 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +from dataclasses import MISSING, field +from typing import Union + +import attrs +import torch + +from cosmos3._src.imaginaire.config import make_freezable +from cosmos3._src.vfm.configs.base.defaults.parallelism import ParallelismConfig + + +def skip_ui_field(*, default=MISSING, default_factory=MISSING, **kwargs): + metadata = kwargs.pop("metadata", {}) + metadata["skip_ui"] = True + if default_factory is not MISSING: + return field(default_factory=default_factory, metadata=metadata, **kwargs) + elif default is not MISSING: + return field(default=default, metadata=metadata, **kwargs) + else: + raise ValueError("Must provide either default or default_factory.") + + +@make_freezable +@attrs.define(slots=False) +class TrainPolicyConfig: + mini_batch: int = 1 + type: str = "sft" + + +@make_freezable +@attrs.define(slots=False) +class FP8: + enable_fp8: bool = False + + +@make_freezable +@attrs.define(slots=False) +class TrainConfig: + # Master parameter dtype for FSDP. Activation / model dtype lives on the + # shared ParallelismConfig as policy.parallelism.precision. + master_dtype: str = "float32" + + # The data type for reduction in FSDP + fsdp_reduce_dtype: str = "float32" + + # Whether to offload the model to CPU if using FSDP + fsdp_offload: bool = False + + # Reshard the param after forward pass in FSDP + fsdp_reshard_after_forward: str = "default" + + # The batch size for training per iteration in one replica, this is the local batch size for each gradient accumulation step + train_batch_per_replica: int = 1 + + # The interval of train step for synchronizing weights between replicas. + sync_weight_interval: int = 1 + + # Train policy + train_policy: TrainPolicyConfig = TrainPolicyConfig() + fp8: FP8 = FP8() + deterministic: bool = True + + def key_values(self): + return {k: v for k, v in self.__dict__.items() if not k.startswith("_")} + + @property + def master_torch_dtype(self): + return { + "bfloat16": torch.bfloat16, + "float16": torch.float16, + "float32": torch.float32, + }[self.master_dtype] + + @property + def fsdp_reduce_torch_dtype(self): + return {"float32": torch.float32}[self.fsdp_reduce_dtype] + + +# Why we does not make this freezable? +# Because we need to path the cache model dir as model_name_or_path to the cosmos-rl model to use the +# model weights downloaded from s3. If cosmos-rl support reading model from s3 directly, we can make it freezable. +@attrs.define(slots=False) +class PolicyConfig: + # Parallelism configuration + parallelism: ParallelismConfig = ParallelismConfig() + # The model name or path, compatible with huggingface model name or local path + model_name_or_path: str = "Qwen/Qwen2.5-VL-3B-Instruct" + # The maximum length for training, longer than this will be ignored for training stability + model_max_length: int = 16000 + # The maximum length for video tokens, only applied to qwen model + qwen_max_video_token_length: int = 8000 + + # Pretrain weights (Optional) + pretrain_weights_path_vlm: str = "" + pretrain_weights_path_llm: str = "" + pretrain_weights_path_vit: str = "" + pretrain_weights_cred: str = "credentials/s3_training.secret" + + # Extra model config + lora: Union[str, None] = None + enable_liger_kernel: bool = False + trainable_map: Union[str, None] = None + monkey_patch_for_text_only_data: bool = False diff --git a/cosmos-inference/cosmos3/_src/vfm/configs/base/vlm/defaults/vlm_policy.py b/cosmos-inference/cosmos3/_src/vfm/configs/base/vlm/defaults/vlm_policy.py new file mode 100644 index 00000000..1944c9f5 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/vfm/configs/base/vlm/defaults/vlm_policy.py @@ -0,0 +1,170 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +from hydra.core.config_store import ConfigStore + +from cosmos3._src.vfm.configs.base.vlm.defaults.training import PolicyConfig + +# Each entry replaces cfg.model.config.policy via package="model.config.policy". +# Sibling to the VFM vlm_config group at +# cosmos3/_src/vfm/configs/base/defaults/vlm.py: that group binds +# VLMConfig SKUs onto OmniMoTModelConfig.vlm_config; this group binds +# PolicyConfig SKUs onto VLMModelConfig.policy. The two schemas are kept +# separate today because the loader contracts diverge (VFM uses a +# registry-label + LazyDict model_instance with MoTDecoderLayer +# substitution; VLM uses a literal HF cache path fed to from_pretrained). +# Convergence onto a single SKU schema is tracked as L6 in +# config_unification_plan.v10.md. + +qwen2_5_vl_7b = PolicyConfig(model_name_or_path="Qwen/Qwen2.5-VL-7B-Instruct") + +eagle_er_1p7b = PolicyConfig( + model_name_or_path="eagle_er_qwen3_1p7b_siglip_400m", + model_max_length=16000, +) + +internvl3_5_1b = PolicyConfig( + model_name_or_path="OpenGVLab/InternVL3_5-1B-HF", + model_max_length=16000, # 40960 is the max length by default. +) + +internvl3_5_2b = PolicyConfig( + model_name_or_path="OpenGVLab/InternVL3_5-2B-HF", + model_max_length=16000, # 40960 is the max length by default. +) + +qwen3_vl_2b = PolicyConfig(model_name_or_path="Qwen/Qwen3-VL-2B-Init") + +qwen3_vl_30b_a3b_instruct = PolicyConfig(model_name_or_path="Qwen/Qwen3-VL-30B-A3B-Instruct") + +qwen3_vl_30b_a3b_thinking = PolicyConfig(model_name_or_path="Qwen/Qwen3-VL-30B-A3B-Thinking") + +qwen3_vl_235b_a22b_thinking = PolicyConfig(model_name_or_path="Qwen/Qwen3-VL-235B-A22B-Thinking") + +qwen3_vl_8b_thinking = PolicyConfig(model_name_or_path="Qwen/Qwen3-VL-8B-Thinking") + +qwen3_vl_8b_instruct = PolicyConfig(model_name_or_path="Qwen/Qwen3-VL-8B-Instruct") + +qwen3_vl_2b_instruct = PolicyConfig(model_name_or_path="Qwen/Qwen3-VL-2B-Instruct") + +qwen3_vl_2b_thinking = PolicyConfig(model_name_or_path="Qwen/Qwen3-VL-2B-Thinking") + +qwen3_vl_4b_instruct = PolicyConfig(model_name_or_path="Qwen/Qwen3-VL-4B-Instruct") + +qwen3_vl_4b_thinking = PolicyConfig(model_name_or_path="Qwen/Qwen3-VL-4B-Thinking") + +qwen3_vl_32b_instruct = PolicyConfig(model_name_or_path="Qwen/Qwen3-VL-32B-Instruct") + +nemotron_nano_12b_v2_vl_bf16 = PolicyConfig(model_name_or_path="nvidia/NVIDIA-Nemotron-Nano-12B-v2-VL-BF16") + + +def register_vlm_policy(): + cs = ConfigStore.instance() + cs.store( + group="vlm_policy", + package="model.config.policy", + name="qwen2_5_vl_7b", + node=qwen2_5_vl_7b, + ) + cs.store( + group="vlm_policy", + package="model.config.policy", + name="eagle_er_1p7b", + node=eagle_er_1p7b, + ) + cs.store( + group="vlm_policy", + package="model.config.policy", + name="internvl3_5_1b", + node=internvl3_5_1b, + ) + cs.store( + group="vlm_policy", + package="model.config.policy", + name="internvl3_5_2b", + node=internvl3_5_2b, + ) + cs.store( + group="vlm_policy", + package="model.config.policy", + name="qwen3_vl_2b", + node=qwen3_vl_2b, + ) + cs.store( + group="vlm_policy", + package="model.config.policy", + name="qwen3_vl_30b_a3b_instruct", + node=qwen3_vl_30b_a3b_instruct, + ) + cs.store( + group="vlm_policy", + package="model.config.policy", + name="qwen3_vl_30b_a3b_thinking", + node=qwen3_vl_30b_a3b_thinking, + ) + cs.store( + group="vlm_policy", + package="model.config.policy", + name="qwen3_vl_235b_a22b_thinking", + node=qwen3_vl_235b_a22b_thinking, + ) + cs.store( + group="vlm_policy", + package="model.config.policy", + name="qwen3_vl_8b_thinking", + node=qwen3_vl_8b_thinking, + ) + cs.store( + group="vlm_policy", + package="model.config.policy", + name="qwen3_vl_8b_instruct", + node=qwen3_vl_8b_instruct, + ) + cs.store( + group="vlm_policy", + package="model.config.policy", + name="qwen3_vl_2b_instruct", + node=qwen3_vl_2b_instruct, + ) + cs.store( + group="vlm_policy", + package="model.config.policy", + name="qwen3_vl_2b_thinking", + node=qwen3_vl_2b_thinking, + ) + cs.store( + group="vlm_policy", + package="model.config.policy", + name="qwen3_vl_4b_instruct", + node=qwen3_vl_4b_instruct, + ) + cs.store( + group="vlm_policy", + package="model.config.policy", + name="qwen3_vl_4b_thinking", + node=qwen3_vl_4b_thinking, + ) + cs.store( + group="vlm_policy", + package="model.config.policy", + name="qwen3_vl_32b_instruct", + node=qwen3_vl_32b_instruct, + ) + cs.store( + group="vlm_policy", + package="model.config.policy", + name="nemotron_nano_12b_v2_vl_bf16", + node=nemotron_nano_12b_v2_vl_bf16, + ) diff --git a/cosmos-inference/cosmos3/_src/vfm/configs/base/vlm/experiment/__init__.py b/cosmos-inference/cosmos3/_src/vfm/configs/base/vlm/experiment/__init__.py new file mode 100644 index 00000000..db6d6eb6 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/vfm/configs/base/vlm/experiment/__init__.py @@ -0,0 +1,15 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025-2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + diff --git a/cosmos-inference/cosmos3/_src/vfm/configs/base/vlm/experiment/pre_exp012_phase2_vlm_smoke.py b/cosmos-inference/cosmos3/_src/vfm/configs/base/vlm/experiment/pre_exp012_phase2_vlm_smoke.py new file mode 100644 index 00000000..f42a25d6 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/vfm/configs/base/vlm/experiment/pre_exp012_phase2_vlm_smoke.py @@ -0,0 +1,105 @@ +# ----------------------------------------------------------------------------- +# Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. +# All rights reserved. +# +# This codebase constitutes NVIDIA proprietary technology and is strictly +# confidential. Any unauthorized reproduction, distribution, or disclosure +# of this code, in whole or in part, outside NVIDIA is strictly prohibited +# without prior written consent. +# +# For inquiries regarding the use of this code in other NVIDIA CORPORATION +# projects, please contact the Deep Imagination Research Team at +# dir@exchange.nvidia.com. +# ----------------------------------------------------------------------------- +"""Phase 2 VFM VLMModel smoke test experiments. + +These configs exercise the Phase 2 HFModel/FSDP2 path end-to-end on 4 GPUs. +They are NOT training runs — max_iter=10 is intentionally minimal. + +Usage: + torchrun --nproc_per_node=4 --master_port=12341 -m scripts.train \\ + --config=cosmos3/_src/vfm/configs/base/vlm/config.py \\ + -- experiment=pre_exp012_000_phase2_vlm_smoke_4gpu +""" + +from hydra.core.config_store import ConfigStore + +from cosmos3._src.imaginaire.lazy_config import LazyDict + +cs = ConfigStore.instance() + +# --------------------------------------------------------------------------- +# Smoke test: 4-GPU FSDP2 with qwen3_vl_2b, debug data, 10 iterations. +# Exercises: HFModel meta-init → parallelize() → forward → CE loss → backward +# Requires: trainable_params (Phase 2 mandatory), data_parallel_shard_degree=4 +# --------------------------------------------------------------------------- +pre_exp012_000_phase2_vlm_smoke_4gpu_8b = LazyDict( + dict( + defaults=[ + {"override /checkpoint": "s3"}, + {"override /data_train": "nemotron_nanov2_stage_1_0218_34m_uniform_pretrain_s3_vlmdb_recipe"}, + {"override /data_val": "nemotron_nanov2_stage_1_0218_34m_uniform_pretrain_s3_vlmdb_recipe"}, + {"override /model": "vlm_fsdp"}, + {"override /vlm_policy": "qwen3_vl_8b_instruct"}, + {"override /callbacks": ["basic_vlm", "simple_log"]}, + "_self_", + ], + job=dict( + group="phase2_smoke", + ), + trainer=dict( + max_iter=10, + run_validation=False, + logging_iter=1, + ), + optimizer=dict( + config=dict( + # Phase 2 REQUIRED: trainable_params regex list. + # ".*" trains all parameters — appropriate for smoke test. + trainable_params=[".*"], + lr=1e-5, + fused=True, + ), + ), + model=dict( + config=dict( + policy=dict( + parallelism=dict( + data_parallel_shard_degree=4, + data_parallel_replicate_degree=-1, + ), + ), + ), + ), + # dataloader_train=dict( + # enable_flop_based_batching=True, + # target_runtime_seconds=3.0, # Used when max_batch_size > 1 + # ), + checkpoint=dict( + # Don't save checkpoints during smoke test + save_iter=100000, + ), + data_setting=dict( + qwen_max_video_token_length=8192, + qwen_max_image_token_length=2048, + max_tokens=16000, + max_batch_size=1, + distributor_type="no_replace", + distributor_seed=1993, + val_split_ratio=0.1, + webdataset_detshuffle=True, + ), + ) +) + + +for _item in [ + pre_exp012_000_phase2_vlm_smoke_4gpu_8b, +]: + experiment_name = [name.lower() for name, value in globals().items() if value is _item][0] + if "job" not in _item: + _item["job"] = dict(name=experiment_name + "_${now:%Y-%m-%d}_${now:%H-%M-%S}") + else: + _item["job"]["name"] = experiment_name + "_${now:%Y-%m-%d}_${now:%H-%M-%S}" + + cs.store(group="experiment", package="_global_", name=experiment_name, node=_item) diff --git a/cosmos-inference/cosmos3/_src/vfm/configs/base/vlm/experiment/pre_exp01x.py b/cosmos-inference/cosmos3/_src/vfm/configs/base/vlm/experiment/pre_exp01x.py new file mode 100644 index 00000000..57f012d8 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/vfm/configs/base/vlm/experiment/pre_exp01x.py @@ -0,0 +1,852 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +from hydra.core.config_store import ConfigStore + +from cosmos3._src.imaginaire.lazy_config import LazyDict + +cs = ConfigStore.instance() + +""" +Bundled VLM training experiments registered under the ``experiment`` group. + +Usage: +torchrun --nproc_per_node=8 --master_port=12341 -m scripts.train \ + --config=cosmos3/_src/vfm/configs/base/vlm/config.py \ + -- experiment=pre_exp010_000_eagle_er_1p7b_joint_reasoner_tl_722_5vs5_no_predict2_s3_webloader +""" + + +pre_exp010_000_eagle_er_1p7b_joint_reasoner_tl_722_5vs5_no_predict2_s3_webloader = LazyDict( + dict( + defaults=[ + {"override /checkpoint": "s3"}, + {"override /data_train": "joint_reasoner_tl_722_5vs5_no_predict2_s3_webloader"}, + {"override /data_val": "debug_image_data_qwen"}, + {"override /model": "vlm_fsdp"}, + {"override /vlm_policy": "eagle_er_1p7b"}, + {"override /callbacks": ["basic_vlm", "simple_log"]}, + "_self_", + ], + job=dict( + group="debug", + ), + trainer=dict( + callbacks=dict(log_tensor_shape=dict(num_log=100)), + max_iter=32000, + ), + checkpoint=dict(save_iter=5000), + ) +) + +pre_exp010_010_internvl3_5_1b_joint_reasoner_tl_722_5vs5_no_predict2_s3_webloader = LazyDict( + dict( + defaults=[ + {"override /checkpoint": "s3"}, + {"override /data_train": "joint_reasoner_tl_722_5vs5_no_predict2_s3_webloader"}, + {"override /data_val": "debug_image_data_qwen"}, + {"override /model": "vlm_fsdp"}, + {"override /vlm_policy": "internvl3_5_1b"}, + {"override /callbacks": ["basic_vlm", "simple_log"]}, + "_self_", + ], + job=dict( + group="debug", + ), + trainer=dict( + callbacks=dict(log_tensor_shape=dict(num_log=100)), + ), + checkpoint=dict(save_iter=2000), + ) +) + +pre_exp010_020_internvl3_5_2b_joint_reasoner_tl_722_5vs5_no_predict2_s3_webloader = LazyDict( + dict( + defaults=[ + {"override /checkpoint": "s3"}, + {"override /data_train": "joint_reasoner_tl_722_5vs5_no_predict2_s3_webloader"}, + {"override /data_val": "debug_image_data_qwen"}, + {"override /model": "vlm_fsdp"}, + {"override /vlm_policy": "internvl3_5_2b"}, + {"override /callbacks": ["basic_vlm", "simple_log"]}, + "_self_", + ], + job=dict( + group="debug", + ), + trainer=dict( + callbacks=dict(log_tensor_shape=dict(num_log=100)), + max_iter=32000, + ), + checkpoint=dict(save_iter=2000), + ) +) + +""" +torchrun --nproc_per_node=8 --master_port=12341 -m projects.cosmos3.vlm.train --config=projects/cosmos3/vlm/configs/base/config.py -- experiment=pre_exp011_000_qwen3_vl_2b +""" +pre_exp011_000_qwen3_vl_2b = LazyDict( + dict( + defaults=[ + {"override /checkpoint": "s3"}, + {"override /data_train": "08_eagle_sft_full_mul_repeat_default_s3"}, + {"override /data_val": "debug_image_data_qwen"}, + {"override /model": "vlm_fsdp"}, + {"override /vlm_policy": "qwen3_vl_2b"}, + {"override /callbacks": ["basic_vlm", "simple_log"]}, + "_self_", + ], + job=dict( + group="debug", + ), + trainer=dict( + callbacks=dict(log_tensor_shape=dict(num_log=2)), + max_iter=8000, + ), + checkpoint=dict(save_iter=2000), + ) +) + + +""" +torchrun --nproc_per_node=8 --master_port=12341 -m projects.cosmos3.vlm.train --config=projects/cosmos3/vlm/configs/base/config.py -- experiment=pre_exp011_020_qwen3_vl_2b_vit2k8k +""" +pre_exp011_020_qwen3_vl_2b_vit2k8k = LazyDict( + dict( + defaults=[ + {"override /checkpoint": "s3"}, + {"override /data_train": "08_eagle_sft_full_mul_repeat_default_s3"}, + {"override /data_val": "debug_image_data_qwen"}, + {"override /model": "vlm_fsdp"}, + {"override /vlm_policy": "qwen3_vl_2b"}, + {"override /callbacks": ["basic_vlm", "simple_log"]}, + "_self_", + ], + job=dict( + group="debug", + ), + trainer=dict( + callbacks=dict(log_tensor_shape=dict(num_log=2)), + max_iter=8000, + ), + optimizer=dict( + config=dict( + lr=5e-5, + fused=True, + ), + ), + model=dict( + config=dict( + policy=dict( + parallelism=dict( + data_parallel_shard_degree=8, + data_parallel_replicate_degree=-1, + ), + ), + ), + ), + data_setting=dict( + qwen_max_video_token_length=8192, + qwen_max_image_token_length=2048, + ), + checkpoint=dict(save_iter=2000), + ) +) + +""" +torchrun --nproc_per_node=8 --master_port=12341 -m projects.cosmos3.vlm.train --config=projects/cosmos3/vlm/configs/base/config.py -- experiment=pre_exp011_030_qwen3_vl_2b_vit2k8k_mbs8 +""" +pre_exp011_030_qwen3_vl_2b_vit2k8k_mbs8 = LazyDict( + dict( + defaults=[ + {"override /checkpoint": "s3"}, + {"override /data_train": "08_eagle_sft_full_mul_repeat_default_s3"}, + {"override /data_val": "debug_image_data_qwen"}, + {"override /model": "vlm_fsdp"}, + {"override /vlm_policy": "qwen3_vl_2b"}, + {"override /callbacks": ["basic_vlm", "simple_log"]}, + "_self_", + ], + job=dict( + group="debug", + ), + trainer=dict( + callbacks=dict(log_tensor_shape=dict(num_log=2)), + max_iter=200_000, + ), + optimizer=dict( + config=dict( + lr=5e-5, + fused=True, + ), + ), + data_setting=dict( + qwen_max_video_token_length=8192, + qwen_max_image_token_length=2048, + ), + model=dict( + config=dict( + policy=dict( + parallelism=dict( + data_parallel_shard_degree=8, + data_parallel_replicate_degree=-1, + ), + ), + ), + ), + dataloader_train=dict( + max_batch_size=8, + max_tokens=16000, + ), + checkpoint=dict(save_iter=2000), + ) +) + +""" +torchrun --nproc_per_node=8 --master_port=12341 -m projects.cosmos3.vlm.train --config=projects/cosmos3/vlm/configs/base/config.py -- experiment=pre_exp011_030_qwen3_vl_2b_vit2k8k_mbs8 +""" +pre_exp011_040_qwen3_vl_2b_vit2k8k_mbs8_flop3s = LazyDict( + dict( + defaults=[ + {"override /checkpoint": "s3"}, + {"override /data_train": "08_eagle_sft_full_mul_repeat_default_s3"}, + {"override /data_val": "debug_image_data_qwen"}, + {"override /model": "vlm_fsdp"}, + {"override /vlm_policy": "qwen3_vl_2b"}, + {"override /callbacks": ["basic_vlm", "simple_log"]}, + "_self_", + ], + job=dict( + group="debug", + ), + trainer=dict( + callbacks=dict(log_tensor_shape=dict(num_log=2)), + max_iter=200_000, + ), + optimizer=dict( + config=dict( + lr=5e-5, + fused=True, + ), + ), + data_setting=dict( + qwen_max_video_token_length=8192, + qwen_max_image_token_length=2048, + max_tokens=16000, + max_batch_size=8, + ), + model=dict( + config=dict( + policy=dict( + parallelism=dict( + data_parallel_shard_degree=8, + data_parallel_replicate_degree=-1, + ), + ), + ), + ), + dataloader_train=dict( + enable_flop_based_batching=True, + target_runtime_seconds=3.0, + ), + checkpoint=dict(save_iter=2000), + ) +) + +pre_exp011_041_qwen3_vl_2b_vit2k8k_mbs8_flop3s_mix_text_only = LazyDict( + dict( + defaults=[ + {"override /checkpoint": "s3"}, + {"override /data_train": "08_eagle_sft_full_mul_repeat_default_s3"}, + {"override /data_val": "debug_image_data_qwen"}, + {"override /model": "vlm_fsdp"}, + {"override /vlm_policy": "qwen3_vl_2b"}, + {"override /callbacks": ["basic_vlm", "simple_log"]}, + "_self_", + ], + job=dict( + group="debug", + ), + trainer=dict( + callbacks=dict(log_tensor_shape=dict(num_log=2)), + max_iter=200_000, + ), + optimizer=dict( + config=dict( + lr=5e-5, + fused=True, + ), + ), + data_setting=dict( + qwen_max_video_token_length=8192, + qwen_max_image_token_length=2048, + max_tokens=16000, + max_batch_size=8, + ), + model=dict( + config=dict( + policy=dict( + parallelism=dict( + data_parallel_shard_degree=8, + data_parallel_replicate_degree=-1, + ), + ), + ), + ), + dataloader_train=dict( + dataloaders=dict( + with_visual=dict( + dataloader=dict( + enable_flop_based_batching=True, + target_runtime_seconds=3.0, + ), + ), + text_only=dict( + dataloader=dict( + enable_flop_based_batching=True, + target_runtime_seconds=3.0, + ), + ), + ) + ), + checkpoint=dict(save_iter=2000), + ) +) + + +pre_exp011_100_qwen3_vl_2b_align = LazyDict( + dict( + defaults=[ + {"override /checkpoint": "s3"}, + {"override /data_train": "07_eagle_pretrain_default_s3"}, + {"override /data_val": "debug_image_data_qwen"}, + {"override /model": "vlm_fsdp"}, + {"override /vlm_policy": "qwen3_vl_2b"}, + {"override /callbacks": ["basic_vlm", "simple_log"]}, + "_self_", + ], + job=dict( + group="debug", + ), + optimizer=dict( + config=dict( + freeze_vision_encoder=True, + freeze_llm=True, + lr=1e-5, + init_lr=1e-7, + end_lr=5e-6, + ), + ), + trainer=dict( + callbacks=dict(log_tensor_shape=dict(num_log=2)), + max_iter=8000, + ), + checkpoint=dict(save_iter=2000), + ) +) + +# reinit the llm and/or projector weights for internvl3_5_2b +pre_exp011_300_internvl3_5_2b_reinit_align = LazyDict( + dict( + defaults=[ + {"override /checkpoint": "s3"}, + {"override /data_train": "07_eagle_pretrain_default_s3"}, + {"override /data_val": "debug_image_data_qwen"}, + {"override /model": "vlm_fsdp"}, + {"override /vlm_policy": "internvl3_5_2b"}, + {"override /callbacks": ["basic_vlm", "simple_log"]}, + "_self_", + ], + job=dict( + group="debug", + ), + optimizer=dict( + config=dict( + freeze_vision_encoder=True, + freeze_llm=True, + lr=1e-5, + init_lr=1e-7, + end_lr=5e-6, + ), + ), + trainer=dict( + callbacks=dict(log_tensor_shape=dict(num_log=2)), + max_iter=8000, + ), + model=dict( + config=dict( + policy=dict( + model_name_or_path="OpenGVLab/InternVL3_5-2B-HF-ReinitLLMProj", + ), + ), + ), + checkpoint=dict(save_iter=2000), + ) +) + + +# reinit the llm and/or projector weights for internvl3_5_2b +pre_exp011_400_internvl3_5_2b_reinit_e2e = LazyDict( + dict( + defaults=[ + {"override /checkpoint": "s3"}, + {"override /data_train": "07_eagle_pretrain_default_s3"}, + {"override /data_val": "debug_image_data_qwen"}, + {"override /model": "vlm_fsdp"}, + {"override /vlm_policy": "internvl3_5_2b"}, + {"override /callbacks": ["basic_vlm", "simple_log"]}, + "_self_", + ], + job=dict( + group="debug", + ), + trainer=dict( + callbacks=dict(log_tensor_shape=dict(num_log=2)), + max_iter=8000, + ), + model=dict( + config=dict( + policy=dict( + model_name_or_path="OpenGVLab/InternVL3_5-2B-HF-ReinitLLMProj", + ), + ), + ), + checkpoint=dict(save_iter=2000), + ) +) + +""" +torchrun --nproc_per_node=8 --master_port=12341 -m projects.cosmos3.vlm.train --config=projects/cosmos3/vlm/configs/base/config.py -- experiment=pre_exp014_000_qwen3_vl_8b_instruct_vit2k8k_mbs8_flop3s +- reduce lr to 2e-6 cause this is SFT +""" +pre_exp015_000_qwen3_vl_8b_instruct_vit2k8k_mbs1_flop3s = LazyDict( + dict( + defaults=[ + {"override /checkpoint": "s3"}, + {"override /data_train": "08_eagle_sft_full_mul_repeat_default_s3"}, + {"override /data_val": "dummy_image_data_qwen"}, + {"override /model": "vlm_fsdp"}, + {"override /vlm_policy": "qwen3_vl_8b_instruct"}, + {"override /callbacks": ["basic_vlm", "simple_log"]}, + "_self_", + ], + job=dict( + group="debug", + ), + trainer=dict( + callbacks=dict(log_tensor_shape=dict(num_log=2)), + max_iter=8000, + ), + optimizer=dict( + config=dict( + lr=1e-5, + fused=True, + ), + ), + data_setting=dict( + qwen_max_video_token_length=8192, + qwen_max_image_token_length=2048, + max_tokens=16000, + max_batch_size=1, + ), + model=dict( + config=dict( + policy=dict( + parallelism=dict( + data_parallel_shard_degree=8, + data_parallel_replicate_degree=-1, + ), + ), + ), + ), + dataloader_train=dict( + enable_flop_based_batching=True, + target_runtime_seconds=3.0, # Used when max_batch_size > 1 + ), + checkpoint=dict(save_iter=2000), + ) +) + +""" +torchrun --nproc_per_node=8 --master_port=12341 -m projects.cosmos3.vlm.train --config=projects/cosmos3/vlm/configs/base/config.py -- experiment=pre_exp015_001_qwen3_vl_8b_instruct_vit2k8k_mbs1_flop3s_mix_text_only data_train=m02_visual_5_mix_text_1__joint_reason2p0_2p1_joint_s3 +""" +pre_exp015_001_qwen3_vl_8b_instruct_vit2k8k_mbs1_flop3s_mix_text_only = LazyDict( + dict( + defaults=[ + {"override /checkpoint": "s3"}, + {"override /data_train": "08_eagle_sft_full_mul_repeat_default_s3"}, + {"override /data_val": "debug_image_data_qwen"}, + {"override /model": "vlm_fsdp"}, + {"override /vlm_policy": "qwen3_vl_8b_instruct"}, + {"override /callbacks": ["basic_vlm", "simple_log"]}, + "_self_", + ], + job=dict( + group="debug", + ), + trainer=dict( + callbacks=dict(log_tensor_shape=dict(num_log=2)), + max_iter=10000, + ), + optimizer=dict( + config=dict( + lr=1e-5, + fused=True, + ), + ), + data_setting=dict( + qwen_max_video_token_length=8192, + qwen_max_image_token_length=2048, + max_tokens=16000, + max_batch_size=1, + ), + model=dict( + config=dict( + policy=dict( + parallelism=dict( + data_parallel_shard_degree=8, + data_parallel_replicate_degree=-1, + ), + ), + ), + ), + dataloader_train=dict( + dataloaders=dict( + with_visual=dict( + dataloader=dict( + enable_flop_based_batching=True, + target_runtime_seconds=3.0, + ), + ), + text_only=dict( + dataloader=dict( + enable_flop_based_batching=True, + target_runtime_seconds=3.0, + ), + ), + ) + ), + checkpoint=dict(save_iter=2000), + ) +) + +""" +torchrun --nproc_per_node=8 --master_port=12341 -m projects.cosmos3.vlm.train --config=projects/cosmos3/vlm/configs/base/config.py -- experiment=pre_exp014_000_qwen3_vl_8b_instruct_vit2k8k_mbs8_flop3s +- reduce lr to 2e-6 cause this is SFT +""" +pre_exp016_000_qwen3_vl_8b_thinking_vit2k8k_mbs1_flop3s = LazyDict( + dict( + defaults=[ + {"override /checkpoint": "s3"}, + {"override /data_train": "08_eagle_sft_full_mul_repeat_default_s3"}, + {"override /data_val": "debug_image_data_qwen"}, + {"override /model": "vlm_fsdp"}, + {"override /vlm_policy": "qwen3_vl_8b_thinking"}, + {"override /callbacks": ["basic_vlm", "simple_log"]}, + "_self_", + ], + job=dict( + group="debug", + ), + trainer=dict( + callbacks=dict(log_tensor_shape=dict(num_log=2)), + max_iter=8000, + ), + optimizer=dict( + config=dict( + lr=1e-5, + fused=True, + ), + ), + data_setting=dict( + qwen_max_video_token_length=8192, + qwen_max_image_token_length=2048, + max_tokens=16000, + max_batch_size=1, + ), + model=dict( + config=dict( + policy=dict( + parallelism=dict( + data_parallel_shard_degree=8, + data_parallel_replicate_degree=-1, + ), + ), + ), + ), + dataloader_train=dict( + enable_flop_based_batching=True, + target_runtime_seconds=3.0, # Used when max_batch_size > 1 + ), + checkpoint=dict(save_iter=2000), + ) +) + + +""" +torchrun --nproc_per_node=8 --master_port=12341 -m projects.cosmos3.vlm.train --config=projects/cosmos3/vlm/configs/base/config.py -- experiment=pre_exp017_000_nemotron_vl_12b_mbs1 +""" +pre_exp017_000_nemotron_vl_12b_mbs1 = LazyDict( + dict( + defaults=[ + {"override /checkpoint": "s3"}, + {"override /data_train": "51_joint_reason1p0_1p1_2_1126_1126_s3"}, + {"override /data_val": "debug_image_data_qwen"}, + {"override /model": "vlm_fsdp"}, + {"override /vlm_policy": "nemotron_nano_12b_v2_vl_bf16"}, + {"override /callbacks": ["basic_vlm", "simple_log"]}, + "_self_", + ], + job=dict( + group="debug", + ), + trainer=dict( + callbacks=dict(log_tensor_shape=dict(num_log=2)), + max_iter=8000, + ), + optimizer=dict( + config=dict( + lr=1e-5, + fused=True, + ), + ), + data_setting=dict( + max_tokens=16000, + max_batch_size=1, + ), + model=dict( + config=dict( + policy=dict( + parallelism=dict( + data_parallel_shard_degree=8, + data_parallel_replicate_degree=-1, + ), + ), + ), + ), + checkpoint=dict(save_iter=2000), + ) +) + +""" +torchrun --nproc_per_node=8 --master_port=12341 -m projects.cosmos3.vlm.train --config=projects/cosmos3/vlm/configs/base/config.py -- experiment=pre_exp018_000_qwen3_vl_8b_instruct_vit2k8k_mbs1_flop3s_pretrain +""" +pre_exp018_000_qwen3_vl_8b_instruct_vit2k8k_mbs1_flop3s_pretrain = LazyDict( + dict( + defaults=[ + {"override /checkpoint": "s3"}, + {"override /data_train": "200_nanov2_stage_1_0218_34m_uniform_pretrain_repeat_s3_vlmdb"}, + {"override /data_val": "200_nanov2_stage_1_0218_34m_uniform_pretrain_repeat_s3_vlmdb"}, + {"override /model": "vlm_fsdp"}, + {"override /vlm_policy": "qwen3_vl_8b_instruct"}, + {"override /callbacks": ["basic_vlm", "simple_log"]}, + "_self_", + ], + job=dict( + group="debug", + ), + trainer=dict( + callbacks=dict(log_tensor_shape=dict(num_log=2)), + max_iter=60000, + run_validation=True, + validation_iter=40, + max_val_iter=1, + ), + optimizer=dict( + config=dict( + lr=2e-5, + fused=True, + weight_decay=0.05, + betas=[0.9, 0.999], + freeze_vision_encoder=True, + freeze_mm_projector=True, + ), + ), + scheduler=dict( + warmup_iters=6000, + ), + data_setting=dict( + qwen_max_video_token_length=8192, + qwen_max_image_token_length=2048, + max_tokens=16000, + max_batch_size=1, + distributor_type="no_replace", + distributor_seed=1993, + val_split_ratio=0.1, + ), + model=dict( + config=dict( + policy=dict( + parallelism=dict( + data_parallel_shard_degree=8, + data_parallel_replicate_degree=-1, + ), + monkey_patch_for_text_only_data=True, + pretrain_weights_path_llm="Qwen/Qwen3-8B", + ), + ), + ), + dataloader_train=dict( + enable_flop_based_batching=True, + target_runtime_seconds=3.0, # Used when max_batch_size > 1 + ), + checkpoint=dict( + save_iter=2000, + ), + ) +) + +""" +torchrun --nproc_per_node=8 --master_port=12341 -m projects.cosmos3.vlm.train --config=projects/cosmos3/vlm/configs/base/config.py -- experiment=pre_exp019_000_qwen3_vl_8b_instruct_vit2k8k_mbs1_flop3s_posttrain +""" +pre_exp019_000_qwen3_vl_8b_instruct_vit2k8k_mbs1_flop3s_posttrain = LazyDict( + dict( + defaults=[ + {"override /checkpoint": "s3"}, + {"override /data_train": "201_nanov2_stage_1_0218_34m_uniform_posttrain_repeat_s3_vlmdb"}, + {"override /data_val": "201_nanov2_stage_1_0218_34m_uniform_posttrain_repeat_s3_vlmdb"}, + {"override /model": "vlm_fsdp"}, + {"override /vlm_policy": "qwen3_vl_8b_instruct"}, + {"override /callbacks": ["basic_vlm", "simple_log"]}, + "_self_", + ], + job=dict( + group="debug", + ), + trainer=dict( + callbacks=dict(log_tensor_shape=dict(num_log=2)), + max_iter=8000, + run_validation=True, + validation_iter=40, + max_val_iter=1, + ), + optimizer=dict( + config=dict( + lr=1e-5, + fused=True, + weight_decay=0.05, + betas=[0.9, 0.999], + freeze_vision_encoder=True, + freeze_mm_projector=True, + ), + ), + scheduler=dict( + warmup_iters=800, + ), + data_setting=dict( + qwen_max_video_token_length=8192, + qwen_max_image_token_length=2048, + max_tokens=16000, + max_batch_size=1, + distributor_type="no_replace", + distributor_seed=1996, + val_split_ratio=0.05, + ), + model=dict( + config=dict( + policy=dict( + parallelism=dict( + data_parallel_shard_degree=8, + data_parallel_replicate_degree=-1, + ), + monkey_patch_for_text_only_data=True, + ), + ), + ), + dataloader_train=dict( + enable_flop_based_batching=True, + target_runtime_seconds=3.0, # Used when max_batch_size > 1 + ), + checkpoint=dict( + save_iter=1000, + load_path="cosmos_reason2/pre_exp015/pre_exp015_288_34m_v1352_repeat_uniform_1_epoch_n64/checkpoints/iter_000060000/", + ), + ) +) + + +pre_exp020_001_qwen3_vl_30b_a3b_instruct_ep = LazyDict( + dict( + defaults=[ + {"override /checkpoint": "s3"}, + {"override /data_train": "nemotron_nanov2_stage_1_0218_34m_uniform_pretrain_s3_vlmdb_recipe"}, + {"override /data_val": "nemotron_nanov2_stage_1_0218_34m_uniform_pretrain_s3_vlmdb_recipe"}, + {"override /model": "vlm_fsdp"}, + {"override /vlm_policy": "qwen3_vl_30b_a3b_instruct"}, + {"override /callbacks": ["basic_vlm", "simple_log"]}, + "_self_", + ], + job=dict(group="debug"), + trainer=dict( + max_iter=8000, + logging_iter=1, + ), + optimizer=dict(config=dict(lr=5e-5, fused=True)), + data_setting=dict( + qwen_max_video_token_length=8192, + qwen_max_image_token_length=2048, + max_tokens=16000, + max_batch_size=1, + distributor_type="no_replace", + distributor_seed=1993, + val_split_ratio=0.1, + webdataset_detshuffle=True, + ), + model=dict( + config=dict( + policy=dict( + parallelism=dict( + data_parallel_shard_degree=2, + data_parallel_replicate_degree=1, + ), + ), + ), + ), + dataloader_train=dict( + enable_flop_based_batching=True, + target_runtime_seconds=3.0, # Used when max_batch_size > 1 + ), + checkpoint=dict(save_iter=2000), + ) +) + + +for _item in [ + pre_exp010_000_eagle_er_1p7b_joint_reasoner_tl_722_5vs5_no_predict2_s3_webloader, + pre_exp010_010_internvl3_5_1b_joint_reasoner_tl_722_5vs5_no_predict2_s3_webloader, + pre_exp010_020_internvl3_5_2b_joint_reasoner_tl_722_5vs5_no_predict2_s3_webloader, + pre_exp011_000_qwen3_vl_2b, + pre_exp011_020_qwen3_vl_2b_vit2k8k, + pre_exp011_030_qwen3_vl_2b_vit2k8k_mbs8, + pre_exp011_040_qwen3_vl_2b_vit2k8k_mbs8_flop3s, + pre_exp011_041_qwen3_vl_2b_vit2k8k_mbs8_flop3s_mix_text_only, + pre_exp011_100_qwen3_vl_2b_align, + pre_exp011_300_internvl3_5_2b_reinit_align, + pre_exp011_400_internvl3_5_2b_reinit_e2e, + pre_exp015_000_qwen3_vl_8b_instruct_vit2k8k_mbs1_flop3s, + pre_exp015_001_qwen3_vl_8b_instruct_vit2k8k_mbs1_flop3s_mix_text_only, + pre_exp016_000_qwen3_vl_8b_thinking_vit2k8k_mbs1_flop3s, + pre_exp017_000_nemotron_vl_12b_mbs1, + pre_exp018_000_qwen3_vl_8b_instruct_vit2k8k_mbs1_flop3s_pretrain, + pre_exp019_000_qwen3_vl_8b_instruct_vit2k8k_mbs1_flop3s_posttrain, + pre_exp020_001_qwen3_vl_30b_a3b_instruct_ep, +]: + experiment_name = [name.lower() for name, value in globals().items() if value is _item][0] + if "job" not in _item: + _item["job"] = dict(name=experiment_name + "_${now:%Y-%m-%d}_${now:%H-%M-%S}") + else: + _item["job"]["name"] = experiment_name + "_${now:%Y-%m-%d}_${now:%H-%M-%S}" + + cs.store(group="experiment", package="_global_", name=experiment_name, node=_item) diff --git a/cosmos-inference/cosmos3/_src/vfm/configs/base/vlm/experiment/utils.py b/cosmos-inference/cosmos3/_src/vfm/configs/base/vlm/experiment/utils.py new file mode 100644 index 00000000..676fd475 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/vfm/configs/base/vlm/experiment/utils.py @@ -0,0 +1,28 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +from dataclasses import dataclass, field +from typing import Dict, List + + +@dataclass +class Experiment: + job_exp: str + nnode: int + command_args: List[str] + job_name: str = None + init_command: str = "" + job_group: str = None + extra_env_vars: Dict[str, str] = field(default_factory=dict) diff --git a/cosmos-inference/cosmos3/_src/vfm/datasets/__init__.py b/cosmos-inference/cosmos3/_src/vfm/datasets/__init__.py new file mode 100644 index 00000000..db6d6eb6 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/vfm/datasets/__init__.py @@ -0,0 +1,15 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025-2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + diff --git a/cosmos-inference/cosmos3/_src/vfm/datasets/action/__init__.py b/cosmos-inference/cosmos3/_src/vfm/datasets/action/__init__.py new file mode 100644 index 00000000..db6d6eb6 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/vfm/datasets/action/__init__.py @@ -0,0 +1,15 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025-2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + diff --git a/cosmos-inference/cosmos3/_src/vfm/datasets/action/action_normalization.py b/cosmos-inference/cosmos3/_src/vfm/datasets/action/action_normalization.py new file mode 100644 index 00000000..1d8defed --- /dev/null +++ b/cosmos-inference/cosmos3/_src/vfm/datasets/action/action_normalization.py @@ -0,0 +1,61 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +"""Action normalization helpers.""" + +import json +from pathlib import Path + +import numpy as np +import torch + +from cosmos3._src.imaginaire.utils import log + + +def load_action_stats(stats_path: str, stats_key: str = "global") -> dict[str, np.ndarray]: + """Load pre-computed action normalization stats from a JSON file.""" + path = Path(stats_path) + if not path.exists(): + raise FileNotFoundError(f"Action normalization stats not found at {stats_path}.") + log.info(f"Loading action normalization stats from {stats_path}") + with path.open("r") as f: + raw = json.load(f) + if stats_key in raw: + raw = raw[stats_key] + if not isinstance(raw, dict): + raise TypeError(f"Action normalization stats block {stats_key!r} in {stats_path} must be a dict.") + elif stats_key != "global": + raise KeyError(f"Action normalization stats block {stats_key!r} not found in {stats_path}.") + stat_keys = {"mean", "std", "min", "max", "q01", "q99"} + return {k: np.array(v, dtype=np.float32) for k, v in raw.items() if k in stat_keys} + + +def normalize_action( + action: torch.Tensor, + method: str, + stats: dict[str, torch.Tensor], +) -> torch.Tensor: + """Normalize action tensor (all dimensions including gripper).""" + if method == "quantile": + q01, q99 = stats["q01"], stats["q99"] + denom = (q99 - q01).clamp(min=1e-8) + return (2.0 * (action - q01) / denom - 1.0).clamp(-1.0, 1.0) + if method == "meanstd": + return (action - stats["mean"]) / stats["std"].clamp(min=1e-8) + if method == "minmax": + lo, hi = stats["min"], stats["max"] + denom = (hi - lo).clamp(min=1e-8) + return (2.0 * (action - lo) / denom - 1.0).clamp(-1.0, 1.0) + raise ValueError(f"Unknown normalization method: {method!r}") diff --git a/cosmos-inference/cosmos3/_src/vfm/datasets/action/action_spec.py b/cosmos-inference/cosmos3/_src/vfm/datasets/action/action_spec.py new file mode 100644 index 00000000..dfc6811f --- /dev/null +++ b/cosmos-inference/cosmos3/_src/vfm/datasets/action/action_spec.py @@ -0,0 +1,247 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +"""Action-vector specification: per-dim type label + idle thresholds. + +Single concept: every column of an action vector has a :class:`DimType` label. +Idle detection iterates by type and applies the matching algorithm: + + POS → ‖action[pos_idx]‖ per arm < eps_t + ROT → distance(rot, identity) per group < eps_r + GRIPPER → max |Δgripper| < eps_g (frame 0 idle by convention) + JOINT → max |Δjoint| < joint_threshold (frame 0 idle) + RESERVED → ignored + +An :class:`ActionSpec` is just ``names`` + ``types`` + ``rotation_format``. +Build one declaratively via :func:`build_action_spec` from DSL components:: + + build_action_spec(Pos(), Rot("rot6d"), Gripper()) # 10D single arm + build_action_spec(Pos(), Rot("rot6d")) # 9D no gripper + build_action_spec(Joint(n=14, label="arm"), # 30D joint-space + Joint(n=14, label="end"), + Joint(n=2, label="gripper")) + build_action_spec(Pos(prefix="left"), Rot("rot6d", "left"), Gripper(prefix="left"), + Pos(prefix="right"), Rot("rot6d", "right"), Gripper(prefix="right")) + +Naming convention: + Default ``pos_x``, ``rot_0``, ``gripper``, ``arm_0`` ... + With ``prefix="left"`` (idempotent on trailing ``_``): ``left_pos_x`` ... +""" + +from __future__ import annotations + +from dataclasses import dataclass +from enum import Enum +from typing import ClassVar + +from cosmos3._src.vfm.datasets.action.pose_utils import ( + RotationConvention, + _identity_rotation_vector, +) + + +class DimType(str, Enum): + """Per-column action-dim category (drives idle detection).""" + + POS = "pos" + ROT = "rot" + GRIPPER = "gripper" + JOINT = "joint" + RESERVED = "reserved" + + +@dataclass(frozen=True, slots=True) +class ActionSpec: + """Structural description of an action vector: names + per-dim types. + + All ROT dims share a single ``rotation_format``; mixed formats in one spec + are not supported (raise at build time). + + This struct contains no detection thresholds — those are passed at call + time to :func:`compute_idle_frames` so each dataset can tune them + independently of layout. + """ + + names: list[str] + types: list[DimType] + rotation_format: RotationConvention = "rot6d" + + @property + def dim(self) -> int: + return len(self.names) + + +# --------------------------------------------------------------------------- +# DSL components +# --------------------------------------------------------------------------- + + +def _join_prefix(prefix: str, name: str) -> str: + """Join ``prefix`` and ``name`` with a single ``_``; idempotent on trailing ``_``.""" + return name if not prefix else f"{prefix.rstrip('_')}_{name}" + + +@dataclass(frozen=True) +class Pos: + """Translation block. + + Default 3D (``pos_x``, ``pos_y``, ``pos_z``). For planar tasks (e.g. PushT) + use ``Pos(dim=2)`` → ``pos_x``, ``pos_y``. ``dim >= 4`` falls back to + indexed names ``pos_0``, ``pos_1``, ... + """ + + dim: int = 3 + prefix: str = "" + type: ClassVar[DimType] = DimType.POS + + def names(self) -> list[str]: + if self.dim <= 3: + return [_join_prefix(self.prefix, f"pos_{c}") for c in "xyz"[: self.dim]] + return [_join_prefix(self.prefix, f"pos_{i}") for i in range(self.dim)] + + +@dataclass(frozen=True) +class Rot: + """Rotation block; ``format`` selects the encoding. + + Supported formats and per-dim names: + + - ``rot6d`` → 6 dims, ``rot_0`` ... ``rot_5`` (identity ``[1,0,0,0,1,0]``) + - ``rot9d`` → 9 dims, ``rot_0`` ... ``rot_8`` (identity ``[1,0,0,0,1,0,0,0,1]``) + - ``euler_xyz`` → 3 dims, ``roll``, ``pitch``, ``yaw`` (identity ``[0,0,0]``) + - ``axisangle`` → 3 dims, ``axang_x/y/z`` (identity ``[0,0,0]``) + - ``quat_xyzw`` / ``quat_wxyz`` → 4 dims, ``quat_x/y/z/w`` in declared order + """ + + format: RotationConvention = "rot6d" + prefix: str = "" + type: ClassVar[DimType] = DimType.ROT + + @property + def rotation_format(self) -> RotationConvention: + return self.format + + @property + def dim(self) -> int: + return _identity_rotation_vector(self.format).shape[0] + + def names(self) -> list[str]: + if self.format == "euler_xyz": + return [_join_prefix(self.prefix, c) for c in ("roll", "pitch", "yaw")] + if self.format == "axisangle": + return [_join_prefix(self.prefix, f"axang_{c}") for c in "xyz"] + if self.format.startswith("quat_"): + order = self.format.split("_", 1)[1] # "xyzw" or "wxyz" + return [_join_prefix(self.prefix, f"quat_{c}") for c in order] + return [_join_prefix(self.prefix, f"rot_{i}") for i in range(self.dim)] + + +@dataclass(frozen=True) +class Gripper: + """1D gripper command (binary 0/1 or continuous). Detected by frame-diff.""" + + prefix: str = "" + type: ClassVar[DimType] = DimType.GRIPPER + + @property + def dim(self) -> int: + return 1 + + def names(self) -> list[str]: + return [_join_prefix(self.prefix, "gripper")] + + +@dataclass(frozen=True) +class Joint: + """``n`` joint commands. Detected by frame-diff against ``joint_threshold``.""" + + n: int = 0 + label: str = "joint" + prefix: str = "" + type: ClassVar[DimType] = DimType.JOINT + + @property + def dim(self) -> int: + return self.n + + def names(self) -> list[str]: + return [_join_prefix(self.prefix, f"{self.label}_{i}") for i in range(self.n)] + + +@dataclass(frozen=True) +class Reserved: + """``n`` dims counted in ``action_dim`` but ignored by idle detection.""" + + n: int = 0 + label: str = "reserved" + prefix: str = "" + type: ClassVar[DimType] = DimType.RESERVED + + @property + def dim(self) -> int: + return self.n + + def names(self) -> list[str]: + return [_join_prefix(self.prefix, f"{self.label}_{i}") for i in range(self.n)] + + +# --------------------------------------------------------------------------- +# Builder +# --------------------------------------------------------------------------- + + +# Type alias for any DSL component. Not a runtime check — only annotation hint. +Component = Pos | Rot | Gripper | Joint | Reserved + + +def build_action_spec(*components: Component) -> ActionSpec: + """Compose ``components`` into an :class:`ActionSpec`. + + Each component contributes its ``names()`` and replicates its ``type`` for + every column it occupies. The first ROT component's ``rotation_format`` + is captured for the whole spec; mixing formats raises ``ValueError``. + """ + names: list[str] = [] + types: list[DimType] = [] + rotation_format: RotationConvention | None = None + + for c in components: + names.extend(c.names()) + types.extend([c.type] * c.dim) + if c.type == DimType.ROT: + fmt = c.rotation_format # type: ignore[union-attr] + if rotation_format is None: + rotation_format = fmt + elif rotation_format != fmt: + raise ValueError(f"Mixed rotation_format in one ActionSpec: {rotation_format!r} vs {fmt!r}") + + return ActionSpec( + names=names, + types=types, + rotation_format=rotation_format or "rot6d", + ) + + +__all__ = [ + "ActionSpec", + "Component", + "DimType", + "Gripper", + "Joint", + "Pos", + "Reserved", + "Rot", + "build_action_spec", +] diff --git a/cosmos-inference/cosmos3/_src/vfm/datasets/action/bridge_orig_lerobot_dataset.py b/cosmos-inference/cosmos3/_src/vfm/datasets/action/bridge_orig_lerobot_dataset.py new file mode 100644 index 00000000..95af8636 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/vfm/datasets/action/bridge_orig_lerobot_dataset.py @@ -0,0 +1,284 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025-2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +# * https://github.com/2toinf/X-VLA/blob/30090f81cf91b15da73af234ce2b098fe20590f8/datasets/domain_handler/simulations.py#L70-L93 +# * https://github.com/2toinf/X-VLA/issues/11 +# * https://github.com/2toinf/X-VLA/issues/33 +# * https://github.com/2toinf/X-VLA/issues/67 +# + +# uses identity stats (q01=-1, q99=1) on the 6D rotation dims 3..8, while +# ``"quantile_rot"`` uses the raw stats and normalizes those columns too. + +from typing import Any + +import numpy as np +import torch +from lerobot.datasets.lerobot_dataset import LeRobotDatasetMetadata + +from cosmos3._src.imaginaire.utils import log +from cosmos3._src.vfm.datasets.action.cosmos3_action_lerobot import ( + ActionNormalization, + ActionSpec, + BaseActionLeRobotDataset, + Gripper, + Pos, + Rot, + build_action_spec, +) +from cosmos3._src.vfm.datasets.action.pose_utils import ( + PoseConvention, + build_abs_pose_from_components, + convert_rotation, + pose_abs_to_rel, +) +from cosmos3._src.vfm.datasets.action.viewpoint_utils import Viewpoint + +# Bridge rotation decomposition: +# 1) _DEFAULT_ROTATION: raw bridge state → kinematics (MJCF/URDF) frame. +# The WidowX controller records ``R_state = R_fk @ DEFAULT_ROTATION.T``, +# so ``R_fk = R_state @ DEFAULT_ROTATION``. +# 2) _TCP_TO_FLANGE: re-reference from ee_gripper_link to gripper_link +# (pure translation in kinematics frame). See block below. +# 3) _KIN_TO_OPENCV: kinematics → OpenCV convention (for training/vis). +# The viewer undoes this before IK to recover the kinematic frame. +_DEFAULT_ROTATION = np.array( + [[0, 0, 1], [0, 1, 0], [-1, 0, 0]], + dtype=np.float32, +) +_BRIDGE_TO_OPENCV = np.array( + [[0, 0, 1], [-1, 0, 0], [0, -1, 0]], + dtype=np.float32, +) + +# --------------------------------------------------------------------------- +# TCP → flange (gripper body) offset +# --------------------------------------------------------------------------- +# The bridge dataset records EE poses at ``ee_gripper_link`` — the Interbotix +# SDK's end-effector reference, 93.6 mm past the wrist rotate body +# (``gripper_link``), roughly at the grasp center between the finger pads. +# For action learning we re-reference poses to the *wrist rotate body* +# (``gripper_link``) because: +# 1. It is the last actuated link — its pose is fully determined by joint +# angles, with no dependence on finger opening. +# 2. The ~10 cm offset reduces the lever-arm effect of small rotation +# errors on position accuracy. +# 3. Consistent with Google Robot, where we also target the gripper body. +# +# The constant below is the SE(3) transform from ``ee_gripper_link`` to +# ``gripper_link``, computed from the SimplerEnv URDF via pinocchio FK at the +# neutral configuration: +# T = oMf[ee_gripper_link]⁻¹ · oMf[gripper_link] +# It is pure translation (identity rotation) — the two frames share the +# same orientation by construction (connected via fixed joints with no +# rotational offset). +# +# Source URDF: https://github.com/simpler-env/ManiSkill2_real2sim +# → mani_skill2_real2sim/assets/descriptions/widowx_description/ +# + +# so the translation is expressed in the kinematic (MJCF) frame. +# fmt: off +_TCP_TO_FLANGE = np.array([ + [+1.0000000000, +0.0000000000, +0.0000000000, -0.0935750000], + [+0.0000000000, +1.0000000000, +0.0000000000, +0.0000000000], + [+0.0000000000, +0.0000000000, +1.0000000000, +0.0000000000], + [+0.0000000000, +0.0000000000, +0.0000000000, +1.0000000000], +], dtype=np.float32) +# fmt: on + + +class BridgeOrigLeRobotDataset(BaseActionLeRobotDataset): + """ """ + + def __init__( + self, + root: str = "/lustre/fsw/portfolios/cosmos/projects/cosmos_base_training/cosmos3_action_datasets/bridge_raw", + fps: float = 5.0, + chunk_length: int = 16, + split_seed: int = 42, + split_val_ratio: float = 0.05, + split: str = "train", + mode: str = "policy", + pose_convention: PoseConvention = "backward_framewise", + action_normalization: ActionNormalization | None = None, + viewpoint: Viewpoint = "ego_view", + enable_fast_init: bool = False, + ) -> None: + """ """ + super().__init__( + fps=fps, + chunk_length=chunk_length, + split_seed=split_seed, + split_val_ratio=split_val_ratio, + split=split, + mode=mode, + embodiment_type="bridge_orig_lerobot", + viewpoint=viewpoint, + pose_convention=pose_convention, + rotation_format="rot6d", + action_normalization=action_normalization, + tolerance_s=1e-4, + enable_fast_init=enable_fast_init, + ) + # _to_opencv is the kinematics→OpenCV part only. + # The viewer undoes this before IK → recovers kinematic frame directly. + self._to_opencv = _BRIDGE_TO_OPENCV + + self._all_shard_roots = [root] + + self._delta_timestamps = { + "observation.images.image_0": [i * self._dt for i in range(0, self._chunk_length + 1)], + "observation.state": [i * self._dt for i in range(0, self._chunk_length + 1)], + "action": [i * self._dt for i in range(0, self._chunk_length)], + } + + # ------------------------------------------------------------------ + # Action computation + # ------------------------------------------------------------------ + + def _compute_absolute_action(self, sample: dict[str, Any]) -> tuple[torch.Tensor, torch.Tensor]: + """Absolute action from state + gripper from action. + + EEF xyz+rotation come from observation.state[1:]; gripper from action[:, 6]. + + Returns: + (action_tensor, initial_pose) — initial_pose is the first-frame + absolute EE pose (4×4, in the corrected OpenCV frame). + """ + state = sample["observation.state"][1:] # [T, 8] + poses_abs = build_abs_pose_from_components( + state[:, 0:3], + state[:, 3:6], + "euler_xyz", + ) + + # 1. Raw → kinematics: apply DEFAULT_ROTATION + # 2. TCP → flange: shift from ee_gripper_link to gripper_link + # 3. Kinematics → OpenCV convention (rotation only) + poses_abs[:, :3, :3] = poses_abs[:, :3, :3] @ _DEFAULT_ROTATION.astype(poses_abs.dtype) + poses_abs = poses_abs @ _TCP_TO_FLANGE.astype(poses_abs.dtype) + poses_abs[:, :3, :3] = poses_abs[:, :3, :3] @ self._to_opencv.astype(poses_abs.dtype) + + initial_pose = torch.from_numpy(poses_abs[0].copy()).float() + + translation = torch.from_numpy(poses_abs[:, :3, 3]).float() + rotation_matrix = torch.from_numpy(poses_abs[:, :3, :3]).float() + rotation = convert_rotation(rotation_matrix, input_format="matrix", output_format="rot6d").float() + + pose = torch.cat([translation, rotation], dim=-1) # [T, 9] + return torch.cat([pose, sample["action"][:, [6]]], dim=-1), initial_pose # [T, 10] + + def _compute_backward_framewise_action(self, sample: dict[str, Any]) -> tuple[torch.Tensor, torch.Tensor]: + """Body-frame (ego-frame) delta: ``T_curr^{-1} @ T_next``. + + Matches Camera/AV ``backward_framewise`` convention. Translation is in + the current frame's local coordinate system; rotation is + ``R_curr^{-1} @ R_next``. + + Returns: + (action_tensor, initial_pose) — initial_pose is the first-frame + absolute EE pose (4×4, in the corrected OpenCV frame). + """ + states = sample["observation.state"] # (chunk_length + 1, 8) + poses_abs = build_abs_pose_from_components( + states[:, 0:3], + states[:, 3:6], + "euler_xyz", + ) + + # 1. Raw → kinematics: apply DEFAULT_ROTATION + # 2. TCP → flange: shift from ee_gripper_link to gripper_link + # 3. Kinematics → OpenCV convention (rotation only) + poses_abs[:, :3, :3] = poses_abs[:, :3, :3] @ _DEFAULT_ROTATION.astype(poses_abs.dtype) + poses_abs = poses_abs @ _TCP_TO_FLANGE.astype(poses_abs.dtype) + poses_abs[:, :3, :3] = poses_abs[:, :3, :3] @ self._to_opencv.astype(poses_abs.dtype) + + initial_pose = torch.from_numpy(poses_abs[0].copy()).float() + + poses_rel = pose_abs_to_rel( + poses_abs=poses_abs, + rotation_format="rot6d", + pose_convention="backward_framewise", + ) + poses_rel_tensor = torch.from_numpy(poses_rel).float() + + return torch.cat([poses_rel_tensor, sample["action"][:, [6]]], dim=-1), initial_pose + + # ------------------------------------------------------------------ + # Normalization is handled by BaseActionLeRobotDataset. + # Stats are loaded from: + # cosmos3/_src/vfm/datasets/action/normalizers/ + # bridge_orig_lerobot__.json + # Regenerate via ``compute_action_stats.py`` + ``debug/stats_all.sh``. + # ------------------------------------------------------------------ + + # ------------------------------------------------------------------ + # Episode filtering + # ------------------------------------------------------------------ + def _filter_valid_episodes(self, meta: LeRobotDatasetMetadata, episode_ids: list[int]) -> list[int]: + """Drop episodes whose ``tasks`` metadata is empty/whitespace. + + Narrower than the offline + ``projects/cosmos3/vfm/datasets/action/filter_bridge_dataset.py`` + (which also flags gibberish/question/non-English/patterns via + ``classify_task``). + """ + kept: list[int] = [] + dropped = 0 + for ep_id in episode_ids: + ep = meta.episodes[ep_id] + tasks = ep.get("tasks", []) + if isinstance(tasks, str): + tasks = [tasks] + has_prompt = any(t and str(t).strip() for t in (tasks or [])) + if has_prompt: + kept.append(ep_id) + else: + dropped += 1 + if dropped: + log.info(f"BridgeOrigLeRobotDataset: dropped {dropped} / {len(episode_ids)} episodes with empty prompt") + return kept + + # ------------------------------------------------------------------ + # __getitem__ + # ------------------------------------------------------------------ + + def _build_action_spec(self) -> ActionSpec: + """Bridge: 10D = ``[Pos, Rot6d, Gripper]``.""" + return build_action_spec(Pos(), Rot("rot6d"), Gripper()) + + def __getitem__(self, idx: int) -> dict[str, Any]: + """ """ + mode, _, _, sample = self._fetch_sample(idx) + + ai_caption = sample["task"] + + video = sample["observation.images.image_0"] # [T,C,H,W] + if self._pose_convention == "absolute": + action, initial_pose = self._compute_absolute_action(sample) + elif self._pose_convention == "backward_framewise": + action, initial_pose = self._compute_backward_framewise_action(sample) + else: + raise ValueError(f"Unknown pose_convention: {self._pose_convention}") + + return self._build_result( + mode=mode, video=video, action=action, ai_caption=ai_caption, initial_pose=initial_pose + ) + + @property + def action_dim(self) -> int: + """Action dimensionality: position(3) + 6D rotation(6) + gripper(1) = 10.""" + return 10 diff --git a/cosmos-inference/cosmos3/_src/vfm/datasets/action/cosmos3_action_lerobot.py b/cosmos-inference/cosmos3/_src/vfm/datasets/action/cosmos3_action_lerobot.py new file mode 100644 index 00000000..3e62c4b9 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/vfm/datasets/action/cosmos3_action_lerobot.py @@ -0,0 +1,1015 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025-2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +"""Shared LeRobot adapter utilities for Action datasets. + +These helpers centralize common behavior across Action wrappers: +- deterministic train/val episode splitting +- valid per-episode index range construction +- a reusable BaseActionLeRobotDataset class with lazy init, video formatting, + and common result building +""" + +from __future__ import annotations + +import importlib +import logging as _logging +import math +import os as _os +import random +from bisect import bisect_right +from collections import OrderedDict, defaultdict +from collections.abc import Callable, Sequence +from concurrent.futures import ThreadPoolExecutor +from pathlib import Path +from threading import Lock +from typing import Any, ClassVar, Literal + +import huggingface_hub.constants as _hf_const +import numpy as np +import torch +from lerobot.datasets.lerobot_dataset import LeRobotDataset, LeRobotDatasetMetadata +from torch.utils.data import Dataset + +_hf_offline_applied = False + + +def _ensure_hf_hub_offline() -> None: + """Force HF Hub into offline mode for local-only datasets (repo_id="local"). + + Sets the ``HF_HUB_OFFLINE`` env var (for any future imports in worker + processes), patches the already-imported constant, and suppresses the + expected "Returning existing local_dir" fallback warning. + + Safe to call multiple times; only applies once per process. + """ + global _hf_offline_applied + if _hf_offline_applied: + return + if "HF_HUB_OFFLINE" not in _os.environ: + _os.environ["HF_HUB_OFFLINE"] = "1" + if not _hf_const.HF_HUB_OFFLINE: + _hf_const.HF_HUB_OFFLINE = True + _logging.getLogger("huggingface_hub._snapshot_download").setLevel(_logging.ERROR) + _hf_offline_applied = True + + +from functools import cached_property + +from cosmos3._src.imaginaire.utils import log +from cosmos3._src.vfm.datasets.action.action_normalization import ( + load_action_stats, + normalize_action, +) + +# Re-export the action_spec DSL from this module so that subclass datasets +# only need a single import block (alongside ``BaseActionLeRobotDataset``). +from cosmos3._src.vfm.datasets.action.action_spec import ( # noqa: F401 (re-export) + ActionSpec, + DimType, + Gripper, + Joint, + Pos, + Reserved, + Rot, + build_action_spec, +) +from cosmos3._src.vfm.datasets.action.domain_utils import get_domain_id +from cosmos3._src.vfm.datasets.action.pose_utils import compute_idle_frames +from cosmos3._src.vfm.datasets.action.viewpoint_utils import Viewpoint +from cosmos3._src.vfm.scripts.action.memprofile import ( + deep_size as _deep_size, +) +from cosmos3._src.vfm.scripts.action.memprofile import ( + fmt_mb as _fmt_mb, +) +from cosmos3._src.vfm.scripts.action.memprofile import ( + log_worker_memory_breakdown, + rss_tracker, +) +from cosmos3._src.vfm.scripts.action.memprofile import ( + memprofile_enabled as _memprofile_enabled, +) + +# --------------------------------------------------------------------------- +# LRU-capped VideoDecoderCache +# --------------------------------------------------------------------------- +_LRU_VIDEO_CACHE_MAX_SIZE: int = 64 +_LRU_DATASET_MAX_LOADED: int = 32 +ActionNormalization = Literal["quantile", "quantile_rot", "meanstd", "minmax"] +_ACTION_NORMALIZATION_CHOICES: tuple[str, ...] = ("quantile", "quantile_rot", "meanstd", "minmax") + +_decoder_cache_patched = False + + +class _LRUVideoDecoderCache: + """Drop-in replacement for ``lerobot.datasets.video_utils.VideoDecoderCache`` + with LRU eviction. When the cache exceeds *max_size* entries the + least-recently-used decoder (and its file handle) is evicted. + """ + + def __init__(self, max_size: int = _LRU_VIDEO_CACHE_MAX_SIZE) -> None: + self._max_size = max_size + self._cache: OrderedDict[str, tuple[Any, Any]] = OrderedDict() + self._lock = Lock() + self._hits = 0 + self._misses = 0 + self._evictions = 0 + + def get_decoder(self, video_path: str) -> Any: + if importlib.util.find_spec("torchcodec"): # type: ignore[attr-defined] + from torchcodec.decoders import VideoDecoder + else: + raise ImportError("torchcodec is required but not available.") + + import fsspec + + video_path = str(video_path) + + with self._lock: + if video_path in self._cache: + self._cache.move_to_end(video_path) + self._hits += 1 + return self._cache[video_path][0] + + self._misses += 1 + file_handle = fsspec.open(video_path).__enter__() + decoder = VideoDecoder(file_handle, seek_mode="approximate") # type: ignore[arg-type] + self._cache[video_path] = (decoder, file_handle) + + evicted = 0 + while len(self._cache) > self._max_size: + _, (_, old_fh) = self._cache.popitem(last=False) + try: + old_fh.close() + except Exception: + pass + evicted += 1 + self._evictions += evicted + + if evicted and self._evictions % 50 <= evicted: + log.debug( + f"[VideoDecoderCache pid={_os.getpid()}] " + f"evicted={self._evictions} total, size={len(self._cache)}/{self._max_size}, " + f"hits={self._hits}, misses={self._misses}, " + f"hit_rate={100 * self._hits / max(1, self._hits + self._misses):.1f}%" + ) + + return decoder + + def clear(self) -> None: + with self._lock: + for _, file_handle in self._cache.values(): + try: + file_handle.close() + except Exception: + pass + self._cache.clear() + + def size(self) -> int: + with self._lock: + return len(self._cache) + + +def _patch_decoder_cache(max_size: int = _LRU_VIDEO_CACHE_MAX_SIZE) -> None: + """Replace the module-level ``_default_decoder_cache`` in LeRobot with an + LRU-capped version to prevent unbounded memory growth in workers.""" + global _decoder_cache_patched + if _decoder_cache_patched: + return + + import lerobot.datasets.video_utils as _vu + + lru_cache = _LRUVideoDecoderCache(max_size=max_size) + _vu._default_decoder_cache = lru_cache + _decoder_cache_patched = True + log.debug(f"Patched LeRobot VideoDecoderCache with LRU max_size={max_size}") + + +def _parallel_map( + fn: Callable[[Any], Any], + items: list[Any], + *, + max_workers: int, + label: str, +) -> list[Any]: + """Thread-pool ``map`` — returns results in input order. + + Intended for IO-bound prefetch (``LeRobotDatasetMetadata`` loads, + parquet column reads). Preserves item-order so callers can ``zip`` + with their ``indices`` / ``roots`` list. Skips the thread pool + entirely when there is 0 or 1 task — avoids per-worker + ``ThreadPoolExecutor`` setup cost and log spam under + ``shard_across_workers=True`` where each worker typically gets + only 1-2 shards. + """ + if not items: + return [] + if len(items) == 1 or max_workers <= 1: + return [fn(items[0])] if len(items) == 1 else [fn(x) for x in items] + log.info(f"{label}: {len(items)} tasks (workers={max_workers})") + with ThreadPoolExecutor(max_workers=max_workers) as ex: + return list(ex.map(fn, items)) + + +def split_episode_ids(total_episodes: int, seed: int, val_ratio: float, split: str) -> list[int]: + """Create deterministic random episode ids for train/val/full splits.""" + num_val = int(round(total_episodes * val_ratio)) + g = torch.Generator().manual_seed(seed) + episode_ids = torch.randperm(total_episodes, generator=g).tolist() + + if split == "train": + return episode_ids[num_val:] + if split == "val": + return episode_ids[:num_val] + return episode_ids + + +def build_episode_spans( + episodes: Any, + episode_ids: Sequence[int], + chunk_length: int, + sample_stride: int = 1, +) -> tuple[list[tuple[int, int, int]], int, int]: + """Build valid episode spans for LeRobot frame queries. + + Returns: + - episode spans as ``(episode_id, sample_start, valid_len)`` + - total valid sample count across selected episodes + - total raw frame count across selected episodes + """ + assert sample_stride >= 1, f"sample_stride must be >= 1, got {sample_stride}" + + dataset_from_index = list(episodes["dataset_from_index"]) + dataset_to_index = list(episodes["dataset_to_index"]) + length = list(episodes["length"]) + + spans: list[tuple[int, int, int]] = [] + valid_count = 0 + sample_count = 0 + for episode_id in episode_ids: + start = dataset_from_index[episode_id] + stop = dataset_to_index[episode_id] + raw_valid_len = stop - start - chunk_length + if raw_valid_len > 0: + valid_len = (raw_valid_len + sample_stride - 1) // sample_stride + spans.append((episode_id, start, valid_len)) + valid_count += valid_len + sample_count += int(length[episode_id]) + + return spans, valid_count, sample_count + + +def _normalize_split(split: str) -> str: + """Normalize split name to one of ``'train'``, ``'val'``, ``'full'``.""" + s = split.lower().strip() + if s in {"val", "valid", "validation", "eval", "test"}: + return "val" + if s in {"train", "full"}: + return s + raise ValueError(f"Unsupported {split=}. Use train/val/full.") + + +class BaseActionLeRobotDataset(Dataset): + """Reusable base class for Action LeRobot-backed map-style datasets. + + Subclasses typically: + 1) call ``_register_source`` to register one or more LeRobot sources + 2) implement ``__getitem__`` for dataset-specific sample parsing + 3) call ``_build_result`` to assemble the return dict + """ + + # Applied as: R_opencv = R_native @ _to_opencv + # Subclasses override in __init__; default is identity (no correction). + + # Bundled normalization stats directory. Stats are committed at + # ``<_NORMALIZERS_DIR>/__.json`` (flat + # layout matching the existing UMI files) and produced by + # ``projects/cosmos3/vfm/datasets/action/compute_action_stats.py``. + # Subclasses that need a different filename scheme can override + # :meth:`_normalizer_filename`. + _NORMALIZERS_DIR: ClassVar[Path] = Path(__file__).parent / "normalizers" + + def __init__( + self, + *, + fps: float, + chunk_length: int, + split_seed: int, + split_val_ratio: float, + split: str, + mode: str, + embodiment_type: str, + viewpoint: Viewpoint, + pose_convention: str | None = None, + rotation_format: str | None = None, + action_normalization: ActionNormalization | None = None, + tolerance_s: float = 1e-4, + max_loaded_datasets: int = _LRU_DATASET_MAX_LOADED, + skip_video_loading: bool = False, + sample_stride: int = 1, + enable_fast_init: bool = False, + fast_init_max_workers: int = 64, + ) -> None: + super().__init__() + _ensure_hf_hub_offline() + _patch_decoder_cache() + self._memprofile = _memprofile_enabled() + + assert sample_stride >= 1, f"sample_stride must be >= 1, got {sample_stride}" + assert fast_init_max_workers >= 1, f"fast_init_max_workers must be >= 1, got {fast_init_max_workers}" + assert action_normalization is None or action_normalization in _ACTION_NORMALIZATION_CHOICES, ( + f"action_normalization must be None or one of {_ACTION_NORMALIZATION_CHOICES}, got {action_normalization!r}" + ) + + with rss_tracker(f"{self.__class__.__name__}.__init__", enabled=self._memprofile): + self._fps = fps + self._dt = 1.0 / fps + self._chunk_length = chunk_length + self._split_seed = split_seed + self._split_val_ratio = split_val_ratio + self._split = _normalize_split(split) + self._mode = mode + self._embodiment_type = embodiment_type + self._viewpoint: Viewpoint = viewpoint + self._pose_convention = pose_convention + self._rotation_format = rotation_format + self._action_normalization = action_normalization + # Lazy-loaded stats cache, populated on first call to + # :meth:`_normalize_action`. Per-process (workers get their own). + self._norm_stats: dict[str, torch.Tensor] | None = None + self._tolerance_s = tolerance_s + self._max_loaded_datasets = max_loaded_datasets + self._skip_video_loading = skip_video_loading + self._sample_stride = sample_stride + self._enable_fast_init = enable_fast_init + self._fast_init_max_workers = fast_init_max_workers + self._delta_timestamps: dict[str, list[float]] = {} + self._to_opencv: np.ndarray | dict[str, np.ndarray] = np.eye(3, dtype=np.float32) + + if pose_convention is None: + log.warning( + f"{self.__class__.__name__}: pose_convention is not set. " + "Consider specifying 'backward_framewise' or 'backward_anchored'." + ) + + self._datasets: list[LeRobotDataset | None] = [] + self._dataset_build_args: list[dict[str, Any] | None] = [] + self._loaded_lru: OrderedDict[int, None] = OrderedDict() + + # -- Flat index structures (populated by _append_index_records) -- + # Together these two lists form a searchable map from a flat + # global index to (dataset, row, episode, frame). One entry per + # episode span across *all* registered sources. + # + # _episode_records[i] = (ds_idx, sample_start, valid_len, episode_id) + # ds_idx – which source dataset (index into _datasets) + # sample_start – first row of this span in that dataset's table + # valid_len – number of usable frames in this span + # episode_id – the episode this span belongs to + # + # _episode_cum_ends[i] = running total of valid_len through span i + # Used for O(log N) lookup via bisect_right in _resolve_index. + self._episode_records: list[tuple[int, int, int, int]] = [] + self._episode_cum_ends: list[int] = [] + self._num_valid_indices = 0 + self._domain_id = get_domain_id(self._embodiment_type) + + # Deferred-init shard roots — a list of root paths. + # Subclasses populate this in __init__; _register_sources() + # reads _delta_timestamps and _tolerance_s from self (both + # initialised above, with _delta_timestamps overridden by + # each subclass). + # ActionUnifiedIterableDataset.assign_worker uses len() for + # round-robin shard distribution and _register_sources(indices) + # for deferred loading. When empty, shard distribution is + # skipped (every worker iterates the full dataset). + self._all_shard_roots: list[str] = [] + + # -- public properties --------------------------------------------------- + + @property + def fps(self) -> float: + return self._fps + + @property + def chunk_length(self) -> int: + return self._chunk_length + + @property + def split(self) -> str: + return self._split + + @property + def mode(self) -> str: + return self._mode + + @mode.setter + def mode(self, value: str) -> None: + self._mode = value + + @property + def domain_id(self) -> int: + return self._domain_id + + # -- source registration ------------------------------------------------- + + def _register_source( + self, + *, + delta_timestamps: dict[str, list[float]], + tolerance_s: float, + root: str | None = None, + repo_id: str = "local", + force_cache_sync: bool = False, + download_videos: bool = False, + video_backend: str | None = None, + revision: str | None = None, + dataset_label: str | None = None, + prefetched_meta: LeRobotDatasetMetadata | None = None, + ) -> LeRobotDatasetMetadata: + """Register a LeRobot dataset source lazily (metadata-only at init). + + ``prefetched_meta`` lets subclasses load metadata in a thread pool + (``LeRobotDatasetMetadata`` reads are pure I/O — ``info.json`` + + ``episodes.parquet`` + ``tasks.parquet``) and then hand the ready + object to the serial append-path below, which still manages the + order-sensitive shared state (``_datasets`` / ``_dataset_build_args`` + / ``_episode_records`` / ``_episode_cum_ends``). When ``None`` the + caller gets the original single-threaded behavior. + """ + label_str = f" [{dataset_label}]" if dataset_label else "" + cls = self.__class__.__name__ + # "local" is not a valid PEP 440 version, so LeRobot's + # is_valid_version() check skips the get_safe_version() HF API call. + if repo_id == "local" and revision is None: + revision = "local" + + with rss_tracker(f"{cls}{label_str} — metadata load", enabled=self._memprofile): + if prefetched_meta is not None: + meta = prefetched_meta + else: + meta = LeRobotDatasetMetadata( + repo_id=repo_id, + root=root, + revision=revision, + force_cache_sync=force_cache_sync, + ) + ds_idx = len(self._datasets) + self._datasets.append(None) + self._dataset_build_args.append( + { + "repo_id": repo_id, + "root": root, + "delta_timestamps": delta_timestamps, + "tolerance_s": tolerance_s, + "force_cache_sync": force_cache_sync, + "download_videos": download_videos, + "video_backend": video_backend, + "revision": revision, + } + ) + + with rss_tracker( + f"{cls}{label_str} — index records", + enabled=self._memprofile, + extras_fn=lambda: [ + f"episode_records so far: {len(self._episode_records)} entries, " + f"~{_fmt_mb(_deep_size(self._episode_records) / (1024 * 1024))}", + f"episode_cum_ends so far: {len(self._episode_cum_ends)} entries, " + f"~{_fmt_mb(_deep_size(self._episode_cum_ends) / (1024 * 1024))}", + ], + ): + self._append_index_records(meta=meta, ds_idx=ds_idx, dataset_label=dataset_label) + + return meta + + def _append_index_records( + self, + *, + meta: LeRobotDatasetMetadata, + ds_idx: int, + dataset_label: str | None = None, + ) -> None: + """Populate episode split / index records from dataset metadata.""" + episode_ids = split_episode_ids( + total_episodes=meta.total_episodes, + seed=self._split_seed, + val_ratio=self._split_val_ratio, + split=self._split, + ) + + if hasattr(self, "_filter_valid_episodes"): + episode_ids = self._filter_valid_episodes(meta, episode_ids) + episode_spans, valid_count, sample_count = build_episode_spans( + episodes=meta.episodes, + episode_ids=episode_ids, + chunk_length=self._chunk_length, + sample_stride=self._sample_stride, + ) + + class_name = self.__class__.__name__ + label = f" [{dataset_label}]" if dataset_label else "" + log.info(f"{class_name}{label}: split={self._split}, num episodes={len(episode_ids)}") + if sample_count > 0: + log.info( + f"{class_name}{label}: kept {valid_count} / {sample_count} " + f"({100 * valid_count / sample_count:.2f} %) samples" + ) + + for episode_id, sample_start, valid_len in episode_spans: + self._episode_records.append((ds_idx, sample_start, valid_len, episode_id)) + self._num_valid_indices += valid_len + self._episode_cum_ends.append(self._num_valid_indices) + + # -- deferred shard registration ----------------------------------------- + + def _register_sources(self, indices: list[int] | None = None) -> None: + """Register a subset (or all) of the shard roots in ``_all_shard_roots``. + + Called by ``ActionUnifiedIterableDataset.assign_worker`` during training, + or explicitly by eval/visualization scripts after construction. + + ``_all_shard_roots`` is a list of root paths. Per-shard args that are + shared across all shards (``delta_timestamps``, ``tolerance_s``) are + taken from ``self``. Subclasses may override this for extra per-shard + setup (e.g. loading instruction segments). + + When ``enable_fast_init=True``, ``LeRobotDatasetMetadata`` (a pure-IO + read of ``info.json`` + ``episodes.parquet`` + ``tasks.parquet``) is + prefetched in a thread pool and handed to the order-sensitive + serial register loop via ``prefetched_meta=``. Shard count scales + the speedup; for single-shard datasets the two paths are + equivalent. + + Args: + indices: Which entries of ``_all_shard_roots`` to register. + ``None`` means all. + """ + if indices is None: + indices = list(range(len(self._all_shard_roots))) + if not indices: + return + + roots = [self._all_shard_roots[i] for i in indices] + + if self._enable_fast_init: + # ``_ensure_hf_hub_offline`` already ran in ``__init__`` and is + # idempotent; no need to re-invoke here. + workers = max(1, min(self._fast_init_max_workers, len(roots))) + metas: list[LeRobotDatasetMetadata | None] = _parallel_map( + lambda root: LeRobotDatasetMetadata(repo_id="local", root=root, revision="local"), + roots, + max_workers=workers, + label=f"{type(self).__name__}: LeRobotDatasetMetadata prefetch", + ) + else: + metas = [None] * len(roots) + + for root, meta in zip(roots, metas): + label = root.rsplit("/", 1)[-1] if "/" in root else root + self._register_source( + root=root, + delta_timestamps=self._delta_timestamps, + tolerance_s=self._tolerance_s, + dataset_label=label, + prefetched_meta=meta, + ) + + # -- lazy dataset access ------------------------------------------------- + + def _get_dataset(self, ds_idx: int) -> LeRobotDataset: + """Get or lazily construct the LeRobot dataset for the given source index. + + Loaded datasets are tracked with LRU ordering. When the number of + loaded datasets exceeds ``_max_loaded_datasets`` the least-recently-used + dataset is evicted (set back to ``None``) so the GC can reclaim it. + """ + ds = self._datasets[ds_idx] + if ds is not None: + self._loaded_lru.move_to_end(ds_idx) + return ds + + _ensure_hf_hub_offline() + + build_args = self._dataset_build_args[ds_idx] + if build_args is None: + raise RuntimeError(f"Missing dataset build args for dataset index {ds_idx}") + + # Evict least-recently-used datasets before loading a new one. + while len(self._loaded_lru) >= self._max_loaded_datasets: + evict_idx, _ = self._loaded_lru.popitem(last=False) + self._datasets[evict_idx] = None + + with rss_tracker( + f"[WORKER {_os.getpid()}] Lazy-loaded ds[{ds_idx}]", + enabled=self._memprofile, + extras_fn=lambda: [f"total loaded={len(self._loaded_lru)}/{len(self._datasets)}"], + ): + delta_ts = build_args["delta_timestamps"] + if self._skip_video_loading: + # Covers both LeRobot v2 (``observation.images.``) and + # v3 (``observation.image.``) video-column conventions. + delta_ts = {k: v for k, v in delta_ts.items() if not k.startswith("observation.image")} + + log.info(f"Loading shard root={build_args['root']}") + ds = LeRobotDataset( + repo_id=build_args["repo_id"], + root=build_args["root"], + delta_timestamps=delta_ts, + tolerance_s=build_args["tolerance_s"], + force_cache_sync=build_args["force_cache_sync"], + download_videos=build_args["download_videos"], + video_backend=build_args["video_backend"], + revision=build_args["revision"], + episodes=None, + ) + if self._skip_video_loading: + ds.meta.info["features"] = { + k: v for k, v in ds.meta.info["features"].items() if v.get("dtype") != "video" + } + self._datasets[ds_idx] = ds + self._loaded_lru[ds_idx] = None + + return ds + + # -- index resolution ---------------------------------------------------- + + def _resolve_index(self, idx: int) -> tuple[int, int, int, int]: + """Map a flat global index to the source dataset, row, episode, and frame. + + Multiple datasets are concatenated into a single virtual sequence. + Each episode contributes a contiguous *span* of valid frames, and + ``_episode_cum_ends[i]`` stores the running total of valid frames + through the *i*-th span. For example, with two episodes of lengths + 5 and 3 the cum-ends are ``[5, 8]``, so global index 6 falls in the + second span at offset 1. + + The lookup is O(log N) via :func:`bisect_right`. + + Returns: + dataset_idx: Which source dataset this sample belongs to. + row_idx: Row index *within* that dataset's LeRobot table. + episode_id: The episode ID for this sample. + frame_offset: Frame offset from the start of the episode span + (0-based). + + Pure index math -- no I/O or dataset access. Higher-level helpers + like :meth:`_fetch_sample` build on this. + """ + # Support negative indexing (e.g. -1 → last sample). + if idx < 0: + idx += self._num_valid_indices + if idx < 0 or idx >= self._num_valid_indices: + raise IndexError(f"{self.__class__.__name__} index {idx} out of range for size {self._num_valid_indices}") + + # _episode_cum_ends is a monotonically increasing list where entry i + # holds the cumulative number of valid frames up to and including the + # i-th episode span. bisect_right finds the first span whose + # cumulative end is strictly greater than idx, i.e. the span that + # contains idx. + # + # Example: cum_ends = [5, 8, 20] + # idx=0 -> span_idx=0 (first span, frames 0..4) + # idx=4 -> span_idx=0 + # idx=5 -> span_idx=1 (second span, frames 5..7) + # idx=8 -> span_idx=2 (third span, frames 8..19) + span_idx = bisect_right(self._episode_cum_ends, idx) + + # The global index where this span begins is the previous span's + # cumulative end (or 0 for the very first span). The frame_offset + # is how far idx is into this particular episode. + span_start = 0 if span_idx == 0 else self._episode_cum_ends[span_idx - 1] + frame_offset = idx - span_start + + # _episode_records[span_idx] stores (dataset_idx, row_start, valid_len, + # episode_id). row_start is the absolute row in the LeRobot table + # where this episode begins. With sample_stride=k, consecutive + # valid indices map to rows k apart inside the episode, so the + # effective row is row_start + frame_offset * sample_stride. + dataset_idx, row_start, _, episode_id = self._episode_records[span_idx] + row_idx = row_start + frame_offset * self._sample_stride + return dataset_idx, row_idx, episode_id, frame_offset + + def _choose_mode(self) -> str: + """Resolve the active mode for one sample request.""" + if self._mode == "joint": + return random.choice(("forward_dynamics", "inverse_dynamics", "policy")) + return self._mode + + def _fetch_sample(self, idx: int) -> tuple[str, int, int, dict[str, Any]]: + """Resolve index, pick a mode, and load the sample from the dataset. + + Returns ``(mode, dataset_idx, row_idx, sample_dict)``. + """ + mode = self._choose_mode() + dataset_idx, row_idx, _, _ = self._resolve_index(idx) + + self._getitem_count = getattr(self, "_getitem_count", 0) + 1 + profile = self._memprofile and self._getitem_count % 50 == 1 + + with rss_tracker( + f"[WORKER {_os.getpid()}] __getitem__ transient (dataset_idx={dataset_idx})", + enabled=profile, + after_fn=lambda: log_worker_memory_breakdown(self), + ): + sample = self._get_dataset(dataset_idx)[row_idx] + + if self._skip_video_loading: + sample = defaultdict(lambda: None, sample) + + return mode, dataset_idx, row_idx, sample + + # -- action normalization ------------------------------------------------ + + def _normalizer_filename(self) -> str: + """Bundled stats filename for this dataset instance. + + Default convention (matches ``compute_action_stats.py`` output): + ``[_][_].json``. + + Pose/rotation suffixes are appended only when the instance actually + has them (SE(3) pose datasets like Bridge / DROID). Joint-space + datasets — where both are ``None`` — resolve to just + ``.json``. + + Subclasses may override when the bundled filename uses a different + scheme (e.g. UMI's ``uva_umi_single_task_normalizer.json``). + """ + if not self._embodiment_type: + raise RuntimeError( + f"{self.__class__.__name__}: embodiment_type is not set; cannot resolve normalizer filename." + ) + parts = [self._embodiment_type] + if self._pose_convention: + parts.append(self._pose_convention) + if self._rotation_format: + parts.append(self._rotation_format) + return "_".join(parts) + ".json" + + def _normalizer_path(self) -> Path: + """Full path to the bundled stats JSON for this dataset.""" + return self._NORMALIZERS_DIR / self._normalizer_filename() + + def _load_norm_stats(self) -> dict[str, torch.Tensor]: + """Lazy-load action normalization stats (once per worker process). + + Raises :class:`FileNotFoundError` if the stats file is missing. This + is intentional — silently falling back to identity normalization when + the user asked for ``quantile`` / ``quantile_rot`` / ``meanstd`` / + ``minmax`` would be a training bug. + """ + if self._norm_stats is not None: + return self._norm_stats + stats_key = "global_raw" if self._action_normalization == "quantile_rot" else "global" + raw = load_action_stats(str(self._normalizer_path()), stats_key=stats_key) + self._norm_stats = {} + for key, value in raw.items(): + self._norm_stats[key] = torch.from_numpy(value).float() # [D] + return self._norm_stats + + def _normalize_action(self, action: torch.Tensor) -> torch.Tensor: + """Apply the configured normalization, or return the raw action. + + - ``action_normalization=None`` → pass-through (used by viewer / debug) + - ``"quantile"`` → ``2·(x − q01) / (q99 − q01) − 1`` clamped to [-1, 1] + - ``"quantile_rot"`` → same as ``"quantile"``, but using ``global_raw`` + stats so rotation dimensions are normalized too. + - ``"meanstd"`` → ``(x − mean) / std`` + - ``"minmax"`` → ``2·(x − min) / (max − min) − 1`` clamped to [-1, 1] + """ + if self._action_normalization is None: + return action + method = "quantile" if self._action_normalization == "quantile_rot" else self._action_normalization + normalized_action = normalize_action( + action, + method, + self._load_norm_stats(), + ) # [T,D] + return normalized_action + + # -- video formatting ---------------------------------------------------- + + def _convert_video(self, video_tchw: torch.Tensor | None) -> torch.Tensor | None: + """Convert LeRobot ``(T,C,H,W)`` float video to Action ``(C,T,H,W)`` uint8. + + Args: + video_tchw: Raw floating-point video tensor in ``[0, 1]`` with + LeRobot layout, or ``None``. # [T,C,H,W] | None + + Returns: + Action-formatted video tensor, or ``None``. # [C,T,H,W] | None + """ + if self._skip_video_loading or video_tchw is None: + return None + if video_tchw.ndim != 4: + raise ValueError( + f"{self.__class__.__name__}._convert_video expected video with shape [T,C,H,W], " + f"got ndim={video_tchw.ndim}" + ) + if not torch.is_floating_point(video_tchw): + raise TypeError( + f"{self.__class__.__name__}._convert_video expected floating-point video in [0, 1], " + f"got dtype={video_tchw.dtype}" + ) + video_min = video_tchw.amin() # [] + video_max = video_tchw.amax() # [] + if video_min.item() < 0.0 or video_max.item() > 1.0: + raise ValueError( + f"{self.__class__.__name__}._convert_video expected floating-point video in [0, 1], " + f"got range=[{video_min.item():.6f}, {video_max.item():.6f}]" + ) + formatted_video = (video_tchw * 255.0).clamp(0.0, 255.0).to(torch.uint8).permute(1, 0, 2, 3) # [C,T,H,W] + return formatted_video + + # -- result building ----------------------------------------------------- + + def _build_action_spec(self) -> ActionSpec | None: + """Subclass override: declare this dataset's action layout. + + Called once per instance — the result is cached by ``self.action_spec``. + Return ``None`` to skip spec-driven idle detection; in that case + ``_compute_idle_frames`` will log a one-time warning and return + ``None`` for every sample. + """ + return None + + @cached_property + def action_spec(self) -> ActionSpec | None: + """Cached :class:`ActionSpec` from ``_build_action_spec``. + + Returns ``None`` when the subclass did not declare one; idle detection + is then skipped (with a one-time warning) until the subclass overrides + ``_build_action_spec``. + """ + return self._build_action_spec() + + @cached_property + def action_names(self) -> list[str] | None: + spec = self.action_spec + return spec.names if spec is not None else None + + # Idle-detection thresholds. Defined as **velocities** (per second) so the + # same numeric value means the same physical motion across datasets with + # different sampling rates; converted to per-frame at call time using + # ``self._fps`` via :meth:`_resolve_idle_thresholds`. + # + # Defaults: + # - ``idle_eps_t_per_sec`` = 5 mm/s (≈ 1 mm/frame at 5 Hz) + # - ``idle_eps_r_per_sec`` = 1.5°/s (geodesic, rotation-format aware) + # - ``idle_eps_g`` = 1e-2 unit gripper Δ (no fps) + # - ``idle_joint_threshold_per_sec`` = 5e-3 rad/s + # - ``idle_min_streak`` = 3 require ≥ 3 consecutive + # + # Subclasses can either override the ``*_per_sec`` attributes (preferred — + # keeps the velocity semantics) or set the corresponding ``idle_eps_*`` / + # ``idle_joint_threshold`` attribute to a non-``None`` value to bypass the + # per-fps conversion entirely (raw per-frame override). + idle_eps_t_per_sec: float = 5e-3 + idle_eps_r_per_sec: float = math.radians(1.5) + idle_eps_g: float = 1e-2 + idle_joint_threshold_per_sec: float = 5e-3 + idle_min_streak: int = 3 + + # Optional per-frame overrides. ``None`` (default) → use the ``*_per_sec`` + # attribute / fps conversion above. + idle_eps_t: float | None = None + idle_eps_r: float | None = None + idle_joint_threshold: float | None = None + + def _resolve_idle_thresholds(self) -> tuple[float, float, float, float]: + """Resolve per-frame idle thresholds for this dataset instance. + + Returns ``(eps_t, eps_r, eps_g, joint_threshold)`` in raw per-frame + units. Honours direct per-frame overrides if the subclass sets the + non-``_per_sec`` attribute; otherwise scales the ``_per_sec`` values + by ``self._fps``. + """ + fps = float(self._fps) if self._fps else 1.0 + eps_t = self.idle_eps_t if self.idle_eps_t is not None else self.idle_eps_t_per_sec / fps + eps_r = self.idle_eps_r if self.idle_eps_r is not None else self.idle_eps_r_per_sec / fps + joint_thr = ( + self.idle_joint_threshold + if self.idle_joint_threshold is not None + else self.idle_joint_threshold_per_sec / fps + ) + return float(eps_t), float(eps_r), float(self.idle_eps_g), float(joint_thr) + + def _compute_idle_frames(self, raw_action: torch.Tensor) -> torch.Tensor | None: + """Count idle frames in the *raw* (un-normalized) action chunk. + + Requires ``self.action_spec`` to be declared via ``_build_action_spec``. + Returns ``None`` when: + - ``pose_convention`` is not ``"backward_framewise"`` (TODO: extend), + - the subclass has not declared an ``ActionSpec`` (logs a one-time warning), + - the action layout does not match the declared spec. + + Detection thresholds come from the ``idle_eps_*`` class attributes + (overridable per dataset). Subclasses can also override this method + outright, or pass an explicit ``idle_frames`` integer via + ``**extras`` to :meth:`_build_result`. + """ + + # conventions (anchored / absolute) need different idle semantics. + if self._pose_convention != "backward_framewise": + if not getattr(self, "_warned_pose_convention", False): + log.warning( + f"Dataset {self.__class__.__name__}: pose_convention=" + f"{self._pose_convention!r} is not 'backward_framewise'; " + "skipping idle-frames detection. Centralize the dataset " + "to backward_framewise to enable IdleFrames captioning." + ) + self._warned_pose_convention = True + return None + + spec = self.action_spec + if spec is None: + if not getattr(self, "_warned_no_action_spec", False): + log.warning( + f"Dataset {self.__class__.__name__} has no action spec defined; " + "skipping idle-frames detection. Override _build_action_spec() to enable it." + ) + self._warned_no_action_spec = True + return None + + eps_t, eps_r, eps_g, joint_thr = self._resolve_idle_thresholds() + try: + n = compute_idle_frames( + raw_action, + spec, + eps_t=eps_t, + eps_r=eps_r, + eps_g=eps_g, + joint_threshold=joint_thr, + min_streak=self.idle_min_streak, + ) + except (ValueError, TypeError) as e: + if not getattr(self, "_warned_action_layout", False): + log.warning( + f"Dataset {self.__class__.__name__}: action layout does " + f"not match the declared ActionSpec " + f"(action_dim={int(raw_action.shape[-1])}, " + f"spec.dim={spec.dim}); skipping idle-frames detection. " + f"Underlying error: {e}" + ) + self._warned_action_layout = True + return None + return torch.tensor(n, dtype=torch.long) + + def _build_result( + self, + *, + mode: str, + video: torch.Tensor | None, + action: torch.Tensor, + ai_caption: str, + **extras: Any, + ) -> dict[str, Any]: + """Assemble the common return dict for ``__getitem__``. + + ``video`` is expected in raw LeRobot layout before final formatting. + Subclasses may pass extra keys (e.g. ``initial_pose``) via ``**extras``. + ``idle_frames`` is auto-computed from the raw (un-normalized) ``action`` + whenever the dataset's pose/rotation conventions allow it; subclasses + can override by passing ``idle_frames`` (int or scalar tensor) via + ``**extras``. + """ + # Compute idle_frames from the raw action before normalization, unless + # the subclass has provided one explicitly via ``**extras``. + if "idle_frames" not in extras: + idle_frames = self._compute_idle_frames(action) + if idle_frames is not None: + extras = {"idle_frames": idle_frames, **extras} + + normalized_action = self._normalize_action(action) # [T,D] + if self._skip_video_loading: + result: dict[str, Any] = {"action": normalized_action} + if "idle_frames" in extras: + result["idle_frames"] = extras["idle_frames"] + return result + formatted_video = self._convert_video(video) # [C,T,H,W] | None + return { + "ai_caption": ai_caption, + "video": formatted_video, + "action": normalized_action, + "conditioning_fps": torch.tensor(self._fps, dtype=torch.long), + "mode": mode, + "domain_id": torch.tensor(self._domain_id, dtype=torch.long), + "viewpoint": self._viewpoint, + **extras, + } + + def __len__(self) -> int: + return self._num_valid_indices diff --git a/cosmos-inference/cosmos3/_src/vfm/datasets/action/dataloaders.py b/cosmos-inference/cosmos3/_src/vfm/datasets/action/dataloaders.py new file mode 100644 index 00000000..0daa4a53 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/vfm/datasets/action/dataloaders.py @@ -0,0 +1,160 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +from __future__ import annotations + +import functools +import random +import time +from typing import Callable, Iterator + +import numpy as np +import torch +import torch.utils.data + +from cosmos3._src.imaginaire.utils import distributed +from cosmos3._src.vfm.datasets.action.unified_dataset import ActionUnifiedIterableDataset +from cosmos3._src.vfm.datasets.joint_dataloader import custom_collate_fn + + +def _action_worker_init_fn( + worker_id: int, seed: int = 42, use_deterministic_seed: bool = True, rank: int = 0, world_size: int = 1 +) -> None: + if use_deterministic_seed: + worker_seed = seed + rank * 9999 + worker_id + else: + worker_seed = int(time.time() * 1000) % (2**32) + rank * 9999 + worker_id + random.seed(worker_seed) + np.random.seed(worker_seed % (2**32)) + torch.manual_seed(worker_seed) + + info = torch.utils.data.get_worker_info() + assert info is not None + dataset = info.dataset + if isinstance(dataset, ActionUnifiedIterableDataset): + dataset.assign_worker(worker_id, info.num_workers, rank, world_size) + + +def create_action_worker_init_fn(seed: int = 42, use_deterministic_seed: bool = True) -> Callable[[int], None]: + """Create a worker_init_fn for Action training with ``ActionUnifiedIterableDataset``. + + Seeds RNGs first, then calls ``dataset.assign_worker()`` to set up + rank-level dataset assignment and worker-level shard distribution. + + Passed to ``DataLoader`` (or ``InfiniteDataLoader``) as the + ``worker_init_fn`` parameter. Only called when ``num_workers > 0``. + + Args: + seed: Base seed for deterministic worker seeding. Ignored when + ``use_deterministic_seed=False`` (time-based seed used instead). + use_deterministic_seed: If True, use the provided seed for reproducible + RNG initialization. If False, derive a time-based seed so that + each resume sees different data. This is preferred for large-scale + runs that resume frequently, and when ``in_order=False`` already + makes iteration order non-deterministic. + + Returns: + A ``worker_init_fn`` suitable for ``torch.utils.data.DataLoader``. + """ + try: + rank = distributed.get_rank() + world_size = distributed.get_world_size() + except RuntimeError: + rank = 0 + world_size = 1 + + return functools.partial( + _action_worker_init_fn, + seed=seed, + use_deterministic_seed=use_deterministic_seed, + rank=rank, + world_size=world_size, + ) + + +class InfiniteDataLoader(torch.utils.data.DataLoader): + """A dataloader that yields forever with proper seeding for reproducibility. + + All Action datasets are ``IterableDataset`` instances (map-style datasets + are automatically wrapped by :class:`~.transforms.MapToIterableAdapter`). + The loader catches ``StopIteration`` and restarts the iterator so that + iteration never ends. + """ + + def __init__( + self, + *args, + seed: int = 42, + use_deterministic_seed: bool = True, + **kwargs, + ) -> None: + """Initialize InfiniteDataLoader. + + Args: + *args: Positional arguments passed to parent DataLoader. + seed: Random seed for reproducible worker initialization. + Default is 42 for reproducibility. + use_deterministic_seed: If True, use the provided seed for reproducible + RNG initialization. If False, derive a time-based seed so that + each resume sees different data. This is preferred for large-scale + runs that resume frequently, and when ``in_order=False`` already + makes iteration order non-deterministic. + **kwargs: Keyword arguments passed to parent DataLoader. + """ + kwargs.pop("shuffle", None) + kwargs["shuffle"] = False + + # Default to ``custom_collate_fn`` so that variable-length per-sample + # tensors (e.g. ``text_token_ids``) and multi-item keys (``video``, + # ``action``, ...) are returned as lists rather than stacked by + # PyTorch's ``default_collate``. + if kwargs.get("collate_fn") is None: + kwargs["collate_fn"] = custom_collate_fn + + if "worker_init_fn" not in kwargs or kwargs["worker_init_fn"] is None: + kwargs["worker_init_fn"] = create_action_worker_init_fn(seed, use_deterministic_seed=use_deterministic_seed) + + num_workers = kwargs.get("num_workers", 0) + if num_workers == 0: + try: + rank = distributed.get_rank() + except RuntimeError: + rank = 0 + if use_deterministic_seed: + rank_seed = seed + rank * 9999 + else: + rank_seed = int(time.time() * 1000) % (2**32) + rank * 9999 + random.seed(rank_seed) + np.random.seed(rank_seed % (2**32)) + torch.manual_seed(rank_seed) + + super().__init__(*args, **kwargs) + self._stream_iterator: Iterator | None = None + + def __len__(self) -> int: + # Delegate to DataLoader which calls len(self.dataset). + # Raises TypeError if the underlying dataset has no __len__. + return super().__len__() + + def __iter__(self) -> Iterator: + """Yield batches forever.""" + while True: + if self._stream_iterator is None: + self._stream_iterator = super().__iter__() + try: + yield next(self._stream_iterator) # type: ignore[arg-type] + except StopIteration: + self._stream_iterator = super().__iter__() + yield next(self._stream_iterator) # type: ignore[arg-type] diff --git a/cosmos-inference/cosmos3/_src/vfm/datasets/action/domain_utils.py b/cosmos-inference/cosmos3/_src/vfm/datasets/action/domain_utils.py new file mode 100644 index 00000000..85dda5e7 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/vfm/datasets/action/domain_utils.py @@ -0,0 +1,47 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +"""Domain ID helpers for cross-embodiment action datasets.""" + +EMBODIMENT_TO_DOMAIN_ID: dict[str, int] = { + "no_action": 0, + "av": 1, + "camera_pose": 2, + "hand_pose": 3, + "pusht": 4, + "libero": 5, + "umi": 6, + "bridge_orig_lerobot": 7, + "droid_lerobot": 8, + "robomind-franka": 8, # Both Droid and RoboMIND-Franka are using robotiq and franka + "embodiment_b": 9, + "robomind-franka-dual": 12, + "robomind-ur": 13, + "agibotworld": 15, + "embodiment_c_gripper": 15, + "embodiment_c_gripper_ext": 15, + "fractal": 20, +} + + +def get_domain_id(embodiment_type: str) -> int: + """Get the domain ID for a given embodiment type.""" + key = embodiment_type.lower().strip() + if key not in EMBODIMENT_TO_DOMAIN_ID: + raise KeyError( + f"Unknown embodiment type: {embodiment_type!r}. " + f"Available embodiments: {sorted(EMBODIMENT_TO_DOMAIN_ID.keys())}" + ) + return EMBODIMENT_TO_DOMAIN_ID[key] diff --git a/cosmos-inference/cosmos3/_src/vfm/datasets/action/json_formatter.py b/cosmos-inference/cosmos3/_src/vfm/datasets/action/json_formatter.py new file mode 100644 index 00000000..847c3f81 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/vfm/datasets/action/json_formatter.py @@ -0,0 +1,306 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025-2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +from __future__ import annotations + +import math + +import torch + +from cosmos3._src.imaginaire.utils import log +from cosmos3._src.vfm.datasets.action.viewpoint_utils import DEFAULT_VIEWPOINT_TEMPLATES +from cosmos3._src.vfm.datasets.utils import VIDEO_RES_SIZE_INFO + + +def _should_append_idle_frame_info(mode: object) -> bool: + """Return whether idle-frame prompt metadata should be surfaced.""" + return mode != "inverse_dynamics" + + +class ActionPromptJsonFormatter: + """Format action prompts into a structured JSON-compatible dictionary. + + JSON fields are emitted in this order: ``cinematography``, ``actions``, + ``duration``, ``fps``, ``resolution``, then ``aspect_ratio``. Like video JSON + prompts, ``cinematography`` is a dictionary, duration is truncated to an + integer-second string such as ``"2s"``, and aspect ratio is stored as a + comma-separated string such as ``"16,9"``. If ``data_dict["mode"]`` is + ``"inverse_dynamics"``, idle-frame metadata is omitted from the prompt. + """ + + def __init__( + self, + caption_key: str = "ai_caption", + viewpoint_key: str = "viewpoint", + video_key: str = "video", + fps_key: str = "conditioning_fps", + image_size_key: str = "image_size", + idle_frames_key: str = "idle_frames", + total_frames_key: str = "idle_frames_total", + action_key: str = "action", + viewpoint_templates: dict[str, str] | None = None, + ) -> None: + self.caption_key: str = caption_key + self.viewpoint_key: str = viewpoint_key + self.video_key: str = video_key + self.fps_key: str = fps_key + self.image_size_key: str = image_size_key + self.idle_frames_key: str = idle_frames_key + self.total_frames_key: str = total_frames_key + self.action_key: str = action_key + self.viewpoint_templates: dict[str, str] = ( + viewpoint_templates if viewpoint_templates is not None else DEFAULT_VIEWPOINT_TEMPLATES + ) + + def __call__(self, data_dict: dict) -> dict: + """Replace the caption with the action JSON prompt structure.""" + additional_view_description = data_dict.pop("additional_view_description", None) + caption = data_dict.get(self.caption_key) + if not isinstance(caption, str) or caption == "": + return data_dict + + height, width = self._get_resolution(data_dict) + fps = self._get_scalar_float(data_dict.get(self.fps_key), self.fps_key) + if fps <= 0: + raise ValueError(f"ActionPromptJsonFormatter: '{self.fps_key}' must be positive, got {fps}") + + video = data_dict.get(self.video_key) + if not isinstance(video, torch.Tensor) or video.ndim < 2: + raise ValueError( + f"ActionPromptJsonFormatter: expected '{self.video_key}' to be a video tensor with shape " + f"(C, T, H, W), got {type(video).__name__}" + ) + duration_seconds = video.shape[1] / fps + duration = self._truncate_seconds(duration_seconds) + action_end_time = self._round_time_seconds(duration_seconds) + + prompt = { + "cinematography": { + "framing": self._get_viewpoint_caption(data_dict, additional_view_description), + }, + "actions": [ + { + "time": f"0:00-{self._format_time_mss(action_end_time)}", + "description": self._ensure_sentence(caption), + "idle_frame": self._get_idle_frame_info(data_dict), + } + ], + "duration": f"{duration}s", + "fps": float(fps), + "resolution": {"H": height, "W": width}, + "aspect_ratio": self._get_aspect_ratio(width, height), + } + cleaned_prompt = self._drop_empty_fields(prompt) + self._raise_if_empty_fields(cleaned_prompt) + data_dict[self.caption_key] = cleaned_prompt + return data_dict + + def _truncate_seconds(self, seconds: float) -> int: + """Truncate duration to integer seconds, matching video JSON-caption augmentors.""" + if seconds < 0 or not math.isfinite(seconds): + return 0 + return int(seconds) + + def _round_time_seconds(self, seconds: float) -> int: + """Round an action timestamp to integer seconds, matching video captioning.""" + if seconds < 0 or not math.isfinite(seconds): + return 0 + return round(seconds) + + def _format_time_mss(self, seconds: int) -> str: + """Format integer seconds as M:SS for JSON prompt time ranges.""" + minutes, remaining_seconds = divmod(seconds, 60) + return f"{minutes}:{remaining_seconds:02d}" + + def _get_aspect_ratio(self, width: int, height: int) -> str: + """Return the canonical width,height aspect ratio string when known.""" + for aspect_ratio_sizes in VIDEO_RES_SIZE_INFO.values(): + for aspect_ratio, (candidate_w, candidate_h) in aspect_ratio_sizes.items(): + if width == candidate_w and height == candidate_h: + return aspect_ratio + + divisor = math.gcd(width, height) + if divisor == 0: + raise ValueError( + f"ActionPromptJsonFormatter: width and height must be non-zero, got width={width}, height={height}." + ) + return f"{width // divisor},{height // divisor}" + + def _get_viewpoint_caption(self, data_dict: dict, additional_view_description: object | None) -> str | None: + """Resolve the viewpoint text used in the ``cinematography`` field.""" + viewpoint = data_dict.get(self.viewpoint_key) + template = self.viewpoint_templates.get(viewpoint) if isinstance(viewpoint, str) else None + + if template is None: + if viewpoint is not None: + log.warning( + f"ActionPromptJsonFormatter: unrecognized viewpoint {viewpoint!r}. " + f"Known viewpoints: {sorted(self.viewpoint_templates.keys())}. " + f"Using additional view description when available.", + rank0_only=False, + ) + return self._get_optional_text(additional_view_description) + + if additional_view_description: + separator = " " if template.endswith(".") else ". " + template = template + separator + str(additional_view_description).rstrip() + return template + + def _get_resolution(self, data_dict: dict) -> tuple[int, int]: + """Resolve ``(height, width)`` from the post-padding image size.""" + image_size = data_dict.get(self.image_size_key) + if image_size is None: + raise ValueError(f"ActionPromptJsonFormatter: missing '{self.image_size_key}' in data_dict.") + + if isinstance(image_size, torch.Tensor): + if image_size.numel() < 2: + raise ValueError( + f"ActionPromptJsonFormatter: expected '{self.image_size_key}' to contain at least " + f"height and width, got shape {tuple(image_size.shape)}" + ) + return int(image_size[0].item()), int(image_size[1].item()) + + try: + return int(image_size[0]), int(image_size[1]) + except (TypeError, ValueError, IndexError) as e: + raise ValueError( + f"ActionPromptJsonFormatter: expected '{self.image_size_key}' to contain height and width." + ) from e + + def _get_scalar_float(self, value: object, key: str) -> float: + """Parse a required scalar float from a tensor or Python value.""" + if value is None: + raise ValueError(f"ActionPromptJsonFormatter: missing '{key}' in data_dict.") + + if isinstance(value, torch.Tensor): + if value.numel() != 1: + raise ValueError( + f"ActionPromptJsonFormatter: expected scalar tensor at '{key}', got shape {tuple(value.shape)}" + ) + return float(value.item()) + + if isinstance(value, (str, int, float)): + try: + return float(value) + except ValueError as e: + raise ValueError( + f"ActionPromptJsonFormatter: expected scalar float-compatible value at '{key}'." + ) from e + raise ValueError(f"ActionPromptJsonFormatter: expected scalar float-compatible value at '{key}'.") + + def _get_optional_scalar_int(self, value: object, key: str) -> int | None: + """Parse an optional scalar integer metadata value.""" + if value is None: + return None + + if isinstance(value, torch.Tensor): + if value.numel() != 1: + log.warning( + f"ActionPromptJsonFormatter: expected scalar tensor at '{key}', got shape " + f"{tuple(value.shape)}. Skipping.", + rank0_only=False, + ) + return None + return int(value.item()) + + if isinstance(value, (str, int, float)): + try: + return int(value) + except ValueError: + pass + log.warning( + f"ActionPromptJsonFormatter: expected integer-compatible value at " + f"'{key}', got {type(value).__name__}. Skipping.", + rank0_only=False, + ) + return None + + def _get_total_frames(self, data_dict: dict) -> int | None: + """Resolve the total action-frame count for idle-frame text.""" + total_frames = self._get_optional_scalar_int(data_dict.get(self.total_frames_key), self.total_frames_key) + if total_frames is not None: + return total_frames + + action = data_dict.get(self.action_key) + if isinstance(action, torch.Tensor): + if action.ndim == 0: + log.warning( + f"ActionPromptJsonFormatter: expected action tensor at " + f"'{self.action_key}' to have a frame dimension. Skipping total frames.", + rank0_only=False, + ) + return None + return int(action.shape[0]) + + try: + return len(action) if action is not None else None + except TypeError: + return None + + def _get_idle_frame_info(self, data_dict: dict) -> str | None: + """Build the idle-frame string for the action object.""" + if not _should_append_idle_frame_info(data_dict.get("mode")): + return None + + idle_frames = self._get_optional_scalar_int(data_dict.get(self.idle_frames_key), self.idle_frames_key) + total_frames = self._get_total_frames(data_dict) + + if idle_frames is not None and total_frames is not None: + return f"{idle_frames} out of {total_frames}." + if idle_frames is not None: + return f"{idle_frames}." + return None + + def _ensure_sentence(self, text: str) -> str: + """Return text with terminal sentence punctuation.""" + text = text.strip() + if text.endswith((".", "!", "?")): + return text + return f"{text}." + + def _get_optional_text(self, value: object) -> str | None: + """Return stripped text, leaving empty optional text for the final prune pass.""" + if value is None: + return None + text = str(value).rstrip() + return text if text else None + + def _drop_empty_fields(self, value: object) -> object: + """Recursively remove empty strings, dictionaries, lists, and ``None`` values.""" + if isinstance(value, dict): + return { + key: cleaned + for key, item in value.items() + if not self._is_empty(cleaned := self._drop_empty_fields(item)) + } + if isinstance(value, list): + return [cleaned for item in value if not self._is_empty(cleaned := self._drop_empty_fields(item))] + return value + + def _is_empty(self, value: object) -> bool: + """Return whether a JSON field should be dropped.""" + return value is None or value == "" or value == [] or value == {} + + def _raise_if_empty_fields(self, value: object, path: str = "prompt") -> None: + """Validate that no empty JSON fields remain after pruning.""" + if self._is_empty(value): + raise ValueError(f"ActionPromptJsonFormatter: empty field remains at {path}.") + + if isinstance(value, dict): + for key, item in value.items(): + self._raise_if_empty_fields(item, f"{path}.{key}") + elif isinstance(value, list): + for index, item in enumerate(value): + self._raise_if_empty_fields(item, f"{path}[{index}]") diff --git a/cosmos-inference/cosmos3/_src/vfm/datasets/action/libero_dataset.py b/cosmos-inference/cosmos3/_src/vfm/datasets/action/libero_dataset.py new file mode 100644 index 00000000..58c294fb --- /dev/null +++ b/cosmos-inference/cosmos3/_src/vfm/datasets/action/libero_dataset.py @@ -0,0 +1,623 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +"""LIBERO dataset for training from local storage, supporting multiple dataset roots.""" + +import random +from pathlib import Path +from typing import Literal + +import torch +import torchvision.transforms.functional as F +from lerobot.datasets.lerobot_dataset import LeRobotDataset +from torch.utils.data import Dataset + +from cosmos3._src.imaginaire.utils import log +from cosmos3._src.vfm.datasets.action.action_normalization import ( + load_action_stats, + normalize_action, +) +from cosmos3._src.vfm.datasets.action.action_spec import ( + Gripper, + Pos, + Rot, + build_action_spec, +) +from cosmos3._src.vfm.datasets.action.domain_utils import get_domain_id +from cosmos3._src.vfm.datasets.action.libero_pose_utils import ( + libero_action_dim, + libero_rotation_format, +) +from cosmos3._src.vfm.datasets.action.pose_utils import ( + compute_idle_frames, + convert_rotation, +) + +LIBERO_ROOTS: list[str] = [ + "/lustre/fsw/portfolios/dir/projects/dir_cosmos_base_lustre/maxzhaoshuol/dataset/libero_10_no_noops_1.0.0_lerobot_aligned", + "/lustre/fsw/portfolios/dir/projects/dir_cosmos_base_lustre/maxzhaoshuol/dataset/libero_90_no_noops_lerobot_shuffled", + "/lustre/fsw/portfolios/dir/projects/dir_cosmos_base_lustre/maxzhaoshuol/dataset/libero_object_no_noops_1.0.0_lerobot_aligned", + "/lustre/fsw/portfolios/dir/projects/dir_cosmos_base_lustre/maxzhaoshuol/dataset/libero_spatial_no_noops_1.0.0_lerobot", + "/lustre/fsw/portfolios/dir/projects/dir_cosmos_base_lustre/maxzhaoshuol/dataset/libero_goal_no_noops_1.0.0_lerobot", +] + + +class LIBERODataset(Dataset): + """ + A Dataset wrapper for LeRobot LIBERO dataset(s) designed for training from local storage. + + This dataset: + - Loads data from local storage using LeRobotDataset + - Supports multiple dataset roots that are concatenated into one dataset + - Supports configurable camera modes (image, wrist_image, or concat_view) + - Filters episodes for train/val split + - Filters frames at episode boundaries (to avoid padding issues with delta timestamps) + - Uses task descriptions from meta/tasks.parquet for ai_caption + """ + + _NORMALIZERS_DIR = Path(__file__).parent / "normalizers" + + def __init__( + self, + repo_id: str | list[str] = "lerobot/libero_90", + root: str | list[str] | None = LIBERO_ROOTS, + image_size: int = 256, + chunk_length: int = 16, # must be divisible by 4 + fps: int = 10, # IMPORTANT! LIBERO is at 20fps. If using frame_wise_relative in policy mode, we have to match the fps. + mode: str = "policy", + video_backend: str | None = "torchcodec", + download_videos: bool = False, + force_cache_sync: bool = False, + tolerance_s: float = 1e-4, + split: str = "train", + val_ratio: float = 0.01, + seed: int = 0, + # Camera configuration + camera_mode: str = "image", # 'image', 'wrist_image', or 'concat_view' + # Action configuration + action_space: str = "frame_wise_relative", # "absolute" or "relative" or "frame_wise_relative" + # rotation_space + rotation_space: Literal["9d", "6d", "3d"] = "3d", + # Native simulator frame or shared OpenCV-style EE frame used by midtraining. + pose_coordinate_frame: Literal["native", "opencv"] = "native", + # domain-aware configuration + embodiment_type: str = "libero", + action_normalization: Literal["quantile", "quantile_rot", "meanstd", "minmax"] | None = None, + action_stats_path: str | None = None, + skip_video_loading: bool = False, + ): + super().__init__() + self._embodiment_type = embodiment_type + self.domain_id = get_domain_id(embodiment_type) + self.image_size = image_size + self.chunk_length = chunk_length + assert self.chunk_length % 4 == 0, "chunk_length must be divisible by 4" + self.fps = fps + self.mode = mode + self.split = split.lower().strip() + self.val_ratio = val_ratio + self.seed = seed + self.camera_mode = camera_mode.lower().strip() + self.action_space = action_space + self.action_normalization = action_normalization + self.rotation_space = rotation_space.lower().strip() + self.pose_coordinate_frame = pose_coordinate_frame + self._pose_convention = self.action_space + self._rotation_format = libero_rotation_format(self.rotation_space) + # When True, skip video decoding entirely: drop image keys from + # delta_timestamps so LeRobot never touches the mp4, and return + # ``video=None`` in __getitem__. Must be set at construction time + # because LeRobotDataset is eagerly built in __init__. + self._skip_video_loading = bool(skip_video_loading) + + # Load action normalization stats. ``action_min`` / ``action_range`` are + # retained for older LIBERO eval code that knows how to invert a + # range-style [-1, 1] normalization. + self._norm_stats: dict[str, torch.Tensor] | None = None + self.action_min: torch.Tensor | None = None + self.action_max: torch.Tensor | None = None + self.action_range: torch.Tensor | None = None + if self.action_normalization is not None: + stats_path = self._resolve_action_stats_path(action_stats_path) + stats_key = "global_raw" if self.action_normalization == "quantile_rot" else "global" + raw_stats = load_action_stats(str(stats_path), stats_key=stats_key) + self._norm_stats = {} + for key, value in raw_stats.items(): + self._norm_stats[key] = torch.from_numpy(value).float() # [D] + self._set_range_denormalization_stats() + log.info( + f"Loaded LIBERO action stats from {stats_path} with action_normalization={self.action_normalization}" + ) + + # Validate camera mode + if self.camera_mode not in {"image", "wrist_image", "concat_view"}: + raise ValueError(f"Unsupported camera_mode={camera_mode!r}. Use 'image', 'wrist_image', or 'concat_view'.") + + # Validate split + if self.split not in {"train", "val", "valid", "validation", "eval", "test", "full"}: + raise ValueError(f"Unsupported {split=}. Use train/val/full.") + + # Build delta timestamps based on camera mode + dt = 1.0 / self.fps + + if self.fps != 20: + log.warning( + f"LIBERO is at 20fps. If using frame_wise_relative for policy mode training, we have to match the fps. fps={self.fps}" + ) + + # Determine which image keys to use + if self.camera_mode == "image": + self.image_keys = ["observation.images.image"] + elif self.camera_mode == "wrist_image": + self.image_keys = ["observation.images.wrist_image"] + else: # concat_view + self.image_keys = ["observation.images.image", "observation.images.wrist_image"] + + # Build delta_timestamps for all keys (same convention as PushT: 0 to chunk_length) + self.delta_timestamps: dict[str, list[float]] = {} + if not self._skip_video_loading: + for key in self.image_keys: + self.delta_timestamps[key] = [i * dt for i in range(0, chunk_length + 1)] + self.delta_timestamps["observation.state"] = [i * dt for i in range(0, chunk_length + 1)] + self.delta_timestamps["action"] = [i * dt for i in range(0, chunk_length + 1)] + + # Normalize repo_id and root to lists + repo_id_list: list[str] = [repo_id] if isinstance(repo_id, str) else list(repo_id) + root_list: list[str | None] + if root is None: + root_list = [None for _ in repo_id_list] + elif isinstance(root, str): + root_list = [root] + else: + root_list = [r for r in root] + + if len(repo_id_list) != len(root_list): + raise ValueError( + f"Length mismatch: repo_id has {len(repo_id_list)} items, root has {len(root_list)} items." + ) + + # Load all datasets + self.datasets: list[LeRobotDataset] = [] + self.tasks_dfs: list = [] # Store tasks DataFrames for each dataset + for rid, r in zip(repo_id_list, root_list): + dataset = LeRobotDataset( + repo_id=rid, + root=r, + delta_timestamps=self.delta_timestamps, # type: ignore + tolerance_s=tolerance_s, + force_cache_sync=force_cache_sync, + download_videos=download_videos, + video_backend=video_backend, + episodes=None, # Load full dataset, filter later + ) + self.datasets.append(dataset) + self.tasks_dfs.append(dataset.meta.tasks) + + # Build index mapping: list of (dataset_idx, local_idx) for valid frames + self.index_map: list[tuple[int, int, int]] = [] # (dataset_idx, local_idx, episode_idx) + self._episode_boundaries: list[dict[int, tuple[int, int]]] = [] + self._episode_splits: list[tuple[set[int], set[int]]] = [] + + total_episodes = 0 + total_frames = 0 + for ds_idx, dataset in enumerate(self.datasets): + # Compute episode splits for this dataset + train_eps, val_eps = self._compute_episode_splits_for_dataset(dataset) + self._episode_splits.append((train_eps, val_eps)) + + # Get episodes for current split + split_episodes = self._get_split_episodes_for_dataset(ds_idx) + + # Build episode boundaries + boundaries = self._build_episode_boundaries_for_dataset(dataset) + self._episode_boundaries.append(boundaries) + + # Filter indices + indices = self._filter_indices_for_dataset(ds_idx, dataset, split_episodes, boundaries) + self.index_map.extend(indices) + + total_episodes += dataset.num_episodes + total_frames += len(dataset) + + log.info( + f"Loaded LIBERO dataset with {len(repo_id_list)} source(s) split={self.split!r} " + f"camera_mode={self.camera_mode!r} " + f"total_episodes={total_episodes} " + f"total_frames={total_frames} " + f"valid_indices={len(self.index_map)}" + ) + + def _compute_episode_splits_for_dataset(self, dataset: LeRobotDataset) -> tuple[set[int], set[int]]: + """Compute train/val episode splits deterministically for a single dataset.""" + total_episodes = int(dataset.meta.total_episodes) + + if not (0.0 < self.val_ratio < 1.0): + raise ValueError(f"{self.val_ratio=} must be in (0, 1).") + + n_val = max(1, int(round(total_episodes * self.val_ratio))) + # val_eps = set(range(n_val)) + # train_eps = set(range(n_val, total_episodes)) + + # Yihuai: Randomly select validation episodes instead of the first n_val episodes (otherwise task will be repeated) + rng = random.Random(self.seed) # To ensure validation episodes are the same on all ranks + val_eps = set(rng.sample(range(total_episodes), n_val)) + train_eps = set(range(total_episodes)) - val_eps + + log.info(f"train_eps={train_eps}, val_eps={val_eps}") + + return train_eps, val_eps + + def _get_split_episodes_for_dataset(self, ds_idx: int) -> set[int]: + """Get the episode set for the current split for a specific dataset.""" + train_eps, val_eps = self._episode_splits[ds_idx] + if self.split in {"val", "valid", "validation", "eval", "test"}: + return val_eps + elif self.split == "train": + return train_eps + else: # full + return train_eps | val_eps + + def _build_episode_boundaries_for_dataset(self, dataset: LeRobotDataset) -> dict[int, tuple[int, int]]: + """Build a dict of episode_index -> (start_frame, end_frame) for a single dataset.""" + boundaries: dict[int, tuple[int, int]] = {} + for ep in dataset.meta.episodes: + ep_idx = int(ep["episode_index"]) # type: ignore[index] + start = int(ep["dataset_from_index"]) # type: ignore[index] + end = int(ep["dataset_to_index"]) # type: ignore[index] + boundaries[ep_idx] = (start, end) + return boundaries + + def _filter_indices_for_dataset( + self, + ds_idx: int, + dataset: LeRobotDataset, + split_episodes: set[int], + boundaries: dict[int, tuple[int, int]], + ) -> list[tuple[int, int, int]]: + """Filter valid indices for a single dataset, returning (dataset_idx, local_idx, episode_idx).""" + index_map: list[tuple[int, int, int]] = [] + all_meta = list(dataset.meta.episodes) + + for ep_idx in split_episodes: + if ep_idx >= len(all_meta): + continue + ep = all_meta[ep_idx] + + ep_start = int(ep["dataset_from_index"]) # type: ignore[index] + ep_end = int(ep["dataset_to_index"]) # type: ignore[index] + + # Valid range: [start, end - chunk_length - 1] inclusive + # We drop chunk_length frames at end to ensure we can query up to delta=chunk_length. + start = ep_start + end = ep_end - self.chunk_length - 1 + + if end >= start: + for local_idx in range(start, end + 1): + index_map.append((ds_idx, local_idx, ep_idx)) + + return index_map + + def __len__(self) -> int: + return len(self.index_map) + + def _get_task_description(self, ds_idx: int, item: dict) -> str: + """Get task description for the current item from meta/tasks.parquet. + + The tasks.parquet has task descriptions as the DataFrame index (row labels) + and task_index as an integer column. We look up by task_index and return + the corresponding index name (the actual task description string). + """ + task_idx = item.get("task_index") + if task_idx is not None: + if isinstance(task_idx, torch.Tensor): + task_idx = task_idx.item() + task_idx = int(task_idx) + tasks_df = self.tasks_dfs[ds_idx] + if task_idx in tasks_df["task_index"].values: + row = tasks_df[tasks_df["task_index"] == task_idx].iloc[0] + # The task description is the index name (row label), not a column value + return str(row.name) + raise ValueError(f"Task index {task_idx} not found in tasks.parquet for dataset {ds_idx}") + + def _compute_anchored_actions( + self, + state_raw: torch.Tensor, + action_raw: torch.Tensor, + ) -> tuple[torch.Tensor, torch.Tensor, torch.Tensor]: + """Compute anchored relative actions (batched). + + Converts frame-wise relative actions to anchored relative actions where each + action[t] represents the target pose (after applying action[t] to state[t]) + expressed in state 0's local coordinate frame. + + Mathematical formulation: + 1. Compute target in world frame (LIBERO convention): + - p_{t+1} = p_t + delta_p[t] (position addition in world frame) + - R_{t+1} = R_delta[t] @ R_t (rotation composition, delta first) + 2. Compute anchored (left-multiply by T_0^{-1}): + - anchored_pos[t] = R_0^T @ (p_{t+1} - p_0) + - anchored_rot[t] = R_0^T @ R_{t+1} + + Args: + state_raw: State tensor of shape (T+1, 8): [x, y, z, ax, ay, az, grip1, grip2] + where (ax, ay, az) is axis-angle rotation. + action_raw: Action tensor of shape (T+1, 7): [dx, dy, dz, dax, day, daz, grip] + where (dax, day, daz) is axis-angle rotation delta. + + Returns: + anchored_translation: (T, 3) - position in state_0's local frame + anchored_rotation_9d: (T, 9) - rotation relative to state_0 as flattened 3x3 matrix + gripper: (T, 1) - original gripper commands (unchanged) + """ + # Extract positions and rotations from states + p_states = state_raw[:, :3] # [T+1,3] + rotvec_states = state_raw[:, 3:6] # [T+1,3] - axis-angle + + # Extract deltas from actions (use first T actions) + delta_p = action_raw[:-1, :3] # [T,3] + delta_rotvec = action_raw[:-1, 3:6] # [T,3] - axis-angle delta + gripper = action_raw[:-1, 6:7] # [T,1] + + # Convert all axis-angle to rotation matrices (batched) + R_states = convert_rotation(rotvec_states, input_format="axisangle", output_format="matrix") # [T+1,3,3] + R_deltas = convert_rotation(delta_rotvec, input_format="axisangle", output_format="matrix") # [T,3,3] + + # Initial pose (state 0) + p_0 = p_states[0] # [3] + R_0 = R_states[0] # [3,3] + R_0_T = R_0.T # [3,3] - transpose for inverse rotation + + # Current states for t = 0..T-1 + p_t = p_states[:-1] # [T,3] + R_t = R_states[:-1] # [T,3,3] + + # Step 1: Compute target poses in world frame (LIBERO convention) + # p_target = p_t + delta_p + p_target = p_t + delta_p # [T,3] + + # R_target = R_delta @ R_t (batched matrix multiply) + R_target = torch.bmm(R_deltas, R_t) # [T,3,3] + + # Step 2: Compute anchored (in state_0's local frame) + # anchored_p = R_0^T @ (p_target - p_0) + displacement = p_target - p_0 # [T,3] + anchored_p = (R_0_T @ displacement.T).T # [T,3] + + # anchored_R = R_0^T @ R_target (batched) + R_0_T_expanded = R_0_T.unsqueeze(0).expand(R_target.shape[0], -1, -1) # [T,3,3] + anchored_R = torch.bmm(R_0_T_expanded, R_target) # [T,3,3] + + return anchored_p, anchored_R, gripper + + def _convert_rotation_to_repr(self, rotation_matrix: torch.Tensor) -> torch.Tensor: + """Convert rotation matrix to the desired representation. + + Args: + rotation_matrix: Rotation matrices of shape (T, 3, 3). + + Returns: + Rotation in the configured ``rotation_space`` format. + """ + return convert_rotation(rotation_matrix, "matrix", libero_rotation_format(self.rotation_space)) + + def _normalizer_filename(self) -> str: + rotation_suffix = { + "3d": "3d", + "6d": "rot6d", + "9d": "rot9d", + }.get(self.rotation_space) + if rotation_suffix is None: + raise ValueError(f"Unsupported rotation_space={self.rotation_space!r}.") + action_space = self.action_space.replace("-", "_") + return f"{self._embodiment_type}_{action_space}_{rotation_suffix}.json" + + def _resolve_action_stats_path(self, action_stats_path: str | None) -> Path: + if action_stats_path is None: + stats_path = self._NORMALIZERS_DIR / self._normalizer_filename() + if stats_path.exists(): + return stats_path + raise FileNotFoundError( + f"Could not find bundled LIBERO action stats at {stats_path}. " + "Pass action_stats_path explicitly or regenerate stats with compute_action_stats.py." + ) + + stats_path = Path(action_stats_path) + if stats_path.is_absolute(): + if stats_path.exists(): + return stats_path + raise FileNotFoundError(f"Could not find action_stats_path={action_stats_path!r}.") + + module_dir = Path(__file__).resolve().parent + candidates: list[Path] = [] + for parent in module_dir.parents: + candidates.append(parent / stats_path) + candidates.append(self._NORMALIZERS_DIR / stats_path.name) + candidates.append(module_dir / stats_path.name) + for candidate in candidates: + if candidate.exists(): + return candidate + raise FileNotFoundError( + f"Could not resolve action_stats_path={action_stats_path!r}; tried: {[str(c) for c in candidates]}" + ) + + def _set_range_denormalization_stats(self) -> None: + if self._norm_stats is None: + return + + if self.action_normalization == "minmax": + lo_key, hi_key = "min", "max" + elif self.action_normalization in ("quantile", "quantile_rot"): + lo_key, hi_key = "q01", "q99" + else: + return + + if lo_key not in self._norm_stats or hi_key not in self._norm_stats: + raise ValueError( + f"Action stats for {self.action_normalization!r} normalization require " + f"{lo_key!r} and {hi_key!r} entries." + ) + self.action_min = self._norm_stats[lo_key] # [D] + self.action_max = self._norm_stats[hi_key] # [D] + action_range = self.action_max - self.action_min # [D] + self.action_range = torch.clamp(action_range, min=1e-6) # [D] + + def __getitem__(self, idx: int, _retry_count: int = 0) -> dict[str, torch.Tensor | str]: + """Get a single item from the dataset.""" + max_retries = 10 + ds_idx, local_idx, ep_idx = self.index_map[idx] + dataset = self.datasets[ds_idx] + try: + item = dataset[local_idx] + except Exception as e: + log.warning( + f"Error loading item (retry {_retry_count}/{max_retries}): idx={idx}, ds_idx={ds_idx}, " + f"local_idx={local_idx}, ep_idx={ep_idx}, repo_id={dataset.meta.repo_id}, error={e}" + ) + if _retry_count >= max_retries: + raise RuntimeError(f"Failed to load data after {max_retries} retries") from e + new_idx = random.randint(0, len(self) - 1) + return self.__getitem__(new_idx, _retry_count + 1) + + if self.mode == "joint": + mode = random.choice(["forward_dynamics", "inverse_dynamics", "policy", "image2video"]) + else: + mode = self.mode + + # Get task description for ai_caption + task_description = self._get_task_description(ds_idx, item) + + # Process video based on camera mode (skipped entirely when + # skip_video_loading=True; image keys are also absent from + # delta_timestamps so LeRobot never decoded them). + video: torch.Tensor | None + if self._skip_video_loading: + video = None + else: + if self.camera_mode == "concat_view": + # Load both cameras and concatenate horizontally + video_1: torch.Tensor = item["observation.images.image"] + video_2: torch.Tensor = item["observation.images.wrist_image"] + + # Resize each if needed + if video_1.shape[-1] != self.image_size or video_1.shape[-2] != self.image_size: + video_1 = F.resize(video_1, [self.image_size, self.image_size]) + if video_2.shape[-1] != self.image_size or video_2.shape[-2] != self.image_size: + video_2 = F.resize(video_2, [self.image_size, self.image_size]) + + # Concatenate along width dimension (last dim for TCHW) + video_tchw = torch.cat([video_1, video_2], dim=-1) # (T, C, H, W*2) + else: + # Single camera mode + image_key = self.image_keys[0] + video_tchw = item[image_key] + + # Resize if needed + if video_tchw.shape[-1] != self.image_size or video_tchw.shape[-2] != self.image_size: + video_tchw = F.resize(video_tchw, [self.image_size, self.image_size]) + + # Convert to uint8 and transpose to (C, T, H, W) + video = (video_tchw * 255).clamp(0, 255).to(torch.uint8).permute(1, 0, 2, 3) + + # Action (raw): LIBERO actions are 7D (6 DoF + gripper) + action_raw: torch.Tensor = item["action"] + # State (raw): LIBERO state is 8D (6 DoF + 2 gripper states) + state_raw: torch.Tensor = item["observation.state"] + + # Action: (T+1, D) -> (T, D) + # Take all but last action + # LIBERO action format: [x, y, z, ax, ay, az, gripper] (7D) where (ax,ay,az) is axis-angle + + if self.action_space == "relative": + # Compute anchored relative actions + # Returns: translation (T, 3), rotation_matrix (T, 3, 3), gripper (T, 1) + translation, rotation_matrix, gripper = self._compute_anchored_actions(state_raw, action_raw.clone()) + elif self.action_space == "frame_wise_relative": + action = action_raw[:-1].clone() # [T,7] + translation = action[:, :3] # [T,3] + rotation_rotvec = action[:, 3:6] # [T,3] + gripper = action[:, 6:] # [T,1] + rotation_matrix = convert_rotation( + rotation_rotvec, input_format="axisangle", output_format="matrix" + ) # [T,3,3] + else: + raise ValueError(f"Unsupported action space: {self.action_space}") + + rotation = self._convert_rotation_to_repr(rotation_matrix) # [T,rot_dim] + action = torch.cat([translation, rotation, gripper], dim=-1) # [T,action_dim] + + # Compute idle_frames from the raw (un-normalized) action, only when the + # action layout has correct per-frame idle semantics (frame_wise_relative + # ⇔ backward_framewise). The other action_spaces ("relative", + # "absolute") encode per-frame motion differently and would not give + # meaningful idle counts under the same threshold check. + idle_frames: torch.Tensor | None = None + if self.action_space == "frame_wise_relative": + try: + spec = build_action_spec(Pos(), Rot(libero_rotation_format(self.rotation_space)), Gripper()) + n = compute_idle_frames(action, spec) + idle_frames = torch.tensor(n, dtype=torch.long) + except (ValueError, TypeError): + idle_frames = None + + if self.action_normalization is not None and self._norm_stats is not None and self.action_min is not None: + if action.shape[-1] != self.action_min.shape[0]: + raise ValueError( + f"Action dimension {action.shape[-1]} does not match stats dimension " + f"{self.action_min.shape[0]}. Recompute stats for the current " + f"rotation_space={self.rotation_space!r} and action_space={self.action_space!r}." + ) + method = "quantile" if self.action_normalization == "quantile_rot" else self.action_normalization + action = normalize_action(action, method, self._norm_stats) # [T,D] + + # Index + key = torch.tensor([local_idx], dtype=torch.long) + + if self.camera_mode == "image": + viewpoint = "third_person_view" + elif self.camera_mode == "wrist_image": + viewpoint = "wrist_view" + else: + viewpoint = "concat_view" + + result: dict[str, torch.Tensor | str] = { + "source_repo_id": dataset.meta.repo_id, + "video": video, + "action": action, + "action_raw": action_raw, + "conditioning_fps": torch.tensor(self.fps, dtype=torch.long), + "prompt": task_description, + "ai_caption": task_description, + "mode": mode, + "state": state_raw, + "action_space": self.action_space, + "rotation_space": self.rotation_space, + "pose_coordinate_frame": self.pose_coordinate_frame, + "__key__": key, + "domain_id": torch.tensor(self.domain_id, dtype=torch.long), + "viewpoint": viewpoint, + } + if idle_frames is not None: + result["idle_frames"] = idle_frames + + if self.camera_mode == "concat_view" and not self._skip_video_loading: + result["additional_view_description"] = ( + "The left half shows the third-person view; the right half shows the wrist-mounted camera." + ) + + return result + + @property + def action_dim(self) -> int: + return libero_action_dim(self.rotation_space) diff --git a/cosmos-inference/cosmos3/_src/vfm/datasets/action/libero_pose_utils.py b/cosmos-inference/cosmos3/_src/vfm/datasets/action/libero_pose_utils.py new file mode 100644 index 00000000..a4207755 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/vfm/datasets/action/libero_pose_utils.py @@ -0,0 +1,81 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +"""Small LIBERO pose helpers shared by training and closed-loop eval.""" + +from __future__ import annotations + +import numpy as np +import torch + +from cosmos3._src.vfm.datasets.action.pose_utils import ( + RotationConvention, + build_abs_pose_from_components, +) + +# Same local-frame post-rotation pattern used by DROID/Bridge/Fractal: +# R_opencv = R_native @ *_TO_OPENCV. +LIBERO_TO_OPENCV: np.ndarray = np.array( + [[0.0, -1.0, 0.0], [1.0, 0.0, 0.0], [0.0, 0.0, 1.0]], + dtype=np.float32, +) + +LIBERO_ROTATION_FORMATS: dict[str, RotationConvention] = { + "3d": "axisangle", + "6d": "rot6d", + "9d": "rot9d", +} +LIBERO_ACTION_DIMS: dict[str, int] = {"3d": 7, "6d": 10, "9d": 13} + + +def libero_rotation_format(rotation_space: str) -> RotationConvention: + """Return the shared ``pose_utils`` rotation format for a LIBERO setting.""" + rotation_format = LIBERO_ROTATION_FORMATS.get(rotation_space) + if rotation_format is None: + raise ValueError(f"Unsupported rotation_space={rotation_space!r}. Use 3d/6d/9d.") + return rotation_format + + +def libero_action_dim(rotation_space: str) -> int: + """Return ``[xyz, rotation, gripper]`` action width for LIBERO.""" + action_dim = LIBERO_ACTION_DIMS.get(rotation_space) + if action_dim is None: + raise ValueError(f"Unsupported rotation_space={rotation_space!r}. Use 3d/6d/9d.") + return action_dim + + +def libero_rotation_space_from_action_dim(action_dim: int) -> str: + """Infer LIBERO rotation space from unpadded action width.""" + for rotation_space, dim in LIBERO_ACTION_DIMS.items(): + if dim == action_dim: + return rotation_space + raise ValueError(f"Unable to infer rotation_space from action_dim={action_dim}.") + + +def build_libero_abs_pose(state_raw: torch.Tensor | np.ndarray, *, to_opencv: bool) -> np.ndarray: + """Build absolute LIBERO EE poses from state rows. + + ``state_raw`` is ``[x,y,z,axisangle(3),gripper(2)]``. When requested, the + local EE frame is post-rotated into the shared OpenCV-style action frame. + """ + if isinstance(state_raw, torch.Tensor): + state_np = state_raw.detach().cpu().numpy().astype(np.float32, copy=False) + else: + state_np = np.asarray(state_raw, dtype=np.float32) + + poses_abs = build_abs_pose_from_components(state_np[:, :3], state_np[:, 3:6], "axisangle") + if to_opencv: + poses_abs[:, :3, :3] = poses_abs[:, :3, :3] @ LIBERO_TO_OPENCV + return poses_abs diff --git a/cosmos-inference/cosmos3/_src/vfm/datasets/action/normalizers/bridge_orig_lerobot_backward_framewise_rot6d.json b/cosmos-inference/cosmos3/_src/vfm/datasets/action/normalizers/bridge_orig_lerobot_backward_framewise_rot6d.json new file mode 100644 index 00000000..6a074f5d --- /dev/null +++ b/cosmos-inference/cosmos3/_src/vfm/datasets/action/normalizers/bridge_orig_lerobot_backward_framewise_rot6d.json @@ -0,0 +1,33 @@ +{ + "metadata": { + "embodiment_type": "bridge_orig_lerobot", + "pose_convention": "backward_framewise", + "rotation_format": "rot6d", + "action_dim": 10, + "skip_rotation_dims": [3, 4, 5, 6, 7, 8], + "chunk_length": 16, + "sample_stride": 16, + "dataset_name": "bridge_20260416", + "dataset_class": "BridgeOrigLeRobotDataset", + "dataset_root": "/lustre/fsw/portfolios/cosmos/projects/cosmos_base_training/cosmos3_action_datasets/bridge_raw", + "split": "train", + "num_samples_stats": 83036, + "reservoir_size": 5000000 + }, + "global": { + "mean": [-0.000094, -0.000394, 0.001623, 0.000000, 0.000000, 0.000000, 0.000000, 0.000000, 0.000000, 0.582683], + "std": [ 0.013297, 0.009985, 0.012079, 1.000000, 1.000000, 1.000000, 1.000000, 1.000000, 1.000000, 0.489959], + "min": [-0.309451, -0.074740, -0.082767, -1.000000, -1.000000, -1.000000, -1.000000, -1.000000, -1.000000, 0.000000], + "max": [ 0.127018, 0.414660, 0.493186, 1.000000, 1.000000, 1.000000, 1.000000, 1.000000, 1.000000, 1.000000], + "q01": [-0.038884, -0.028667, -0.037840, -1.000000, -1.000000, -1.000000, -1.000000, -1.000000, -1.000000, 0.000000], + "q99": [ 0.039722, 0.029068, 0.026702, 1.000000, 1.000000, 1.000000, 1.000000, 1.000000, 1.000000, 1.000000] + }, + "global_raw": { + "mean": [-0.000094, -0.000394, 0.001623, 0.998307, -0.001371, 0.000061, 0.001414, 0.998226, -0.000154, 0.582683], + "std": [ 0.013297, 0.009985, 0.012079, 0.004630, 0.050168, 0.029018, 0.050165, 0.004328, 0.031742, 0.489959], + "min": [-0.309451, -0.074740, -0.082767, -0.845782, -0.636628, -0.401535, -0.590214, -0.217448, -0.979635, 0.000000], + "max": [ 0.127018, 0.414660, 0.493186, 1.000000, 0.362611, 0.601211, 0.619479, 1.000000, 0.365993, 1.000000], + "q01": [-0.038884, -0.028667, -0.037840, 0.976292, -0.163098, -0.081545, -0.160193, 0.976322, -0.078872, 0.000000], + "q99": [ 0.039722, 0.029068, 0.026702, 1.000000, 0.160195, 0.081655, 0.163227, 1.000000, 0.095189, 1.000000] + } +} diff --git a/cosmos-inference/cosmos3/_src/vfm/datasets/action/normalizers/libero_native_frame_wise_relative_rot6d.json b/cosmos-inference/cosmos3/_src/vfm/datasets/action/normalizers/libero_native_frame_wise_relative_rot6d.json new file mode 100644 index 00000000..6cde6705 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/vfm/datasets/action/normalizers/libero_native_frame_wise_relative_rot6d.json @@ -0,0 +1,37 @@ +{ + "metadata": { + "embodiment_type": "libero", + "pose_convention": "frame_wise_relative", + "pose_coordinate_frame": "native", + "rotation_format": "6d", + "action_dim": 10, + "skip_rotation_dims": [3, 4, 5, 6, 7, 8], + "chunk_length": 16, + "sample_stride": null, + "dataset_name": "libero", + "dataset_class": "LIBERODataset", + "dataset_root": ["outputs/libero_datasets/libero_10", "outputs/libero_datasets/libero_object", "outputs/libero_datasets/libero_spatial", "outputs/libero_datasets/libero_goal"], + "_comment": "Dataset paths are placeholders; the statistics values are independent of local dataset location.", + "split": "train", + "num_samples_stats": 10000, + "reservoir_size": 50000, + "max_samples": 10000, + "sampling_seed": 42 + }, + "global": { + "mean": [ 0.050704, 0.097407, -0.094833, 0.000000, 0.000000, 0.000000, 0.000000, 0.000000, 0.000000, 0.476725], + "std": [ 0.333621, 0.387175, 0.457140, 1.000000, 1.000000, 1.000000, 1.000000, 1.000000, 1.000000, 0.499460], + "min": [-0.937500, -0.937500, -0.937500, -1.000000, -1.000000, -1.000000, -1.000000, -1.000000, -1.000000, 0.000000], + "max": [ 0.937500, 0.937500, 0.937500, 1.000000, 1.000000, 1.000000, 1.000000, 1.000000, 1.000000, 1.000000], + "q01": [-0.723214, -0.808929, -0.937500, -1.000000, -1.000000, -1.000000, -1.000000, -1.000000, -1.000000, 0.000000], + "q99": [ 0.937500, 0.870536, 0.937500, 1.000000, 1.000000, 1.000000, 1.000000, 1.000000, 1.000000, 1.000000] + }, + "global_raw": { + "mean": [ 0.050704, 0.097407, -0.094833, 0.994873, -0.004579, -0.004288, 0.004389, 0.996104, 0.001109, 0.476725], + "std": [ 0.333621, 0.387175, 0.457140, 0.010807, 0.077802, 0.063386, 0.078571, 0.009994, 0.038504, 0.499460], + "min": [-0.937500, -0.937500, -0.937500, 0.902028, -0.356085, -0.367416, -0.370434, 0.921907, -0.255000, 0.000000], + "max": [ 0.937500, 0.937500, 0.937500, 1.000000, 0.368853, 0.341214, 0.356395, 1.000000, 0.348251, 1.000000], + "q01": [-0.723214, -0.808929, -0.937500, 0.934955, -0.223431, -0.189878, -0.334735, 0.938516, -0.107736, 0.000000], + "q99": [ 0.937500, 0.870536, 0.937500, 1.000000, 0.331000, 0.163153, 0.226216, 1.000000, 0.127158, 1.000000] + } +} diff --git a/cosmos-inference/cosmos3/_src/vfm/datasets/action/pose_utils.py b/cosmos-inference/cosmos3/_src/vfm/datasets/action/pose_utils.py new file mode 100644 index 00000000..491db4ce --- /dev/null +++ b/cosmos-inference/cosmos3/_src/vfm/datasets/action/pose_utils.py @@ -0,0 +1,759 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +"""Rotation and pose utilities for action datasets. + +This module centralizes three related responsibilities used across the action +dataset stack: + +1. Converting rotations between the conventions used by the datasets and the + action model (`euler_xyz`, quaternion, axis-angle, rot6d, rot9d, matrix). +2. Building absolute homogeneous poses of shape ``(T, 4, 4)`` from per-frame + translation and rotation components. +3. Converting trajectories between absolute-pose form and the relative-pose + action vectors consumed by the datasets. + + The relative-pose action vectors always follow the shared layout + ``[translation(3), rotation(...)]``. The rotation block is encoded with the + requested rotation output convention, and `convert_rotation()` is the + canonical public entrypoint for representation conversion. +""" + +import math +from typing import Literal + +import numpy as np +import torch +from scipy.spatial.transform import Rotation as R + +PoseConvention = Literal["absolute", "backward_anchored", "backward_framewise"] +RotationConvention = Literal["matrix", "euler_xyz", "quat_xyzw", "quat_wxyz", "rot6d", "axisangle", "rot9d"] + + +def _to_numpy_float32(array: torch.Tensor | np.ndarray) -> np.ndarray: + """Convert an input array to a NumPy ``float32`` array. + + Args: + array: A torch tensor or NumPy array with arbitrary leading dimensions. + + Returns: + A NumPy array with dtype ``float32``. Torch tensors are moved to CPU + before conversion. NumPy inputs are converted with ``copy=False`` + semantics when possible. + + Raises: + ValueError: If a torch tensor with ``requires_grad=True`` is passed. + These utilities are non-differentiable; callers must explicitly + detach tensors before conversion. + """ + if isinstance(array, torch.Tensor): + if array.requires_grad: + raise ValueError( + "pose_utils conversion is non-differentiable; call `.detach()` " + "explicitly before passing tensors with requires_grad=True" + ) + return array.cpu().numpy().astype(np.float32, copy=False) + return np.asarray(array, dtype=np.float32) + + +def _normalize_rotation_matrices(rot_matrices: np.ndarray) -> np.ndarray: + """Project approximate matrices onto valid rotation matrices. + + This helper uses an SVD-based projection onto ``SO(3)``. It is mainly used + when decoding rotations from network-like representations such as rot6d or rot9d + where the input may not already be perfectly orthonormal. + + Args: + rot_matrices: Array of shape ``(..., 3, 3)`` containing one or more + approximate rotation matrices. + + Returns: + Array of shape ``(..., 3, 3)`` whose trailing matrices are proper + rotation matrices with determinant ``+1``. + + Raises: + ValueError: If the input does not have trailing shape ``(3, 3)``. + """ + matrices = np.asarray(rot_matrices, dtype=np.float32) + if matrices.ndim < 2 or matrices.shape[-2:] != (3, 3): + raise ValueError(f"Rotation matrices must have shape (..., 3, 3), got {matrices.shape}") + + original_shape = matrices.shape[:-2] + matrices_flat = matrices.reshape(-1, 3, 3) + + # Batched SVD projection to SO(3). + U, _, Vt = np.linalg.svd(matrices_flat) + normalized = U @ Vt + + # Ensure determinant is +1 (proper rotations, no reflections). + det = np.linalg.det(normalized) + reflection_mask = det < 0 + if np.any(reflection_mask): + U_reflect = U.copy() + U_reflect[reflection_mask, :, -1] *= -1 + normalized[reflection_mask] = U_reflect[reflection_mask] @ Vt[reflection_mask] + + return normalized.astype(np.float32, copy=False).reshape(*original_shape, 3, 3) + + +def convert_rotation( + rotation: torch.Tensor | np.ndarray, + input_format: RotationConvention, + output_format: RotationConvention, + normalize_matrix: bool = False, +) -> torch.Tensor | np.ndarray: + """Convert rotations between the conventions used by action datasets. + + The function first maps the input representation to rotation matrices and + then emits the requested output convention. It is the single conversion seam + used by the public pose helpers so that all code paths share the same + convention handling. + + Supported input conventions: + - ``matrix``: rotation matrices with shape ``(..., 3, 3)`` + - ``euler_xyz``: Euler xyz angles in radians with shape ``(..., 3)`` + - ``quat_xyzw``: quaternions in SciPy's xyzw order with shape ``(..., 4)`` + - ``quat_wxyz``: quaternions in wxyz order with shape ``(..., 4)`` + - ``rot6d``: column-based 6D representation with shape ``(..., 6)`` + - ``rot9d``: flattened rotation matrices with shape ``(..., 9)`` + - ``axisangle``: axis-angle vectors with shape ``(..., 3)`` + + Supported output conventions: + - ``matrix`` + - ``euler_xyz`` + - ``quat_xyzw`` + - ``quat_wxyz`` + - ``rot6d`` + - ``axisangle`` + - ``rot9d`` + + Args: + rotation: Input rotations in the representation specified by + ``input_format``. + input_format: Convention used by ``rotation``. + output_format: Convention to return. + normalize_matrix: Whether to project intermediate matrices to a valid + rotation before returning. This is most useful when decoding from + approximate ``rot6d``/``rot9d`` inputs or non-unit quaternions. + + Returns: + Rotations with the same leading shape as the input, expressed in the + requested output convention. Torch inputs return torch outputs on the + same device with the same dtype; NumPy inputs return NumPy arrays. + + Raises: + ValueError: If the input shape is incompatible with ``input_format`` or + if either format is unsupported. + """ + input_is_tensor = isinstance(rotation, torch.Tensor) + input_dtype = rotation.dtype if input_is_tensor else None + input_device = rotation.device if input_is_tensor else None + rotation_np = _to_numpy_float32(rotation) + + if input_format == "matrix": + if rotation_np.ndim < 2 or rotation_np.shape[-2:] != (3, 3): + raise ValueError(f"matrix rotation must have shape (..., 3, 3), got {rotation_np.shape}") + original_shape = rotation_np.shape[:-2] + matrices_flat = rotation_np.reshape(-1, 3, 3) + if normalize_matrix: + matrices_flat = _normalize_rotation_matrices(matrices_flat).reshape(-1, 3, 3) + elif input_format == "euler_xyz": + if rotation_np.ndim < 1 or rotation_np.shape[-1] != 3: + raise ValueError(f"{input_format} rotation must have shape (..., 3), got {rotation_np.shape}") + original_shape = rotation_np.shape[:-1] + matrices_flat = R.from_euler("xyz", rotation_np.reshape(-1, 3), degrees=False).as_matrix().astype(np.float32) + elif input_format in ("quat_xyzw", "quat_wxyz"): + if rotation_np.ndim < 1 or rotation_np.shape[-1] != 4: + raise ValueError(f"{input_format} rotation must have shape (..., 4), got {rotation_np.shape}") + original_shape = rotation_np.shape[:-1] + quaternions = rotation_np.reshape(-1, 4) + if input_format == "quat_wxyz": + quaternions = quaternions[:, [1, 2, 3, 0]] + norms = np.linalg.norm(quaternions, axis=-1) + if np.any(norms < 1e-8): + raise ValueError(f"Found zero-norm quaternion(s) (min norm={norms.min():.2e}).") + if normalize_matrix: + quaternions = quaternions / norms[:, None] + matrices_flat = R.from_quat(quaternions).as_matrix().astype(np.float32) + elif input_format == "rot6d": + if rotation_np.ndim < 1 or rotation_np.shape[-1] != 6: + raise ValueError(f"{input_format} rotation must have shape (..., 6), got {rotation_np.shape}") + original_shape = rotation_np.shape[:-1] + rot6d_flat = rotation_np.reshape(-1, 6) + col0 = rot6d_flat[:, :3] + col1 = rot6d_flat[:, 3:] + col2 = np.cross(col0, col1, axis=-1) + matrices_flat = np.stack((col0, col1, col2), axis=-1).astype(np.float32) + if normalize_matrix: + matrices_flat = _normalize_rotation_matrices(matrices_flat).reshape(-1, 3, 3) + elif input_format == "rot9d": + if rotation_np.ndim < 1 or rotation_np.shape[-1] != 9: + raise ValueError(f"rot9d rotation must have shape (..., 9), got {rotation_np.shape}") + original_shape = rotation_np.shape[:-1] + matrices_flat = rotation_np.reshape(-1, 3, 3) + if normalize_matrix: + matrices_flat = _normalize_rotation_matrices(matrices_flat).reshape(-1, 3, 3) + elif input_format == "axisangle": + if rotation_np.ndim < 1 or rotation_np.shape[-1] != 3: + raise ValueError(f"axisangle rotation must have shape (..., 3), got {rotation_np.shape}") + original_shape = rotation_np.shape[:-1] + matrices_flat = R.from_rotvec(rotation_np.reshape(-1, 3)).as_matrix().astype(np.float32) + else: + raise ValueError(f"Unsupported input_format: {input_format!r}") + + if output_format == "matrix": + converted = matrices_flat.reshape(*original_shape, 3, 3).astype(np.float32) + elif output_format == "rot9d": + converted = matrices_flat.reshape(-1, 9) + elif output_format == "rot6d": + converted = matrices_flat[:, :, :2].transpose(0, 2, 1).reshape(-1, 6) + elif output_format == "quat_xyzw": + converted = R.from_matrix(matrices_flat).as_quat().astype(np.float32) + elif output_format == "quat_wxyz": + converted = R.from_matrix(matrices_flat).as_quat().astype(np.float32) + converted = converted[:, [3, 0, 1, 2]] + elif output_format == "euler_xyz": + converted = R.from_matrix(matrices_flat).as_euler("xyz", degrees=False).astype(np.float32) + elif output_format == "axisangle": + converted = R.from_matrix(matrices_flat).as_rotvec().astype(np.float32) + else: + raise ValueError(f"Unsupported output_format: {output_format!r}") + + if output_format != "matrix": + converted = converted.reshape(*original_shape, converted.shape[-1]) + + if input_is_tensor: + return torch.from_numpy(np.ascontiguousarray(converted)).to(dtype=input_dtype, device=input_device) + return converted + + +# ----------------------------------------------------------------------------- +# Absolute pose construction +# ----------------------------------------------------------------------------- + + +def build_abs_pose_from_components( + xyz: torch.Tensor | np.ndarray, + rotation: torch.Tensor | np.ndarray, + rotation_input_format: Literal["euler_xyz", "quat_xyzw", "quat_wxyz", "axisangle"], + translation_scale: float | None = None, +) -> np.ndarray: + """Build absolute homogeneous poses from per-frame translation and rotation. + + This is the canonical helper for turning dataset-provided pose components + into a sequence of rigid transforms. Each output pose is a homogeneous + transform whose top-left ``3 x 3`` block stores rotation and whose last + column stores translation. + + Args: + xyz: Per-frame translations with shape ``(T, 3)``. + rotation: Per-frame rotations with shape ``(T, 3)`` for ``euler_xyz`` + and ``axisangle``, or ``(T, 4)`` for quaternion conventions. + rotation_input_format: Convention used by ``rotation``. Supported values + are ``euler_xyz``, ``quat_xyzw``, ``quat_wxyz``, and ``axisangle``. + translation_scale: Optional factor used to divide translations before + inserting them into the output poses. This is useful when upstream + data stores translations in a scaled unit. + + Returns: + Absolute poses with shape ``(T, 4, 4)`` and dtype ``float32``. + + Raises: + ValueError: If the translation and rotation arrays have incompatible + lengths or unsupported shapes, or if ``translation_scale`` is zero. + """ + xyz_np = _to_numpy_float32(xyz) + rotation_np = _to_numpy_float32(rotation) + + if xyz_np.ndim != 2 or xyz_np.shape[1] != 3: + raise ValueError(f"xyz must have shape (T, 3), got {xyz_np.shape}") + if rotation_np.ndim != 2: + raise ValueError(f"rotation must be 2D, got {rotation_np.shape}") + if rotation_np.shape[0] != xyz_np.shape[0]: + raise ValueError( + f"xyz and rotation must have the same length, got {xyz_np.shape[0]} and {rotation_np.shape[0]}" + ) + + rot_mats = np.asarray( + convert_rotation(rotation_np, input_format=rotation_input_format, output_format="matrix"), + dtype=np.float32, + ) + + if translation_scale is not None: + if translation_scale == 0: + raise ValueError("translation_scale must be non-zero") + xyz_np = xyz_np / float(translation_scale) + + poses_abs = np.eye(4, dtype=np.float32)[None].repeat(xyz_np.shape[0], axis=0) + poses_abs[:, :3, :3] = rot_mats.astype(np.float32) + poses_abs[:, :3, 3] = xyz_np + return poses_abs + + +# ----------------------------------------------------------------------------- +# Relative pose conversions +# ----------------------------------------------------------------------------- + + +def _delta_transform_to_pose_vector( + delta_T: np.ndarray, + rotation_output_format: RotationConvention, + translation_scale: float = 1.0, + rotation_scale: float = 1.0, +) -> np.ndarray: + """Encode a relative transform as an action vector. + + The shared action-vector layout is always ``[translation(3), rotation(...)]``. + The translation block is multiplied by ``translation_scale`` before concatenation, + and the rotation block is multiplied by ``rotation_scale``. + + Args: + delta_T: Relative transform of shape ``(4, 4)``. + rotation_output_format: Concrete convention used for the output rotation + block. + translation_scale: Scalar multiplier applied to the translation block. + rotation_scale: Scalar multiplier applied to the rotation block. Used to + match the loss scale of the rotation block to the translation block. + The decoder must divide by the same factor before reconstructing the + rotation matrix. + + Returns: + A ``float32`` action vector whose first three values are translation and + whose remaining values are the rotation in ``rotation_output_format``. + """ + delta_np = np.asarray(delta_T, dtype=np.float32) + if delta_np.shape != (4, 4): + raise ValueError(f"delta_T must have shape (4, 4), got {delta_np.shape}") + + translation = delta_np[:3, 3] * translation_scale + rotation = np.asarray( + convert_rotation(delta_np[:3, :3], input_format="matrix", output_format=rotation_output_format), + dtype=np.float32, + ) + rotation = rotation * rotation_scale + return np.concatenate([translation, rotation]).astype(np.float32) + + +def _pose_vector_to_delta_transform( + pose_vector: np.ndarray, + rotation_input_format: RotationConvention, + translation_scale: float, + normalize_rotation: bool, + rotation_scale: float = 1.0, +) -> np.ndarray: + """Decode an action vector back into a relative homogeneous transform. + + This is the inverse of `_delta_transform_to_pose_vector()` when the same + rotation convention and scale are used. + + Args: + pose_vector: Relative-pose action vector with layout + ``[translation(3), rotation(...)]``. + rotation_input_format: Concrete convention used by the rotation block. + translation_scale: Scalar used to undo the translation scaling applied during + encoding. + normalize_rotation: Whether to project the decoded rotation to a valid + matrix before assembling the transform. + rotation_scale: Scalar used to undo the rotation scaling applied during + encoding. Must match the value used by + `_delta_transform_to_pose_vector()`. + + Returns: + A relative homogeneous transform with shape ``(4, 4)`` and dtype + ``float32``. + """ + pose_vector_np = np.asarray(pose_vector, dtype=np.float32) + rotation_block = pose_vector_np[3:] / rotation_scale + + rotation_matrix = np.asarray( + convert_rotation( + rotation_block, + input_format=rotation_input_format, + output_format="matrix", + normalize_matrix=normalize_rotation, + ), + dtype=np.float32, + ) + + delta_T = np.eye(4, dtype=np.float32) + delta_T[:3, 3] = pose_vector_np[:3] / translation_scale + delta_T[:3, :3] = rotation_matrix + return delta_T + + +def _get_relative_delta_transform( + poses_abs: np.ndarray, + inv_poses_abs: np.ndarray, + frame_idx: int, + pose_convention: PoseConvention, +) -> np.ndarray: + """Compute one relative transform from an absolute-pose trajectory. + + Args: + poses_abs: Absolute poses of shape ``(T, 4, 4)``. + inv_poses_abs: Precomputed inverses of ``poses_abs`` with the same shape. + frame_idx: Index of the step to encode, in ``[0, T - 2]``. + pose_convention: Pose convention controlling which two poses + define the delta and whether it is framewise or anchored. + + Returns: + The relative transform ``delta_T`` with shape ``(4, 4)`` for the + requested step and convention. + """ + if pose_convention == "backward_framewise": + return inv_poses_abs[frame_idx] @ poses_abs[frame_idx + 1] + if pose_convention == "backward_anchored": + return inv_poses_abs[0] @ poses_abs[frame_idx + 1] + raise ValueError( + f"Unsupported pose_convention={pose_convention!r}. Expected one of: backward_framewise, backward_anchored." + ) + + +def _apply_relative_delta_transform( + current_pose: np.ndarray, + initial_pose: np.ndarray, + delta_T: np.ndarray, + pose_convention: PoseConvention, +) -> np.ndarray: + """Recover the next absolute pose from a decoded relative transform. + + Args: + current_pose: The current reconstructed pose for framewise modes. + initial_pose: The anchor pose used by anchored modes. + delta_T: Relative transform for the current step. + pose_convention: Pose convention controlling how ``delta_T`` + should be composed back into an absolute pose. + + Returns: + The next absolute pose with shape ``(4, 4)``. + """ + if pose_convention == "backward_framewise": + return current_pose @ delta_T + if pose_convention == "backward_anchored": + return initial_pose @ delta_T + raise ValueError( + f"Unsupported pose_convention={pose_convention!r}. Expected one of: backward_framewise, backward_anchored." + ) + + +def pose_abs_to_rel( + poses_abs: np.ndarray, + rotation_format: RotationConvention = "rot9d", + pose_convention: PoseConvention = "backward_framewise", + translation_scale: float = 1.0, + rotation_scale: float = 1.0, +) -> np.ndarray: + """Convert an absolute-pose trajectory into relative-pose action vectors. + + Args: + poses_abs: Absolute poses with shape ``(T, 4, 4)``. These are typically + object-in-world or camera-to-world transforms. + rotation_format: Rotation convention used for the output rotation block. + Supported values are ``rot9d``, ``rot6d``, ``quat_xyzw``, and + ``euler_xyz``. + pose_convention: Pose convention: + - ``backward_framewise``: ``delta_T = T_i^{-1} @ T_{i+1}`` + - ``backward_anchored``: ``delta_T = T_0^{-1} @ T_{i+1}`` + translation_scale: Scalar multiplier applied to the translation block of each + encoded action vector. + rotation_scale: Scalar multiplier applied to the rotation block of each + encoded action vector. Use this to match the loss scale of rotation + and translation. `pose_rel_to_abs()` must be called with the same + value to invert the scaling. + + Returns: + An array of shape ``(T - 1, D)`` where ``D = 3 + rotation_dim``. + + Raises: + AssertionError: If fewer than two absolute poses are provided. + """ + num_frames = len(poses_abs) + assert num_frames > 1, "At least 2 frames are required to compute relative poses" + + # Compute inverse poses + inv_poses_abs = np.linalg.inv(poses_abs) + + poses_rel = [] + # We produce num_frames - 1 relative poses + for i in range(num_frames - 1): + delta_T = _get_relative_delta_transform(poses_abs, inv_poses_abs, i, pose_convention) + poses_rel.append( + _delta_transform_to_pose_vector( + delta_T, + rotation_output_format=rotation_format, + translation_scale=translation_scale, + rotation_scale=rotation_scale, + ) + ) + + return np.stack(poses_rel).astype(np.float32) # [T-1,D] + + +def pose_rel_to_abs( + poses_rel: np.ndarray, + rotation_format: RotationConvention = "rot9d", + pose_convention: PoseConvention = "backward_framewise", + initial_pose: np.ndarray | None = None, + normalize_rotation: bool = True, + translation_scale: float = 1.0, + rotation_scale: float = 1.0, +) -> np.ndarray: + """Reconstruct an absolute-pose trajectory from relative-pose action vectors. + + Args: + poses_rel: Relative-pose action vectors with shape ``(T - 1, D)`` and + layout ``[translation(3), rotation(...)]``. + rotation_format: Convention used by the rotation block of ``poses_rel``. + pose_convention: Pose convention used when the vectors were + encoded. This must match the convention passed to `pose_abs_to_rel()`. + initial_pose: Absolute pose for the first frame. If ``None``, the + identity transform is used. + normalize_rotation: Whether to project decoded rotations onto ``SO(3)`` + before composing them back into the trajectory. + translation_scale: Scalar used to undo the translation scaling applied during + `pose_abs_to_rel()`. + rotation_scale: Scalar used to undo the rotation scaling applied during + `pose_abs_to_rel()`. Must match the value passed there. + + Returns: + Absolute poses with shape ``(T, 4, 4)`` where ``T = len(poses_rel) + 1``. + """ + if initial_pose is None: + initial_pose = np.eye(4) + + poses_abs = [initial_pose] + current_pose = initial_pose + + num_poses_rel = poses_rel.shape[0] + + for i in range(num_poses_rel): + delta_T = _pose_vector_to_delta_transform( + poses_rel[i], + rotation_input_format=rotation_format, + translation_scale=translation_scale, + normalize_rotation=normalize_rotation, + rotation_scale=rotation_scale, + ) + next_pose = _apply_relative_delta_transform(current_pose, initial_pose, delta_T, pose_convention) + + poses_abs.append(next_pose) + current_pose = next_pose + + return np.stack(poses_abs) # [T,4,4] + + +# ----------------------------------------------------------------------------- +# Idle-frame detection +# ----------------------------------------------------------------------------- + + +def _identity_rotation_vector(rotation_format: RotationConvention) -> np.ndarray: + """Return the identity-rotation vector for a given rotation convention. + + Used by :func:`compute_idle_frames` to test whether a rotation block is + close to "no rotation" in its current encoding. + """ + if rotation_format in ("matrix", "rot9d"): + return np.array([1, 0, 0, 0, 1, 0, 0, 0, 1], dtype=np.float32) + if rotation_format == "rot6d": + return np.array([1, 0, 0, 0, 1, 0], dtype=np.float32) + if rotation_format == "quat_xyzw": + return np.array([0, 0, 0, 1], dtype=np.float32) + if rotation_format == "quat_wxyz": + return np.array([1, 0, 0, 0], dtype=np.float32) + if rotation_format in ("euler_xyz", "axisangle"): + return np.array([0, 0, 0], dtype=np.float32) + raise ValueError(f"Unsupported rotation_format={rotation_format!r}") + + +def _rotation_angle_per_arm(rotations: np.ndarray, rotation_format: str) -> np.ndarray: + """Geodesic angle (rad) from identity for each arm at each frame. + + ``rotations`` has shape ``(T, n_arms, n_per_arm)``; the returned array has + shape ``(T, n_arms)``. The angle is rotation-format aware so a fixed + ``eps_r`` threshold has consistent geometric meaning across formats: + + - ``rot6d`` → reconstruct ``trace(R)`` in closed form from the two stored + columns ``a, b`` (already unit-orthogonal as they came from a valid + rotation matrix). The third column is ``a × b``, so + ``trace(R) = a[0] + b[1] + a[0]·b[1] - a[1]·b[0]``. + ``angle = arccos(clip((trace - 1) / 2, -1, 1))``. + - ``rot9d`` → reshape to ``(..., 3, 3)`` and use + ``trace(R) = R[0,0] + R[1,1] + R[2,2]``. + - ``quat_xyzw`` / ``quat_wxyz`` → ``angle = 2 · arccos(|q_w|)``; the + absolute value handles the double cover (``q`` and ``-q`` represent the + same rotation). + - ``axisangle`` → the magnitude of the axis-angle vector *is* the angle. + - ``euler_xyz`` → no closed-form angle; use ``‖euler‖`` as a conservative + upper bound (exact for single-axis rotations, an overestimate for + composed ones — fine for idle detection where small angles are the + regime of interest). + """ + if rotation_format == "rot6d": + a = rotations[..., :3] + b = rotations[..., 3:6] + trace = a[..., 0] + b[..., 1] + a[..., 0] * b[..., 1] - a[..., 1] * b[..., 0] + return np.arccos(np.clip((trace - 1.0) / 2.0, -1.0, 1.0)) + if rotation_format == "rot9d": + mat = rotations.reshape(*rotations.shape[:-1], 3, 3) + trace = mat[..., 0, 0] + mat[..., 1, 1] + mat[..., 2, 2] + return np.arccos(np.clip((trace - 1.0) / 2.0, -1.0, 1.0)) + if rotation_format in ("quat_xyzw", "quat_wxyz"): + qw = rotations[..., 3] if rotation_format == "quat_xyzw" else rotations[..., 0] + return 2.0 * np.arccos(np.clip(np.abs(qw), 0.0, 1.0)) + if rotation_format == "axisangle": + return np.linalg.norm(rotations, axis=-1) + if rotation_format == "euler_xyz": + # Exact for single-axis rotations, overestimate for composed ones — + # safe for idle thresholds since overestimation can only mark a frame + # as non-idle, never spuriously idle. + return np.linalg.norm(rotations, axis=-1) + raise ValueError(f"Unsupported rotation_format={rotation_format!r}") + + +def _consecutive_streaks(idle: np.ndarray, min_streak: int) -> np.ndarray: + """Zero out idle bits not belonging to a run of ``>= min_streak`` Trues. + + Pure-numpy two-pointer scan. ``min_streak <= 1`` is a no-op (returns the + input mask unchanged). + """ + if min_streak <= 1: + return idle + out = np.zeros_like(idle) + n = len(idle) + i = 0 + while i < n: + if not idle[i]: + i += 1 + continue + j = i + while j < n and idle[j]: + j += 1 + if j - i >= min_streak: + out[i:j] = True + i = j + return out + + +def compute_idle_frames( + action_raw: torch.Tensor | np.ndarray, + spec: "ActionSpec", # noqa: F821 — forward ref, real import is in action_spec.py + *, + eps_t: float = 1e-3, + eps_r: float = math.radians(5.0), + eps_g: float = 1e-2, + joint_threshold: float = 5e-4, + min_streak: int = 3, +) -> int: + """Count idle frames in a raw (un-normalized) action chunk. + + Idle detection runs per-DimType (driven by ``spec.types``); a frame is + *raw-idle* iff every relevant type group is idle on that frame, and + counts toward the final tally only if it belongs to a run of at least + ``min_streak`` consecutive raw-idle frames. The streak filter rejects + isolated low-motion frames (instantaneous slowdowns) which carry weak + physical meaning and add noise to the IdleFrames training signal. + + DimType branches: + + - ``POS`` → combined ``‖action[pos_idx]‖`` (L2 across all POS dims) + < ``eps_t``. For single-arm specs (3 dims) this is the standard ``‖t‖`` + check; for multi-arm specs the combined norm is slightly stricter than + a per-arm check. + - ``ROT`` → per-arm geodesic rotation angle (rad) from identity + < ``eps_r``. The angle is computed in a rotation-format aware way (see + :func:`_rotation_angle_per_arm`) so the threshold has consistent + geometric meaning regardless of the encoding. + - ``GRIPPER`` → ``max |action[t] - action[t-1]| < eps_g``. ``np.diff`` + with ``prepend=action[0]`` makes step 0 ``|0|`` (treated as "no change"); + with the streak filter this can no longer create a spurious single-frame + idle event. + - ``JOINT`` → same frame-diff scheme as gripper with + ``joint_threshold`` (rad / step). + - ``RESERVED`` → ignored. + + Defaults (in the units of the un-normalized action): + + - ``eps_t = 1e-3`` → 1 mm per-frame translation + - ``eps_r = 5°`` → 5° per-frame rotation (geodesic angle) + - ``eps_g = 1e-2`` → 1 % gripper command change + - ``joint_threshold = 5e-4`` → ~0.03° / step joint angle change + - ``min_streak = 3`` → require a run of >= 3 consecutive idle frames + + The input must be **un-normalized** so the identity transform sits at + known coordinates (translation ≈ 0, rotation ≈ identity). The action + vector is also assumed to be encoded in a per-step / framewise convention + (e.g. ``backward_framewise``); anchored conventions (``backward_anchored``) + accumulate over the chunk and would silently break the POS/ROT idle + checks. Callers (e.g. the LeRobot base class) gate on pose convention + before calling this function. + """ + if isinstance(action_raw, torch.Tensor): + action = action_raw.detach().cpu().numpy().astype(np.float32, copy=False) + else: + action = np.asarray(action_raw, dtype=np.float32) + + if action.ndim != 2: + raise ValueError(f"action_raw must be 2-D (T, D); got shape {action.shape}") + num_frames, action_dim = action.shape + if num_frames == 0: + return 0 + if action_dim != len(spec.types): + raise ValueError(f"action_dim={action_dim} does not match spec.dim={len(spec.types)}") + + # Import locally to avoid a circular import at module load time + # (action_spec.py imports RotationConvention from this file). + from cosmos3._src.vfm.datasets.action.action_spec import DimType + + pos_idx = [i for i, t in enumerate(spec.types) if t == DimType.POS] + rot_idx = [i for i, t in enumerate(spec.types) if t == DimType.ROT] + grip_idx = [i for i, t in enumerate(spec.types) if t == DimType.GRIPPER] + joint_idx = [i for i, t in enumerate(spec.types) if t == DimType.JOINT] + + idle = np.ones(num_frames, dtype=bool) + + # POS: combined L2 norm across all translation dims. + if pos_idx: + idle &= np.linalg.norm(action[:, pos_idx], axis=1) < eps_t + + # ROT: per-arm geodesic angle (rad). + if rot_idx: + rot_id = _identity_rotation_vector(spec.rotation_format) + n_per_arm = rot_id.shape[0] + if len(rot_idx) % n_per_arm != 0: + raise ValueError( + f"ROT dims ({len(rot_idx)}) not a multiple of " + f"rotation_format={spec.rotation_format!r} dim ({n_per_arm})" + ) + rotations = action[:, rot_idx].reshape(num_frames, -1, n_per_arm) + angles = _rotation_angle_per_arm(rotations, spec.rotation_format) # (T, n_arms) + idle &= angles.max(axis=1) < eps_r + + # GRIPPER: max |Δgripper| across all gripper dims; step 0's diff is 0. + if grip_idx: + gripper = action[:, grip_idx] + diff = np.abs(np.diff(gripper, axis=0, prepend=gripper[:1])) + idle &= diff.max(axis=1) < eps_g + + # JOINT: same frame-diff scheme with joint_threshold. + if joint_idx: + joints = action[:, joint_idx] + diff = np.abs(np.diff(joints, axis=0, prepend=joints[:1])) + idle &= diff.max(axis=1) < joint_threshold + + if min_streak > 1: + idle = _consecutive_streaks(idle, min_streak) + + return int(idle.sum()) diff --git a/cosmos-inference/cosmos3/_src/vfm/datasets/action/transforms.py b/cosmos-inference/cosmos3/_src/vfm/datasets/action/transforms.py new file mode 100644 index 00000000..1dce8595 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/vfm/datasets/action/transforms.py @@ -0,0 +1,678 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +"""Dataset transform wrappers for the Action project. + +This module provides the ``ActionTransformPipeline`` and spatial padding utilities. + +The reflection padding snaps each sample to the closest predefined resolution from +``VIDEO_RES_SIZE_INFO`` (matching VFM's approach), guaranteeing a bounded set of +output shapes that are all multiples of 16. + +See :func:`~.unified_dataset.wrap_dataset` for the convenience factory that +combines datasets with transforms, and :class:`~.unified_dataset.MapToIterableAdapter` +for the map-to-iterable wrapper. +""" + +from __future__ import annotations + +import torch +import torchvision.transforms.functional as transforms_F + +from cosmos3._src.imaginaire.utils import log +from cosmos3._src.vfm.datasets.action.json_formatter import ActionPromptJsonFormatter +from cosmos3._src.vfm.datasets.action.viewpoint_utils import ViewpointTextInfo +from cosmos3._src.vfm.datasets.augmentors.duration_fps_text_timestamps import DurationFPSTextTimeStamps +from cosmos3._src.vfm.datasets.augmentors.idle_frames_text_info import IdleFramesTextInfo +from cosmos3._src.vfm.datasets.augmentors.resolution_text_info import ResolutionTextInfo +from cosmos3._src.vfm.datasets.augmentors.text_tokenizer import TextTokenizerTransform +from cosmos3._src.vfm.datasets.sequence_packing import SequencePlan +from cosmos3._src.vfm.datasets.utils import VIDEO_RES_SIZE_INFO +from cosmos3._src.vfm.utils.data_utils import get_vision_data_resolution + + +def _should_append_idle_frame_info(mode: object) -> bool: + """Return whether idle-frame prompt metadata should be surfaced.""" + return mode != "inverse_dynamics" + + +def pad_action_to_max_dim(action: torch.Tensor, max_action_dim: int) -> torch.Tensor: + """Pad action tensor to max_action_dim along the last dimension. + + Args: + action: Action tensor of shape (T, D) where D is the current action dimension. + max_action_dim: Target action dimension to pad to. + + Returns: + Padded action tensor of shape (T, max_action_dim). + """ + if action.shape[-1] > max_action_dim: + raise ValueError(f"Action dimension {action.shape[-1]} is greater than max_action_dim {max_action_dim}") + elif action.shape[-1] == max_action_dim: + return action + else: + padding_size = max_action_dim - action.shape[-1] + zero_padding = torch.zeros( + *action.shape[:-1], padding_size, dtype=action.dtype, device=action.device + ) # [T,padding_size] + return torch.cat([action, zero_padding], dim=-1) # [T,max_action_dim] + + +def find_closest_target_size(h: int, w: int, resolution: str | int) -> tuple[int, int]: + """Find the closest predefined target size for a given input resolution. + + Looks up ``VIDEO_RES_SIZE_INFO[resolution]`` and selects the aspect ratio + whose ``H/W`` ratio is closest to the input ``h/w``. + + Args: + h: Input height in pixels. + w: Input width in pixels. + resolution: Resolution tier key (e.g. ``"256"``, ``"480"``, ``"720"``). + + Returns: + ``(target_w, target_h)`` from the predefined table. + + Raises: + ValueError: If *resolution* is not a key in ``VIDEO_RES_SIZE_INFO``. + """ + if isinstance(resolution, int): + resolution = str(resolution) + if resolution not in VIDEO_RES_SIZE_INFO: + raise ValueError( + f"Resolution '{resolution}' not found in VIDEO_RES_SIZE_INFO. Available: {list(VIDEO_RES_SIZE_INFO.keys())}" + ) + + candidates = VIDEO_RES_SIZE_INFO[resolution] + input_ratio = h / w + + best_key: str | None = None + best_diff = float("inf") + for aspect_key, (cand_w, cand_h) in candidates.items(): + cand_ratio = cand_h / cand_w + diff = abs(input_ratio - cand_ratio) + if diff < best_diff: + best_diff = diff + best_key = aspect_key + + assert best_key is not None + target_w, target_h = candidates[best_key] + return target_w, target_h + + +def reflection_pad_to_target( + data_dict: dict, + keys: list[str], + keep_aspect_ratio: bool, + target_w: int, + target_h: int, +) -> dict: + """Resize (aspect-preserving) and reflection-pad tensors to exact target size. + + For each key in *keys*, the tensor is: + + 1. Resized so its spatial dimensions fit within ``(target_h, target_w)`` + while preserving the aspect ratio (matching VFM's + ``ResizeLargestSideAspectPreserving``). + 2. Reflection-padded (or edge-padded when the padding exceeds the spatial + dimension) to reach exactly ``(target_h, target_w)`` (matching VFM's + ``ReflectionPadding``). + + After processing, the following entries are added to *data_dict*: + + - ``"image_size"``: ``torch.Tensor`` of shape ``(4,)`` containing + ``[target_h, target_w, orig_h_resized, orig_w_resized]`` where + ``target_h/w`` is the padded canvas size and ``orig_h/w_resized`` + is the original spatial size after aspect-preserving resize (i.e. + the content region before padding). After ``default_collate`` + this becomes ``(B, 4)``; the ``IterativeJointDataLoader`` then + splits it into per-sample ``(1, 4)`` tensors so the model can + index as ``data_batch["image_size"][i][0][0]``. + + Args: + data_dict: The sample dictionary (mutated in-place). + keys: Data-dict keys whose tensors should be resized and padded. + Tensors must have shape ``(C, H, W)`` or ``(C, T, H, W)``. + keep_aspect_ratio: Whether to keep the aspect ratio of the input tensor. + target_w: Target width in pixels. + target_h: Target height in pixels. + + Returns: + The mutated *data_dict*. + """ + orig_h_resized: int = 0 + orig_w_resized: int = 0 + + for key in keys: + if key not in data_dict: + continue + tensor = data_dict[key] + if not isinstance(tensor, torch.Tensor): + continue + + # Extract spatial dims + if tensor.ndim == 3: + orig_h, orig_w = tensor.shape[-2:] + elif tensor.ndim == 4: + orig_h, orig_w = tensor.shape[-2:] + else: + raise ValueError(f"Unexpected tensor ndim={tensor.ndim} for key '{key}', expected 3 or 4") + + # Step 1: aspect-preserving resize to fit within (target_h, target_w) + if keep_aspect_ratio: + # Prevent upscaling the video by setting the upper bound of scaling_ratio to 1.0. + scaling_ratio = min(target_w / orig_w, target_h / orig_h, 1.0) + orig_h_resized = int(scaling_ratio * orig_h + 0.5) + orig_w_resized = int(scaling_ratio * orig_w + 0.5) + assert orig_h_resized <= target_h and orig_w_resized <= target_w, ( + f"Resize error: orig ({orig_h}, {orig_w}) target ({target_h}, {target_w}) " + f"computed ({orig_h_resized}, {orig_w_resized})" + ) + else: + orig_h_resized = target_h + orig_w_resized = target_w + + if orig_h_resized != orig_h or orig_w_resized != orig_w: + tensor = transforms_F.resize( + tensor, + size=[orig_h_resized, orig_w_resized], + interpolation=transforms_F.InterpolationMode.BICUBIC, + antialias=True, + ) + + # Step 2: padding to exact target size (bottom and right only) + if orig_w_resized != target_w or orig_h_resized != target_h: + padding_right = target_w - orig_w_resized + padding_bottom = target_h - orig_h_resized + padding = [0, 0, padding_right, padding_bottom] + + if padding_right >= orig_w_resized or padding_bottom >= orig_h_resized: + tensor = transforms_F.pad(tensor, padding, padding_mode="edge") + else: + tensor = transforms_F.pad(tensor, padding, padding_mode="reflect") + + data_dict[key] = tensor + + # image_size: shape (4,) — [target_h, target_w, orig_h_resized, orig_w_resized]. + # Matches VFM's item_dataset convention. default_collate stacks to (B, 4); + # IterativeJointDataLoader._get_next_sample slices to (1, 4) per sample so + # the model can index [i][0][0]. + data_dict["image_size"] = torch.tensor( + [target_h, target_w, orig_h_resized, orig_w_resized], dtype=torch.float + ) # [4] + + return data_dict + + +def remove_reflection_padding( + tensor: torch.Tensor, + image_size: torch.Tensor, +) -> torch.Tensor: + """Remove reflection padding added by :func:`reflection_pad_to_target`. + + Content is at top-left; crops to ``(orig_h_resized, orig_w_resized)``. + + Args: + tensor: Tensor whose last two dimensions are the padded spatial dims. + Supports any leading dimensions, e.g. ``(C, T, H, W)`` or + ``(C, H, W)``. + image_size: 1-D tensor of shape ``(4,)`` containing + ``[target_h, target_w, orig_h_resized, orig_w_resized]`` where + ``orig_h/w_resized`` is the original spatial size after + aspect-preserving resize (i.e. the content region before + padding) — the same convention stored by + :func:`reflection_pad_to_target` and VFM's + ``ReflectionPadding``. + + Returns: + Cropped tensor of shape ``(..., orig_h_resized, orig_w_resized)``. + """ + target_h = int(image_size[0].item()) + target_w = int(image_size[1].item()) + orig_h_resized = int(image_size[2].item()) + orig_w_resized = int(image_size[3].item()) + + if orig_h_resized == target_h and orig_w_resized == target_w: + return tensor + + return tensor[..., :orig_h_resized, :orig_w_resized].contiguous() + + +def build_sequence_plan_from_mode( + mode: str, + video_length: int, + action_length: int, + has_text: bool = True, + video_temporal_downsample: int = 4, + num_history_actions: int = 0, +) -> SequencePlan: + """Build a SequencePlan based on the training mode. + + This function determines whether action should be included and computes the + appropriate condition frame indexes for vision and action based on the mode. + + Args: + mode: Training mode. One of: + - "image2video": Image-to-video generation (no action) + - "forward_dynamics": Predict video given first frame and all actions + - "inverse_dynamics": Predict actions given all video frames + - "policy": Predict both actions and video given first frame + video_length: Number of video frames (including the conditioning frame). + action_length: Number of action steps (typically video_length - 1). + has_text: Whether text conditioning is available. Defaults to True. + video_temporal_downsample: Temporal downsampling factor of the video + tokenizer. Used to compute condition frame indexes for inverse + dynamics mode. Defaults to 4. + + Returns: + SequencePlan instance with appropriate settings. + Use ``sequence_plan.has_action`` to check if action should be included. + + Raises: + ValueError: If mode is not one of the supported modes. + + Example: + >>> sequence_plan = build_sequence_plan_from_mode( + ... mode="policy", + ... video_length=5, + ... action_length=4, + ... ) + >>> sequence_plan.has_action + True + >>> sequence_plan.as_dict() + {'has_text': True, 'has_vision': True, 'has_action': True, + 'condition_frame_indexes_vision': [0], 'condition_frame_indexes_action': []} + """ + valid_modes = ["image2video", "forward_dynamics", "inverse_dynamics", "policy"] + if mode not in valid_modes: + raise ValueError(f"Invalid mode: {mode!r}. Must be one of {valid_modes}") + + # Determine if action should be included based on mode + # image2video mode: no action (pure image-to-video generation) + # forward_dynamics, inverse_dynamics, policy: action is needed + has_action = mode != "image2video" + + # Determine condition frame indexes based on mode + # image2video/forward_dynamics/policy: first frame is clean (conditioning) + # inverse_dynamics: all frames are provided as context + if mode in ["image2video", "forward_dynamics", "policy"]: + condition_frame_indexes_vision = [0] + elif mode == "inverse_dynamics": + # All frames are observed for inverse dynamics + condition_frame_indexes_vision = list(range(0, (video_length - 1) // video_temporal_downsample + 1)) + else: + condition_frame_indexes_vision = [] + + # For action conditioning indexes: + # forward_dynamics: all action steps are clean (conditioning) + # inverse_dynamics/policy: action is supervised (predicted) + # History frames (prepended) are always conditioning. + base_action_length = action_length - num_history_actions + if mode == "forward_dynamics": + condition_frame_indexes_action = list(range(action_length)) + + # This currently assumes that the action length is the same as the video length - 1 + # and if action length is the same as the video length, then the first action is the conditioning action + elif base_action_length == video_length - 1: + condition_frame_indexes_action = list(range(num_history_actions)) + elif base_action_length == video_length: + condition_frame_indexes_action = list(range(num_history_actions + 1)) + + if base_action_length == video_length - 1: + action_start_frame_offset = 1 - num_history_actions + if base_action_length == video_length: + action_start_frame_offset = -num_history_actions + + return SequencePlan( + has_text=has_text, + has_vision=True, + has_action=has_action, + condition_frame_indexes_vision=condition_frame_indexes_vision, + condition_frame_indexes_action=condition_frame_indexes_action, + action_start_frame_offset=action_start_frame_offset, + ) + + +class VideoResize: + """Resize and reflection-pad video-aligned tensors for a single sample. + + Resolution is supplied at call time. When ``resolution`` is ``None``, the + tier is auto-detected from the sample's ``"video"`` spatial dimensions. + + Args: + pad_keys: Data-dict keys whose values should be resized and padded. + Pass an empty list to disable padding entirely. Defaults to + ``["video"]``. + keep_aspect_ratio: Whether to resize aspect-preservingly to the closest + predefined target size before padding. Defaults to ``True``. + log_prefix: Prefix used in debug logging. + """ + + def __init__( + self, + pad_keys: list[str] | None = None, + keep_aspect_ratio: bool = True, + log_prefix: str = "VideoResize", + ) -> None: + self.pad_keys = pad_keys if pad_keys is not None else ["video"] + self.keep_aspect_ratio = keep_aspect_ratio + self.log_prefix = log_prefix + + def __call__(self, data_dict: dict, resolution: str | int | None) -> dict: + """Resize and pad a sample in-place. + + Args: + data_dict: Sample dictionary containing a ``"video"`` entry. + resolution: Resolution tier key (e.g. ``"256"``, ``"480"``, + ``"720"``). When ``None``, auto-detected from video dimensions. + + Returns: + The same dictionary, mutated in-place with padded tensors and an + ``"image_size"`` entry. + """ + video = data_dict.get("video") + assert isinstance(video, torch.Tensor), "video is required for reflection padding" + h, w = video.shape[-2:] + + if resolution is None: + resolution = get_vision_data_resolution((h, w)) + + if self.keep_aspect_ratio: + target_w, target_h = find_closest_target_size(h, w, resolution) + else: + target_w = int(resolution) + target_h = int(resolution) + reflection_pad_to_target(data_dict, self.pad_keys, self.keep_aspect_ratio, target_w, target_h) + + return data_dict + + def _log_shapes(self, data_dict: dict, when: str) -> None: + """Log tensor shapes for the configured pad keys.""" + for key in self.pad_keys: + val = data_dict.get(key) + if isinstance(val, torch.Tensor): + log.debug(f"{self.log_prefix}: {when} padding '{key}' shape = {tuple(val.shape)}") + + +class ActionTransformPipeline: + """A composable transform pipeline that chains ``VideoResize``, text + tokenization, and automatic sequence plan construction. + + Reflection padding snaps each sample to the closest predefined aspect + ratio from ``VIDEO_RES_SIZE_INFO[resolution]``, resizes + (aspect-preserving) to fit within the target, then reflection-pads to + the exact target size. This guarantees a bounded set of output shapes + (5 per resolution tier), all multiples of 16. Resolution is supplied + at call time via the required ``resolution`` argument to ``__call__``; + when ``resolution`` is ``None``, the tier is auto-detected from the + video's spatial dimensions via ``get_vision_data_resolution``. + + Text tokenization is enabled when ``tokenizer_config`` is provided. + + When the data dictionary contains a ``"mode"`` key, the pipeline automatically + builds a ``SequencePlan`` via :func:`build_sequence_plan_from_mode` and attaches + it as ``data_dict["sequence_plan"]``. For modes where action is not needed + (e.g. ``"image2video"``), the ``"action"`` and ``"domain_id"`` keys are set to + ``None``. + + Args: + pad_keys: Data-dict keys whose values should be resized and padded. Pass + an empty list to disable padding entirely. Defaults to ``["video"]``. + tokenizer_config: A lazy-instantiable config dict for the VLM tokenizer. When + ``None``, text tokenization is skipped. Defaults to ``None``. + cfg_dropout_rate: Probability of replacing the caption with an empty string for + classifier-free guidance. Only used when text tokenization is enabled. + Defaults to ``0.0``. + caption_key: The data-dict key that contains the input caption string. + Defaults to ``"ai_caption"``. + text_token_key: The data-dict key where tokenized text IDs will be stored. + Defaults to ``"text_token_ids"``. + video_temporal_downsample: Temporal downsampling factor of the video tokenizer. + Used when building a ``SequencePlan`` for ``"inverse_dynamics"`` mode. + Defaults to 4. + max_action_dim: Target action dimension to pad to. The ``"action"`` tensor + in every sample is padded along its last dimension via + :func:`pad_action_to_max_dim`. Defaults to 32. + action_channel_masking: When ``True`` (default), the original action + dimension is stored in ``"raw_action_dim"`` so that the model masks + loss/noise/velocity on zero-padded action channels. When ``False``, + ``"raw_action_dim"`` is set to ``None`` and the model treats all + ``max_action_dim`` channels equally (original main-branch behavior). + append_viewpoint_info: Whether to append viewpoint type metadata to the + caption (via ``ViewpointTextInfo`` augmentor). Requires that + samples contain a ``"viewpoint"`` key. Defaults to ``True``. + append_duration_fps_timestamps: Whether to append duration and FPS metadata to the + caption (matching VFM's ``DurationFPSTextTimeStamps`` augmentor). + Defaults to ``True``. + append_resolution_info: Whether to append resolution metadata to the + caption (matching VFM's ``ResolutionTextInfo`` augmentor). + Defaults to ``True``. + append_idle_frames: Whether to append the idle-frame count out of the + total action frames to the caption (Pi0.7-style metadata, via + ``IdleFramesTextInfo`` augmentor). The dataset is responsible for + populating ``data_dict["idle_frames"]``; samples without it are + silently skipped. Idle-frame text is skipped only for + ``"inverse_dynamics"`` mode. Defaults to ``False`` so existing + experiments are unaffected. + idle_frames_dropout: Per-field dropout rate for the idle-frame segment. + With this probability the augmentor leaves the caption unchanged + (matching Pi0.7's ~5% per-component dropout). Independent of the + global ``cfg_dropout_rate``, which empties the whole caption. + Defaults to 0.05. + format_prompt_as_json: Whether to replace the plain text prompt with a + structured JSON-compatible dictionary before tokenization. When + enabled, legacy string metadata appenders are skipped and the JSON + formatter owns viewpoint, action, resolution, duration, FPS, and + idle-frame fields. Defaults to ``False``. + """ + + def __init__( + self, + pad_keys: list[str] | None = None, + keep_aspect_ratio: bool = True, + tokenizer_config: dict | None = None, + cfg_dropout_rate: float = 0.0, + caption_key: str = "ai_caption", + text_token_key: str = "text_token_ids", + video_temporal_downsample: int = 4, + max_action_dim: int = 32, + action_channel_masking: bool = True, + append_viewpoint_info: bool = True, + append_duration_fps_timestamps: bool = True, + append_resolution_info: bool = True, + append_idle_frames: bool = False, + idle_frames_dropout: float = 0.05, + format_prompt_as_json: bool = False, + ) -> None: + self.caption_key: str = caption_key + self.video_temporal_downsample: int = video_temporal_downsample + self.max_action_dim: int = max_action_dim + self.action_channel_masking: bool = action_channel_masking + + # --- Spatial resize/padding stage (resolution supplied at call time) --- + self.video_resize: VideoResize = VideoResize( + pad_keys=pad_keys, + keep_aspect_ratio=keep_aspect_ratio, + log_prefix="ActionTransformPipeline", + ) + self.pad_keys: list[str] = self.video_resize.pad_keys + self.keep_aspect_ratio: bool = self.video_resize.keep_aspect_ratio + + self.prompt_json_formatter: ActionPromptJsonFormatter | None = None + if format_prompt_as_json: + self.prompt_json_formatter = ActionPromptJsonFormatter(caption_key=caption_key) + + # --- Viewpoint text augmentor (runs after ai_caption, before duration/FPS) --- + self.viewpoint_augmentor: ViewpointTextInfo | None = None + if append_viewpoint_info and self.prompt_json_formatter is None: + self.viewpoint_augmentor = ViewpointTextInfo( + input_keys=[caption_key, "viewpoint"], + output_keys=[caption_key], + args={"caption_key": caption_key, "viewpoint_key": "viewpoint", "enabled": True}, + ) + + # --- Duration/FPS text augmentor (runs before tokenization) --- + self.duration_fps_augmentor: DurationFPSTextTimeStamps | None = None + if append_duration_fps_timestamps and self.prompt_json_formatter is None: + self.duration_fps_augmentor = DurationFPSTextTimeStamps( + input_keys=[caption_key, "video", "conditioning_fps"], + output_keys=[caption_key], + args={"caption_key": caption_key, "video_key": "video", "fps_key": "conditioning_fps"}, + ) + + # --- Resolution text augmentor (runs before tokenization) --- + self.resolution_info_augmentor: ResolutionTextInfo | None = None + if append_resolution_info and self.prompt_json_formatter is None: + self.resolution_info_augmentor = ResolutionTextInfo( + input_keys=[caption_key, "video", "image_size"], + output_keys=[caption_key], + args={"caption_key": caption_key, "video_key": "video", "enabled": True}, + ) + + # --- IdleFrames text augmentor (Pi0.7-style episode metadata) --- + # Runs after resolution info, before tokenization. Per-field dropout is + # independent from the tokenizer's global cfg_dropout_rate. + self.idle_frames_augmentor: IdleFramesTextInfo | None = None + if append_idle_frames and self.prompt_json_formatter is None: + self.idle_frames_augmentor = IdleFramesTextInfo( + input_keys=[caption_key, "idle_frames", "action"], + output_keys=[caption_key], + args={ + "caption_key": caption_key, + "idle_frames_key": "idle_frames", + "action_key": "action", + "dropout_rate": idle_frames_dropout, + "enabled": True, + }, + ) + + # --- Text tokenizer augmentor --- + self.text_tokenizer: TextTokenizerTransform | None = None + if tokenizer_config is not None: + self.text_tokenizer = TextTokenizerTransform( + input_keys=[caption_key], + output_keys=[text_token_key], + args={ + "tokenizer_config": tokenizer_config, + "cfg_dropout_rate": cfg_dropout_rate, + }, + ) + + def __call__(self, data_dict: dict, resolution: str | None) -> dict: + """Apply the transform pipeline to a single data dictionary. + + Resolution is required at call time and is the only source of truth + for this sample. When ``resolution`` is ``None``, the tier is + auto-detected from the video's spatial dimensions. + + The pipeline runs in order: + + 1. Resize + reflection-pad spatial dimensions to the closest + predefined target from ``VIDEO_RES_SIZE_INFO[resolution]``. + 2. Format the caption as a structured JSON prompt (if enabled). + 3. Otherwise, append viewpoint type metadata to caption (if enabled). + 4. Append duration/FPS metadata to caption (if enabled). + 5. Append resolution metadata to caption (if enabled). + 6. Append idle-frame metadata (Pi0.7-style) to caption unless the + sample is in inverse dynamics mode (if enabled). + 7. Tokenize caption text (if enabled). + 8. Build a ``SequencePlan`` from the ``"mode"`` key (if present). + 9. If action is needed by the plan, pad ``"action"`` to ``max_action_dim``. + 10. Otherwise, nullify ``"action"`` and ``"domain_id"`` (e.g. in + ``"image2video"`` mode). + + Args: + data_dict: A sample dictionary as returned by a Action dataset. + resolution: Resolution tier key (e.g. ``"256"``, ``"480"``, ``"720"``) + for this sample. When ``None``, auto-detected from video dimensions. + + Returns: + The same dictionary, mutated in-place with padded tensors, + ``image_size``, tokenized text IDs, and a + ``"sequence_plan"`` entry added. + """ + mode = data_dict.get("mode") + assert mode is not None, "mode is required" + + # 1. Resize + reflection-pad spatial dimensions to the closest predefined target from ``VIDEO_RES_SIZE_INFO[resolution]``. + data_dict = self.video_resize(data_dict, resolution) + + # 2. Format the caption as structured JSON when requested; otherwise run the legacy string appenders. + if self.prompt_json_formatter is not None: + data_dict = self.prompt_json_formatter(data_dict) + else: + # 3. Append viewpoint type metadata to caption (if enabled). + if self.viewpoint_augmentor is not None: + result = self.viewpoint_augmentor(data_dict) + if result is not None: + data_dict = result + + # 4. Append duration/FPS metadata to caption (if enabled). + if self.duration_fps_augmentor is not None: + result = self.duration_fps_augmentor(data_dict) + if result is not None: + data_dict = result + + # 5. Append resolution metadata to caption (if enabled). + if self.resolution_info_augmentor is not None: + result = self.resolution_info_augmentor(data_dict) + if result is not None: + data_dict = result + + # 6. Append idle-frame metadata to caption (if enabled for this mode). + if self.idle_frames_augmentor is not None and _should_append_idle_frame_info(mode): + result = self.idle_frames_augmentor(data_dict) + if result is not None: + data_dict = result + + # 7. Tokenize caption text (if enabled). + if self.text_tokenizer is not None: + data_dict = self.text_tokenizer(data_dict) + + # 8. Build a ``SequencePlan`` from the ``"mode"`` key (if present). + video = data_dict.get("video") + action = data_dict.get("action") + assert video is not None, "video is required" + video_length = video.shape[1] # [C,T,H,W] -> T + action_length = action.shape[0] if isinstance(action, torch.Tensor) else max(video_length - 1, 0) + + # Prepend history action frames (ground-truth conditioning) if present. + history_action = data_dict.pop("history_action", None) + num_history_actions = 0 + if history_action is not None and isinstance(action, torch.Tensor): + num_history_actions = history_action.shape[0] + action = torch.cat([history_action, action], dim=0) + action_length += num_history_actions + + sequence_plan = build_sequence_plan_from_mode( + mode=mode, + video_length=video_length, + action_length=action_length, + video_temporal_downsample=self.video_temporal_downsample, + num_history_actions=num_history_actions, + ) + data_dict["sequence_plan"] = sequence_plan + + if sequence_plan.has_action: + assert isinstance(action, torch.Tensor), "action tensor is required when sequence plan has action" + data_dict["raw_action_dim"] = torch.tensor(action.shape[1]) if self.action_channel_masking else None + data_dict["action"] = pad_action_to_max_dim(action, self.max_action_dim) + else: + # Nullify action-related fields when action is not needed so the + # collate function can simply stack all non-None actions. + data_dict["raw_action_dim"] = None + data_dict["action"] = None + data_dict["domain_id"] = None + + return data_dict diff --git a/cosmos-inference/cosmos3/_src/vfm/datasets/action/unified_dataset.py b/cosmos-inference/cosmos3/_src/vfm/datasets/action/unified_dataset.py new file mode 100644 index 00000000..ca960e72 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/vfm/datasets/action/unified_dataset.py @@ -0,0 +1,594 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +"""Unified iterable dataset for Action multi-embodiment robot data. + +``ActionUnifiedIterableDataset`` is the Layer 2 component of the Action data loading +pipeline. It wraps *all* Action datasets into a single ``IterableDataset`` and +handles: + +- **Rank-level dataset assignment** (Hare-Niemeyer proportional allocation) +- **Worker-level shard distribution** (round-robin within a dataset family) +- **Per-sample transforms** via :class:`~.transforms.ActionTransformPipeline` +- **Weighted random fallback** when worker assignment is not active + +See ``docs/dataloader.md`` for the full design document. +""" + +from __future__ import annotations + +import gc +import random +import warnings +from collections.abc import Iterator, Mapping, Sequence +from typing import Any + +from torch.utils.data import Dataset, IterableDataset + +from cosmos3._src.imaginaire.utils import log +from cosmos3._src.vfm.datasets.action.transforms import ActionTransformPipeline +from cosmos3._src.vfm.datasets.utils import VIDEO_RES_SIZE_INFO + +_iterable_dataset_len_warning_suppressed = False + + +def _suppress_iterable_dataset_len_warning() -> None: + """Register a one-time filter for PyTorch's IterableDataset len() warning. + + The inner datasets may not implement ``__len__``, so the wrapper reports + ``len()=0``. PyTorch's iterator then warns on every ``__next__`` when + samples are fetched. This filter suppresses that warning. + """ + global _iterable_dataset_len_warning_suppressed + if _iterable_dataset_len_warning_suppressed: + return + _iterable_dataset_len_warning_suppressed = True + warnings.filterwarnings( + "ignore", + message="Length of IterableDataset.*was reported to be 0", + category=UserWarning, + module="torch.utils.data.dataloader", + ) + + +# --------------------------------------------------------------------------- +# Worker-side periodic garbage collection +# --------------------------------------------------------------------------- +# DataLoader workers running IterableDatasets are long-lived forked processes. +# Complex sample dictionaries (nested dicts, tensors, Arrow references) can +# create circular-reference chains that Python's reference counting alone +# cannot free. The generational GC *does* collect them eventually, but its +# default thresholds are too conservative for high-throughput data loading, +# causing RSS to grow monotonically until the node OOMs. +# +# Calling ``gc.collect()`` periodically inside the worker iteration loop +# eliminates the leak with negligible overhead (<1 ms per call vs ~6 s +# iteration time). +# +_GC_INTERVAL: int = 10 + + +def _maybe_gc(interval: int, count: int) -> int: + """Increment *count* and run ``gc.collect()`` every *interval* samples.""" + if interval <= 0: + return count + count += 1 + if count % interval == 0: + gc.collect() + return count + + +class ActionUnifiedIterableDataset(IterableDataset): + """Single IterableDataset wrapping all Action datasets. + + Handles worker-to-dataset assignment, shard distribution, and transforms. + + Args: + datasets: List of dicts, each with keys ``"name"`` (str identifier), + ``"dataset"`` (the dataset instance), and ``"ratio"`` (float + sampling weight). + transform: Transform pipeline applied to every yielded sample. + shard_across_workers: When ``True``, ranks are assigned to + dataset families via Hare-Niemeyer and workers get round-robin + shards. When ``False`` (default), every worker loads all + datasets and iterates with weighted random selection. + """ + + def __init__( + self, + datasets: list[dict[str, Any]], + transform: ActionTransformPipeline, + shard_across_workers: bool = False, + ) -> None: + super().__init__() + self._datasets = datasets + self._transform = transform + self._shard_across_workers = shard_across_workers + + # Set per-worker by assign_worker; None means not yet assigned. + self._dataset: Any | None = None + self._resolution: str | None = None # resolution for single-dataset path + self._sources_initialized = False + + # Backward compat: expose ``self.dataset`` pointing to the first + # inner dataset and ``self.transform`` exposing the pipeline + # (mirrors old TransformedIterableDataset interface). + self.dataset = datasets[0]["dataset"] if datasets else None + self.transform = transform + + # -- source initialization ------------------------------------------------ + + def _ensure_sources_registered(self) -> None: + if self._sources_initialized: + return + self._sources_initialized = True + for entry in self._datasets: + ds = entry["dataset"] + shard_roots = getattr(ds, "_all_shard_roots", []) + if shard_roots and hasattr(ds, "_register_sources"): + ds._register_sources() + + # -- backward-compat helpers ----------------------------------------------- + + def __len__(self) -> int: # type: ignore[override] + total = 0 + for entry in self._datasets: + ds = entry["dataset"] + try: + total += len(ds) # type: ignore[arg-type] + except TypeError: + pass + return total + + def __getattr__(self, name: str) -> Any: + """Forward attribute lookups to the first inner dataset.""" + if name.startswith("_") or not self._datasets: + raise AttributeError(name) + return getattr(self._datasets[0]["dataset"], name) + + # -- Hare-Niemeyer rank allocation ----------------------------------------- + + @staticmethod + def _compute_rank_ranges( + datasets: list[dict[str, Any]], + world_size: int, + ) -> list[tuple[int, int]]: + """Hare-Niemeyer allocation of ranks to datasets. + + Guarantees at least 1 rank per dataset, distributes the rest + proportionally. Returns a list of ``(start_rank, end_rank)`` ranges. + + Raises: + ValueError: If ``world_size < len(datasets)``. + """ + n_ds = len(datasets) + if world_size < n_ds: + raise ValueError(f"world_size ({world_size}) must be >= number of datasets ({n_ds})") + ratios = [d["ratio"] for d in datasets] + total = sum(ratios) + + # Hare-Niemeyer (largest-remainder) method: + # 1. Give every dataset a guaranteed minimum of 1 rank. + # 2. Distribute the leftover ranks proportionally to each dataset's + # ratio. Take the floor of each fractional allocation, then award + # the still-unassigned ranks one-by-one to datasets with the + # largest fractional remainders. + # Example: world_size=8, ratios=[3, 1] (2 datasets) + # remaining = 8 - 2 = 6 + # fractional = [6*3/4, 6*1/4] = [4.5, 1.5] + # floors = [4, 1], remainders = [0.5, 0.5], leftover = 1 + # award 1 extra to first dataset -> floors = [5, 1] + # counts = [1+5, 1+1] = [6, 2] + counts = [1] * n_ds + remaining = world_size - n_ds + if remaining > 0: + fractional = [remaining * r / total for r in ratios] + floors = [int(f) for f in fractional] + remainders = [f - fl for f, fl in zip(fractional, floors)] + leftover = remaining - sum(floors) + for idx in sorted(range(n_ds), key=lambda j: -remainders[j])[:leftover]: + floors[idx] += 1 + counts = [1 + f for f in floors] + + # Convert per-dataset counts into contiguous rank intervals. + # Example continued: counts=[6, 2] -> ranges=[(0,6), (6,8)] + # ranks 0..5 serve dataset 0, ranks 6..7 serve dataset 1. + ranges: list[tuple[int, int]] = [] + cursor = 0 + for c in counts: + ranges.append((cursor, cursor + c)) + cursor += c + return ranges + + # -- worker assignment ----------------------------------------------------- + + def assign_worker( + self, + worker_id: int, + num_workers: int, + rank: int, + world_size: int, + ) -> None: + """Assign this worker to a dataset family and distribute shards. + + Called by the DataLoader's ``worker_init_fn`` (via + :func:`~.dataloaders.create_action_worker_init_fn`) -- not by the + dataset itself. + + Two-level assignment: + + 1. **Rank -> dataset family** (Hare-Niemeyer over *world_size* + ranks). Every rank is fully dedicated to one family. + 2. **Workers -> shards** (round-robin within the family's worker + pool). ``family_worker_id = rank_within_family * num_workers + + worker_id``. + + When ``shard_across_workers=False``: no assignment is performed. + Every worker loads all datasets and ``__iter__`` uses weighted + random selection. + """ + self._sources_initialized = True + if not self._shard_across_workers: + for entry in self._datasets: + ds = entry["dataset"] + shard_roots = getattr(ds, "_all_shard_roots", []) + if shard_roots and hasattr(ds, "_register_sources"): + ds._register_sources() + return + + rank_ranges = self._compute_rank_ranges(self._datasets, world_size) + + # Step 1: which dataset family does this rank belong to? + # ``rank_ranges`` is a list of (start_rank, end_rank) intervals -- one + # per dataset family -- produced by ``_compute_rank_ranges()`` above + # using Hare-Niemeyer allocation. The intervals are contiguous and + # non-overlapping, covering [0, world_size), so every rank belongs to + # exactly one family. + # + # We scan through the intervals to find the one containing this rank, + # then derive two values: + # - rank_within_family: this rank's 0-based position inside its + # family (used in Step 2 to build a globally unique worker id). + # - num_family_ranks: total number of ranks assigned to this family + # (used in Step 2 to compute the family's worker pool size). + # + # ``self._dataset`` is set to the matched family's dataset object so + # that ``__iter__`` only yields samples from this one dataset. + # + # Example with world_size=8, ratios=[3,1] -> ranges=[(0,6), (6,8)]: + # rank 3 -> family 0, rank_within_family=3, num_family_ranks=6 + # rank 6 -> family 1, rank_within_family=0, num_family_ranks=2 + num_family_ranks = 1 + rank_within_family = 0 + for i, (start_rank, end_rank) in enumerate(rank_ranges): + if start_rank <= rank < end_rank: + entry = self._datasets[i] + self._dataset = entry["dataset"] + self._resolution = entry["resolution"] + rank_within_family = rank - start_rank + num_family_ranks = end_rank - start_rank + break + + # Step 2: distribute shards across workers within the family. + # Each rank spawns ``num_workers`` DataLoader workers (set by the + # DataLoader's ``num_workers`` arg). So the family's total worker + # pool is ``num_family_ranks * num_workers``. + # + # We flatten the 2D index (rank_within_family, worker_id) into a + # single linear ``family_worker_id`` so every worker in the family + # gets a globally unique id within that family: + # family_worker_id = rank_within_family * num_workers + worker_id + # + # Example: family has 3 ranks, each rank spawns 2 workers -> 6 total: + # rank_within_family=0: worker_id 0 -> fwid 0, worker_id 1 -> fwid 1 + # rank_within_family=1: worker_id 0 -> fwid 2, worker_id 1 -> fwid 3 + # rank_within_family=2: worker_id 0 -> fwid 4, worker_id 1 -> fwid 5 + # + # This linear id is then used for round-robin shard assignment below. + family_total_workers = num_family_ranks * num_workers + family_worker_id = rank_within_family * num_workers + worker_id + + # Round-robin assignment: worker k gets shards k, k+stride, k+2*stride, ... + # This ensures shards are evenly spread across the family's workers. + # + # When family_total_workers > num_shards, some workers get an empty + # list from range() (any worker with family_worker_id >= num_shards, + # since start >= stop). The ``if not my_shards`` guard catches this + # and falls back to ``family_worker_id % num_shards``, wrapping the + # worker around to an existing shard so it shares rather than idles. + # + # Example: AgiBotWorld with 190 shards and 256 family workers: + # Workers 0-189 -> each gets 1 unique shard via range() + # Workers 190-255 -> empty range, fallback to family_worker_id % 190, + # sharing a shard with an earlier worker. + # + # Multiple workers reading the same shard is fine because each worker + # has a different RNG seed (``seed + rank * 9999 + worker_id``), so + # they produce different sample orderings from the same underlying data. + shard_roots = getattr(self._dataset, "_all_shard_roots", []) + if shard_roots and hasattr(self._dataset, "_register_sources"): + num_shards = len(shard_roots) + my_shards = list(range(family_worker_id, num_shards, family_total_workers)) + if not my_shards: + my_shards = [family_worker_id % num_shards] + self._dataset._register_sources(my_shards) + + # -- iteration ------------------------------------------------------------- + + def _iter_all_datasets_weighted(self) -> Iterator[dict[str, Any]]: + """Iterate all datasets with weighted random selection. + + Used when ``shard_across_workers=False`` (every worker sees all + datasets) or as the ``num_workers=0`` fallback. + """ + iterators = [iter(d["dataset"]) for d in self._datasets] + ratios = [d["ratio"] for d in self._datasets] + total = sum(ratios) + weights = [r / total for r in ratios] + + gc_count = 0 + + while True: + chosen = random.choices(range(len(self._datasets)), weights=weights, k=1)[0] + resolution = self._datasets[chosen]["resolution"] + try: + yield self._transform(next(iterators[chosen]), resolution=resolution) + except StopIteration: + iterators[chosen] = iter(self._datasets[chosen]["dataset"]) + try: + yield self._transform(next(iterators[chosen]), resolution=resolution) + except StopIteration: + continue + gc_count = _maybe_gc(_GC_INTERVAL, gc_count) + + def __iter__(self) -> Iterator[dict[str, Any]]: + if self._dataset is not None: + gc_count = 0 + for sample in self._dataset: + yield self._transform(sample, resolution=self._resolution) + gc_count = _maybe_gc(_GC_INTERVAL, gc_count) + return + + if not self._shard_across_workers: + self._ensure_sources_registered() + yield from self._iter_all_datasets_weighted() + return + + # num_workers=0 fallback (shard_across_workers=True but no worker + # processes exist, so assign_worker was never called). + log.warning( + "ActionUnifiedIterableDataset: num_workers=0 fallback — " + "loading ALL datasets in main process. Use only for debugging." + ) + self._ensure_sources_registered() + yield from self._iter_all_datasets_weighted() + + +class MapToIterableAdapter(IterableDataset): + """Wraps a map-style ``Dataset`` as an ``IterableDataset``. + + Each iteration yields a sample from a uniformly random index, using + ``random.randint`` for O(1) time and zero extra memory. The per-worker + RNG seed (set by :func:`~.dataloaders.create_action_worker_init_fn`) ensures + different DataLoader workers produce different random sequences. + + Args: + dataset: A map-style ``Dataset`` with ``__len__`` and ``__getitem__``. + """ + + def __init__(self, dataset: Dataset) -> None: + super().__init__() + self.dataset = dataset + + def __len__(self) -> int: # type: ignore[override] + return len(self.dataset) # type: ignore[arg-type] + + def __iter__(self) -> Iterator: + n = len(self.dataset) # type: ignore[arg-type] + while True: + yield self.dataset[random.randint(0, n - 1)] + + def __getattr__(self, name: str) -> Any: + """Forward attribute lookups to the inner dataset for transparency.""" + if name == "dataset": + raise AttributeError(name) + return getattr(self.dataset, name) + + +def dataset_entry( + name: str, + dataset: Dataset | IterableDataset, + ratio: float = 1.0, + resolution: str | None = None, +) -> dict: + """Factory for a single dataset descriptor used inside ``wrap_dataset``. + + Wrapping each entry with ``LazyCall(dataset_entry)(...)`` gives it a + ``_target_`` so that ``instantiate`` recurses into the nested dataset + config automatically. + + Args: + name: Identifier for the dataset. + dataset: The dataset instance. + ratio: Sampling weight. Defaults to 1.0. + resolution: Optional resolution tier (e.g. ``"256"``, ``"480"``) for + this dataset. When ``None``, falls back to ``wrap_dataset``'s + global ``resolution`` (which may be ``None`` for auto-detect). + """ + return {"name": name, "dataset": dataset, "ratio": ratio, "resolution": resolution} + + +def wrap_dataset( + list_of_datasets: Sequence[dict] | list[dict] | Dataset | IterableDataset, + resolution: str | None = None, + pad_keys: list[str] | None = None, + keep_aspect_ratio: bool = True, + tokenizer_config: dict | None = None, + cfg_dropout_rate: float = 0.0, + caption_key: str = "ai_caption", + text_token_key: str = "text_token_ids", + video_temporal_downsample: int = 4, + max_action_dim: int = 32, + shard_across_workers: bool = False, + action_channel_masking: bool = True, + append_duration_fps_timestamps: bool = True, + append_resolution_info: bool = True, + append_idle_frames: bool = False, + idle_frames_dropout: float = 0.05, + format_prompt_as_json: bool = False, +) -> ActionUnifiedIterableDataset: + """Factory that wraps one or more datasets with the Action transform pipeline. + + ``list_of_datasets`` accepts either: + + * A **list of dicts**, where each dict has the keys: + - ``name`` (``str``): identifier for the dataset. + - ``dataset`` (``Dataset | IterableDataset``): the dataset instance. + - ``ratio`` (``float``, optional): sampling weight. Defaults to ``1``. + - ``resolution`` (``str | None``, optional): resolution tier for this + dataset. When missing, falls back to ``wrap_dataset``'s global + ``resolution`` (which may be ``None`` for auto-detect). + * A **single** ``Dataset`` or ``IterableDataset`` for backward compatibility + (auto-wrapped as ``[{"name": "default", "dataset": , "ratio": 1}]``). + + Map-style datasets are automatically wrapped with + :class:`MapToIterableAdapter` so the returned dataset is always an + ``IterableDataset``. This means callers can mix map-style and + iterable-style datasets freely. + + Args: + list_of_datasets: The dataset(s) to wrap. + resolution: Resolution tier key (e.g. ``"256"``, ``"480"``, ``"720"``). + Spatial dimensions are resized and reflection-padded to the closest + predefined target from ``VIDEO_RES_SIZE_INFO``. When ``None``, the + tier is auto-detected per sample via ``get_vision_data_resolution``. + Defaults to ``None``. + pad_keys: Data-dict keys whose values should be resized and padded. Pass + an empty list or ``None`` to disable padding. Defaults to ``["video"]``. + tokenizer_config: A lazy-instantiable config dict for the VLM tokenizer. When + ``None``, text tokenization is skipped. Defaults to ``None``. + cfg_dropout_rate: Probability of replacing the caption with an empty string for + classifier-free guidance. Defaults to ``0.0``. + caption_key: The data-dict key that contains the input caption string. + Defaults to ``"ai_caption"``. + text_token_key: The data-dict key where tokenized text IDs will be stored. + Defaults to ``"text_token_ids"``. + video_temporal_downsample: Temporal downsampling factor of the video tokenizer. + Used when building a ``SequencePlan`` for ``"inverse_dynamics"`` mode. + Defaults to 4. + max_action_dim: Target action dimension to pad to. The ``"action"`` tensor + in every sample is padded along its last dimension. Defaults to 32. + action_channel_masking: When ``True`` (default), stores the original action + dimension in ``"raw_action_dim"`` so the model masks loss/noise/velocity + on padded channels. Set to ``False`` to disable (original behavior). + shard_across_workers: When ``True``, the returned dataset + supports rank-level dataset assignment and worker-level shard + distribution via ``assign_worker()``. When ``False`` (default), + every worker iterates all datasets with weighted random selection. + append_duration_fps_timestamps: Whether to append duration and FPS metadata to the + caption before tokenization. Defaults to ``True``. + append_resolution_info: Whether to append resolution metadata to the + caption before tokenization. Defaults to ``True``. + append_idle_frames: Whether to append the idle-frame count out of the + total action frames (Pi0.7-style metadata) to the caption before + tokenization. The dataset is responsible for populating + ``data_dict["idle_frames"]``; samples without it are silently + skipped. Defaults to ``False`` so existing experiments are + unaffected. + idle_frames_dropout: Per-field dropout rate for the idle-frame segment. + Independent of ``cfg_dropout_rate`` (which empties the whole + caption). Defaults to 0.05. + format_prompt_as_json: Whether to replace the plain text prompt with a + structured JSON-compatible dictionary before tokenization. Defaults + to ``False``. + + Returns: + A :class:`ActionUnifiedIterableDataset` wrapping the dataset(s) with the + configured transforms applied. + + Raises: + TypeError: If the dataset(s) are not ``Dataset`` or ``IterableDataset``. + ValueError: If ``list_of_datasets`` is an empty list. + """ + if pad_keys is None: + pad_keys = ["video"] + + # ------------------------------------------------------------------ + # Backward compatibility: single dataset -> list-of-dicts + # ------------------------------------------------------------------ + if isinstance(list_of_datasets, (Dataset, IterableDataset)): + list_of_datasets = [{"name": "default", "dataset": list_of_datasets, "ratio": 1}] + + if ( + not isinstance(list_of_datasets, Sequence) + or isinstance(list_of_datasets, (str, bytes)) + or len(list_of_datasets) == 0 + ): + raise ValueError( + "list_of_datasets must be a non-empty list/sequence of dicts or a single Dataset/IterableDataset, " + f"got {type(list_of_datasets).__name__}" + ) + + # ------------------------------------------------------------------ + # Parse list-of-dicts, wrapping map-style datasets with + # MapToIterableAdapter so every dataset is iterable. Compute effective + # resolution per entry (per-entry overrides global). + # ------------------------------------------------------------------ + datasets: list[dict] = [] + for entry in list_of_datasets: + if not isinstance(entry, Mapping): + raise TypeError(f"Each entry in list_of_datasets must be a dict/mapping, got {type(entry).__name__}") + name: str = entry["name"] + dataset: Dataset | IterableDataset = entry["dataset"] + ratio: float = float(entry.get("ratio", 1)) + resolution: str | None = entry.get("resolution", None) + if resolution is not None: + res_key = str(resolution) if isinstance(resolution, int) else resolution + if res_key not in VIDEO_RES_SIZE_INFO: + raise ValueError( + f"Resolution '{resolution}' for dataset '{name}' not found in VIDEO_RES_SIZE_INFO. " + f"Available: {list(VIDEO_RES_SIZE_INFO.keys())}" + ) + if not isinstance(dataset, IterableDataset): + dataset = MapToIterableAdapter(dataset) + datasets.append({"name": name, "dataset": dataset, "ratio": ratio, "resolution": resolution}) + + # ------------------------------------------------------------------ + # Build the transform pipeline (resolution supplied at call time) + # ------------------------------------------------------------------ + transform = ActionTransformPipeline( + pad_keys=pad_keys, + keep_aspect_ratio=keep_aspect_ratio, + tokenizer_config=tokenizer_config, + cfg_dropout_rate=cfg_dropout_rate, + caption_key=caption_key, + text_token_key=text_token_key, + video_temporal_downsample=video_temporal_downsample, + max_action_dim=max_action_dim, + action_channel_masking=action_channel_masking, + append_duration_fps_timestamps=append_duration_fps_timestamps, + append_resolution_info=append_resolution_info, + append_idle_frames=append_idle_frames, + idle_frames_dropout=idle_frames_dropout, + format_prompt_as_json=format_prompt_as_json, + ) + + _suppress_iterable_dataset_len_warning() + + return ActionUnifiedIterableDataset( + datasets=datasets, + transform=transform, + shard_across_workers=shard_across_workers, + ) diff --git a/cosmos-inference/cosmos3/_src/vfm/datasets/action/viewpoint_utils.py b/cosmos-inference/cosmos3/_src/vfm/datasets/action/viewpoint_utils.py new file mode 100644 index 00000000..ebb99ab2 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/vfm/datasets/action/viewpoint_utils.py @@ -0,0 +1,126 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025-2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +"""Viewpoint type definitions and caption augmentor for Action datasets. + +Provides a ``Viewpoint`` type alias for camera perspective labels and a +``ViewpointTextInfo`` augmentor that appends a human-readable viewpoint +description to the caption string. +""" + +from __future__ import annotations + +from typing import Literal + +from cosmos3._src.imaginaire.datasets.webdataset.augmentors.augmentor import Augmentor +from cosmos3._src.imaginaire.utils import log + +Viewpoint = Literal["ego_view", "third_person_view", "wrist_view", "concat_view"] + +DEFAULT_VIEWPOINT_TEMPLATES: dict[str, str] = { + "ego_view": "This video is captured from a first-person perspective looking at the scene.", + "third_person_view": "This video is captured from a third-person perspective looking towards the agent from the front.", + "wrist_view": "This video is captured from a wrist-mounted camera.", + "concat_view": "This video contains concatenated views from multiple camera perspectives.", +} + + +class ViewpointTextInfo(Augmentor): + """Augmentor that appends viewpoint type description to captions. + + Reads a viewpoint label from ``data_dict[viewpoint_key]`` and appends + the corresponding template sentence to the caption. Designed to run + after the raw ``ai_caption`` is set but before duration/FPS metadata + is appended. + + Args: + input_keys: Input keys (kept for API compatibility). + output_keys: Output keys (kept for API compatibility). + args: Configuration arguments: + - caption_key (str): Key for caption in data_dict. Default: ``"ai_caption"`` + - viewpoint_key (str): Key for viewpoint label. Default: ``"viewpoint"`` + - templates (dict): Override mapping from viewpoint to sentence. + Default: :data:`DEFAULT_VIEWPOINT_TEMPLATES` + - separator (str): Separator between caption and metadata. Default: ``". "`` + - enabled (bool): Whether augmentation is enabled. Default: ``True`` + """ + + def __init__( + self, + input_keys: list | None = None, + output_keys: list | None = None, + args: dict | None = None, + ) -> None: + super().__init__(input_keys or [], output_keys or [], args) + + self.caption_key: str = args.get("caption_key", "ai_caption") if args else "ai_caption" + self.viewpoint_key: str = args.get("viewpoint_key", "viewpoint") if args else "viewpoint" + self.templates: dict[str, str] = ( + args.get("templates", DEFAULT_VIEWPOINT_TEMPLATES) if args else DEFAULT_VIEWPOINT_TEMPLATES + ) + self.default_separator: str = args.get("separator", ". ") if args else ". " + self.enabled: bool = args.get("enabled", True) if args else True + + def __call__(self, data_dict: dict) -> dict | None: + """Append viewpoint description to the caption. + + If the sample provides an ``"additional_view_description"`` key (a + free-form string describing the concatenated camera layout), it is + appended after the generic ``concat_view`` template. This allows each + dataset to supply its own description of which cameras are tiled and + how. + + Args: + data_dict: Sample dictionary containing caption and viewpoint. + + Returns: + The mutated *data_dict*, or the original unchanged if the + viewpoint key is missing or unrecognized. + """ + if not self.enabled: + return data_dict + + viewpoint = data_dict.get(self.viewpoint_key) + if viewpoint is None: + raise ValueError( + f"ViewpointTextInfo: missing key {self.viewpoint_key!r} in data_dict. " + f"All action datasets must provide a viewpoint label." + ) + + # Append dataset-specific concat_view details after the base template. + additional_view_description = data_dict.pop("additional_view_description", None) + template = self.templates.get(viewpoint) + + if template is None: + log.warning( + f"ViewpointTextInfo: unrecognized viewpoint {viewpoint!r}. " + f"Known viewpoints: {sorted(self.templates.keys())}. Skipping.", + rank0_only=False, + ) + return data_dict + + if additional_view_description: + separator = " " if template.endswith(".") else self.default_separator + template = template + separator + additional_view_description.rstrip() + + caption = data_dict.get(self.caption_key) + if not isinstance(caption, str) or caption == "": + return data_dict + + caption = caption.rstrip() + separator = " " if caption.endswith(".") else self.default_separator + data_dict[self.caption_key] = caption + separator + template + + return data_dict diff --git a/cosmos-inference/cosmos3/_src/vfm/datasets/augmentors/__init__.py b/cosmos-inference/cosmos3/_src/vfm/datasets/augmentors/__init__.py new file mode 100644 index 00000000..db6d6eb6 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/vfm/datasets/augmentors/__init__.py @@ -0,0 +1,15 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025-2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + diff --git a/cosmos-inference/cosmos3/_src/vfm/datasets/augmentors/append_fps_frames_for_image.py b/cosmos-inference/cosmos3/_src/vfm/datasets/augmentors/append_fps_frames_for_image.py new file mode 100644 index 00000000..00f66796 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/vfm/datasets/augmentors/append_fps_frames_for_image.py @@ -0,0 +1,37 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +from typing import Optional + +from cosmos3._src.imaginaire.datasets.webdataset.augmentors.augmentor import Augmentor + + +class AppendFPSFramesForImage(Augmentor): + def __init__( + self, input_keys: Optional[list] = None, output_keys: Optional[list] = None, args: Optional[dict] = None + ) -> None: + super().__init__(input_keys, output_keys, args) + + def __call__(self, data_dict: dict) -> dict: + r"""Remove the input keys from the data dict. + + Args: + data_dict (dict): Input data dict + Returns: + data_dict (dict): Output dict with keys removed. + """ + data_dict["fps"] = 30.0 # set image model fps = 30, which is the most common fps we used to train video. + data_dict["num_frames"] = 1 + return data_dict diff --git a/cosmos-inference/cosmos3/_src/vfm/datasets/augmentors/audio_caption.py b/cosmos-inference/cosmos3/_src/vfm/datasets/augmentors/audio_caption.py new file mode 100644 index 00000000..077dfbbf --- /dev/null +++ b/cosmos-inference/cosmos3/_src/vfm/datasets/augmentors/audio_caption.py @@ -0,0 +1,107 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +"""Augmentor that appends audio captions to video captions. + +Reads an audio caption from the metadata JSON and appends it to the existing +video caption string before tokenization. This allows the model to condition +on both visual and audio descriptions. + +Placed AFTER text_transform (which sets ai_caption) and BEFORE text_tokenization. +""" + +import sys + +from cosmos3._src.imaginaire.datasets.webdataset.augmentors.augmentor import Augmentor +from cosmos3._src.imaginaire.utils import log + + +def _debug(msg: str) -> None: + """Write debug message to stderr (unbuffered, reliable in worker processes).""" + sys.stderr.write(f"[AudioCaptionAppender] {msg}\n") + sys.stderr.flush() + + +class AudioCaptionAppender(Augmentor): + """Appends audio caption text from metadata to the video caption. + + Args: + input_keys: Expected to be ["metas", "ai_caption"] but read from data_dict directly. + output_keys: Not used. + args: Dictionary with: + - audio_caption_key: Metadata key for audio caption (default: "audio_caption") + - separator: Text inserted between video and audio captions (default: " Audio description: ") + - sound_key: Key to check if sound data exists (default: "sound") + """ + + def __init__(self, input_keys: list, output_keys: list | None = None, args: dict | None = None) -> None: + super().__init__(input_keys, output_keys, args) + args = args or {} + self.audio_caption_key = args.get("audio_caption_key", "caption_audio") + self.separator = args.get("separator", " Audio description: ") + self.sound_key = args.get("sound_key", "sound") + self.caption_key = "ai_caption" + log.warning( + f"AudioCaptionAppender initialized: audio_caption_key='{self.audio_caption_key}', " + f"sound_key='{self.sound_key}', metas_key='{input_keys[0]}'", + rank0_only=True, + ) + + def _find_audio_caption(self, meta_dict: dict) -> str | None: + """Find audio caption in metas, supporting both flat and nested formats. + + Flat format (e.g., metas_w_audio_caps): + {"caption_audio": "...", ...} + + Nested format (e.g., midtrain dataset): + {"0_156": {"caption_sound": "..."}, ...} + The key is a frame range like "0_156" containing a dict with "caption_sound". + """ + # Try flat key first + value = meta_dict.get(self.audio_caption_key) + if isinstance(value, str) and len(value) > 0: + return value + + # Try nested: look for a dict value containing "caption_sound" + for key, val in meta_dict.items(): + if isinstance(val, dict) and "caption_sound" in val: + caption = val["caption_sound"] + if isinstance(caption, str) and len(caption) > 0: + return caption + + return None + + def __call__(self, data_dict: dict) -> dict | None: + """Append audio caption to the video caption if available. + + Only appends when sound data is present in the sample. If the metadata + does not contain the audio_caption_key, the video caption is left unchanged. + Always cleans up metas from data_dict since this is the last augmentor that reads it. + """ + metas_key = self.input_keys[0] + has_sound = self.sound_key in data_dict and data_dict.get(self.sound_key) is not None + meta_dict = data_dict.get(metas_key) + + if has_sound and meta_dict is not None: + audio_caption = self._find_audio_caption(meta_dict) + if isinstance(audio_caption, str) and len(audio_caption) > 0: + current_caption = data_dict.get(self.caption_key, "") + data_dict[self.caption_key] = current_caption + self.separator + audio_caption + + # Clean up metas from data_dict — this augmentor is the last consumer of metas + if metas_key in data_dict: + del data_dict[metas_key] + + return data_dict diff --git a/cosmos-inference/cosmos3/_src/vfm/datasets/augmentors/audio_parsing.py b/cosmos-inference/cosmos3/_src/vfm/datasets/augmentors/audio_parsing.py new file mode 100644 index 00000000..344304e8 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/vfm/datasets/augmentors/audio_parsing.py @@ -0,0 +1,149 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +"""Audio parsing augmentor for T2A (Text-to-Audio) datasets. + +For audio-only datasets (AudioCaps, WavCaps, etc.) that have no video, +this augmentor: +1. Decodes audio from bytes +2. Creates a full-length dummy video (all zeros) matching the audio duration +3. Outputs data compatible with the v3 video training pipeline + +The dummy video ensures compatibility with the model architecture which +requires vision tokens in the sequence (sound_gen requires vision_gen). +The dummy video is fully conditioned (all frames clean), so it contributes +no loss — effectively making this a tv2s (text+video→sound) mode where +the video is a placeholder. +""" + +from typing import Optional + +import torch +from torchcodec.decoders import AudioDecoder + +from cosmos3._src.imaginaire.datasets.webdataset.augmentors.augmentor import Augmentor +from cosmos3._src.imaginaire.utils import log + + +class AudioParsingForFullClips(Augmentor): + """Audio parsing augmentor for audio-only datasets. + + Loads audio from bytes, creates a dummy video of matching duration, + and outputs data compatible with the VideoParsingWithFullFrames pipeline. + + Args: + input_keys: [meta_key, audio_key] — keys to fetch metadata and audio bytes + output_keys: Optional output keys + args: Dictionary with: + - target_sample_rate: Target audio sample rate (default: 48000) + - target_channels: Target audio channels (default: 2 for stereo) + - dummy_video_fps: FPS for dummy video (default: 24) + - dummy_video_size: (H, W) for dummy video (default: (256, 256)) + - max_audio_duration_sec: Max audio duration in seconds (default: 30.0) + - min_audio_duration_sec: Min audio duration in seconds (default: 1.0) + """ + + def __init__(self, input_keys: list, output_keys: Optional[list] = None, args: Optional[dict] = None) -> None: + super().__init__(input_keys, output_keys, args) + assert len(input_keys) == 2, "AudioParsingForFullClips requires two input keys: [meta_key, audio_key]" + self.meta_key = input_keys[0] + self.audio_key = input_keys[1] + + self.target_sample_rate = args.get("target_sample_rate", 48000) + self.target_channels = args.get("target_channels", 2) + self.dummy_video_fps = args.get("dummy_video_fps", 24.0) + self.dummy_video_size = args.get("dummy_video_size", (256, 256)) + self.max_audio_duration_sec = args.get("max_audio_duration_sec", 30.0) + self.min_audio_duration_sec = args.get("min_audio_duration_sec", 1.0) + + def __call__(self, data_dict: dict) -> dict | None: + try: + meta_dict = data_dict[self.meta_key] + audio_bytes = data_dict[self.audio_key] + except Exception: + log.warning( + f"Cannot find audio data. url: {data_dict.get('__url__', '?')}, key: {data_dict.get('__key__', '?')}", + rank0_only=False, + ) + return None + + if not isinstance(audio_bytes, bytes): + log.warning("Audio data is not bytes, skipping", rank0_only=False) + return None + + # Decode audio + try: + audio_decoder = AudioDecoder(audio_bytes) + audio_metadata = audio_decoder.metadata + orig_sample_rate = audio_metadata.sample_rate + + audio_samples = audio_decoder.get_samples_played_in_range() + audio_chunk = audio_samples.data # [C,N_orig] + del audio_decoder + except Exception as e: + log.warning(f"Failed to decode audio: {e}", rank0_only=False) + return None + + # Compute duration + audio_duration_sec = audio_chunk.shape[1] / orig_sample_rate + + # Filter by duration + if audio_duration_sec < self.min_audio_duration_sec: + log.debug(f"Audio too short: {audio_duration_sec:.2f}s < {self.min_audio_duration_sec}s", rank0_only=False) + return None + if audio_duration_sec > self.max_audio_duration_sec: + # Crop to max duration + max_samples = int(self.max_audio_duration_sec * orig_sample_rate) + audio_chunk = audio_chunk[:, :max_samples] + audio_duration_sec = self.max_audio_duration_sec + + # Resample if needed + if orig_sample_rate != self.target_sample_rate: + import torchaudio + + audio_chunk = torchaudio.functional.resample( + audio_chunk, orig_freq=orig_sample_rate, new_freq=self.target_sample_rate + ) # [C,N_resampled] + + # Handle channel count (mono → stereo or vice versa) + if audio_chunk.shape[0] == 1 and self.target_channels == 2: + audio_chunk = audio_chunk.repeat(2, 1) # [2,N_resampled] + elif audio_chunk.shape[0] > self.target_channels: + audio_chunk = audio_chunk[: self.target_channels] # [C_target,N_resampled] + + # Create dummy video matching audio duration + # VAE compress temporal by 4x, with 1 as condition → num_frames must be 1 + 4N + num_video_frames = int(audio_duration_sec * self.dummy_video_fps) + N = (num_video_frames - 1) // 4 + num_video_frames = max(1 + 4 * N, 1) + + h, w = self.dummy_video_size + dummy_video = torch.zeros(3, num_video_frames, h, w, dtype=torch.uint8) # [3,T,H,W] + + # Build output compatible with VideoParsingWithFullFrames + video_info = { + "frame_start": 0, + "frame_end": num_video_frames - 1, + "num_frames": num_video_frames, + "video": dummy_video, + "fps": self.dummy_video_fps, + "conditioning_fps": self.dummy_video_fps, + "n_orig_video_frames": num_video_frames, + "sound": audio_chunk, + "audio_sample_rate": self.target_sample_rate, + } + data_dict["video"] = video_info + + return data_dict diff --git a/cosmos-inference/cosmos3/_src/vfm/datasets/augmentors/caption_filter.py b/cosmos-inference/cosmos3/_src/vfm/datasets/augmentors/caption_filter.py new file mode 100644 index 00000000..46cf4c0b --- /dev/null +++ b/cosmos-inference/cosmos3/_src/vfm/datasets/augmentors/caption_filter.py @@ -0,0 +1,173 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +from typing import Optional + +from cosmos3._src.imaginaire.datasets.webdataset.augmentors.augmentor import Augmentor +from cosmos3._src.imaginaire.utils import log + + +class CaptionFilter(Augmentor): + """ + Caption filter augmentor for predict2 training. + + This augmentor filters video samples based on caption content with configurable behavior: + - contain_keyword=True: Only return videos that contain keywords in captions + - contain_keyword=False: Only return videos that do NOT contain keywords in captions + + When a sample doesn't match the filter criteria, it returns None, which causes + the webdataset pipeline to skip that sample and continue to the next one. + """ + + def __init__(self, input_keys: list, output_keys: Optional[list] = None, args: Optional[dict] = None) -> None: + """ + Initialize the caption filter. + + Args: + input_keys: List containing the caption key (e.g., ["ai_caption"] or text embeddings key) + output_keys: Not used for filtering, can be None + args: Dictionary with filtering parameters: + - "keywords": List of keywords to filter by (e.g., ["camera pan"]) + - "contain_keyword": Boolean flag for filtering behavior: + * True: Only return videos that contain keywords + * False: Only return videos that do NOT contain keywords + - "log_filtered": Whether to log filtered samples (default: False) + - "filter_stats": Whether to track filtering statistics (default: True) + - "dont_apply_on_webdataset_names": List of webdataset names to not apply the filter on, it will just pass through without checking contain or not contain keywords + """ + super().__init__(input_keys, output_keys, args) + + # Parse arguments + if args is None: + args = {} + + self.keywords = args.get("keywords", []) + self.contain_keyword = args.get("contain_keyword", False) # Default to exclude mode + self.log_filtered = args.get("log_filtered", False) + self.filter_stats = args.get("filter_stats", True) + self.dont_apply_on_webdataset_names = args.get("dont_apply_on_webdataset_names", []) + + # Validate input_keys + if not input_keys or len(input_keys) == 0: + raise ValueError("CaptionFilter requires at least one input key for the caption field") + + self.caption_key = input_keys[0] # Use the first input key as the caption key + + # Statistics tracking + if self.filter_stats: + self.total_samples = 0 + self.filtered_samples = 0 + + # Validate configuration + if not self.keywords: + log.warning("CaptionFilter: No keywords provided, filter will not filter any samples") + + mode_str = "contain" if self.contain_keyword else "exclude" + log.info( + f"CaptionFilter initialized in '{mode_str}' mode with {len(self.keywords)} keywords using caption key '{self.caption_key}': {self.keywords}" + ) + + def __call__(self, data_dict: dict) -> Optional[dict]: + """ + Filter data based on caption content. + + This checks the caption field specified by the input_keys parameter. + Depending on contain_keyword flag: + - True: Returns data_dict only if caption contains any keyword, None otherwise + - False: Returns data_dict only if caption contains NO keywords, None otherwise + + Args: + data_dict: Input data dictionary containing the caption field specified in input_keys + + Returns: + data_dict: Original data dict if caption passes filter + None: If caption should be filtered out (causes sample to be skipped) + """ + data_dict_root = data_dict["__url__"].root + if any(n in data_dict_root for n in self.dont_apply_on_webdataset_names): + return data_dict + + if self.filter_stats: + self.total_samples += 1 + + # Check if caption key exists + if self.caption_key not in data_dict: + if self.log_filtered: + log.warning(f"CaptionFilter: No '{self.caption_key}' found in data_dict, passing through") + return data_dict + + caption = data_dict[self.caption_key] + if not isinstance(caption, str) or not caption.strip(): + if self.log_filtered: + log.warning(f"CaptionFilter: '{self.caption_key}' is empty or not a string, got {type(caption)}") + return data_dict + + # Check if any keywords are found in the caption + search_caption = caption.lower() + keyword_found = False + matched_keyword = None + + for keyword in self.keywords: + if keyword.lower() in search_caption: + keyword_found = True + matched_keyword = keyword + break + + # Apply filtering logic based on contain_keyword flag + should_filter = False + if self.contain_keyword: + # Include mode: filter out if NO keywords found + should_filter = not keyword_found + else: + # Exclude mode: filter out if ANY keyword found + should_filter = keyword_found + + if should_filter: + if self.log_filtered: + if self.contain_keyword: + log.info(f"CaptionFilter: excluded sample (no keywords found) - caption: '{caption[:100]}...'") + else: + log.info( + f"CaptionFilter: excluded sample due to keyword '{matched_keyword}' - caption: '{caption[:100]}...'" + ) + + if self.filter_stats: + self.filtered_samples += 1 + return None + + # Sample passes filter + return data_dict + + def get_filter_stats(self) -> dict: + """ + Get filtering statistics. + + Returns: + Dictionary with filtering statistics + """ + if not self.filter_stats: + return {"stats_disabled": True} + + filter_rate = (self.filtered_samples / self.total_samples * 100) if self.total_samples > 0 else 0 + mode_str = "contain" if self.contain_keyword else "exclude" + + return { + "total_samples": self.total_samples, + "filtered_samples": self.filtered_samples, + "passed_samples": self.total_samples - self.filtered_samples, + "filter_rate_percent": filter_rate, + "mode": mode_str, + "keywords": self.keywords, + } diff --git a/cosmos-inference/cosmos3/_src/vfm/datasets/augmentors/cropping.py b/cosmos-inference/cosmos3/_src/vfm/datasets/augmentors/cropping.py new file mode 100644 index 00000000..8e1d2a20 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/vfm/datasets/augmentors/cropping.py @@ -0,0 +1,92 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +from typing import Optional + +import torch +import torchvision.transforms.functional as transforms_F +from PIL import Image + +from cosmos3._src.imaginaire.datasets.webdataset.augmentors.augmentor import Augmentor + + +class CropToMultiple(Augmentor): + """Crops images/videos to the nearest multiple of a specified value using center crop. + + This augmentor crops the height and width of images/videos to be divisible by + a given multiple (default 16). The crop is centered, removing equal amounts + from opposite edges. + + Supports: + - PIL Images (for image data) + - Torch tensors with shape (C, H, W) or (C, T, H, W) (for video data) + + Example: + Input: 209x187 with multiple=16 + Output: 208x176 (center cropped to nearest lower multiple of 16) + """ + + def __init__(self, input_keys: list, output_keys: Optional[list] = None, args: Optional[dict] = None) -> None: + super().__init__(input_keys, output_keys, args) + self.multiple = 16 + if self.args is not None and "multiple" in self.args: + self.multiple = self.args["multiple"] + + def __call__(self, data_dict: dict) -> dict: + """Center crops images/videos to the nearest multiple of the specified value. + + Args: + data_dict (dict): Input data dict containing images/videos to crop. + + Returns: + data_dict (dict): Output dict with center cropped images/videos. + """ + for key in self.input_keys: + if key not in data_dict: + continue + + data = data_dict[key] + + # Get dimensions based on data type + if isinstance(data, Image.Image): + # PIL Image: size returns (width, height) + w, h = data.size + elif isinstance(data, torch.Tensor): + # Torch tensor: (C, H, W) or (C, T, H, W) + if data.ndim == 3: + _, h, w = data.shape + elif data.ndim == 4: + _, _, h, w = data.shape + else: + raise ValueError(f"Unexpected tensor dimensions: {data.ndim}, expected 3 or 4") + else: + raise ValueError(f"Unexpected data type: {type(data)}, expected PIL Image or torch Tensor") + + # Calculate new dimensions (nearest lower multiple) + new_h = (h // self.multiple) * self.multiple + new_w = (w // self.multiple) * self.multiple + + # Center crop: calculate offsets to center the crop + if new_h != h or new_w != w: + top = (h - new_h) // 2 + left = (w - new_w) // 2 + # log.info(f"Data cropped from ({h}, {w}) to ({new_h}, {new_w})") + data_dict[key] = transforms_F.crop(data, top=top, left=left, height=new_h, width=new_w) + + # Store final dimensions for downstream use (e.g., resolution text info) + data_dict["final_height"] = new_h + data_dict["final_width"] = new_w + + return data_dict diff --git a/cosmos-inference/cosmos3/_src/vfm/datasets/augmentors/duration_fps_text_timestamps.py b/cosmos-inference/cosmos3/_src/vfm/datasets/augmentors/duration_fps_text_timestamps.py new file mode 100644 index 00000000..998c014f --- /dev/null +++ b/cosmos-inference/cosmos3/_src/vfm/datasets/augmentors/duration_fps_text_timestamps.py @@ -0,0 +1,135 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +from typing import Optional + +import torch + +from cosmos3._src.imaginaire.datasets.webdataset.augmentors.augmentor import Augmentor +from cosmos3._src.imaginaire.utils import log + +# Global template for duration and FPS text timestamps +DEFAULT_TEMPLATE = "The video is {duration:.1f} seconds long and is of {fps:.0f} FPS." + + +class DurationFPSTextTimeStamps(Augmentor): + """ + Augmentor that appends video duration and FPS as text timestamps to captions. + + This augmentor should run AFTER TextTransformForVideo to append metadata + to the already-selected caption in data_dict["ai_caption"]. + + IMPORTANT: Reads num_frames from the actual video tensor shape to get the + FINAL frame count after all video processing (subsampling, etc.) is complete. + + Example: + Original caption: "A cat playing with a ball" + Augmented caption: "A cat playing with a ball. The video is 1.4 seconds long and is of 24 FPS" + + Args: + input_keys (list): Input keys (not used, kept for API compatibility) + output_keys (list): Output keys (not used, kept for API compatibility) + args (dict): Configuration arguments: + - caption_key (str): Key for caption in data_dict. Default: "ai_caption" + - video_key (str): Key for video tensor in data_dict. Default: "video" + - fps_key (str): Key for FPS value in data_dict. Default: "conditioning_fps" + - template (str): Format string for metadata text. Default: DEFAULT_TEMPLATE constant + - separator (str): Separator between caption and metadata. Default: ". " + - enabled (bool): Whether augmentation is enabled. Default: True + - skip_on_error (bool): If True, skip on errors and return original data_dict. If False, return None. Default: True + - num_multiplier_key (str): Key for num_multiplier value in data_dict. Default: "num_multiplier" + """ + + def __init__( + self, input_keys: Optional[list] = None, output_keys: Optional[list] = None, args: Optional[dict] = None + ) -> None: + super().__init__(input_keys, output_keys, args) + + # Configuration with sensible defaults + self.caption_key = args.get("caption_key", "ai_caption") if args else "ai_caption" + self.video_key = args.get("video_key", "video") if args else "video" + self.fps_key = args.get("fps_key", "conditioning_fps") if args else "conditioning_fps" + self.template = args.get("template", DEFAULT_TEMPLATE) if args else DEFAULT_TEMPLATE + self.default_separator = args.get("separator", ". ") if args else ". " + self.enabled = args.get("enabled", True) if args else True + self.skip_on_error = args.get("skip_on_error", True) if args else True + self.num_multiplier_key = args.get("num_multiplier_key", "num_multiplier") if args else "num_multiplier" + + def __call__(self, data_dict: dict) -> dict | None: + """ + Append video duration and FPS as text timestamps to the caption. + + Args: + data_dict (dict): Input data dict containing caption, fps, and video tensor + + Returns: + data_dict (dict): Output dict with augmented caption, or None if error and skip_on_error=False + """ + if not self.enabled: + return data_dict + # Get caption - must exist at this point (set by TextTransformForVideo) + if self.caption_key not in data_dict: + if self.skip_on_error: + log.warning( + f"DurationFPSTextTimeStamps: '{self.caption_key}' not found in data_dict. Skipping.", + rank0_only=False, + ) + return data_dict + else: + return None + caption = data_dict[self.caption_key] + if (not isinstance(caption, str) and not isinstance(caption, dict)) or caption == "": + if self.skip_on_error: + return data_dict + else: + return None + + # Use pre-calculated conditioning_fps from VideoParsing augmentor + # This already accounts for frame skipping (fps / num_multiplier) + fps_value = data_dict[self.fps_key] + if isinstance(fps_value, torch.Tensor): + fps = fps_value.item() if fps_value.numel() == 1 else fps_value[0].item() + else: + fps = float(fps_value) + + # Extract ACTUAL number of frames from the video tensor shape + # This is critical - we need the final frame count after all processing + video = data_dict[self.video_key] + + # Video shape is (C, T, H, W) + num_frames = video.shape[1] + + # Compute duration and append to caption + if fps > 0: + duration = int(num_frames / fps) + if isinstance(caption, str): + # Case 1: Caption is a string (existing behavior). + metadata_text = self.template.format(duration=duration, fps=fps) + + # Choose separator based on whether caption ends with a period + separator = " " if caption.rstrip().endswith(".") else self.default_separator + + # Update caption text + data_dict[self.caption_key] = caption + separator + metadata_text + elif isinstance(caption, dict): + # Case 2: Caption is JSON. Add structured duration/FPS fields. + data_dict[self.caption_key].update( + { + "duration": str(duration) + "s", + "fps": fps, + } + ) + + return data_dict diff --git a/cosmos-inference/cosmos3/_src/vfm/datasets/augmentors/idle_frames_text_info.py b/cosmos-inference/cosmos3/_src/vfm/datasets/augmentors/idle_frames_text_info.py new file mode 100644 index 00000000..e280fdbd --- /dev/null +++ b/cosmos-inference/cosmos3/_src/vfm/datasets/augmentors/idle_frames_text_info.py @@ -0,0 +1,192 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +"""Augmentor that appends idle-frame count metadata to the caption. + +The label is a Pi0.7-style episode-metadata field encoded as plain text. It +records how many frames of the action chunk were "idle" out of the total action +frames (i.e. the relative-pose delta is close to identity and the gripper +command does not change). The upstream dataset is responsible for populating +``data_dict[idle_frames_key]`` via +:func:`projects.cosmos3.vfm.datasets.action.pose_utils.compute_idle_frames`. + +Per-field dropout (default 5%) is applied here, matching Pi0.7's approach of +independently dropping each metadata component. This is complementary to the +global ``cfg_dropout_rate`` in :class:`TextTokenizerTransform`, which still +empties the whole caption. +""" + +from __future__ import annotations + +import random + +import torch + +from cosmos3._src.imaginaire.datasets.webdataset.augmentors.augmentor import Augmentor +from cosmos3._src.imaginaire.utils import log + +DEFAULT_TEMPLATE = "IdleFrames: {n} out of {m}." +FALLBACK_TEMPLATE = "IdleFrames: {n}." + + +class IdleFramesTextInfo(Augmentor): + """Augmentor that appends ``IdleFrames: N out of M.`` to the caption. + + Reads ``data_dict[idle_frames_key]`` (set by the dataset layer) and appends + a textual marker to the caption, modeled after + :class:`ResolutionTextInfo` and :class:`DurationFPSTextTimeStamps`. + + Per-field dropout is supported: with probability ``dropout_rate`` the + segment is omitted entirely (the caption is left unchanged). This is + independent from the global classifier-free-guidance dropout in the + tokenizer. + + Example: + Original caption: "pick up the cup" + Augmented: "pick up the cup. IdleFrames: 0 out of 16." + + Args: + input_keys (list): Input keys (not used, kept for API compatibility). + output_keys (list): Output keys (not used, kept for API compatibility). + args (dict): Configuration arguments: + - caption_key (str): Key for caption in data_dict. Default ``"ai_caption"``. + - idle_frames_key (str): Key for the idle-frame integer in data_dict. + Default ``"idle_frames"``. + - total_frames_key (str): Optional key for the total frame integer + in data_dict. Default ``"idle_frames_total"``. + - action_key (str): Key for the action tensor used to infer total + frames when ``total_frames_key`` is missing. Default ``"action"``. + - template (str): Format string for the appended segment. + Default ``"IdleFrames: {n} out of {m}."``. + - separator (str): Separator inserted between the original caption + and the new segment. Default ``". "``. + - dropout_rate (float): Probability of skipping the append step + (per-field dropout). Default 0.05. + - enabled (bool): Whether the augmentor is active. Default True. + """ + + def __init__( + self, + input_keys: list | None = None, + output_keys: list | None = None, + args: dict | None = None, + ) -> None: + super().__init__(input_keys, output_keys, args) + + args = args or {} + self.caption_key: str = args.get("caption_key", "ai_caption") + self.idle_frames_key: str = args.get("idle_frames_key", "idle_frames") + self.total_frames_key: str = args.get("total_frames_key", "idle_frames_total") + self.action_key: str = args.get("action_key", "action") + self.template: str = args.get("template", DEFAULT_TEMPLATE) + self.default_separator: str = args.get("separator", ". ") + self.dropout_rate: float = float(args.get("dropout_rate", 0.05)) + self.enabled: bool = bool(args.get("enabled", True)) + + if not 0.0 <= self.dropout_rate <= 1.0: + raise ValueError(f"dropout_rate must be in [0, 1]; got {self.dropout_rate}") + + def _get_scalar_int(self, value: object, key: str) -> int | None: + """Parse an optional scalar integer metadata value.""" + + if value is None: + return None + + if isinstance(value, torch.Tensor): + if value.numel() != 1: + log.warning( + f"IdleFramesTextInfo: expected scalar tensor at '{key}', got shape {tuple(value.shape)}. Skipping.", + rank0_only=False, + ) + return None + return int(value.item()) + + try: + return int(value) + except (TypeError, ValueError): + log.warning( + f"IdleFramesTextInfo: expected integer-compatible value at " + f"'{key}', got {type(value).__name__}. Skipping.", + rank0_only=False, + ) + return None + + def _get_total_frames(self, data_dict: dict) -> int | None: + """Resolve the total action-frame count for the idle-frame text.""" + + total_frames = self._get_scalar_int(data_dict.get(self.total_frames_key), self.total_frames_key) + if total_frames is not None: + return total_frames + + action = data_dict.get(self.action_key) + if isinstance(action, torch.Tensor): + if action.ndim == 0: + log.warning( + f"IdleFramesTextInfo: expected action tensor at " + f"'{self.action_key}' to have a frame dimension. Skipping total frames.", + rank0_only=False, + ) + return None + return int(action.shape[0]) + + try: + return len(action) if action is not None else None + except TypeError: + return None + + def __call__(self, data_dict: dict) -> dict | None: + """Append ``IdleFrames: N out of M.`` to ``data_dict[caption_key]`` in place. + + Returns the input dict unchanged when: + + - the augmentor is disabled, + - the per-field dropout fires, + - ``idle_frames_key`` is missing or ``None`` (e.g. non-action sample), + - the caption is missing, empty, or not a string/dict (unconditional case). + + For dict-typed captions (the JSON-caption code path), the idle-frame + integer is added under ``"idle_frames"`` and the total count, when + available, is added under ``"idle_frames_total"``. + """ + if not self.enabled: + return data_dict + + if random.random() < self.dropout_rate: + return data_dict + + n = self._get_scalar_int(data_dict.get(self.idle_frames_key), self.idle_frames_key) + if n is None: + return data_dict + + m = self._get_total_frames(data_dict) + + if self.caption_key not in data_dict: + return data_dict + caption = data_dict[self.caption_key] + + if isinstance(caption, str): + if caption == "": + return data_dict + metadata_text = self.template.format(n=n, m=m) if m is not None else FALLBACK_TEMPLATE.format(n=n) + separator = " " if caption.rstrip().endswith(".") else self.default_separator + data_dict[self.caption_key] = caption + separator + metadata_text + elif isinstance(caption, dict): + data_dict[self.caption_key]["idle_frames"] = n + if m is not None: + data_dict[self.caption_key]["idle_frames_total"] = m + else: + return data_dict + + return data_dict diff --git a/cosmos-inference/cosmos3/_src/vfm/datasets/augmentors/image_editing_transform.py b/cosmos-inference/cosmos3/_src/vfm/datasets/augmentors/image_editing_transform.py new file mode 100644 index 00000000..96315f4e --- /dev/null +++ b/cosmos-inference/cosmos3/_src/vfm/datasets/augmentors/image_editing_transform.py @@ -0,0 +1,382 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +""" +Augmentors for image editing tasks in the cosmos3 VFM pipeline. + +These augmentors process conversation-format image editing data and produce +the output format expected by the main training pipeline: + - images: List[torch.Tensor] (source + target images as a two-frame "video") + - image_size: List[torch.Tensor] + - ai_caption: List[str] + - selected_caption_type: List[str] + - fps: List[float] + - num_frames: List[int] + - dataset_name: str + - sequence_plan: SequencePlan +""" + +from __future__ import annotations + +import random + +import torch +import torchvision.transforms.functional as transforms_F +from PIL import Image + +from cosmos3._src.imaginaire.datasets.webdataset.augmentors.augmentor import Augmentor +from cosmos3._src.imaginaire.utils import log +from cosmos3._src.vfm.datasets.sequence_packing import SequencePlan + + +class ExtractImageEditingConversation(Augmentor): + """Extract and validate image editing conversation from standard annotation format. + + This augmentor processes the cosmos-interleaved conversation format for image editing: + - Validates that the conversation has exactly one round (user + assistant) + - User message must contain at least one image and text instruction + - Assistant message must contain exactly one image (the edited result) + - If multi-round conversation is found, only the first round is kept + + Input Format (from data_dict): + - texts: Dict containing "content" with conversation data + - mllm_media_list: Dict mapping image keys to PIL images (for understanding) + - diffusion_media_list: Dict mapping image keys to PIL images (for diffusion/VAE) + + Output Format (added to data_dict): + - source_image: PIL.Image (the input image for editing) + - target_image: PIL.Image (the edited output image) + - editing_instruction: str (the user's editing instruction) + """ + + def __init__( + self, + input_keys: list | None = None, + max_round: int = 1, + args: dict | None = None, + ) -> None: + super().__init__(input_keys or [], None, args) + self.max_round = max_round + + def __call__(self, data_dict: dict) -> dict | None: + """Extract image editing conversation. + + Args: + data_dict: Input data dictionary. + + Returns: + Updated data_dict with source_image, target_image, editing_instruction, + or None if the data is invalid. + """ + # Validate required keys + for required_key in ["mllm_media_list", "diffusion_media_list", "texts"]: + if required_key not in data_dict: + log.warning( + f"{required_key} not found in data_dict: {data_dict.get('__key__', 'unknown')}", + rank0_only=False, + ) + return None + + mllm_media_list = data_dict["mllm_media_list"] + diffusion_media_list = data_dict["diffusion_media_list"] + + # Get conversation content + try: + texts_content = data_dict["texts"].get("content") + if texts_content is None: + log.warning( + f"texts.content is None: {data_dict.get('__key__', 'unknown')}", + rank0_only=False, + ) + return None + + # Handle case where content is a list of conversation options + if isinstance(texts_content, list) and len(texts_content) > 0: + if isinstance(texts_content[0], list): + # Multiple conversation options, randomly select one + selected_conversations = random.choice(texts_content) + else: + selected_conversations = texts_content + else: + log.warning( + f"Unexpected texts.content format: {data_dict.get('__key__', 'unknown')}", + rank0_only=False, + ) + return None + except Exception as e: + log.warning( + f"Error accessing texts.content: {data_dict.get('__key__', 'unknown')}, {str(e)}", + rank0_only=False, + ) + return None + + # For image editing, we only keep the first round (user + assistant) + # Trim to first round if multiple rounds exist + if len(selected_conversations) > 2: + log.warning( + f"Multi-round conversation found ({len(selected_conversations)} messages), " + f"keeping only first round: {data_dict.get('__key__', 'unknown')}", + rank0_only=False, + ) + selected_conversations = selected_conversations[:2] + + if len(selected_conversations) < 2: + log.warning( + f"Expected at least 2 messages (user + assistant), got {len(selected_conversations)}: " + f"{data_dict.get('__key__', 'unknown')}", + rank0_only=False, + ) + return None + + # Validate roles: first must be user, second must be assistant + user_msg = selected_conversations[0] + assistant_msg = selected_conversations[1] + + if user_msg.get("role") != "user": + log.warning( + f"First message role is not 'user': {data_dict.get('__key__', 'unknown')}", + rank0_only=False, + ) + return None + + if assistant_msg.get("role") != "assistant": + log.warning( + f"Second message role is not 'assistant': {data_dict.get('__key__', 'unknown')}", + rank0_only=False, + ) + return None + + # Extract user content: must have at least one image and one text + user_content = user_msg.get("content", []) + if isinstance(user_content, str): + user_content = [{"type": "text", "text": user_content}] + + user_text_parts: list[str] = [] + user_image_key: str | None = None + + for item in user_content: + if not isinstance(item, dict): + continue + content_type = item.get("type") + if content_type == "text": + user_text_parts.append(item.get("text", "")) + elif content_type == "image": + if user_image_key is None: + user_image_key = item.get("image") + # If multiple user images, we only take the first one + + if user_image_key is None: + log.warning( + f"No image found in user message: {data_dict.get('__key__', 'unknown')}", + rank0_only=False, + ) + return None + + editing_instruction = " ".join(user_text_parts).strip() + if not editing_instruction: + log.warning( + f"No text instruction found in user message: {data_dict.get('__key__', 'unknown')}", + rank0_only=False, + ) + return None + + # Extract assistant content: must have exactly one image + assistant_content = assistant_msg.get("content", []) + if isinstance(assistant_content, str): + log.warning( + f"Assistant content is text-only (no image): {data_dict.get('__key__', 'unknown')}", + rank0_only=False, + ) + return None + + assistant_image_key: str | None = None + for item in assistant_content: + if not isinstance(item, dict): + continue + if item.get("type") == "image": + assistant_image_key = item.get("image") + break + + if assistant_image_key is None: + log.warning( + f"No image found in assistant message: {data_dict.get('__key__', 'unknown')}", + rank0_only=False, + ) + return None + + # Validate images exist in media lists + for media_key in [user_image_key, assistant_image_key]: + if media_key not in diffusion_media_list: + log.warning( + f"Image {media_key} not found in diffusion_media_list: {data_dict.get('__key__', 'unknown')}", + rank0_only=False, + ) + return None + + # Get PIL images + source_image = diffusion_media_list[user_image_key] + target_image = diffusion_media_list[assistant_image_key] + + # Handle video (list of frames) - use first frame + if isinstance(source_image, list): + source_image = source_image[0] if source_image else None + if isinstance(target_image, list): + target_image = target_image[0] if target_image else None + + if source_image is None or target_image is None: + log.warning( + f"Source or target image is None: {data_dict.get('__key__', 'unknown')}", + rank0_only=False, + ) + return None + + data_dict["source_image"] = source_image + data_dict["target_image"] = target_image + data_dict["editing_instruction"] = editing_instruction + + return data_dict + + +class ImageEditingToTrainingFormat(Augmentor): + """Convert extracted image editing data to the training-compatible format. + + This augmentor takes the source image, target image, and editing instruction + and produces the output format expected by the main training pipeline. + + Images are assumed to have been already resized by an upstream augmentor + (e.g. ``OmniInterleavedMediaResize``). This augmentor only normalises the + PIL images to tensors and assembles the remaining metadata fields. + + Input (from data_dict): + - source_image: PIL.Image (already resized by upstream augmentor) + - target_image: PIL.Image (already resized by upstream augmentor) + - editing_instruction: str + + Output (added to data_dict): + - images: list[torch.Tensor] — ``[source (C,H_s,W_s), target (C,H_t,W_t)]`` + - ai_caption: str + - selected_caption_type: str + - fps: float + - num_frames: int + - sequence_plan: SequencePlan + """ + + def __init__( + self, + input_keys: list | None = None, + mean: float = 0.5, + std: float = 0.5, + args: dict | None = None, + ) -> None: + super().__init__(input_keys or [], None, args) + self.mean = mean + self.std = std + + def _normalize_image(self, image: Image.Image) -> torch.Tensor: + """Convert PIL image to normalized tensor (C, H, W).""" + tensor = transforms_F.to_tensor(image) + tensor = transforms_F.normalize(tensor, mean=[self.mean] * 3, std=[self.std] * 3) + return tensor + + def __call__(self, data_dict: dict) -> dict | None: + """Convert image editing data to training format. + + Args: + data_dict: Input data dictionary with source_image, target_image, editing_instruction. + + Returns: + Updated data_dict with training-compatible fields, or None on error. + """ + source_image: Image.Image = data_dict.get("source_image") + target_image: Image.Image = data_dict.get("target_image") + editing_instruction: str = data_dict.get("editing_instruction", "") + + if source_image is None or target_image is None: + return None + + try: + # Normalize PIL images to tensors (upstream augmentor already handled resizing) + source_tensor = self._normalize_image(source_image) # [C,H_s,W_s] + target_tensor = self._normalize_image(target_image) # [C,H_t,W_t] + + # Store as list of tensors for the batch collation. + # Each image keeps its own spatial size; the model encodes them separately. + data_dict["images"] = [source_tensor, target_tensor] + + # Set text fields + data_dict["ai_caption"] = editing_instruction + data_dict["selected_caption_type"] = "editing_instruction" + + # Set metadata + data_dict["fps"] = 30.0 # Same as standard image training + data_dict["num_frames"] = 2 # Source + target = 2 frames + data_dict["image_size"] = [ + torch.tensor( + [source_image.height, source_image.width, source_image.height, source_image.width], + dtype=torch.float, + ), # [4] + torch.tensor( + [target_image.height, target_image.width, target_image.height, target_image.width], + dtype=torch.float, + ), # [4] + ] + # Set the dataset name if not already present + if "dataset_name" not in data_dict: + data_dict["dataset_name"] = "image_editing" + + # Build sequence plan for image editing. + # The number of vision items per sample (e.g. 2 for source + target) is tracked + # by GenerationDataClean.num_vision_items_per_sample (set in get_data_and_condition). + # In pack_input_sequence, all items except the last are fully conditioned; + # the last item uses condition_frame_indexes_vision ([] = fully generated). + data_dict["sequence_plan"] = SequencePlan( + has_text=True, + has_vision=True, + condition_frame_indexes_vision=[], # Target (last item) is fully generated + ) + + except Exception as e: + log.warning( + f"Error processing image editing data: {data_dict.get('__key__', 'unknown')}, {str(e)}", + rank0_only=False, + ) + return None + + return data_dict + + +class RemoveKeys(Augmentor): + """Remove specified keys from the data dictionary. + + This is useful for cleaning up intermediate keys that are not needed + downstream (e.g. raw PIL images, media lists) so that every remaining + value is a tensor, number, dict, or list — as required by the dataloader + collation. + + Args: + input_keys: Keys to remove from ``data_dict``. + """ + + def __init__( + self, + input_keys: list | None = None, + args: dict | None = None, + ) -> None: + super().__init__(input_keys or [], None, args) + + def __call__(self, data_dict: dict) -> dict: + for key in self.input_keys: + data_dict.pop(key, None) + return data_dict diff --git a/cosmos-inference/cosmos3/_src/vfm/datasets/augmentors/image_resolution_filter.py b/cosmos-inference/cosmos3/_src/vfm/datasets/augmentors/image_resolution_filter.py new file mode 100644 index 00000000..c77dd29d --- /dev/null +++ b/cosmos-inference/cosmos3/_src/vfm/datasets/augmentors/image_resolution_filter.py @@ -0,0 +1,56 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +from typing import Optional + +from cosmos3._src.imaginaire.datasets.webdataset.augmentors.augmentor import Augmentor +from cosmos3._src.vfm.datasets.utils import IMAGE_RES_SIZE_INFO + +# Map dataset_resolution_type to resolution tier key in IMAGE_RES_SIZE_INFO +_DATASET_RESOLUTION_TIER: dict[str, str] = {"gt480p": "480", "gt720p": "720", "gt1080p": "1080"} + + +class ImageResolutionFilter(Augmentor): + """ + Filters out image samples whose (width, height) are below the minimum for + the sample's aspect ratio when dataset_resolution_type is not "all". + Mirrors the resolution check used in video_parsing. + """ + + def __init__(self, input_keys: list, output_keys: Optional[list] = None, args: Optional[dict] = None) -> None: + super().__init__(input_keys, output_keys, args) + self.image_key = args.get("image_key", "images") if args else "images" + self.dataset_resolution_type = args.get("dataset_resolution_type", "all") if args else "all" + self.resolution_tier = _DATASET_RESOLUTION_TIER.get(self.dataset_resolution_type) + + def __call__(self, data_dict: dict) -> dict | None: + image = data_dict.get(self.image_key) + if image is None: + return data_dict + + # PIL Image has .size as (width, height) + width, height = image.size + + aspect_ratio: str | None = None + if "__url__" in data_dict: + aspect_ratio = data_dict["__url__"].meta.opts["aspect_ratio"] + + # If the resolution of the image is smaller than the minimum resolution for the aspect ratio, skip the sample. This will ensure that we do not upsample any image. + if self.resolution_tier is not None: + min_w, min_h = IMAGE_RES_SIZE_INFO[self.resolution_tier][aspect_ratio] + if width < min_w and height < min_h: + return None + + return data_dict diff --git a/cosmos-inference/cosmos3/_src/vfm/datasets/augmentors/interleaved_image_transform.py b/cosmos-inference/cosmos3/_src/vfm/datasets/augmentors/interleaved_image_transform.py new file mode 100644 index 00000000..4eb93fb3 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/vfm/datasets/augmentors/interleaved_image_transform.py @@ -0,0 +1,278 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +""" +Visual transformation augmentors for Omni models. +""" + +import math +from typing import Dict, List, Optional + +import torch +import torchvision.transforms.functional as transforms_F +from PIL import Image + +from cosmos3._src.imaginaire.datasets.webdataset.augmentors.augmentor import Augmentor +from cosmos3._src.imaginaire.datasets.webdataset.augmentors.image.misc import obtain_image_size + + +class ResizeToPaddingDivisor(Augmentor): + """Resize images so that both width and height are multiples of padding_divisor.""" + + def __init__(self, input_keys: list, padding_divisor: int = 16) -> None: + super().__init__(input_keys) + self.padding_divisor = padding_divisor + + def __call__(self, data_dict: dict) -> dict: + """Resize images to the nearest multiple of padding_divisor. + + Args: + data_dict (dict): Input data dict + Returns: + data_dict (dict): Output dict with resized images and metadata + """ + + if self.output_keys is None: + self.output_keys = self.input_keys + + # Get original image size + orig_w, orig_h = obtain_image_size(data_dict, self.input_keys) + + # Calculate new dimensions as multiples of padding_divisor + new_w = math.ceil(orig_w / self.padding_divisor) * self.padding_divisor + new_h = math.ceil(orig_h / self.padding_divisor) * self.padding_divisor + + # Resize images + for inp_key, out_key in zip(self.input_keys, self.output_keys): + data_dict[out_key] = transforms_F.resize( + data_dict[inp_key], + size=(new_h, new_w), + interpolation=transforms_F.InterpolationMode.BICUBIC, + antialias=True, + ) + if out_key != inp_key: + del data_dict[inp_key] + + # Store image size information (new_h, new_w, orig_h, orig_w) + data_dict["image_size"] = torch.tensor([new_h, new_w, orig_h, orig_w], dtype=torch.float) # [4] + + return data_dict + + +class InterleavedMediaResize(Augmentor): + """Resizes interleaved media content (images and videos) for both diffusion and MLLM models. + + This augmentor processes mixed media content containing both images and videos, creating two + versions of each media item: one optimized for diffusion models and another for Multimodal + Large Language Models (MLLMs). It preserves aspect ratios while ensuring dimensions meet + specific constraints for each model type. + + The resizing process follows these steps: + 1. Maintains aspect ratio while ensuring no side exceeds the maximum allowed length + 2. Adjusts dimensions to be divisible by model-specific padding constants + 3. Uses high-quality LANCZOS resampling for optimal visual quality + + Args: + input_keys (List, optional): List containing the key to access media content in data_dict. + Must contain exactly one key. Defaults to ['media_list']. + max_diffusion_image_side_length (int, optional): Maximum side length for diffusion model + images. Defaults to 1024. + max_mllm_image_side_length (int, optional): Maximum side length for MLLM images. + Defaults to 768. + diffusion_image_padding_constant (int, optional): Divisor for diffusion model image + dimensions. Both width and height must be divisible by this value. Defaults to 16. + mllm_image_padding_constant (int, optional): Divisor for MLLM image dimensions. + Both width and height must be divisible by this value. Defaults to 28. + use_center_crop (bool, optional): If True, uses center cropping to ensure dimensions + are divisible by padding constants, avoiding distortion. If False, uses resizing + which may cause slight distortion. Defaults to False. + args (Optional[dict], optional): Additional arguments passed to parent class. + Defaults to None. + + Input Format: + The data_dict should contain a key (specified in input_keys) with value structured as: + { + "image_0": PIL.Image, # Single image + "image_1": PIL.Image, # Another single image + "video_0": List[PIL.Image], # Video as list of frames + "video_1": List[PIL.Image], # Another video + ... + } + + Output Format: + The method adds two new keys to data_dict: + - 'diffusion_media_content': Resized media for diffusion models + - 'mllm_media_content': Resized media for MLLMs + + Both follow the same structure as the input, with resized versions of each media item. + + Example: + >>> # Using resize (default, may cause slight distortion) + >>> augmentor = OmniInterleavedMediaResize( + ... input_keys=['media_list'], + ... max_diffusion_image_side_length=1024, + ... max_mllm_image_side_length=768 + ... ) + >>> + >>> # Using center crop (no distortion) + >>> augmentor_crop = OmniInterleavedMediaResize( + ... input_keys=['media_list'], + ... max_diffusion_image_side_length=1024, + ... max_mllm_image_side_length=768, + ... use_center_crop=True + ... ) + >>> + >>> data_dict = { + ... 'media_list': { + ... 'image_0': pil_image, + ... 'video_0': [frame1, frame2, frame3] + ... } + ... } + >>> result = augmentor(data_dict) + >>> # result now contains 'diffusion_media_content' and 'mllm_media_content' + + Note: + - Images are only scaled down, never up, to preserve quality + - Videos are processed frame by frame, maintaining temporal consistency + - Unsupported media types will raise a ValueError + - When use_center_crop=True, images are center-cropped to achieve padding divisibility + without distortion. When False, images are resized which may cause slight distortion. + """ + + def __init__( + self, + input_keys: List = ["media_list"], + max_diffusion_image_side_length: int = 1024, + max_mllm_image_side_length: int = 768, + diffusion_image_padding_constant: int = 16, + use_center_crop: bool = False, + args: Optional[dict] = None, + ) -> None: + super().__init__(input_keys, None, args) + self.max_diffusion_image_side_length = max_diffusion_image_side_length + self.max_mllm_image_side_length = max_mllm_image_side_length + self.diffusion_image_padding_constant = diffusion_image_padding_constant + self.use_center_crop = use_center_crop + + def __call__(self, data_dict: Dict) -> Dict: + assert len(self.input_keys) == 1, ( + "This transform only supports one input key. Try to organize all the media contents under one key." + ) + if self.input_keys[0] not in data_dict: + print(f"Input key {self.input_keys[0]} not found in data_dict: {data_dict['__key__']}") + return None + original_media_content = data_dict[self.input_keys[0]] + + diffusion_media_content = {} + mllm_media_content = {} + + for key, media in original_media_content.items(): + # Check if it's an image or video + if isinstance(media, Image.Image): + # Process single image + diffusion_media_content[key] = self._resize_image( + media, + self.max_diffusion_image_side_length, + self.diffusion_image_padding_constant, + self.use_center_crop, + ) + mllm_media_content[key] = self._resize_image( + media, + self.max_mllm_image_side_length, + None, # we don't need to resize the mllm media content to a specific padding constant since it will be handled by the processor + self.use_center_crop, + ) + elif isinstance(media, list) and all(isinstance(frame, Image.Image) for frame in media): + # Process video (list of images) + diffusion_media_content[key] = [ + self._resize_image( + frame, + self.max_diffusion_image_side_length, + self.diffusion_image_padding_constant, + self.use_center_crop, + ) + for frame in media + ] + mllm_media_content[key] = [ + self._resize_image( + frame, + self.max_mllm_image_side_length, + None, # we don't need to resize the mllm media content to a specific padding constant since it will be handled by the processor + self.use_center_crop, + ) + for frame in media + ] + else: + raise ValueError(f"Unsupported media type for key {key}: {type(media)}") + + # Add the resized media content to data_dict + data_dict["diffusion_media_list"] = diffusion_media_content + data_dict["mllm_media_list"] = mllm_media_content + + return data_dict + + def _resize_image( + self, image: Image.Image, max_side_length: int, padding_divisor=None, use_center_crop: bool = False + ) -> Image.Image: + """Resize image while preserving aspect ratio and ensuring dimensions are divisible by padding_divisor. + + Args: + image: Input PIL Image + max_side_length: Maximum allowed side length + padding_divisor: Both dimensions must be divisible by this value + use_center_crop: If True, use center crop to achieve divisibility; if False, use resize + + Returns: + Resized PIL Image + """ + # Get original dimensions + width, height = image.size + + # Calculate scale factor to ensure max side length constraint + scale_factor = min(max_side_length / width, max_side_length / height) + + # Only scale down, not up + if scale_factor < 1.0: + new_width = max(1, int(width * scale_factor)) + new_height = max(1, int(height * scale_factor)) + else: + new_width = width + new_height = height + + # Resize image to maintain aspect ratio + resized_image = image.resize((new_width, new_height), Image.Resampling.LANCZOS) + + # Calculate target dimensions that are divisible by padding_divisor + if padding_divisor is not None: + final_width = max(1, (new_width // padding_divisor)) * padding_divisor + final_height = max(1, (new_height // padding_divisor)) * padding_divisor + else: + final_width = new_width + final_height = new_height + + # If dimensions need adjustment + if final_width != new_width or final_height != new_height: + if use_center_crop: + # Use center crop to achieve target dimensions + left = (new_width - final_width) // 2 + top = (new_height - final_height) // 2 + right = left + final_width + bottom = top + final_height + resized_image = resized_image.crop((left, top, right, bottom)) + else: + # Use resize (may cause distortion) + resized_image = resized_image.resize((final_width, final_height), Image.Resampling.LANCZOS) + + return resized_image diff --git a/cosmos-inference/cosmos3/_src/vfm/datasets/augmentors/interleaved_video_parsing.py b/cosmos-inference/cosmos3/_src/vfm/datasets/augmentors/interleaved_video_parsing.py new file mode 100644 index 00000000..44d10892 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/vfm/datasets/augmentors/interleaved_video_parsing.py @@ -0,0 +1,583 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +import random +from collections.abc import Callable +from typing import Optional + +import numpy as np +import omegaconf +import torch +from einops import rearrange +from torchcodec.decoders import VideoDecoder +from torchvision.transforms.v2 import Resize, UniformTemporalSubsample + +from cosmos3._src.imaginaire.datasets.webdataset.augmentors.image.misc import obtain_augmentation_size +from cosmos3._src.imaginaire.utils import log +from cosmos3._src.vfm.datasets.augmentors.video_parsing import VideoParsingWithFullFrames + +# Local copies of the torchcodec decoder helpers so this module does not depend on +# private symbols of ``video_parsing.py``. Behavior matches the originals. +_PostDecodeTransforms = list[Callable[[torch.Tensor], torch.Tensor]] | None +_SUPPORTS_VIDEO_DECODER_TRANSFORMS: bool | None = None +_WARNED_POST_DECODE_TRANSFORMS = False + + +def _create_video_decoder( + video: bytes, + seek_mode: str, + num_ffmpeg_threads: int, + transforms: _PostDecodeTransforms = None, +) -> tuple[VideoDecoder, _PostDecodeTransforms]: + global _SUPPORTS_VIDEO_DECODER_TRANSFORMS, _WARNED_POST_DECODE_TRANSFORMS + + kwargs = {"seek_mode": seek_mode, "num_ffmpeg_threads": num_ffmpeg_threads} + if transforms is None: + return VideoDecoder(video, **kwargs), None + + if _SUPPORTS_VIDEO_DECODER_TRANSFORMS is not False: + try: + decoder = VideoDecoder(video, transforms=transforms, **kwargs) + _SUPPORTS_VIDEO_DECODER_TRANSFORMS = True + return decoder, None + except TypeError as e: + if "transforms" not in str(e): + raise + _SUPPORTS_VIDEO_DECODER_TRANSFORMS = False + + if not _WARNED_POST_DECODE_TRANSFORMS: + log.warning( + "Installed torchcodec does not support VideoDecoder(transforms=...); " + "applying video transforms after frame decode.", + rank0_only=False, + ) + _WARNED_POST_DECODE_TRANSFORMS = True + return VideoDecoder(video, **kwargs), transforms + + +def _apply_post_decode_transforms( + frames: torch.Tensor, transforms: _PostDecodeTransforms +) -> torch.Tensor: # frames: [T,C,H,W], returns: [T,C,H,W] + if transforms is None: + return frames + + for transform in transforms: + frames = transform(frames) # [T,C,H,W] + return frames + + +class VideoTransferAlignedFullFramesParsing(VideoParsingWithFullFrames): + """Decode RGB and precomputed control videos with one shared v3 frame plan. + + This is the variable-length counterpart of the fixed-window transfer parser. + The RGB stream determines the sampled stride and frame indices. Any extra + input video streams, such as depth or segmentation, are decoded with the same + frame indices so the control video stays temporally aligned with the target. + """ + + def __init__(self, input_keys: list, output_keys: Optional[list] = None, args: Optional[dict] = None) -> None: + assert len(input_keys) >= 2, "VideoTransferAlignedFullFramesParsing requires [metas, video, ...]." + super().__init__(input_keys=input_keys[:2], output_keys=output_keys, args=args) + self.input_keys = input_keys + self.control_video_keys = input_keys[2:] + self.min_stride_key = self.args.get("min_stride_key", "_full_frames_min_stride") + + def _build_rgb_decode_transform(self, data_dict: dict, meta_dict: dict) -> list[Resize] | None: + if not self.perform_resize: + return None + + img_size = obtain_augmentation_size(data_dict, {"size": self.size}) + assert isinstance(img_size, (tuple, omegaconf.listconfig.ListConfig)), ( + f"Arg size in resize should be a tuple, get {type(img_size)}, {img_size}" + ) + img_w, img_h = img_size + orig_w, orig_h = meta_dict["width"], meta_dict["height"] + + scaling_ratio = min((img_w / orig_w), (img_h / orig_h)) + target_size = (int(scaling_ratio * orig_h + 0.5), int(scaling_ratio * orig_w + 0.5)) + assert target_size[0] <= img_h and target_size[1] <= img_w, ( + f"Resize error. orig {(orig_w, orig_h)} desire {img_size} compute {target_size}" + ) + return [Resize(target_size)] + + def _sample_frame_indices(self, decoder_len: int, min_stride_override: int | None = None) -> tuple[list[int], int]: + min_stride = int(min_stride_override) if min_stride_override is not None else self.min_stride + max_stride = max(self.max_stride, min_stride) + stride = self._sample_stride_with_bias(max_stride, min_stride) + frame_indices = np.arange(0, decoder_len, stride).tolist() + max_num_frames = min(len(frame_indices), self.args.get("max_num_frames", 1000)) + if max_num_frames < 1: + return [], stride + + # Wan VAE temporal compression expects 1 + 4N video frames. + num_video_frames = 1 + 4 * ((max_num_frames - 1) // 4) + return frame_indices[:num_video_frames], stride + + def _probe_video_len(self, video: bytes) -> int: + video_decoder = VideoDecoder( + video, + seek_mode=self.seek_mode, + num_ffmpeg_threads=self.video_decode_num_threads, + ) + try: + return len(video_decoder) + finally: + del video_decoder + + def _decode_frames_at( + self, + video: bytes, + frame_indices: list[int], + transforms: list[Resize] | None = None, + ) -> torch.Tensor: # returns [C,T,H,W] + video_decoder, post_decode_transforms = _create_video_decoder( + video, + self.seek_mode, + self.video_decode_num_threads, + transforms, + ) + try: + frame_batch = video_decoder.get_frames_at(frame_indices) + frames = frame_batch.data # [T,C,H,W] + frames = _apply_post_decode_transforms(frames, post_decode_transforms) # [T,C,H,W] + frames = frames.permute(1, 0, 2, 3) # [C,T,H,W] + finally: + del video_decoder + return frames # [C,T,H,W] + + def __call__(self, data_dict: dict) -> dict | None: + try: + meta_dict = data_dict[self.meta_key] + video = data_dict[self.video_key] + except Exception: + log.warning( + f"Cannot find video. url: {data_dict['__url__']}, key: {data_dict['__key__']}", + rank0_only=False, + ) + return None + + if not self._validate_and_probe(video, meta_dict, data_dict): + return None + + rgb_transform = self._build_rgb_decode_transform(data_dict, meta_dict) + control_videos: dict[str, bytes] = {} + try: + decoder_len = self._probe_video_len(video) + + control_decoder_lens = [] + for control_video_key in self.control_video_keys: + control_video = data_dict.get(control_video_key) + if not isinstance(control_video, bytes): + log.warning( + f"VideoTransferAlignedFullFramesParsing: missing bytes for {control_video_key}. " + f"url: {data_dict['__url__']}, key: {data_dict['__key__']}", + rank0_only=False, + ) + return None + control_videos[control_video_key] = control_video + control_decoder_lens.append(self._probe_video_len(control_video)) + + # Precomputed control streams can be one frame shorter than RGB; sample + # only frames present in every stream to keep all modalities aligned. + aligned_decoder_len = min([decoder_len, *control_decoder_lens]) if control_decoder_lens else decoder_len + + min_stride_override = data_dict.pop(self.min_stride_key, None) + frame_indices, stride = self._sample_frame_indices( + aligned_decoder_len, min_stride_override=min_stride_override + ) + if len(frame_indices) == 0: + log.warning( + f"VideoTransferAlignedFullFramesParsing: no valid frame indices. " + f"url: {data_dict['__url__']}, key: {data_dict['__key__']}", + rank0_only=False, + ) + return None + + video_frames = self._decode_frames_at(video, frame_indices, rgb_transform) # [C,T,H,W] + except Exception as e: + log.warning( + f"Failed to decode RGB video. url: {data_dict['__url__']}, key: {data_dict['__key__']}, error: {e}", + rank0_only=False, + ) + return None + + base_video_info = { + "frame_start": frame_indices[0], + "frame_end": frame_indices[-1], + "frame_indices": frame_indices, + "num_frames": len(frame_indices), + "fps": meta_dict["framerate"], + "conditioning_fps": meta_dict["framerate"] / stride, + "num_multiplier": stride, + "n_orig_video_frames": decoder_len, + } + data_dict[self.video_key] = { + **base_video_info, + "video": video_frames, # [C,T,H,W] + } + + for control_video_key, control_video in control_videos.items(): + try: + control_frames = self._decode_frames_at(control_video, frame_indices) # [C,T,H,W] + except Exception as e: + log.warning( + f"Failed to decode {control_video_key}. " + f"url: {data_dict['__url__']}, key: {data_dict['__key__']}, error: {e}", + rank0_only=False, + ) + return None + data_dict[control_video_key] = { + **base_video_info, + "video": control_frames, # [C,T,H,W] + } + + return data_dict + + +class VideoTransferAlignedLegacyChunkParsing(VideoTransferAlignedFullFramesParsing): + """Decode legacy caption-window transfer streams with shared RGB/control frame indices.""" + + def __init__(self, input_keys: list, output_keys: Optional[list] = None, args: Optional[dict] = None) -> None: + super().__init__(input_keys=input_keys, output_keys=output_keys, args=args) + self.key_for_caption = self.args["key_for_caption"] + assert self.key_for_caption in [ + "t2w_windows", + "i2w_windows_later_frames", + ], "key_for_caption must be either t2w_windows or i2w_windows_later_frames" + self.min_duration = self.args["min_duration"] + self.num_frames = self.args["num_video_frames"] + self.use_native_fps = self.args["use_native_fps"] + self.use_original_fps = self.args["use_original_fps"] + self.use_dynamic_fps = self.args.get("use_dynamic_fps", False) + self.low_fps_bias = self.args.get("low_fps_bias", 0.5) + assert 0.0 <= self.low_fps_bias <= 1.0, f"low_fps_bias must be in [0, 1], got {self.low_fps_bias}" + mode_count = sum([self.use_dynamic_fps, self.use_native_fps, self.use_original_fps]) + assert mode_count <= 1, ( + f"Only one FPS mode can be enabled at a time. Got: " + f"use_dynamic_fps={self.use_dynamic_fps}, " + f"use_native_fps={self.use_native_fps}, " + f"use_original_fps={self.use_original_fps}" + ) + self.allowed_num_multiplers = self.args.get("allowed_num_multiplers", list(range(1, 100))) + if self.num_frames > 0: + self.sampler = UniformTemporalSubsample(self.num_frames) + + def _sample_legacy_stride_with_bias(self, max_stride: int) -> int: + if max_stride == 1: + return 1 + + strides = np.arange(1, max_stride + 1) + weights = np.linspace(1 - self.low_fps_bias, self.low_fps_bias, max_stride) + weights = np.maximum(weights, 0.01) + probs = weights / weights.sum() + return int(np.random.choice(strides, p=probs)) + + def _decode_all_streams_at( + self, + video: bytes, + control_videos: dict[str, bytes], + frame_indices: list[int], + rgb_transform: list[Resize] | None, + ) -> tuple[torch.Tensor, dict[str, torch.Tensor]]: + video_frames = self._decode_frames_at(video, frame_indices, rgb_transform) # [C,T,H,W] + control_frames_by_key = { + control_video_key: self._decode_frames_at(control_video, frame_indices) # [C,T,H,W] + for control_video_key, control_video in control_videos.items() + } + return video_frames, control_frames_by_key + + def _subsample_all_streams( + self, video_frames: torch.Tensor, control_frames_by_key: dict[str, torch.Tensor] + ) -> tuple[torch.Tensor, dict[str, torch.Tensor]]: + video_frames = rearrange( + self.sampler(rearrange(video_frames, "c t h w -> t c h w")), "t c h w -> c t h w" + ) # [C,T,H,W] + control_frames_by_key = { + key: rearrange(self.sampler(rearrange(frames, "c t h w -> t c h w")), "t c h w -> c t h w") + for key, frames in control_frames_by_key.items() + } # [C,T,H,W] + return video_frames, control_frames_by_key + + def __call__(self, data_dict: dict) -> dict | None: + try: + meta_dict = data_dict[self.meta_key] + video = data_dict[self.video_key] + except Exception: + log.warning( + f"Cannot find video. url: {data_dict['__url__']}, key: {data_dict['__key__']}", + rank0_only=False, + ) + return None + + if not self._validate_and_probe(video, meta_dict, data_dict): + return None + + control_videos: dict[str, bytes] = {} + control_decoder_lens: list[int] = [] + try: + decoder_len = self._probe_video_len(video) + for control_video_key in self.control_video_keys: + control_video = data_dict.get(control_video_key) + if not isinstance(control_video, bytes): + log.warning( + f"VideoTransferAlignedLegacyChunkParsing: missing bytes for {control_video_key}. " + f"url: {data_dict['__url__']}, key: {data_dict['__key__']}", + rank0_only=False, + ) + return None + control_videos[control_video_key] = control_video + control_decoder_lens.append(self._probe_video_len(control_video)) + except Exception as e: + log.warning( + f"Failed to probe video streams. url: {data_dict['__url__']}, key: {data_dict['__key__']}, error: {e}", + rank0_only=False, + ) + return None + + aligned_decoder_len = min([decoder_len, *control_decoder_lens]) if control_decoder_lens else decoder_len + options: list = list((i, item) for i, item in enumerate(meta_dict[self.key_for_caption])) + if len(options) > 1: + options = options[:-1] + random.shuffle(options) + + rgb_transform = self._build_rgb_decode_transform(data_dict, meta_dict) + video_frames = None + control_frames_by_key: dict[str, torch.Tensor] = {} + dynamic_conditioning_fps = None + num_multiplier: float | int = 1 + frame_indices: list[int] = [] + chunk_index = 0 + start_frame = 0 + end_frame = 0 + + for chunk_index, option in options: + start_frame = int(option["start_frame"]) + end_frame = min(int(option["end_frame"]), aligned_decoder_len) + if (end_frame - start_frame) < self.min_duration * meta_dict["framerate"]: + continue + if self.use_native_fps or self.use_original_fps or self.use_dynamic_fps: + if "alpamayo" in data_dict["__url__"].root: + start_frame += 5 + if (end_frame - start_frame) < self.num_frames: + continue + total_frames = end_frame - start_frame + if self.use_dynamic_fps: + max_stride = total_frames // self.num_frames + if max_stride < 1: + continue + num_multiplier = self._sample_legacy_stride_with_bias(max_stride) + dynamic_conditioning_fps = meta_dict["framerate"] / num_multiplier + elif self.use_native_fps: + num_multiplier = total_frames // self.num_frames + if num_multiplier not in self.allowed_num_multiplers: + continue + else: + num_multiplier = 1 + + expected_length = self.num_frames * int(num_multiplier) + if total_frames < expected_length: + continue + frame_start = start_frame + (total_frames - expected_length) // 2 + frame_end = frame_start + expected_length + frame_indices = list(range(frame_start, frame_end, int(num_multiplier))) + else: + frame_indices = list(range(start_frame, end_frame)) + if "alpamayo" in data_dict["__url__"].root: + if len(frame_indices) < 5: + continue + frame_indices = frame_indices[5:] + start_frame += 5 + + try: + video_frames, control_frames_by_key = self._decode_all_streams_at( + video, control_videos, frame_indices, rgb_transform + ) # [C,T,H,W] + except Exception as e: + log.warning( + f"Failed to decode aligned video streams. " + f"url: {data_dict['__url__']}, key: {data_dict['__key__']}, error: {e}", + rank0_only=False, + ) + return None + break + + if video_frames is None: + log.warning( + f"No valid video frames found, return None. url: {data_dict['__url__']}, key: {data_dict['__key__']}", + rank0_only=False, + ) + return None + + if self.num_frames > 0 and not (self.use_dynamic_fps or self.use_native_fps or self.use_original_fps): + video_frames, control_frames_by_key = self._subsample_all_streams( + video_frames, control_frames_by_key + ) # [C,T,H,W] + num_multiplier = (end_frame - start_frame) / self.num_frames + + + # variable-length fields like ``frame_indices`` here -- ``video_flatten_keys`` in + # ``get_video_transfer_augmentor`` lists ``frame_indices``, and surfacing a + # per-sample list there would crash ``custom_collate_fn`` (default_collate requires + # equal-size elements across the batch). + base_video_info = { + "fps": meta_dict["framerate"], + "n_orig_video_frames": meta_dict["nb_frames"], + "chunk_index": chunk_index, + "frame_start": start_frame, + "frame_end": end_frame, + "num_frames": end_frame - start_frame, + "num_multiplier": num_multiplier, + "conditioning_fps": dynamic_conditioning_fps or meta_dict["framerate"] / num_multiplier, + } + data_dict[self.video_key] = { + **base_video_info, + "video": video_frames, # [C,T,H,W] + } + for control_video_key, control_frames in control_frames_by_key.items(): + data_dict[control_video_key] = { + **base_video_info, + "video": control_frames, # [C,T,H,W] + } + return data_dict + + +class VideoTransferAlignedChunkedFramesParsing(VideoTransferAlignedFullFramesParsing): + """Decode RGB and aligned control videos for a selected caption chunk.""" + + def _sample_frame_indices_for_chunk( + self, + decoder_len: int, + chunk_start: int, + chunk_end: int, + min_stride_override: int | None = None, + ) -> tuple[list[int], int]: + chunk_start = max(0, min(chunk_start, decoder_len)) + chunk_end = max(chunk_start, min(chunk_end, decoder_len)) + if chunk_end <= chunk_start: + return [], 0 + + min_stride = int(min_stride_override) if min_stride_override is not None else self.min_stride + max_stride = max(self.max_stride, min_stride) + stride = self._sample_stride_with_bias(max_stride, min_stride) + frame_indices = np.arange(chunk_start, chunk_end, stride).tolist() + max_num_frames = min(len(frame_indices), self.args.get("max_num_frames", 1000)) + if max_num_frames < 1: + return [], stride + + # Wan VAE temporal compression expects 1 + 4N video frames. + num_video_frames = 1 + 4 * ((max_num_frames - 1) // 4) + return frame_indices[:num_video_frames], stride + + def __call__(self, data_dict: dict) -> dict | None: + try: + meta_dict = data_dict[self.meta_key] + video = data_dict[self.video_key] + except Exception: + log.warning( + f"Cannot find video. url: {data_dict['__url__']}, key: {data_dict['__key__']}", + rank0_only=False, + ) + return None + + if not self._validate_and_probe(video, meta_dict, data_dict): + return None + + if "chunk_start_frame" not in data_dict or "chunk_end_frame" not in data_dict: + log.warning( + f"VideoTransferAlignedChunkedFramesParsing: missing chunk_start_frame/chunk_end_frame. " + f"url: {data_dict['__url__']}, key: {data_dict['__key__']}", + rank0_only=False, + ) + return None + chunk_start = int(data_dict["chunk_start_frame"]) + chunk_end = int(data_dict["chunk_end_frame"]) + + rgb_transform = self._build_rgb_decode_transform(data_dict, meta_dict) + control_videos: dict[str, bytes] = {} + try: + decoder_len = self._probe_video_len(video) + + control_decoder_lens = [] + for control_video_key in self.control_video_keys: + control_video = data_dict.get(control_video_key) + if not isinstance(control_video, bytes): + log.warning( + f"VideoTransferAlignedChunkedFramesParsing: missing bytes for {control_video_key}. " + f"url: {data_dict['__url__']}, key: {data_dict['__key__']}", + rank0_only=False, + ) + return None + control_videos[control_video_key] = control_video + control_decoder_lens.append(self._probe_video_len(control_video)) + + # Clamp the caption chunk to frames available in every loaded stream. + aligned_decoder_len = min([decoder_len, *control_decoder_lens]) if control_decoder_lens else decoder_len + min_stride_override = data_dict.pop(self.min_stride_key, None) + frame_indices, stride = self._sample_frame_indices_for_chunk( + aligned_decoder_len, + chunk_start, + chunk_end, + min_stride_override=min_stride_override, + ) + if len(frame_indices) == 0: + log.warning( + f"VideoTransferAlignedChunkedFramesParsing: empty chunk after clamping/stride. " + f"chunk=[{chunk_start},{chunk_end}), aligned_decoder_len={aligned_decoder_len}, " + f"url: {data_dict['__url__']}, key: {data_dict['__key__']}", + rank0_only=False, + ) + return None + + video_frames = self._decode_frames_at(video, frame_indices, rgb_transform) # [C,T,H,W] + except Exception as e: + log.warning( + f"Failed to decode RGB video. url: {data_dict['__url__']}, key: {data_dict['__key__']}, error: {e}", + rank0_only=False, + ) + return None + + base_video_info = { + "frame_start": frame_indices[0], + "frame_end": frame_indices[-1], + "frame_indices": frame_indices, + "num_frames": len(frame_indices), + "fps": meta_dict["framerate"], + "conditioning_fps": meta_dict["framerate"] / stride if stride > 0 else meta_dict["framerate"], + "num_multiplier": stride, + "n_orig_video_frames": decoder_len, + } + data_dict[self.video_key] = { + **base_video_info, + "video": video_frames, # [C,T,H,W] + } + + for control_video_key, control_video in control_videos.items(): + try: + control_frames = self._decode_frames_at(control_video, frame_indices) # [C,T,H,W] + except Exception as e: + log.warning( + f"Failed to decode {control_video_key}. " + f"url: {data_dict['__url__']}, key: {data_dict['__key__']}, error: {e}", + rank0_only=False, + ) + return None + data_dict[control_video_key] = { + **base_video_info, + "video": control_frames, # [C,T,H,W] + } + + data_dict.pop("chunk_start_frame", None) + data_dict.pop("chunk_end_frame", None) + return data_dict diff --git a/cosmos-inference/cosmos3/_src/vfm/datasets/augmentors/merge_datadict.py b/cosmos-inference/cosmos3/_src/vfm/datasets/augmentors/merge_datadict.py new file mode 100644 index 00000000..b7056e39 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/vfm/datasets/augmentors/merge_datadict.py @@ -0,0 +1,79 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +from typing import Optional + +from cosmos3._src.imaginaire.datasets.webdataset.augmentors.augmentor import Augmentor +from cosmos3._src.imaginaire.utils import log + + +class KeyRenamer(Augmentor): + """Renames keys in data_dict. Runs as the first augmentor to normalize key names. + + Args: + input_keys: Not used (required by Augmentor interface). + output_keys: Not used. + args: Dictionary with: + - rename_map: dict[str, str] mapping old_key -> new_key. + """ + + def __init__(self, input_keys: list, output_keys: Optional[list] = None, args: Optional[dict] = None) -> None: + super().__init__(input_keys, output_keys, args) + self.rename_map: dict[str, str] = args.get("rename_map", {}) if args else {} + + def __call__(self, data_dict: dict) -> dict: + if not self.rename_map: + return data_dict + + for old_key, new_key in self.rename_map.items(): + if old_key in data_dict: + data_dict[new_key] = data_dict.pop(old_key) + return data_dict + + +class DataDictMerger(Augmentor): + def __init__(self, input_keys: list, output_keys: list, args: Optional[dict] = None) -> None: + super().__init__(input_keys, output_keys, args) + + def __call__(self, data_dict: dict) -> dict | None: + r"""Merge the dictionary associated with the input keys into data_dict. Only keys in output_keys are merged. + + Supports transfer-style keys (e.g. depth_pervideo_video_depth_anything): when "depth" in key + assigns key_dict["video"] to data_dict["depth"]; when "segmentation" in key assigns + key_dict["video"] or key_dict to data_dict["segmentation"]. + + Args: + data_dict (dict): Input data dict + Returns: + data_dict (dict): Output dict with dictionary associated with the input keys merged. + """ + for key in self.input_keys: + if key not in data_dict: + log.warning( + f"DataDictMerger dataloader error: missing {key}, {data_dict['__url__']}, {data_dict['__key__']}", + rank0_only=False, + ) + return None + key_dict = data_dict.pop(key) + if "depth" in key and "depth" in self.output_keys: + data_dict["depth"] = key_dict["video"] + elif "segmentation" in key and "segmentation" in self.output_keys: + data_dict["segmentation"] = key_dict["video"] if "video" in key_dict else key_dict + if isinstance(key_dict, dict): + for sub_key in key_dict: + if sub_key in self.output_keys and sub_key not in data_dict: + data_dict[sub_key] = key_dict[sub_key] + del key_dict + return data_dict diff --git a/cosmos-inference/cosmos3/_src/vfm/datasets/augmentors/pkl_to_media.py b/cosmos-inference/cosmos3/_src/vfm/datasets/augmentors/pkl_to_media.py new file mode 100644 index 00000000..3a092f18 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/vfm/datasets/augmentors/pkl_to_media.py @@ -0,0 +1,360 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +"""Augmentors for handling video loading from pickled bytes.""" + +import io +import pickle as pkl +import random +import re +from typing import Dict, Optional + +import numpy as np +import torch +from PIL import Image, UnidentifiedImageError +from qwen_vl_utils.vision_process import smart_nframes, smart_resize +from torchvision import transforms +from torchvision.transforms import InterpolationMode + +from cosmos3._src.imaginaire.datasets.webdataset.augmentors.augmentor import Augmentor +from cosmos3._src.imaginaire.utils import log + +Image.MAX_IMAGE_PIXELS = 933120000 +_VIDEO_EXTENSIONS = "mp4 avi webm mov".split() + +VIDEO_DECODER_OPTIONS = {} + + +def token_to_pixels(token_length: int, patch_size: int = 14, temporal_patch_size: int = 2) -> int: + """Convert token length to pixels based on patch size and temporal patch size.""" + + merged_patch_size = patch_size * 2 + return token_length * merged_patch_size**2 * temporal_patch_size + + +def pixels_to_token(pixels: int, patch_size: int = 14, temporal_patch_size: int = 2) -> int: + """Convert pixels to token length based on patch size and temporal patch size.""" + + merged_patch_size = patch_size * 2 + return pixels // merged_patch_size**2 // temporal_patch_size + + +def tensor_to_pil_images(video_tensor): + """ + Convert a video tensor of shape (C, T, H, W) or (T, C, H, W) to a list of PIL images. + + Args: + video_tensor (torch.Tensor): Video tensor with shape (C, T, H, W) or (T, C, H, W) + + Returns: + list[PIL.Image.Image]: List of PIL images + """ + # Check tensor shape and convert if needed + if video_tensor.shape[0] == 3 and video_tensor.shape[1] > 3: # (C, T, H, W) + # Convert to (T, C, H, W) + video_tensor = video_tensor.permute(1, 0, 2, 3) # [T,C,H,W] + + # Convert to numpy array with shape (T, H, W, C) + video_np = video_tensor.permute(0, 2, 3, 1).cpu().numpy() # [T,H,W,C] + + # Ensure values are in the right range for PIL (0-255, uint8) + if video_np.dtype == np.float32 or video_np.dtype == np.float64: + if video_np.max() <= 1.0: + video_np = (video_np * 255).astype(np.uint8) + else: + video_np = video_np.astype(np.uint8) + + # Convert each frame to a PIL image + pil_images = [Image.fromarray(frame) for frame in video_np] + + return pil_images + + +def _video_decoder_qwen_func( + key: str, + data: bytes, + min_fps_thres: int = 4, + max_fps_thres: int = 60, + target_fps: float = 2.0, + min_video_token_length: int = 16, + max_video_token_length: int = 8192, + num_threads: int = 0, + random_augmentation: bool = False, + fps_random_range: list[float] = [0.5, 1.5], + max_video_token_length_random_range: list[float] = [0.75, 1.25], + frame_count_random_range: Optional[list[int]] = None, + start_frame: Optional[int] = None, + end_frame: Optional[int] = None, + **kwargs, +) -> dict | None: + """Actual video decoder function. + + Args: + key (str): Video file name/key + data (bytes): Video binary data + min_fps_thres (int, optional): Minimum FPS threshold. Defaults to 4. + max_fps_thres (int, optional): Maximum FPS threshold. Defaults to 60. + target_fps (float, optional): Target FPS. Defaults to 2.0. + min_video_token_length (int, optional): Minimum token length. Defaults to 16. + max_video_token_length (int, optional): Maximum token length. Defaults to 8192. + num_threads (int, optional): Number of threads for decord. Defaults to 0. + random_augmentation (bool, optional): Whether to randomize the FPS and max_video_token_length. Defaults to False. + fps_random_range (list[float], optional): Random FPS range. Defaults to [10.0, 24.0]. + max_video_token_length_random_range (list[float], optional): Random max_video_token_length range. Defaults to [0.75, 1.25]. + frame_count_random_range (list[int], optional): Random frame count range. If provided, take priority over fps_random_range. + start_frame (Optional[int], optional): Start frame. Defaults to None. If both start_frame and end_frame are provided, the video will be decoded from start_frame to end_frame. + end_frame (Optional[int], optional): End frame. Defaults to None. If both start_frame and end_frame are provided, the video will be decoded from start_frame to end_frame. + + Raises: + ValueError: Video fps lower than 1, skipping + ValueError: Video fps lower than min_fps_thres, skipping + ValueError: Video fps higher than max_fps_thres, skipping + + Returns: + dict | None: Dictionary with video frames tensor and target FPS + """ + import decord + + # Check video extension + extension = re.sub(r".*[.]", "", key) + if extension.lower() not in _VIDEO_EXTENSIONS: + return None + + # Read video + video_buffer = io.BytesIO(data) + video_reader = decord.VideoReader(video_buffer, num_threads=num_threads) + total_frames, video_fps = len(video_reader), video_reader.get_avg_fps() + + if start_frame is not None and end_frame is not None: + total_frames = end_frame - start_frame + + if video_fps < 1: + raise ValueError("Video fps lower than 1, skipping") + if video_fps < min_fps_thres: + raise ValueError(f"Video fps {video_fps} lower than {min_fps_thres}, skipping") + if video_fps > max_fps_thres: + raise ValueError(f"Video fps {video_fps} higher than {max_fps_thres}, skipping") + + if random_augmentation: + if frame_count_random_range is not None: + # Random number of frames + min_frames_range, max_frames_range = frame_count_random_range + min_frames_range = min(min_frames_range, total_frames) + max_frames_range = min(max_frames_range, total_frames) + target_frames = random.uniform(min_frames_range, max_frames_range) + target_fps = target_frames / total_frames * video_fps + else: + # randomize fps + target_fps = ( + random.uniform(fps_random_range[0], fps_random_range[1]) * target_fps + if random.random() < 0.5 + else target_fps + ) + # randomize max_video_token_length + max_video_token_length = int( + random.uniform(max_video_token_length_random_range[0], max_video_token_length_random_range[1]) + * max_video_token_length + ) + log.debug(f"random_augmentation: max_video_token_length: {max_video_token_length}, target_fps: {target_fps}") + + patch_size = 14 + min_height_width = 56 # https://github.com/huggingface/transformers/blob/main/src/transformers/models/qwen2_vl/image_processing_qwen2_vl.py#L57 + temporal_patch_size = 2 + min_pixels: int = token_to_pixels(min_video_token_length, patch_size, temporal_patch_size) + max_pixels: int = token_to_pixels(max_video_token_length, patch_size, temporal_patch_size) + max_frames: int = max_pixels // (min_height_width) ** 2 // temporal_patch_size + + # sample based on target fps + nframes = smart_nframes(dict(fps=target_fps), total_frames=total_frames, video_fps=video_fps) + nframes = min(nframes, max_frames) + if start_frame is not None and end_frame is not None: + idx = torch.linspace(start_frame, end_frame - 1, nframes).round().long().tolist() # [nframes] + else: + idx = torch.linspace(0, total_frames - 1, nframes).round().long().tolist() # [nframes] + video_frames = video_reader.get_batch(idx).asnumpy() + video_frames = torch.tensor(video_frames).permute(0, 3, 1, 2) # [T,C,H,W] + sample_fps = nframes / max(total_frames, 1e-6) * video_fps + + # recompute max_pixels based on number of sampled frames + nframes, _, height, width = video_frames.shape + max_pixels = max_pixels // nframes + resized_height, resized_width = smart_resize( + height, + width, + min_pixels=min_pixels, + max_pixels=max_pixels, + ) + video_frames = transforms.functional.resize( + video_frames, + [resized_height, resized_width], + interpolation=InterpolationMode.BICUBIC, + antialias=True, + ).float() + video_frames = video_frames.permute(1, 0, 2, 3) # [C,T,H,W] + + # Clean up + video_reader.seek(0) # set video reader point back to 0 to clean up cache + del video_reader # delete the reader to avoid memory leak + + return dict(videos=video_frames, fps=sample_fps) + + +class PKLToMedia(Augmentor): + """ + Converts PKL bytes stored in a data dictionary into media. + + Handles input formats for the specified input key: + A dictionary mapping media names (str) to bytes objects. + + The output format is a dictionary mapping names to their respective decoded objects: + Input dict[str, bytes] -> Output dict[str, torch.Tensor | PIL.Image] + + Corrupted or non-decodable bytes are skipped with a warning. + """ + + def __init__( + self, + input_key: str = "media", + output_key: str = "media", + min_fps_thres: int = 4, + max_fps_thres: int = 60, + target_fps: float = 4.0, + min_video_token_length: int = 16, + max_video_token_length: int = 8192, + num_threads: int = 0, + random_augmentation: bool = False, + is_input_in_dict: bool = False, + use_start_frame_end_frame: bool = False, + frame_count_random_range: Optional[list[int]] = None, + ) -> None: + """ + Args: + input_key (str): Key in the data_dict containing video/image data. + output_key (str): Key to store the resulting video frame tensors or PIL images. + min_fps_thres (int): Minimum FPS threshold for video decoding. + max_fps_thres (int): Maximum FPS threshold for video decoding. + target_fps (float): Target FPS for video decoding. + min_video_token_length (int): Minimum token length for video decoding. + max_video_token_length (int): Maximum token length for video decoding. + num_threads (int): Number of threads for video decoding. + random_augmentation (bool): Whether to apply random augmentation during decoding. + is_input_in_dict (bool): Whether the input key is in the data_dict instead of pkl files. (For cosmos predict2 videos) + use_start_frame_end_frame (bool): Whether to use start_frame and end_frame to decode the video. (For cosmos predict2 videos) + frame_count_random_range (list[int], optional): Random frame count range. Defaults to None. + """ + self.input_key = input_key + self.output_key = output_key + self.video_decoder_params = { + "min_fps_thres": min_fps_thres, + "max_fps_thres": max_fps_thres, + "target_fps": target_fps, + "min_video_token_length": min_video_token_length, + "max_video_token_length": max_video_token_length, + "num_threads": num_threads, + "random_augmentation": random_augmentation, + "frame_count_random_range": frame_count_random_range, + } + self.is_input_in_dict = is_input_in_dict + self.use_start_frame_end_frame = use_start_frame_end_frame + + def _bytes_to_video_frames(self, video_bytes: bytes, identifier: str = "video") -> Optional[Dict]: + """Converts video bytes to video frame tensors using the video decoder.""" + try: + result = _video_decoder_qwen_func( + key=f"{identifier}.mp4", # Add .mp4 extension for the decoder + data=video_bytes, + **self.video_decoder_params, + ) + result["videos"] = tensor_to_pil_images(result["videos"]) # 3,T,H,W -> list of PIL images + if result is not None: + return result + else: + log.warning(f"Skipping item '{identifier}': Video decoder returned None.") + return None + except Exception as e: + log.warning(f"Skipping item '{identifier}': Error decoding video bytes: {e}") + return None + + def _bytes_to_pil(self, image_bytes: bytes, identifier: str = "image") -> Optional[Image.Image]: + """Converts a single bytes object to a PIL Image.""" + try: + with io.BytesIO(image_bytes) as stream: + img = Image.open(stream) + img.load() # Verify the image data + return img.convert("RGB") # Convert to standard RGB format + except UnidentifiedImageError: + log.warning(f"Skipping item '{identifier}': Cannot identify image file from bytes.") + except Exception as e: + log.warning(f"Skipping item '{identifier}': Error decoding image bytes: {e}") + return None + + def __call__(self, data_dict: Dict) -> Dict: + """ + Processes the data_dict to convert video/image bytes to their respective formats. + + Args: + data_dict (Dict): The input data dictionary. + + Returns: + Dict: The modified data dictionary with video frame tensors and/or PIL images. + """ + input_key = self.input_key + output_key = self.output_key + + if input_key not in data_dict: + log.debug( + f"Input key '{input_key}' not found in data_dict. Skipping PKLToMedia. Available keys: {data_dict.keys()}" + ) + return data_dict + + if not self.is_input_in_dict: + data = pkl.loads(data_dict[input_key]) + else: + data = data_dict[input_key] + + output_data = {} + + if isinstance(data, dict): + for name, item in data.items(): + if isinstance(item, bytes): + # Determine if this is video or image based on the key name + if "video" in name.lower(): + # Decode as video + result = self._bytes_to_video_frames(item, identifier=f"{input_key}['{name}']") + if result: + output_data[name] = result + elif "image" in name.lower(): + # Decode as image + result = self._bytes_to_pil(item, identifier=f"{input_key}['{name}']") + if result: + output_data[name] = result + else: + log.warning( + f"Skipping item with key '{name}' in '{input_key}': Key does not contain 'video' or 'image'." + ) + else: + log.warning(f"Skipping item with key '{name}' in '{input_key}': Expected bytes, got {type(item)}.") + else: + raise ValueError( + f"Input key '{input_key}' has unsupported type {type(data)}. " + f"Expected dict[str, bytes] for video/image data." + ) + + # Add the processed data and optionally remove the input key + data_dict[output_key] = output_data + if input_key != output_key: + del data_dict[input_key] + + return data_dict diff --git a/cosmos-inference/cosmos3/_src/vfm/datasets/augmentors/resolution_text_info.py b/cosmos-inference/cosmos3/_src/vfm/datasets/augmentors/resolution_text_info.py new file mode 100644 index 00000000..ef03adec --- /dev/null +++ b/cosmos-inference/cosmos3/_src/vfm/datasets/augmentors/resolution_text_info.py @@ -0,0 +1,134 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +from typing import Optional + +from cosmos3._src.imaginaire.datasets.webdataset.augmentors.augmentor import Augmentor + +# Default templates for resolution info +DEFAULT_IMAGE_TEMPLATE = "This image is of {height}x{width} resolution." +DEFAULT_VIDEO_TEMPLATE = "This video is of {height}x{width} resolution." + + +class ResolutionTextInfo(Augmentor): + """ + Augmentor that appends resolution (height x width) info to captions. + + This augmentor should run AFTER CropToMultiple (which sets final_height/final_width) + and AFTER text transforms (so ai_caption exists), but BEFORE tokenization. + + Reads resolution from metadata keys (final_height, final_width) set by CropToMultiple. + Does NOT fall back to tensor shape to avoid incorrect latent dimensions. + + Automatically detects whether the input is an image or video based on which + key is present in the data_dict, and uses the appropriate template. + + Example: + Original caption: "A cat playing with a ball" + Augmented (image): "A cat playing with a ball. This image is 512x512." + Augmented (video): "A cat playing with a ball. This video is 480x854." + + Args: + input_keys (list): Input keys (not used, kept for API compatibility) + output_keys (list): Output keys (not used, kept for API compatibility) + args (dict): Configuration arguments: + - caption_key (str): Key for caption in data_dict. Default: "ai_caption" + - video_key (str): Key for video tensor in data_dict. Default: "video" + - image_size_key (str): Key for image size tensor in data_dict. Default: "image_size" + - image_template (str): Format string for image metadata. Default: DEFAULT_IMAGE_TEMPLATE + - video_template (str): Format string for video metadata. Default: DEFAULT_VIDEO_TEMPLATE + - separator (str): Separator between caption and metadata. Default: ". " + - enabled (bool): Whether augmentation is enabled. Default: True + """ + + def __init__( + self, input_keys: Optional[list] = None, output_keys: Optional[list] = None, args: Optional[dict] = None + ) -> None: + super().__init__(input_keys, output_keys, args) + + # Configuration with sensible defaults + self.caption_key = args.get("caption_key", "ai_caption") if args else "ai_caption" + self.image_key = args.get("image_key", "images") if args else "images" + self.video_key = args.get("video_key", "video") if args else "video" + self.image_size_key = args.get("image_size_key", "image_size") if args else "image_size" + self.image_template = args.get("image_template", DEFAULT_IMAGE_TEMPLATE) if args else DEFAULT_IMAGE_TEMPLATE + self.video_template = args.get("video_template", DEFAULT_VIDEO_TEMPLATE) if args else DEFAULT_VIDEO_TEMPLATE + self.default_separator = args.get("separator", ". ") if args else ". " + self.enabled = args.get("enabled", True) if args else True + + def __call__(self, data_dict: dict) -> dict | None: + """ + Append resolution (height x width) as text timestamps to the caption. + + Args: + data_dict (dict): Input data dict containing caption and image/video tensor + + Returns: + data_dict (dict): Output dict with augmented caption. + """ + if not self.enabled: + return data_dict + + # Get caption - must exist at this point (set by text transforms) + assert self.caption_key in data_dict, f"caption_key '{self.caption_key}' not found in data_dict." + caption = data_dict[self.caption_key] + + if (not isinstance(caption, str) and not isinstance(caption, dict)) or caption == "": + # This is for unconditional case. + return data_dict + + # Detect image vs video to select template + is_video = self.video_key in data_dict + is_image = self.image_key in data_dict + + if isinstance(caption, str): + # Case 1: Caption is a string. In this case, we create a string template for + # resolution, aspect ratio info and add it + if not is_video and not is_image: + raise ValueError("Neither video_key nor image_key found in data_dict.") + + template = self.video_template if is_video else self.image_template + + # Get dimensions from metadata keys (set by CropToMultiple) + image_size = data_dict.get(self.image_size_key) + height = int(image_size[0]) + width = int(image_size[1]) + + # Format metadata text + metadata_text = template.format(height=height, width=width) + + # Choose separator based on whether caption ends with a period + separator = " " if caption.rstrip().endswith(".") else self.default_separator + + # Update caption + data_dict[self.caption_key] = caption + separator + metadata_text + + elif isinstance(caption, dict): + # Case 2: Caption is a dictionary. This is for the json caption case. + # In this case, we add resolution and aspect ratio in json fields + aspect_ratio = data_dict["__url__"].meta.opts["aspect_ratio"] + height = int(data_dict["image_size"][0]) + width = int(data_dict["image_size"][1]) + data_dict[self.caption_key].update( + { + "resolution": {"H": height, "W": width}, + "aspect_ratio": aspect_ratio, + } + ) + + else: + raise ValueError(f"Unsupported caption type: {type(caption)}") + + return data_dict diff --git a/cosmos-inference/cosmos3/_src/vfm/datasets/augmentors/sequence_plan.py b/cosmos-inference/cosmos3/_src/vfm/datasets/augmentors/sequence_plan.py new file mode 100644 index 00000000..3f28d562 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/vfm/datasets/augmentors/sequence_plan.py @@ -0,0 +1,142 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +"""Augmentor for creating sequence plans with random conditional frames. + +Supports two sampling strategies: +- weighted dict (``conditioning_config``): explicit frame-count → probability pairs +- uniform (``uniform_conditioning=True``): k ~ Uniform{0, T_latent-1}, where T_latent + is computed from the actual video length using the VAE temporal compression factor +""" + +import random +from typing import Optional + +import torch + +from cosmos3._src.imaginaire.datasets.webdataset.augmentors.augmentor import Augmentor +from cosmos3._src.vfm.datasets.sequence_packing import SequencePlan + + +class SequencePlanAugmentor(Augmentor): + """Augmentor that creates SequencePlan with random conditional frames. + + Samples k conditioning frames and writes ``condition_frame_indexes_vision = list(range(k))`` + into the SequencePlan. Downstream packing code reads this field to set condition_mask. + + Args: + input_keys: List of input keys (not used, but required by Augmentor interface). + output_keys: List of output keys (not used, but required by Augmentor interface). + args: Dictionary containing: + - "conditioning_config" (dict[int, float], optional): Weighted distribution + mapping latent-frame counts to unnormalized probabilities. + Example: {0: 0.5, 4: 0.3, 8: 0.2}. Clamped to T_latent-1 at runtime. + - "uniform_conditioning" (bool, default False): When True, samples + k ~ Uniform{0, T_latent-1}. Takes precedence over conditioning_config when + both are set. At least one of uniform_conditioning or conditioning_config + must be provided. + - "temporal_compression_factor" (int, default 4): VAE temporal compression + factor used to convert pixel frame count N to T_latent = 1 + (N-1) // tcf. + """ + + def __init__(self, input_keys: list, output_keys: Optional[list] = None, args: Optional[dict] = None) -> None: + super().__init__(input_keys, output_keys, args) + + if args is None: + args = {} + + self.conditioning_config = args.get("conditioning_config") + self.uniform_conditioning = args.get("uniform_conditioning", False) + self.temporal_compression_factor = args.get("temporal_compression_factor", 4) + + if self.conditioning_config is None and not self.uniform_conditioning: + raise ValueError("args must provide 'conditioning_config' or set 'uniform_conditioning=True'") + + # Validate and normalize probabilities + if self.conditioning_config is not None: + # Validate keys are non-negative integers + for num_frames, prob in self.conditioning_config.items(): + if not isinstance(num_frames, int) or num_frames < 0: + raise ValueError(f"conditioning_config keys must be non-negative integers, got {num_frames}") + if not isinstance(prob, (int, float)) or prob < 0: + raise ValueError(f"conditioning_config values must be non-negative numbers, got {prob}") + + # Normalize probabilities to sum to 1.0 + total_prob = sum(self.conditioning_config.values()) + if total_prob <= 0: + raise ValueError("conditioning_config probabilities must sum to a positive number") + + self.normalized_config = {k: v / total_prob for k, v in self.conditioning_config.items()} + else: + self.normalized_config = {0: 1.0} + + def __call__(self, data_dict: dict) -> dict: + """Create a SequencePlan with random conditional frames. + + Args: + data_dict: Input data dictionary. Should contain "video" key to determine + the number of frames available. + + Returns: + data_dict: Output dictionary with "sequence_plan" key added. + """ + # Get video to determine available frames + video = data_dict.get("video") + if video is None or (self.conditioning_config is None and not self.uniform_conditioning): + # This is an image batch + sequence_plan = SequencePlan( + has_text=True, # Has text prompt! + has_vision=True, + condition_frame_indexes_vision=[], # No conditioning frames! + ) + data_dict["sequence_plan"] = sequence_plan + return data_dict + + # Determine number of frames + # Video should be a tensor with shape (C, T, H, W) by this point in the pipeline + if isinstance(video, torch.Tensor): + assert video.ndim == 4, "video should be a tensor with shape (C, T, H, W)" + num_frames = video.shape[1] + else: + # If video is not a tensor or dict, we can't determine the exact number + # Use a conservative approach - will be limited by max available frames + num_frames = None + + T_latent = 1 + (num_frames - 1) // self.temporal_compression_factor if num_frames is not None else 1 + + # Sample number of conditional frames + if self.uniform_conditioning: + num_conditional_frames = random.randint(0, max(0, T_latent - 1)) + else: + frames_options = list(self.normalized_config.keys()) + weights = list(self.normalized_config.values()) + num_conditional_frames = random.choices(frames_options, weights=weights, k=1)[0] + num_conditional_frames = min(num_conditional_frames, T_latent - 1) if num_frames is not None else 0 + + # Create condition_frame_indexes_vision list + # Conditional frames are always the first N frames + condition_frame_indexes_vision = list(range(num_conditional_frames)) + + # Create SequencePlan + sequence_plan = SequencePlan( + has_text=True, + has_vision=True, + condition_frame_indexes_vision=condition_frame_indexes_vision, + ) + + # Add sequence plan to data dict + data_dict["sequence_plan"] = sequence_plan + + return data_dict diff --git a/cosmos-inference/cosmos3/_src/vfm/datasets/augmentors/sound_sequence_plan.py b/cosmos-inference/cosmos3/_src/vfm/datasets/augmentors/sound_sequence_plan.py new file mode 100644 index 00000000..14c98260 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/vfm/datasets/augmentors/sound_sequence_plan.py @@ -0,0 +1,107 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +"""Augmentor that builds a SequencePlan for sound-enabled training. + +This augmentor creates a SequencePlan based on the presence of sound data +in the sample, following the same pattern as Action's ActionTransformPipeline +which builds sequence plans for action-enabled training. + +Placed at the END of the augmentor pipeline (after video/audio extraction +and text transforms) so that all data shapes are known. +""" + +from typing import Optional + +from cosmos3._src.imaginaire.datasets.webdataset.augmentors.augmentor import Augmentor +from cosmos3._src.vfm.datasets.sound_data_utils import VALID_SOUND_MODES, build_sequence_plan_for_sound + + +class SoundSequencePlanBuilder(Augmentor): + """Builds a SequencePlan for sound-enabled samples. + + Inspects the data dict for sound data and creates an appropriate + SequencePlan. If no sound is present, creates a video-only plan. + + Args: + input_keys: Not used (reads from data_dict directly) + output_keys: Not used + args: Dictionary with: + - mode: Generation mode ("t2vs", "tv2s", "ts2v", "ti2sv"). Default: "t2vs" + - video_key: Key to find video data. Default: "video" + - sound_key: Key to find sound data. Default: "sound" + """ + + def __init__(self, input_keys: list, output_keys: Optional[list] = None, args: Optional[dict] = None) -> None: + super().__init__(input_keys, output_keys, args) + self.mode = args.get("mode", "t2vs") + self.video_key = args.get("video_key", "video") + self.sound_key = args.get("sound_key", "sound") + + assert self.mode in VALID_SOUND_MODES, f"Invalid mode: {self.mode}. Must be one of {VALID_SOUND_MODES}" + + def __call__(self, data_dict: dict) -> dict | None: + """Add sound fields to the existing SequencePlan. + + Only modifies ``has_sound`` and ``condition_frame_indexes_sound``. + All other fields (vision conditioning, action conditioning, etc.) set + by upstream augmentors are preserved. + + If no upstream plan exists, creates a minimal one with sensible defaults. + """ + video = data_dict.get(self.video_key) + sound = data_dict.get(self.sound_key) + + if video is None: + return None # Can't proceed without video + + if not hasattr(video, "shape"): + return None + + video_length = video.shape[1] # (C, T, H, W) → T + + existing_plan = data_dict.get("sequence_plan") + + if existing_plan is not None: + # Update only the sound fields on the existing plan + if sound is not None and hasattr(sound, "shape"): + sound_plan = build_sequence_plan_for_sound( + mode=self.mode, + video_latent_length=video_length, + sound_latent_length=0, + ) + existing_plan.has_sound = sound_plan.has_sound + existing_plan.condition_frame_indexes_sound = sound_plan.condition_frame_indexes_sound + else: + existing_plan.has_sound = False + existing_plan.condition_frame_indexes_sound = [] + else: + # No upstream plan — build a complete one from scratch + if sound is not None and hasattr(sound, "shape"): + data_dict["sequence_plan"] = build_sequence_plan_for_sound( + mode=self.mode, + video_latent_length=video_length, + sound_latent_length=0, + ) + else: + from cosmos3._src.vfm.datasets.sequence_packing import SequencePlan + + data_dict["sequence_plan"] = SequencePlan( + has_text=True, + has_vision=True, + has_sound=False, + ) + + return data_dict diff --git a/cosmos-inference/cosmos3/_src/vfm/datasets/augmentors/text_tokenizer.py b/cosmos-inference/cosmos3/_src/vfm/datasets/augmentors/text_tokenizer.py new file mode 100644 index 00000000..431599d2 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/vfm/datasets/augmentors/text_tokenizer.py @@ -0,0 +1,108 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +# Augmentor for tokenizing input text + +import json +import random +from typing import Optional + +import torch + +from cosmos3._src.imaginaire.datasets.webdataset.augmentors.augmentor import Augmentor +from cosmos3._src.imaginaire.lazy_config import instantiate as lazy_instantiate + +_MAX_NUM_TOKENS = 4096 + + +class TextTokenizerTransform(Augmentor): + def __init__(self, input_keys: list, output_keys: Optional[list] = None, args: Optional[dict] = None) -> None: + super().__init__(input_keys, output_keys, args) + + tokenizer_config = self.args["tokenizer_config"] + self.cfg_dropout_rate = self.args["cfg_dropout_rate"] + self.use_system_prompt = self.args.get("use_system_prompt", False) + + self._processor = lazy_instantiate(tokenizer_config) + + def __call__(self, data_dict: dict) -> dict: + input_caption = data_dict[self.input_keys[0]] + + if isinstance(input_caption, dict): + # Encode dict into a json string. This json string is then passed to the transformer tokenizer. + input_caption = json.dumps(input_caption) + data_dict[self.input_keys[0]] = input_caption + + if self.cfg_dropout_rate > 0: + # If CFG is used, randomly dropout the input caption + # We dropout the input caption by replacing it with an empty string + if random.random() < self.cfg_dropout_rate: + input_caption = "" + data_dict[self.input_keys[0]] = input_caption + + text_ids = self._processor.tokenize_text( + input_caption, + is_video=False, + use_system_prompt=self.use_system_prompt, + ) + text_ids = text_ids[:_MAX_NUM_TOKENS] # truncate the text ids to the maximum number of tokens + # This will take care of wierd edge cases where we generate extremely long captions + data_dict[self.output_keys[0]] = torch.tensor(text_ids) # [N_tokens] + return data_dict + + +_SYSTEM_PROMPT_IMAGE_EDITING = "You are a helpful assistant who will edit images based on the user's instructions." + +_SYSTEM_PROMPT_TRANSFER = "You are a helpful assistant that generates images or videos following the user's instructions and control signals (edge maps, blur, depth, or segmentation)." + +_SYSTEM_PROMPTS = { + "editing": _SYSTEM_PROMPT_IMAGE_EDITING, + "transfer": _SYSTEM_PROMPT_TRANSFER, +} + + +class TextTokenizerTransformForEditing(Augmentor): + """Tokenizer augmentor for interleaved tasks: image editing or transfer (control-conditioned generation). + + Uses a task-specific system prompt. Pass args["task"] = "editing" (default) or "transfer". + """ + + def __init__(self, input_keys: list, output_keys: Optional[list] = None, args: Optional[dict] = None) -> None: + super().__init__(input_keys, output_keys, args) + + tokenizer_config = self.args["tokenizer_config"] + self.cfg_dropout_rate = self.args.get("cfg_dropout_rate", 0.0) + task = self.args.get("task", "editing") + self._system_prompt = _SYSTEM_PROMPTS.get(task, _SYSTEM_PROMPTS["editing"]) + + self._processor = lazy_instantiate(tokenizer_config) + + def __call__(self, data_dict: dict) -> dict | None: + input_caption = data_dict.get(self.input_keys[0], "") + if self.cfg_dropout_rate > 0 and random.random() < self.cfg_dropout_rate: + input_caption = "" + data_dict[self.input_keys[0]] = input_caption + text_ids = self._processor.tokenize_text(input_caption, system_prompt=self._system_prompt) + data_dict[self.output_keys[0]] = torch.tensor(text_ids) # [N_tokens] + return data_dict + + +class TextTokenizerTransformForTransfer(TextTokenizerTransformForEditing): + """Tokenizer augmentor for transfer (control-conditioned) generation. Uses transfer system prompt.""" + + def __init__(self, input_keys: list, output_keys: Optional[list] = None, args: Optional[dict] = None) -> None: + args = dict(args) if args else {} + args["task"] = "transfer" + super().__init__(input_keys, output_keys, args) diff --git a/cosmos-inference/cosmos3/_src/vfm/datasets/augmentors/text_transforms_for_image.py b/cosmos-inference/cosmos3/_src/vfm/datasets/augmentors/text_transforms_for_image.py new file mode 100644 index 00000000..fb2f147e --- /dev/null +++ b/cosmos-inference/cosmos3/_src/vfm/datasets/augmentors/text_transforms_for_image.py @@ -0,0 +1,308 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +import json +import random +from typing import Optional + +from cosmos3._src.imaginaire.datasets.augmentors.v3_text_transforms import pad_and_resize +from cosmos3._src.imaginaire.datasets.webdataset.augmentors.augmentor import Augmentor +from cosmos3._src.imaginaire.utils import log +from cosmos3._src.vfm.datasets.data_sources.data_registration import _CAPTION_EMBEDDING_KEY_MAPPING_IMAGES + +# For the qwen captions, we have 3 variants: short, medium, long +# In addition, for synthetic data, we create prompt embeddings as well. +# There is quite a bit of entropy in the way prompt data is saved. +# Captions are saved as "prompts", while the corresponding embeddings are saved as "original_prompt" +# This part will be cleaned after synthetic data is cleaned to be in the same format as real data. +_AVAILABLE_QWEN_CAPTIONS = ["qwen2p5_7b_short", "qwen2p5_7b_medium", "qwen2p5_7b_long"] +_AVAILABLE_QWEN3_30B_A3B_CAPTIONS = [ + "qwen3_30b_a3b_short", + "qwen3_30b_a3b_descriptive", + "qwen3_30b_a3b_dense", +] +# used for new caption in Nov 2025 +_AVAILABLE_CAPTIONS_V2 = ["caption_short", "caption_medium", "caption_long"] +# used for sft v1 +_AVAILABLE_CAPTIONS_SFT_V1 = [ + "gemini_v1_dense", + "gemini_v2_dense", + "qwen3vl_30B_v1_dense", + "qwen3vl_30B_v2_dense", + "qwen3vl_235B_v1_dense", + "qwen3vl_235B_v2_dense", +] +# used for genplan ablation +# captions are saved as "caption_long" as a JSON string, like {"dense": "xxx", "dense_bbox": "xxx"} +_AVAILABLE_CAPTIONS_GENPLAN = ["dense", "dense_bbox"] +_CAPTION_EMBEDDING_MAPPING = { + "qwen2p5_7b_short": "qwen2p5_7b_short", + "qwen2p5_7b_medium": "qwen2p5_7b_medium", + "qwen2p5_7b_long": "qwen2p5_7b_long", + "prompts": "original_prompt", +} + + +class TextTransformForImage(Augmentor): + def __init__(self, input_keys: list, output_keys: Optional[list] = None, args: Optional[dict] = None) -> None: + super().__init__(input_keys, output_keys, args) + + def __call__(self, data_dict: dict) -> dict: + r"""Performs camera transformation. + + Args: + data_dict (dict): Input data dict + Returns: + data_dict (dict): Output dict with camera attributes added + """ + + caption_type = self.args["caption_type"] + embedding_key_in_dict = _CAPTION_EMBEDDING_KEY_MAPPING_IMAGES[caption_type] + embedding_type = self.args["embedding_type"] + embedding_input_key_prefix = "" if embedding_type == "t5_xxl" else "umt5_" + + captions_key, embeddings_key = ( + f"captions_{caption_type}", + f"{embedding_input_key_prefix}embeddings_captions_{embedding_key_in_dict}", + ) + decoded_captions_ai = data_dict[captions_key] + decoded_embeddings_ai = data_dict[embeddings_key] + + try: + # Hotfix: Some captions are labeled as "captions" and some are labeled as "caption" + # This issue needs to be fixed in the synthetic data. This is a hack and will be removed + # once the data is cleaned. + caption_key = "captions" if "captions" in decoded_captions_ai else "caption" + embedding_key = "t5_xxl_fp8" if embedding_type == "t5_xxl" else "umt5_xxl" + if caption_type == "qwen2p5_7b_v4": + selected_caption_type = random.choice(_AVAILABLE_QWEN_CAPTIONS) + data_dict["ai_caption"] = decoded_captions_ai[caption_key][selected_caption_type] + t5_embedding = decoded_embeddings_ai[selected_caption_type]["embeddings"][embedding_key] + data_dict["selected_caption_type"] = selected_caption_type + elif caption_type == "prompts": + data_dict["ai_caption"] = decoded_captions_ai["caption"]["prompt"] + t5_embedding = decoded_embeddings_ai[_CAPTION_EMBEDDING_MAPPING[caption_type]]["embeddings"][ + embedding_key + ] + data_dict["selected_caption_type"] = caption_type + else: + assert caption_type == "ai_v3p1", f"Caption type {caption_type} not supported" + if decoded_captions_ai["had_parse_issue"]: + data_dict["ai_caption"] = decoded_captions_ai["captions"]["kosmos_2"] + t5_embedding = decoded_embeddings_ai["kosmos2"]["embeddings"][embedding_key] + else: + data_dict["ai_caption"] = decoded_captions_ai["captions"]["vfc"] + t5_embedding = decoded_embeddings_ai["vfc_fidelity"]["embeddings"][embedding_key] + + out_t5, out_t5_mask = pad_and_resize( + t5_embedding, + self.args["t5_tokens"]["num"], + is_mask_all_ones=self.args["is_mask_all_ones"], + ) + data_dict["t5_text_embeddings"] = out_t5 + data_dict["t5_text_mask"] = out_t5_mask + except Exception as e: + log.warning( + f"TextTransform dataloader error: {data_dict['__url__']}, {data_dict['__key__']}\n error {e}", + rank0_only=False, + ) + return None + + del data_dict[captions_key] + del data_dict[embeddings_key] + + return data_dict + + +class TextTransformForImageWithoutEmbeddings(Augmentor): + def __init__(self, input_keys: list, output_keys: Optional[list] = None, args: Optional[dict] = None) -> None: + super().__init__(input_keys, output_keys, args) + self.caption_prefix = args.get("caption_prefix", None) if args else None + + def _apply_caption_prefix(self, data_dict: dict) -> None: + """Prepend caption_prefix to ai_caption if configured.""" + if not self.caption_prefix or not isinstance(data_dict.get("ai_caption"), str): + return + original = data_dict["ai_caption"] + data_dict["ai_caption"] = self.caption_prefix + " " + original.lstrip() + log.debug( + f"[caption_prefix] before: {original[:120]!r}... | after: {data_dict['ai_caption'][:120]!r}...", + rank0_only=False, + ) + + def __call__(self, data_dict: dict) -> dict: + r"""Performs text transform without any embedding loading. + This is useful for online computation. + + Args: + data_dict (dict): Input data dict + Returns: + data_dict (dict): Output dict with camera attributes added + """ + + caption_type = self.args["caption_type"] + captions_key = f"captions_{caption_type}" + decoded_captions_ai = data_dict[captions_key] + + train_on_captions = self.args.get("train_on_captions", []) + # if [], will infer based on the caption json + # otherwise it will only use the captions in the list + + try: + # Hotfix: Some captions are labeled as "captions" and some are labeled as "caption" + # This issue needs to be fixed in the synthetic data. This is a hack and will be removed + # once the data is cleaned. + caption_key = "captions" if "captions" in decoded_captions_ai else "caption" + if len(train_on_captions) == 0: + # infer which caption types are there + if caption_type in ("generated_gpt_oss_20b", "generated_gpt_oss_120b"): + selected_caption_type = "caption_long" + if caption_key in decoded_captions_ai: # sharded with sila pipeline + data_dict["ai_caption"] = decoded_captions_ai[caption_key][selected_caption_type] + else: + data_dict["ai_caption"] = decoded_captions_ai[selected_caption_type] + data_dict["selected_caption_type"] = selected_caption_type + elif caption_type == "qwen3_30b_a3b": + selected_caption_type = random.choice(_AVAILABLE_QWEN3_30B_A3B_CAPTIONS) + data_dict["ai_caption"] = decoded_captions_ai[selected_caption_type] + data_dict["selected_caption_type"] = selected_caption_type + elif caption_type == "qwen3_235b_a22b_v0": + # Synthetic scene-text data ingested via + # pipelines/image/text_rendering/ingest_webdataset.py stores captions as a flat + # dict {"caption_short": ..., "caption_long": ...} (no "caption"/"captions" + # nesting), so we index decoded_captions_ai directly. + available = [k for k in _AVAILABLE_CAPTIONS_V2 if k in decoded_captions_ai] + if not available: + raise KeyError( + f"No known caption keys for {caption_type} in {list(decoded_captions_ai.keys())}" + ) + selected_caption_type = random.choice(available) + data_dict["ai_caption"] = decoded_captions_ai[selected_caption_type] + data_dict["selected_caption_type"] = selected_caption_type + elif any( + caption_type in _AVAILABLE_QWEN_CAPTIONS for caption_type in decoded_captions_ai[caption_key].keys() + ): + # qwen2p5_7b_v4 captions + selected_caption_type = random.choice(_AVAILABLE_QWEN_CAPTIONS) + data_dict["ai_caption"] = decoded_captions_ai[caption_key][selected_caption_type] + data_dict["selected_caption_type"] = selected_caption_type + elif caption_type == "cosmos_captioner_v1p1": + selected_caption_type = "caption_cosmos_captioner_image" + if decoded_captions_ai[caption_key].get(selected_caption_type, "") == "": + # xingqianx: a temporary skip as some data is imperfect. + return None # type: ignore + data_dict["ai_caption"] = decoded_captions_ai[caption_key][selected_caption_type] + data_dict["selected_caption_type"] = selected_caption_type + elif caption_type == "cosmos_captioner_v1p1_structured_json": + # this is made by mistake, should be removed in future. + # it is used for cosmos_lab_image_v1_human_sft. Once we fix it, this should be removed. + selected_caption_type = "caption_cosmos_captioner_image_structured_json" + data_dict["ai_caption"] = decoded_captions_ai[caption_key][selected_caption_type] + data_dict["selected_caption_type"] = selected_caption_type + elif any( + caption_type in _AVAILABLE_CAPTIONS_V2 for caption_type in decoded_captions_ai[caption_key].keys() + ): + # v2 captions + selected_caption_type = random.choice(_AVAILABLE_CAPTIONS_V2) + data_dict["ai_caption"] = decoded_captions_ai[caption_key][selected_caption_type] + data_dict["selected_caption_type"] = selected_caption_type + elif caption_type == "prompts": + data_dict["ai_caption"] = decoded_captions_ai["caption"]["prompt"] + data_dict["selected_caption_type"] = caption_type + else: + assert caption_type == "ai_v3p1", f"Caption type {caption_type} not supported" + if decoded_captions_ai["had_parse_issue"]: + data_dict["ai_caption"] = decoded_captions_ai["captions"]["kosmos_2"] + else: + data_dict["ai_caption"] = decoded_captions_ai["captions"]["vfc"] + else: # use the designated captions + # Validate that all specified caption types exist (except genplan types which are nested) + for cap_type in train_on_captions: + if cap_type not in _AVAILABLE_CAPTIONS_GENPLAN: + assert cap_type in decoded_captions_ai[caption_key].keys(), ( + f"Caption type {cap_type} not found in data" + ) + + selected_caption_type = random.choice(train_on_captions) + + if selected_caption_type in _AVAILABLE_CAPTIONS_GENPLAN: + # Genplan captions are nested inside caption_long as a JSON string + caption_long_data = json.loads(decoded_captions_ai[caption_key]["caption_long"]) + data_dict["ai_caption"] = caption_long_data[selected_caption_type] + else: + data_dict["ai_caption"] = decoded_captions_ai[caption_key][selected_caption_type] + data_dict["selected_caption_type"] = selected_caption_type + + except Exception as e: + log.warning( + f"TextTransform dataloader error: {data_dict['__url__']}, {data_dict['__key__']}\n error {e}", + rank0_only=False, + ) + return None + + del data_dict[captions_key] + + self._apply_caption_prefix(data_dict) + return data_dict + + +class TextTransformForImageJsonCaption(Augmentor): + def __init__(self, input_keys: list, output_keys: Optional[list] = None, args: Optional[dict] = None) -> None: + super().__init__(input_keys, output_keys, args) + self.json_field_dropout_rate = args.get("json_field_dropout_rate", 0.0) if args else 0.0 + + def __call__(self, data_dict: dict) -> dict: + r"""Performs text transform without any embedding loading. + This is useful for online computation. + + Args: + data_dict (dict): Input data dict + Returns: + data_dict (dict): Output dict with camera attributes added + """ + + caption_type = self.args["caption_type"] + captions_key = f"captions_{caption_type}" + + if "cosmos_captioner_v1p1_structured_json" in data_dict[captions_key]: + # this is made by mistake, should be removed in future. + # it is used for cosmos_lab_image_v1_human_sft. Once we fix it, this should be removed. + selected_caption_type = "caption_cosmos_captioner_image_structured_json" + else: + selected_caption_type = "caption_cosmos_captioner_image" + caption_json = data_dict[captions_key]["caption"].get(selected_caption_type, "") + if caption_json == "": + # xingqianx: a temporary skip as some text data is imperfect. + return None # type: ignore + caption_json = json.loads(caption_json) + + # In some erraneous cases, the caption_json is a list + if isinstance(caption_json, list): + caption_json = caption_json[0] + + assert isinstance(caption_json, dict), ( + f"Caption json is not a dict: {caption_json}, url: {data_dict['__url__']}, key: {data_dict['__key__']}" + ) + + # Randomly dropout json keys during training + if self.json_field_dropout_rate > 0: + for key in list(caption_json.keys()): + if random.random() < self.json_field_dropout_rate: + caption_json.pop(key) + + data_dict["ai_caption"] = caption_json + del data_dict[captions_key] + + return data_dict diff --git a/cosmos-inference/cosmos3/_src/vfm/datasets/augmentors/text_transforms_for_video.py b/cosmos-inference/cosmos3/_src/vfm/datasets/augmentors/text_transforms_for_video.py new file mode 100644 index 00000000..b2fa283c --- /dev/null +++ b/cosmos-inference/cosmos3/_src/vfm/datasets/augmentors/text_transforms_for_video.py @@ -0,0 +1,733 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +import json +import random +from typing import Optional + +from cosmos3._src.imaginaire.datasets.augmentors.v3_text_transforms import pad_and_resize +from cosmos3._src.imaginaire.datasets.webdataset.augmentors.augmentor import Augmentor +from cosmos3._src.imaginaire.utils import log + + +class TextTransformForVideo(Augmentor): + def __init__(self, input_keys: dict, output_keys: Optional[list] = None, args: Optional[dict] = None) -> None: + super().__init__(input_keys, output_keys, args) + + # our caption is saved in json with format: {"": "xxx", "": [{"start_frame": x, "end_frame": x, "": xxx}, ...], "": [{"start_frame":...]} + # our t5 embedding is saved in pickle with format: [{"": array1, "": array2}, ...] + self.captions_key: str = args[ + "captions_key" + ] # s3 folder that saves the captions; this get mapped to the key in data_dict to fetch the caption field + self.embeddings_key: Optional[str] = args[ + "embeddings_key" + ] # s3 folder that saves the embeddings; this get mapped to the key in data_dict to fetch the embedding field + self.caption_windows_key: str = args[ + "caption_windows_key" + ] # key to get the caption windows from the caption field + self.caption_type: str = args["caption_type"] # key of caption type to fetch the caption from caption windows + + self._load_embeddings = self.embeddings_key is not None + + if not self._load_embeddings: + # In this case, we don't load the embeddings + log.info("No embeddings key provided, we will not load embeddings") + self.embedding_caption_type = None + self.t5_tokens_num = None + self.is_mask_all_ones = None + self.embedding_style_mapping = None + else: + self.embedding_caption_type: str = args[ + "embedding_caption_type" + ] # key to get the embedding of a particular caption type from the embedding field + self.t5_tokens_num = args["t5_tokens"]["num"] # number of tokens we cap after padding + self.is_mask_all_ones = args["is_mask_all_ones"] # if true, set mask for t5 to all ones + + self.embedding_style_mapping = { + "long": self.embedding_caption_type, + "short": f"{self.embedding_caption_type}_short", + "medium": f"{self.embedding_caption_type}_medium", + "user": f"{self.embedding_caption_type}_user", + } + + self.caption_probs: dict[str, float] = args[ + "caption_probs" + ] # probabilities for user/short/medium/long captions + self.caption_style_mapping = { + "long": self.caption_type, + "short": f"{self.caption_type}_short", + "medium": f"{self.caption_type}_medium", + "user": f"{self.caption_type}_user", + } + assert self.caption_probs.keys() == self.caption_style_mapping.keys(), ( + "The keys for caption_probs, caption_style_mapping, and embedding_style_mapping should match" + ) + + if self._load_embeddings: + assert self.caption_style_mapping.keys() == self.embedding_style_mapping.keys(), ( + "The keys for caption_style_mapping and embedding_style_mapping should match" + ) + + def __call__(self, data_dict: dict) -> dict: + r"""Performs text transformation. + + Args: + data_dict (dict): Input data dict + Returns: + data_dict (dict): Output dict with captions and t5 embeddings added + """ + + try: + windows = data_dict[self.captions_key][self.caption_windows_key] + n_windows = len(windows) + chunk_index = data_dict["chunk_index"] + + if chunk_index == n_windows: + # This will only happen when the number of captions does not match number of chunks due to re-transcoding the videos. + log.warning( + f"TextTransform dataloader error: Found {data_dict['n_orig_video_frames']} in video but captioning is done with videos of {windows[-1]['end_frame']} frames. This mismatch is due to video re-transcoding.", + rank0_only=False, + ) + chunk_index -= 1 + + selected_caption_window = windows[chunk_index] + except Exception as e: + log.warning( + f"TextTransform dataloader error -- url: {data_dict['__url__']}, key: {data_dict['__key__']}, chunk_index: {data_dict['chunk_index']}\n error {e}", + rank0_only=False, + ) + return None + + sampled_caption_style = None + try: + available_caption_styles = [] + for k in selected_caption_window.keys(): + caption_style = k.replace(self.caption_type, "").replace("_", "") + if caption_style == "": # it is long caption by default + available_caption_styles.append("long") + elif caption_style in self.caption_style_mapping: + available_caption_styles.append(caption_style) + else: + assert caption_style in ["startframe", "endframe"], f"Unsupported caption_type {caption_style}" + + probabilities_for_available_caption_styles = { + k: v for k, v in self.caption_probs.items() if k in available_caption_styles + } + sampled_caption_style = random.choices( + list(probabilities_for_available_caption_styles), + weights=probabilities_for_available_caption_styles.values(), + )[0] + data_dict["ai_caption"] = selected_caption_window[self.caption_style_mapping[sampled_caption_style]] + except Exception as e: + log.warning( + f"TextTransform dataloader error -- url: {data_dict['__url__']}, key: {data_dict['__key__']}, selected_caption_window: {selected_caption_window}\n error {e}", + rank0_only=False, + ) + return None + if data_dict["ai_caption"] == "": + log.warning( + f"TextTransform dataloader error -- empty caption! url: {data_dict['__url__']}, key: {data_dict['__key__']}, selected_caption_window: {selected_caption_window}", + rank0_only=False, + ) + return None + + assert data_dict["ai_caption"] is not None and sampled_caption_style is not None + data_dict["sampled_caption_style"] = sampled_caption_style + + del data_dict[self.captions_key] # delete the field as we have extracted ai_caption from it + + if self._load_embeddings: + ai_caption_embedding_data = data_dict[self.embeddings_key] + try: + if self.embedding_caption_type == "vila_caption": + t5_embedding = ai_caption_embedding_data[chunk_index] + else: + t5_embedding = ai_caption_embedding_data[chunk_index][ + self.embedding_style_mapping[sampled_caption_style] + ] + except Exception as e: + log.warning( + f"TextTransform dataloader error -- url: {data_dict['__url__']}, key: {data_dict['__key__']}, chunk_index: {data_dict['chunk_index']}, n embeddings: {len(ai_caption_embedding_data)}, n captions: {n_windows} \n error {e}", + rank0_only=False, + ) + return None + out_t5, out_t5_mask = pad_and_resize( + t5_embedding, + self.t5_tokens_num, + is_mask_all_ones=self.is_mask_all_ones, + ) + data_dict["t5_text_embeddings"] = out_t5 + data_dict["t5_text_mask"] = out_t5_mask + del data_dict[self.embeddings_key] # delete the field as we have extracted t5 embedding from it + + return data_dict + + +class TextTransformForVideoWithFullFrames(Augmentor): + """ + Pair use with VideoParsingWithFullFrames to get the full frames of the video. + The caption is assumed to be for the entire video frames, rather than TextTransformForVideo + which assumes captions are for a specific chunk of frames. + + Audio captions are handled separately by AudioCaptionAppender, which appends + audio descriptions to the video caption after this augmentor runs. + """ + + def __init__(self, input_keys: dict, output_keys: Optional[list] = None, args: Optional[dict] = None) -> None: + super().__init__(input_keys, output_keys, args) + assert len(input_keys) == 3, "TextTransformForVideoWithFullFrames augmentor only supports three input keys" + self.meta_key = input_keys[0] + self.video_key = input_keys[1] + self.sequence_plan_key = input_keys[2] + self.args = args + self.keep_metas = args.get("keep_metas", False) if args else False + self.caption_prefix = args.get("caption_prefix", None) if args else None + + def _apply_caption_prefix(self, data_dict: dict) -> None: + """Prepend caption_prefix to ai_caption if configured.""" + if not self.caption_prefix or not isinstance(data_dict.get("ai_caption"), str): + return + original = data_dict["ai_caption"] + data_dict["ai_caption"] = self.caption_prefix + " " + original.lstrip() + log.debug( + f"[caption_prefix] before: {original[:120]!r}... | after: {data_dict['ai_caption'][:120]!r}...", + rank0_only=False, + ) + + @staticmethod + def _resolve_multi_chunk_caption(raw_caption: str) -> str: + """Resolve a caption that may be in multi-chunk JSON format. + + Multi-chunk captions are JSON strings encoding a dict of chunks, e.g.: + {"chunk_0_300": {"caption": "...", "start_frame": 0, "end_frame": 300}, ...} + When detected, a chunk is randomly selected and its "caption" text returned. + Plain string captions are returned unchanged. + """ + if not isinstance(raw_caption, str): + return raw_caption + try: + parsed = json.loads(raw_caption) + except (json.JSONDecodeError, TypeError): + return raw_caption + if not isinstance(parsed, dict) or len(parsed) == 0: + return raw_caption + chunk = random.choice(list(parsed.values())) + if isinstance(chunk, dict) and "caption" in chunk: + return chunk["caption"] + return raw_caption + + def __call__(self, data_dict: dict) -> dict: + r"""Performs text transformation. + + Samples a video caption from metadata based on caption_config ratios. + Supports both plain-string captions and multi-chunk JSON captions + (randomly selects one chunk when multiple chunks are present). + Audio captions are handled separately by AudioCaptionAppender. + + Args: + data_dict (dict): Input data dict + Returns: + data_dict (dict): Output dict with captions and t5 embeddings added + """ + caption_config = self.args["caption_config"] + meta_dict = data_dict[self.meta_key] + + for caption_type in caption_config: + assert caption_type in meta_dict, ( + f"Caption type {caption_type} not found in meta_dict (keys = {meta_dict.keys()})" + ) + + # First check if we are doing image to world or video to world + if self.sequence_plan_key in data_dict: + sequence_plan = data_dict[self.sequence_plan_key] + conditioning_frame_indexes_vision = sequence_plan.condition_frame_indexes_vision + if len(conditioning_frame_indexes_vision) > 0: + sampled_caption = self._resolve_multi_chunk_caption(meta_dict["caption_temporal"]) + data_dict["ai_caption"] = sampled_caption + data_dict["sampled_caption_style"] = "caption_temporal" + + self._apply_caption_prefix(data_dict) + if not self.keep_metas: + del data_dict[self.meta_key] + return data_dict + + # Text-to-world: sample from short, medium, long captions + caption_keys = list(caption_config.keys()) + caption_ratios = [caption_config[k]["ratio"] for k in caption_keys] + sampled_caption_type = random.choices(caption_keys, weights=caption_ratios, k=1)[0] + data_dict["ai_caption"] = self._resolve_multi_chunk_caption(meta_dict[sampled_caption_type]) + data_dict["sampled_caption_style"] = sampled_caption_type + + self._apply_caption_prefix(data_dict) + + # Clean up - delete the caption fields that were sampled from + for caption_type in caption_config.keys(): + if caption_type in meta_dict: + del meta_dict[caption_type] + + # Delete metas unless keep_metas=True (set when AudioCaptionAppender runs downstream) + if not self.keep_metas: + del data_dict[self.meta_key] + + return data_dict + + +class TextTransformForVideoTransferFullFrames(Augmentor): + """Read structured captions for the full-frame transfer pipeline. + + Two-level lookup: + + 1. A caption-source key is sampled (with weights) from ``caption_config``. + This key identifies the WebDataset folder / metadata field whose value + is a dict of annotations (e.g. + ``"structured_captions_qwen3-vl-8b-lora-v1.5-merged"``). The sampled + value is looked up first in ``data_dict`` (top-level) and then in + ``meta_dict``. + 2. Inside that caption dict the field ``caption_structured`` is hardcoded + as the JSON-encoded chunked annotation, of the form + ``{"chunk_0_300": {"caption": "", + "start_frame": ..., "end_frame": ...}}``. + + The full-frame pipeline always decodes from the start of the video, so the + first chunk is always selected and its inner JSON-encoded structured payload + is parsed back into a dict before being serialized as ``ai_caption``. + """ + + CAPTION_FIELD = "caption_structured" + + def __init__(self, input_keys: dict, output_keys: Optional[list] = None, args: Optional[dict] = None) -> None: + super().__init__(input_keys, output_keys, args) + assert len(input_keys) >= 1, "TextTransformForVideoTransferFullFrames requires a metadata input key" + self.meta_key = input_keys[0] + self.args = args or {} + self.keep_metas = self.args.get("keep_metas", False) + self.caption_options = self._normalize_caption_config(self.args["caption_config"]) + # This fixes transfer datasets that mix caption chunks with different + # lengths. Each caption source needs its own stride so the sampled video + # stays within the token budget while matching the selected caption. + self.min_stride_key = self.args.get("min_stride_key", "_full_frames_min_stride") + + @staticmethod + def _normalize_caption_config(caption_config: dict | list) -> list[tuple[str, float, dict]]: + if isinstance(caption_config, dict): + options: list[tuple[str, float, dict]] = [] + for caption_key, config in caption_config.items(): + if isinstance(config, dict): + ratio = config.get("ratio", 1.0) + # Keep more than the sampling ratio because source-specific + # settings, like min_stride, are part of how caption/video + # alignment is preserved. + options.append((caption_key, float(ratio), dict(config))) + else: + options.append((caption_key, float(config), {})) + return options + + options = [] + for item in caption_config: + if isinstance(item, str): + options.append((item, 1.0, {})) + elif isinstance(item, dict): + caption_key = item.get("key") or item.get("caption_key") or item.get("caption_type") or item.get("name") + if caption_key is None: + raise ValueError(f"Caption config entry is missing a caption key: {item}") + options.append((caption_key, float(item.get("ratio", 1.0)), dict(item))) + else: + caption_key, ratio = item + options.append((caption_key, float(ratio), {})) + return options + + def _lookup_caption_dict(self, data_dict: dict, meta_dict: dict | None, caption_key: str) -> dict | None: + candidate = data_dict.get(caption_key) + if candidate is None and isinstance(meta_dict, dict): + candidate = meta_dict.get(caption_key) + if isinstance(candidate, dict): + return candidate + return None + + def __call__(self, data_dict: dict) -> dict | None: + meta_dict = data_dict.get(self.meta_key) + + available_options: list[tuple[str, float, dict]] = [] + for key, ratio, option in self.caption_options: + if ratio <= 0: + continue + if self._lookup_caption_dict(data_dict, meta_dict, key) is not None: + available_options.append((key, ratio, option)) + + if not available_options: + log.warning( + f"TextTransformForVideoTransferFullFrames: none of the configured caption keys " + f"{[key for key, _, _ in self.caption_options]} hold a caption dict in metadata/sample keys. " + f"url: {data_dict.get('__url__')}, key: {data_dict.get('__key__')}", + rank0_only=False, + ) + return None + + sampled_caption_key = random.choices( + [key for key, _, _ in available_options], + weights=[ratio for _, ratio, _ in available_options], + k=1, + )[0] + sampled_caption_option = next(option for key, _, option in available_options if key == sampled_caption_key) + caption_dict = self._lookup_caption_dict(data_dict, meta_dict, sampled_caption_key) + if caption_dict is None or self.CAPTION_FIELD not in caption_dict: + log.warning( + f"TextTransformForVideoTransferFullFrames: caption dict for {sampled_caption_key} is missing the " + f"hardcoded {self.CAPTION_FIELD} field. url: {data_dict.get('__url__')}, key: {data_dict.get('__key__')}", + rank0_only=False, + ) + return None + + try: + chunks = json.loads(caption_dict[self.CAPTION_FIELD]) + first_chunk = next(iter(chunks.values())) + structured = json.loads(first_chunk["caption"]) + except Exception as e: + log.warning( + f"TextTransformForVideoTransferFullFrames: failed to decode {sampled_caption_key}.{self.CAPTION_FIELD}. " + f"url: {data_dict.get('__url__')}, key: {data_dict.get('__key__')}, error: {e}", + rank0_only=False, + ) + return None + + data_dict["ai_caption"] = json.dumps(structured) + data_dict["sampled_caption_style"] = sampled_caption_key + if "min_stride" in sampled_caption_option: + # Without this override, 200-frame and 400-frame caption sources + # would share one stride and could either waste context or overflow + # the intended token length. + data_dict[self.min_stride_key] = int(sampled_caption_option["min_stride"]) + + if not self.keep_metas: + data_dict.pop(self.meta_key, None) + for caption_key, _, _ in self.caption_options: + if caption_key in data_dict: + del data_dict[caption_key] + return data_dict + + +class TextTransformForVideoTransferChunkedFrames(TextTransformForVideoTransferFullFrames): + """Read structured captions and sample one chunk for transfer training. + + This keeps the full-frame caption-source sampling behavior, including + per-source options such as ``min_stride``, but emits the sampled chunk's + frame range so the downstream parser can decode the matching RGB/control + frames. + """ + + def __init__(self, input_keys: dict, output_keys: Optional[list] = None, args: Optional[dict] = None) -> None: + super().__init__(input_keys, output_keys, args) + # The parser still needs metadata for fps/resolution after this transform. + self.keep_metas = self.args.get("keep_metas", True) + self.min_num_frames = int(self.args.get("min_num_frames", 5)) + + def __call__(self, data_dict: dict) -> dict | None: + meta_dict = data_dict.get(self.meta_key) + + available_options: list[tuple[str, float, dict]] = [] + for key, ratio, option in self.caption_options: + if ratio <= 0: + continue + if self._lookup_caption_dict(data_dict, meta_dict, key) is not None: + available_options.append((key, ratio, option)) + + if not available_options: + log.warning( + f"TextTransformForVideoTransferChunkedFrames: none of the configured caption keys " + f"{[key for key, _, _ in self.caption_options]} hold a caption dict in metadata/sample keys. " + f"url: {data_dict.get('__url__')}, key: {data_dict.get('__key__')}", + rank0_only=False, + ) + return None + + sampled_caption_key = random.choices( + [key for key, _, _ in available_options], + weights=[ratio for _, ratio, _ in available_options], + k=1, + )[0] + sampled_caption_option = next(option for key, _, option in available_options if key == sampled_caption_key) + caption_dict = self._lookup_caption_dict(data_dict, meta_dict, sampled_caption_key) + if caption_dict is None or self.CAPTION_FIELD not in caption_dict: + log.warning( + f"TextTransformForVideoTransferChunkedFrames: caption dict for {sampled_caption_key} is missing the " + f"hardcoded {self.CAPTION_FIELD} field. url: {data_dict.get('__url__')}, key: {data_dict.get('__key__')}", + rank0_only=False, + ) + return None + + try: + chunks = json.loads(caption_dict[self.CAPTION_FIELD]) + if not isinstance(chunks, dict) or len(chunks) == 0: + log.warning( + f"TextTransformForVideoTransferChunkedFrames: empty chunk dict for {sampled_caption_key}. " + f"url: {data_dict.get('__url__')}, key: {data_dict.get('__key__')}", + rank0_only=False, + ) + return None + + eligible_chunk_keys: list[str] = [] + for chunk_key, chunk in chunks.items(): + try: + start_frame = int(chunk["start_frame"]) + end_frame = int(chunk["end_frame"]) + except (KeyError, TypeError, ValueError): + continue + if end_frame - start_frame >= self.min_num_frames: + eligible_chunk_keys.append(chunk_key) + + if not eligible_chunk_keys: + log.warning( + f"TextTransformForVideoTransferChunkedFrames: no chunks with >= {self.min_num_frames} frames " + f"in {sampled_caption_key}. url: {data_dict.get('__url__')}, key: {data_dict.get('__key__')}", + rank0_only=False, + ) + return None + + sampled_chunk_key = random.choice(eligible_chunk_keys) + sampled_chunk = chunks[sampled_chunk_key] + chunk_start_frame = int(sampled_chunk["start_frame"]) + chunk_end_frame = int(sampled_chunk["end_frame"]) + structured = json.loads(sampled_chunk["caption"]) + except Exception as e: + log.warning( + f"TextTransformForVideoTransferChunkedFrames: failed to decode {sampled_caption_key}.{self.CAPTION_FIELD}. " + f"url: {data_dict.get('__url__')}, key: {data_dict.get('__key__')}, error: {e}", + rank0_only=False, + ) + return None + + data_dict["chunk_start_frame"] = chunk_start_frame + data_dict["chunk_end_frame"] = chunk_end_frame + data_dict["ai_caption"] = json.dumps(structured) + data_dict["sampled_caption_style"] = sampled_caption_key + data_dict["sampled_chunk_key"] = sampled_chunk_key + if "min_stride" in sampled_caption_option: + data_dict[self.min_stride_key] = int(sampled_caption_option["min_stride"]) + + if not self.keep_metas: + data_dict.pop(self.meta_key, None) + for caption_key, _, _ in self.caption_options: + if caption_key in data_dict: + del data_dict[caption_key] + return data_dict + + +class TextTransformForVideoJsonCaption(Augmentor): + """ + This augmentor is used to transform the caption from a json string to a string. + The caption is assumed to be in the format of a json string. + The caption is then transformed to a string by converting the json string to a dictionary and then converting the dictionary to a string. + The caption is then returned as a string. + + When ``meta_dict["caption_audio"]`` is present and non-empty, its contents + are injected into the caption dict under the ``"audio_description"`` key. + This happens after the JSON field dropout so the audio description is + preserved whenever upstream metadata provides it. + """ + + def __init__(self, input_keys: dict, output_keys: Optional[list] = None, args: Optional[dict] = None) -> None: + super().__init__(input_keys, output_keys, args) + assert len(input_keys) >= 2, ( + "TextTransformForVideoJsonCaption augmentor requires at least two input keys: [meta_key, video_key]" + ) + self.meta_key = input_keys[0] + self.video_key = input_keys[1] + self.args = args or {} + self.keep_metas = self.args.get("keep_metas", False) + self.caption_key = self.args.get("caption_key", "caption") + + @staticmethod + def _json_dict_or_none(raw_caption: object) -> dict | None: + if isinstance(raw_caption, dict): + return raw_caption + if not isinstance(raw_caption, str) or len(raw_caption) == 0: + return None + try: + parsed_caption = json.loads(raw_caption) + except (json.JSONDecodeError, TypeError): + return None + return parsed_caption if isinstance(parsed_caption, dict) else None + + @staticmethod + def _frame_count_or_large_bound(meta_dict: dict) -> int: + nb_frames = meta_dict.get("nb_frames") + if isinstance(nb_frames, int) and nb_frames > 0: + return nb_frames + length = meta_dict.get("length") + framerate = meta_dict.get("framerate") + if isinstance(length, (float, int)) and isinstance(framerate, (float, int)) and length > 0 and framerate > 0: + return max(1, int(round(length * framerate))) + # VideoParsingChunkedFrames clamps to the decoder length, so a large end frame is safe. + return 10**9 + + def _find_audio_caption(self, meta_dict: dict) -> str | None: + audio_caption = meta_dict.get("caption_audio") + if isinstance(audio_caption, str) and len(audio_caption) > 0: + return audio_caption + for value in meta_dict.values(): + if isinstance(value, dict): + caption_sound = value.get("caption_sound") + if isinstance(caption_sound, str) and len(caption_sound) > 0: + return caption_sound + return None + + def _parse_legacy_full_video_caption(self, meta_dict: dict) -> dict[str, dict] | None: + """Build one full-video chunk from older caption schemas that do not have ``caption``.""" + caption_json = self._json_dict_or_none(meta_dict.get("caption_structured")) + if caption_json is None: + for caption_key in ( + "caption_rewrite_dense", + "caption_dense", + "caption_descriptive", + "caption_base", + "caption_temporal", + "caption_short", + ): + caption_text = meta_dict.get(caption_key) + if isinstance(caption_text, str) and len(caption_text) > 0: + caption_json = {"description": caption_text} + break + if caption_json is None: + return None + + end_frame = self._frame_count_or_large_bound(meta_dict) + return { + f"chunk_0_{end_frame}": { + "start_frame": 0, + "end_frame": end_frame, + "caption_json": caption_json, + } + } + + def _parse_audio_caption_chunks(self, meta_dict: dict) -> dict[str, dict] | None: + """Build chunk metadata from nested audio-caption metas when visual captions are absent.""" + chunks: dict[str, dict] = {} + for key, value in meta_dict.items(): + if not isinstance(key, str) or not isinstance(value, dict): + continue + caption_sound = value.get("caption_sound") + if not isinstance(caption_sound, str) or len(caption_sound) == 0: + continue + + try: + start_frame, end_frame = [int(part) for part in key.split("_", maxsplit=1)] + except ValueError: + start_frame = value.get("start_frame") + end_frame = value.get("end_frame") + if not isinstance(start_frame, int) or not isinstance(end_frame, int): + continue + + chunks[key] = { + "start_frame": start_frame, + "end_frame": end_frame, + "caption_json": {"audio_description": caption_sound}, + } + + return chunks or None + + def __call__(self, data_dict: dict) -> dict | None: + r"""Performs text transformation. + + Parses the per-chunk caption JSON, randomly samples one chunk, and writes + the chunk's frame range into ``data_dict`` so a downstream + ``VideoParsingChunkedFrames`` can decode only that frame range. When a + non-empty ``caption_audio`` field is present in the metadata, it is + injected into the caption dict under the ``"audio_description"`` key. + + Args: + data_dict (dict): Input data dict + Returns: + data_dict (dict): Output dict with captions and t5 embeddings added + """ + caption_config = self.args["caption_config"] + json_field_dropout_rate = caption_config["json_field_dropout_rate"] + + try: + meta_dict = data_dict[self.meta_key] + raw_caption = meta_dict.get(self.caption_key) + if raw_caption is not None: + caption = self._json_dict_or_none(raw_caption) + if caption is None: + raise ValueError(f"{self.caption_key} is not a JSON object") + else: + # Some sound midtrain shards use older full-video visual captions + # (caption_base/caption_structured/...) instead of the chunked + # ``caption`` field. Prefer those visual captions when present; + # otherwise fall back to nested audio-only chunks. + caption = self._parse_legacy_full_video_caption(meta_dict) + if caption is None: + caption = self._parse_audio_caption_chunks(meta_dict) + if caption is None: + raise KeyError(self.caption_key) + + # Contents of caption + # caption = { + # "chunk_0_300": { + # "caption": "...", + # "start_frame": 0, + # "end_frame": 300, + # }, + # "chunk_300_435": { + # "caption": "...", + # "start_frame": 300, + # "end_frame": 435, + # }, + # } + chunk_keys = list(caption.keys()) + if len(chunk_keys) == 0: + log.warning( + f"TextTransformForVideoJsonCaption: empty caption dict. url: {data_dict.get('__url__')}, key: {data_dict.get('__key__')}", + rank0_only=False, + ) + return None + + sampled_key = random.choice(chunk_keys) + sampled_chunk = caption[sampled_key] + + data_dict["chunk_index"] = chunk_keys.index(sampled_key) + data_dict["chunk_start_frame"] = int(sampled_chunk["start_frame"]) + data_dict["chunk_end_frame"] = int(sampled_chunk["end_frame"]) + + if "caption_json" in sampled_chunk: + caption_json = sampled_chunk["caption_json"] + else: + caption_json = json.loads(sampled_chunk["caption"]) + except Exception as e: + log.warning( + f"TextTransformForVideoJsonCaption dataloader error -- url: {data_dict.get('__url__')}, key: {data_dict.get('__key__')}\n error {e}", + rank0_only=False, + ) + return None + + # Randomly dropout json keys during training + if json_field_dropout_rate > 0: + for key in list(caption_json.keys()): + if random.random() < json_field_dropout_rate: + caption_json.pop(key) + + # Inject audio caption from metas as a new field when available. Added after the field + # dropout above so it is preserved whenever upstream metadata provides it. + audio_caption = self._find_audio_caption(meta_dict) + if isinstance(audio_caption, str) and len(audio_caption) > 0: + caption_json["audio_description"] = audio_caption + + data_dict["ai_caption"] = caption_json + + # Delete metas unless keep_metas=True (set when downstream augmentors still need them, + # e.g. VideoParsingChunkedFrames needs framerate/width/height/nb_frames). + if not self.keep_metas: + del data_dict[self.meta_key] + + return data_dict diff --git a/cosmos-inference/cosmos3/_src/vfm/datasets/augmentors/transfer_control_input/__init__.py b/cosmos-inference/cosmos3/_src/vfm/datasets/augmentors/transfer_control_input/__init__.py new file mode 100644 index 00000000..2cbc0d5d --- /dev/null +++ b/cosmos-inference/cosmos3/_src/vfm/datasets/augmentors/transfer_control_input/__init__.py @@ -0,0 +1,20 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +"""Transfer control input augmentors (edge, blur, depth, seg) for cosmos3 VFM; copied from transfer2 to avoid cosmos dependency.""" + +from cosmos3._src.vfm.datasets.augmentors.transfer_control_input.control_input import AddControlInputComb + +__all__ = ["AddControlInputComb"] diff --git a/cosmos-inference/cosmos3/_src/vfm/datasets/augmentors/transfer_control_input/blur.py b/cosmos-inference/cosmos3/_src/vfm/datasets/augmentors/transfer_control_input/blur.py new file mode 100644 index 00000000..feb81552 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/vfm/datasets/augmentors/transfer_control_input/blur.py @@ -0,0 +1,279 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +import random +from typing import Callable, Optional + +import attrs +import cv2 +import numpy as np +import torch + +from cosmos3._src.vfm.datasets.augmentors.transfer_control_input.fast_blur import BilateralGaussian + + +@attrs.define +class BilateralFilterConfig: + """Configuration for Bilateral filter""" + + use_random: bool = False + # if use_random is False, then optionally define the param values + d: int = 30 + sigma_color: int = 150 + sigma_space: int = 100 + iter: int = 1 + + # if use_random is True, then optionally define the range + d_min: int = 15 + d_max: int = 50 + sigma_color_min: int = 100 + sigma_color_max: int = 300 + sigma_space_min: int = 50 + sigma_space_max: int = 150 + iter_min: int = 1 + iter_max: int = 2 + + # Whether to use GPU kernel (inference only) + use_cuda: bool = False + + +# Blur config default values are tuned for this resolution (longest side). +REFERENCE_RESOLUTION = 720 + + +def _scale_for_resolution(value: float, longest_side: int) -> float: + """Scale a blur parameter from REFERENCE_RESOLUTION to the given longest frame side.""" + if longest_side <= 0: + return value + return value * (longest_side / REFERENCE_RESOLUTION) + + +def _scale_ksize(ksize: int, longest_side: int) -> int: + """Scale kernel size for resolution; result is odd and >= 1.""" + scaled = max(1, int(round(_scale_for_resolution(float(ksize), longest_side)))) + return scaled + 1 if scaled % 2 == 0 else scaled + + +@attrs.define +class GaussianBlurConfig: + """Configuration for Gaussian blur""" + + use_random: bool = False + # if use_random is False, then optionally define the param values + ksize: int = 25 + sigmaX: float = 12.5 + + # if use_random is True, then optionally define the range + ksize_min: int = 21 + ksize_max: int = 29 + sigmaX_min: float = 10.5 + sigmaX_max: float = 14.5 + + +def apply_bilateral_filter( + frames: np.ndarray, + d: int = 9, + sigma_color: float = 75, + sigma_space: float = 75, + iter: int = 1, + bilateral_cuda_module: Optional[Callable] = None, +) -> np.ndarray: + if bilateral_cuda_module is not None: + blurred_image = [] + frames_tensor = torch.from_numpy(frames).cuda() + for _image in frames_tensor.permute(1, 2, 3, 0): + blurred_image.append(bilateral_cuda_module(_image, d // 3, (sigma_color // 2) ** 2, sigma_space**2)) + blurred_image = torch.stack(blurred_image).permute(3, 0, 1, 2) + return blurred_image.cpu().numpy() + + C, T, H, W = frames.shape + blurred_frames = np.empty_like(frames) + + for t in range(T): + frame = np.ascontiguousarray(frames[:, t].transpose(1, 2, 0)) + for _ in range(iter): + frame = cv2.bilateralFilter(frame, d, sigma_color, sigma_space) + if len(frame.shape) == 2: + frame = frame[..., None] + + blurred_frames[:, t] = frame.transpose(2, 0, 1) + + return blurred_frames + + +def _longest_frame_side(frames: np.ndarray) -> int: + """Return the longest spatial dimension (H or W) for CTHW frames.""" + # frames: (C, T, H, W) + return int(max(frames.shape[2], frames.shape[3])) + + +class BilateralFilter: + def __init__(self, config: BilateralFilterConfig) -> None: + self.use_random = config.use_random + self.config = config + assert not (self.use_random and self.config.use_cuda), "Cannot use GPU kernel for training." + self.bilateral_cuda_module = BilateralGaussian() if self.config.use_cuda else None + + def __call__(self, frames: np.ndarray) -> np.ndarray: + config = self.config + longest = _longest_frame_side(frames) + if self.use_random: + d = np.random.randint(config.d_min, config.d_max) + sigma_color = np.random.randint(config.sigma_color_min, config.sigma_color_max) + sigma_space = np.random.randint(config.sigma_space_min, config.sigma_space_max) + iter = np.random.randint(config.iter_min, config.iter_max) + else: + d = config.d + sigma_color = config.sigma_color + sigma_space = config.sigma_space + iter = config.iter + # Scale from reference resolution (720) to current frame size + d = max(1, int(round(_scale_for_resolution(float(d), longest)))) + d = d + 1 if d % 2 == 0 else d # cv2.bilateralFilter requires odd d + sigma_color = max(1.0, _scale_for_resolution(float(sigma_color), longest)) + sigma_space = max(1.0, _scale_for_resolution(float(sigma_space), longest)) + return apply_bilateral_filter(frames, d, sigma_color, sigma_space, iter, self.bilateral_cuda_module) + + +def apply_gaussian_blur(frames: np.ndarray, ksize: int = 5, sigmaX: float = 1.0) -> np.ndarray: + if ksize % 2 == 0: + ksize += 1 # ksize must be odd + + _, T, _, _ = frames.shape + blurred_frames = np.empty_like(frames) + + for t in range(T): + frame = np.ascontiguousarray(frames[:, t].transpose(1, 2, 0)) + frame = cv2.GaussianBlur(frame, (ksize, ksize), sigmaX=sigmaX) + if len(frame.shape) == 2: + frame = frame[..., None] + blurred_frames[:, t] = frame.transpose(2, 0, 1) + + return blurred_frames + + +class GaussianBlur: + def __init__(self, config: GaussianBlurConfig) -> None: + self.use_random = config.use_random + self.config = config + + def __call__(self, frames: np.ndarray) -> np.ndarray: + longest = _longest_frame_side(frames) + if self.use_random: + ksize = np.random.randint(self.config.ksize_min, self.config.ksize_max + 1) + sigmaX = np.random.uniform(self.config.sigmaX_min, self.config.sigmaX_max) + else: + ksize = self.config.ksize + sigmaX = self.config.sigmaX + ksize = _scale_ksize(int(ksize), longest) + sigmaX = max(0.1, _scale_for_resolution(float(sigmaX), longest)) + return apply_gaussian_blur(frames, ksize, sigmaX) + + +@attrs.define +class BlurCombinationConfig: + """Configuration for a combination of blurs with associated probability""" + + # list of choices are: ["gaussian", "bilateral"] + # the corresponding config must be defined for each item in this blur_types list + blur_types: list[str] + probability: float + gaussian_blur: GaussianBlurConfig | None = None + bilateral_filter: BilateralFilterConfig | None = None + + +@attrs.define +class BlurConfig: + """Configuration for blur augmentation with multiple combinations""" + + # probabilities from the list of combinations should add up to 1.0 + blur_combinations: list[BlurCombinationConfig] = [] + + +# For training +random_blur_config = BlurConfig( + blur_combinations=[ + BlurCombinationConfig( + blur_types=["bilateral"], + probability=0.3, + bilateral_filter=BilateralFilterConfig(use_random=True), + ), + BlurCombinationConfig( + blur_types=["gaussian"], + probability=0.5, + gaussian_blur=GaussianBlurConfig(use_random=True), + ), + BlurCombinationConfig( + blur_types=["bilateral", "gaussian"], + probability=0.2, + bilateral_filter=BilateralFilterConfig(use_random=True), + gaussian_blur=GaussianBlurConfig(use_random=True), + ), + ], +) + +# For inference +bilateral_blur_config = BlurConfig( + blur_combinations=[ + BlurCombinationConfig( + blur_types=["bilateral"], + probability=1.0, + bilateral_filter=BilateralFilterConfig(use_random=False), + ), + ], +) + + +class Blur: + def __init__(self, config: BlurConfig | None = None, use_random: bool = True) -> None: + if config is None: + config = random_blur_config if use_random else bilateral_blur_config + probabilities = [combo.probability for combo in config.blur_combinations] + total_prob = sum(probabilities) + assert abs(total_prob - 1.0) < 1e-6, f"Probabilities must sum to 1.0, got {total_prob}" + + self.blur_combinations = config.blur_combinations + self.probabilities = probabilities + self._set_blur_instances() + + def _set_blur_instances(self) -> None: + if not self.blur_combinations: + return + self.blur_combinations_instances = [] + + for blur_combination in self.blur_combinations: + blur_mapping = { + "gaussian": (GaussianBlur, blur_combination.gaussian_blur), + "bilateral": (BilateralFilter, blur_combination.bilateral_filter), + } + + cur_instances = [] + for blur_type in blur_combination.blur_types: + assert blur_type in blur_mapping, f"Unknown {blur_type}. Needs to correct blur_type or blur_mapping." + + blur_class, blur_config = blur_mapping[blur_type] + cur_instances.append(blur_class(blur_config)) + + self.blur_combinations_instances.append(cur_instances) + + assert len(self.blur_combinations_instances) == len(self.blur_combinations), ( + "Number of blur_combinations_instances needs to match number of blur_combinations." + ) + + def __call__(self, frames: np.ndarray) -> np.ndarray: + blur_instances = random.choices(self.blur_combinations_instances, weights=self.probabilities, k=1)[0] + for ins in blur_instances: + frames = ins(frames) + return frames diff --git a/cosmos-inference/cosmos3/_src/vfm/datasets/augmentors/transfer_control_input/control_input.py b/cosmos-inference/cosmos3/_src/vfm/datasets/augmentors/transfer_control_input/control_input.py new file mode 100644 index 00000000..08d1a3dd --- /dev/null +++ b/cosmos-inference/cosmos3/_src/vfm/datasets/augmentors/transfer_control_input/control_input.py @@ -0,0 +1,672 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +import random +from typing import Optional, Union + +import cv2 +import numpy as np +import torch +import torchvision.transforms.functional as transforms_F +from pycocotools import mask as mask_utils + +from cosmos3._src.imaginaire.datasets.webdataset.augmentors.augmentor import Augmentor +from cosmos3._src.imaginaire.utils import log +from cosmos3._src.vfm.datasets.augmentors.transfer_control_input.blur import Blur, BlurConfig +from cosmos3._src.vfm.datasets.augmentors.transfer_control_input.seg import ( + decode_partial_rle_width1, + segmentation_color_mask, +) + +# Constants for segmentation color processing +# These parameters control the color-based mask extraction process in AddControlInputSeg + +# Color quantization bin size for grouping similar colors together +# Range: 1-100 (smaller values = more granular color detection, larger values = more color grouping) +# Typical range: 10-50, where 25 provides good balance between precision and grouping +_BIN_SIZE = 25 + +# Maximum number of unique colors to examine for mask generation (to limit computation time) +# Range: 10-500 (smaller values = faster processing, larger values = more thorough color search) +# Typical range: 50-200, where 100 balances thoroughness with performance +_MAX_UNIQUE_COLORS = 100 + +# Color distance tolerance for considering pixels as the same color +# Range: 1-100 (smaller values = stricter color matching, larger values = more lenient matching) +# Typical range: 10-60, where 30 provides good tolerance for natural color variations +_COLOR_TOLERANCE = 30 + +# RGB value threshold below which a color is considered "black" and filtered out +# Range: 0-100 (smaller values = stricter black detection, larger values = more colors considered black) +# Typical range: 20-80, where 50 effectively filters out dark/black regions +_BLACK_THRESHOLD = 50 + + +def _maybe_torch_to_numpy(frames: Union[torch.Tensor, list]) -> np.ndarray: + try: + return frames.numpy() + except AttributeError: + return np.array(frames) + + +class AddControlInputEdge(Augmentor): + """ + Add control input to the data dictionary. control input are expanded to 3-channels + steps to add new items: modify this file, configs/conditioner.py, conditioner.py + """ + + def __init__( + self, + input_keys: list, + output_keys: Optional[list] = ["control_input_edge"], + args: Optional[dict] = None, + use_random: Optional[bool] = True, + preset_strength: Optional[str] = "medium", + edge_t_lower: Optional[int] = None, + edge_t_upper: Optional[int] = None, + **kwargs, + ) -> None: + super().__init__(input_keys, output_keys, args) + self.use_random = use_random + self.preset_strength = preset_strength + self.t_lower = edge_t_lower + self.t_upper = edge_t_upper + + def __call__(self, data_dict: dict) -> dict: + if "control_input_edge" in data_dict: + return data_dict + key_img = self.input_keys[0] + key_out = self.output_keys[0] + frames = data_dict[key_img] + # log.info(f"Adding control input edge. Input key: {key_img}, Output key: {key_out}. Use random: {self.use_random}, Preset strength: {self.preset_strength}") + # Get lower and upper threshold for canny edge detection. + if self.use_random: # always on for training, always off for inference + if self.t_lower is not None and self.t_upper is not None: + # Use provided t_lower and t_upper values + t_lower = self.t_lower + t_upper = self.t_upper + else: + # Generate random values as before + t_lower = np.random.randint(20, 100) # Get a random lower thre + t_diff = np.random.randint(50, 150) # Get a random diff between lower and upper + t_upper = t_lower + t_diff # The upper thre is lower added by the diff + else: + if self.preset_strength == "none" or self.preset_strength == "very_low": + t_lower, t_upper = 20, 50 + elif self.preset_strength == "low": + t_lower, t_upper = 50, 100 + elif self.preset_strength == "medium": + t_lower, t_upper = 100, 200 + elif self.preset_strength == "high": + t_lower, t_upper = 200, 300 + elif self.preset_strength == "very_high": + t_lower, t_upper = 300, 400 + else: + raise ValueError(f"Preset {self.preset_strength} not recognized.") + + frames = _maybe_torch_to_numpy(frames).astype(np.uint8) + is_image = len(frames.shape) < 4 + + # Compute the canny edge map by the two thresholds. + if is_image: + edge_maps = cv2.Canny(frames, t_lower, t_upper)[None, None] + else: + edge_maps = [ + cv2.Canny(img, t_lower, t_upper) for img in frames.transpose((1, 2, 3, 0)) + ] # (C, T, H, W) -> (T, H, W) + edge_maps = np.stack(edge_maps)[None] + edge_maps = torch.from_numpy(edge_maps).expand(3, -1, -1, -1) # [3,T,H,W] + if is_image: + edge_maps = edge_maps[:, 0] + data_dict[key_out] = edge_maps + return data_dict + + +class AddControlInputBlur(Augmentor): + """ + Main class for adding blurred input to the data dictionary. + self.output_keys[0] indicates the types of blur added to the input. + For example, control_input_gaussian_guided indicates that both Gaussian and Guided filters are applied + """ + + def __init__( + self, + input_keys: list, # [key_load, key_img] + output_keys: Optional[list] = ["control_input_blur"], + args: Optional[dict] = None, # not used + use_random: bool = True, # whether to use random parameters + blur_config: BlurConfig | None = None, + downup_preset: str | int = "medium", # preset strength for downup factor + min_downup_factor: int = 4, # minimum downup factor + max_downup_factor: int = 16, # maximum downup factor + downsize_before_blur: bool = True, # whether to downsize before applying blur and then upsize or downup after blur + blur_downsize_factor: list[int] = list(range(1, 5)), # downscale factor for blur + resize_cuda: bool = False, # whether to do resizing on GPU, the result is still moved back to CPU for compatibility. + **kwargs, + ) -> None: + super().__init__(input_keys, output_keys, args) + self.use_random = use_random + downup_preset_values = { + "none": 1, + "very_low": min_downup_factor, + "low": min_downup_factor, + "medium": (min_downup_factor + max_downup_factor) // 2, + "high": max_downup_factor, + "very_high": max_downup_factor, + } + blur_downup_preset_values = { + "none": 1, + "very_low": 1, + "low": 4, + "medium": 2, + "high": 1, + "very_high": 4, + } + self.blur = Blur(config=blur_config, use_random=use_random) + + self.preset_strength = downup_preset + self.downup_preset = downup_preset if isinstance(downup_preset, int) else downup_preset_values[downup_preset] + self.downsize_before_blur = downsize_before_blur + self.min_downup_factor = min_downup_factor + self.max_downup_factor = max_downup_factor + self.blur_downsize_factor = blur_downsize_factor + self.blur_downup_preset = blur_downup_preset_values[downup_preset] + self.resize_cuda = resize_cuda + assert not (self.use_random and self.resize_cuda), "Cannot use resize on GPU during training." + + def _load_frame(self, data_dict: dict) -> tuple[np.ndarray, bool]: + key_img = self.input_keys[0] + frames = data_dict[key_img] + frames = _maybe_torch_to_numpy(frames) + is_image = False + if len(frames.shape) < 4: + frames = frames.transpose((2, 0, 1))[:, None] + is_image = True + return frames, is_image + + def __call__(self, data_dict: dict) -> dict: + if "control_input_blur" in data_dict: + # already processed + data_dict[self.output_keys[0]] = data_dict["control_input_blur"] + return data_dict + + key_out = self.output_keys[0] + + frames, is_image = self._load_frame(data_dict) + if self.preset_strength == "none": + if is_image: + frames = frames[:, 0] + data_dict[key_out] = torch.from_numpy(frames) + return data_dict + C, T, H, W = frames.shape + + # --- 1. Downscale Before Blur --- + if self.use_random: + downscale_factor = random.choice(self.blur_downsize_factor) + else: + downscale_factor = self.blur_downup_preset + + if self.downsize_before_blur: + new_W, new_H = W // downscale_factor, H // downscale_factor + downscaled = np.empty((C, T, new_H, new_W), dtype=frames.dtype) + + for t in range(T): + frame = np.ascontiguousarray(frames[:, t].transpose(1, 2, 0)) + resized = cv2.resize(frame, (new_W, new_H), interpolation=cv2.INTER_AREA) + if len(resized.shape) == 2: + resized = resized[..., None] + downscaled[:, t] = resized.transpose(2, 0, 1) + + frames = downscaled + # -------------------------- + + # --- 2. Apply Blur --- + frames = self.blur(frames) + # -------------------------- + + # --- 3. Upscale After Blur --- + if self.downsize_before_blur: + upscaled = np.empty((C, T, H, W), dtype=frames.dtype) + for t in range(T): + frame = np.ascontiguousarray(frames[:, t].transpose(1, 2, 0)) + resized = cv2.resize(frame, (W, H), interpolation=cv2.INTER_LINEAR) + if len(resized.shape) == 2: + resized = resized[..., None] + upscaled[:, t] = resized.transpose(2, 0, 1) + frames = upscaled + + # --- 4. Final Downup Augmentation --- + if self.use_random: + scale_factor = random.randint(self.min_downup_factor, self.max_downup_factor + 1) + else: + scale_factor = self.downup_preset + + final_W, final_H = int(W / scale_factor), int(H / scale_factor) + + final_frames = np.empty_like(frames) + for t in range(T): + frame = np.ascontiguousarray(frames[:, t].transpose(1, 2, 0)) + small = cv2.resize(frame, (final_W, final_H), interpolation=cv2.INTER_CUBIC) + large = cv2.resize(small, (W, H), interpolation=cv2.INTER_CUBIC) + if len(large.shape) == 2: + large = large[..., None] + final_frames[:, t] = large.transpose(2, 0, 1) + + if is_image: + final_frames = final_frames[:, 0] + data_dict[key_out] = torch.from_numpy(final_frames) + + return data_dict + + +class AddControlInputDepth(Augmentor): + """ + Add control input to the data dictionary. control input are expanded to 3-channels + steps to add new items: modify this file, configs/conditioner.py, conditioner.py + """ + + def __init__( + self, + input_keys: list, + output_keys: Optional[list] = ["control_input_depth"], + args: Optional[dict] = None, + **kwargs, + ) -> None: + super().__init__(input_keys, output_keys, args) + + def __call__(self, data_dict: dict) -> dict: + if "control_input_depth" in data_dict: + # already processed + return data_dict + + key_out = self.output_keys[0] + depth = data_dict["depth"] + + frames = data_dict["video"] + _, T, H, W = frames.shape + depth = transforms_F.resize( + depth, + size=(H, W), + interpolation=transforms_F.InterpolationMode.BILINEAR, + ) + data_dict[key_out] = depth + return data_dict + + +class AddControlInputSeg(Augmentor): + """ + Add control input to the data dictionary. control input are expanded to 3-channels + steps to add new items: modify this file, configs/conditioner.py, conditioner.py + """ + + def __init__( + self, + input_keys: list, + output_keys: Optional[list] = ["control_input_seg"], + thres_mb_python_decode: Optional[int] = 256, # required: <= 512 for 7b + use_fixed_color_list: bool = False, + num_masks_max: int = 100, + random_sample_num_masks: bool = True, + min_mask_size: float = 0.2, + args: Optional[dict] = None, + **kwargs, + ) -> None: + """ + Args: + thres_mb_python_decode: int, threshold of memory usage for python decode, in MB + use_fixed_color_list: bool, if True, use predefined colors for segmentation masks. If False, generate random colors for segmentation masks. + num_masks_max: int, maximum number of masks to sample + random_sample_num_masks: bool, if True, sample number of masks randomly. If False, sample all masks in the data. + min_mask_size: float, minimum size of the mask area, in fraction of the entire frame. + """ + super().__init__(input_keys, output_keys, args) + self.use_fixed_color_list = use_fixed_color_list + self.num_masks_max = num_masks_max + self.thres_mb_python_decode = thres_mb_python_decode + self.random_sample_num_masks = random_sample_num_masks + self.min_mask_size = min_mask_size + + def get_masks(self, data_dict: dict, num_masks: int = 1) -> tuple[torch.Tensor, bool]: + """ + Get a single mask from the data dictionary. + segmentation: list of dicts + phrase: str + segmentation_mask_rle: dict + data: dict + size: [N, 1] + counts: bytes + mask_shape: [T, H, W] + """ + frames = data_dict["video"] + _, T, H, W = frames.shape + + if not isinstance(data_dict["segmentation"], dict) and num_masks == 1: + # this video is a color-coded segmentation mask, where each color corresponds to a different object + # we need to extract the binary mask for a single object from the video + seg_video = data_dict["segmentation"] + seg_video = transforms_F.resize( + seg_video, + size=(H, W), + interpolation=transforms_F.InterpolationMode.NEAREST, + ) + # Get the first frame of the segmentation video + first_frame = seg_video[:, 0, :, :] + # Get a list of unique colors from the first frame and calculate mask size for each unique color + unique_colors = (first_frame // _BIN_SIZE).view(3, -1).permute(1, 0).unique( + dim=0 + ) * _BIN_SIZE # [N_colors,3] + # Randomly shuffle unique colors and take first N colors + perm = torch.randperm(len(unique_colors)) + unique_colors = unique_colors[perm] + unique_colors = unique_colors[:_MAX_UNIQUE_COLORS] # check up to max colors to save time + mask_sizes = [] + for color in unique_colors: + color_diff = first_frame.to(torch.float32) - color[:, None, None] + color_dists = torch.sqrt(torch.sum(color_diff**2, dim=0)) + mask = color_dists < _COLOR_TOLERANCE + mask_size = mask.sum() / (H * W) # Size as fraction of frame + mask_sizes.append(mask_size) + + # Only keep colors that produce masks >= min_mask_size of frame and not black + valid_color_indices = [ + i + for i, size in enumerate(mask_sizes) + if size >= self.min_mask_size and (unique_colors[i] > _BLACK_THRESHOLD).sum() > 0 + ] + if len(valid_color_indices) == 0: + # If no masks are large enough, return all ones + log.critical("No masks are large enough, returning all ones") + all_masks = np.ones((num_masks, T, H, W)).astype(bool) + return torch.from_numpy(all_masks), False # [num_masks,T,H,W] + else: + # Randomly select one of the valid large masks + valid_color_idx = valid_color_indices[np.random.randint(len(valid_color_indices))] + target_color = unique_colors[valid_color_idx] + # Create binary mask where True means within tolerance of target color + color_diff = seg_video.to(torch.float32) - target_color[:, None, None, None] + color_dists = torch.sqrt(torch.sum(color_diff**2, dim=0, keepdim=True)) # [1,T,H,W] + mask = (color_dists < _COLOR_TOLERANCE).to(torch.bool) # [1,T,H,W] + return mask, True + frame_indices = data_dict["frame_indices"] + frame_start, frame_end = frame_indices[0], frame_indices[-1] + 1 + is_continuous_frame_indices = (frame_end - frame_start) == T + assert len(frame_indices) == T, ( + f"frame_indices length {len(frame_indices)} != T {T}, likely due to video decoder using different fps, i.e. sample with stride. Need to return frame indices from video decoder." + ) + + all_masks = np.ones((num_masks, T, H, W)).astype(bool) # [num_masks,T,H,W] + + # sample number of masks + mask_ids = np.arange(len(data_dict["segmentation"])).tolist() + if len(data_dict["segmentation"]) == 0 or num_masks == 0: + return torch.from_numpy(all_masks), False # [num_masks,T,H,W] + if num_masks == 1: # Try up to 16 masks to find a large enough mask + mask_ids_select = np.random.choice(mask_ids, min(len(mask_ids), 16), replace=False) + else: + mask_ids_select = np.random.choice(mask_ids, num_masks, replace=False) + + for idx, mid in enumerate(mask_ids_select): + mask = data_dict["segmentation"][mid] + if type(mask) != dict: # data has sharding issue, skip this mask + return torch.from_numpy(all_masks), False # [num_masks,T,H,W] + shape = mask["segmentation_mask_rle"]["mask_shape"] + num_byte_per_mb = 1024 * 1024 + # total number of elements in uint8 (1 byte) / num_byte_per_mb + if mask["segmentation_mask_rle"]["data"]["size"][0] / num_byte_per_mb > self.thres_mb_python_decode: + # Switch to python decode if the mask is too large to avoid out of shared memory + if is_continuous_frame_indices and ( + T * shape[1] * shape[2] / num_byte_per_mb <= self.thres_mb_python_decode + ): + log.critical( + f"Using python decode for mask of shape {shape}, Continuous frame indices, frame_start: {frame_start}, frame_end: {frame_end}" + ) + rle = decode_partial_rle_width1( + mask["segmentation_mask_rle"]["data"], + frame_start * shape[1] * shape[2], + frame_end * shape[1] * shape[2], + ) + partial_shape = (frame_end - frame_start, shape[1], shape[2]) + rle = rle.reshape(partial_shape) * 255 + rle = np.stack( + [cv2.resize(_image_np, (W, H), interpolation=cv2.INTER_NEAREST) for _image_np in rle] + ) + else: # need to call decode_partial_rle_width1 multiple times + # It takes too much time to decode the mask, so we skip it and select another modality instead + log.critical(f"Skipping python decode for mask of shape {shape}") + return torch.from_numpy(all_masks), False # [num_masks,T,H,W] + else: + rle = mask_utils.decode(mask["segmentation_mask_rle"]["data"]) + rle = rle.reshape(shape) * 255 + # Select the frames that are in the video + if len(rle) < frame_end: # Pad the mask if it is shorter than original video + rle = np.vstack([rle, [rle[-1]] * (frame_end - len(rle))]) + rle = np.stack([cv2.resize(rle[i], (W, H), interpolation=cv2.INTER_NEAREST) for i in frame_indices]) + if num_masks == 1: # if we only need one mask and the current mask is large enough, return it + if (rle > 0).sum() / rle.size >= self.min_mask_size: + # log.critical(f"Found a large enough mask with size {(rle > 0).sum() / rle.size}") + all_masks[0] = rle.astype(bool) + break + elif idx == len(mask_ids_select) - 1: + log.critical("No large enough mask found, returning all ones") + else: # if we need multiple masks, return all masks + all_masks[idx] = rle.astype(bool) + del rle + return torch.from_numpy(all_masks), True # [num_masks,T,H,W] + + def __call__(self, data_dict: dict) -> dict: + if "control_input_seg" in data_dict: + # already processed + return data_dict + + key_out = self.output_keys[0] + if not isinstance(data_dict["segmentation"], dict): + # already have a color-coded segmentation mask video, directly use it + seg = data_dict["segmentation"] + seg = transforms_F.resize( + seg, + size=data_dict["video"].shape[-2:], + interpolation=transforms_F.InterpolationMode.NEAREST, + ) + data_dict[key_out] = seg + return data_dict + + # sample number of masks + if self.random_sample_num_masks: + num_masks = np.random.randint(0, min(self.num_masks_max + 1, len(data_dict["segmentation"]) + 1)) + else: + num_masks = len(data_dict["segmentation"]) + + all_masks, success = self.get_masks(data_dict, num_masks) + + if not success: + data_dict["preprocess_failed"] = True + del all_masks # free memory + return data_dict + + key_out = self.output_keys[0] + # control_input_seg is the colored segmentation mask, value in [0,255], shape (3, T, H, W) + data_dict[key_out] = torch.from_numpy( + segmentation_color_mask(all_masks, self.use_fixed_color_list) + ) # [3,T,H,W] + if num_masks > 0: + data_dict[key_out + "_mask"] = all_masks[random.randint(0, num_masks - 1)].clone()[None] # [1,T,H,W] + del all_masks # free memory + return data_dict + + +class AddControlInputIdentity(Augmentor): + def __init__( + self, + input_keys: list, + output_keys: Optional[list] = ["control_input_identity"], + args: Optional[dict] = None, + **kwargs, + ) -> None: + super().__init__(input_keys, output_keys, args) + + def __call__(self, data_dict: dict) -> dict: + key_img = self.input_keys[0] + key_out = self.output_keys[0] + frames = _maybe_torch_to_numpy(data_dict[key_img]) # CTHW for video, HWC for image + is_image = len(frames.shape) < 4 + if is_image: + frames = frames.transpose((2, 0, 1)) + data_dict[key_out] = torch.from_numpy(frames).clone() # [C,T,H,W] for video, [C,H,W] for image + return data_dict + + +class AddControlInputHdmapBbox(Augmentor): + """ + Add control input to the data dictionary. control input are expanded to 3-channels + steps to add new items: modify this file, configs/conditioner.py, conditioner.py + """ + + def __init__( + self, + input_keys: list, + output_keys: Optional[list] = ["control_input_hdmap_bbox"], + args: Optional[dict] = None, + use_random: Optional[bool] = True, + preset_strength: Optional[str] = "medium", + **kwargs, + ) -> None: + super().__init__(input_keys, output_keys, args) + self.use_random = use_random + self.preset_strength = preset_strength + + def __call__(self, data_dict: dict) -> dict: + if "control_input_hdmap_bbox" in data_dict: + return data_dict + key_input = self.input_keys[0] + key_out = self.output_keys[0] + data_dict[key_out] = data_dict[key_input] + return data_dict + + +CTRL_HINT_KEYS = { + "control_input_edge": AddControlInputEdge, + "control_input_blur": AddControlInputBlur, + "control_input_depth": AddControlInputDepth, + "control_input_seg": AddControlInputSeg, + "control_input_inpaint": AddControlInputIdentity, + "control_input_hdmap_bbox": AddControlInputHdmapBbox, +} + + +class AddControlInputComb(Augmentor): + """ + Add control input to the data dictionary. control input are expanded to 3-channels + steps to add new items: modify this file, configs/conditioner.py, conditioner.py + """ + + def __init__( + self, + input_keys: list, + output_keys: Optional[list] = None, + args: Optional[dict] = None, + use_random: bool = True, + control_input_type: str = "edge_blur_depth_seg", + use_control_mask_prob: float = 0.0, + num_control_inputs_prob: list[float] = [1.0, 0.0, 0.0, 0.0], + num_control_inputs: int | None = None, + **kwargs, + ) -> None: + super().__init__(input_keys, output_keys, args) + self.use_random = use_random + self.control_hint_keys = ["control_input_" + key for key in control_input_type.split("_")] + self.use_control_mask_prob = use_control_mask_prob + self.num_control_inputs_prob = num_control_inputs_prob[: len(self.control_hint_keys)] + self.num_control_inputs = num_control_inputs + self.comb = {} + for output_key, class_name in CTRL_HINT_KEYS.items(): + aug = class_name( + input_keys=input_keys, output_keys=[output_key], args=args, use_random=use_random, **kwargs + ) + self.comb[output_key] = aug + + def __call__(self, data_dict: dict) -> dict: + success = False + if self.use_random: + # Randomly select a number of control inputs (or use fixed num_control_inputs when set) + num_keys_prob = self.num_control_inputs_prob + ctrl_hint_keys = self.control_hint_keys + if self.num_control_inputs is not None: + num_keys = max(1, min(self.num_control_inputs, len(ctrl_hint_keys))) + else: + num_keys = random.choices(range(len(ctrl_hint_keys)), weights=num_keys_prob, k=1)[0] + 1 + output_keys = np.random.choice(ctrl_hint_keys, size=num_keys, replace=False) + # output_keys = np.random.choice(["control_input_edge", "control_input_blur", "control_input_depth"], size=num_keys, replace=False) + zero_input = torch.zeros_like(data_dict[self.input_keys[0]]) # [C,T,H,W] + zero_mask = torch.zeros(*data_dict[self.input_keys[0]][:1].shape, dtype=torch.bool) # [1,T,H,W] + ones_mask = torch.ones(*data_dict[self.input_keys[0]][:1].shape, dtype=torch.bool) # [1,T,H,W] + use_control_mask = random.random() < self.use_control_mask_prob + for cur_key in ctrl_hint_keys: + cur_mask_key = cur_key + "_mask" + if cur_key in output_keys: + data_dict["preprocess_failed"] = False + data_dict = self.comb[cur_key](data_dict) + # log.critical(f"self.use_control_mask_prob: {self.use_control_mask_prob}") + if use_control_mask or cur_key == "control_input_inpaint": + # Get mask for the control input + if cur_mask_key not in data_dict: + data_dict[cur_mask_key], success = self.comb["control_input_seg"].get_masks( + data_dict, num_masks=1 + ) + else: + data_dict[cur_mask_key] = ones_mask + + # If preprocess failed or cannot get inpaint mask, use control_input_edge instead + if data_dict["preprocess_failed"] or (cur_key == "control_input_inpaint" and not success): + data_dict[cur_key] = zero_input + data_dict[cur_mask_key] = zero_mask + if num_keys == 1 and "control_input_edge" in ctrl_hint_keys: + new_key = "control_input_edge" + log.critical(f"Preprocess failed for {cur_key}, using {new_key} instead") + if new_key in data_dict: + del data_dict[new_key] + data_dict = self.comb[new_key](data_dict) + data_dict[new_key + "_mask"] = ones_mask + else: + data_dict[cur_key] = zero_input + data_dict[cur_mask_key] = zero_mask + + # Free memory: remove unused depth/segmentation to avoid OOM later + if "control_input_depth" not in output_keys and "depth" in data_dict: + del data_dict["depth"] + if "control_input_seg" not in output_keys and "segmentation" in data_dict: + del data_dict["segmentation"] + if "segmentation" in data_dict and isinstance(data_dict["segmentation"], dict): + del data_dict["segmentation"] + + if "control_input_inpaint" in output_keys and success: # Post-process the inpaint mask + inpaint_mask_key = "control_input_inpaint_mask" + if random.random() < 0.5: # randomly negate the mask + data_dict[inpaint_mask_key] = ~data_dict[inpaint_mask_key] + # Make sure the inpaint mask does not overlap with other masks + for cur_key in ctrl_hint_keys: + cur_mask_key = cur_key + "_mask" + if cur_mask_key == inpaint_mask_key: + continue + if torch.all(data_dict[cur_mask_key]) or torch.all(~data_dict[cur_mask_key]): # dummy mask + continue + # Remove overlap by zeroing overlapping regions in mask1 + overlap = data_dict[cur_mask_key] & data_dict[inpaint_mask_key] + if overlap.any(): + data_dict[inpaint_mask_key] = data_dict[inpaint_mask_key] & ~overlap + + else: + for k, v in self.comb.items(): + data_dict = v(data_dict) + return data_dict diff --git a/cosmos-inference/cosmos3/_src/vfm/datasets/augmentors/transfer_control_input/fast_blur.py b/cosmos-inference/cosmos3/_src/vfm/datasets/augmentors/transfer_control_input/fast_blur.py new file mode 100644 index 00000000..453ae7ac --- /dev/null +++ b/cosmos-inference/cosmos3/_src/vfm/datasets/augmentors/transfer_control_input/fast_blur.py @@ -0,0 +1,104 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +import ctypes +from ctypes import POINTER, c_float, c_int, c_ubyte, sizeof + +import torch + + +class BilateralGaussian: + class NppiSize(ctypes.Structure): + _fields_ = [("width", c_int), ("height", c_int)] + + class NppiPoint(ctypes.Structure): + _fields_ = [("x", c_int), ("y", c_int)] + + def __init__(self): + self.npp_i_lib = self._load_npp_library() + self._setup_buffer_size_function() + self._setup_bilateral_function() + + def _load_npp_library(self): + return ctypes.CDLL("libnppif.so") + + def _setup_buffer_size_function(self): + self.get_buffer_size_func = self.npp_i_lib.nppiFilterCannyBorderGetBufferSize + self.get_buffer_size_func.restype = c_int + self.get_buffer_size_func.argtypes = [BilateralGaussian.NppiSize, POINTER(c_int)] # oSizeROI # bufferSize + + def _setup_bilateral_function(self): + self.bilateral_function = self.npp_i_lib.nppiFilterBilateralGaussBorder_8u_C3R + self.bilateral_function.restype = c_int + self.bilateral_function.argtypes = [ + POINTER(c_ubyte), # pSrc + c_int, # nSrcStep + BilateralGaussian.NppiSize, # oSrcSize + BilateralGaussian.NppiPoint, # oSrcOffset + POINTER(c_ubyte), # pDst + c_int, # nDstStep + BilateralGaussian.NppiSize, # oSizeROI + c_int, # nRadius + c_int, # nStepBetweenSrcPixels + c_float, # nValSquareSigma + c_float, # nPosSquareSigma + c_int, # eBorderType + ] + + def _prepare_input(self, image_tensor): + if not image_tensor.is_cuda: + image_tensor = image_tensor.cuda() + if image_tensor.dtype != torch.uint8: + image_tensor = (image_tensor * 255).byte() + return image_tensor + + def _get_buffer_size(self, roi): + buffer_size = c_int(0) + status = self.get_buffer_size_func(roi, ctypes.byref(buffer_size)) + if status != 0: + raise RuntimeError(f"Failed to get buffer size, status: {status}") + return buffer_size.value + + def __call__(self, image_tensor, radius=30, color_sigma_square=150 * 150, sigma_space_square=100 * 100): + # Prepare input + image_tensor = self._prepare_input(image_tensor) + + height, width, channels = image_tensor.shape + output = torch.empty_like(image_tensor) + + src_ptr = ctypes.cast(image_tensor.data_ptr(), POINTER(c_ubyte)) + dst_ptr = ctypes.cast(output.data_ptr(), POINTER(c_ubyte)) + + roi = BilateralGaussian.NppiSize(width, height) + + status = self.bilateral_function( + src_ptr, + width * channels * sizeof(c_ubyte), + BilateralGaussian.NppiSize(width, height), + BilateralGaussian.NppiPoint(0, 0), + dst_ptr, + width * channels * sizeof(c_ubyte), + roi, + c_int(radius), + 1, # step size + c_float(color_sigma_square), + c_float(sigma_space_square), + 2, # border replicate + ) + + if status != 0: + raise RuntimeError(f"NPP Canny edge detection failed with status {status}") + + return output diff --git a/cosmos-inference/cosmos3/_src/vfm/datasets/augmentors/transfer_control_input/seg.py b/cosmos-inference/cosmos3/_src/vfm/datasets/augmentors/transfer_control_input/seg.py new file mode 100644 index 00000000..bc2967e5 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/vfm/datasets/augmentors/transfer_control_input/seg.py @@ -0,0 +1,183 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +import random + +import matplotlib.colors as mcolors +import numpy as np + +# Array of 23 highly distinguishable colors in RGB format +PREDEFINED_COLORS_SEGMENTATION = np.array( + [ + [255, 0, 0], # Red + [0, 255, 0], # Green + [0, 0, 255], # Blue + [255, 255, 0], # Yellow + [0, 255, 255], # Cyan + [255, 0, 255], # Magenta + [255, 140, 0], # Dark Orange + [255, 105, 180], # Hot Pink + [0, 0, 139], # Dark Blue + [0, 128, 128], # Teal + [75, 0, 130], # Indigo + [128, 0, 128], # Purple + [255, 69, 0], # Red-Orange + [34, 139, 34], # Forest Green + [128, 128, 0], # Olive + [70, 130, 180], # Steel Blue + [255, 215, 0], # Gold + [255, 222, 173], # Navajo White + [144, 238, 144], # Light Green + [255, 99, 71], # Tomato + [221, 160, 221], # Plum + [0, 255, 127], # Spring Green + [255, 255, 255], # White + ] +) + + +def generate_distinct_colors() -> np.ndarray: + """ + Generate `n` visually distinguishable and randomized colors. + + Returns: + np.ndarray, (3) + """ + # Randomize hue, saturation, and lightness within a range + hue = random.uniform(0, 1) # Full spectrum of hues + saturation = random.uniform(0.1, 1) # Vibrant colors + lightness = random.uniform(0.2, 1.0) # Avoid too dark + + r, g, b = mcolors.hsv_to_rgb((hue, saturation, lightness)) + return (np.array([r, g, b]) * 255).astype(np.uint8) + + +def segmentation_color_mask(segmentation_mask: np.ndarray, use_fixed_color_list: bool = False) -> np.ndarray: + """ + Convert segmentation mask to color mask + Args: + segmentation_mask: np.ndarray, shape (num_masks, T, H, W) + Returns: + np.ndarray, shape (3, T, H, W), with each mask converted to a color mask, value [0,255] + """ + + num_masks, T, H, W = segmentation_mask.shape + segmentation_mask_sorted = [segmentation_mask[i] for i in range(num_masks)] + # Sort the segmentation mask by the number of non-zero pixels, from most to least + segmentation_mask_sorted = sorted(segmentation_mask_sorted, key=lambda x: np.count_nonzero(x), reverse=True) + + output = np.zeros((3, T, H, W), dtype=np.uint8) + if use_fixed_color_list: + predefined_colors_permuted = PREDEFINED_COLORS_SEGMENTATION[ + np.random.permutation(len(PREDEFINED_COLORS_SEGMENTATION)) + ] + else: + predefined_colors_permuted = [generate_distinct_colors() for _ in range(num_masks)] + # index the segmentation mask from last channel to first channel, i start from num_masks-1 to 0 + for i in range(num_masks): + mask = segmentation_mask_sorted[i] + color = predefined_colors_permuted[i % len(predefined_colors_permuted)] + + # Create boolean mask and use it for assignment + bool_mask = mask > 0 + for c in range(3): + output[c][bool_mask] = color[c] + + return output + + +def decode_partial_rle_width1(rle_obj: dict, start_row: int, end_row: int) -> np.ndarray: + """ + Decode a partial RLE encoded mask with width = 1. In SAM2 output, the video mask (num_frame, height, width) are reshaped to (total_size, 1). + Sometimes the video mask could be large, e.g. 1001x1080x1092 shape and it takes >1GB memory if using pycocotools, resulting in segmentation faults when training with multiple GPUs and data workers. + This function is used to decode the mask for a subset of frames to reduce memory usage. + + Args: + rle_obj (dict): RLE object containing: + - 'size': A list [height, width=1] indicating the dimensions of the mask. + - 'counts': A bytes or string object containing the RLE encoded data. + start_row (int): The starting row (inclusive). It's computed from frame_start * height * width. + end_row (int): The ending row (exclusive). It's computed from frame_end * height * width. + + Returns: + numpy.ndarray: Decoded binary mask for the specified rows as a 1D numpy array. + """ + height, width = rle_obj["size"] + + # Validate row range + if width != 1: + raise ValueError("This function is optimized for width=1.") + if start_row < 0 or end_row > height or start_row >= end_row: + raise ValueError("Invalid row range specified.") + + # Decode the RLE counts + counts = rle_obj["counts"] + if isinstance(counts, str): + counts = np.frombuffer(counts.encode("ascii"), dtype=np.uint8) + elif isinstance(counts, bytes): + counts = np.frombuffer(counts, dtype=np.uint8) + else: + raise ValueError("Unsupported format for counts. Must be str or bytes.") + + # Interpret counts as a sequence of run lengths + run_lengths = [] + current_val = 0 + i = 0 + while i < len(counts): + x = int(0) + k = 0 + more = True + while more: + c = int(counts[i]) - 48 + x |= (c & 0x1F) << (5 * k) + more = (c & 0x20) != 0 + i += 1 + k += 1 + if not more and (c & 0x10): + x |= -1 << (5 * k) + if len(run_lengths) > 2: + x += run_lengths[-2] + + run_lengths.append(x) + current_val += x + if current_val > end_row: + break + # Initialize the partial mask + idx_start = start_row + idx_end = end_row + partial_mask = np.zeros(idx_end - idx_start, dtype=np.uint8) + partial_height = end_row - start_row + idx = 0 # Current global index + for i, run in enumerate(run_lengths): + run_start = idx + run_end = idx + run + if run_end <= idx_start: + # Skip runs entirely before the region + idx = run_end + continue + if run_start >= idx_end: + # Stop decoding once we pass the region + break + + # Calculate overlap with the target region + start = max(run_start, idx_start) + end = min(run_end, idx_end) + if start < end: + partial_start = start - idx_start + partial_end = end - idx_start + partial_mask[partial_start:partial_end] = i % 2 + + idx = run_end + return partial_mask.reshape((partial_height, 1), order="F") diff --git a/cosmos-inference/cosmos3/_src/vfm/datasets/augmentors/transfer_control_transform.py b/cosmos-inference/cosmos3/_src/vfm/datasets/augmentors/transfer_control_transform.py new file mode 100644 index 00000000..b44428e8 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/vfm/datasets/augmentors/transfer_control_transform.py @@ -0,0 +1,329 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +""" +Augmentors for transfer (control-conditioned) image and video generation in the cosmos3 VFM pipeline. + +Transfer training conditions the model on control signals (edge, blur, depth, or segmentation) +to generate images or videos, aligned with cosmos/transfer2. This module provides: + +- **TransferToTrainingFormat**: Converts (control_input, target) into the joint dataloader format + with SequencePlan (condition frame + generated frame), for both image and video outputs. + +- **VideoTransferSampleFrame**: For video→image transfer: samples a single frame index consistently + across control and video tensors, producing image-sized tensors from 4D video inputs. +- **AddControlFromVideoComb**: Uses AddControlInputComb (in transfer_control_input) to compute one of edge/blur/depth/seg + from video or precomputed fields and writes the chosen control to data_dict["control_input"]. +- **SampleResolution**: Samples a resolution from a list and sets data_dict["_res_size_map"] so downstream + resize/padding use that resolution (used to combine multiple resolutions in one dataloader). +""" + +from __future__ import annotations + +import random +from typing import cast + +import torch +import torchvision.transforms.functional as transforms_F + +from cosmos3._src.imaginaire.datasets.webdataset.augmentors.augmentor import Augmentor +from cosmos3._src.imaginaire.utils import log +from cosmos3._src.vfm.datasets.augmentors.transfer_control_input import AddControlInputComb +from cosmos3._src.vfm.datasets.sequence_packing import SequencePlan +from cosmos3._src.vfm.datasets.utils import VIDEO_RES_SIZE_INFO + + +class SampleResolution(Augmentor): + """Sample one resolution from a list and set data_dict['_res_size_map'] for downstream resize/padding. + + When used before ResizeLargestSideAspectPreserving and ReflectionPadding, those augmentors will + use obtain_augmentation_size(), which reads _res_size_map when present. This allows one dataloader + to produce samples at different resolutions (e.g. 480 and 720) by sampling per sample. + + resolutions_weights: Optional sampling weights for each resolution (same length as resolutions). + Weights are used by random.choices and need not sum to 1. If None, sampling is uniform. + """ + + def __init__( + self, + input_keys: list, + output_keys: list | None = None, + args: dict | None = None, + resolutions: list[str] | None = None, + resolutions_weights: list[float] | None = None, + **kwargs, + ) -> None: + super().__init__(input_keys, output_keys, args) + self.resolutions = list(resolutions) if resolutions else [] + assert len(self.resolutions) > 0, "SampleResolution requires at least one resolution." + for r in self.resolutions: + assert r in VIDEO_RES_SIZE_INFO, f"Unknown resolution {r}; known: {list(VIDEO_RES_SIZE_INFO.keys())}" + self.resolutions_weights = resolutions_weights + if self.resolutions_weights is not None: + assert len(self.resolutions_weights) == len(self.resolutions), ( + "resolutions_weights must have same length as resolutions." + ) + + def __call__(self, data_dict: dict) -> dict: + if self.resolutions_weights is not None: + res = random.choices(self.resolutions, weights=self.resolutions_weights, k=1)[0] + else: + res = random.choice(self.resolutions) + data_dict["_res_size_map"] = VIDEO_RES_SIZE_INFO[res] + return data_dict + + +class TransferToTrainingFormat(Augmentor): + """Convert (control_input, target) into joint-dataloader training format with SequencePlan. + + Reads data_dict["control_input"] and data_dict["video"] (target). Normalizes control to + mean/std 0.5, then writes [control_tensor, target_tensor] into data_dict[output_media_key] + ("images" for image transfer, "video" for video transfer). Also sets num_frames, + dataset_name, fps, ai_caption, selected_caption_type, sequence_plan, and image_size. + + Supports both image (3D: C,H,W) and video (4D: C,T,H,W); for image output, 4D tensors + are sliced to the first frame. Same output structure as ImageEditingToTrainingFormat. + """ + + def __init__( + self, + input_keys: list | None = None, + mean: float = 0.5, + std: float = 0.5, + output_media_key: str = "images", + conditioning_config: dict[int, float] | None = None, + share_vision_temporal_positions: bool = True, + args: dict | None = None, + ) -> None: + super().__init__(input_keys or [], None, args) + self.mean = mean + self.std = std + self.output_media_key = output_media_key + self.conditioning_config = conditioning_config + self.share_vision_temporal_positions = share_vision_temporal_positions + + if self.conditioning_config is not None: + for num_frames, prob in self.conditioning_config.items(): + if not isinstance(num_frames, int) or num_frames < 0: + raise ValueError(f"conditioning_config keys must be non-negative integers, got {num_frames}") + if not isinstance(prob, (int, float)) or prob < 0: + raise ValueError(f"conditioning_config values must be non-negative numbers, got {prob}") + total_prob = sum(self.conditioning_config.values()) + if total_prob <= 0: + raise ValueError("conditioning_config probabilities must sum to a positive number") + self.normalized_conditioning_config = {k: v / total_prob for k, v in self.conditioning_config.items()} + else: + self.normalized_conditioning_config = None + + def _normalize_tensor(self, x: torch.Tensor) -> torch.Tensor: + """Normalize channel-wise to given mean/std. Accepts values in [0,1] or [0,255] (auto-detected).""" + if x.dtype == torch.uint8 or x.max() > 1.0: + x = x.float() / 255.0 + return transforms_F.normalize(x, mean=[self.mean] * 3, std=[self.std] * 3) + + def __call__(self, data_dict: dict) -> dict | None: + control_norm = data_dict.get("control_input") + target_norm = data_dict.get("video") + + if control_norm is None or target_norm is None: + log.warning( + f"TransferToTrainingFormat: missing control or target (video): {data_dict.get('__key__', 'unknown')}", + rank0_only=False, + ) + return None + + try: + if control_norm.dim() == 2: + control_norm = control_norm.unsqueeze(0).expand(3, -1, -1) # [3,H,W] + # is_video = control.dim() == 4 and isinstance(target_norm, torch.Tensor) and target_norm.dim() == 4 + if self.output_media_key == "video": + # Video: (C, T, H, W) each; normalize per frame + # control_norm = self._normalize_tensor(control.float()) + num_frames = control_norm.shape[1] + data_dict["video"] = [control_norm, target_norm] + data_dict["num_frames"] = num_frames + data_dict["dataset_name"] = "video_transfer" + data_dict["fps"] = data_dict.get("fps", 24.0) + else: + # Image: (C, H, W) + if target_norm.dim() == 4: + target_norm = target_norm[:, 0] + if control_norm.dim() == 4: + control_norm = control_norm[:, 0] + # control_norm = self._normalize_tensor(control.float()) + data_dict["images"] = [control_norm, target_norm] + data_dict["num_frames"] = 2 + data_dict["dataset_name"] = "image_transfer" + data_dict["fps"] = 30.0 + data_dict.setdefault("ai_caption", "") + data_dict.setdefault("selected_caption_type", "transfer_caption") + + num_condition_frames = 0 + if self.normalized_conditioning_config is not None: + frames_options = list(self.normalized_conditioning_config.keys()) + weights = list(self.normalized_conditioning_config.values()) + num_condition_frames = random.choices(frames_options, weights=weights, k=1)[0] + if self.output_media_key == "video" and target_norm.dim() == 4: + max_cond = target_norm.shape[1] - 1 + num_condition_frames = min(num_condition_frames, max_cond) + + if num_condition_frames > 0 and target_norm.shape[1] > 1: + condition_frames_indexes = list(range(num_condition_frames)) + else: + condition_frames_indexes = [] + + data_dict["sequence_plan"] = SequencePlan( + has_text=True, + has_vision=True, + condition_frame_indexes_vision=condition_frames_indexes, + # ControlNet-style transfer: control item and target item are + # spatio-temporally aligned (same source video, frame-synced). + # Forces shared temporal mRoPE grid across both items so the + # model sees control_t=k and target_t=k as the same time index. + share_vision_temporal_positions=self.share_vision_temporal_positions, + ) + except Exception as e: + log.warning( + f"TransferToTrainingFormat error: {data_dict.get('__key__', 'unknown')}, {e}", + rank0_only=False, + ) + return None + + # duplicate image_size for each vision input/output + data_dict["image_size"] = [data_dict["image_size"]] * len(data_dict[self.output_media_key]) + + return data_dict + + +class VideoTransferSampleFrame(Augmentor): + """Sample a single frame index from video tensors for image→image transfer. + + For each key in input_keys (default ["control_input", "video"]), resolves the + tensor (e.g. unwraps data_dict["video"]["video"]). Picks one temporal index t + (random if random_frame=True, else 0) and for every 4D tensor (C, T, H, W) + replaces it in-place with the slice at t, yielding (C, 1, H, W). 3D tensors + are left unchanged. All keys must be present; returns None if any is missing. + """ + + def __init__( + self, + input_keys: list | None = None, + args: dict | None = None, + random_frame: bool = True, + ) -> None: + self.input_keys = input_keys or ["control_input", "video"] + super().__init__(self.input_keys, None, args) + self.random_frame = random_frame + + def _get_tensor(self, data_dict: dict, key: str) -> torch.Tensor | None: + """Return the tensor for key; if key is 'video' and value is a dict, return value['video'].""" + val = data_dict.get(key) + if val is None: + return None + if isinstance(val, dict) and key == "video": + return val.get("video") + return val + + def __call__(self, data_dict: dict) -> dict | None: + # Resolve tensors; find T from first 4D tensor. Require all keys present. + tensors: list[tuple[str, torch.Tensor]] = [] + T: int | None = None + for key in self.input_keys: + raw = self._get_tensor(data_dict, key) + if raw is None or not isinstance(raw, torch.Tensor): + return None + tensor = cast(torch.Tensor, raw) + if tensor.dim() == 4: + if T is None: + T = tensor.shape[1] + if T == 0: + return None + tensors.append((key, tensor)) + else: + tensors.append((key, tensor)) + + if T is None: + # No 4D tensor; nothing to sample + return data_dict + + t_idx = random.randint(0, T - 1) if self.random_frame else 0 + + for key, tensor in tensors: + if tensor.dim() == 4: + sampled = tensor[:, t_idx : t_idx + 1] + else: + sampled = tensor + data_dict[key] = sampled + + return data_dict + + +class AddControlFromVideoComb(Augmentor): + """Compute one control signal from video via AddControlInputComb and set control_input. + + Delegates to AddControlInputComb (edge/blur computed from video; depth/seg from data_dict + when present). After the comb runs, selects the first non-zero control among + control_input_edge, control_input_blur, control_input_depth, control_input_seg, + writes it to data_dict["control_input"], and removes the temporary control keys. + + Args: + control_input_type: e.g. "edge_blur", "edge_blur_depth_seg" (which controls to consider). + num_control_inputs_prob: Probability distribution over number of combined controls; + this wrapper uses only the single chosen control. + """ + + CONTROL_KEYS = ("control_input_edge", "control_input_blur", "control_input_depth", "control_input_seg") + + def __init__( + self, + input_keys: list, + output_keys: list | None = None, + args: dict | None = None, + control_input_type: str = "edge_blur_depth_seg", + use_random: bool = True, + num_control_inputs_prob: tuple[float, ...] = (1.0, 0.0, 0.0, 0.0), + num_control_inputs: int | None = None, + **kwargs, + ) -> None: + super().__init__(input_keys, output_keys or ["control_input"], args) + self._comb = AddControlInputComb( + input_keys=input_keys, + output_keys=None, + use_random=use_random, + control_input_type=control_input_type, + num_control_inputs_prob=list(num_control_inputs_prob), + num_control_inputs=num_control_inputs, + **kwargs, + ) + + def __call__(self, data_dict: dict) -> dict | None: + data_dict = self._comb(data_dict) + if data_dict is None: + return None + # Pick first control key that exists and has non-zero data (comb sets unchosen to zeros). + for key in self.CONTROL_KEYS: + if key in data_dict: + t = data_dict[key] + if isinstance(t, torch.Tensor) and t.numel() > 0 and t.abs().sum() > 0: + data_dict["control_input"] = t + break + else: + # No break: no valid control found (e.g. all chosen controls failed or are zero). + log.warning("AddControlFromVideoComb: no non-zero control found", rank0_only=False) + return None + for key in self.CONTROL_KEYS: + data_dict.pop(key, None) + data_dict.pop(key + "_mask", None) + return data_dict diff --git a/cosmos-inference/cosmos3/_src/vfm/datasets/augmentors/video_parsing.py b/cosmos-inference/cosmos3/_src/vfm/datasets/augmentors/video_parsing.py new file mode 100644 index 00000000..ba78bf9b --- /dev/null +++ b/cosmos-inference/cosmos3/_src/vfm/datasets/augmentors/video_parsing.py @@ -0,0 +1,853 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +import random +from typing import Optional + +import numpy as np +import omegaconf +import torch +from einops import rearrange +from torchcodec.decoders import AudioDecoder, VideoDecoder +from torchvision.transforms.v2 import Resize, UniformTemporalSubsample + +from cosmos3._src.imaginaire.datasets.webdataset.augmentors.augmentor import Augmentor +from cosmos3._src.imaginaire.datasets.webdataset.augmentors.image.misc import obtain_augmentation_size +from cosmos3._src.imaginaire.utils import log +from cosmos3._src.vfm.datasets.utils import VIDEO_RES_SIZE_INFO + +# Map dataset_resolution_type to resolution tier key in VIDEO_RES_SIZE_INFO +_DATASET_RESOLUTION_TIER: dict[str, str] = {"gt480p": "480", "gt720p": "720", "gt1080p": "1080"} + +_MIN_FPS = 10 +_MAX_FPS = 60 + + +class VideoParsing(Augmentor): + """ + This augmentor is used to parse the video bytes and get the video frames. + the return dict is back-compatible with old datasets, which video decoding happens in the decoder stage. + + Now uses torchcodec instead of decord for video decoding, with optional audio extraction. + """ + + def __init__(self, input_keys: list, output_keys: Optional[list] = None, args: Optional[dict] = None) -> None: + super().__init__(input_keys, output_keys, args) + assert len(input_keys) == 2, "VideoParsing augmentor only supports two input keys" + self.meta_key = input_keys[0] + self.video_key = input_keys[1] + + self.key_for_caption = args["key_for_caption"] + assert self.key_for_caption in [ + "t2w_windows", + "i2w_windows_later_frames", + ], "key_for_caption must be either t2w_windows or i2w_windows_later_frames" + self.min_duration = args["min_duration"] + self.min_fps = args["min_fps"] + self.max_fps = args["max_fps"] + self.num_frames = args["num_video_frames"] + self.use_native_fps = args["use_native_fps"] # orginal fps if (total_frames // self.num_frames == 1). + # a list of allowed num_multiplers (how many frames are skipped) + # default is 1 - 100 which allows virtually any num_multipler possible + self.allowed_num_multiplers = args.get("allowed_num_multiplers", list(range(1, 100))) + log.info(f"allowed_num_multiplers in video_parsing with use_native_fps: {self.allowed_num_multiplers}") + self.use_original_fps = args["use_original_fps"] # use original fps without sampling + + # Dynamic FPS mode: sample stride from valid range based on video properties + self.use_dynamic_fps = args.get("use_dynamic_fps", False) + # low_fps_bias: 0.0 = favor original FPS (stride=1), 0.5 = uniform, 1.0 = favor slow-mo (high stride) + self.low_fps_bias = args.get("low_fps_bias", 0.5) + assert 0.0 <= self.low_fps_bias <= 1.0, f"low_fps_bias must be in [0, 1], got {self.low_fps_bias}" + + # Validate mutually exclusive modes + mode_count = sum([self.use_dynamic_fps, self.use_native_fps, self.use_original_fps]) + assert mode_count <= 1, ( + f"Only one FPS mode can be enabled at a time. Got: " + f"use_dynamic_fps={self.use_dynamic_fps}, " + f"use_native_fps={self.use_native_fps}, " + f"use_original_fps={self.use_original_fps}" + ) + + if self.use_dynamic_fps: + log.info( + f"use_dynamic_fps mode enabled: stride will be sampled from valid range per video " + f"with low_fps_bias={self.low_fps_bias} (0.0=favor original FPS, 0.5=uniform, 1.0=favor slow-mo)" + ) + + if self.use_native_fps or self.use_original_fps: + assert self.num_frames > 0, "num_frames must be greater than 0 when use_native_fps is True" + if self.use_dynamic_fps: + assert self.num_frames > 0, "num_frames must be greater than 0 when use_dynamic_fps is True" + if self.num_frames > 0: + self.sampler = UniformTemporalSubsample(self.num_frames) + self.video_decode_num_threads = args.get("video_decode_num_threads", 1) + + # Audio extraction parameters + self.extract_audio = args.get("extract_audio", False) + self.audio_sample_rate = args.get("audio_sample_rate", 44100) + self.seek_mode = args.get("seek_mode", "exact") + + def _extract_audio_chunk( + self, video_bytes: bytes, video_fps: float, frame_indices: list[int] + ) -> torch.Tensor | None: # returns [C,N_audio] or None + """ + Extract audio chunk corresponding to the given frame indices. + + Args: + video_bytes: Raw video bytes + video_fps: Video frames per second + frame_indices: List of frame indices being extracted + + Returns: + Audio tensor of shape (C, N) or None if audio extraction fails + """ + try: + # Create audio decoder + audio_decoder = AudioDecoder(video_bytes) + + # Calculate time range for audio corresponding to video frames + time_start = frame_indices[0] / video_fps + time_end = (frame_indices[-1] + 1) / video_fps # +1 to include the last frame's duration + + # Get audio samples for the specific time range + audio_metadata = audio_decoder.metadata + orig_sample_rate = audio_metadata.sample_rate + + audio_samples = audio_decoder.get_samples_played_in_range(start_seconds=time_start, stop_seconds=time_end) + audio_chunk = audio_samples.data # [C,N_orig] + + # Resample if needed + if orig_sample_rate != self.audio_sample_rate: + import librosa + + audio_np = audio_chunk.numpy() + resampled_audio_np = librosa.resample( + audio_np, orig_sr=orig_sample_rate, target_sr=self.audio_sample_rate, axis=-1 + ) + audio_chunk = torch.from_numpy(resampled_audio_np) # [C,N_resampled] + + # Clean up audio decoder + del audio_decoder + + return audio_chunk + + except Exception as e: + log.warning(f"Failed to extract audio: {e}", rank0_only=False) + return None + + def _sample_stride_with_bias(self, max_stride: int) -> int: + """Sample a stride from [1, max_stride] with bias controlled by low_fps_bias. + + Args: + max_stride: Maximum valid stride value. + + Returns: + Sampled stride value. + + The bias controls the probability distribution: + - low_fps_bias=0.0: Favor stride=1 (original FPS) + - low_fps_bias=0.5: Uniform distribution + - low_fps_bias=1.0: Favor high strides (slow-mo / lower FPS) + """ + if max_stride == 1: + return 1 + + # Linear interpolation from (1 - bias) to bias, clamped to min 0.01 + strides = np.arange(1, max_stride + 1) + weights = np.linspace(1 - self.low_fps_bias, self.low_fps_bias, max_stride) + weights = np.maximum(weights, 0.01) + probs = weights / weights.sum() + + return int(np.random.choice(strides, p=probs)) + + def __call__(self, data_dict: dict) -> dict | None: + try: + meta_dict = data_dict[self.meta_key] + video = data_dict[self.video_key] + except Exception as e: + log.warning( + f"Cannot find video. url: {data_dict['__url__']}, key: {data_dict['__key__']}", rank0_only=False + ) + return None + + if not isinstance(video, bytes): + return data_dict + + video_info = { + "fps": meta_dict["framerate"], + "n_orig_video_frames": meta_dict["nb_frames"], + } + + if video_info["fps"] < self.min_fps: + log.warning(f"Video FPS {video_info['fps']} is less than min_fps {self.min_fps}", rank0_only=False) + return None + if video_info["fps"] > self.max_fps: + log.warning(f"Video FPS {video_info['fps']} is greater than max_fps {self.max_fps}", rank0_only=False) + return None + + options: list = list((i, item) for i, item in enumerate(meta_dict[self.key_for_caption])) + + # Skip the last window if possible. + # All windows except the last are 5 seconds long. The last window has a duration in the range [2.5s, 7.5), which is less preferred. + if len(options) > 1: + options = options[:-1] + + # shuffle options + random.shuffle(options) + video_frames = None + dynamic_conditioning_fps = None # Track conditioning FPS for dynamic mode + for chunk_index, option in options: + start_frame = option["start_frame"] + end_frame = option["end_frame"] + if (end_frame - start_frame) < self.min_duration * video_info["fps"]: + continue + + if self.use_native_fps or self.use_original_fps or self.use_dynamic_fps: + if (end_frame - start_frame) < self.num_frames: + continue + + # Create video decoder with torchcodec (directly from bytes) + video_decoder = VideoDecoder( + video, seek_mode=self.seek_mode, num_ffmpeg_threads=self.video_decode_num_threads + ) + + if self.use_dynamic_fps or self.use_native_fps or self.use_original_fps: + # Shared: Handle alpamayo - skip first 5 frames + if "alpamayo" in data_dict["__url__"].root: + start_frame += 5 + if (end_frame - start_frame) < self.num_frames: + continue + + total_frames = end_frame - start_frame + + # Compute num_multiplier based on mode + if self.use_dynamic_fps: + # Dynamic FPS mode: compute valid strides and sample with bias + max_stride = total_frames // self.num_frames + if max_stride < 1: + # Not enough frames even for stride=1, skip this chunk + continue + + # Sample stride with low_fps_bias controlling the distribution + num_multiplier = self._sample_stride_with_bias(max_stride) + + # Compute conditioning FPS based on sampled stride + dynamic_conditioning_fps = video_info["fps"] / num_multiplier + + fps_mode_desc = ( + "original_fps (contiguous)" if num_multiplier == 1 else f"subsampled (stride={num_multiplier})" + ) + log.info( + f"Dynamic FPS mode: video_fps={video_info['fps']}, total_frames={total_frames}, " + f"max_stride={max_stride}, sampled_stride={num_multiplier}, " + f"conditioning_fps={dynamic_conditioning_fps:.2f}, mode={fps_mode_desc}, " + f"low_fps_bias={self.low_fps_bias}", + rank0_only=False, + ) + elif self.use_native_fps: + # take mid self.num_frames frames from start frame to end frame. + # always try lower fps if possible. + num_multiplier = total_frames // self.num_frames + if num_multiplier not in self.allowed_num_multiplers: + log.debug( + f"Skipping chunk (native_fps): stride not allowed. num_multiplier={num_multiplier}, allowed={self.allowed_num_multiplers}" + ) + continue + else: # self.use_original_fps + # Original FPS mode: no frame skipping + num_multiplier = 1 + + # Shared: Check if we have enough frames for the selected stride + expected_length = self.num_frames * num_multiplier + if total_frames < expected_length: + log.info( + f"Skipping chunk: not enough frames for stride. total_frames={total_frames}, expected={expected_length}, num_multiplier={num_multiplier}", + rank0_only=False, + ) + continue + + # Shared: Select frames from the center of the window + _start_frame = start_frame + (total_frames - expected_length) // 2 + _end_frame = _start_frame + expected_length + frame_indices = list(range(_start_frame, _end_frame, num_multiplier)) + assert len(frame_indices) == self.num_frames, "frame_indices length is not equal to num_frames" + + # Decode frames with torchcodec + frame_batch = video_decoder.get_frames_at(frame_indices) + video_frames = frame_batch.data # [T,C,H,W] + video_frames = video_frames.permute(1, 0, 2, 3) # [C,T,H,W] + + # Clean up video decoder + del video_decoder + + # Extract audio if requested + audio_chunk = None + if self.extract_audio: + audio_chunk = self._extract_audio_chunk(video, video_info["fps"], frame_indices) # [C,N_audio] + + break + + else: + frame_indices = list(range(start_frame, end_frame)) + num_multiplier = 1 # No frame skipping in this block of code. + + # online hot-fix for alpamayo data. Skip the first 5 frames as there is chance that the first five frames contain black frames. + if "alpamayo" in data_dict["__url__"].root: + assert len(frame_indices) >= 5, ( + "Getting less than 5 frames for alpamayo videos. There is no way to skip the first five frames." + ) + frame_indices = frame_indices[5:] + start_frame += 5 + + # Decode frames with torchcodec + try: + frame_batch = video_decoder.get_frames_at(frame_indices) + except Exception as e: + # Some segmentation videos for Transfer are not long enough as the target video, skip them. + log.warning( + f"Video is not long enough, return None. url: {data_dict['__url__']}, key: {data_dict['__key__']}, error: {e}, start_frame: {start_frame}, end_frame: {end_frame}, frame_indices: {frame_indices}", + rank0_only=False, + ) + return None + video_frames = frame_batch.data # [T,C,H,W] + video_frames = video_frames.permute(1, 0, 2, 3) # [C,T,H,W] + + # Clean up video decoder + del video_decoder + + # Extract audio if requested + audio_chunk = None + if self.extract_audio: + audio_chunk = self._extract_audio_chunk(video, video_info["fps"], frame_indices) # [C,N_audio] + + break + + if video_frames is None: + log.warning( + f"No valid video frames found, return None. url: {data_dict['__url__']}, key: {data_dict['__key__']}", + rank0_only=False, + ) + return None + + video_info["chunk_index"] = chunk_index + video_info["frame_start"] = start_frame + video_info["frame_end"] = end_frame + video_info["num_frames"] = end_frame - start_frame # type: ignore + if self.num_frames > 0 and not (self.use_dynamic_fps or self.use_native_fps or self.use_original_fps): + # Uniform temporal subsampling mode (default when no FPS mode is enabled) + video_frames = rearrange( + self.sampler(rearrange(video_frames, "c t h w -> t c h w")), "t c h w -> c t h w" + ) # [C,T_sub,H,W] where T_sub = self.num_frames + num_multiplier = ( + end_frame - start_frame + ) / self.num_frames # Specifically for the uniform temporal subsampling case. + + video_info["video"] = video_frames + video_info["num_multiplier"] = num_multiplier # Store the frame skipping multiplier + + + # 1. Our video parser stores the original video FPS of the video. + # 2. We have multiple modes of frame selection -- consecutive chunk of frames or subsampled frames. + # Here's what we do in each case: + # + # A. Dynamic FPS mode (use_dynamic_fps=True): + # - We compute max possible stride based on total_frames // num_frames. + # - We sample a stride uniformly from [1, max_stride]. + # - We compute conditioning_fps = native_fps / stride. + # - This gives us a diverse range of effective FPS values. + # + # B. Consecutive chunk of frames (use_original_fps=True): + # - We use the stored FPS and the number of frames in the video. + # - We calculate the duration in seconds using the above two values. + # - conditioning_fps = native_fps (num_multiplier=1) + # + # C. Subsampled frames (use_native_fps=True or uniform subsampling): + # - We check the skipping_rate (1 / num_multiplier) in case of subsampling. + # - We adjust the conditioning FPS by the skipping_rate (faithful to original video's motion). + # - conditioning_fps = native_fps / num_multiplier + # - We calculate the duration in seconds using the adjusted conditioning FPS and the number of frames. + if dynamic_conditioning_fps is not None: + # Dynamic FPS mode: use the pre-computed conditioning FPS + video_info["conditioning_fps"] = dynamic_conditioning_fps + else: + # Other modes: compute effective FPS from stride + video_info["conditioning_fps"] = ( + video_info["fps"] / num_multiplier + ) # Effective FPS for RoPE modulation and text timestamps + + # Add audio if extracted + if audio_chunk is not None: + video_info["audio"] = audio_chunk + video_info["audio_sample_rate"] = self.audio_sample_rate + + # update data_dict, make it back-compatible with old datasets, which video decoding happens in the decoder stage. + data_dict[self.video_key] = video_info + + return data_dict + + +class VideoParsingWithFullFrames(Augmentor): + """ + This augmentor is used to parse the video bytes and get the video frames. + The caption is assumed to be for the entire video frames, rather than VideoParsing which assume captions are for a specific chunk of frames + """ + + def __init__(self, input_keys: list, output_keys: Optional[list] = None, args: Optional[dict] = None) -> None: + super().__init__(input_keys, output_keys, args) + assert len(input_keys) == 2, "VideoParsingWithFullFrames augmentor only supports two input keys" + self.meta_key = input_keys[0] + self.video_key = input_keys[1] + self.args = args + + # Dynamic FPS mode options + # If use_dynamic_fps=True, then we sample fps from a valid range of values. + # If use_dynamic_fps=False, then we use the original fps of the video (no frame skipping). + self.use_dynamic_fps = args.get("use_dynamic_fps", False) + # low_fps_bias: 0.0 = favor original FPS (stride=1), 0.5 = uniform, 1.0 = favor slow-mo (high stride) + self.max_stride = args.get("max_stride", 3) + self.min_stride = args.get("min_stride", 1) + assert self.max_stride >= self.min_stride, ( + f"max_stride ({self.max_stride}) must be >= min_stride ({self.min_stride})" + ) + self.min_fps = args.get("min_fps", _MIN_FPS) + self.max_fps = args.get("max_fps", _MAX_FPS) + if self.use_dynamic_fps: + log.info(f"use_dynamic_fps mode enabled: stride will be sampled from valid range per video ") + + self.video_decode_num_threads = args.get("video_decode_num_threads", 1) + self.seek_mode = args.get("seek_mode", "exact") + + self.size = args.get("size", None) + self.perform_resize = self.size is not None + + # Audio extraction parameters + self.extract_audio = args.get("extract_audio", False) + self.audio_sample_rate = args.get("audio_sample_rate", 48000) + # When True, emit placeholder sound=None and audio_sample_rate + # without extracting audio. Keeps output keys consistent across + # datasets that share the same dataloader (some with audio, some + # without). + self.emit_placeholder_sound = args.get("emit_placeholder_sound", False) + + # Resolution filter: when not "all", skip samples whose (width, height) are below the + # minimum for this aspect ratio in VIDEO_RES_SIZE_INFO[tier]. + self.dataset_resolution_type = args.get("dataset_resolution_type", "all") + self.resolution_tier = _DATASET_RESOLUTION_TIER.get(self.dataset_resolution_type) + + def _sample_stride_with_bias(self, max_stride: int, min_stride: int = 1) -> int: + """Sample a stride from [min_stride, max_stride] with bias controlled by low_fps_bias. + + Args: + max_stride: Maximum valid stride value. + min_stride: Minimum valid stride value. + + Returns: + Sampled stride value. + max_stride=3, min_stride=1, probs = [0.86681333, 0.11731043, 0.01587624] + These values are chosen to approximately match our old ablations. + TODO @pchattopadhy: Do ablations with this scheme + """ + assert max_stride >= min_stride, f"max_stride ({max_stride}) must be >= min_stride ({min_stride})" + if max_stride == min_stride: + return min_stride + + # Samples native fps stride mostly and picks low fps with some probability. + strides = np.arange(min_stride, max_stride + 1) + weights = np.exp(-2 * strides) + probs = weights / weights.sum() + return int(np.random.choice(strides, p=probs)) + + def _validate_and_probe(self, video: Optional[bytes], meta_dict: dict, data_dict: dict) -> bool: + """Validate video bytes, back-fill missing metadata via probing, and + enforce fps/resolution filters. + Returns True if the video is valid, False otherwise. + """ + + if not isinstance(video, bytes): + raise ValueError(f"Video is not bytes. url: {data_dict['__url__']}, key: {data_dict['__key__']}") + + if len(video) == 0: + log.warning( + f"Empty video bytes. url: {data_dict['__url__']}, key: {data_dict['__key__']}", rank0_only=False + ) + return False + + # Back-fill missing metadata keys (width, height, framerate, nb_frames) by probing the + # video stream header. Also probe when the sidecar framerate looks abnormal to verify + # against the actual video stream. + _needs_probe = any(k not in meta_dict for k in ("width", "height", "framerate", "nb_frames")) + _metadata_fps = meta_dict.get("framerate", 0) + _fps_suspicious = _metadata_fps > _MAX_FPS or _metadata_fps < _MIN_FPS + _needs_probe = _needs_probe or _fps_suspicious + if _needs_probe: + _probe = VideoDecoder(video, seek_mode=self.seek_mode) + meta_dict.setdefault("width", _probe.metadata.width) + meta_dict.setdefault("height", _probe.metadata.height) + meta_dict.setdefault("nb_frames", _probe.metadata.num_frames) + meta_dict["framerate"] = _probe.metadata.average_fps + del _probe + + # Skip videos with framerates outside [min_fps, max_fps] + if meta_dict["framerate"] > self.max_fps: + log.warning( + f"Skipping video with framerate {meta_dict['framerate']} > max_fps {self.max_fps}. " + f"url: {data_dict['__url__']}, key: {data_dict['__key__']}", + rank0_only=False, + ) + return False + if meta_dict["framerate"] < self.min_fps: + log.warning( + f"Skipping video with framerate {meta_dict['framerate']} < min_fps {self.min_fps}. " + f"url: {data_dict['__url__']}, key: {data_dict['__key__']}", + rank0_only=False, + ) + return False + + # Resolution check: skip sample if (width, height) are below the minimum for this aspect ratio + width = meta_dict["width"] + height = meta_dict["height"] + aspect_ratio: str | None = None + + if "__url__" in data_dict: + aspect_ratio = data_dict["__url__"].meta.opts["aspect_ratio"] + + # If the resolution of the video is smaller than the minimum resolution for the aspect ratio, skip the sample. This will ensure that we do not upsample any video. + if self.resolution_tier is not None: + min_w, min_h = VIDEO_RES_SIZE_INFO[self.resolution_tier][aspect_ratio] + if width < min_w and height < min_h: + return False + + return True + + def __call__(self, data_dict: dict) -> dict | None: + + # if in future we need to train with batch size > 1, need to pad frames + try: + meta_dict = data_dict[self.meta_key] + video = data_dict[self.video_key] + except Exception as e: + log.warning( + f"Cannot find video. url: {data_dict['__url__']}, key: {data_dict['__key__']}", rank0_only=False + ) + return None + + if not self._validate_and_probe(video, meta_dict, data_dict): + return None + + # Resize video frames if size is specified. This computes a scaling ratio that fits the + # video within the target size bounds while preserving the original aspect ratio. + # The resize transform is applied during decoding via VideoDecoder's transforms parameter. + if self.perform_resize: + img_size = obtain_augmentation_size(data_dict, {"size": self.size}) + assert isinstance(img_size, (tuple, omegaconf.listconfig.ListConfig)), ( + f"Arg size in resize should be a tuple, get {type(img_size)}, {img_size}" + ) + img_w, img_h = img_size + orig_w, orig_h = meta_dict["width"], meta_dict["height"] + + # Compute uniform scaling ratio to fit video within target bounds (aspect-ratio preserving) + scaling_ratio = min((img_w / orig_w), (img_h / orig_h)) + target_size = (int(scaling_ratio * orig_h + 0.5), int(scaling_ratio * orig_w + 0.5)) + + assert target_size[0] <= img_h and target_size[1] <= img_w, ( + f"Resize error. orig {(orig_w, orig_h)} desire {img_size} compute {target_size}" + ) + transform = [Resize(target_size)] + else: + transform = None + + # Adding try-expcept because some of the data is bad and video decoding call fail. + try: + video_decoder = VideoDecoder( + video, + seek_mode=self.seek_mode, + num_ffmpeg_threads=self.video_decode_num_threads, + transforms=transform, + ) + num_video_frames = len(video_decoder) + + stride = self._sample_stride_with_bias(self.max_stride, self.min_stride) + frame_indices = np.arange(0, num_video_frames, stride).tolist() + + # VAE compress temporal by 4x, with 1 as condition + # thus the max_video_frames must be 1 + 4N + num_video_frames = min(len(frame_indices), self.args.get("max_num_frames", 1000)) + N = (num_video_frames - 1) // 4 + num_video_frames = 1 + 4 * N + frame_indices = frame_indices[0:num_video_frames] + + frame_batch = video_decoder.get_frames_at(frame_indices) + video_frames = frame_batch.data # [T,C,H,W] + video_frames = video_frames.permute(1, 0, 2, 3) # [C,T,H,W] (T = num_video_frames) + + del video_decoder + except Exception as e: + log.warning( + f"Failed to decode video. url: {data_dict['__url__']}, key: {data_dict['__key__']}, error: {e}", + rank0_only=False, + ) + return None + + video_info = { + "frame_start": frame_indices[0], + "frame_end": frame_indices[-1], + "num_frames": len(frame_indices), + "video": video_frames, + "fps": meta_dict["framerate"], + "conditioning_fps": meta_dict["framerate"] / stride, + "n_orig_video_frames": num_video_frames, + } + + # Extract audio for the same time range as the video frames + if self.extract_audio: + audio_chunk = self._extract_audio_chunk( + video_bytes=video, video_fps=meta_dict["framerate"], frame_indices=frame_indices + ) + if audio_chunk is not None: + video_info["sound"] = audio_chunk + else: + video_info["sound"] = None + # Always include audio_sample_rate when extract_audio is enabled, + # even if audio extraction failed, so the collate function has a + # consistent set of keys across all samples in the batch. + video_info["audio_sample_rate"] = self.audio_sample_rate + elif self.emit_placeholder_sound: + video_info["sound"] = None + video_info["audio_sample_rate"] = self.audio_sample_rate + + data_dict[self.video_key] = video_info + + return data_dict + + def _extract_audio_chunk( + self, video_bytes: bytes, video_fps: float, frame_indices: list[int] + ) -> torch.Tensor | None: # returns [C,N_audio] or None + """Load audio from the clip, resample, and truncate to match video duration. + + Args: + video_bytes: Raw video bytes + video_fps: Video frames per second, used to compute video duration for truncation. + frame_indices: Frame indices extracted from the video. + + Returns: + Audio tensor of shape (C, N) or None if extraction fails. + """ + try: + # Quick check: probe container for audio streams before AudioDecoder init. + # AudioDecoder is slow when no audio stream exists. We use torchcodec._core + # (internal API) to read container metadata without setting up a decode pipeline. + # If this breaks on a future torchcodec upgrade, remove this block — AudioDecoder + # will still work, just slower on videos without audio. + try: + from torchcodec._core import create_from_bytes, get_container_metadata + + _handle = create_from_bytes(video_bytes) + _meta = get_container_metadata(_handle) + _has_audio = _meta.best_audio_stream_index is not None + del _handle, _meta + if not _has_audio: + return None + except (ImportError, AttributeError): + pass # Fall through to AudioDecoder if _core API is unavailable + + audio_decoder = AudioDecoder(video_bytes) + all_samples = audio_decoder.get_all_samples() + audio = all_samples.data # [C,N_orig] + orig_sr = all_samples.sample_rate + del audio_decoder, all_samples + + if orig_sr != self.audio_sample_rate: + import librosa + + audio = torch.from_numpy( + librosa.resample(audio.numpy(), orig_sr=orig_sr, target_sr=self.audio_sample_rate, axis=-1) + ) # [C,N_resampled] + + # Truncate audio to match the extracted video frame duration. + if len(frame_indices) > 0 and video_fps > 0: + video_duration = (frame_indices[-1] + 1) / video_fps + max_audio_samples = int(video_duration * self.audio_sample_rate) + if audio.shape[-1] > max_audio_samples: + audio = audio[:, :max_audio_samples] # [C,N_truncated] + + return audio.clone() # [C,N_audio] + + except Exception as e: + log.warning(f"Failed to extract audio: {e}", rank0_only=False) + return None + + +class VideoParsingChunkedFrames(VideoParsingWithFullFrames): + """ + This augmentor is used to parse the video bytes and get the video frames for a chunk of frames. + In the new scheme, we process + - Full frames if num_frames < 400 + - If num_frames >= 400, we caption only for the first n frame chunk + In this case, the video extraction needs to only extract the first n frame chunk + + Additionally, in robotics and AV data, we do multi-chunk captioning. + In this case, we need to sample a chunk uniformly at random and extract the video frames only for that chunk. + + The chunk's frame range is supplied by an upstream ``TextTransformForVideoJsonCaption`` + augmentor via ``data_dict["chunk_start_frame"]`` and ``data_dict["chunk_end_frame"]``. + Only frames in ``[chunk_start_frame, chunk_end_frame)`` (and the matching audio range) + are decoded. + """ + + def __init__(self, input_keys: list, output_keys: Optional[list] = None, args: Optional[dict] = None) -> None: + super().__init__(input_keys, output_keys, args) + + def __call__(self, data_dict: dict) -> dict | None: + + # if in future we need to train with batch size > 1, need to pad frames + try: + meta_dict = data_dict[self.meta_key] + video = data_dict[self.video_key] + except Exception as e: + log.warning( + f"Cannot find video. url: {data_dict['__url__']}, key: {data_dict['__key__']}", rank0_only=False + ) + return None + + if not self._validate_and_probe(video, meta_dict, data_dict): + return None + + # The chunk frame range must be supplied by an upstream caption-parsing augmentor + # (e.g. TextTransformForVideoJsonCaption). + if "chunk_start_frame" not in data_dict or "chunk_end_frame" not in data_dict: + log.warning( + f"VideoParsingChunkedFrames: missing chunk_start_frame/chunk_end_frame in data_dict. " + f"url: {data_dict['__url__']}, key: {data_dict['__key__']}", + rank0_only=False, + ) + return None + chunk_start = int(data_dict["chunk_start_frame"]) + chunk_end = int(data_dict["chunk_end_frame"]) + + # Resize video frames if size is specified. This computes a scaling ratio that fits the + # video within the target size bounds while preserving the original aspect ratio. + # The resize transform is applied during decoding via VideoDecoder's transforms parameter. + if self.perform_resize: + img_size = obtain_augmentation_size(data_dict, {"size": self.size}) + assert isinstance(img_size, (tuple, omegaconf.listconfig.ListConfig)), ( + f"Arg size in resize should be a tuple, get {type(img_size)}, {img_size}" + ) + img_w, img_h = img_size + orig_w, orig_h = meta_dict["width"], meta_dict["height"] + + # Compute uniform scaling ratio to fit video within target bounds (aspect-ratio preserving) + scaling_ratio = min((img_w / orig_w), (img_h / orig_h)) + target_size = (int(scaling_ratio * orig_h + 0.5), int(scaling_ratio * orig_w + 0.5)) + + assert target_size[0] <= img_h and target_size[1] <= img_w, ( + f"Resize error. orig {(orig_w, orig_h)} desire {img_size} compute {target_size}" + ) + transform = [Resize(target_size)] + else: + transform = None + + # Adding try-expcept because some of the data is bad and video decoding call fail. + try: + video_decoder = VideoDecoder( + video, + seek_mode=self.seek_mode, + num_ffmpeg_threads=self.video_decode_num_threads, + transforms=transform, + ) + decoder_len = len(video_decoder) + + # Clamp the chunk range to what the decoder actually has. + chunk_start_clamped = max(0, min(chunk_start, decoder_len)) + chunk_end_clamped = max(chunk_start_clamped, min(chunk_end, decoder_len)) + if chunk_end_clamped <= chunk_start_clamped: + log.warning( + f"VideoParsingChunkedFrames: empty chunk after clamping. " + f"chunk=[{chunk_start},{chunk_end}), decoder_len={decoder_len}, " + f"url: {data_dict['__url__']}, key: {data_dict['__key__']}", + rank0_only=False, + ) + del video_decoder + return None + + stride = self._sample_stride_with_bias(self.max_stride, self.min_stride) + frame_indices = np.arange(chunk_start_clamped, chunk_end_clamped, stride).tolist() + + # VAE compress temporal by 4x, with 1 as condition + # thus the max_video_frames must be 1 + 4N + num_video_frames = min(len(frame_indices), self.args.get("max_num_frames", 1000)) + N = (num_video_frames - 1) // 4 + num_video_frames = 1 + 4 * N + if num_video_frames < 1: + log.warning( + f"VideoParsingChunkedFrames: chunk too short for stride. " + f"chunk=[{chunk_start_clamped},{chunk_end_clamped}), stride={stride}, " + f"url: {data_dict['__url__']}, key: {data_dict['__key__']}", + rank0_only=False, + ) + del video_decoder + return None + frame_indices = frame_indices[0:num_video_frames] + if len(frame_indices) == 0: + del video_decoder + return None + + frame_batch = video_decoder.get_frames_at(frame_indices) + video_frames = frame_batch.data # [T,C,H,W] + video_frames = video_frames.permute(1, 0, 2, 3) # [C,T,H,W] (T = num_video_frames) + + del video_decoder + except Exception as e: + log.warning( + f"Failed to decode video. url: {data_dict['__url__']}, key: {data_dict['__key__']}, error: {e}", + rank0_only=False, + ) + return None + + video_info = { + "frame_start": frame_indices[0], + "frame_end": frame_indices[-1], + "num_frames": len(frame_indices), + "video": video_frames, + "fps": meta_dict["framerate"], + "conditioning_fps": meta_dict["framerate"] / stride, + "n_orig_video_frames": num_video_frames, + } + + # Extract audio for the same time range as the chunk's video frames. + if self.extract_audio: + audio_chunk = self._extract_audio_chunk( + video_bytes=video, video_fps=meta_dict["framerate"], frame_indices=frame_indices + ) + if audio_chunk is not None: + video_info["sound"] = audio_chunk + else: + video_info["sound"] = None + # Always include audio_sample_rate when extract_audio is enabled, + # even if audio extraction failed, so the collate function has a + # consistent set of keys across all samples in the batch. + video_info["audio_sample_rate"] = self.audio_sample_rate + elif self.emit_placeholder_sound: + video_info["sound"] = None + video_info["audio_sample_rate"] = self.audio_sample_rate + + data_dict[self.video_key] = video_info + + # Cleanup: this augmentor is the last consumer of metas in the json-caption pipeline. + # Also drop the chunk range markers now that the chunk has been decoded. + data_dict.pop(self.meta_key, None) + data_dict.pop("chunk_start_frame", None) + data_dict.pop("chunk_end_frame", None) + + return data_dict diff --git a/cosmos-inference/cosmos3/_src/vfm/datasets/augmentors/vlm/__init__.py b/cosmos-inference/cosmos3/_src/vfm/datasets/augmentors/vlm/__init__.py new file mode 100644 index 00000000..db6d6eb6 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/vfm/datasets/augmentors/vlm/__init__.py @@ -0,0 +1,15 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025-2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + diff --git a/cosmos-inference/cosmos3/_src/vfm/datasets/augmentors/vlm/bytes_to_media.py b/cosmos-inference/cosmos3/_src/vfm/datasets/augmentors/vlm/bytes_to_media.py new file mode 100644 index 00000000..9274f473 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/vfm/datasets/augmentors/vlm/bytes_to_media.py @@ -0,0 +1,225 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +"""Augmentors for handling video loading from pickled bytes. +Copied from projects/cosmos/reason1/datasets/augmentors/bytes_to_media.py +Changes: + 1: fully support start frame end frame, s.t. we could remove the projects/cosmos/reason1/datasets/augmentors/bytes_to_media.py class for predict2 video support + 2: add processor in init, as we need to read the processing config during the video decoding process +""" + +import io +import pickle as pkl +from typing import Dict, Optional + +from PIL import Image, UnidentifiedImageError + +from cosmos3._src.imaginaire.datasets.webdataset.augmentors.augmentor import Augmentor +from cosmos3._src.imaginaire.utils import log +from cosmos3._src.vfm.datasets.vlm.video_decoder_qwen import _video_decoder_qwen_func +from projects.cosmos3.vlm.processors.qwen3vl_processor import Qwen3VLProcessor +from projects.cosmos3.vlm.utils.video_preprocess import tensor_to_pil_images + + +class BytesToMedia(Augmentor): + """ + Converts PKL bytes stored in a data dictionary into media. + + Handles input formats for the specified input key: + A dictionary mapping media names (str) to bytes objects. + + The output format is a dictionary mapping names to their respective decoded objects: + Input dict[str, bytes] -> Output dict[str, torch.Tensor | PIL.Image] + + Corrupted or non-decodable bytes are skipped with a warning. + """ + + def __init__( + self, + input_key: str = "media", + output_key: str = "media", + min_fps_thres: int = 4, + max_fps_thres: int = 60, + target_fps: float = 4.0, + min_video_token_length: int = 16, + max_video_token_length: int = 8192, + num_threads: int = 0, + random_augmentation: bool = False, + is_input_pickle_byptes: bool = True, + use_start_frame_end_frame: bool = False, + frame_count_random_range: Optional[list[int]] = None, + processor: Qwen3VLProcessor = None, + ) -> None: + """ + Args: + input_key (str): Key in the data_dict containing video/image data. + output_key (str): Key to store the resulting video frame tensors or PIL images. + min_fps_thres (int): Minimum FPS threshold for video decoding. + max_fps_thres (int): Maximum FPS threshold for video decoding. + target_fps (float): Target FPS for video decoding. + min_video_token_length (int): Minimum token length for video decoding. + max_video_token_length (int): Maximum token length for video decoding. + num_threads (int): Number of threads for video decoding. + random_augmentation (bool): Whether to apply random augmentation during decoding. + is_input_pickle_byptes (bool): Whether the input key is in the data_dict instead of pkl files. (For cosmos predict2 videos) + use_start_frame_end_frame (bool): Whether to use start_frame and end_frame to decode the video. (For cosmos predict2 videos) + frame_count_random_range (list[int], optional): Random frame count range. Defaults to None. + """ + self.input_key = input_key + self.output_key = output_key + self.video_decoder_params = { + "min_fps_thres": min_fps_thres, + "max_fps_thres": max_fps_thres, + "target_fps": target_fps, + "min_video_token_length": min_video_token_length, + "max_video_token_length": max_video_token_length, + "num_threads": num_threads, + "random_augmentation": random_augmentation, + "frame_count_random_range": frame_count_random_range, + } + self.is_input_pickle_byptes = is_input_pickle_byptes + self.use_start_frame_end_frame = use_start_frame_end_frame + self.processor = processor + + def _bytes_to_video_frames( + self, video_bytes: bytes, identifier: str = "video", start_frame: int = None, end_frame: int = None + ) -> Optional[Dict]: + """Converts video bytes to video frame tensors using the video decoder.""" + try: + result = _video_decoder_qwen_func( + key=f"{identifier}.mp4", # Add .mp4 extension for the decoder + data=video_bytes, + processor=self.processor, + start_frame=start_frame, + end_frame=end_frame, + **self.video_decoder_params, + ) + result["videos"] = tensor_to_pil_images(result["videos"]) # 3,T,H,W -> list of PIL images + if result is not None: + return result + else: + log.warning(f"Skipping item '{identifier}': Video decoder returned None.") + return None + except Exception as e: + log.warning(f"Skipping item '{identifier}': Error decoding video bytes: {e}") + return None + + def _perhaps_unpickle_image_bytes(self, image_bytes: bytes) -> bytes: + """Unpickles the image bytes if it's double-pickled.""" + if image_bytes[:3] == b"\x80\x04\x95": + nested_data = pkl.loads(image_bytes) + if isinstance(nested_data, dict) and "image" in nested_data: + image_bytes = nested_data["image"] + else: + image_bytes = nested_data + return image_bytes + + def _bytes_to_pil(self, image_bytes: bytes, identifier: str = "image") -> Optional[Image.Image]: + """Converts a single bytes object to a PIL Image.""" + image_bytes = self._perhaps_unpickle_image_bytes(image_bytes) + try: + with io.BytesIO(image_bytes) as stream: + img = Image.open(stream) + img.load() # Verify the image data + return img.convert("RGB") # Convert to standard RGB format + except UnidentifiedImageError: + log.warning(f"Skipping item '{identifier}': Cannot identify image file from bytes.") + except Exception as e: + log.warning(f"Skipping item '{identifier}': Error decoding image bytes: {e}") + return None + + def __call__(self, data_dict: Dict) -> Dict: + """ + Processes the data_dict to convert video/image bytes to their respective formats. + + Args: + data_dict (Dict): The input data dictionary. + + Returns: + Dict: The modified data dictionary with video frame tensors and/or PIL images. + """ + input_key = self.input_key + output_key = self.output_key + + if input_key not in data_dict: + log.debug( + f"Input key '{input_key}' not found in data_dict. Skipping BytesToMedia. Available keys: {data_dict.keys()}" + ) + return data_dict + + raw = data_dict[input_key] + if self.is_input_pickle_byptes: + if isinstance(raw, bytes): + # Old webdataset (<1.0.2): .pkl files not auto-decoded, raw bytes arrive here + data = pkl.loads(raw) + elif isinstance(raw, dict): + # New webdataset (>=1.0.2): basichandlers runs as default post-handler, + # auto-decoding .media.pkl before this augmentor — use directly + data = raw + else: + raise ValueError(f"Input key '{input_key}' has unexpected type {type(raw)}; expected bytes or dict.") + else: + data = raw + output_data = {} + + if isinstance(data, dict): + for name, item in data.items(): + if isinstance(item, bytes): + # Determine if this is video or image based on the key name + if ("video" in name.lower() or ".mp4" in name.lower()) and not self.use_start_frame_end_frame: + # Decode as video + result = self._bytes_to_video_frames(item, identifier=f"{input_key}['{name}']") + if result: + output_data[name] = result + elif ("video" in name.lower() or ".mp4" in name.lower()) and self.use_start_frame_end_frame: + assert "start_frame" in data_dict.keys() and "end_frame" in data_dict.keys(), ( + f"start_frame and end_frame are not in data_dict.keys(): {data_dict.keys()}" + ) + start_frame = data_dict["start_frame"] + end_frame = data_dict["end_frame"] + result = self._bytes_to_video_frames( + item, identifier=f"{input_key}['{name}']", start_frame=start_frame, end_frame=end_frame + ) + if result: + output_data[name] = result + + elif ( + "image" in name.lower() + or ".jpg" in name.lower() + or ".jpeg" in name.lower() + or ".png" in name.lower() + ): + # Decode as image + result = self._bytes_to_pil(item, identifier=f"{input_key}['{name}']") + if result: + output_data[name] = result + else: + log.warning( + f"Skipping item with key '{name}' in '{input_key}': Key does not contain 'video', '.mp4', '.jpg', '.jpeg', '.png', or 'image'." + ) + else: + log.warning(f"Skipping item with key '{name}' in '{input_key}': Expected bytes, got {type(item)}.") + else: + raise ValueError( + f"Input key '{input_key}' has unsupported type {type(data)}. " + f"Expected dict[str, bytes] for video/image data." + ) + + # Add the processed data and optionally remove the input key + data_dict[output_key] = output_data + if input_key != output_key: + del data_dict[input_key] + + return data_dict diff --git a/cosmos-inference/cosmos3/_src/vfm/datasets/augmentors/vlm/filter_output_key.py b/cosmos-inference/cosmos3/_src/vfm/datasets/augmentors/vlm/filter_output_key.py new file mode 100644 index 00000000..8c3debce --- /dev/null +++ b/cosmos-inference/cosmos3/_src/vfm/datasets/augmentors/vlm/filter_output_key.py @@ -0,0 +1,70 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +"""Augmentations to remove keys from the output data_dict""" + +from typing import Dict, List, Optional + +from cosmos3._src.imaginaire.datasets.webdataset.augmentors.augmentor import Augmentor +from cosmos3._src.imaginaire.utils import log + + +class FilterOutputKey(Augmentor): + """ + Keep a subset of keys in the output data_dict + """ + + def __init__( + self, + input_keys: List = [], + output_keys: Optional[list] = [ + "__key__", + "__url__", + "dialog_str", + "input_ids", + "token_mask", + "attention_mask", + "pixel_values_videos", + "video_grid_thw", + "second_per_grid_ts", + "raw_video", # for debugging + "pixel_values", + "image_grid_thw", + "raw_image", # for debugging + # For collate_fn + "pad_token_id", + "ignore_index", + "labels", + ], + text_only: bool = False, + args: Optional[dict] = None, + ) -> None: + self.output_keys = output_keys + self.text_only = text_only + + def __call__(self, data_dict: Dict) -> Dict: + data_dict = {k: data_dict[k] for k in self.output_keys if k in data_dict} + + has_media = "pixel_values" in data_dict or "pixel_values_videos" in data_dict + has_text = "input_ids" in data_dict and "labels" in data_dict + is_valid_data = has_media or has_text + if not self.text_only and not is_valid_data: + log.critical( + f"No media input in data_dict: {data_dict.keys()} | __url__: {data_dict['__url__']} | __key__: {data_dict['__key__']} | dialog_str: {data_dict.get('dialog_str', '')} | does not contain pixel_values or pixel_values_videos", + rank0_only=False, + ) + return None + + return data_dict diff --git a/cosmos-inference/cosmos3/_src/vfm/datasets/augmentors/vlm/filter_seq_length.py b/cosmos-inference/cosmos3/_src/vfm/datasets/augmentors/vlm/filter_seq_length.py new file mode 100644 index 00000000..222f50ed --- /dev/null +++ b/cosmos-inference/cosmos3/_src/vfm/datasets/augmentors/vlm/filter_seq_length.py @@ -0,0 +1,65 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +"""Augmentations to remove keys from the output data_dict""" + +from typing import Dict, List, Optional + +from cosmos3._src.imaginaire.datasets.webdataset.augmentors.augmentor import Augmentor +from cosmos3._src.imaginaire.utils import log +from projects.cosmos3.vlm.processors.qwen3vl_processor import Qwen3VLProcessor + + +class FilterSeqLength(Augmentor): + """ + Check the sequence length of the input data_dict and filter out the samples that are too long (TODO: Instead of removing them, we can truncate the input ids, but need to make sure the image tokens are not truncated) + """ + + def __init__( + self, + input_keys: List = ["input_ids"], + output_keys: Optional[list] = ["input_ids"], + max_token_length: int = 24000, + processor: Qwen3VLProcessor = None, + ) -> None: + self.max_token_length = max_token_length + self.processor = processor + + def __call__(self, data_dict: Dict) -> Dict: + input_ids = data_dict["input_ids"] + if input_ids.shape[-1] > self.max_token_length: + # check if there is pixel values or pixel value videos in the remaining tokens, if not truncate the input ids + input_ids_extra = input_ids[self.max_token_length :] + has_video_tokens = sum(input_ids_extra == self.processor.video_token_id) > 0 + has_image_tokens = sum(input_ids_extra == self.processor.image_token_id) > 0 + if not has_video_tokens and not has_image_tokens: + log.debug( + f"Truncating input_ids from {input_ids.shape[-1]} to {self.max_token_length} because there are no video or image tokens in the remaining tokens | __url__: path={data_dict['__url__'].path} root={data_dict['__url__'].root} | __key__: {data_dict['__key__']} | dialog_str: {data_dict.get('dialog_str', '')}" + ) + data_dict["input_ids"] = data_dict["input_ids"][: self.max_token_length] + data_dict["token_mask"] = data_dict["token_mask"][: self.max_token_length] + data_dict["attention_mask"] = data_dict["attention_mask"][: self.max_token_length] + data_dict["labels"] = data_dict["labels"][: self.max_token_length] + return data_dict + + if input_ids.shape[-1] > self.max_token_length: + msg = f"Input ids length {input_ids.shape[-1]} is greater than max token length {self.max_token_length} | __url__: path={data_dict['__url__'].path} root={data_dict['__url__'].root} | __key__: {data_dict['__key__']} | dialog_str: {data_dict.get('dialog_str', '')}" + if "pixel_values" in data_dict: + msg += f" | pixel_values: {data_dict['pixel_values'].shape}" + if "pixel_values_videos" in data_dict: + msg += f" | pixel_values_videos: {data_dict['pixel_values_videos'].shape}" + log.critical(msg, rank0_only=False) + return None + return data_dict diff --git a/cosmos-inference/cosmos3/_src/vfm/datasets/augmentors/vlm/floating_number_format.py b/cosmos-inference/cosmos3/_src/vfm/datasets/augmentors/vlm/floating_number_format.py new file mode 100644 index 00000000..b6e0b6a6 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/vfm/datasets/augmentors/vlm/floating_number_format.py @@ -0,0 +1,86 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +import re +from typing import Dict + +from cosmos3._src.imaginaire.datasets.webdataset.augmentors.augmentor import Augmentor + + +def format_floating_number(text: str, decimal_places: int) -> str: + """ + Format floating point numbers in text according to the specified format. + + Args: + text: Input text containing floating point numbers + floating_number_format: Format string like '.2f', '2.2f', etc. + + Returns: + Text with floating point numbers formatted according to the format string + """ + # Pattern to match floating point numbers (including integers that could be floats) + # Matches: integers, decimals like 123.456, scientific notation, etc. + pattern = r"-?\d+\.?\d*(?:[eE][+-]?\d+)?" + + def replace_float(match: re.Match) -> str: + try: + num = float(match.group()) + # Format the number using the provided format string + # Handle format strings like '.2f' or '2.2f' + formatted = f"{num:.{decimal_places}f}".rstrip("0").rstrip(".") if decimal_places > 0 else str(int(num)) + return formatted + except (ValueError, TypeError): + # If conversion fails, return the original match + return match.group() + + # Replace all floating point numbers in the text + formatted_text = re.sub(pattern, replace_float, text) + return formatted_text + + +class FloatingNumberFormat(Augmentor): + def __init__( + self, + input_key: str = "conversation", + decimal_places: int = 2, + urls_needs_format: list = [], + processor=None, + ) -> None: + """ + Args: + input_keys (list): List of input keys. + """ + self.input_key = input_key + self.decimal_places = decimal_places + self.urls_needs_format = urls_needs_format + + def __call__(self, data_dict: Dict) -> Dict: + url = data_dict["__url__"] + if not any(url_pattern in url.root for url_pattern in self.urls_needs_format): + return data_dict + + for item in data_dict[self.input_key]: + if item["role"] == "user": + for content in item["content"]: + if content["type"] == "text": + content["text"] = format_floating_number(content["text"], self.decimal_places) + elif item["role"] == "assistant": + if isinstance(item["content"], list): + assert len(item["content"]) == 1 + assert item["content"][0]["type"] == "text" + item["content"] = format_floating_number(item["content"][0]["text"], self.decimal_places) + else: + item["content"] = format_floating_number(item["content"], self.decimal_places) + return data_dict diff --git a/cosmos-inference/cosmos3/_src/vfm/datasets/augmentors/vlm/format_describe_anything.py b/cosmos-inference/cosmos3/_src/vfm/datasets/augmentors/vlm/format_describe_anything.py new file mode 100644 index 00000000..18ac7317 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/vfm/datasets/augmentors/vlm/format_describe_anything.py @@ -0,0 +1,285 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +"""Augmentors for image captions with SoM prompting. +Copied from projects/cosmos/reason1/datasets/augmentors/format_describe_anything.py +Changes: + 1. Unify system prompt to 'You are a helpful assistant.' + 2. Move task requirements from system prompts to the end of user prompts. +""" + +import json +import random +from typing import Dict, List, Literal + +from cosmos3._src.imaginaire.datasets.webdataset.augmentors.augmentor import Augmentor +from cosmos3._src.vfm.datasets.augmentors.vlm.timestamp import markdown_to_list + + +# reorder dict entries +def reorder_dict_entries(conversation_data: List[Dict]) -> List[Dict]: + key_order = ["subject_id", "category", "caption"] + output_dict = {} + for key in key_order: + if key in conversation_data: + output_dict[key] = conversation_data[key] + return output_dict + + +def list_to_markdown(conversation_data: List[Dict]) -> str: + conversation_data = [reorder_dict_entries(item) for item in conversation_data] + json_string = json.dumps(conversation_data, indent=2) + return f"```json\n{json_string}\n```".strip() + + +def augment_assistant_message( + assistant_message: List[Dict], + output_format: Literal[ + "dense_image_caption_json_per_subject", + "dense_image_caption_plain_per_subject", + "caption_one_object", + "location_and_caption_json_one_category", + "location_and_caption_plain_one_category", + ], +): + if output_format == "dense_image_caption_json_per_subject": + output_message = list_to_markdown(assistant_message) + return output_message + elif output_format == "dense_image_caption_plain_per_subject": + output_message = "" + for item in assistant_message: + output_message += f"subject_id = <{item['subject_id']}> category = <{item['category']}> {item['caption']}\n" + return output_message + + elif output_format == "location_and_caption_json_one_category": + # remove category + assistant_message = [ + {"subject_id": item["subject_id"], "caption": item["caption"]} for item in assistant_message + ] + output_message = list_to_markdown(assistant_message) + return output_message + elif output_format == "location_and_caption_plain_one_category": + output_message = "" + for item in assistant_message: + output_message += f"subject_id = <{item['subject_id']}> {item['caption']}\n" + return output_message + + elif output_format == "caption_one_object": + return f"{assistant_message[0]['caption']}" + else: + raise ValueError(f"Invalid output format: {output_format}") + + +def augment_user_prompt( + assistant_message: List[dict], + output_format: Literal[ + "dense_image_caption_json_per_subject", + "dense_image_caption_plain_per_subject", + "caption_one_object", + "location_and_caption_json_one_category", + "location_and_caption_plain_one_category", + ], +): + if ( + output_format == "dense_image_caption_json_per_subject" + or output_format == "dense_image_caption_plain_per_subject" + ): + if random.random() < 0.5: + user_prompt = random.choice( + [ + "Caption the notable attributes in the provided image.", + "Describe the notable attributes in the provided image.", + "Summarize the notable attributes in the provided image.", + ] + ) + if random.random() < 0.5: + user_prompt = "Please " + user_prompt.lower() + else: + user_prompt = random.choice( + [ + "Can you caption the notable attributes in the provided image?", + "Can you describe the notable attributes in the provided image?", + "Can you summarize the notable attributes in the provided image?", + ] + ) + if output_format == "dense_image_caption_json_per_subject": + user_prompt += """ List and describe all marked subjects in the image with their categories and detailed captions using the following format: +```json +[ +{ +"subject_id": , +"category": , +"caption": , +}, +{ +"subject_id": , +"category": , +"caption": , +}, +] +``` +""" + else: + user_prompt += " Please provide captions of the tracked objects in the images using the following format: \nsubject_id = category = caption of event 1.\nsubject_id = category = caption of event 2.\n" + elif output_format == "caption_one_object": + event = assistant_message[0] + user_prompt = random.choice( + [ + f"What happen to the object with ID <{event['subject_id']}>?", + f"Describe the object with ID <{event['subject_id']}>?", + f"Provide a caption of the object with ID <{event['subject_id']}>?", + ] + ) + elif ( + output_format == "location_and_caption_json_one_category" + or output_format == "location_and_caption_plain_one_category" + ): + event = assistant_message[0] + + user_prompt = random.choice( + [ + f"Caption the attribute of the object with category <{event['category']}>.", + f"Please describe the attribute of the object with category <{event['category']}>.", + f"Please caption the attribute of the object with category <{event['category']}>.", + f"Summarize the attribute of the object with category <{event['category']}>.", + ] + ) + if output_format == "location_and_caption_json_one_category": + user_prompt += """ Find all marked subjects that belong to and describe them in detail using the following format: +```json +[ +{ + "subject_id": , + "caption": +}, +{ + "subject_id": , + "caption": +}, +] +```""" + else: + user_prompt += """ Find all marked subjects that belong to and describe them in detail using the following format: +subject_id = caption of event 1. +subject_id = caption of event 2.""" + else: + raise ValueError(f"Invalid output format: {output_format}") + return user_prompt + + +class FormatDescribeAnything(Augmentor): + def __init__( + self, + input_key: list = "media", + output_format: Literal[ + "dense_image_caption_per_subject", + "caption_one_object", + "location_and_caption_one_category", + "random", + ] = "random", + urls_needs_timestamp: list = ["tl_plm_sav_20250714"], + ) -> None: + """ + Args: + input_keys (list): List of input keys. + """ + self.input_key = input_key + self.output_format = output_format + self.urls_needs_timestamp = urls_needs_timestamp + + def __call__(self, data_dict: Dict) -> Dict: + url = data_dict["__url__"] + if not any(url_pattern in url.root for url_pattern in self.urls_needs_timestamp): + return data_dict + + if self.output_format == "random": + output_format = random.choice( + [ + "dense_image_caption_per_subject", + "caption_one_object", + "location_and_caption_one_category", + ] + ) + else: + output_format = self.output_format + + if output_format == "dense_image_caption_per_subject": + output_format = random.choice( + ["dense_image_caption_json_per_subject", "dense_image_caption_plain_per_subject"] + ) + elif output_format == "location_and_caption_one_category": + output_format = random.choice( + ["location_and_caption_json_one_category", "location_and_caption_plain_one_category"] + ) + + # find the assistant message and parse into a list of dictionaries + for item in data_dict["conversation"]: + if item["role"] == "assistant": + """ + content dict: + ```json + [ + { + "subject_id": "4", + "category": "doughnut", + "caption": "This doughnut has a golden-brown exterior and a light, airy inside, evenly covered with a shiny, clear sugar glaze." + }, + { + "subject_id": "5", + "category": "tray", + "caption": "This stainless steel tray is rectangular with rounded edges and includes a sequence of symmetrical cut-outs shaped like simplified flowers or four-leaf clovers." + }, + ] + ``` + """ + assistant_message = markdown_to_list(item["content"]) + if assistant_message is None: + return None # skip this sample + break + # sort assistant_message by object id + assistant_message = sorted(assistant_message, key=lambda x: int(x["subject_id"])) + + # if temporal localization or caption, sample one event + if output_format in ["caption_one_object"]: + assistant_message = random.sample(assistant_message, 1) + elif output_format in ["location_and_caption_json_one_category", "location_and_caption_plain_one_category"]: + available_category = list( + set([assistant_message_i["category"] for assistant_message_i in assistant_message]) + ) + # sample one subject id + category = random.choice(available_category) + assistant_message = [ + assistant_message_i + for assistant_message_i in assistant_message + if assistant_message_i["category"] == category + ] + + # process conversation + conversation = data_dict["conversation"] + for item in conversation: + if item["role"] == "system": + item["content"] = "You are a helpful assistant." + elif item["role"] == "user": + for content in item["content"]: + if content["type"] == "text": + content["text"] = augment_user_prompt(assistant_message, output_format) + if content["text"] is None: # parse error + return None + elif item["role"] == "assistant": + assistant_message = augment_assistant_message(assistant_message, output_format) + item["content"] = assistant_message + data_dict["conversation"] = conversation + + return data_dict diff --git a/cosmos-inference/cosmos3/_src/vfm/datasets/augmentors/vlm/nvlm_data_to_conversation.py b/cosmos-inference/cosmos3/_src/vfm/datasets/augmentors/vlm/nvlm_data_to_conversation.py new file mode 100644 index 00000000..b2e721d2 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/vfm/datasets/augmentors/vlm/nvlm_data_to_conversation.py @@ -0,0 +1,268 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +"""Visual-Text Transformations or Augmentations.""" + +import json +import random +from typing import Dict, Optional + +import numpy as np +from PIL import ImageDraw + +from cosmos3._src.imaginaire.datasets.webdataset.augmentors.augmentor import Augmentor +from cosmos3._src.imaginaire.utils import log + + +def convert_conversation_role(messages): + """ + The original messages can be in the following format: + [ + {"from": "human", "value": "Hello, how are you?"}, + {"from": "gpt", "value": "I'm good, thank you!"}, + ] + or + [ + {"role": "user", "content": "Hello, how are you?"}, + {"role": "gpt", "content": "I'm good, thank you!"}, + ] + + The target format is: + [ + {"role": "user", "content": "Hello, how are you?"}, + {"role": "assistant", "content": "I'm good, thank you!"}, + ] + """ + role_mapping = { + "human": "user", + "gpt": "assistant", + "user": "user", + "assistant": "assistant", + "label": "assistant", + } + messages_converted = [] + for message in messages: + assert "from" in message or "role" in message, f"Invalid message: {message}" + assert "value" in message or "content" in message, f"Invalid message: {message}" + role = message["from"] if "from" in message else message["role"] + content = message["value"] if "value" in message else message["content"] + role = role_mapping[role] + messages_converted.append({"role": role, "content": content}) + return messages_converted + + +class NVLMImageDataConversation(Augmentor): + """ + This augmentor is used to convert the nvlm data to a conversation format. + It will take the data_dict with the following keys: + { + "data_class": str, + "images": List[PIL.Image.Image], + "text": str, + "words_boxes": Optional[List[List[int]]], + "words_text": Optional[List[str]], + "similarity_matrix": Optional[List[List[float]]], + } + and convert it to a dictionary with the following keys: + { + "conversation": List[Dict], # Can be taken by TokenizeData augmentors shared with all datasets + "media": Dict, + } + + The dataclass includes: + - SimilarityInterleavedWebdataset + - CaptioningWebdataset + - MultiChoiceVQAWebdataset + - VQAWebdataset + - OCRWebdataset + - TextOCRWebdataset + + SimilarityInterleavedWebdataset will come with the conversations + CaptioningWebdataset will come with the caption + MultiChoiceVQAWebdataset will come with the question and choices + VQAWebdataset will come with the question and answer + OCRWebdataset will come with the text and words_boxes + TextOCRWebdataset will come with the text and words_boxes + """ + + def __init__( + self, + input_keys: list = ["data_class", "images", "text", "words_boxes", "words_text"], + output_keys: Optional[list] = ["text"], + media_type: str = "image", + media_key_in_data_dict: str = "images", + ) -> None: + super().__init__(input_keys, output_keys, None) + self.media_type = media_type + self.media_key_in_data_dict = media_key_in_data_dict + + self.user_prompt_list = json.load( + open("projects/cosmos3/vlm/datasets/augmentors/user_prompt_caption_general.json", "r") + ) + self.user_prompt_ocr_list = json.load( + open("projects/cosmos3/vlm/datasets/augmentors/user_prompt_ocr.json", "r") + ) + + def __call__(self, data_dict: Dict) -> Dict: + try: + return self.try_parse(data_dict) + except Exception as e: + log.warning( + f"Error parsing data_dict: {e} | data_dict: {data_dict.keys()} | __url__: {data_dict['__url__']}" + ) + return None + + def try_parse(self, data_dict: Dict) -> Dict: + """ + The output data_dict will has key "conversation" and "media" + for the "conversation" key, it will be list of dict + + data['conversation'] = [ + {"role": "system", "content": "**"}, + { + "role": "user", + "content": [ + {"type": media_type, media_type: media_dict_key}, + {"type": "text", "text": user_prompt}, + ], + }, + {"role": "assistant", "content": caption}, + ] + """ + data_class = data_dict["data_class"] + media_dict = {} + user_content_list = [] + for media_id, media in enumerate(data_dict[self.media_key_in_data_dict]): + media_dict_key = f"{self.media_type}_{media_id}" + user_content_list.append({"type": self.media_type, self.media_type: media_dict_key}) + media_dict[media_dict_key] = media + + if data_class == "SimilarityInterleavedWebdataset": + messages = data_dict["texts"] + messages = convert_conversation_role(messages) + # Insert the user_content_list to the user content + for message_id, message in enumerate(messages): + if message["role"] == "user": # Add to the first user message + messages[message_id]["content"] = user_content_list + [ + {"type": "text", "text": messages[message_id]["content"]} + ] + break + + elif data_class == "CaptioningWebdataset": + raw_captions = data_dict["caption"] + user_prompt = random.choice(self.user_prompt_list) + user_content_list.append({"type": "text", "text": user_prompt}) + messages = [ + {"role": "user", "content": user_content_list}, + {"role": "assistant", "content": f"{raw_captions}"}, + ] + + elif data_class == "MultiChoiceVQAWebdataset": + if data_dict["correct_choice_idx"] == -1: + answer = data_dict["choices"] + user_prompt = "\nAnswer the question using a single word or phrase." + else: + answer = data_dict["correct_choice_idx"] + user_prompt = "\nAnswer with the option's letter from the given choices directly." + user_content_list.append( + {"type": "text", "text": f"{data_dict['context']} {data_dict['choices']}. {user_prompt}"} + ) + messages = [ + {"role": "user", "content": user_content_list}, + {"role": "assistant", "content": f"{answer}"}, + ] + elif data_class == "VQAWebdataset": + user_prompt = "\nAnswer the question using a single word or phrase." + answer = data_dict["answers"] + if isinstance(answer, list): + # random sample one answer + answer = random.choice(answer) + user_content_list.append({"type": "text", "text": f"{data_dict['context']} {user_prompt}"}) + messages = [ + {"role": "user", "content": user_content_list}, + {"role": "assistant", "content": f"{answer}"}, + ] + elif data_class == "OCRWebdataset": + user_prompt = random.choice(self.user_prompt_ocr_list) + if ( + "words_boxes" in data_dict + and "words_text" in data_dict + and isinstance(data_dict["words_boxes"], list) + and isinstance(data_dict["words_text"], list) + and len(data_dict["words_boxes"]) == len(data_dict["words_text"]) + ): + boxes = data_dict["words_boxes"] + text = data_dict["words_text"] + # random sample one box and text + index = random.randint(0, len(boxes) - 1) + box = boxes[index] + text = text[index] + user_prompt = ( + user_prompt + f"\nbox: {box}; original image size: {np.array(media_dict[media_dict_key]).shape}" + ) + assert len(media_dict) == 1, ( + f"media_dict: {media_dict} | user_prompt: {user_prompt} | __url__: {data_dict['__url__']}" + ) + # Draw the box on the image + image = media_dict[media_dict_key] + + log.info( + f"box: {box} | text: {text} | media_dict_key: {media_dict_key} | __url__: {data_dict['__url__']} | image shape: {np.array(image).shape}" + ) + if len(box) == 4: + draw = ImageDraw.Draw(image) + draw.rectangle(box, outline="red", width=2) + media_dict[media_dict_key] = image + + reply = text + elif "words_text" in data_dict: + reply = data_dict["words_text"] + else: + reply = data_dict["text"] + user_content_list.append({"type": "text", "text": user_prompt}) + messages = [ + {"role": "user", "content": user_content_list}, + {"role": "assistant", "content": f"{reply}"}, + ] + else: + log.warning(f"Invalid data class: {data_class}") + return None + + # Remove image tag in the user content if any + def remove_image_tag(text): + text = text.replace("\n", "").replace("", "").replace("", "") + return text + + for message_id in range(len(messages)): + if messages[message_id]["role"] == "user" and isinstance(messages[message_id]["content"], list): + for content_id in range(len(messages[message_id]["content"])): + if messages[message_id]["content"][content_id]["type"] == "text": + text = messages[message_id]["content"][content_id]["text"] + text = remove_image_tag(text) + messages[message_id]["content"][content_id]["text"] = text + elif messages[message_id]["role"] == "user" and isinstance(messages[message_id]["content"], str): + messages[message_id]["content"] = remove_image_tag(messages[message_id]["content"]) + + # Make sure the assistant text content is a string not float or int + if messages[message_id]["role"] == "assistant" and not isinstance(messages[message_id]["content"], list): + if isinstance(messages[message_id]["content"], dict): + if messages[message_id]["content"]["type"] == "text": + messages[message_id]["content"]["text"] = f"{messages[message_id]['content']['text']}" + else: + messages[message_id]["content"] = f"{messages[message_id]['content']}" + + data_dict["conversation"] = messages + data_dict["media"] = media_dict + return data_dict diff --git a/cosmos-inference/cosmos3/_src/vfm/datasets/augmentors/vlm/nvlm_data_unify.py b/cosmos-inference/cosmos3/_src/vfm/datasets/augmentors/vlm/nvlm_data_unify.py new file mode 100644 index 00000000..37de1e68 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/vfm/datasets/augmentors/vlm/nvlm_data_unify.py @@ -0,0 +1,132 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +"""Visual-Text Transformations or Augmentations.""" + +import io +from typing import Dict, Optional + +from PIL import Image + +from cosmos3._src.imaginaire.datasets.webdataset.augmentors.augmentor import Augmentor +from cosmos3._src.imaginaire.utils import log +from cosmos3._src.vfm.datasets.augmentors.vlm.nvlm_sample_loaders_and_part_filters import ( + get_data_class, + get_part_filter, + get_sample_loader, +) + + +class NVLMImageDataUnify(Augmentor): + """ + This augmentor is used to unify the data format of the nvlm data. + It will take the raw nvlm data tar and convert it to a dictionary with the following keys: + { + "__url__": str, + "__key__": str, + "data_class": str, + "images": List[PIL.Image.Image], + "text": str, + "words_boxes": Optional[List[List[int]]], + "words_text": Optional[List[str]], + "similarity_matrix": Optional[List[List[float]]], + } + """ + + def __init__( + self, + input_keys: list = ["raw_nvlm"], + output_keys: Optional[list] = [], + args: Optional[dict] = None, + data_path_prefix: list[str] = [ + "cosmos/ar/v2/nvlm/", + ], # prefix of the data in s3 + ) -> None: + super().__init__(input_keys, output_keys, args) + self.data_path_prefix = data_path_prefix + + def convert_image(self, img): + try: + if isinstance(img, bytes): + img = Image.open(io.BytesIO(img)).convert("RGB") + elif isinstance(img, Image.Image): + img = img.convert("RGB") + pass # Image is already in PIL format + elif isinstance(img, list): + for i in range(len(img)): + img[i], success = self.convert_image(img[i]) + if not success: + return Image.new("RGB", (256, 256), (0, 0, 0)), False + return img, True + else: + raise ValueError(f"Invalid image type: {type(img)}") + + success = True + except Exception as e: + log.warning(f"Error processing image: {e}. Creating an empty black image.", rank0_only=False) + img = Image.new("RGB", (256, 256), (0, 0, 0)) # Creates a 256x256 black image + success = False + return img, success + + def __call__(self, data_dict: Dict) -> Dict: + url = data_dict["__url__"] + data_path = "/".join(url.path.split("/")[:-1]) # remove the last part of the path + sample_loader = get_sample_loader(data_path) + part_filter = get_part_filter(data_path) + data_class = get_data_class(data_path) + assert sample_loader is not None and part_filter is not None and data_class is not None, ( + f"sample_loader({sample_loader}) or part_filter({part_filter}) or data_class({data_class}) is not found for {data_path}" + ) + + raw = {"__url__": url, "__key__": data_dict["__key__"]} + output = {"__url__": url, "__key__": data_dict["__key__"]} + for k, v in data_dict.items(): + ext = k.split(".")[-1] + if part_filter(ext): + raw[ext] = v + try: + output_converted = sample_loader(raw) + # Here output_converted will be a dictionary with the following keys: + # { + # "__key__": str, + # "image": PIL.Image.Image, + # "images": List[PIL.Image.Image], + # "text": str, + # "words_boxes": Optional + # "words_text": Optional + # "similarity_matrix": Optional + # } + except Exception as e: + log.warning( + f"Error in sample_loader: {e}, sample_loader: {sample_loader}, data_path: {data_path}, raw: {raw.keys()}, original_data_dict: {data_dict.keys()}, __url__: {url}, __key__: {data_dict['__key__']}" + ) + return None + + output.update(output_converted) + if "image" not in output_converted and "images" not in output_converted: + success = False + log.warning(f"image not found in {output_converted.keys()}") + if "image" in output_converted: # Single image case + img, success = self.convert_image(output["image"]) + output["images"] = [img] # What should be the format for the iamges + elif "images" in output_converted: + output["images"] = output_converted["images"] + output["images"], success = self.convert_image(output["images"]) + if not success: + log.warning(f"image conversion failed for {data_dict['__key__']} url: {url} | Skip this data") + return None + output["data_class"] = data_class + + return output diff --git a/cosmos-inference/cosmos3/_src/vfm/datasets/augmentors/vlm/nvlm_sample_loaders_and_part_filters.py b/cosmos-inference/cosmos3/_src/vfm/datasets/augmentors/vlm/nvlm_sample_loaders_and_part_filters.py new file mode 100644 index 00000000..083f682c --- /dev/null +++ b/cosmos-inference/cosmos3/_src/vfm/datasets/augmentors/vlm/nvlm_sample_loaders_and_part_filters.py @@ -0,0 +1,2918 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +# Combined Sample Loaders +# Auto-generated script combining all sample_loader.py files (Dont edit this file! Edit the projects/cosmos/reasoning/v1/scripts/create_sample_loader_and_part_filter_file.py instead) + +import io + +import torch +from PIL import Image + +from cosmos3._src.imaginaire.utils import log +from cosmos3._src.vfm.datasets.data_sources.vlm.nvlm import data_path_mapping + +# This file was automatically generated by `nvgpt4 data prepare`. + +# import torch + + +def sample_loader_0(raw: dict) -> dict: # Note: Images are already decoded to tensors + + if "text" in raw: + caption = raw["text"] + else: + caption = raw["json"]["caption"] + return dict( + __key__=raw["__key__"], + image=raw["jpg"], # expected type: torch.Tensor + caption=caption, # expected type: str + ) + + +def part_filter_0(part: str) -> bool: + + # E.g. if your dataset contains jpeg, txt and json, but you won't use json, + # remove it from the list, such that it is not decoded. If you need all, keep as is + return part in ("json", "jpg", "text") + + +# Loader for: /lustre/fsw/portfolios/llmservice/users/zhuoliny/extended-sci/data/merged/CoT +# This file was automatically generated by `energon prepare`. + + + +def sample_loader_1(raw: dict) -> dict: # Note: Images are already decoded to tensors + json_item = raw["json"] + images = [raw["jpg"]] + + return dict( + __key__="llava-{}".format(raw["__key__"]), + images=images, + texts=json_item["conversations"], + similarity_matrix=None, + ) + + +def part_filter_1(part: str) -> bool: + + # E.g. if your dataset contains jpeg, txt and json, but you won't use json, + # remove it from the list, such that it is not decoded. If you need all, keep as is + return part in ("jpg", "json") + + +# Loader for: /lustre/fsw/portfolios/llmservice/users/zhuoliny/extended-sci/data/merged/single-choice +# This file was automatically generated by `energon prepare`. + + + +def sample_loader_2(raw: dict) -> dict: # Note: Images are already decoded to tensors + json_item = raw["json"] + images = [raw["jpg"]] + + return dict( + __key__="llava-{}".format(raw["__key__"]), + images=images, + texts=json_item["conversations"], + similarity_matrix=None, + ) + + +def part_filter_2(part: str) -> bool: + + # E.g. if your dataset contains jpeg, txt and json, but you won't use json, + # remove it from the list, such that it is not decoded. If you need all, keep as is + return part in ("jpg", "json") + + +# Loader for: /lustre/fsw/portfolios/llmservice/users/zhuoliny/extended-sci/data/extended-sci-3/CoT +# This file was automatically generated by `energon prepare`. + + + +def sample_loader_3(raw: dict) -> dict: # Note: Images are already decoded to tensors + json_item = raw["json"] + images = [raw["jpg"]] + + return dict( + __key__="llava-{}".format(raw["__key__"]), + images=images, + texts=json_item["conversations"], + similarity_matrix=None, + ) + + +def part_filter_3(part: str) -> bool: + + # E.g. if your dataset contains jpeg, txt and json, but you won't use json, + # remove it from the list, such that it is not decoded. If you need all, keep as is + return part in ("jpg", "json") + + +# Loader for: /lustre/fsw/portfolios/llmservice/users/zhuoliny/extended-sci/data/extended-sci-3/single-choice +# This file was automatically generated by `energon prepare`. + + + +def sample_loader_4(raw: dict) -> dict: # Note: Images are already decoded to tensors + json_item = raw["json"] + images = [raw["jpg"]] + + return dict( + __key__="llava-{}".format(raw["__key__"]), + images=images, + texts=json_item["conversations"], + similarity_matrix=None, + ) + + +def part_filter_4(part: str) -> bool: + + # E.g. if your dataset contains jpeg, txt and json, but you won't use json, + # remove it from the list, such that it is not decoded. If you need all, keep as is + return part in ("jpg", "json") + + +# Loader for: /lustre/fsw/portfolios/llmservice/projects/llmservice_nlp_fm/datasets/nvlm/wdai/data/SceMQA_processed +# This file was automatically generated by `nvgpt4 data prepare`. + + + +def sample_loader_5(raw: dict) -> dict: # Note: Images are already decoded to tensors + json_item = raw["json"] + images = [raw["jpg"]] + + return dict( + __key__="llava-{}".format(raw["__key__"]), + images=images, + texts=json_item["conversations"], + similarity_matrix=None, + ) + + +def part_filter_5(part: str) -> bool: + + # E.g. if your dataset contains jpeg, txt and json, but you won't use json, + # remove it from the list, such that it is not decoded. If you need all, keep as is + return part in ("json", "jpg") + + +# Loader for: /lustre/fsw/portfolios/llmservice/projects/llmservice_nlp_fm/datasets/nvlm/wdai/data/vqa_collection_doc_text_st_chart_scale_textbook_LRV_Screen +# This file was automatically generated by `nvgpt4 data prepare`. + + + +def sample_loader_6(raw: dict) -> dict: # Note: Images are already decoded to tensors + json_item = raw["json"] + + key = raw["__key__"] + if "docvqa" in key: + context = json_item["question"] + answers = json_item["answers"] + image = raw["jpg"] + answer_weights = json_item["answer_weights"] + elif "textvqa" in key or "lrv_instruct" in key: + context = json_item["question"] + answers = json_item["answer"] + image = raw["jpg"] + answer_weights = None + elif "stvqa" in key: + context = json_item["question"] + answers = json_item["answers"] + image = raw["jpg"] + answer_weights = [1.0] * len(json_item["answers"]) + elif "chartqa" in key: + context = json_item["query"] + answers = json_item["label"] + image = raw["png"] + answer_weights = None + elif "screenqa" in key: + image = raw["jpg"] + context = json_item["question"] + answers = json_item["ground_truth"] + answer_weights = [1.0] * len(json_item["ground_truth"]) + elif "HME100K" in key: + image = raw["jpg"] + context = "Please write out the expression of the formula in the image using LaTeX format." + answers = json_item["latex_formula"] + answer_weights = None + else: # scale, textbook + image = raw["jpg"] + context = json_item["question"] + answers = json_item["answer"] + answer_weights = None + + return dict( + __key__=key, + image=image, + context=context, + answers=answers, + answer_weights=answer_weights, + ) + + +def part_filter_6(part: str) -> bool: + + # E.g. if your dataset contains jpeg, txt and json, but you won't use json, + # remove it from the list, such that it is not decoded. If you need all, keep as is + return part in ("json", "jpg", "png") + + +# Loader for: /lustre/fsw/portfolios/llmservice/projects/llmservice_nlp_fm/datasets/nvlm/wdai/data/plotqa/processed +# This file was automatically generated by `nvgpt4 data prepare`. + + + +def sample_loader_7(raw: dict) -> dict: # Note: Images are already decoded to tensors + j = raw["json"] + + return dict( + __key__=raw["__key__"], + image=raw["jpg"], # expected type: torch.Tensor + context=j["question_string"], # expected type: str + answers=j["answer"], # expected type: typing.Union[typing.List[str], NoneType], default: None + answer_weights=None, # expected type: typing.Union[torch.Tensor, NoneType], default: None + ) + + +def part_filter_7(part: str) -> bool: + + # E.g. if your dataset contains jpeg, txt and json, but you won't use json, + # remove it from the list, such that it is not decoded. If you need all, keep as is + return part in ("jpg", "json") + + +# Loader for: /lustre/fsw/portfolios/llmservice/projects/llmservice_nlp_fm/datasets/nvlm/wdai/data/clevr-math/processed +# This file was automatically generated by `nvgpt4 data prepare`. + + + +def sample_loader_8(raw: dict) -> dict: # Note: Images are already decoded to tensors + j = raw["json"] + + return dict( + __key__=raw["__key__"], + image=raw["jpg"], # expected type: torch.Tensor + context=j["question"], # expected type: str + answers=str(j["answer"]), # expected type: typing.Optional[typing.List[str]], default: None + answer_weights=None, # expected type: typing.Optional[torch.Tensor], default: None + ) + + +def part_filter_8(part: str) -> bool: + + # E.g. if your dataset contains jpeg, txt and json, but you won't use json, + # remove it from the list, such that it is not decoded. If you need all, keep as is + return part in ("jpg", "json") + + +# Loader for: /lustre/fsw/portfolios/llmservice/projects/llmservice_nlp_fm/datasets/nvlm/wdai/data/MMC-Instruction/processed +# This file was automatically generated by `nvgpt4 data prepare`. + + + +def sample_loader_9(raw: dict) -> dict: # Note: Images are already decoded to tensors + j = raw["json"] + + return dict( + __key__=raw["__key__"], + image=raw["jpg"], # expected type: torch.Tensor + context=j["question"].strip(), # expected type: str + answers=j["gt_answer"].strip(), # expected type: typing.Union[typing.List[str], NoneType], default: None + answer_weights=None, # expected type: typing.Union[torch.Tensor, NoneType], default: None + ) + + +def part_filter_9(part: str) -> bool: + + # E.g. if your dataset contains jpeg, txt and json, but you won't use json, + # remove it from the list, such that it is not decoded. If you need all, keep as is + return part in ("json", "jpg") + + +# Loader for: /lustre/fsw/portfolios/llmservice/projects/llmservice_nlp_fm/datasets/nvlm/wdai/data/ocrvqa/processed +# This file was automatically generated by `nvgpt4 data prepare`. + + + +def sample_loader_10(raw: dict) -> dict: # Note: Images are already decoded to tensors + j = raw["json"] + return dict( + image=raw["jpg"], # expected type: torch.Tensor + context=j["question"], # expected type: str + answers=j["answer"], # expected type: typing.Optional[typing.List[str]], default: None + answer_weights=None, # expected type: typing.Optional[torch.Tensor], default: None + ) + + +def part_filter_10(part: str) -> bool: + + # E.g. if your dataset contains jpeg, txt and json, but you won't use json, + # remove it from the list, such that it is not decoded. If you need all, keep as is + return part in ("json", "jpg") + + +# Loader for: /lustre/fsw/portfolios/llmservice/projects/llmservice_nlp_fm/datasets/nvlm/wdai/data/dude/processed +# This file was automatically generated by `nvgpt4 data prepare`. + + + +def sample_loader_11(raw: dict) -> dict: # Note: Images are already decoded to tensors + j = raw["json"] + + return dict( + __key__=raw["__key__"], + image=raw["jpg"], + context=j["question"], + answers=j["answer"], + answer_weights=None, + ) + + +def part_filter_11(part: str) -> bool: + + # E.g. if your dataset contains jpeg, txt and json, but you won't use json, + # remove it from the list, such that it is not decoded. If you need all, keep as is + return part in ("jpg", "json") + + +# Loader for: /lustre/fsw/portfolios/llmservice/projects/llmservice_nlp_fm/datasets/nvlm/wdai/data/VisualMRC/processed +# This file was automatically generated by `nvgpt4 data prepare`. + + + +def sample_loader_12(raw: dict) -> dict: # Note: Images are already decoded to tensors + j = raw["json"] + return dict( + __key__=raw["__key__"], + image=raw["jpg"], # expected type: torch.Tensor + context=j["question"], # expected type: str + answers=j["answer"], # expected type: typing.Optional[typing.List[str]], default: None + answer_weights=None, # expected type: typing.Optional[torch.Tensor], default: None + ) + + +def part_filter_12(part: str) -> bool: + + # E.g. if your dataset contains jpeg, txt and json, but you won't use json, + # remove it from the list, such that it is not decoded. If you need all, keep as is + return part in ("jpg", "json") + + +# Loader for: /lustre/fsw/portfolios/llmservice/projects/llmservice_nlp_fm/datasets/nvlm/wdai/data/mcvqa_collection_scienceqa_ai2d_geoqaplus_geometry3k_tqa +# This file was automatically generated by `nvgpt4 data prepare`. + + + +def sample_loader_13(raw: dict) -> dict: # Note: Images are already decoded to tensors + json_item = raw["json"] + key = raw["__key__"] + + if "geoqa_plus" in key or "tqa" in key: + return dict( + __key__=raw["__key__"], + image=raw["jpg"], + context=json_item["question"], + choices=json_item["choices"], + correct_choice_idx=json_item["correct_answer_index"], + ) + elif "geometry3k" in key: + return dict( + __key__=raw["__key__"], + image=raw["jpg"], + context=json_item["question"], + choices=json_item["choices"], + correct_choice_idx=ord(json_item["answer"].lower()) - 97, + ) + else: # science_qa, ai2d + image_key = "png" if "png" in raw else "jpg" + if image_key not in raw: + log.warning(f"Image key {image_key} not found in with raw keys: {raw.keys()}") + return dict( + __key__=raw["__key__"], # science_qa_sample_{idx} + image=raw[image_key], # expected type: torch.Tensor + context=json_item["question"], # expected type: str + choices=json_item["choices"], # expected type: typing.Union[typing.List[str], NoneType], default: None + correct_choice_idx=json_item["correct_choice_index"], + ) + + +def part_filter_13(part: str) -> bool: + + # E.g. if your dataset contains jpeg, txt and json, but you won't use json, + # remove it from the list, such that it is not decoded. If you need all, keep as is + return part in ("json", "png", "jpg") + + +# Loader for: /lustre/fsw/portfolios/llmservice/projects/llmservice_nlp_fm/datasets/nvlm/wdai/data/arxiv_qa/processed +# This file was automatically generated by `nvgpt4 data prepare`. + + + +def sample_loader_14(raw: dict) -> dict: # Note: Images are already decoded to tensors + json_item = raw["json"] + + return dict( + __key__=raw["__key__"], # arxiv_qa_sample_{idx} + image=raw["jpg"], # expected type: torch.Tensor + context=json_item["question"], # expected type: str + choices=json_item["options"], # expected type: typing.Union[typing.List[str], NoneType], default: None + correct_choice_idx=json_item["correct_choice_index"], + ) + + +def part_filter_14(part: str) -> bool: + + # E.g. if your dataset contains jpeg, txt and json, but you won't use json, + # remove it from the list, such that it is not decoded. If you need all, keep as is + return part in ("jpg", "json") + + +# Loader for: /lustre/fsw/portfolios/llmservice/projects/llmservice_nlp_fm/datasets/nvlm/wdai/data/tabmwp/processed +# This file was automatically generated by `nvgpt4 data prepare`. + + + +def sample_loader_15(raw: dict) -> dict: # Note: Images are already decoded to tensors + json_item = raw["json"] + + if json_item["question_type"] == "multi_choice": + correct_choice_idx = json_item["choices"].index(json_item["answer"]) + return dict( + __key__=raw["__key__"], + image=raw["jpg"], + context=json_item["question"], + choices=json_item["choices"], + correct_choice_idx=correct_choice_idx, + ) + else: + # A temporary hack for non multi-choice samples. + # If correct_choice_idx=-1, we should route it to the VQAWebdataset dataloading method. + # (74.7% free-text questions, 25.3% multi-choice questions) + return dict( + __key__=raw["__key__"], + image=raw["jpg"], + context=json_item["question"], + choices=[json_item["answer"]], + correct_choice_idx=-1, + ) + + +def part_filter_15(part: str) -> bool: + + # E.g. if your dataset contains jpeg, txt and json, but you won't use json, + # remove it from the list, such that it is not decoded. If you need all, keep as is + return part in ("jpg", "json") + + +# Loader for: /lustre/fsw/portfolios/llmservice/projects/llmservice_nlp_fm/datasets/nvlm/wdai/data/ocr_vqa_aug/processed/ +# This file was automatically generated by `nvgpt4 data prepare`. + + + +def sample_loader_16(raw: dict) -> dict: # Note: Images are already decoded to tensors + j = raw["json"] + + return dict( + __key__="llava-{}".format(raw["__key__"]), images=[raw["jpg"]], texts=j["conversations"], similarity_matrix=None + ) + + +def part_filter_16(part: str) -> bool: + + # E.g. if your dataset contains jpeg, txt and json, but you won't use json, + # remove it from the list, such that it is not decoded. If you need all, keep as is + return part in ("jpg", "json") + + +# Loader for: /lustre/fsw/portfolios/llmservice/projects/llmservice_nlp_fm/datasets/nvlm/wdai/data/dvqa_full/processed +# This file was automatically generated by `nvgpt4 data prepare`. + + + +def sample_loader_17(raw: dict) -> dict: # Note: Images are already decoded to tensors + json_item = raw["json"] + images = [raw["jpg"]] + + return dict( + __key__="llava-{}".format(raw["__key__"]), + images=images, + texts=json_item["conversations"], + similarity_matrix=None, + ) + + +def part_filter_17(part: str) -> bool: + + # E.g. if your dataset contains jpeg, txt and json, but you won't use json, + # remove it from the list, such that it is not decoded. If you need all, keep as is + return part in ("json", "jpg") + + +# Loader for: /lustre/fsw/portfolios/llmservice/projects/llmservice_nlp_fm/datasets/nvlm/wdai/data/LLaVA-v1.5_shuffle/no_refcoco_vg_ocrvqa +# This file was automatically generated by `nvgpt4 data prepare`. + + + +def sample_loader_18(raw: dict) -> dict: # Note: Images are already decoded to tensors + json_item = raw["json"] + images = [raw["jpg"]] + + return dict( + __key__="llava-{}".format(raw["__key__"]), + images=images, + texts=json_item["conversations"], + similarity_matrix=None, + ) + + +def part_filter_18(part: str) -> bool: + + # E.g. if your dataset contains jpeg, txt and json, but you won't use json, + # remove it from the list, such that it is not decoded. If you need all, keep as is + return part in ("json", "jpg") + + +# Loader for: /lustre/fsw/portfolios/llmservice/projects/llmservice_nlp_fm/datasets/vqa/more_data/infographics_vqa/processed/train/ +# This file was automatically generated by `nvgpt4 data prepare`. + + + +def sample_loader_19(raw: dict) -> dict: # Note: Images are already decoded to tensors + + json_item = raw["json"] + images = [raw["jpg"]] + + return dict( + __key__="llava-{}".format(raw["__key__"]), + images=images, # expected type: torch.Tensor + texts=json_item["conversations"], + similarity_matrix=None, + ) + + +def part_filter_19(part: str) -> bool: + + return part in ("jpg", "json") + + +# Loader for: /lustre/fsw/portfolios/llmservice/projects/llmservice_nlp_fm/datasets/nvlm/wdai/data/sharegpt4o/processed +# This file was automatically generated by `nvgpt4 data prepare`. + + + +def sample_loader_20(raw: dict) -> dict: # Note: Images are already decoded to tensors + j = raw["json"] + + return dict( + __key__="llava-{}".format(raw["__key__"]), images=[raw["jpg"]], texts=j["conversations"], similarity_matrix=None + ) + + +def part_filter_20(part: str) -> bool: + + # E.g. if your dataset contains jpeg, txt and json, but you won't use json, + # remove it from the list, such that it is not decoded. If you need all, keep as is + return part in ("jpg", "json") + + +# Loader for: /lustre/fsw/portfolios/llmservice/projects/llmservice_nlp_fm/datasets/nvlm/wdai/data/sparse_ocr_data/merged +# This file was automatically generated by `nvgpt4 data prepare`. + + + +def sample_loader_21(raw: dict) -> dict: # Note: Images are already decoded to tensors + j = raw["json"] + + return dict( + __key__="llava-{}".format(raw["__key__"]), images=[raw["png"]], texts=j["conversations"], similarity_matrix=None + ) + + +def part_filter_21(part: str) -> bool: + + # E.g. if your dataset contains jpeg, txt and json, but you won't use json, + # remove it from the list, such that it is not decoded. If you need all, keep as is + return part in ("json", "png") + + +# Loader for: /lustre/fsw/portfolios/llmservice/projects/llmservice_nlp_fm/datasets/nvlm/nayeonl/data/blendv4/MetaMathQA/processed/train_text_image +# This file was automatically generated by `nvgpt4 data prepare`. + + + +def sample_loader_22(raw: dict) -> dict: # Note: Images are already decoded to tensors + + json_item = raw["json"] + images = [raw["jpg"]] + + return dict( + __key__="llava-{}".format(raw["__key__"]), + images=images, # expected type: torch.Tensor + texts=json_item["conversations"], + similarity_matrix=None, + ) + + +def part_filter_22(part: str) -> bool: + + return part in ("jpg", "json") + + +# Loader for: /lustre/fsw/portfolios/llmservice/projects/llmservice_nlp_fm/datasets/nvlm/nayeonl/data/blendv4/gsm8k/processed/train_text_image +# This file was automatically generated by `nvgpt4 data prepare`. + + + +def sample_loader_23(raw: dict) -> dict: # Note: Images are already decoded to tensors + + json_item = raw["json"] + images = [raw["jpg"]] + + return dict( + __key__="llava-{}".format(raw["__key__"]), + images=images, # expected type: torch.Tensor + texts=json_item["conversations"], + similarity_matrix=None, + ) + + +def part_filter_23(part: str) -> bool: + + return part in ("jpg", "json") + + +# Loader for: /lustre/fsw/portfolios/llmservice/projects/llmservice_nlp_fm/datasets/nvlm/wdai/data/docmatix/processed +# This file was automatically generated by `nvgpt4 data prepare`. + + + +def sample_loader_24(raw: dict) -> dict: # Note: Images are already decoded to tensors + json_item = raw["json"] + images = [raw["jpg"]] + + return dict( + __key__="llava-{}".format(raw["__key__"]), + images=images, + texts=json_item["conversations"], + similarity_matrix=None, + ) + + +def part_filter_24(part: str) -> bool: + + # E.g. if your dataset contains jpeg, txt and json, but you won't use json, + # remove it from the list, such that it is not decoded. If you need all, keep as is + return part in ("json", "jpg") + + +# Loader for: /lustre/fsw/portfolios/llmservice/projects/llmservice_nlp_fm/datasets/nvlm/wdai/data/bentham_hw_squad/processed +# This file was automatically generated by `nvgpt4 data prepare`. + + + +def sample_loader_25(raw: dict) -> dict: # Note: Images are already decoded to tensors + j = raw["json"] + + return dict( + __key__="llava-{}".format(raw["__key__"]), images=[raw["jpg"]], texts=j["conversations"], similarity_matrix=None + ) + + +def part_filter_25(part: str) -> bool: + + # E.g. if your dataset contains jpeg, txt and json, but you won't use json, + # remove it from the list, such that it is not decoded. If you need all, keep as is + return part in ("json", "jpg") + + +# Loader for: /lustre/fsw/portfolios/llmservice/projects/llmservice_nlp_fm/datasets/nvlm/wdai/data/WikiTableQA/processed +# This file was automatically generated by `nvgpt4 data prepare`. + + + +def sample_loader_26(raw: dict) -> dict: # Note: Images are already decoded to tensors + j = raw["json"] + + return dict( + __key__="llava-{}".format(raw["__key__"]), images=[raw["jpg"]], texts=j["conversations"], similarity_matrix=None + ) + + +def part_filter_26(part: str) -> bool: + + # E.g. if your dataset contains jpeg, txt and json, but you won't use json, + # remove it from the list, such that it is not decoded. If you need all, keep as is + return part in ("jpg", "json") + + +# Loader for: /lustre/fsw/portfolios/llmservice/projects/llmservice_nlp_fm/datasets/nvlm/wdai/data/figureqa/processed +# This file was automatically generated by `nvgpt4 data prepare`. + + + +def sample_loader_27(raw: dict) -> dict: # Note: Images are already decoded to tensors + json_item = raw["json"] + images = [raw["jpg"]] + + return dict( + __key__="llava-{}".format(raw["__key__"]), + images=images, + texts=json_item["conversations"], + similarity_matrix=None, + ) + + +def part_filter_27(part: str) -> bool: + + # E.g. if your dataset contains jpeg, txt and json, but you won't use json, + # remove it from the list, such that it is not decoded. If you need all, keep as is + return part in ("json", "jpg") + + +# Loader for: /lustre/fsw/portfolios/llmservice/projects/llmservice_nlp_fm/datasets/nvlm/wdai/data/llava-onevision/ai2d_combined_processed +# This file was automatically generated by `nvgpt4 data prepare`. + + + +def sample_loader_28(raw: dict) -> dict: # Note: Images are already decoded to tensors + json_item = raw["json"] + images = [raw["jpg"]] + + return dict( + __key__="llava-{}".format(raw["__key__"]), + images=images, + texts=json_item["conversations"], + similarity_matrix=None, + ) + + +def part_filter_28(part: str) -> bool: + + # E.g. if your dataset contains jpeg, txt and json, but you won't use json, + # remove it from the list, such that it is not decoded. If you need all, keep as is + return part in ("jpg", "json") + + +# Loader for: /lustre/fsw/portfolios/llmservice/projects/llmservice_nlp_fm/datasets/nvlm/wdai/data/llava-onevision/math_combined_processed +# This file was automatically generated by `nvgpt4 data prepare`. + + + +def sample_loader_29(raw: dict) -> dict: # Note: Images are already decoded to tensors + json_item = raw["json"] + images = [raw["jpg"]] + + return dict( + __key__="llava-{}".format(raw["__key__"]), + images=images, + texts=json_item["conversations"], + similarity_matrix=None, + ) + + +def part_filter_29(part: str) -> bool: + + # E.g. if your dataset contains jpeg, txt and json, but you won't use json, + # remove it from the list, such that it is not decoded. If you need all, keep as is + return part in ("json", "jpg") + + +# Loader for: /lustre/fsw/portfolios/llmservice/projects/llmservice_nlp_fm/datasets/nvlm/wdai/data/llava-onevision/robut_combined_processed +# This file was automatically generated by `nvgpt4 data prepare`. + + + +def sample_loader_30(raw: dict) -> dict: # Note: Images are already decoded to tensors + json_item = raw["json"] + images = [raw["jpg"]] + + return dict( + __key__="llava-{}".format(raw["__key__"]), + images=images, + texts=json_item["conversations"], + similarity_matrix=None, + ) + + +def part_filter_30(part: str) -> bool: + + # E.g. if your dataset contains jpeg, txt and json, but you won't use json, + # remove it from the list, such that it is not decoded. If you need all, keep as is + return part in ("json", "jpg") + + +# Loader for: /lustre/fsw/portfolios/llmservice/projects/llmservice_nlp_fm/datasets/nvlm/wdai/data/llava-onevision/llavar_20k_processed +# This file was automatically generated by `nvgpt4 data prepare`. + + + +def sample_loader_31(raw: dict) -> dict: # Note: Images are already decoded to tensors + json_item = raw["json"] + images = [raw["jpg"]] + + return dict( + __key__="llava-{}".format(raw["__key__"]), + images=images, + texts=json_item["conversations"], + similarity_matrix=None, + ) + + +def part_filter_31(part: str) -> bool: + + # E.g. if your dataset contains jpeg, txt and json, but you won't use json, + # remove it from the list, such that it is not decoded. If you need all, keep as is + return part in ("jpg", "json") + + +# Loader for: /lustre/fsw/portfolios/llmservice/projects/llmservice_nlp_fm/datasets/nvlm/wdai/data/llava-onevision/tallyqa_processed +# This file was automatically generated by `nvgpt4 data prepare`. + + + +def sample_loader_32(raw: dict) -> dict: # Note: Images are already decoded to tensors + json_item = raw["json"] + images = [raw["jpg"]] + + return dict( + __key__="llava-{}".format(raw["__key__"]), + images=images, + texts=json_item["conversations"], + similarity_matrix=None, + ) + + +def part_filter_32(part: str) -> bool: + + # E.g. if your dataset contains jpeg, txt and json, but you won't use json, + # remove it from the list, such that it is not decoded. If you need all, keep as is + return part in ("json", "jpg") + + +# Loader for: /lustre/fsw/portfolios/llmservice/projects/llmservice_nlp_fm/datasets/nvlm/wdai/data/llava-onevision/ureader_ie_processed +# This file was automatically generated by `nvgpt4 data prepare`. + + + +def sample_loader_33(raw: dict) -> dict: # Note: Images are already decoded to tensors + json_item = raw["json"] + images = [raw["jpg"]] + + return dict( + __key__="llava-{}".format(raw["__key__"]), + images=images, + texts=json_item["conversations"], + similarity_matrix=None, + ) + + +def part_filter_33(part: str) -> bool: + + # E.g. if your dataset contains jpeg, txt and json, but you won't use json, + # remove it from the list, such that it is not decoded. If you need all, keep as is + return part in ("json", "jpg") + + +# Loader for: /lustre/fsw/portfolios/llmservice/projects/llmservice_nlp_fm/datasets/nvlm/wdai/data/llava-onevision/visual7w_processed +# This file was automatically generated by `nvgpt4 data prepare`. + + + +def sample_loader_34(raw: dict) -> dict: # Note: Images are already decoded to tensors + json_item = raw["json"] + images = [raw["jpg"]] + + return dict( + __key__="llava-{}".format(raw["__key__"]), + images=images, + texts=json_item["conversations"], + similarity_matrix=None, + ) + + +def part_filter_34(part: str) -> bool: + + # E.g. if your dataset contains jpeg, txt and json, but you won't use json, + # remove it from the list, such that it is not decoded. If you need all, keep as is + return part in ("json", "jpg") + + +# Loader for: /lustre/fsw/portfolios/llmservice/projects/llmservice_nlp_fm/datasets/nvlm/wdai/data/llava-onevision/mavis_math_rule_geo_processed +# This file was automatically generated by `nvgpt4 data prepare`. + + + +def sample_loader_35(raw: dict) -> dict: # Note: Images are already decoded to tensors + json_item = raw["json"] + images = [raw["jpg"]] + + return dict( + __key__="llava-{}".format(raw["__key__"]), + images=images, + texts=json_item["conversations"], + similarity_matrix=None, + ) + + +def part_filter_35(part: str) -> bool: + + # E.g. if your dataset contains jpeg, txt and json, but you won't use json, + # remove it from the list, such that it is not decoded. If you need all, keep as is + return part in ("jpg", "json") + + +# Loader for: /lustre/fsw/portfolios/llmservice/projects/llmservice_nlp_fm/datasets/nvlm/wdai/data/llava-onevision/ureader_kg_processed +# This file was automatically generated by `nvgpt4 data prepare`. + + + +def sample_loader_36(raw: dict) -> dict: # Note: Images are already decoded to tensors + json_item = raw["json"] + images = [raw["jpg"]] + + return dict( + __key__="llava-{}".format(raw["__key__"]), + images=images, + texts=json_item["conversations"], + similarity_matrix=None, + ) + + +def part_filter_36(part: str) -> bool: + + # E.g. if your dataset contains jpeg, txt and json, but you won't use json, + # remove it from the list, such that it is not decoded. If you need all, keep as is + return part in ("jpg", "json") + + +# Loader for: /lustre/fsw/portfolios/llmservice/projects/llmservice_nlp_fm/datasets/nvlm/wdai/data/llava-onevision/ureader_qa_processed +# This file was automatically generated by `nvgpt4 data prepare`. + + + +def sample_loader_37(raw: dict) -> dict: # Note: Images are already decoded to tensors + json_item = raw["json"] + images = [raw["jpg"]] + + return dict( + __key__="llava-{}".format(raw["__key__"]), + images=images, + texts=json_item["conversations"], + similarity_matrix=None, + ) + + +def part_filter_37(part: str) -> bool: + + # E.g. if your dataset contains jpeg, txt and json, but you won't use json, + # remove it from the list, such that it is not decoded. If you need all, keep as is + return part in ("json", "jpg") + + +# Loader for: /lustre/fsw/portfolios/llmservice/projects/llmservice_nlp_fm/datasets/nvlm/wdai/data/ocr_multi_collection_cocotext_textocr_ReCTs +# This file was automatically generated by `nvgpt4 data prepare`. + + + +def sample_loader_38(raw: dict) -> dict: + j = raw["json"] + + if "ReCTs" in raw["__key__"]: + return dict( + __key__=raw["__key__"], + image=raw["jpg"], + text="", + words_boxes=j["quads_1k_normalized"], + words_text=j["texts"], + ) + else: # coco-text-multi, textocr-multi + return dict( + __key__=raw["__key__"], + image=raw["jpg"], + text="", + words_boxes=j["bboxes_1k_normalized"], + words_text=j["texts"], + ) + + +def part_filter_38(part: str) -> bool: + + # E.g. if your dataset contains jpeg, txt and json, but you won't use json, + # remove it from the list, such that it is not decoded. If you need all, keep as is + return part in ("jpg", "json") + + +# Loader for: /lustre/fsw/portfolios/llmservice/projects/llmservice_nlp_fm/datasets/nvlm/wdai/data/pdfa-eng-wds/processed_word_len_500 +# This file was automatically generated by `nvgpt4 data prepare`. + + + +def sample_loader_39(raw: dict) -> dict: # Note: Images are already decoded to tensors + j = raw["json"] + return dict( + image=raw["jpg"], # expected type: torch.Tensor + text=" ".join(j["lines"]["text"]), # expected type: str + ) + + +def part_filter_39(part: str) -> bool: + + # E.g. if your dataset contains jpeg, txt and json, but you won't use json, + # remove it from the list, such that it is not decoded. If you need all, keep as is + return part in ("jpg", "json") + + +# Loader for: /lustre/fsw/portfolios/llmservice/projects/llmservice_nlp_fm/datasets/nvlm/wdai/data/llava-onevision/super_clevr_processed +# This file was automatically generated by `nvgpt4 data prepare`. + + + +def sample_loader_40(raw: dict) -> dict: # Note: Images are already decoded to tensors + json_item = raw["json"] + images = [raw["jpg"]] + + return dict( + __key__="llava-{}".format(raw["__key__"]), + images=images, + texts=json_item["conversations"], + similarity_matrix=None, + ) + + +def part_filter_40(part: str) -> bool: + + # E.g. if your dataset contains jpeg, txt and json, but you won't use json, + # remove it from the list, such that it is not decoded. If you need all, keep as is + return part in ("json", "jpg") + + +# Loader for: /lustre/fsw/portfolios/llmservice/projects/llmservice_nlp_fm/datasets/nvlm/wdai/data/llava-onevision/icon_qa_processed +# This file was automatically generated by `nvgpt4 data prepare`. + + + +def sample_loader_41(raw: dict) -> dict: # Note: Images are already decoded to tensors + json_item = raw["json"] + images = [raw["jpg"]] + + return dict( + __key__="llava-{}".format(raw["__key__"]), + images=images, + texts=json_item["conversations"], + similarity_matrix=None, + ) + + +def part_filter_41(part: str) -> bool: + + # E.g. if your dataset contains jpeg, txt and json, but you won't use json, + # remove it from the list, such that it is not decoded. If you need all, keep as is + return part in ("jpg", "json") + + +# Loader for: /lustre/fsw/portfolios/llmservice/projects/llmservice_nlp_fm/datasets/nvlm/wdai/data/augmentations/chartqa_aug +# This file was automatically generated by `nvgpt4 data prepare`. + + + +def sample_loader_42(raw: dict) -> dict: # Note: Images are already decoded to tensors + json_item = raw["json"] + + img = Image.open(io.BytesIO(raw["img"])) + images = [img] + + return dict( + __key__="llava-{}".format(raw["__key__"]), + images=images, + texts=json_item["conversations"], + similarity_matrix=None, + ) + + +def part_filter_42(part: str) -> bool: + + # E.g. if your dataset contains jpeg, txt and json, but you won't use json, + # remove it from the list, such that it is not decoded. If you need all, keep as is + return part in ("img", "json") + + +# Loader for: /lustre/fsw/portfolios/llmservice/projects/llmservice_nlp_fm/datasets/nvlm/wdai/data/augmentations/gpt_chartqa +# This file was automatically generated by `nvgpt4 data prepare`. + + + +def sample_loader_43(raw: dict) -> dict: # Note: Images are already decoded to tensors + json_item = raw["json"] + + img = Image.open(io.BytesIO(raw["img"])) + images = [img] + + return dict( + __key__="llava-{}".format(raw["__key__"]), + images=images, + texts=json_item["conversations"], + similarity_matrix=None, + ) + + +def part_filter_43(part: str) -> bool: + + # E.g. if your dataset contains jpeg, txt and json, but you won't use json, + # remove it from the list, such that it is not decoded. If you need all, keep as is + return part in ("img", "json") + + +# Loader for: /lustre/fsw/portfolios/llmservice/projects/llmservice_nlp_fm/datasets/nvlm/wdai/data/augmentations/gpt_docvqa +# This file was automatically generated by `nvgpt4 data prepare`. + + + +def sample_loader_44(raw: dict) -> dict: # Note: Images are already decoded to tensors + json_item = raw["json"] + + img = Image.open(io.BytesIO(raw["img"])) + images = [img] + + return dict( + __key__="llava-{}".format(raw["__key__"]), + images=images, + texts=json_item["conversations"], + similarity_matrix=None, + ) + + +def part_filter_44(part: str) -> bool: + + # E.g. if your dataset contains jpeg, txt and json, but you won't use json, + # remove it from the list, such that it is not decoded. If you need all, keep as is + return part in ("img", "json") + + +# Loader for: /lustre/fsw/portfolios/llmservice/projects/llmservice_nlp_fm/datasets/nvlm/wdai/data/augmentations/docvqa_text +# This file was automatically generated by `nvgpt4 data prepare`. + + + +def sample_loader_45(raw: dict) -> dict: # Note: Images are already decoded to tensors + json_item = raw["json"] + + img = Image.open(io.BytesIO(raw["img"])) + images = [img] + + return dict( + __key__="llava-{}".format(raw["__key__"]), + images=images, + texts=json_item["conversations"], + similarity_matrix=None, + ) + + +def part_filter_45(part: str) -> bool: + + # E.g. if your dataset contains jpeg, txt and json, but you won't use json, + # remove it from the list, such that it is not decoded. If you need all, keep as is + return part in ("img", "json") + + +# Loader for: /lustre/fsw/portfolios/llmservice/projects/llmservice_nlp_fm/datasets/nvlm/wdai/data/augmentations/textvqa_text +# This file was automatically generated by `nvgpt4 data prepare`. + + + +def sample_loader_46(raw: dict) -> dict: # Note: Images are already decoded to tensors + json_item = raw["json"] + + img = Image.open(io.BytesIO(raw["img"])) + images = [img] + + return dict( + __key__="llava-{}".format(raw["__key__"]), + images=images, + texts=json_item["conversations"], + similarity_matrix=None, + ) + + +def part_filter_46(part: str) -> bool: + + # E.g. if your dataset contains jpeg, txt and json, but you won't use json, + # remove it from the list, such that it is not decoded. If you need all, keep as is + return part in ("img", "json") + + +# Loader for: /lustre/fsw/portfolios/llmservice/projects/llmservice_nlp_fm/datasets/nvlm/wdai/data/augmentations/i2s-musicsheet +# This file was automatically generated by `nvgpt4 data prepare`. + + + +def sample_loader_47(raw: dict) -> dict: # Note: Images are already decoded to tensors + json_item = raw["json"] + + img = Image.open(io.BytesIO(raw["img"])) + images = [img] + + return dict( + __key__="llava-{}".format(raw["__key__"]), + images=images, + texts=json_item["conversations"], + similarity_matrix=None, + ) + + +def part_filter_47(part: str) -> bool: + + # E.g. if your dataset contains jpeg, txt and json, but you won't use json, + # remove it from the list, such that it is not decoded. If you need all, keep as is + return part in ("img", "json") + + +# Loader for: /lustre/fsw/portfolios/llmservice/projects/llmservice_nlp_fm/datasets/nvlm/wdai/data/augmentations/music +# This file was automatically generated by `nvgpt4 data prepare`. + + + +def sample_loader_48(raw: dict) -> dict: # Note: Images are already decoded to tensors + json_item = raw["json"] + + img = Image.open(io.BytesIO(raw["img"])) + images = [img] + + return dict( + __key__="llava-{}".format(raw["__key__"]), + images=images, + texts=json_item["conversations"], + similarity_matrix=None, + ) + + +def part_filter_48(part: str) -> bool: + + # E.g. if your dataset contains jpeg, txt and json, but you won't use json, + # remove it from the list, such that it is not decoded. If you need all, keep as is + return part in ("img", "json") + + +# Loader for: /lustre/fsw/portfolios/llmservice/projects/llmservice_nlp_fm/datasets/nvlm/wdai/data/augmentations/invoice +# This file was automatically generated by `nvgpt4 data prepare`. + + + +def sample_loader_49(raw: dict) -> dict: # Note: Images are already decoded to tensors + json_item = raw["json"] + + img = Image.open(io.BytesIO(raw["img"])) + images = [img] + + return dict( + __key__="llava-{}".format(raw["__key__"]), + images=images, + texts=json_item["conversations"], + similarity_matrix=None, + ) + + +def part_filter_49(part: str) -> bool: + + # E.g. if your dataset contains jpeg, txt and json, but you won't use json, + # remove it from the list, such that it is not decoded. If you need all, keep as is + return part in ("img", "json") + + +# Loader for: /lustre/fsw/portfolios/llmservice/projects/llmservice_nlp_fm/datasets/nvlm/wdai/data/augmentations/k12 +# This file was automatically generated by `nvgpt4 data prepare`. + + + +def sample_loader_50(raw: dict) -> dict: # Note: Images are already decoded to tensors + json_item = raw["json"] + + img = Image.open(io.BytesIO(raw["img"])) + images = [img] + + return dict( + __key__="llava-{}".format(raw["__key__"]), + images=images, + texts=json_item["conversations"], + similarity_matrix=None, + ) + + +def part_filter_50(part: str) -> bool: + + # E.g. if your dataset contains jpeg, txt and json, but you won't use json, + # remove it from the list, such that it is not decoded. If you need all, keep as is + return part in ("img", "json") + + +# Loader for: /lustre/fsw/portfolios/llmservice/projects/llmservice_nlp_fm/datasets/nvlm/wdai/data/augmentations/MTVQA +# This file was automatically generated by `nvgpt4 data prepare`. + + + +def sample_loader_51(raw: dict) -> dict: # Note: Images are already decoded to tensors + json_item = raw["json"] + for i, turn in enumerate(json_item["conversations"]): + if i > 0 and turn["from"] == "human" and "" in turn["value"]: + turn["value"] = turn["value"].replace("\n", "") + + img = Image.open(io.BytesIO(raw["img"])) + images = [img] + + return dict( + __key__="llava-{}".format(raw["__key__"]), + images=images, + texts=json_item["conversations"], + similarity_matrix=None, + ) + + +def part_filter_51(part: str) -> bool: + + # E.g. if your dataset contains jpeg, txt and json, but you won't use json, + # remove it from the list, such that it is not decoded. If you need all, keep as is + return part in ("img", "json") + + +# Loader for: /lustre/fsw/portfolios/llmservice/projects/llmservice_nlp_fm/datasets/nvlm/wdai/data/augmentations/VisualWebInstruct +# This file was automatically generated by `nvgpt4 data prepare`. + + + +def sample_loader_52(raw: dict) -> dict: # Note: Images are already decoded to tensors + json_item = raw["json"] + + img = Image.open(io.BytesIO(raw["img"])) + images = [img] + + return dict( + __key__="llava-{}".format(raw["__key__"]), + images=images, + texts=json_item["conversations"], + similarity_matrix=None, + ) + + +def part_filter_52(part: str) -> bool: + + # E.g. if your dataset contains jpeg, txt and json, but you won't use json, + # remove it from the list, such that it is not decoded. If you need all, keep as is + return part in ("img", "json") + + +# Loader for: /lustre/fsw/portfolios/llmservice/projects/llmservice_nlp_fm/datasets/nvlm/wdai/data/augmentations/financeqa +# This file was automatically generated by `nvgpt4 data prepare`. + + + +def sample_loader_53(raw: dict) -> dict: # Note: Images are already decoded to tensors + json_item = raw["json"] + # for i, turn in enumerate(json_item['conversations']): + # if i > 0 and turn['from'] == 'human' and '' in turn['value']: + # turn['value'] = turn['value'].replace("\n", "") + + img = Image.open(io.BytesIO(raw["img"])) + images = [img] + + return dict( + __key__="llava-{}".format(raw["__key__"]), + images=images, + texts=json_item["conversations"], + similarity_matrix=None, + ) + + +def part_filter_53(part: str) -> bool: + + # E.g. if your dataset contains jpeg, txt and json, but you won't use json, + # remove it from the list, such that it is not decoded. If you need all, keep as is + return part in ("img", "json") + + +# Loader for: /lustre/fsw/portfolios/llmservice/projects/llmservice_nlp_fm/datasets/nvlm/wdai/data/augmentations/docreason +# This file was automatically generated by `nvgpt4 data prepare`. + + + +def sample_loader_54(raw: dict) -> dict: # Note: Images are already decoded to tensors + json_item = raw["json"] + + img = Image.open(io.BytesIO(raw["img"])) + images = [img] + + return dict( + __key__="llava-{}".format(raw["__key__"]), + images=images, + texts=json_item["conversations"], + similarity_matrix=None, + ) + + +def part_filter_54(part: str) -> bool: + + # E.g. if your dataset contains jpeg, txt and json, but you won't use json, + # remove it from the list, such that it is not decoded. If you need all, keep as is + return part in ("img", "json") + + +# Loader for: /lustre/fsw/portfolios/llmservice/projects/llmservice_nlp_fm/datasets/nvlm/wdai/data/augmentations/gpt_mtwi +# This file was automatically generated by `nvgpt4 data prepare`. + + + +def sample_loader_55(raw: dict) -> dict: # Note: Images are already decoded to tensors + json_item = raw["json"] + + img = Image.open(io.BytesIO(raw["img"])) + images = [img] + + return dict( + __key__="llava-{}".format(raw["__key__"]), + images=images, + texts=json_item["conversations"], + similarity_matrix=None, + ) + + +def part_filter_55(part: str) -> bool: + + # E.g. if your dataset contains jpeg, txt and json, but you won't use json, + # remove it from the list, such that it is not decoded. If you need all, keep as is + return part in ("img", "json") + + +# Loader for: /lustre/fsw/portfolios/llmservice/projects/llmservice_nlp_fm/datasets/nvlm/wdai/data/augmentations/geos_gpt +# This file was automatically generated by `nvgpt4 data prepare`. + + + +def sample_loader_56(raw: dict) -> dict: # Note: Images are already decoded to tensors + json_item = raw["json"] + + img = Image.open(io.BytesIO(raw["img"])) + images = [img] + + return dict( + __key__="llava-{}".format(raw["__key__"]), + images=images, + texts=json_item["conversations"], + similarity_matrix=None, + ) + + +def part_filter_56(part: str) -> bool: + + # E.g. if your dataset contains jpeg, txt and json, but you won't use json, + # remove it from the list, such that it is not decoded. If you need all, keep as is + return part in ("img", "json") + + +# Loader for: /lustre/fsw/portfolios/llmservice/projects/llmservice_nlp_fm/datasets/nvlm/wdai/data/augmentations/cauldron_vistext +# This file was automatically generated by `nvgpt4 data prepare`. + + + +def sample_loader_57(raw: dict) -> dict: # Note: Images are already decoded to tensors + json_item = raw["json"] + + img = Image.open(io.BytesIO(raw["img"])) + images = [img] + + return dict( + __key__="llava-{}".format(raw["__key__"]), + images=images, + texts=json_item["conversations"], + similarity_matrix=None, + ) + + +def part_filter_57(part: str) -> bool: + + # E.g. if your dataset contains jpeg, txt and json, but you won't use json, + # remove it from the list, such that it is not decoded. If you need all, keep as is + return part in ("img", "json") + + +# Loader for: /lustre/fsw/portfolios/llmservice/projects/llmservice_nlp_fm/datasets/nvlm/wdai/data/augmentations/memes +# This file was automatically generated by `nvgpt4 data prepare`. + + + +def sample_loader_58(raw: dict) -> dict: # Note: Images are already decoded to tensors + json_item = raw["json"] + + img = Image.open(io.BytesIO(raw["img"])) + images = [img] + + return dict( + __key__="llava-{}".format(raw["__key__"]), + images=images, + texts=json_item["conversations"], + similarity_matrix=None, + ) + + +def part_filter_58(part: str) -> bool: + + # E.g. if your dataset contains jpeg, txt and json, but you won't use json, + # remove it from the list, such that it is not decoded. If you need all, keep as is + return part in ("img", "json") + + +# Loader for: /lustre/fsw/portfolios/llmservice/projects/llmservice_nlp_fm/datasets/nvlm/wdai/data/augmentations/gpt_roadtext +# This file was automatically generated by `nvgpt4 data prepare`. + + + +def sample_loader_59(raw: dict) -> dict: # Note: Images are already decoded to tensors + json_item = raw["json"] + + img = Image.open(io.BytesIO(raw["img"])) + images = [img] + + return dict( + __key__="llava-{}".format(raw["__key__"]), + images=images, + texts=json_item["conversations"], + similarity_matrix=None, + ) + + +def part_filter_59(part: str) -> bool: + + # E.g. if your dataset contains jpeg, txt and json, but you won't use json, + # remove it from the list, such that it is not decoded. If you need all, keep as is + return part in ("img", "json") + + +# Loader for: /lustre/fsw/portfolios/llmservice/projects/llmservice_nlp_fm/datasets/nvlm/wdai/data/augmentations/indoor_qa +# This file was automatically generated by `nvgpt4 data prepare`. + + + +def sample_loader_60(raw: dict) -> dict: # Note: Images are already decoded to tensors + json_item = raw["json"] + + img = Image.open(io.BytesIO(raw["img"])) + images = [img] + + return dict( + __key__="llava-{}".format(raw["__key__"]), + images=images, + texts=json_item["conversations"], + similarity_matrix=None, + ) + + +def part_filter_60(part: str) -> bool: + + # E.g. if your dataset contains jpeg, txt and json, but you won't use json, + # remove it from the list, such that it is not decoded. If you need all, keep as is + return part in ("img", "json") + + +# Loader for: /lustre/fsw/portfolios/llmservice/projects/llmservice_nlp_fm/datasets/nvlm/wdai/data/augmentations/colpali +# This file was automatically generated by `nvgpt4 data prepare`. + + + +def sample_loader_61(raw: dict) -> dict: # Note: Images are already decoded to tensors + json_item = raw["json"] + + img = Image.open(io.BytesIO(raw["img"])) + images = [img] + + return dict( + __key__="llava-{}".format(raw["__key__"]), + images=images, + texts=json_item["conversations"], + similarity_matrix=None, + ) + + +def part_filter_61(part: str) -> bool: + + # E.g. if your dataset contains jpeg, txt and json, but you won't use json, + # remove it from the list, such that it is not decoded. If you need all, keep as is + return part in ("img", "json") + + +# Loader for: /lustre/fsw/portfolios/llmservice/projects/llmservice_nlp_fm/datasets/nvlm/wdai/data/augmentations/pmc_vqa +# This file was automatically generated by `nvgpt4 data prepare`. + + + +def sample_loader_62(raw: dict) -> dict: # Note: Images are already decoded to tensors + json_item = raw["json"] + + img = Image.open(io.BytesIO(raw["img"])) + images = [img] + + return dict( + __key__="llava-{}".format(raw["__key__"]), + images=images, + texts=json_item["conversations"], + similarity_matrix=None, + ) + + +def part_filter_62(part: str) -> bool: + + # E.g. if your dataset contains jpeg, txt and json, but you won't use json, + # remove it from the list, such that it is not decoded. If you need all, keep as is + return part in ("img", "json") + + +# Loader for: /lustre/fsw/portfolios/llmservice/projects/llmservice_nlp_fm/datasets/nvlm/wdai/data/augmentations/pathvqa +# This file was automatically generated by `nvgpt4 data prepare`. + + + +def sample_loader_63(raw: dict) -> dict: # Note: Images are already decoded to tensors + json_item = raw["json"] + + img = Image.open(io.BytesIO(raw["img"])) + images = [img] + + return dict( + __key__="llava-{}".format(raw["__key__"]), + images=images, + texts=json_item["conversations"], + similarity_matrix=None, + ) + + +def part_filter_63(part: str) -> bool: + + # E.g. if your dataset contains jpeg, txt and json, but you won't use json, + # remove it from the list, such that it is not decoded. If you need all, keep as is + return part in ("img", "json") + + +# Loader for: /lustre/fsw/portfolios/llmservice/projects/llmservice_nlp_fm/datasets/nvlm/wdai/data/augmentations/sciqa +# This file was automatically generated by `nvgpt4 data prepare`. + + + +def sample_loader_64(raw: dict) -> dict: # Note: Images are already decoded to tensors + json_item = raw["json"] + + img = Image.open(io.BytesIO(raw["img"])) + images = [img] + + return dict( + __key__="llava-{}".format(raw["__key__"]), + images=images, + texts=json_item["conversations"], + similarity_matrix=None, + ) + + +def part_filter_64(part: str) -> bool: + + # E.g. if your dataset contains jpeg, txt and json, but you won't use json, + # remove it from the list, such that it is not decoded. If you need all, keep as is + return part in ("img", "json") + + +# Loader for: /lustre/fsw/portfolios/llmservice/projects/llmservice_nlp_fm/datasets/nvlm/wdai/data/augmentations/chinese_meme +# This file was automatically generated by `nvgpt4 data prepare`. + + + +def sample_loader_65(raw: dict) -> dict: # Note: Images are already decoded to tensors + json_item = raw["json"] + + img = Image.open(io.BytesIO(raw["img"])) + images = [img] + + return dict( + __key__="llava-{}".format(raw["__key__"]), + images=images, + texts=json_item["conversations"], + similarity_matrix=None, + ) + + +def part_filter_65(part: str) -> bool: + + # E.g. if your dataset contains jpeg, txt and json, but you won't use json, + # remove it from the list, such that it is not decoded. If you need all, keep as is + return part in ("img", "json") + + +# Loader for: /lustre/fsw/portfolios/llmservice/projects/llmservice_nlp_fm/datasets/nvlm/wdai/data/augmentations/gpt_hiertext +# This file was automatically generated by `nvgpt4 data prepare`. + + + +def sample_loader_66(raw: dict) -> dict: # Note: Images are already decoded to tensors + json_item = raw["json"] + + img = Image.open(io.BytesIO(raw["img"])) + images = [img] + + return dict( + __key__="llava-{}".format(raw["__key__"]), + images=images, + texts=json_item["conversations"], + similarity_matrix=None, + ) + + +def part_filter_66(part: str) -> bool: + + # E.g. if your dataset contains jpeg, txt and json, but you won't use json, + # remove it from the list, such that it is not decoded. If you need all, keep as is + return part in ("img", "json") + + +# Loader for: /lustre/fsw/portfolios/llmservice/projects/llmservice_nlp_fm/datasets/nvlm/wdai/data/augmentations/cauldron_cocoqa +# This file was automatically generated by `nvgpt4 data prepare`. + + + +def sample_loader_67(raw: dict) -> dict: # Note: Images are already decoded to tensors + json_item = raw["json"] + + img = Image.open(io.BytesIO(raw["img"])) + images = [img] + + return dict( + __key__="llava-{}".format(raw["__key__"]), + images=images, + texts=json_item["conversations"], + similarity_matrix=None, + ) + + +def part_filter_67(part: str) -> bool: + + # E.g. if your dataset contains jpeg, txt and json, but you won't use json, + # remove it from the list, such that it is not decoded. If you need all, keep as is + return part in ("img", "json") + + +# Loader for: /lustre/fsw/portfolios/llmservice/projects/llmservice_nlp_fm/datasets/nvlm/wdai/data/cmm-math/processed +# This file was automatically generated by `nvgpt4 data prepare`. + + + +def sample_loader_68(raw: dict) -> dict: # Note: Images are already decoded to tensors + json_item = raw["json"] + images = [raw["jpg"]] + + return dict( + __key__="llava-{}".format(raw["__key__"]), + images=images, + texts=json_item["conversations"], + similarity_matrix=None, + ) + + +def part_filter_68(part: str) -> bool: + + # E.g. if your dataset contains jpeg, txt and json, but you won't use json, + # remove it from the list, such that it is not decoded. If you need all, keep as is + return part in ("json", "jpg") + + +# Loader for: /lustre/fsw/portfolios/llmservice/projects/llmservice_nlp_fm/datasets/nvlm/wdai/data/mmtab/processed +# This file was automatically generated by `nvgpt4 data prepare`. + + + +def sample_loader_69(raw: dict) -> dict: # Note: Images are already decoded to tensors + json_item = raw["json"] + images = [raw["jpg"]] + + return dict( + __key__="llava-{}".format(raw["__key__"]), + images=images, + texts=json_item["conversations"], + similarity_matrix=None, + ) + + +def part_filter_69(part: str) -> bool: + + # E.g. if your dataset contains jpeg, txt and json, but you won't use json, + # remove it from the list, such that it is not decoded. If you need all, keep as is + return part in ("jpg", "json") + + +# Loader for: /lustre/fsw/portfolios/llmservice/projects/llmservice_nlp_fm/datasets/nvlm/wdai/data/simchart9k/processed +# This file was automatically generated by `nvgpt4 data prepare`. + + + +def sample_loader_70(raw: dict) -> dict: # Note: Images are already decoded to tensors + json_item = raw["json"] + images = [raw["jpg"]] + + return dict( + __key__="llava-{}".format(raw["__key__"]), + images=images, + texts=json_item["conversations"], + similarity_matrix=None, + ) + + +def part_filter_70(part: str) -> bool: + + # E.g. if your dataset contains jpeg, txt and json, but you won't use json, + # remove it from the list, such that it is not decoded. If you need all, keep as is + return part in ("jpg", "json") + + +# Loader for: /lustre/fsw/portfolios/llmservice/projects/llmservice_nlp_fm/datasets/nvlm/wdai/data/llava-onevision/mapqa_processed +# This file was automatically generated by `nvgpt4 data prepare`. + + + +def sample_loader_71(raw: dict) -> dict: # Note: Images are already decoded to tensors + json_item = raw["json"] + images = [raw["jpg"]] + + return dict( + __key__="llava-{}".format(raw["__key__"]), + images=images, + texts=json_item["conversations"], + similarity_matrix=None, + ) + + +def part_filter_71(part: str) -> bool: + + # E.g. if your dataset contains jpeg, txt and json, but you won't use json, + # remove it from the list, such that it is not decoded. If you need all, keep as is + return part in ("jpg", "json") + + +# Loader for: /lustre/fsw/portfolios/llmservice/projects/llmservice_nlp_fm/datasets/nvlm/wdai/data/llava-onevision/vizwiz_processed +# This file was automatically generated by `nvgpt4 data prepare`. + + + +def sample_loader_72(raw: dict) -> dict: # Note: Images are already decoded to tensors + json_item = raw["json"] + images = [raw["jpg"]] + + return dict( + __key__="llava-{}".format(raw["__key__"]), + images=images, + texts=json_item["conversations"], + similarity_matrix=None, + ) + + +def part_filter_72(part: str) -> bool: + + # E.g. if your dataset contains jpeg, txt and json, but you won't use json, + # remove it from the list, such that it is not decoded. If you need all, keep as is + return part in ("jpg", "json") + + +# Loader for: /lustre/fsw/portfolios/llmservice/projects/llmservice_nlp_fm/datasets/nvlm/wdai/data/augmentations/gpt_infovqa +# This file was automatically generated by `nvgpt4 data prepare`. + + + +def sample_loader_73(raw: dict) -> dict: # Note: Images are already decoded to tensors + json_item = raw["json"] + + img = Image.open(io.BytesIO(raw["img"])) + images = [img] + + return dict( + __key__="llava-{}".format(raw["__key__"]), + images=images, + texts=json_item["conversations"], + similarity_matrix=None, + ) + + +def part_filter_73(part: str) -> bool: + + # E.g. if your dataset contains jpeg, txt and json, but you won't use json, + # remove it from the list, such that it is not decoded. If you need all, keep as is + return part in ("json", "img") + + +# Loader for: /lustre/fsw/portfolios/llmservice/projects/llmservice_nlp_fm/datasets/nvlm/wdai/data/augmentations/viquae +# This file was automatically generated by `nvgpt4 data prepare`. + + + +def sample_loader_74(raw: dict) -> dict: # Note: Images are already decoded to tensors + json_item = raw["json"] + + img = Image.open(io.BytesIO(raw["img"])) + images = [img] + + return dict( + __key__="llava-{}".format(raw["__key__"]), + images=images, + texts=json_item["conversations"], + similarity_matrix=None, + ) + + +def part_filter_74(part: str) -> bool: + + # E.g. if your dataset contains jpeg, txt and json, but you won't use json, + # remove it from the list, such that it is not decoded. If you need all, keep as is + return part in ("json", "img") + + +# Loader for: /lustre/fsw/portfolios/llmservice/projects/llmservice_nlp_fm/datasets/captioning/ccs_recaptioned/webdataset +# This file was automatically generated by `nvgpt4 data prepare`. + + + +def sample_loader_75(raw: dict) -> dict: # Note: Images are already decoded to tensors + + if "text" in raw: + caption = raw["text"] + else: + caption = raw["json"]["caption"] + return dict( + __key__=raw["__key__"], + image=raw["jpg"], # expected type: torch.Tensor + caption=caption, # expected type: str + ) + + +def part_filter_75(part: str) -> bool: + + # E.g. if your dataset contains jpeg, txt and json, but you won't use json, + # remove it from the list, such that it is not decoded. If you need all, keep as is + return part in ("json", "jpg", "text") + + +# Loader for: /lustre/fsw/portfolios/llmservice/projects/llmservice_nlp_fm/datasets/captioning/laion115m-clean +# This file was automatically generated by `nvgpt4 data prepare`. + + + +def sample_loader_76(raw: dict) -> dict: # Note: Images are already decoded to tensors + + if "text" in raw: + caption = raw["text"] + else: + caption = raw["json"]["caption"] + return dict( + __key__=raw["__key__"], + image=raw["jpg"], # expected type: torch.Tensor + caption=caption, # expected type: str + ) + + +def part_filter_76(part: str) -> bool: + + # E.g. if your dataset contains jpeg, txt and json, but you won't use json, + # remove it from the list, such that it is not decoded. If you need all, keep as is + return part in ("json", "jpg", "text") + + +# Loader for: /lustre/fsw/portfolios/llmservice/projects/llmservice_nlp_fm/datasets/nvlm/wdai/data/dvqa_full/processed_pt +# This file was automatically generated by `nvgpt4 data prepare`. + + + +def sample_loader_77(raw: dict) -> dict: # Note: Images are already decoded to tensors + json_item = raw["json"] + + total = len(json_item["conversations"]) // 2 + idx = random.randrange(total) # noqa: F821 + human = json_item["conversations"][idx * 2] + out = json_item["conversations"][idx * 2 + 1] + + return dict( + __key__=raw["__key__"], + image=raw["jpg"], + context=human["value"].replace("\n", ""), + answers=out["value"], + answer_weights=None, + ) + + +def part_filter_77(part: str) -> bool: + + # E.g. if your dataset contains jpeg, txt and json, but you won't use json, + # remove it from the list, such that it is not decoded. If you need all, keep as is + return part in ("json", "jpg") + + +# Loader for: /lustre/fsw/portfolios/llmservice/projects/llmservice_nlp_fm/datasets/nvlm/wdai/data/docmatix/processed_pt +# This file was automatically generated by `nvgpt4 data prepare`. + + + +def sample_loader_78(raw: dict) -> dict: # Note: Images are already decoded to tensors + json_item = raw["json"] + + total = len(json_item["conversations"]) // 2 + idx = random.randrange(total) # noqa: F821 + human = json_item["conversations"][idx * 2] + out = json_item["conversations"][idx * 2 + 1] + + return dict( + __key__=raw["__key__"], + image=raw["jpg"], + context=human["value"].replace("\n", ""), + answers=out["value"], + answer_weights=None, + ) + + +def part_filter_78(part: str) -> bool: + + # E.g. if your dataset contains jpeg, txt and json, but you won't use json, + # remove it from the list, such that it is not decoded. If you need all, keep as is + return part in ("json", "jpg") + + +# Loader for: /lustre/fsw/portfolios/llmservice/projects/llmservice_nlp_fm/datasets/vqa/VQAv2/stage1 + + +def sample_loader_79(raw: dict) -> dict: # Note: Images are already decoded to tensors + j = raw["json"] + if "answer" in j: + answers = [a[0] for a in j["answer"][0]] + answer_weights = torch.Tensor([float(a[1]) for a in j["answer"][0]]) + else: + answers = None + answer_weights = None + + return dict( + __key__=raw["__key__"], + image=raw["jpg"], # expected type: torch.Tensor + context=j["question"], # expected type: str + answers=answers, # expected type: typing.List[str] + answer_weights=answer_weights, # expected type: typing.Union[torch.Tensor, NoneType] + ) + + +def part_filter_79(part: str) -> bool: + # Filter for parts required by the sample_loader + return part in ("jpg", "json") + + +# Loader for: /lustre/fsw/portfolios/llmservice/projects/llmservice_nlp_fm/datasets/vqa/Visual_Genome +# This file was automatically generated by `nvgpt4 data prepare`. + + + +def sample_loader_80(raw: dict) -> dict: # Note: Images are already decoded to tensors + + return dict( + __key__=raw["__key__"], + image=raw["jpg"], # expected type: torch.Tensor + context=raw["json"]["question"], # expected type: str + answers=raw["json"]["answer"], # expected type: typing.Union[typing.List[str], NoneType], default: None + answer_weights=None, # expected type: typing.Union[torch.Tensor, NoneType], default: None + ) + + +def part_filter_80(part: str) -> bool: + + return part in ("json", "jpg") + + +# Loader for: /lustre/fsw/portfolios/llmservice/projects/llmservice_nlp_fm/datasets/nvlm/wdai/data/pdfa-eng-wds/processed_word_len_300 +# This file was automatically generated by `nvgpt4 data prepare`. + + + +def sample_loader_81(raw: dict) -> dict: # Note: Images are already decoded to tensors + j = raw["json"] + return dict( + image=raw["jpg"], # expected type: torch.Tensor + text=" ".join(j["lines"]["text"]), # expected type: str + ) + + +def part_filter_81(part: str) -> bool: + + # E.g. if your dataset contains jpeg, txt and json, but you won't use json, + # remove it from the list, such that it is not decoded. If you need all, keep as is + return part in ("jpg", "json") + + +# Loader for: /lustre/fsw/portfolios/llmservice/projects/llmservice_nlp_fm/datasets/nvlm/wdai/data/textocr/processed +# This file was automatically generated by `nvgpt4 data prepare`. + + + +def sample_loader_82(raw: dict) -> dict: # Note: Images are already decoded to tensors + j = raw["json"] + + return dict( + __key__=raw["__key__"], + image=raw["jpg"], + text=j["text"], + words_boxes=j["bbox_1k_normalized"], + ) + + +def part_filter_82(part: str) -> bool: + + # E.g. if your dataset contains jpeg, txt and json, but you won't use json, + # remove it from the list, such that it is not decoded. If you need all, keep as is + return part in ("jpg", "json") + + +# Loader for: /lustre/fsw/portfolios/llmservice/projects/llmservice_nlp_fm/datasets/nvlm/wdai/data/coco-text/processed +# This file was automatically generated by `nvgpt4 data prepare`. + + + +def sample_loader_83(raw: dict) -> dict: + j = raw["json"] + + return dict( + __key__=raw["__key__"], + image=raw["jpg"], + text=j["text"], + words_boxes=j["bbox_1k_normalized"], + ) + + +def part_filter_83(part: str) -> bool: + + # E.g. if your dataset contains jpeg, txt and json, but you won't use json, + # remove it from the list, such that it is not decoded. If you need all, keep as is + return part in ("json", "jpg") + + +# Loader for: /lustre/fsw/portfolios/llmservice/projects/llmservice_nlp_fm/datasets/nvlm/wdai/data/ArT/processed +# This file was automatically generated by `nvgpt4 data prepare`. + + + +def sample_loader_84(raw: dict) -> dict: # Note: Images are already decoded to tensors + j = raw["json"] + + return dict( + __key__=raw["__key__"], + image=raw["jpg"], + text=j["text"], + words_boxes=j["bbox_1k_normalized"], + ) + + +def part_filter_84(part: str) -> bool: + + # E.g. if your dataset contains jpeg, txt and json, but you won't use json, + # remove it from the list, such that it is not decoded. If you need all, keep as is + return part in ("json", "jpg") + + +# Loader for: /lustre/fsw/portfolios/llmservice/projects/llmservice_nlp_fm/datasets/nvlm/wdai/data/ReCTs/processed +# This file was automatically generated by `nvgpt4 data prepare`. + + + +def sample_loader_85(raw: dict) -> dict: # Note: Images are already decoded to tensors + j = raw["json"] + + return dict(__key__=raw["__key__"], image=raw["jpg"], text=j["text"], words_boxes=j["quad_1k_normalized"]) + + +def part_filter_85(part: str) -> bool: + + # E.g. if your dataset contains jpeg, txt and json, but you won't use json, + # remove it from the list, such that it is not decoded. If you need all, keep as is + return part in ("json", "jpg") + + +# Loader for: /lustre/fsw/portfolios/llmservice/projects/llmservice_nlp_fm/datasets/nvlm/wdai/data/lsvt/processed +# This file was automatically generated by `nvgpt4 data prepare`. + + + +def sample_loader_86(raw: dict) -> dict: # Note: Images are already decoded to tensors + j = raw["json"] + + return dict( + __key__=raw["__key__"], + image=raw["jpg"], + text=j["text"], + words_boxes=j["bbox_1k_normalized"], + ) + + +def part_filter_86(part: str) -> bool: + + # E.g. if your dataset contains jpeg, txt and json, but you won't use json, + # remove it from the list, such that it is not decoded. If you need all, keep as is + return part in ("jpg", "json") + + +# Loader for: /lustre/fsw/portfolios/llmservice/projects/llmservice_nlp_fm/datasets/nvlm/wdai/data/RCTW/processed +# This file was automatically generated by `nvgpt4 data prepare`. + + + +def sample_loader_87(raw: dict) -> dict: # Note: Images are already decoded to tensors + j = raw["json"] + + quad = j["quad"] + quad = [val for point in quad for val in point] + + return dict( + image=raw["jpg"], # expected type: torch.Tensor + text=j["text"], # expected type: str + words_boxes=quad, # expected type: typing.Optional[torch.Tensor], default: None + ) + + +def part_filter_87(part: str) -> bool: + + # E.g. if your dataset contains jpeg, txt and json, but you won't use json, + # remove it from the list, such that it is not decoded. If you need all, keep as is + return part in ("json", "jpg") + + +# Loader for: /lustre/fsw/portfolios/llmservice/projects/llmservice_nlp_fm/datasets/nvlm/wdai/data/coco-text/processed_multi +# This file was automatically generated by `nvgpt4 data prepare`. + + + +def sample_loader_88(raw: dict) -> dict: # Note: Images are already decoded to tensors + j = raw["json"] + return dict( + __key__=raw["__key__"], + image=raw["jpg"], + text="", + words_boxes=j["bboxes_1k_normalized"], + words_text=j["texts"], + ) + + +def part_filter_88(part: str) -> bool: + + # E.g. if your dataset contains jpeg, txt and json, but you won't use json, + # remove it from the list, such that it is not decoded. If you need all, keep as is + return part in ("json", "jpg") + + +# Loader for: /lustre/fsw/portfolios/llmservice/projects/llmservice_nlp_fm/datasets/nvlm/wdai/data/textocr/processed_multi +# This file was automatically generated by `nvgpt4 data prepare`. + + + +def sample_loader_89(raw: dict) -> dict: + j = raw["json"] + return dict( + __key__=raw["__key__"], + image=raw["jpg"], + text="", + words_boxes=j["bboxes_1k_normalized"], + words_text=j["texts"], + ) + + +def part_filter_89(part: str) -> bool: + + # E.g. if your dataset contains jpeg, txt and json, but you won't use json, + # remove it from the list, such that it is not decoded. If you need all, keep as is + return part in ("jpg", "json") + + +# Loader for: /lustre/fsw/portfolios/llmservice/projects/llmservice_nlp_fm/datasets/nvlm/wdai/data/ReCTs/processed_multi +# This file was automatically generated by `nvgpt4 data prepare`. + + + +def sample_loader_90(raw: dict) -> dict: # Note: Images are already decoded to tensors + j = raw["json"] + return dict( + __key__=raw["__key__"], + image=raw["jpg"], + text="", + words_boxes=j["quads_1k_normalized"], + words_text=j["texts"], + ) + + +def part_filter_90(part: str) -> bool: + + # E.g. if your dataset contains jpeg, txt and json, but you won't use json, + # remove it from the list, such that it is not decoded. If you need all, keep as is + return part in ("json", "jpg") + + +# Loader for: /lustre/fsw/portfolios/llmservice/projects/llmservice_nlp_fm/datasets/vqa/VQAv2/stage1 + + +def sample_loader_91(raw: dict) -> dict: # Note: Images are already decoded to tensors + j = raw["json"] + if "answer" in j: + answers = [a[0] for a in j["answer"][0]] + answer_weights = torch.Tensor([float(a[1]) for a in j["answer"][0]]) + else: + answers = None + answer_weights = None + + return dict( + __key__=raw["__key__"], + image=raw["jpg"], # expected type: torch.Tensor + context=j["question"], # expected type: str + answers=answers, # expected type: typing.List[str] + answer_weights=answer_weights, # expected type: typing.Union[torch.Tensor, NoneType] + ) + + +def part_filter_91(part: str) -> bool: + # Filter for parts required by the sample_loader + return part in ("jpg", "json") + + +# Dataset -> Sample Loader Mapping +dataset_loader_mapping = { + "coco_train_val_restval": { + "sample_loader": "sample_loader_0", + "part_filter": "part_filter_0", + "data_class": "CaptioningWebdataset", + "data_weight": 0.01, + }, + "extended-sci/data/merged/CoT": { + "sample_loader": "sample_loader_1", + "part_filter": "part_filter_1", + "data_class": "SimilarityInterleavedWebdataset", + "data_weight": 0.006, + }, + "extended-sci/data/merged/single-choice": { + "sample_loader": "sample_loader_2", + "part_filter": "part_filter_2", + "data_class": "SimilarityInterleavedWebdataset", + "data_weight": 0.004, + }, + "extended-sci/data/extended-sci-3/CoT": { + "sample_loader": "sample_loader_3", + "part_filter": "part_filter_3", + "data_class": "SimilarityInterleavedWebdataset", + "data_weight": 0.0006, + }, + "extended-sci/data/extended-sci-3/single-choice": { + "sample_loader": "sample_loader_4", + "part_filter": "part_filter_4", + "data_class": "SimilarityInterleavedWebdataset", + "data_weight": 0.0004, + }, + "nvlm/wdai/data/SceMQA_processed": { + "sample_loader": "sample_loader_5", + "part_filter": "part_filter_5", + "data_class": "SimilarityInterleavedWebdataset", + "data_weight": 0.0006, + }, + "nvlm/wdai/data/vqa_collection_doc_text_st_chart_scale_textbook_LRV_Screen": { + "sample_loader": "sample_loader_6", + "part_filter": "part_filter_6", + "data_class": "VQAWebdataset", + "data_weight": 0.08, + }, + "nvlm/wdai/data/plotqa/processed": { + "sample_loader": "sample_loader_7", + "part_filter": "part_filter_7", + "data_class": "VQAWebdataset", + "data_weight": 0.095, + }, + "nvlm/wdai/data/clevr-math/processed": { + "sample_loader": "sample_loader_8", + "part_filter": "part_filter_8", + "data_class": "VQAWebdataset", + "data_weight": 0.02, + }, + "nvlm/wdai/data/MMC-Instruction/processed": { + "sample_loader": "sample_loader_9", + "part_filter": "part_filter_9", + "data_class": "VQAWebdataset", + "data_weight": 0.07, + }, + "nvlm/wdai/data/ocrvqa/processed": { + "sample_loader": "sample_loader_10", + "part_filter": "part_filter_10", + "data_class": "VQAWebdataset", + "data_weight": 0.06, + }, + "nvlm/wdai/data/dude/processed": { + "sample_loader": "sample_loader_11", + "part_filter": "part_filter_11", + "data_class": "VQAWebdataset", + "data_weight": 0.01, + }, + "nvlm/wdai/data/VisualMRC/processed": { + "sample_loader": "sample_loader_12", + "part_filter": "part_filter_12", + "data_class": "VQAWebdataset", + "data_weight": 0.015, + }, + "nvlm/wdai/data/mcvqa_collection_scienceqa_ai2d_geoqaplus_geometry3k_tqa": { + "sample_loader": "sample_loader_13", + "part_filter": "part_filter_13", + "data_class": "MultiChoiceVQAWebdataset", + "data_weight": 0.025, + }, + "nvlm/wdai/data/arxiv_qa/processed": { + "sample_loader": "sample_loader_14", + "part_filter": "part_filter_14", + "data_class": "MultiChoiceVQAWebdataset", + "data_weight": 0.02, + }, + "nvlm/wdai/data/tabmwp/processed": { + "sample_loader": "sample_loader_15", + "part_filter": "part_filter_15", + "data_class": "MultiChoiceVQAWebdataset", + "data_weight": 0.015, + }, + "nvlm/wdai/data/ocr_vqa_aug/processed": { + "sample_loader": "sample_loader_16", + "part_filter": "part_filter_16", + "data_class": "SimilarityInterleavedWebdataset", + "data_weight": 0.055, + }, + "nvlm/wdai/data/dvqa_full/processed": { + "sample_loader": "sample_loader_17", + "part_filter": "part_filter_17", + "data_class": "SimilarityInterleavedWebdataset", + "data_weight": 0.055, + }, + "nvlm/wdai/data/LLaVA-v1.5_shuffle/no_refcoco_vg_ocrvqa": { + "sample_loader": "sample_loader_18", + "part_filter": "part_filter_18", + "data_class": "SimilarityInterleavedWebdataset", + "data_weight": 0.085, + }, + "vqa/more_data/infographics_vqa/processed/train": { + "sample_loader": "sample_loader_19", + "part_filter": "part_filter_19", + "data_class": "SimilarityInterleavedWebdataset", + "data_weight": 0.01, + }, + "nvlm/wdai/data/sharegpt4o/processed": { + "sample_loader": "sample_loader_20", + "part_filter": "part_filter_20", + "data_class": "SimilarityInterleavedWebdataset", + "data_weight": 0.02, + }, + "nvlm/wdai/data/sparse_ocr_data/merged": { + "sample_loader": "sample_loader_21", + "part_filter": "part_filter_21", + "data_class": "SimilarityInterleavedWebdataset", + "data_weight": 0.045, + }, + "nvlm/nayeonl/data/blendv4/MetaMathQA/processed/train_text_image": { + "sample_loader": "sample_loader_22", + "part_filter": "part_filter_22", + "data_class": "SimilarityInterleavedWebdataset", + "data_weight": 0.004, + }, + "nvlm/nayeonl/data/blendv4/gsm8k/processed/train_text_image": { + "sample_loader": "sample_loader_23", + "part_filter": "part_filter_23", + "data_class": "SimilarityInterleavedWebdataset", + "data_weight": 0.003, + }, + "nvlm/wdai/data/docmatix/processed": { + "sample_loader": "sample_loader_24", + "part_filter": "part_filter_24", + "data_class": "SimilarityInterleavedWebdataset", + "data_weight": 0.1, + }, + "nvlm/wdai/data/bentham_hw_squad/processed": { + "sample_loader": "sample_loader_25", + "part_filter": "part_filter_25", + "data_class": "SimilarityInterleavedWebdataset", + "data_weight": 0.01, + }, + "nvlm/wdai/data/WikiTableQA/processed": { + "sample_loader": "sample_loader_26", + "part_filter": "part_filter_26", + "data_class": "SimilarityInterleavedWebdataset", + "data_weight": 0.003, + }, + "nvlm/wdai/data/figureqa/processed": { + "sample_loader": "sample_loader_27", + "part_filter": "part_filter_27", + "data_class": "SimilarityInterleavedWebdataset", + "data_weight": 0.01, + }, + "nvlm/wdai/data/llava-onevision/ai2d_combined_processed": { + "sample_loader": "sample_loader_28", + "part_filter": "part_filter_28", + "data_class": "SimilarityInterleavedWebdataset", + "data_weight": 0.01, + }, + "nvlm/wdai/data/llava-onevision/math_combined_processed": { + "sample_loader": "sample_loader_29", + "part_filter": "part_filter_29", + "data_class": "SimilarityInterleavedWebdataset", + "data_weight": 0.035, + }, + "nvlm/wdai/data/llava-onevision/robut_combined_processed": { + "sample_loader": "sample_loader_30", + "part_filter": "part_filter_30", + "data_class": "SimilarityInterleavedWebdataset", + "data_weight": 0.01, + }, + "nvlm/wdai/data/llava-onevision/llavar_20k_processed": { + "sample_loader": "sample_loader_31", + "part_filter": "part_filter_31", + "data_class": "SimilarityInterleavedWebdataset", + "data_weight": 0.007, + }, + "nvlm/wdai/data/llava-onevision/tallyqa_processed": { + "sample_loader": "sample_loader_32", + "part_filter": "part_filter_32", + "data_class": "SimilarityInterleavedWebdataset", + "data_weight": 0.02, + }, + "nvlm/wdai/data/llava-onevision/ureader_ie_processed": { + "sample_loader": "sample_loader_33", + "part_filter": "part_filter_33", + "data_class": "SimilarityInterleavedWebdataset", + "data_weight": 0.007, + }, + "nvlm/wdai/data/llava-onevision/visual7w_processed": { + "sample_loader": "sample_loader_34", + "part_filter": "part_filter_34", + "data_class": "SimilarityInterleavedWebdataset", + "data_weight": 0.006, + }, + "nvlm/wdai/data/llava-onevision/mavis_math_rule_geo_processed": { + "sample_loader": "sample_loader_35", + "part_filter": "part_filter_35", + "data_class": "SimilarityInterleavedWebdataset", + "data_weight": 0.01, + }, + "nvlm/wdai/data/llava-onevision/ureader_kg_processed": { + "sample_loader": "sample_loader_36", + "part_filter": "part_filter_36", + "data_class": "SimilarityInterleavedWebdataset", + "data_weight": 0.005, + }, + "nvlm/wdai/data/llava-onevision/ureader_qa_processed": { + "sample_loader": "sample_loader_37", + "part_filter": "part_filter_37", + "data_class": "SimilarityInterleavedWebdataset", + "data_weight": 0.02, + }, + "nvlm/wdai/data/ocr_multi_collection_cocotext_textocr_ReCTs": { + "sample_loader": "sample_loader_38", + "part_filter": "part_filter_38", + "data_class": "OCRWebdataset", + "data_weight": 0.01, + }, + "nvlm/wdai/data/pdfa-eng-wds/processed_word_len_500": { + "sample_loader": "sample_loader_39", + "part_filter": "part_filter_39", + "data_class": "OCRWebdataset", + "data_weight": 0.015, + }, + "nvlm/wdai/data/llava-onevision/super_clevr_processed": { + "sample_loader": "sample_loader_40", + "part_filter": "part_filter_40", + "data_class": "SimilarityInterleavedWebdataset", + "data_weight": 0.007, + }, + "nvlm/wdai/data/llava-onevision/icon_qa_processed": { + "sample_loader": "sample_loader_41", + "part_filter": "part_filter_41", + "data_class": "SimilarityInterleavedWebdataset", + "data_weight": 0.009, + }, + "nvlm/wdai/data/augmentations/chartqa_aug": { + "sample_loader": "sample_loader_42", + "part_filter": "part_filter_42", + "data_class": "SimilarityInterleavedWebdataset", + "data_weight": 0.005, + }, + "nvlm/wdai/data/augmentations/gpt_chartqa": { + "sample_loader": "sample_loader_43", + "part_filter": "part_filter_43", + "data_class": "SimilarityInterleavedWebdataset", + "data_weight": 0.006, + }, + "nvlm/wdai/data/augmentations/gpt_docvqa": { + "sample_loader": "sample_loader_44", + "part_filter": "part_filter_44", + "data_class": "SimilarityInterleavedWebdataset", + "data_weight": 0.006, + }, + "nvlm/wdai/data/augmentations/docvqa_text": { + "sample_loader": "sample_loader_45", + "part_filter": "part_filter_45", + "data_class": "SimilarityInterleavedWebdataset", + "data_weight": 0.006, + }, + "nvlm/wdai/data/augmentations/textvqa_text": { + "sample_loader": "sample_loader_46", + "part_filter": "part_filter_46", + "data_class": "SimilarityInterleavedWebdataset", + "data_weight": 0.008, + }, + "nvlm/wdai/data/augmentations/i2s-musicsheet": { + "sample_loader": "sample_loader_47", + "part_filter": "part_filter_47", + "data_class": "SimilarityInterleavedWebdataset", + "data_weight": 0.0005, + }, + "nvlm/wdai/data/augmentations/music": { + "sample_loader": "sample_loader_48", + "part_filter": "part_filter_48", + "data_class": "SimilarityInterleavedWebdataset", + "data_weight": 0.007, + }, + "nvlm/wdai/data/augmentations/invoice": { + "sample_loader": "sample_loader_49", + "part_filter": "part_filter_49", + "data_class": "SimilarityInterleavedWebdataset", + "data_weight": 0.002, + }, + "nvlm/wdai/data/augmentations/k12": { + "sample_loader": "sample_loader_50", + "part_filter": "part_filter_50", + "data_class": "SimilarityInterleavedWebdataset", + "data_weight": 0.019, + }, + "nvlm/wdai/data/augmentations/MTVQA": { + "sample_loader": "sample_loader_51", + "part_filter": "part_filter_51", + "data_class": "SimilarityInterleavedWebdataset", + "data_weight": 0.007, + }, + "nvlm/wdai/data/augmentations/VisualWebInstruct": { + "sample_loader": "sample_loader_52", + "part_filter": "part_filter_52", + "data_class": "SimilarityInterleavedWebdataset", + "data_weight": 0.028, + }, + "nvlm/wdai/data/augmentations/financeqa": { + "sample_loader": "sample_loader_53", + "part_filter": "part_filter_53", + "data_class": "SimilarityInterleavedWebdataset", + "data_weight": 0.005, + }, + "nvlm/wdai/data/augmentations/docreason": { + "sample_loader": "sample_loader_54", + "part_filter": "part_filter_54", + "data_class": "SimilarityInterleavedWebdataset", + "data_weight": 0.004, + }, + "nvlm/wdai/data/augmentations/gpt_mtwi": { + "sample_loader": "sample_loader_55", + "part_filter": "part_filter_55", + "data_class": "SimilarityInterleavedWebdataset", + "data_weight": 0.005, + }, + "nvlm/wdai/data/augmentations/geos_gpt": { + "sample_loader": "sample_loader_56", + "part_filter": "part_filter_56", + "data_class": "SimilarityInterleavedWebdataset", + "data_weight": 0.0001, + }, + "nvlm/wdai/data/augmentations/cauldron_vistext": { + "sample_loader": "sample_loader_57", + "part_filter": "part_filter_57", + "data_class": "SimilarityInterleavedWebdataset", + "data_weight": 0.005, + }, + "nvlm/wdai/data/augmentations/memes": { + "sample_loader": "sample_loader_58", + "part_filter": "part_filter_58", + "data_class": "SimilarityInterleavedWebdataset", + "data_weight": 0.005, + }, + "nvlm/wdai/data/augmentations/gpt_roadtext": { + "sample_loader": "sample_loader_59", + "part_filter": "part_filter_59", + "data_class": "SimilarityInterleavedWebdataset", + "data_weight": 0.0002, + }, + "nvlm/wdai/data/augmentations/indoor_qa": { + "sample_loader": "sample_loader_60", + "part_filter": "part_filter_60", + "data_class": "SimilarityInterleavedWebdataset", + "data_weight": 0.001, + }, + "nvlm/wdai/data/augmentations/colpali": { + "sample_loader": "sample_loader_61", + "part_filter": "part_filter_61", + "data_class": "SimilarityInterleavedWebdataset", + "data_weight": 0.007, + }, + "nvlm/wdai/data/augmentations/pmc_vqa": { + "sample_loader": "sample_loader_62", + "part_filter": "part_filter_62", + "data_class": "SimilarityInterleavedWebdataset", + "data_weight": 0.01, + }, + "nvlm/wdai/data/augmentations/pathvqa": { + "sample_loader": "sample_loader_63", + "part_filter": "part_filter_63", + "data_class": "SimilarityInterleavedWebdataset", + "data_weight": 0.004, + }, + "nvlm/wdai/data/augmentations/sciqa": { + "sample_loader": "sample_loader_64", + "part_filter": "part_filter_64", + "data_class": "SimilarityInterleavedWebdataset", + "data_weight": 0.027, + }, + "nvlm/wdai/data/augmentations/chinese_meme": { + "sample_loader": "sample_loader_65", + "part_filter": "part_filter_65", + "data_class": "SimilarityInterleavedWebdataset", + "data_weight": 0.001, + }, + "nvlm/wdai/data/augmentations/gpt_hiertext": { + "sample_loader": "sample_loader_66", + "part_filter": "part_filter_66", + "data_class": "SimilarityInterleavedWebdataset", + "data_weight": 0.003, + }, + "nvlm/wdai/data/augmentations/cauldron_cocoqa": { + "sample_loader": "sample_loader_67", + "part_filter": "part_filter_67", + "data_class": "SimilarityInterleavedWebdataset", + "data_weight": 0.007, + }, + "nvlm/wdai/data/cmm-math/processed": { + "sample_loader": "sample_loader_68", + "part_filter": "part_filter_68", + "data_class": "SimilarityInterleavedWebdataset", + "data_weight": 0.001, + }, + "nvlm/wdai/data/mmtab/processed": { + "sample_loader": "sample_loader_69", + "part_filter": "part_filter_69", + "data_class": "SimilarityInterleavedWebdataset", + "data_weight": 0.008, + }, + "nvlm/wdai/data/simchart9k/processed": { + "sample_loader": "sample_loader_70", + "part_filter": "part_filter_70", + "data_class": "SimilarityInterleavedWebdataset", + "data_weight": 0.001, + }, + "nvlm/wdai/data/llava-onevision/mapqa_processed": { + "sample_loader": "sample_loader_71", + "part_filter": "part_filter_71", + "data_class": "SimilarityInterleavedWebdataset", + "data_weight": 0.005, + }, + "nvlm/wdai/data/llava-onevision/vizwiz_processed": { + "sample_loader": "sample_loader_72", + "part_filter": "part_filter_72", + "data_class": "SimilarityInterleavedWebdataset", + "data_weight": 0.002, + }, + "nvlm/wdai/data/augmentations/gpt_infovqa": { + "sample_loader": "sample_loader_73", + "part_filter": "part_filter_73", + "data_class": "SimilarityInterleavedWebdataset", + "data_weight": 0.001, + }, + "nvlm/wdai/data/augmentations/viquae": { + "sample_loader": "sample_loader_74", + "part_filter": "part_filter_74", + "data_class": "SimilarityInterleavedWebdataset", + "data_weight": 0.0005, + }, + "captioning/ccs_recaptioned/webdataset": { + "sample_loader": "sample_loader_75", + "part_filter": "part_filter_75", + "data_class": "CaptioningWebdataset", + "data_weight": 0.2, + }, + "captioning/laion115m-clean": { + "sample_loader": "sample_loader_76", + "part_filter": "part_filter_76", + "data_class": "CaptioningWebdataset", + "data_weight": 0.579, + }, + "nvlm/wdai/data/dvqa_full/processed_pt": { + "sample_loader": "sample_loader_77", + "part_filter": "part_filter_77", + "data_class": "VQAWebdataset", + "data_weight": 0.02, + }, + "nvlm/wdai/data/docmatix/processed_pt": { + "sample_loader": "sample_loader_78", + "part_filter": "part_filter_78", + "data_class": "VQAWebdataset", + "data_weight": 0.02, + }, + "vqa/VQAv2/stage1": { + "sample_loader": "sample_loader_91", + "part_filter": "part_filter_91", + "data_class": "VQAWebdataset", + "data_weight": 1.0, + }, + "vqa/Visual_Genome": { + "sample_loader": "sample_loader_80", + "part_filter": "part_filter_80", + "data_class": "VQAWebdataset", + "data_weight": 0.01, + }, + "nvlm/wdai/data/pdfa-eng-wds/processed_word_len_300": { + "sample_loader": "sample_loader_81", + "part_filter": "part_filter_81", + "data_class": "OCRWebdataset", + "data_weight": 0.08, + }, + "nvlm/wdai/data/textocr/processed": { + "sample_loader": "sample_loader_82", + "part_filter": "part_filter_82", + "data_class": "OCRWebdataset", + "data_weight": 0.02, + }, + "nvlm/wdai/data/coco-text/processed": { + "sample_loader": "sample_loader_83", + "part_filter": "part_filter_83", + "data_class": "OCRWebdataset", + "data_weight": 0.002, + }, + "nvlm/wdai/data/ArT/processed": { + "sample_loader": "sample_loader_84", + "part_filter": "part_filter_84", + "data_class": "OCRWebdataset", + "data_weight": 0.001, + }, + "nvlm/wdai/data/ReCTs/processed": { + "sample_loader": "sample_loader_85", + "part_filter": "part_filter_85", + "data_class": "OCRWebdataset", + "data_weight": 0.001, + }, + "nvlm/wdai/data/lsvt/processed": { + "sample_loader": "sample_loader_86", + "part_filter": "part_filter_86", + "data_class": "OCRWebdataset", + "data_weight": 0.005, + }, + "nvlm/wdai/data/RCTW/processed": { + "sample_loader": "sample_loader_87", + "part_filter": "part_filter_87", + "data_class": "OCRWebdataset", + "data_weight": 0.001, + }, + "nvlm/wdai/data/coco-text/processed_multi": { + "sample_loader": "sample_loader_88", + "part_filter": "part_filter_88", + "data_class": "OCRWebdataset", + "data_weight": 0.0003, + }, + "nvlm/wdai/data/textocr/processed_multi": { + "sample_loader": "sample_loader_89", + "part_filter": "part_filter_89", + "data_class": "OCRWebdataset", + "data_weight": 0.0004, + }, + "nvlm/wdai/data/ReCTs/processed_multi": { + "sample_loader": "sample_loader_90", + "part_filter": "part_filter_90", + "data_class": "OCRWebdataset", + "data_weight": 0.0003, + }, +} + + +def get_sample_loader(path): + """Returns the correct sample_loader function for a dataset.""" + if path not in dataset_loader_mapping: + path = data_path_mapping(path) + assert path in dataset_loader_mapping, f"path {path} not in dataset_loader_mapping" + return globals().get(dataset_loader_mapping.get(path, {}).get("sample_loader")) + + +def get_part_filter(path): + """Returns the correct part_filter function for a dataset.""" + if path not in dataset_loader_mapping: + path = data_path_mapping(path) + assert path in dataset_loader_mapping, f"path {path} not in dataset_loader_mapping" + return globals().get(dataset_loader_mapping.get(path, {}).get("part_filter")) + + +def get_data_class(path): + """Returns the correct data_class for a dataset.""" + if path not in dataset_loader_mapping: + path = data_path_mapping(path) + + assert path in dataset_loader_mapping, f"path {path} not in dataset_loader_mapping" + return dataset_loader_mapping[path]["data_class"] diff --git a/cosmos-inference/cosmos3/_src/vfm/datasets/augmentors/vlm/prompt_format.py b/cosmos-inference/cosmos3/_src/vfm/datasets/augmentors/vlm/prompt_format.py new file mode 100644 index 00000000..0f073906 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/vfm/datasets/augmentors/vlm/prompt_format.py @@ -0,0 +1,133 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +"""Visual-Text Transformations or Augmentations.""" + +import random +from typing import Dict, Literal + +from cosmos3._src.imaginaire.datasets.webdataset.augmentors.augmentor import Augmentor + +REASONING_SUFFIX = ( + "\nAnswer the question using the following format:\n\n" + "\nYour reasoning.\n\n\n" + "Write your final answer immediately after the tag." +) + + +class PromptFormat(Augmentor): + def __init__( + self, + input_keys: list = ["texts"], + text_chat_order: Literal["text_end", "text_start", "random"] = "text_end", + ) -> None: + """ + Args: + input_keys (list): List of input keys. + text_chat_order (Literal["text_end", "text_start", "random"]): Order of text items in user messages. + """ + self.input_keys = input_keys + self.text_chat_order = text_chat_order + + def __call__(self, data_dict: Dict) -> Dict: + conversation_key = self.input_keys[0] + + # retrive conversations from dict + try: + list_of_conversation = data_dict[conversation_key] + except KeyError: + url = data_dict["__url__"].root + "/" + data_dict["__url__"].path + print(f"KeyError: {conversation_key} not found in data_dict for url: {url}") + return None + + # check if this is list of list of dict or list of dict + + if isinstance(list_of_conversation[0], list): + selected_conversation = random.sample(list_of_conversation, 1)[0] + elif isinstance(list_of_conversation[0], dict): + + selected_conversation = list_of_conversation + else: + raise ValueError( + f"list_of_conversation is not a list of list of dict or list of dict: {list_of_conversation}" + ) + + # Now it should be list of dict + assert isinstance(selected_conversation, list) and isinstance(selected_conversation[0], dict), ( + f"selected_conversation is not a list of dict: {selected_conversation}" + ) + # Normalize all string content to list format + for message in selected_conversation: + if "content" in message and isinstance(message["content"], str): + message["content"] = [{"type": "text", "text": message["content"]}] + if "reasoning_content" in message and isinstance(message["reasoning_content"], str): + message["reasoning_content"] = [{"type": "text", "text": message["reasoning_content"]}] + + # Merge reasoning_content into assistant message content + for i, message in enumerate(selected_conversation): + if message.get("role") == "assistant" and message.get("reasoning_content"): + # Append reasoning instruction to the preceding user message + for j in range(i - 1, -1, -1): + if selected_conversation[j].get("role") == "user": + selected_conversation[j]["content"].append({"type": "text", "text": REASONING_SUFFIX}) + break + # Wrap reasoning items in ... tags + reasoning_items = message["reasoning_content"] + think_start = [{"type": "text", "text": "\n"}] + think_end = [{"type": "text", "text": "\n\n\n"}] + message["content"] = think_start + reasoning_items + think_end + message["content"] + del message["reasoning_content"] + + data_dict["conversation"] = selected_conversation + + del data_dict[conversation_key] + + + # # enforce chat order + # self._enforce_text_chat_order(selected_conversation) + + return data_dict + + def _enforce_text_chat_order(self, conversation: list) -> None: + """ + Reorder text content within user messages based on text_chat_order setting. + NOTE (maxzhaoshuol): this does NOT work for interleaved data!!!!!! + + Args: + conversation: List of message dictionaries + """ + for message in conversation: + if message.get("role") == "user" and "content" in message: + content = message["content"] + if isinstance(content, list): + # Separate text items from non-text items + text_items = [item for item in content if item.get("type") == "text"] + non_text_items = [item for item in content if item.get("type") != "text"] + + if text_items: + # Reorder based on text_chat_order + if self.text_chat_order == "text_start": + # Put text items at the beginning + message["content"] = text_items + non_text_items + elif self.text_chat_order == "text_end": + # Put text items at the end + message["content"] = non_text_items + text_items + elif self.text_chat_order == "random": + print("random") + # Randomly put text items at beginning or end + if random.random() < 0.5: + message["content"] = text_items + non_text_items + else: + message["content"] = non_text_items + text_items diff --git a/cosmos-inference/cosmos3/_src/vfm/datasets/augmentors/vlm/shuffle_text_media_order.py b/cosmos-inference/cosmos3/_src/vfm/datasets/augmentors/vlm/shuffle_text_media_order.py new file mode 100644 index 00000000..3a6ca826 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/vfm/datasets/augmentors/vlm/shuffle_text_media_order.py @@ -0,0 +1,60 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +""" +Augmentations to randomly swap media/text order in user prompts if there is only one video/image and one text messeage. +Default swap probability is 1%. +""" + +import random +from typing import Dict, Optional + +from cosmos3._src.imaginaire.datasets.webdataset.augmentors.augmentor import Augmentor +from cosmos3._src.imaginaire.utils import log + + +class ShuffleTextMediaOrder(Augmentor): + def __init__( + self, + shuffle_ratio: float = 0.01, + ) -> None: + """ + Args: + input_keys (list): List of input keys. + """ + self.shuffle_ratio = shuffle_ratio + + def __call__(self, data_dict: Dict) -> Optional[Dict]: + url = data_dict["__url__"] + try: + # process conversation + conversation = data_dict["conversation"] + for item in conversation: + if item["role"] == "user": + if ( + len(item["content"]) == 2 + and item["content"][0]["type"] in ["video", "image"] + and item["content"][1]["type"] == "text" + ): + # random.shuffle(item["content"]) # randomly shuffle media and text + if random.random() < self.shuffle_ratio: + item["content"] = item["content"][::-1] + data_dict["conversation"] = conversation + return data_dict + except Exception as e: + log.warning( + f"Error replacing invalid characters in RFT: {e}. Skipping this sample {url.root} {data_dict['__key__']}." + ) + return None diff --git a/cosmos-inference/cosmos3/_src/vfm/datasets/augmentors/vlm/timestamp.py b/cosmos-inference/cosmos3/_src/vfm/datasets/augmentors/vlm/timestamp.py new file mode 100644 index 00000000..de2d8978 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/vfm/datasets/augmentors/vlm/timestamp.py @@ -0,0 +1,530 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +"""Augmentors for general video dense caption datasets. +Copied from projects/cosmos/reason1/datasets/augmentors/timestamp.py +Changes: + 1. Unify system prompt to 'You are a helpful assistant.' + 2. Move task requirements from system prompts to user prompts. + 3. Randomly change timestamp formats from ["seconds", "hh:mm:ss", "hh:mm:ss.sss", "mm:ss.sss"] + 4. Add json output format for event temporal localization. +""" + +import json +import random +from copy import deepcopy +from typing import Dict, List, Literal, Tuple + +from PIL import Image, ImageDraw, ImageFont + +from cosmos3._src.imaginaire.datasets.webdataset.augmentors.augmentor import Augmentor +from cosmos3._src.imaginaire.utils import log + + +def compute_timestamps(frame_index: int, fps: float, processor) -> float: + if processor is not None and "Qwen3" in processor.name: + frame_index_start = frame_index // processor.merge_size * processor.merge_size + frame_index_end = frame_index_start + processor.merge_size - 1 + timestamps_start = frame_index_start / fps + timestamps_end = frame_index_end / fps + timestamps = (timestamps_start + timestamps_end) / 2 + timestamps = float(f"{timestamps:.1f}") + return timestamps + else: + return frame_index / fps + + +def convert_timestamp(seconds: float | str, format: str = "hh:mm:ss") -> str: + if isinstance(seconds, str): + seconds = float(seconds) + # convert seconds to hh:mm:ss.sss format + minutes = int(seconds // 60) + hours = int(minutes // 60) + minutes = minutes % 60 + seconds = seconds % 60 + if format == "hh:mm:ss": + return f"{hours:02d}:{minutes:02d}:{int(seconds):02d}" + elif format == "hh:mm:ss.sss": + return f"{hours:02d}:{minutes:02d}:{seconds:06.3f}" + elif format == "mm:ss.sss": + return f"{minutes + 60 * hours:02d}:{seconds:06.3f}" + else: + raise ValueError(f"Invalid format: {format}") + + +def check_if_need_overlay_text(processor): + if processor is not None and ("Qwen3" in processor.name or "Nemotron" in processor.name): + return False + return True + + +timestamp_convertor = { + "seconds": lambda x: x, + "hh:mm:ss": lambda x: convert_timestamp(x, format="hh:mm:ss"), + "hh:mm:ss.sss": lambda x: convert_timestamp(x, format="hh:mm:ss.sss"), + "mm:ss.sss": lambda x: convert_timestamp(x, format="mm:ss.sss"), +} + + +def overlay_text( + images: List[Image.Image], + fps: float, + border_height: int = 28, # this is due to patch size of 28 + temporal_path_size: int = 2, # Number of positions to cycle through + font_size: int = 20, + font_color: str = "white", + processor=None, + debug=False, +) -> Tuple[List[Image.Image], List[float]]: + """ + Overlay text on a list of PIL images with black border. + The timestamp position cycles through available positions. + + Args: + images: List of PIL images to process + fps: Frames per second + border_height: Height of the black border in pixels (default: 28) + temporal_path_size: Number of positions to cycle through (default: 2) + font_size: Font size for the text (default: 20) + font_color: Color of the text (default: "white") + + Returns: + List of PIL images with text overlay + List of timestamps + """ + if not check_if_need_overlay_text(processor) and not debug: + # if debug is True, we still need to overlay text for visualization purpose + return images, [compute_timestamps(i, fps, processor) for i in range(len(images))] + + # Try to use DejaVu Sans Mono font for better readability + font = ImageFont.truetype("/usr/share/fonts/truetype/dejavu/DejaVuSansMono.ttf", font_size) + + # Process each image + processed_images = [] + + for i, image in enumerate(images): + # Get original dimensions + width, height = image.size + + # Create new image with black border at the bottom + new_height = height + border_height + if debug: # add border_height for visualization purpose + new_height = new_height + border_height + new_image = Image.new("RGB", (width, new_height), color="black") + + # Paste original image at the top + new_image.paste(image, (0, 0)) + + # Draw text on the black border + draw = ImageDraw.Draw(new_image) + + # Calculate timestamp for current frame + total_seconds = compute_timestamps(i, fps, processor) + text = f"{total_seconds:.2f}s" + + # Get text dimensions + try: + # Get text bounding box + bbox = draw.textbbox((0, 0), text, font=font) + text_width = bbox[2] - bbox[0] + text_height = bbox[3] - bbox[1] + except AttributeError: + # Fallback for older PIL versions + text_width, text_height = draw.textsize(text, font=font) + + # Define available positions (cycling through horizontal positions) + position_idx = i % temporal_path_size + section_width = width // temporal_path_size + + # Calculate x position based on cycling position + section_center_x = position_idx * section_width + section_width // 2 + text_x = section_center_x - text_width // 2 + + # Ensure text doesn't go outside bounds + text_x = max(0, min(text_x, width - text_width)) + + # Center vertically in the border + text_y = height + (border_height - text_height) // 2 + + # Draw the single timestamp + draw.text((text_x, text_y), text, fill=font_color, font=font) + + processed_images.append(new_image) + + return processed_images, [compute_timestamps(i, fps, processor) for i in range(len(images))] + + +def markdown_to_list(conversation_data: str | List[Dict]) -> List[Dict]: + if isinstance(conversation_data, list): + assert ( + isinstance(conversation_data[0], dict) + and conversation_data[0]["type"] == "text" + and len(conversation_data) == 1 + ) + conversation_data = conversation_data[0]["text"] + cleaned_data = conversation_data.strip() + if cleaned_data.startswith("```json"): + cleaned_data = cleaned_data[7:] # Remove '```json' + if cleaned_data.endswith("```"): + cleaned_data = cleaned_data[:-3] # Remove '```' + cleaned_data = cleaned_data.strip() + return json.loads(cleaned_data) + + +def json_to_markdown(conversation_data: List[Dict] | Dict) -> str: + json_string = json.dumps(conversation_data, indent=2) + return f"```json\n{json_string}\n```".strip() + + +def snap_timestamps_to_existing(assistant_message: List[Dict], existing_timestamps: List[float]) -> List[Dict]: + """ + Snap conversation start/end timestamps to the nearest existing timestamps. + + Args: + assistant_message: JSON string containing list of dictionaries with 'start', 'end', and 'caption' fields + existing_timestamps: List of existing timestamps (floats) to snap to + + Returns: + List of dictionaries with snapped timestamps + """ + snapped_message = [] + + for item in assistant_message: + if not isinstance(item, dict) or "start" not in item or "end" not in item: + raise ValueError("Each item must be a dictionary with 'start' and 'end' fields") + + snapped_item = item.copy() + + # Snap start and end timestamps to existing ones + snapped_item["start"] = min(existing_timestamps, key=lambda x: abs(x - item["start"])) + snapped_item["end"] = min(existing_timestamps, key=lambda x: abs(x - item["end"])) + + snapped_message.append(snapped_item) + + # Sort the merged events by start timestamp to ensure chronological order + # Merge captions that share identical start and end timestamps + merged_events: Dict[Tuple[float, float], Dict] = {} + for item in snapped_message: + item["start"] = round(item["start"], 2) + item["end"] = round(item["end"], 2) + + key = (item["start"], item["end"]) + if key in merged_events: + # Concatenate captions for the same time interval. + merged_events[key]["caption"] = merged_events[key]["caption"].rstrip() + " " + item["caption"].lstrip() + else: + merged_events[key] = item + + merged_events[key]["caption"] = merged_events[key]["caption"].strip() + + # Sort the merged events by start timestamp to ensure chronological order + new_assistant_message = sorted(merged_events.values(), key=lambda x: x["start"]) + if len(new_assistant_message) == 0: + raise ValueError("No valid assistant message found for data.") + + return new_assistant_message + + +def augment_assistant_message( + assistant_message: List[Dict], + output_format: Literal[ + "dense_video_caption_json", + "dense_video_caption_plain", + "dense_video_caption_json_with_types", + "dense_video_caption_plain_with_types", + "temporal_localization_plain", + "temporal_localization_json", + "temporal_caption", + ], + timestamp_format: str = "hh:mm:ss", +): + # change time stamp format to hh:mm:ss.sss + assistant_message = deepcopy(assistant_message) + for item in assistant_message: + item["start"] = timestamp_convertor[timestamp_format](item["start"]) + item["end"] = timestamp_convertor[timestamp_format](item["end"]) + if output_format == "dense_video_caption_json" or output_format == "dense_video_caption_json_with_types": + output_message = json_to_markdown(assistant_message) + return output_message + elif output_format == "dense_video_caption_plain" or output_format == "dense_video_caption_plain_with_types": + output_message = "" + for item in assistant_message: + if output_format == "dense_video_caption_plain": + output_message += f"{item['start']}, {item['end']}, {item['caption']}\n" + elif output_format == "dense_video_caption_plain_with_types": + output_message += f"{item['start']}, {item['end']}, {item['type']}, {item['caption']}\n" + return output_message + elif output_format == "temporal_localization_plain": + return f"{assistant_message[0]['start']}, {assistant_message[0]['end']}" + elif output_format == "temporal_localization_json": + output_message = { + "start": assistant_message[0]["start"], + "end": assistant_message[0]["end"], + } + output_message = json_to_markdown(output_message) + return output_message + elif output_format == "temporal_caption": + return assistant_message[0]["caption"] + else: + raise ValueError(f"Invalid output format: {output_format}") + + +def augment_user_prompt( + assistant_message: List[dict], + output_format: Literal[ + "dense_video_caption_json", + "dense_video_caption_plain", + "dense_video_caption_json_with_types", + "dense_video_caption_plain_with_types", + "temporal_localization_plain", + "temporal_localization_json", + "temporal_caption", + ], + timestamp_format: str = "hh:mm:ss", +): + if ( + output_format == "dense_video_caption_json" + or output_format == "dense_video_caption_plain" + or output_format == "dense_video_caption_json_with_types" + or output_format == "dense_video_caption_plain_with_types" + ): + if random.random() < 0.5: + user_prompt = random.choice( + [ + "Caption the notable events in the provided video.", + "Describe the notable events in the provided video.", + "Summarize the notable events in the provided video.", + "Localize a series of activity events in the video, output the start and end timestamp and description for each event.", + ] + ) + if random.random() < 0.5: + user_prompt = "Please " + user_prompt.lower() + else: + user_prompt = random.choice( + [ + "Can you caption the notable events in the provided video?", + "Can you describe the notable events in the provided video?", + "Can you summarize the notable events in the provided video?", + ] + ) + if output_format == "dense_video_caption_json": + # add format requirement. + if random.random() < 0.5: + user_prompt = user_prompt + ( + "\nPlease provide captions of all the events in the video with timestamps using the following format:\n" + "[\n" + " {\n" + f' "start": {timestamp_format},\n' + f' "end": {timestamp_format},\n' + ' "caption": \n' + " },\n" + " {\n" + f' "start": {timestamp_format},\n' + f' "end": {timestamp_format},\n' + ' "caption": \n' + " }\n" + "]" + ) + else: + user_prompt = ( + user_prompt + + f"\nProvide the result in json format with '{timestamp_format}' for time depiction for each event. Use keywords 'start', 'end' and 'caption' in the json output." + ) + elif output_format == "dense_video_caption_json_with_types": + # add format requirement. + if random.random() < 0.5: + user_prompt = user_prompt + ( + "\nPlease provide captions of all the events in the video with timestamps using the following format:\n" + "[\n" + " {\n" + f' "start": {timestamp_format},\n' + f' "end": {timestamp_format},\n' + ' "type": \n' + ' "caption": \n' + " },\n" + " {\n" + f' "start": {timestamp_format},\n' + f' "end": {timestamp_format},\n' + ' "type": \n' + ' "caption": \n' + " }\n" + "]" + ) + else: + user_prompt = ( + user_prompt + + f"\nProvide the result in json format with '{timestamp_format}' for time depiction for each event. Use keywords 'start', 'end', 'type' and 'caption' in the json output." + ) + elif output_format == "dense_video_caption_plain_with_types": + user_prompt = ( + user_prompt + + f"\nPlease provide captions of all the events in the video with start and end timestamps using the following format: \n{timestamp_format}, {timestamp_format}, , .\n{timestamp_format}, {timestamp_format}, , ." + ) + else: # plain format + user_prompt = ( + user_prompt + + f"\nPlease provide captions of all the events in the video with start and end timestamps using the following format: \n{timestamp_format}, {timestamp_format}, caption of event 1.\n{timestamp_format}, {timestamp_format}, caption of event 2." + ) + + elif output_format == "temporal_localization_plain" or output_format == "temporal_localization_json": + event = assistant_message[0] + event_caption = event["caption"] + if not event_caption[-1].isalpha(): + event_caption = event_caption[:-1] + user_prompt = random.choice( + [ + f"When does the following event happen? {event_caption}.", + f"When does the event '{event_caption.lower()}' happen?", + f"Can you find the event '{event_caption.lower()}'?", + ] + ) + if output_format == "temporal_localization_json": + user_prompt = ( + user_prompt + + f"\nPlease provide the result in json format with '{timestamp_format}' for time depiction for the event. Use keywords 'start', 'end' in the json output." + ) + else: + user_prompt = ( + user_prompt + + f"\nPlease provide the start and end timestamp of the event in the following format: {timestamp_format}, {timestamp_format}." + ) + + elif output_format == "temporal_caption": + event = assistant_message[0] + if random.random() < 0.333333: + + start = round(event["start"]) + end = round(event["end"]) + elif random.random() < 0.666666: + + start = round(event["start"] * 2) / 2 + end = round(event["end"] * 2) / 2 + else: + start = event["start"] + end = event["end"] + if start == end: # HACK (maxzhaoshuol): remove events with start == end + raise ValueError("Start and end time are the same for data.") + if timestamp_format == "seconds": + if random.random() < 0.5: + start = f"{start}s" + end = f"{end}s" + else: + start = f"{start} seconds" + end = f"{end} seconds" + else: + start = convert_timestamp(start, format=timestamp_format) + end = convert_timestamp(end, format=timestamp_format) + user_prompt = random.choice( + [ + f"Caption the event between {start} and {end}.", + f"Please describe the event between {start} and {end}.", + f"Please caption the event between the start time {start} and the end time {end}.", + f"Summarize the event between {start} and {end}.", + ] + ) + return user_prompt + + +class TimeStamp(Augmentor): + def __init__( + self, + input_key: list = "media", + output_format: Literal[ + "dense_video_caption", "temporal_localization", "temporal_caption", "caption", "random" + ] = "dense_video_caption", + urls_needs_timestamp: list = ["av_reasoning_localization_20250627", "tl_activitynet_20250630"], + processor=None, + ) -> None: + """ + Args: + input_keys (list): List of input keys. + """ + self.input_key = input_key + self.output_format = output_format + self.urls_needs_timestamp = urls_needs_timestamp + self.processor = processor + + def __call__(self, data_dict: Dict) -> Dict: + url = data_dict["__url__"] + if not any(url_pattern in url.root for url_pattern in self.urls_needs_timestamp): + return data_dict + + media_data = data_dict[self.input_key] + for k, v in media_data.items(): + if "video" in k: + video_frames_with_timestamp, timestamps = overlay_text(v["videos"], v["fps"], processor=self.processor) + media_data[k]["videos"] = video_frames_with_timestamp + + if self.output_format == "random": + output_format = random.choice(["dense_video_caption", "temporal_localization", "temporal_caption"]) + elif self.output_format == "caption": + output_format = random.choice(["dense_video_caption", "temporal_caption"]) + else: + output_format = self.output_format + + if output_format == "dense_video_caption": + output_format = random.choice( + [ + "dense_video_caption_json", + "dense_video_caption_plain", + "dense_video_caption_json_with_types", + "dense_video_caption_plain_with_types", + ] + ) + elif output_format == "temporal_localization": + output_format = random.choice(["temporal_localization_plain", "temporal_localization_json"]) + + timestamp_format = random.choice(list[str](timestamp_convertor.keys())) + + try: + # find the assistant message and parse into a list of dictionaries + for item in data_dict["conversation"]: + if item["role"] == "assistant": + if isinstance(item["content"], list): + assert len(item["content"]) == 1 + assert item["content"][0]["type"] == "text" + item["content"] = item["content"][0]["text"] + assistant_message = markdown_to_list(item["content"]) + assistant_message = snap_timestamps_to_existing(assistant_message, timestamps) + break + + if "types" in output_format: + for item in assistant_message: + if "type" not in item: # if type is not provided, use the default format + output_format = random.choice(["dense_video_caption_json", "dense_video_caption_plain"]) + break + + # if temporal localization or caption, sample one event + if output_format in ["temporal_localization_plain", "temporal_localization_json", "temporal_caption"]: + assistant_message = random.sample(assistant_message, 1) + + # process conversation + conversation = data_dict["conversation"] + for item in conversation: + if item["role"] == "system": + item["content"] = "You are a helpful assistant." + elif item["role"] == "user": + for content in item["content"]: + if content["type"] == "text": + content["text"] = augment_user_prompt(assistant_message, output_format, timestamp_format) + elif item["role"] == "assistant": + assistant_message = augment_assistant_message(assistant_message, output_format, timestamp_format) + item["content"] = assistant_message + data_dict["conversation"] = conversation + + return data_dict + except Exception as e: + log.warning(f"Error timestamping: {e}. Skipping this sample {url.root} {data_dict['__key__']}.") + return None diff --git a/cosmos-inference/cosmos3/_src/vfm/datasets/augmentors/vlm/timestamp_with_subject_tracking.py b/cosmos-inference/cosmos3/_src/vfm/datasets/augmentors/vlm/timestamp_with_subject_tracking.py new file mode 100644 index 00000000..629bf37d --- /dev/null +++ b/cosmos-inference/cosmos3/_src/vfm/datasets/augmentors/vlm/timestamp_with_subject_tracking.py @@ -0,0 +1,442 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +"""Augmentors for facebook/PLM-Video-Human dataset (video dense captions with SoM prompting). +Copied from projects/cosmos/reason1/datasets/augmentors/timestamp_with_subject_tracking.py +Changes: + 1. Unify system prompt to 'You are a helpful assistant.' + 2. Move task requirements from system prompts to user prompts. + 3. Randomly change timestamp formats from ["seconds", "hh:mm:ss", "hh:mm:ss.sss", "mm:ss.sss"] + 4. Add json output format for event temporal localization. +""" + +import random +from copy import deepcopy +from typing import Dict, List, Literal + +from cosmos3._src.imaginaire.datasets.webdataset.augmentors.augmentor import Augmentor +from cosmos3._src.imaginaire.utils import log +from cosmos3._src.vfm.datasets.augmentors.vlm.timestamp import ( + json_to_markdown, + markdown_to_list, + overlay_text, + timestamp_convertor, +) + + +# reorder dict entries +def reorder_dict_entries(conversation_data: Dict) -> Dict: + key_order = ["subject_id", "start", "end", "caption"] + output_dict = {} + for key in key_order: + if key in conversation_data: + output_dict[key] = conversation_data[key] + return output_dict + + +def snap_timestamps_to_existing(assistant_message: List[Dict], existing_timestamps: List[float]) -> List[Dict]: + """ + Snap conversation start/end timestamps to the nearest existing timestamps. + + Args: + assistant_message: JSON string containing list of dictionaries with 'start', 'end', and 'caption' fields + existing_timestamps: List of existing timestamps (floats) to snap to + + Returns: + List of dictionaries with snapped timestamps + """ + snapped_message = [] + + for item in assistant_message: + """ + { + "subject_id": "0", + "start": 0.0, + "end": 12.17, + "caption": "Woman is making garland from flowers beading by her hands." + } + """ + if not isinstance(item, dict) or "start" not in item or "end" not in item or "subject_id" not in item: + log.warning(f"Each item must be a dictionary with 'start', 'end', and 'subject_id' fields. getting {item}") + return None + + snapped_item = {"subject_id": item["subject_id"], "caption": item["caption"]} + + # Snap start and end timestamps to existing ones + snapped_item["start"] = min(existing_timestamps, key=lambda x: abs(x - item["start"])) + snapped_item["end"] = min(existing_timestamps, key=lambda x: abs(x - item["end"])) + + # round to 2 decimal places + snapped_item["start"] = round(snapped_item["start"], 2) + snapped_item["end"] = round(snapped_item["end"], 2) + + snapped_message.append(snapped_item) + + # Sort the merged events by start timestamp to ensure chronological order + new_assistant_message = sorted(snapped_message, key=lambda x: x["start"]) + if len(new_assistant_message) == 0: + log.warning("No valid assistant message found for data.") + return None + + return new_assistant_message + + +def augment_assistant_message( + assistant_message: List[Dict], + output_format: Literal[ + "dense_video_caption_json_per_subject", + "dense_video_caption_plain_per_subject", + "dense_video_caption_json_one_subject", + "dense_video_caption_plain_one_subject", + "temporal_location_subject_plain", + "temporal_location_subject_json", + "temporal_caption_subject", + ], + timestamp_format: str = "hh:mm:ss", +): + # change time stamp format to hh:mm:ss.sss + assistant_message = deepcopy(assistant_message) + for item in assistant_message: + item["start"] = timestamp_convertor[timestamp_format](item["start"]) + item["end"] = timestamp_convertor[timestamp_format](item["end"]) + + if output_format == "dense_video_caption_json_per_subject": + output_message = json_to_markdown(assistant_message) + return output_message + elif output_format == "dense_video_caption_plain_per_subject": + output_message = "" + for item in assistant_message: + output_message += f"Subject {item['subject_id']}, {item['start']}, {item['end']}, {item['caption']}\n" + return output_message + + elif output_format == "dense_video_caption_json_one_subject": + # remove subject_id + assistant_message = [ + {"start": item["start"], "end": item["end"], "caption": item["caption"]} for item in assistant_message + ] + output_message = json_to_markdown(assistant_message) + return output_message + elif output_format == "dense_video_caption_plain_one_subject": + output_message = "" + for item in assistant_message: + output_message += f"{item['start']}, {item['end']}, {item['caption']}\n" + return output_message + elif output_format == "temporal_location_subject_plain": + return f"{assistant_message[0]['start']}, {assistant_message[0]['end']}" + elif output_format == "temporal_location_subject_json": + output_message = { + "start": assistant_message[0]["start"], + "end": assistant_message[0]["end"], + } + return json_to_markdown(output_message) + elif output_format == "temporal_caption_subject": + return assistant_message[0]["caption"] + else: + raise ValueError(f"Invalid output format: {output_format}") + + +def augment_user_prompt( + assistant_message: List[dict], + output_format: Literal[ + "dense_video_caption_json_per_subject", + "dense_video_caption_plain_per_subject", + "dense_video_caption_json_one_subject", + "dense_video_caption_plain_one_subject", + "temporal_location_subject_plain", + "temporal_location_subject_json", + "temporal_caption_subject", + ], + timestamp_format: str = "hh:mm:ss", +): + if ( + output_format == "dense_video_caption_json_per_subject" + or output_format == "dense_video_caption_plain_per_subject" + ): + if random.random() < 0.5: + user_prompt = random.choice( + [ + "Caption the notable events in the provided video.", + "Describe the notable events in the provided video.", + "Summarize the notable events in the provided video.", + "Localize a series of activity events in the video, output the start and end timestamp, subject id and description for each event.", + ] + ) + if random.random() < 0.5: + user_prompt = "Please " + user_prompt.lower() + else: + user_prompt = random.choice( + [ + "Can you caption the notable events in the provided video?", + "Can you describe the notable events in the provided video?", + "Can you summarize the notable events in the provided video?", + ] + ) + if output_format == "dense_video_caption_json_per_subject": + if random.random() < 0.5: + user_prompt = user_prompt + ( + "\nList and describe all marked subjects in the video using the following format:\n" + "[\n" + " {\n" + f' "start": {timestamp_format},\n' + f' "end": {timestamp_format},\n' + ' "subject_id": ,\n' + ' "caption": ,\n' + " },\n" + " {\n" + f' "start": {timestamp_format},\n' + f' "end": {timestamp_format},\n' + ' "subject_id": ,\n' + ' "caption": ,\n' + " },\n" + "]" + ) + else: + user_prompt = ( + user_prompt + + f"\nProvide the result in json format with '{timestamp_format}' for time depiction for each event. Use keywords 'start', 'end', 'subject_id' and 'caption' in the json output." + ) + else: # plain format + user_prompt = ( + user_prompt + + f"\nList and describe all marked subjects in the video using the following format: \nSubject , {timestamp_format}, {timestamp_format}, caption of event 1.\nSubject , {timestamp_format}, {timestamp_format}, caption of event 2.\n" + ) + + elif output_format == "temporal_location_subject_plain" or output_format == "temporal_location_subject_json": + event = assistant_message[0] + user_prompt = random.choice( + [ + f"When does the following event happen to the tracked object with ID <{event['subject_id']}>? {event['caption']}", + f"When does the event '{event['caption'].lower()[:-1]}' happen to the tracked object with ID <{event['subject_id']}>?", + f"Can you find the event '{event['caption'].lower()[:-1]}' happen to the tracked object with ID <{event['subject_id']}>?", + ] + ) + if output_format == "temporal_location_subject_plain": + user_prompt = ( + user_prompt + + f"\nPlease provide the start and end timestamp in the following format: {timestamp_format}, {timestamp_format}." + ) + else: + user_prompt = ( + user_prompt + + f"\nPlease provide the result in json format with '{timestamp_format}' for time depiction for the event. Use keywords 'start', 'end' in the json output." + ) + + elif output_format == "temporal_caption_subject": + event = assistant_message[0] + if random.random() < 0.333333: + + start = round(event["start"]) + end = round(event["end"]) + elif random.random() < 0.666666: + + start = round(event["start"] * 2) / 2 + end = round(event["end"] * 2) / 2 + else: + start = event["start"] + end = event["end"] + if start == end: # HACK (maxzhaoshuol): remove events with start == end + log.warning(f"Start and end time are the same for data. {event}") + return None + + if timestamp_format == "seconds": + if random.random() < 0.5: + start = f"{start}s" + end = f"{end}s" + else: + start = f"{start} seconds" + end = f"{end} seconds" + else: + start = timestamp_convertor[timestamp_format](start) + end = timestamp_convertor[timestamp_format](end) + + user_prompt = random.choice( + [ + f"Caption the event between {start} and {end} of the tracked object with ID <{event['subject_id']}>.", + f"Please describe the event between {start} and {end} of the tracked object with ID <{event['subject_id']}>.", + f"Please caption the event between the start time {start} and the end time {end} of the tracked object with ID <{event['subject_id']}>.", + f"Summarize the event between {start} and {end} of the tracked object with ID <{event['subject_id']}>.", + ] + ) + elif ( + output_format == "dense_video_caption_json_one_subject" + or output_format == "dense_video_caption_plain_one_subject" + ): + event = assistant_message[0] + if random.random() < 0.5: + user_prompt = random.choice( + [ + f"Caption the notable events in the provided video for the tracked object with ID <{event['subject_id']}>.", + f"Describe the notable events in the provided video for the tracked object with ID <{event['subject_id']}>.", + f"Summarize the notable events in the provided video for the tracked object with ID <{event['subject_id']}>.", + f"Localize a series of activity events in the video for the the tracked object with ID <{event['subject_id']}>, output the start and end timestamp and description for each event.", + ] + ) + if random.random() < 0.5: + user_prompt = "Please " + user_prompt.lower() + else: + user_prompt = random.choice( + [ + f"Can you caption the notable events in the provided video for the tracked object with ID <{event['subject_id']}>?", + f"Can you describe the notable events in the provided video for the tracked object with ID <{event['subject_id']}>?", + f"Can you summarize the notable events in the provided video for the tracked object with ID <{event['subject_id']}>?", + ] + ) + if output_format == "dense_video_caption_json_one_subject": + if random.random() < 0.5: + user_prompt = user_prompt + ( + f"\nSummarize the notable events of the subject marked with ID <{event['subject_id']}> with timestamps in the video using the following format:\n" + "[\n" + " {\n" + f' "start": {timestamp_format},\n' + f' "end": {timestamp_format},\n' + ' "caption": ,\n' + " },\n" + " {\n" + f' "start": {timestamp_format},\n' + f' "end": {timestamp_format},\n' + ' "caption": ,\n' + " },\n" + "]" + ) + else: + user_prompt = ( + user_prompt + + f"\nProvide the result in json format with '{timestamp_format}' for time depiction for each event. Use keywords 'start', 'end' and 'caption' in the json output." + ) + else: # plain format + user_prompt = ( + user_prompt + + f"\nPlease provide captions of all the events of the tracked object with given ID in the video with start and end timestamps using the following format:\n{timestamp_format}, {timestamp_format}, caption of event 1.\n{timestamp_format}, {timestamp_format}, caption of event 2.\n" + ) + return user_prompt + + +class TimeStampWithSubjectTracking(Augmentor): + def __init__( + self, + input_key: str = "media", + output_format: Literal[ + "dense_video_caption_per_subject", + "dense_video_caption_one_subject", + "temporal_location_subject", + "temporal_caption_subject", + "random", + ] = "random", + urls_needs_timestamp: list = ["tl_plm_sav_20250714"], + processor=None, + ) -> None: + """ + Args: + input_keys (list): List of input keys. + """ + self.input_key = input_key + self.output_format = output_format + self.urls_needs_timestamp = urls_needs_timestamp + self.processor = processor + + def __call__(self, data_dict: Dict) -> Dict: + url = data_dict["__url__"] + if not any(url_pattern in url.root for url_pattern in self.urls_needs_timestamp): + return data_dict + + media_data = data_dict[self.input_key] + for k, v in media_data.items(): + if "video" in k: + video_frames_with_timestamp, timestamps = overlay_text(v["videos"], v["fps"], processor=self.processor) + media_data[k]["videos"] = video_frames_with_timestamp + + if self.output_format == "random": + output_format = random.choice( + [ + "dense_video_caption_per_subject", + "dense_video_caption_one_subject", + "temporal_location_subject", + "temporal_caption_subject", + ] + ) + else: + output_format = self.output_format + + if output_format == "dense_video_caption_per_subject": + output_format = random.choice( + ["dense_video_caption_json_per_subject", "dense_video_caption_plain_per_subject"] + ) + elif output_format == "dense_video_caption_one_subject": + output_format = random.choice( + ["dense_video_caption_json_one_subject", "dense_video_caption_plain_one_subject"] + ) + elif output_format == "temporal_location_subject": + output_format = random.choice(["temporal_location_subject_plain", "temporal_location_subject_json"]) + + # find the assistant message and parse into a list of dictionaries + for item in data_dict["conversation"]: + if item["role"] == "assistant": + """ + content dict: + ```json + [ + { + "subject_id": "0", + "start": 10.67, + "end": 11.17, + "caption": "A person enters the frame from the left riding a bike on the road towards the right frame wearing a yellow helmet." + } + ] + ``` + """ + assistant_message = markdown_to_list(item["content"]) + assistant_message = snap_timestamps_to_existing(assistant_message, timestamps) + if assistant_message is None: + return None # skip this sample + break + + # if temporal localization or caption, sample one event + if output_format in [ + "temporal_location_subject_plain", + "temporal_location_subject_json", + "temporal_caption_subject", + ]: + assistant_message = random.sample(assistant_message, 1) + elif output_format in ["dense_video_caption_json_one_subject", "dense_video_caption_plain_one_subject"]: + available_subject_ids = list( + set([assistant_message_i["subject_id"] for assistant_message_i in assistant_message]) + ) + # sample one subject id + subject_id = random.choice(available_subject_ids) + assistant_message = [ + assistant_message_i + for assistant_message_i in assistant_message + if assistant_message_i["subject_id"] == subject_id + ] + + timestamp_format = random.choice(list[str](timestamp_convertor.keys())) + + # process conversation + conversation = data_dict["conversation"] + for item in conversation: + if item["role"] == "system": + item["content"] = "You are a helpful assistant." + elif item["role"] == "user": + for content in item["content"]: + if content["type"] == "text": + content["text"] = augment_user_prompt(assistant_message, output_format, timestamp_format) + if content["text"] is None: # parse error + return None + elif item["role"] == "assistant": + assistant_message = augment_assistant_message(assistant_message, output_format, timestamp_format) + item["content"] = assistant_message + data_dict["conversation"] = conversation + + return data_dict diff --git a/cosmos-inference/cosmos3/_src/vfm/datasets/augmentors/vlm/timestamp_without_augment_message.py b/cosmos-inference/cosmos3/_src/vfm/datasets/augmentors/vlm/timestamp_without_augment_message.py new file mode 100644 index 00000000..e7dec18d --- /dev/null +++ b/cosmos-inference/cosmos3/_src/vfm/datasets/augmentors/vlm/timestamp_without_augment_message.py @@ -0,0 +1,226 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +""" +Copied from projects/cosmos/reason1/datasets/augmentors/timestamp_without_augment_message.py +Changes: + - overlay_text is now imported from cosmos3. +""" + +import json +import random +import re +from typing import Dict, List, Literal, Tuple + +from cosmos3._src.imaginaire.datasets.webdataset.augmentors.augmentor import Augmentor +from cosmos3._src.vfm.datasets.augmentors.vlm.timestamp import overlay_text + + +def list_to_markdown(conversation_data: List[Dict]) -> str: + json_string = json.dumps(conversation_data, indent=2) + return f"```json\n{json_string}\n```".strip() + + +def snap_timestamps_to_existing(assistant_message: List[Dict], existing_timestamps: List[float]) -> List[Dict]: + """ + Snap conversation start/end timestamps to the nearest existing timestamps. + + Args: + assistant_message: JSON string containing list of dictionaries with 'start', 'end', and 'caption' fields + existing_timestamps: List of existing timestamps (floats) to snap to + + Returns: + List of dictionaries with snapped timestamps + """ + snapped_message = [] + + for item in assistant_message: + if not isinstance(item, dict) or "start" not in item or "end" not in item: + raise ValueError("Each item must be a dictionary with 'start' and 'end' fields") + + snapped_item = item.copy() + + # Snap start and end timestamps to existing ones + snapped_item["start"] = min(existing_timestamps, key=lambda x: abs(x - item["start"])) + snapped_item["end"] = min(existing_timestamps, key=lambda x: abs(x - item["end"])) + + snapped_message.append(snapped_item) + + # Sort the merged events by start timestamp to ensure chronological order + # Merge captions that share identical start and end timestamps + merged_events: Dict[Tuple[float, float], Dict] = {} + for item in snapped_message: + item["start"] = round(item["start"], 2) + item["end"] = round(item["end"], 2) + + key = (item["start"], item["end"]) + if key in merged_events: + # Concatenate captions for the same time interval. + merged_events[key]["caption"] = merged_events[key]["caption"].rstrip() + " " + item["caption"].lstrip() + else: + merged_events[key] = item + + merged_events[key]["caption"] = merged_events[key]["caption"].strip() + + # Sort the merged events by start timestamp to ensure chronological order + new_assistant_message = sorted(merged_events.values(), key=lambda x: x["start"]) + if len(new_assistant_message) == 0: + raise ValueError("No valid assistant message found for data.") + + return new_assistant_message + + +def augment_assistant_message( + assistant_message: List[Dict], + output_format: Literal[ + "dense_video_caption_json", "dense_video_caption_plain", "temporal_localization", "temporal_caption" + ], +): + if output_format == "dense_video_caption_json": + output_message = list_to_markdown(assistant_message) + return output_message + elif output_format == "dense_video_caption_plain": + output_message = "" + for item in assistant_message: + output_message += f"<{item['start']}> <{item['end']}> {item['caption']}\n" + return output_message + elif output_format == "temporal_localization": + return f"<{assistant_message[0]['start']}> <{assistant_message[0]['end']}>" + elif output_format == "temporal_caption": + return assistant_message[0]["caption"] + else: + raise ValueError(f"Invalid output format: {output_format}") + + +def augment_system_prompt( + system_prompt: str, + output_format: Literal[ + "dense_video_caption_json", "dense_video_caption_plain", "temporal_localization", "temporal_caption" + ], + need_overlay_text=True, +): + if output_format == "dense_video_caption_json": + system_prompt = system_prompt + elif output_format == "dense_video_caption_plain": + system_prompt = re.sub(r"Please.*?\]", "", system_prompt, flags=re.DOTALL) # strip off existing format + system_prompt += """Please provide captions of all the events in the video with timestamps using the following format: + caption of event 1.\n caption of event 2.\n""" + elif output_format == "temporal_localization": + system_prompt = re.sub(r"Please.*?\]", "", system_prompt, flags=re.DOTALL) # strip off existing format + system_prompt += "Please locate the start and end time of a given event specified by the user using the following format: ." + elif output_format == "temporal_caption": + system_prompt = re.sub(r"Please.*?\]", "", system_prompt, flags=re.DOTALL) # strip off existing format + system_prompt += "Please provide a caption of the duration in the video based on the start and end time specified by the user." + else: + raise ValueError(f"Invalid output format: {output_format}") + + if need_overlay_text: + system_prompt = ( + system_prompt + + "\nAt each frame, the timestamp is embedded at the bottom of the video. You need to extract the timestamp and answer the user question." + ) + else: + system_prompt = system_prompt + "\nYou need to extract the timestamp and answer the user question." + + return system_prompt + + +def augment_user_prompt( + assistant_message: List[dict], + output_format: Literal[ + "dense_video_caption_json", "dense_video_caption_plain", "temporal_localization", "temporal_caption" + ], +): + if output_format == "dense_video_caption_json" or output_format == "dense_video_caption_plain": + if random.random() < 0.5: + user_prompt = random.choice( + [ + "Caption the notable events in the provided video.", + "Describe the notable events in the provided video.", + "Summarize the notable events in the provided video.", + ] + ) + if random.random() < 0.5: + user_prompt = "Please " + user_prompt.lower() + else: + user_prompt = random.choice( + [ + "Can you caption the notable events in the provided video?", + "Can you describe the notable events in the provided video?", + "Can you summarize the notable events in the provided video?", + ] + ) + elif output_format == "temporal_localization": + event = assistant_message[0] + user_prompt = random.choice( + [ + f"When does the following event happen? {event['caption']}", + f"When does the event '{event['caption'].lower()[:-1]}' happen?", + f"Can you find the event '{event['caption'].lower()[:-1]}'?", + ] + ) + elif output_format == "temporal_caption": + event = assistant_message[0] + if random.random() < 0.5: + + start = round(event["start"]) + end = round(event["end"]) + else: + + start = round(event["start"] * 2) / 2 + end = round(event["end"] * 2) / 2 + if start == end: # HACK (maxzhaoshuol): remove events with start == end + raise ValueError("Start and end time are the same for data.") + user_prompt = random.choice( + [ + f"Caption the event between {start}s and {end}s.", + f"Please describe the event between {start} and {end}.", + f"Please caption the event between the start time {start}s and the end time {end}s.", + f"Summarize the event between <{start}s, {end}s>.", + ] + ) + return user_prompt + + +class TimeStampWithoutAugmentMessage(Augmentor): + def __init__( + self, + input_key: str = "media", + output_format: Literal[ + "dense_video_caption", "temporal_localization", "temporal_caption", "random" + ] = "dense_video_caption", + urls_needs_timestamp: list = ["av_reasoning_localization_20250627", "tl_activitynet_20250630"], + processor=None, + ) -> None: + """ + Args: + input_keys (list): List of input keys. + """ + self.input_key = input_key + self.output_format = output_format + self.urls_needs_timestamp = urls_needs_timestamp + self.processor = processor + + def __call__(self, data_dict: Dict) -> Dict: + url = data_dict["__url__"] + if not any(url_pattern in url.root for url_pattern in self.urls_needs_timestamp): + return data_dict + + media_data = data_dict[self.input_key] + for k, v in media_data.items(): + if "video" in k: + video_frames_with_timestamp, timestamps = overlay_text(v["videos"], v["fps"], processor=self.processor) + media_data[k]["videos"] = video_frames_with_timestamp + return data_dict diff --git a/cosmos-inference/cosmos3/_src/vfm/datasets/augmentors/vlm/timestamp_without_end_time.py b/cosmos-inference/cosmos3/_src/vfm/datasets/augmentors/vlm/timestamp_without_end_time.py new file mode 100644 index 00000000..e2f8f38c --- /dev/null +++ b/cosmos-inference/cosmos3/_src/vfm/datasets/augmentors/vlm/timestamp_without_end_time.py @@ -0,0 +1,330 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +"""Augmentors for video dense caption datasets without end time. +Copied from projects/cosmos/reason1/datasets/augmentors/timestamp_without_end_time.py +Changes: + 1. Unify system prompt to 'You are a helpful assistant.' + 2. Move task requirements from system prompts to user prompts. + 3. Randomly change timestamp formats from ["seconds", "hh:mm:ss", "hh:mm:ss.sss", "mm:ss.sss"] + 4. Add json output format for event temporal localization. +""" + +import random +from copy import deepcopy +from typing import Dict, List, Literal, Tuple + +from cosmos3._src.imaginaire.datasets.webdataset.augmentors.augmentor import Augmentor +from cosmos3._src.imaginaire.utils import log +from cosmos3._src.vfm.datasets.augmentors.vlm.timestamp import ( + json_to_markdown, + markdown_to_list, + overlay_text, + timestamp_convertor, +) + + +def snap_timestamps_to_existing(assistant_message: List[Dict], existing_timestamps: List[float]) -> List[Dict]: + """ + Snap conversation start/end timestamps to the nearest existing timestamps. + + Args: + assistant_message: List of dictionaries with 'start', 'end', and 'caption' fields + existing_timestamps: List of existing timestamps (floats) to snap to + + Returns: + List of dictionaries with snapped timestamps + """ + snapped_message = [] + for item in assistant_message: + if "caption" not in item and "event" in item: + # This is the nexar dataset + item["caption"] = item["event"] + del item["event"] + + if not isinstance(item, dict) or "start" not in item: + raise ValueError(f"Each item must be a dictionary with 'start' field. getting {item}") + + snapped_item = item.copy() + + # Snap start and end timestamps to existing ones + snapped_item["start"] = min(existing_timestamps, key=lambda x: abs(x - item["start"])) + + snapped_message.append(snapped_item) + + # Sort the merged events by start timestamp to ensure chronological order + # Merge captions that share identical start and end timestamps + merged_events: Dict[Tuple[float], Dict] = {} + for item in snapped_message: + item["start"] = round(item["start"], 2) + + key = (item["start"],) + if key in merged_events: + # Concatenate captions for the same time interval. + merged_events[key]["caption"] = merged_events[key]["caption"].rstrip() + " " + item["caption"].lstrip() + else: + merged_events[key] = item + + merged_events[key]["caption"] = merged_events[key]["caption"].strip() + + # Sort the merged events by start timestamp to ensure chronological order + new_assistant_message = sorted(merged_events.values(), key=lambda x: x["start"]) + if len(new_assistant_message) == 0: + raise ValueError("No valid assistant message found for data.") + + return new_assistant_message + + +def augment_assistant_message( + assistant_message: List[Dict], + output_format: Literal[ + "dense_video_caption_json", + "dense_video_caption_plain", + "single_event_localization", + "multiple_events_localization_json", + "temporal_caption", + ], + timestamp_format: str = "hh:mm:ss", +): + # change time stamp format to hh:mm:ss.sss + assistant_message = deepcopy(assistant_message) + for item in assistant_message: + item["start"] = timestamp_convertor[timestamp_format](item["start"]) + + if output_format == "dense_video_caption_json": + output_message = json_to_markdown(assistant_message) + return output_message + elif output_format == "dense_video_caption_plain": + output_message = "" + for item in assistant_message: + output_message += f"{item['start']}, {item['caption']}\n" + return output_message + elif output_format == "single_event_localization": + return f"{assistant_message[0]['start']}" + elif output_format == "multiple_events_localization_json": + for item in assistant_message: + if "start" in item: + item["time"] = item["start"] + del item["start"] + if "caption" in item: + item["event"] = item["caption"] + del item["caption"] + output_message = json_to_markdown(assistant_message) + return output_message + elif output_format == "temporal_caption": + return assistant_message[0]["caption"] + else: + raise ValueError(f"Invalid output format: {output_format}") + + +def augment_user_prompt( + assistant_message: List[dict], + output_format: Literal[ + "dense_video_caption_json", + "dense_video_caption_plain", + "single_event_localization", + "multiple_events_localization_json", + "temporal_caption", + ], + timestamp_format: str = "hh:mm:ss", +): + if output_format == "dense_video_caption_json" or output_format == "dense_video_caption_plain": + if random.random() < 0.5: + user_prompt = random.choice( + [ + "Caption the notable events in the provided video.", + "Describe the notable events in the provided video.", + "Summarize the notable events in the provided video.", + ] + ) + if random.random() < 0.5: + user_prompt = "Please " + user_prompt.lower() + else: + user_prompt = random.choice( + [ + "Can you caption the notable events in the provided video?", + "Can you describe the notable events in the provided video?", + "Can you summarize the notable events in the provided video?", + ] + ) + if output_format == "dense_video_caption_json": + if random.random() < 0.5: + user_prompt = user_prompt + ( + "\nPlease identify all the events in the following driving video with timestamps using the following format:\n" + "[\n" + " {\n" + f' "start": {timestamp_format},\n' + ' "event": ,\n' + " },\n" + " {\n" + f' "start": {timestamp_format},\n' + ' "event": ,\n' + " },\n" + "]\n" + "Each event corresponds to one and only one of the followinig five types: collision, near collision, hard brake, harsh acceleration, sharp cornering.\n" + ) + else: + user_prompt = ( + user_prompt + + f"\nPlease provide the result in json format with '{timestamp_format}' for time depiction for each event. Use keywords 'start', 'event' in the json output." + ) + else: + user_prompt = ( + user_prompt + + f"\nPlease provide short descriptions of all the events in the video with timestamps using the following format: \n{timestamp_format}, caption of event 1.\n{timestamp_format}, caption of event 2." + ) + elif output_format == "single_event_localization": + event = assistant_message[0] + event_caption = event["caption"] + if not event_caption[-1].isalpha(): + event_caption = event_caption[:-1] + user_prompt = random.choice( + [ + f"When does the following event happen? {event_caption}.", + f"When does the event '{event_caption.lower()}' happen?", + f"Can you find the event '{event_caption.lower()}'?", + ] + ) + user_prompt = ( + user_prompt + + f"\nPlease provide the start timestamp of the event in the following format: {timestamp_format}." + ) + + elif output_format == "multiple_events_localization_json": + user_prompt = random.choice( + [ + f"You should find the following {len(assistant_message)} events in the input video:", + f"Please find the following {len(assistant_message)} events based on descriptions:", + f"Please identify the following events in the input video:", + ] + ) + for i, event in enumerate(assistant_message): + user_prompt += f"\nEvent {i + 1}: {event['caption']}" + user_prompt += f"\nPlease provide the result in json format as a list of dictionaries. Use '{timestamp_format}' for time depiction for each event. Use keywords 'time', 'event' in each dictionary." + + elif output_format == "temporal_caption": + event = assistant_message[0] + if random.random() < 0.333333: + + start = round(event["start"]) + elif random.random() < 0.666666: + + start = round(event["start"] * 2) / 2 + else: + start = event["start"] + + if timestamp_format == "seconds": + if random.random() < 0.5: + start = f"{start}s" + else: + start = f"{start} seconds" + else: + start = timestamp_convertor[timestamp_format](start) + + user_prompt = random.choice( + [ + f"Caption the event starting at {start}.", + f"Please describe the event starting at {start}.", + f"Please caption the event starting at {start}.", + f"Summarize the event starting at {start}.", + ] + ) + return user_prompt + + +class TimeStampWithoutEndTime(Augmentor): + def __init__( + self, + input_key: str = "media", + output_format: Literal[ + "dense_video_caption", "temporal_localization", "temporal_caption", "random" + ] = "dense_video_caption", + urls_needs_timestamp: list = ["av_reasoning_localization_20250627", "tl_activitynet_20250630"], + processor=None, + ) -> None: + """ + Args: + input_keys (list): List of input keys. + """ + self.input_key = input_key + self.output_format = output_format + self.urls_needs_timestamp = urls_needs_timestamp + self.processor = processor + + def __call__(self, data_dict: Dict) -> Dict: + url = data_dict["__url__"] + if not any(url_pattern in url.root for url_pattern in self.urls_needs_timestamp): + return data_dict + + media_data = data_dict[self.input_key] + for k, v in media_data.items(): + if "video" in k: + video_frames_with_timestamp, timestamps = overlay_text(v["videos"], v["fps"], processor=self.processor) + media_data[k]["videos"] = video_frames_with_timestamp + + if self.output_format == "random": + output_format = random.choice(["dense_video_caption", "temporal_localization", "temporal_caption"]) + else: + output_format = self.output_format + + if output_format == "dense_video_caption": + output_format = random.choice(["dense_video_caption_json", "dense_video_caption_plain"]) + + try: + # find the assistant message and parse into a list of dictionaries + for item in data_dict["conversation"]: + if item["role"] == "assistant": + if isinstance(item["content"], list): + assert len(item["content"]) == 1 + assert item["content"][0]["type"] == "text" + item["content"] = item["content"][0]["text"] + assistant_message = markdown_to_list(item["content"]) + assistant_message = snap_timestamps_to_existing(assistant_message, timestamps) + break + + # remove end time if it exists + for item in assistant_message: + if "end" in item: + del item["end"] + + # if temporal localization or caption, sample one event + if output_format == "temporal_localization": + output_format = random.choice(["single_event_localization", "multiple_events_localization_json"]) + if output_format in ["single_event_localization", "temporal_caption"]: + assistant_message = random.sample(assistant_message, 1) + elif output_format == "multiple_events_localization_json": + assistant_message = random.sample(assistant_message, min(len(assistant_message), random.randint(1, 5))) + random.shuffle(assistant_message) + + timestamp_format = random.choice(list(timestamp_convertor.keys())) + # process conversation + conversation = data_dict["conversation"] + for item in conversation: + if item["role"] == "system": + item["content"] = "You are a helpful assistant." + elif item["role"] == "user": + for content in item["content"]: + if content["type"] == "text": + content["text"] = augment_user_prompt(assistant_message, output_format, timestamp_format) + elif item["role"] == "assistant": + assistant_message = augment_assistant_message(assistant_message, output_format, timestamp_format) + item["content"] = assistant_message + data_dict["conversation"] = conversation + + return data_dict + + except Exception as e: + log.warning(f"Error timestamping: {e}. Skipping this sample {url.root} {data_dict['__key__']}.") + return None diff --git a/cosmos-inference/cosmos3/_src/vfm/datasets/augmentors/vlm/tokenize_data.py b/cosmos-inference/cosmos3/_src/vfm/datasets/augmentors/vlm/tokenize_data.py new file mode 100644 index 00000000..8fa7b459 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/vfm/datasets/augmentors/vlm/tokenize_data.py @@ -0,0 +1,375 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +"""Visual-Text Transformations or Augmentations.""" + +import re +from typing import Dict, Optional + +import numpy as np +import torch +from PIL import Image + +from cosmos3._src.imaginaire.datasets.webdataset.augmentors.augmentor import Augmentor +from cosmos3._src.imaginaire.utils import log +from cosmos3._src.vfm.datasets.vlm.video_decoder_qwen import token_to_pixels +from projects.cosmos3.vlm.processors.qwen3vl_processor import Qwen3VLProcessor as Processor +from projects.cosmos3.vlm.utils.constant import IGNORE_INDEX, PROCESSOR_KEYS_TO_ADD + + +def maybe_subsample_frames(model_name_or_path, list_of_pil_image, max_video_token_length, processor): + """ + Why do we need to subsample frames? For model like eagle_er, it does not support smart downsampling in the processor. + And all the frames are resized to the same size. There are 2 senerios the context length can easily exceed the limit. + 1: the video has >32 frames, it will create 256*32=8192 tokens which exceeds the limit. + 2: there are multiple images, by default, each image will be tiled into (at most) 13 tiles. Each tile is 256 tokens. + So if there are multiple images, or many frames in the video, we need to subsample the frames to shorten the context length. + """ + if "Qwen/Qwen2.5-VL" in model_name_or_path: + return list_of_pil_image + elif "eagle_er" in model_name_or_path or "InternVL3_5" in model_name_or_path: + tokens_per_tile = processor.tokens_per_tile + # 1 frames map to 256 tokens + estimated_num_frames = max_video_token_length // tokens_per_tile + if len(list_of_pil_image) > estimated_num_frames: + # Evenly sample frames + sample_idx = np.linspace(0, len(list_of_pil_image) - 1, estimated_num_frames).astype(int) + return [list_of_pil_image[i] for i in sample_idx] + else: + return list_of_pil_image + else: + return list_of_pil_image + + +def convert_all_images_to_rgb(conversation): + """ + Convert all images to RGB. Otherwise the tokenizer will raise error for image in LA mode. + """ + new_conversation = [] + for conversation_round in conversation: + if isinstance(conversation_round["content"], list): + new_content_list = [] + for content in conversation_round["content"]: + if "type" not in content: + log.critical( + f"content: {content} | conversation_round: {conversation_round} | full conversation: {conversation}" + ) + content = {"type": "text", "text": content} + content_type = content["type"] + if content_type in ["image", "video"]: + if isinstance(content[content_type], Image.Image): + content[content_type] = content[content_type].convert("RGB") + elif isinstance(content[content_type], list): + content_i = content[content_type] + new_content_i = [] + for img in content_i: + if isinstance(img, Image.Image): + img = img.convert("RGB") + new_content_i.append(img) + content[content_type] = new_content_i + new_content_list.append(content) + conversation_round["content"] = new_content_list + new_conversation.append(conversation_round) + + return new_conversation + + +def compress_repeated_tokens(dialog_str): + pattern = re.compile(r"((<\|[^|]+\|>|<|[^<>]+|>|\[[^\]]+\]))\1+") + + def replacer(match): + token = match.group(1) + count = len(match.group(0)) // len(token) + return f"{token}*{count}times" + + # Cap length to avoid regex hang on very long decoded sequences + max_len = 16 * 1024 + if len(dialog_str) > max_len: + dialog_str = dialog_str[:max_len] + "...[truncated]" + return pattern.sub(replacer, dialog_str) + + +class TokenizeData(Augmentor): + """ + Image-Text Transform for Supervised Fine-Tuning (SFT) data, for Vision-Language Model training. + """ + + def __init__( + self, + processor: Optional[Processor] = None, + max_video_token_length: int = 8192, + max_image_token_length: int = 8192, + add_system_prompt_if_missing: bool = False, + text_only: bool = False, + ) -> None: + """ + Args: + processor (Processor): Text/Image processor for tokenization. + max_video_token_length (int): Maximum number of video tokens to use. Defaults to 8192. + """ + # Create the tokenizer + self.text_only = text_only + self.processor = processor # Expecting a ImageTextTokenizer + self.max_video_token_length = max_video_token_length + self.max_image_token_length = max_image_token_length + self.add_system_prompt_if_missing = add_system_prompt_if_missing + + def __call__(self, data_dict: Dict) -> Dict: + r"""Tokenize a dialog and pad the sequence. + + "media" is a dict of + { + "video_1": {"video": [PIL.Image.Image, ...], "fps": int}, + "image_1": PIL.Image.Image, + } + + "conversation" is a list of dicts, each dict has the following fields: + { + "role": "user" or "assistant", + "content": [ + {"type": "video", "video": media_key_in_media_dict}, + {"type": "image", "image": media_key_in_media_dict}, + {"type": "text", "text": str}, + ], + } + or + { + "role": "user" or "assistant", + "content": str, + } + + Args: + data_dict (dict): Input data dict + + Returns: + data_dict (dict): Output dict + """ + conversation = data_dict["conversation"] + processor_kwargs = {} + total_images = 0 + total_videos = 0 + raw_images = [] + # Pre-compute the total_images and total_videos + for message in conversation: + if not isinstance(message, dict): + raise ValueError( + f"message is not a dict: {message} | conversation: {conversation} | data_dict: {data_dict} | __url__: {data_dict['__url__'].root}, {data_dict['__url__'].path}" + ) + if message["role"] == "user" and isinstance(message["content"], list): + total_images += len([content for content in message["content"] if content["type"] == "image"]) + total_videos += len([content for content in message["content"] if content["type"] == "video"]) + + assert total_videos == 1 or total_videos == 0, "Only one video is supported for now" + + # url + url = data_dict["__url__"].root + "/" + data_dict["__url__"].path + + # go through each message in the conversation + for message in conversation: + # for user message, we insert the media + + if message["role"] == "user" and isinstance( + message["content"], list + ): # Otherwise it's text and content is a string + images_content_idx_full = [ + content_idx for content_idx, content in enumerate(message["content"]) if content["type"] == "image" + ] + images_content_idx_subsampled = maybe_subsample_frames( + self.processor.name, images_content_idx_full, self.max_image_token_length, self.processor + ) + if ( + len(images_content_idx_subsampled) > 0 + ): # for eagle, we need to reduce the max_dynamic_tiles and not use thumbnail. These args only valid for eagle_er processor. + processor_kwargs["max_dynamic_tiles"] = 1 + processor_kwargs["use_thumbnail"] = False + + new_content_list = [] + for content_idx, content in enumerate(message["content"]): + if content["type"] == "image": + if content_idx not in images_content_idx_subsampled: + continue + # for image, we do NOT use the temporal patch size, this leads to a smaller max_pixels + # Later, each image will be repeated temporal_patch_size times + max_total_pixels = token_to_pixels( + self.max_image_token_length, + patch_size=self.processor.patch_size, + temporal_patch_size=1, # Because this is image, not video + ) + max_pixels_per_image = max_total_pixels // total_images + + if self.processor.use_smart_resize: + min_pixels_per_image = self.processor.processor.image_processor.size["shortest_edge"] + if max_pixels_per_image < min_pixels_per_image: + log.critical( + f"max_pixels_per_image: {max_pixels_per_image} < min_pixels_per_image: {min_pixels_per_image} | self.max_video_token_length = {self.max_video_token_length} is not enough for total_images: {total_images}, as the default min_pixels is {min_pixels_per_image} | Either increase max_video_token_length or include max_pixels in the content or reduce min_pixels" + ) + return None + + # Add each image to the content list + if "media" not in data_dict: + log.critical( + f"[TokenizerDataError]media not found in data_dict, available keys: {data_dict.keys()}. url: {url}, content: {message['content']}", + rank0_only=False, + ) + return None + + elif content["image"] not in data_dict["media"]: + log.critical( + f"[TokenizerDataError]image {content['image']} not found in media, available keys: {data_dict['media'].keys()}. url: {url}", + rank0_only=False, + ) + return None + image = data_dict["media"][content["image"]] + content["image"] = image + content["max_pixels"] = max_pixels_per_image + raw_images.append(image) + + elif content["type"] == "video": + + # as tokenization will NOT upsample the video, we can use a larger value here at the cost of multiple video having 1.5x token length + max_total_pixels = token_to_pixels(self.max_video_token_length * 1.5, temporal_patch_size=2) + media_key = content["video"] + # Add each video to the content list + if "media" not in data_dict: + log.critical( + f"[TokenizerDataError]media not found in data_dict, available keys: {data_dict.keys()}. url: {url}, content: {message['content']}", + rank0_only=False, + ) + return None + if media_key not in data_dict["media"]: + log.info( + f"[TokenizerDataError]video {media_key} not found in media, available keys: {data_dict['media'].keys()}. url: {url}" + ) + return None + if "videos" not in data_dict["media"][media_key]: + log.info( + f"[TokenizerDataError]videos not found in media[{media_key}], available keys: {data_dict['media'][media_key].keys()}. url: {url}" + ) + return None + videos = data_dict["media"][media_key]["videos"] # list of PIL images + fps = data_dict["media"][media_key]["fps"] + + # this is because videos are decoded to be around "max_video_token_length" tokens + + videos = maybe_subsample_frames( + self.processor.name, videos, self.max_video_token_length, self.processor + ) + content["video"] = videos + + max_pixels_per_image = max_total_pixels // total_videos // len(videos) + content["fps"] = fps + content["max_pixels"] = max_pixels_per_image + + data_dict["raw_video"] = torch.from_numpy(np.array(videos)).permute( + 3, 0, 1, 2 + ) # [3,T,H,W], range [0, 255] + new_content_list.append(content) + message["content"] = new_content_list + + if len(raw_images) > 1: + # resize the raw_image to the size of the first image + image_size = raw_images[0].size + raw_images = [image.resize(image_size) for image in raw_images] + + if len(raw_images) > 0: + data_dict["raw_image"] = torch.from_numpy(np.array(raw_images)).permute(3, 0, 1, 2) # [3,num_images,H,W] + + if conversation[0]["role"] != "system" and self.add_system_prompt_if_missing: + conversation.insert(0, {"role": "system", "content": "You are a helpful assistant."}) + + if self.text_only and (total_images > 0 or total_videos > 0): + log.critical( + f"Images or videos found in the conversation but expect only text, __url__: {url} | data_dict: {data_dict.keys()} | conversation={conversation}" + ) + return None + + if total_images > 1 or total_videos > 1: + add_vision_id = True + else: + add_vision_id = False + + try: + conversation = convert_all_images_to_rgb(conversation) + except Exception as e: + log.critical( + f"Error in convert_all_images_to_rgb: {e} | conversation: {conversation} | __url__: {url} | data_dict: {data_dict.keys()}" + ) + return None + + try: + tokenizer_output = self.processor.apply_chat_template( + conversation, + tokenize=True, + add_generation_prompt=False, + add_vision_id=add_vision_id, + **processor_kwargs, + ) + except Exception as e: + log.critical( + f"Error in tokenizer_output: {e} | conversation: {conversation} | __url__: {url} | data_dict: {data_dict.keys()}" + ) + return None + input_ids = tokenizer_output["input_ids"] + if "image_grid_thw" in tokenizer_output and "raw_image" in data_dict: + # image_grid_thw: (1, t, h, w) + t, h, w = tokenizer_output["image_grid_thw"][0] + # interpolate raw_image to the size of the image grid * 14 + data_dict["raw_image"] = torch.nn.functional.interpolate( + data_dict["raw_image"], size=(h * 14, w * 14), mode="bilinear", align_corners=False + ) # [3,num_images,h*14,w*14] + + try: + # token_mask: True for tokens to compute loss on; False for tokens to ignore + token_mask = self.processor.add_assistant_tokens_mask(input_ids) + except Exception as e: + log.critical( + f"Error in add_assistant_tokens_mask: {e} | conversation: {conversation} | __url__: {url} | data_dict: {data_dict.keys()}" + ) + return None + + input_ids = torch.LongTensor(input_ids) # [N_token] + token_mask = torch.BoolTensor(token_mask) # [N_token]; True = compute loss on this token + + data_dict.update( + { + "input_ids": input_ids, + "token_mask": token_mask, + } + ) + for key in PROCESSOR_KEYS_TO_ADD: + if key in tokenizer_output: + data_dict[key] = tokenizer_output[key] + labels = tokenizer_output["input_ids"].clone() # [N_token] + labels[~token_mask] = IGNORE_INDEX + data_dict["labels"] = labels + data_dict["pad_token_id"] = self.processor.pad_id + data_dict["ignore_index"] = IGNORE_INDEX + + # keep raw text for debugging/logging purpose. Add \n\n after each <|im_end|>. + dialog_str = self.processor.decode(input_ids) + data_dict["dialog_str"] = compress_repeated_tokens(dialog_str.replace("<|im_end|>", "<|im_end|>\n\n")) + + # For debugging purpose + msg = f"input_ids: {input_ids.shape[-1]} | __url__: {data_dict['__url__'].root}, {data_dict['__url__'].path} | __key__: {data_dict['__key__']}" + if "raw_video" in data_dict: + msg += f" | raw_video: {data_dict['raw_video'].shape} " + if "raw_image" in data_dict: + msg += f" | raw_image: {data_dict['raw_image'].shape} " + if "pixel_values" in data_dict: + msg += f" | pixel_values: {data_dict['pixel_values'].shape} " + + msg += f"original conversation: {data_dict['conversation']}" + + return data_dict diff --git a/cosmos-inference/cosmos3/_src/vfm/datasets/augmentors/vlm/user_prompt_caption_general.json b/cosmos-inference/cosmos3/_src/vfm/datasets/augmentors/vlm/user_prompt_caption_general.json new file mode 100644 index 00000000..8433bb71 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/vfm/datasets/augmentors/vlm/user_prompt_caption_general.json @@ -0,0 +1,98 @@ +[ + "Describe the image.", + "What do you see in the image?", + "Provide a caption for this image.", + "Explain what is happening in the picture.", + "Summarize the content of the image.", + "Generate a description for the image.", + "Give a detailed description of this picture.", + "What is depicted in the image?", + "Interpret the scene in the image.", + "Analyze the contents of the image.", + "Write a caption for the image.", + "What information does the image convey?", + "Break down the elements in the picture.", + "Narrate what you see in the image.", + "Can you describe the image in detail?", + "Provide a textual representation of the image.", + "Create a description for the given picture.", + "List the main objects in the image.", + "What is the overall theme of the image?", + "Describe the colors and objects in the image.", + "Explain the emotions conveyed by the image.", + "Write a short summary of the image.", + "Generate an insightful caption for this picture.", + "Tell me what is present in the image.", + "Identify key elements in the picture.", + "What is the subject of this image?", + "Give a brief description of this image.", + "Describe the composition of the image.", + "Write an alt text for this image.", + "Create an image description in a sentence.", + "What’s the primary focus of the image?", + "Explain the scene depicted in the picture.", + "Summarize the visual content of the image.", + "Describe the setting of this picture.", + "What message does the image convey?", + "How would you describe this image to someone who can’t see it?", + "Describe the action occurring in the image.", + "Write a detailed breakdown of this picture.", + "What’s happening in this scene?", + "Provide an objective description of the image.", + "What does the image represent?", + "Describe the mood and atmosphere of the image.", + "What kind of scene is captured in this image?", + "Summarize this image in a few words.", + "What are the most prominent elements in this picture?", + "Tell me the meaning of this image.", + "Describe the perspective used in this image.", + "Identify the key features of the picture.", + "Write a natural language description of the image.", + "Give a creative description of this image.", + "Interpret the context of the image.", + "What do the objects in the image signify?", + "Tell me what story this image tells.", + "How would you caption this picture?", + "Describe this image like you would in an article.", + "What’s the background of this image?", + "Describe the lighting and shadows in this image.", + "How does this image make you feel?", + "Explain the composition and focal points of the image.", + "Write a summary of the visual elements in the image.", + "List and describe the objects in the image.", + "Break down the details of this picture.", + "Provide a short textual description of this image.", + "What’s the first thing you notice in this image?", + "Describe this image in a few sentences.", + "Write an accessible alt-text for this picture.", + "What’s visually interesting about this image?", + "Summarize the details of the picture.", + "What’s the story behind this image?", + "Give a brief analysis of this image.", + "Describe the symmetry or patterns in this image.", + "What key visual elements stand out in this picture?", + "Give an artistic interpretation of this image.", + "How would you explain this image to a child?", + "Provide a step-by-step breakdown of this image.", + "Explain the depth and perspective of the image.", + "Describe the relationships between elements in the image.", + "What’s the focal point of this image?", + "Write a caption for this image that explains it concisely.", + "Describe the textures visible in this image.", + "What visual style does this image have?", + "How does this image relate to its surroundings?", + "Describe the contrast and color palette of this image.", + "Tell me what action is happening in this scene.", + "Explain the significance of the objects in this image.", + "What symbols or metaphors are present in this image?", + "Describe the movement and flow in this picture.", + "What’s the central theme of this image?", + "Provide a written interpretation of this image.", + "Describe the balance and harmony in the picture.", + "Tell me what’s most eye-catching about this image.", + "How would you summarize this image in one line?", + "Write a caption that would fit well with this image.", + "Describe the emotions depicted in this image.", + "What’s unique about this image?", + "Explain the interaction between the subjects in this picture." + ] diff --git a/cosmos-inference/cosmos3/_src/vfm/datasets/augmentors/vlm/user_prompt_ocr.json b/cosmos-inference/cosmos3/_src/vfm/datasets/augmentors/vlm/user_prompt_ocr.json new file mode 100644 index 00000000..349b4c0e --- /dev/null +++ b/cosmos-inference/cosmos3/_src/vfm/datasets/augmentors/vlm/user_prompt_ocr.json @@ -0,0 +1,100 @@ +[ + "Extract the text from the image.", + "Read the text in this image.", + "Perform OCR on this image.", + "Identify and extract the words from the image.", + "Recognize the text in this picture.", + "Convert the text in the image into readable text.", + "What does the text in the image say?", + "Retrieve the text content from the image.", + "Scan and extract the words from the image.", + "Detect and read the text in the picture.", + "Provide the text found in the image.", + "Extract and transcribe the words in the image.", + "Can you recognize the text in this image?", + "Extract text and provide a readable version.", + "List all the words present in this image.", + "Convert the image text into digital text.", + "Recognize and extract the characters from the image.", + "What is the written content in the image?", + "Decode the text embedded in the image.", + "Transcribe the words appearing in this image.", + "Identify the letters and numbers in the image.", + "Convert the visual text into a readable format.", + "Extract the words visible in the picture.", + "Recognize and digitize the text in the image.", + "Scan and retrieve the written information from the image.", + "Convert the printed text in the image to text format.", + "Analyze and extract text from the picture.", + "Provide a text version of the content in the image.", + "Detect characters and words in this image.", + "Generate a transcript of the text in this image.", + "What is written in this image?", + "Identify and extract any text found in the image.", + "Extract and display the words from this picture.", + "Scan the image for any textual content.", + "Retrieve and present the text detected in the image.", + "Recognize the letters and words in this picture.", + "Extract and provide the readable text from the image.", + "What can be read from this image?", + "Process the image to retrieve any visible text.", + "Convert this image-based text into editable text.", + "Can you scan this image for text?", + "Read and extract all text content from this image.", + "Convert handwritten or printed text from the image into digital text.", + "Identify any words or symbols in this image.", + "Recognize and output the text from the image.", + "Detect and convert any text in the image to readable text.", + "Extract all readable content from this picture.", + "Find and transcribe the text in this image.", + "Interpret the words appearing in the image.", + "Identify and read any textual elements in the image.", + "Scan for text and output the recognized words.", + "Recognize and retrieve any text in this photo.", + "Extract the characters from this picture.", + "What does the text in this image say?", + "Convert this image's text to machine-readable format.", + "Digitize the text in the image.", + "Retrieve and format the words from this image.", + "Can you extract the letters in this image?", + "Detect and extract words from this scanned document.", + "Analyze the image and provide the text found in it.", + "Retrieve the information written in this image.", + "Extract readable text from this document image.", + "Convert the letters in the image into digital form.", + "Analyze and output the text detected in the image.", + "Scan and extract text information from this picture.", + "Identify and copy any words in this image.", + "Transcribe the image's text into plain text.", + "Extract the visible letters and numbers from this image.", + "What words are present in this image?", + "Extract text as if scanning a document.", + "Digitally transcribe the text in this image.", + "What characters can be detected in this image?", + "Identify and extract any visible phrases from the image.", + "Detect and recognize any writing in this picture.", + "Find any readable words in the image and extract them.", + "Read and transcribe the contents of this image.", + "Recognize and convert the text in the image to digital text.", + "Retrieve and list any textual elements present.", + "Analyze the image for legible words.", + "Extract the written words from this image.", + "Provide the extracted words from this scanned photo.", + "Convert the textual content in the image into an editable format.", + "List the letters and words recognized from the image.", + "Generate a machine-readable version of the text in the image.", + "Retrieve any alphanumeric text from the image.", + "Recognize any textual symbols present in the image.", + "Identify words from this image and transcribe them.", + "Detect and retrieve the text printed in this picture.", + "Recognize and output any readable text from this image.", + "Convert the text in the image into selectable text.", + "Extract letters, words, and numbers from this image.", + "Scan for any handwritten or typed words in the image.", + "Identify all readable characters in this image.", + "Recognize the text and return it in digital form.", + "What readable information does this image contain?", + "Perform OCR processing on this image and extract text.", + "Detect, extract, and format the text from this image.", + "Retrieve the textual content visible in this picture." + ] diff --git a/cosmos-inference/cosmos3/_src/vfm/datasets/joint_dataloader.py b/cosmos-inference/cosmos3/_src/vfm/datasets/joint_dataloader.py new file mode 100644 index 00000000..0bd49ad9 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/vfm/datasets/joint_dataloader.py @@ -0,0 +1,989 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +from collections import deque +from dataclasses import dataclass +from typing import Any, ClassVar, Dict, Union + +import numpy as np +import torch +import webdataset +from torch.utils.data.dataloader import default_collate + +from cosmos3._src.imaginaire.lazy_config import instantiate +from cosmos3._src.imaginaire.utils import log + +_TIMING_KEYS = {"_sample_time", "_aug_time", "_pre_aug_time", "_aug_step_times"} +_BATCH_TIMING_KEYS = { + "_worker_batch_time", + "_worker_aug_time", + "_worker_io_time", + "_worker_aug_step_times", + "_worker_id", +} + + +def custom_collate_fn(batch): + """ + Collate function that works like default_collate for all keys other than "text_token_ids", "images", and "video". + For "text_token_ids", "images", and "video" it simply returns them in a list, instead of stacking them as a tensor. + """ + list_collate_keys = { + "text_token_ids", + "images", + "video", + "action", + "domain_id", + "sequence_plan", + "sound", + "raw_action_dim", + "image_size", + } + + # Handle the case where the batch is already a dictionary (e.g. column-wise batching) + if isinstance(batch, dict): + return {key: (value if key in list_collate_keys else default_collate(value)) for key, value in batch.items()} + + # Handle standard list of samples + elem = batch[0] + if isinstance(elem, dict): + + # Some Action datasets add optional metadata keys (for example + # ``additional_view_description`` for concat-view captions) only for a + # subset of samples. PyTorch can batch such samples together when + # DataLoader batch_size > 1; collating only elem's keys and indexing + # every sample by that key turns the optional field into a fatal + # KeyError. Use the union of keys and skip optional keys that are not + # present in every sample. Required training keys still fail loudly via + # downstream assertions if actually missing. + result = {} + keys = set().union(*(d.keys() for d in batch)) + for key in keys: + if key in _TIMING_KEYS: + continue + values = [d.get(key) for d in batch] + if any(value is None for value in values): + continue + if key in list_collate_keys: + result[key] = values + else: + result[key] = default_collate(values) + result.update(_aggregate_worker_timing(batch)) + return result + else: + return default_collate(batch) + + +def _aggregate_worker_timing(samples: list[dict]) -> dict: + """Extract per-sample timing keys, aggregate into per-batch scalars.""" + info: dict[str, float | int] = {} + if "_sample_time" in samples[0]: + info["_worker_batch_time"] = sum(s.get("_sample_time", 0.0) for s in samples) + if "_aug_time" in samples[0]: + aug_total = sum(s.get("_aug_time", 0.0) for s in samples) + info["_worker_aug_time"] = aug_total + if "_worker_batch_time" in info: + info["_worker_io_time"] = info["_worker_batch_time"] - aug_total + if "_aug_step_times" in samples[0]: + agg: dict[str, float] = {} + for s in samples: + for step_name, t in s.get("_aug_step_times", {}).items(): + agg[step_name] = agg.get(step_name, 0.0) + t + info["_worker_aug_step_times"] = agg + worker_info = torch.utils.data.get_worker_info() + info["_worker_id"] = worker_info.id if worker_info is not None else 0 + return info + + +@dataclass +class _PackingMetrics: + """Per-batch packing statistics collected during the packing loop. + + Also serves as the single source of truth for packing-related metric names + via ``STATS_SPEC``, which the dataloading monitor callback consumes to + drive accumulation and logging. + """ + + current_sequence_length: int = 0 + num_samples: int = 0 + dropped_count: int = 0 + from_buffer: int = 0 + from_workers: int = 0 + + STATS_SPEC: ClassVar[list[tuple[str, str, str]]] = [ + # (batch_key, wandb_suffix, aggregation_type) + ("_num_tokens", "token_fraction", "scalar"), + ("_num_samples", "samples_per_batch", "list"), + ("_from_buffer", "from_buffer", "list"), + ("_from_workers", "from_workers", "list"), + ("_buffer_size", "buffer_size", "list"), + ("_dropped_count", "dropped", "scalar"), + ] + + def attach_to(self, output_batch: dict, buffer_size: int) -> None: + """Write packing statistics into the output batch dict.""" + output_batch["_num_tokens"] = self.current_sequence_length + output_batch["_num_samples"] = self.num_samples + output_batch["_from_buffer"] = self.from_buffer + output_batch["_from_workers"] = self.from_workers + output_batch["_buffer_size"] = buffer_size + output_batch["_dropped_count"] = self.dropped_count + + +class JointDataLoader(webdataset.WebLoader): + r""" + A joint dataloader that supports loading both images and videos. + """ + + _DEFAULT_LOOKAHEAD_LIMIT: ClassVar[int] = 10 + + def __init__( + self, + dataloaders: Dict[str, Dict[str, Union[torch.utils.data.DataLoader, webdataset.WebLoader, int]]], + tokenizer_spatial_compression_factor: int, + tokenizer_temporal_compression_factor: int, + patch_spatial: int, + max_sequence_length: int | None, + max_samples_per_batch: int | None, + sound_latent_fps: float = 0, + audio_sample_rate: int = 48000, + prewarm: bool = True, + default_lookahead_limit: int = _DEFAULT_LOOKAHEAD_LIMIT, + lookahead_limits: Dict[str, int] | None = None, + ): + """ + Initialize the JointDataLoader with multiple datasets. + + The effective mini-batch size can be controlled with either max_sequence_length or + max_samples_per_batch. To use max_sequence_length, max_samples_per_batch needs to be None. + Vice versa, to use max_samples_per_batch, max_sequence_length needs to be None. + max_sequence_length and max_samples_per_batch cannot both be None simultaneously. + + Args: + dataloaders: key - dataset_name; value - {"dataloader": dataloader, "ratio": data_ratio} + tokenizer_spatial_compression_factor: The spatial compression factor of the tokenizer. + tokenizer_temporal_compression_factor: The temporal compression factor of the tokenizer. + patch_spatial: Spatial pathification factor. + max_samples_per_batch: Max number of samples per packed batch (alternative to max_sequence_length). + sound_latent_fps: Sound tokenizer latent rate in Hz (e.g. 25). If 0, sound tokens are not counted. + audio_sample_rate: Audio sample rate in Hz (e.g. 48000). Used with sound_latent_fps to estimate + sound token count. + default_lookahead_limit: Packing-loop look-ahead fallback for dataloaders not in + ``lookahead_limits``. + lookahead_limits: Optional ``{dataset_name: int}`` per-dataloader override. + + Example: + joint_loader = IterativeJointDataLoader( + dataloaders{ + "image_data": { + "dataloader": webdataset.WebLoader(...), + "ratio": 4, + }, + "video_data": { + "dataloader": torch.utils.data.DataLoader(...), + "ratio": 1, + }, + } + ) + """ + self.dataloader_list, self.dataset_name_list, self.data_ratios = [], [], [] + self.lookahead_limits: list[int] = [] + self.tokenizer_spatial_compression_factor = tokenizer_spatial_compression_factor + self.tokenizer_temporal_compression_factor = tokenizer_temporal_compression_factor + self.patch_spatial = patch_spatial + self.max_sequence_length = max_sequence_length + self.max_samples_per_batch = max_samples_per_batch + self.sound_latent_fps = sound_latent_fps + self.audio_sample_rate = audio_sample_rate + self.default_lookahead_limit = int(default_lookahead_limit) + + assert (self.max_sequence_length is None) != (self.max_samples_per_batch is None), ( + "Exactly one of max_sequence_length or max_samples_per_batch must be None, but not both." + ) + + _lookahead_overrides: Dict[str, int] = dict(lookahead_limits) if lookahead_limits else {} + unknown = set(_lookahead_overrides) - set(dataloaders) + assert not unknown, f"lookahead_limits references unknown dataloaders {unknown}; valid: {sorted(dataloaders)}" + + for dataset_name, dataloader_data in dataloaders.items(): + assert set(dataloader_data.keys()) == {"dataloader", "ratio"}, f"Invalid config: {dataloader_data}" + if dataloader_data["ratio"] <= 0: + continue + self.dataset_name_list.append(dataset_name) + self.dataloader_list.append(instantiate(dataloader_data["dataloader"], collate_fn=custom_collate_fn)) + self.data_ratios.append(dataloader_data["ratio"]) + self.lookahead_limits.append(int(_lookahead_overrides.get(dataset_name, self.default_lookahead_limit))) + + self.global_id = 0 + self.ratio_sum = sum(self.data_ratios) + + total = self.ratio_sum if self.ratio_sum > 0 else 1.0 + lines = [f"JointDataLoader: {len(self.dataset_name_list)} streams"] + for name, ratio in zip(self.dataset_name_list, self.data_ratios): + lines.append(f" {name}: ratio={ratio:.4g} ({ratio / total:.1%})") + log.info("\n".join(lines)) + + self.data_len = 0 + self.dataloaders = [iter(dataloader) for dataloader in self.dataloader_list] + self.buffers = [deque() for _ in range(len(self.dataloader_list))] + for data in self.dataloader_list: + self.data_len += len(data) + + # Pre-warm all dataloaders: force worker process spawning and first + # batch loading so that slow dataset initialisation (e.g. action + # datasets with spawn workers) happens here rather than mid-training + # where it would cause NCCL collective timeouts. + if prewarm: + self._prewarm_dataloaders() + else: + log.info( + "JointDataLoader: prewarm DISABLED (debug mode); first iteration may incur per-stream cold-load cost" + ) + + def _prewarm_dataloaders(self) -> None: + """Force all dataloader iterators to spawn workers and produce one batch. + + The first ``next()`` call on an ``InfiniteDataLoader`` iterator triggers + ``DataLoader.__iter__()`` which spawns worker processes. For action + dataloaders using ``multiprocessing_context='spawn'``, each worker must + fully initialise heavy datasets (BridgeOrigLeRobotDataset, EMBODIMENT_A, etc.) + from scratch. If this happens lazily during training, the resulting + delay (potentially minutes) causes NCCL collective timeouts when faster + ranks enter the forward pass while slower ranks are still loading data. + + By pulling one batch from every dataloader here — before any training + iteration — we ensure all workers are alive and warmed up. The fetched + samples are pushed into the per-dataloader buffer so they are consumed + normally by the first iteration that selects that dataloader. + + A ``dist.barrier()`` at the end synchronises all ranks so that training + only begins once every rank has finished pre-warming. + """ + import time + + for i, (name, dl_iter) in enumerate(zip(self.dataset_name_list, self.dataloaders)): + t0 = time.monotonic() + try: + batch = next(dl_iter) + except StopIteration: + log.warning(f"Pre-warm: dataloader {name!r} is empty, skipping") + continue + elapsed = time.monotonic() - t0 + + # Split the collated batch into individual samples and push them + # into the buffer — identical to the splitting logic in + # _get_next_sample — so the samples are not wasted. + is_image_batch = "images" in batch + input_images_or_videos = batch["images" if is_image_batch else "video"] + batch_size = len(input_images_or_videos) + + for j in range(batch_size): + sample = {} + for k, v in batch.items(): + if k in _BATCH_TIMING_KEYS: + sample[k] = v + elif isinstance(v, list) and k in self._MULTI_ITEM_KEYS: + elem = v[j] + if isinstance(elem, list): + sample[k] = elem + else: + sample[k] = v[j : j + 1] + elif isinstance(v, list): + sample[k] = v[j] + elif isinstance(v, torch.Tensor) and v.dim() > 0: + sample[k] = v[j : j + 1] + else: + sample[k] = v[j : j + 1] + self.buffers[i].append(sample) + + log.info( + f"Pre-warm: dataloader {name!r} ready — {batch_size} samples buffered in {elapsed:.1f}s", + rank0_only=False, + ) + + # Synchronise so training only starts once every rank is warmed up. + if torch.distributed.is_initialized(): + log.info("Pre-warm: waiting at barrier for all ranks …") + torch.distributed.barrier() + log.info("Pre-warm: all ranks ready") + + def _compute_num_tokens_per_sample(self, data_batch: dict) -> int: + """ + This function computes the number of tokens per sample in the data batch. + This includes text + vision generation tokens + action tokens. + + Args: + data_batch (dict): The data batch containing the text tokens. + + Returns: + int: The number of tokens per sample. + """ + + # The token sequence we have is + # [] + # The spatial dimension of image tokens is compressed by + # vae spatial downsampling factor + pathification + # The temporal dimension of image tokens is compressed by + # vae temporal downsampling factor + # Action tokens have 1 token per time step (no spatial dimension) + + text_token_ids = data_batch["text_token_ids"] + if isinstance(text_token_ids, list): + num_text_tokens = text_token_ids[0].shape[0] + else: + num_text_tokens = text_token_ids.shape[1] + + num_tokens = num_text_tokens + 1 + + # Vision part + is_image_batch = "images" in data_batch + input_images_or_videos = data_batch["images" if is_image_batch else "video"] + + # iterate over all the media in the batch + for media in input_images_or_videos if isinstance(input_images_or_videos, list) else [input_images_or_videos]: + if is_image_batch: + _, H, W = media.shape + T = 1 + else: + _, T, H, W = media.shape + + vae_spatial_downsample = self.tokenizer_spatial_compression_factor * self.patch_spatial + vae_temporal_downsample = self.tokenizer_temporal_compression_factor + + latent_h_shape = H // vae_spatial_downsample + latent_w_shape = W // vae_spatial_downsample + latent_t_shape = 1 + (T - 1) // vae_temporal_downsample + + num_vision_tokens = latent_h_shape * latent_w_shape * latent_t_shape + 2 + num_tokens += num_vision_tokens + + # Action part: each action time step is 1 token. + # Action tensor shape is (T_action, D) per sample; stored as a single-element list. + if "action" in data_batch: + list_of_actions = data_batch["action"] + for action in list_of_actions: + # skip None actions + if action is None: + continue + num_action_tokens = action.shape[0] + num_tokens += num_action_tokens + + # Sound part — estimate sound tokens from audio waveform length + if self.sound_latent_fps > 0 and "sound" in data_batch: + sound_data = data_batch["sound"] + if isinstance(sound_data, list) and len(sound_data) > 0: + first_sound = sound_data[0] + # Unwrap nested list if needed + if isinstance(first_sound, list): + first_sound = first_sound[0] + if first_sound is not None and isinstance(first_sound, torch.Tensor): + num_audio_samples = first_sound.shape[-1] + audio_duration = num_audio_samples / self.audio_sample_rate + num_sound_tokens = int(audio_duration * self.sound_latent_fps) + num_tokens += num_sound_tokens + + return num_tokens + + # Keys whose value per sample is a list of tensors to be flattened into one list in the batch + _FLATTEN_LIST_KEYS = {"image_size"} + + def _update_output_batch(self, output_batch: dict, output: dict): + for key, value in output.items(): + if key in _BATCH_TIMING_KEYS: + if key not in output_batch: + output_batch[key] = value + elif key in self._FLATTEN_LIST_KEYS and isinstance(value, list): + if key not in output_batch: + output_batch[key] = value + else: + output_batch[key].extend(value) + elif key not in output_batch: + output_batch[key] = [value] + else: + output_batch[key].append(value) + + def __len__(self) -> int: + return self.data_len + + # Keys where each sample may hold multiple tensors (e.g. multiple video + # clips in a packed sequence). Kept as single-element lists per sample + # via v[i:i+1] so that _update_output_batch yields list[list[Tensor]]. + _MULTI_ITEM_KEYS = {"text_token_ids", "images", "video", "action", "sound"} + + def _get_next_sample(self, index_id: int) -> dict: + """Pop the next single-sample dict from the buffer for the given dataloader. + + If the buffer is empty, fetches the next collated batch from the inner + dataloader and splits it into individual samples. + + Splitting rules: + - Multi-item list values (keys in ``_MULTI_ITEM_KEYS``): sliced + via ``v[i:i+1]`` to yield a single-element list ``[tensor]``. + A packed sequence can contain multiple items per key. + - Per-sequence metadata list values (all other list keys, e.g. + ``sequence_plan``, ``domain_id``): direct-indexed via ``v[i]`` + to yield the bare element. + - Tensor values ``(B, ...)``: sliced to ``(1, ...)`` via + ``v[i : i + 1]`` to preserve the batch dimension. + + After ``_update_output_batch`` accumulates samples, the packed output + batch has the following shapes: + - Multi-item keys (``text_token_ids``, ``video``, ``images``, + ``action``): ``list[list[Tensor]]`` — each inner list has one + element from one sub-sample. + - Per-sequence metadata keys (``sequence_plan``, ``domain_id``, + ``dataset_name``, etc.): ``list[element]`` — flat list. + - Tensor-origin keys: ``list[Tensor(1, ...)]``. + + Args: + index_id: Index of the dataloader to fetch from. + + Returns: + A single-sample dictionary. + """ + buffer = self.buffers[index_id] + if not buffer: + try: + batch = next(self.dataloaders[index_id]) + except StopIteration: + raise + + is_image_batch = "images" in batch + input_images_or_videos = batch["images" if is_image_batch else "video"] + batch_size = len(input_images_or_videos) + + for i in range(batch_size): + sample = {} + for k, v in batch.items(): + if k in _BATCH_TIMING_KEYS: + sample[k] = v + elif isinstance(v, list) and k in self._MULTI_ITEM_KEYS: + # For multi-item keys (images, video, etc.), the collated + # value is a list with one element per sample. If the element + # is itself a list (e.g. image editing: [src, tgt]), use v[i] + # directly to avoid wrapping it in a redundant single-element + # list. Otherwise keep the v[i:i+1] slice so that + # _update_output_batch produces list[list[Tensor]]. + elem = v[i] + if isinstance(elem, list): + sample[k] = elem + else: + sample[k] = v[i : i + 1] + elif isinstance(v, list): + sample[k] = v[i] + else: + sample[k] = v[i : i + 1] + buffer.append(sample) + + return buffer.popleft() + + def set_start_iteration(self, iteration: int): + self.global_id = iteration + + def __iter__(self): + raise NotImplementedError("__iter__ function is not implemented yet") + + +class IterativeJointDataLoader(JointDataLoader): + r""" + An iterative joint dataloader that supports loading multiple modalities. + + The behavior depends on the ``seed`` parameter: + + - **seed is not None** (Default): + The modality is randomly selected at each iteration based on the probability distribution + derived from the ratios. The random state is seeded with ``seed + global_id``, ensuring + that all ranks select the same modality at the same iteration (assuming synchronized global_id). + This prevents load imbalance due to mixed resolutions across ranks. + + - **seed is None**: + The modality selection follows a deterministic round-robin pattern based on the ratios. + For example, with 2 modalities (image and video) and ratio 2:1: + - Iterations 0, 1: all ranks process images + - Iteration 2: all ranks process videos + - ... and so on. + This also ensures all ranks process the same modality at the same iteration. + """ + + def __init__( + self, + dataloaders: Dict[str, Dict[str, Union[torch.utils.data.DataLoader, webdataset.WebLoader, int]]], + tokenizer_spatial_compression_factor: int, + tokenizer_temporal_compression_factor: int, + patch_spatial: int, + max_sequence_length: int | None = None, + max_samples_per_batch: int | None = None, + sound_latent_fps: float = 0, + audio_sample_rate: int = 48000, + seed: int | None = 42, + prewarm: bool = True, + default_lookahead_limit: int = JointDataLoader._DEFAULT_LOOKAHEAD_LIMIT, + lookahead_limits: Dict[str, int] | None = None, + ): + super().__init__( + dataloaders, + tokenizer_spatial_compression_factor, + tokenizer_temporal_compression_factor, + patch_spatial, + max_sequence_length, + max_samples_per_batch, + sound_latent_fps=sound_latent_fps, + audio_sample_rate=audio_sample_rate, + prewarm=prewarm, + default_lookahead_limit=default_lookahead_limit, + lookahead_limits=lookahead_limits, + ) + self.seed = seed + # Calculate probabilities for random sampling + total_ratio = sum(self.data_ratios) + self.data_probs = np.array([ratio / total_ratio for ratio in self.data_ratios]) + + def __iter__(self): + while True: + if self.seed is not None: + rng = np.random.RandomState(self.seed + self.global_id) + index_id = rng.choice(len(self.dataloader_list), p=self.data_probs) + else: + data_id = self.global_id % self.ratio_sum + index_id = self._get_dataloader_index(data_id) + + metrics = _PackingMetrics() + output_batch = dict() + skipped_samples = deque() + lookahead_limit = self.lookahead_limits[index_id] + lookahead_count = 0 + + while True: + # Check max samples limit first + if self.max_samples_per_batch is not None and metrics.num_samples >= self.max_samples_per_batch: + break + + # If we have started packing and tried lookahead_limit times to find a fitting sample but failed, stop. + if len(output_batch) > 0 and lookahead_count >= lookahead_limit: + break + + had_buffer = len(self.buffers[index_id]) > 0 + try: + output = self._get_next_sample(index_id) + except StopIteration: + break # No more data in this dataloader + + if had_buffer: + metrics.from_buffer += 1 + else: + metrics.from_workers += 1 + + num_tokens_in_current_sample = self._compute_num_tokens_per_sample(output) + + if ( + self.max_sequence_length is not None + and metrics.current_sequence_length + num_tokens_in_current_sample >= self.max_sequence_length + ): + if len(output_batch) == 0: + # This case happens when current_sequence_length = 0 and num_tokens_in_current_sample > self.max_sequence_length + # In this case, we should simply discard the current sample and get the next sample. + log.info( + f"Discarding oversized sample with {num_tokens_in_current_sample} tokens. Max sequence length: {self.max_sequence_length}", + rank0_only=False, + ) + metrics.dropped_count += 1 + continue + + # current_sequence_length > 0 and selected sample is too large to fit in the remaining space. + # Instead of stopping immediately (creating large padding), we buffer this large sample + # and try to find a smaller one that fits in the remaining space. + skipped_samples.append(output) + lookahead_count += 1 + continue + + metrics.current_sequence_length += num_tokens_in_current_sample + metrics.num_samples += 1 + output["dataset_name"] = self.dataset_name_list[index_id] + self._update_output_batch(output_batch, output) + + # Add back skipped samples to the buffer for the next batch. + # appendleft puts item at HEAD. So we insert S3, then S2, then S1. + for sample in reversed(skipped_samples): + self.buffers[index_id].appendleft(sample) + + if len(output_batch) == 0: + return + + metrics.attach_to(output_batch, buffer_size=len(self.buffers[index_id])) + self.global_id += 1 + yield output_batch + + def _get_dataloader_index(self, data_id): + """Maps global id to the corresponding dataloader index based on ratio.""" + for i, r in enumerate(self.data_ratios): + if data_id < r: + return i + data_id -= r + raise ValueError("Invalid data_id") + + +class RankPartitionedDataLoader: + """Assigns each rank to exactly one dataset based on ratios. + + For N GPUs with datasets having ratios r_1:r_2:...:r_k, the first + N * r_1 / sum(r) ranks are assigned dataset 1, the next N * r_2 / sum(r) + ranks are assigned dataset 2, etc. Each rank instantiates a single + PyTorch DataLoader for its assigned dataset. + + The sharding information (``shard_world_size`` and ``shard_rank``) is set + on each dataset so that it shards data only across ranks that share the + same dataset, rather than across the full world. + + Example: + With 128 GPUs and datasets ``{"video": {"dataset": ..., "ratio": 3}, + "image": {"dataset": ..., "ratio": 1}}``: + + - Ranks 0-95 -> video (shard_world_size=96, shard_rank=0..95) + - Ranks 96-127 -> image (shard_world_size=32, shard_rank=0..31) + """ + + def __init__( + self, + datasets: dict[str, dict[str, Any]], + **dataloader_kwargs: Any, + ): + """ + Args: + datasets: Mapping of dataset name to config dict with keys: + + - ``"dataset"`` (required): a lazy config or dataset instance. + - ``"ratio"`` (required): positive int weight. + - ``"dataloader_kwargs"`` (optional): dict of keyword arguments + that override the top-level ``**dataloader_kwargs`` for this + dataset only (e.g. different ``num_workers`` or ``batch_size``). + + **dataloader_kwargs: Default kwargs forwarded to + ``torch.utils.data.DataLoader``. ``collate_fn`` defaults to + ``custom_collate_fn`` if not given. + """ + world_size = torch.distributed.get_world_size() + rank = torch.distributed.get_rank() + log.info(f"RankPartitionedDataLoader: world_size: {world_size} and rank: {rank}", rank0_only=False) + + _VALID_KEYS = {"dataset", "ratio", "dataloader_kwargs"} + names: list[str] = [] + dataset_configs: list[Any] = [] + ratios: list[int] = [] + per_dataset_kwargs: list[dict[str, Any]] = [] + for name, cfg in datasets.items(): + extra = set(cfg.keys()) - _VALID_KEYS + assert not extra, f"Dataset {name!r}: unexpected keys {extra}. Allowed: {_VALID_KEYS}" + if cfg["ratio"] <= 0: + log.warning( + f"RankPartitionedDataLoader: Skipping dataset {name} with ratio {cfg['ratio']}", rank0_only=False + ) + continue + names.append(name) + dataset_configs.append(cfg["dataset"]) + ratios.append(cfg["ratio"]) + per_dataset_kwargs.append(cfg.get("dataloader_kwargs", {})) + + assert len(names) > 0, "No datasets with positive ratios provided." + assert world_size >= len(names), ( + f"world_size ({world_size}) must be >= number of datasets ({len(names)}) " + f"so each dataset gets at least one rank." + ) + + total_ratio = sum(ratios) + ideal = [r / total_ratio * world_size for r in ratios] + allocations = [max(1, int(q)) for q in ideal] + remaining = world_size - sum(allocations) + if remaining > 0: + remainders = sorted(range(len(ratios)), key=lambda i: ideal[i] - allocations[i], reverse=True) + for j in range(remaining): + allocations[remainders[j]] += 1 + elif remaining < 0: + deficit = -remaining + while deficit > 0: + best = max( + (i for i in range(len(allocations)) if allocations[i] > 1), + key=lambda i: (allocations[i] - ideal[i], allocations[i]), + ) + allocations[best] -= 1 + deficit -= 1 + + expected_ratios = [r / total_ratio for r in ratios] + actual_ratios = [a / world_size for a in allocations] + lines = [f"RankPartitionedDataLoader allocation ({world_size} GPUs):"] + start = 0 + for i, (name, alloc) in enumerate(zip(names, allocations)): + end = start + alloc - 1 + lines.append( + f" {name} (ratio {ratios[i]}): ranks {start}-{end} ({alloc} GPUs) " + f"| expected {expected_ratios[i]:.2%}, actual {actual_ratios[i]:.2%}" + ) + start += alloc + log.info("\n".join(lines), rank0_only=False) + + cumulative = 0 + my_dataset_idx = -1 + for i, alloc in enumerate(allocations): + if rank < cumulative + alloc: + my_dataset_idx = i + break + cumulative += alloc + assert my_dataset_idx >= 0 + + shard_rank = rank - cumulative + shard_world_size = allocations[my_dataset_idx] + + dataset: Any = instantiate(dataset_configs[my_dataset_idx]) + dataset.shard_world_size = shard_world_size + dataset.shard_rank = shard_rank + dataset.shard_id = my_dataset_idx + + merged_kwargs = {**dataloader_kwargs, **per_dataset_kwargs[my_dataset_idx]} + merged_kwargs.setdefault("collate_fn", custom_collate_fn) + self.dataloader = torch.utils.data.DataLoader(dataset, **merged_kwargs) + self.dataset_name = names[my_dataset_idx] + self.dataset = dataset + + def __iter__(self): + return iter(self.dataloader) + + def __len__(self) -> int: + return len(self.dataloader) + + +class PackingDataLoader(JointDataLoader): + """Packs multiple samples from a single dataloader into token-budget-constrained batches. + + Unlike the other ``JointDataLoader`` subclasses which manage multiple + dataloaders with configurable ratios, this class wraps a single dataloader + and greedily packs consecutive samples until the token budget + (``max_sequence_length``) or sample count limit (``max_samples_per_batch``) + is reached. + """ + + def __init__( + self, + dataloader: torch.utils.data.DataLoader | webdataset.WebLoader, + tokenizer_spatial_compression_factor: int, + tokenizer_temporal_compression_factor: int, + patch_spatial: int, + max_sequence_length: int | None = None, + max_samples_per_batch: int | None = None, + sound_latent_fps: float = 0, + audio_sample_rate: int = 48000, + dataset_name: str = "default", + lookahead_limit: int = JointDataLoader._DEFAULT_LOOKAHEAD_LIMIT, + ): + """ + Args: + dataloader: A single dataloader (or lazy config) to draw samples from. + tokenizer_spatial_compression_factor: Spatial compression factor of the tokenizer. + tokenizer_temporal_compression_factor: Temporal compression factor of the tokenizer. + patch_spatial: Spatial patchification factor. + max_sequence_length: Max total tokens per packed batch. Mutually exclusive with + ``max_samples_per_batch``. + max_samples_per_batch: Max number of samples per packed batch. Mutually exclusive + with ``max_sequence_length``. + sound_latent_fps: Sound tokenizer latent rate in Hz. If 0, sound tokens are not counted. + audio_sample_rate: Audio sample rate in Hz. + dataset_name: Name tag attached to every sample in the output batch. + lookahead_limit: Packing-loop look-ahead for the wrapped dataloader. + """ + wrapped = {dataset_name: {"dataloader": dataloader, "ratio": 1}} + super().__init__( + dataloaders=wrapped, + tokenizer_spatial_compression_factor=tokenizer_spatial_compression_factor, + tokenizer_temporal_compression_factor=tokenizer_temporal_compression_factor, + patch_spatial=patch_spatial, + max_sequence_length=max_sequence_length, + max_samples_per_batch=max_samples_per_batch, + sound_latent_fps=sound_latent_fps, + audio_sample_rate=audio_sample_rate, + lookahead_limits={dataset_name: int(lookahead_limit)}, + ) + + def __iter__(self): + inner = self.dataloader_list[0] + ds_name = getattr(inner, "dataset_name", self.dataset_name_list[0]) + + while True: + current_sequence_length = 0 + num_samples = 0 + output_batch: dict = {} + + skipped_samples: deque = deque() + # PackingDataLoader wraps a single dataloader, so lookahead_limits has one entry. + lookahead_limit = self.lookahead_limits[0] + lookahead_count = 0 + + while True: + if self.max_samples_per_batch is not None and num_samples >= self.max_samples_per_batch: + break + + if len(output_batch) > 0 and lookahead_count >= lookahead_limit: + break + + try: + output = self._get_next_sample(0) + except StopIteration: + break + + num_tokens_in_current_sample = self._compute_num_tokens_per_sample(output) + + if ( + self.max_sequence_length is not None + and current_sequence_length + num_tokens_in_current_sample >= self.max_sequence_length + ): + if len(output_batch) == 0: + # This case happens when current_sequence_length = 0 and num_tokens_in_current_sample > self.max_sequence_length + # In this case, we should simply discard the current sample and get the next sample. + log.error( + f"PackingDataLoader: Discarding oversized sample with {num_tokens_in_current_sample} tokens. Max sequence length: {self.max_sequence_length}", + rank0_only=False, + ) + continue + + skipped_samples.append(output) + lookahead_count += 1 + continue + + current_sequence_length += num_tokens_in_current_sample + num_samples += 1 + output["dataset_name"] = ds_name + self._update_output_batch(output_batch, output) + + for sample in reversed(skipped_samples): + self.buffers[0].appendleft(sample) + + if len(output_batch) == 0: + return + + self.global_id += 1 + yield output_batch + + +class RandomJointDataLoader(JointDataLoader): + r""" + A random joint dataloader that supports loading multiple modalities with stochastic sampling. + + In this dataloader, the modality is randomly selected at each iteration based on the + probability distribution derived from the ratios. Each rank independently samples a + modality, so different ranks may process different modalities at the same iteration. + + For example, with 2 modalities (image and video) and ratio 2:1: + - Each iteration has 66.7% probability of selecting images + - Each iteration has 33.3% probability of selecting videos + - The selection is independent across iterations and ranks + + Note: Unlike IterativeJointDataLoader, this does not guarantee synchronized modality + selection across ranks. + """ + + def __init__( + self, + dataloaders: Dict[str, Dict[str, Union[torch.utils.data.DataLoader, webdataset.WebLoader, int]]], + tokenizer_spatial_compression_factor: int, + tokenizer_temporal_compression_factor: int, + patch_spatial: int, + max_sequence_length: int | None = None, + max_samples_per_batch: int | None = None, + sound_latent_fps: float = 0, + audio_sample_rate: int = 48000, + default_lookahead_limit: int = JointDataLoader._DEFAULT_LOOKAHEAD_LIMIT, + lookahead_limits: Dict[str, int] | None = None, + ): + super().__init__( + dataloaders, + tokenizer_spatial_compression_factor, + tokenizer_temporal_compression_factor, + patch_spatial, + max_sequence_length, + max_samples_per_batch, + sound_latent_fps=sound_latent_fps, + audio_sample_rate=audio_sample_rate, + default_lookahead_limit=default_lookahead_limit, + lookahead_limits=lookahead_limits, + ) + + # Convert data ratios to probabilities + self.data_ratios = np.array([ratio / sum(self.data_ratios) for ratio in self.data_ratios]) + + def __iter__(self): + while True: + index_id = np.random.choice(len(self.dataloader_list), p=self.data_ratios) + + metrics = _PackingMetrics() + output_batch = dict() + skipped_samples = deque() + lookahead_limit = self.lookahead_limits[index_id] + lookahead_count = 0 + + while True: + # Check max samples limit first + if self.max_samples_per_batch is not None and metrics.num_samples >= self.max_samples_per_batch: + break + + # If we have started packing and tried lookahead_limit times to find a fitting sample but failed, stop. + if len(output_batch) > 0 and lookahead_count >= lookahead_limit: + break + + had_buffer = len(self.buffers[index_id]) > 0 + try: + output = self._get_next_sample(index_id) + except StopIteration: + break # No more data in this dataloader + + if had_buffer: + metrics.from_buffer += 1 + else: + metrics.from_workers += 1 + + num_tokens_in_current_sample = self._compute_num_tokens_per_sample(output) + + if ( + self.max_sequence_length is not None + and metrics.current_sequence_length + num_tokens_in_current_sample >= self.max_sequence_length + ): + if len(output_batch) == 0: + # This case happens when current_sequence_length = 0 and num_tokens_in_current_sample > self.max_sequence_length + # In this case, we should simply discard the current sample and get the next sample. + log.info( + f"Discarding oversized sample with {num_tokens_in_current_sample} tokens. Max sequence length: {self.max_sequence_length}", + rank0_only=False, + ) + metrics.dropped_count += 1 + continue + + # current_sequence_length > 0 and selected sample is too large to fit in the remaining space. + # Instead of stopping immediately (creating large padding), we buffer this large sample + # and try to find a smaller one that fits in the remaining space. + skipped_samples.append(output) + lookahead_count += 1 + continue + + metrics.current_sequence_length += num_tokens_in_current_sample + metrics.num_samples += 1 + output["dataset_name"] = self.dataset_name_list[index_id] + self._update_output_batch(output_batch, output) + + # Add back skipped samples to the buffer for the next batch. + # appendleft puts item at HEAD. So we insert S3, then S2, then S1. + for sample in reversed(skipped_samples): + self.buffers[index_id].appendleft(sample) + + if len(output_batch) == 0: + return + + metrics.attach_to(output_batch, buffer_size=len(self.buffers[index_id])) + yield output_batch diff --git a/cosmos-inference/cosmos3/_src/vfm/datasets/local_datasets/__init__.py b/cosmos-inference/cosmos3/_src/vfm/datasets/local_datasets/__init__.py new file mode 100644 index 00000000..db6d6eb6 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/vfm/datasets/local_datasets/__init__.py @@ -0,0 +1,15 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025-2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + diff --git a/cosmos-inference/cosmos3/_src/vfm/datasets/local_datasets/helper.py b/cosmos-inference/cosmos3/_src/vfm/datasets/local_datasets/helper.py new file mode 100644 index 00000000..00cc5c59 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/vfm/datasets/local_datasets/helper.py @@ -0,0 +1,265 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +"""Shared helpers for local datasets (S3, video decoding, aspect ratio).""" + +import io +import json +import subprocess +import time +from collections.abc import Generator +from pathlib import Path +from typing import Any + +import numpy as np +from boto3.s3.transfer import TransferConfig +from botocore.config import Config + +from cosmos3._src.imaginaire.utils import log + +client_config = Config( + response_checksum_validation="when_required", + request_checksum_calculation="when_required", + connect_timeout=10, + read_timeout=5, +) +transfer_config = TransferConfig(use_threads=True, max_concurrency=8, multipart_chunksize=8 * 1024 * 1024) + + +def parse_s3_url(s3_url: str) -> tuple[str, str]: + s3_url = s3_url.removeprefix("s3://") + bucket, key = s3_url.split("/", 1) + return bucket, key + + +def download_from_s3(s3_client: Any, s3_url: str, max_tries: int = 20) -> bytes | None: + """Download a file from S3.""" + if not s3_url.startswith("s3://"): + return Path(s3_url).read_bytes() + tries = 0 + while True: + tries += 1 + try: + bucket, key = parse_s3_url(s3_url) + buffer = io.BytesIO() + s3_client.download_fileobj(Bucket=bucket, Key=key, Fileobj=buffer, Config=transfer_config) + data = buffer.getvalue() + return data + except Exception as e: + log.error(f"Error downloading from S3 (try {tries}): {e}\n{s3_url}") + if tries >= max_tries: + return None + time.sleep(1) + + +def get_video_metadata(video_path: str) -> dict: + """ + Get video metadata using ffprobe. + + Args: + video_path: Path to the video file + + Returns: + Dictionary containing width, height, fps, and total_frames + """ + cmd = [ + "ffprobe", + "-v", + "quiet", + "-print_format", + "json", + "-show_streams", + "-select_streams", + "v:0", + video_path, + ] + result = subprocess.run(cmd, stdin=subprocess.DEVNULL, capture_output=True, check=True, text=True) + probe_data = json.loads(result.stdout) + + # Decode output + stream = probe_data["streams"][0] + width = int(stream["width"]) + height = int(stream["height"]) + fps_parts = stream["r_frame_rate"].split("/") + video_fps = float(fps_parts[0]) / float(fps_parts[1]) + if "nb_frames" in stream: + total_frames = int(stream["nb_frames"]) + else: + duration = float(stream.get("duration") or 0) + total_frames = int(duration * video_fps) + + return dict(width=width, height=height, fps=video_fps, total_frames=total_frames) + + +def ffmpeg_decode_video( + video_path: str, scale_hw: tuple[int, int] | None = None, num_threads: int = 1 +) -> Generator[np.ndarray, None, None]: + """ + Decode video frames using ffmpeg and yield HWC uint8 RGB frames. + + Args: + video_path: Path to the video file + scale_hw: Tuple of width and height to scale the video to (default: None) + + Yields: + np.ndarray: HWC uint8 RGB frames + """ + if scale_hw is None: + metadata = get_video_metadata(video_path) + out_width = metadata["width"] + out_height = metadata["height"] + else: + out_height, out_width = scale_hw + + # Calculate frame size in bytes + frame_size = out_width * out_height * 3 # 3 channels (RGB) + + # Build ffmpeg command to decode and output raw RGB frames + ffmpeg_cmd = [ + "ffmpeg", + "-loglevel", + "quiet", + "-threads", + str(num_threads), + "-filter_threads", + str(num_threads), + "-filter_complex_threads", + str(num_threads), + "-i", + video_path, + "-threads", + str(num_threads), + "-filter_threads", + str(num_threads), + "-filter_complex_threads", + str(num_threads), + "-pix_fmt", + "rgb24", + "-sws_flags", + "bicubic+accurate_rnd", # lanczos too much ringing on graphics + *(["-vf", f"scale={scale_hw[1]}:{scale_hw[0]}"] if scale_hw else []), # WH + "-f", + "rawvideo", + "-vsync", + "0", + "-", + ] + + process = subprocess.Popen( + ffmpeg_cmd, + stdin=subprocess.DEVNULL, + stdout=subprocess.PIPE, + stderr=subprocess.DEVNULL, # Set to None to print errors + bufsize=-1, + ) + + try: + while True: + raw_frame = process.stdout.read(frame_size) + + if len(raw_frame) != frame_size: + assert len(raw_frame) == 0, f"Incomplete frame: {len(raw_frame)} bytes" + break + + frame = np.frombuffer(raw_frame, dtype=np.uint8) + frame = frame.reshape((out_height, out_width, 3)) + + yield frame + finally: + process.stdout.close() + process.wait() + + +def get_aspect_ratio(width: int, height: int) -> str: + """Compute aspect ratio bucket from width and height.""" + ratio = width / height + + if ratio < 0.65: + return "9,16" # 0.5625 + elif ratio < 0.88: + return "3,4" # 0.75 + elif ratio < 1.16: + return "1,1" # 1.0 + elif ratio < 1.55: + return "4,3" # 1.3333 + else: + return "16,9" # 1.7778 + + +def save_video_frames_to_mp4( + frames: np.ndarray | Any, + output_path: str, + fps: float = 24.0, + overlay_frame_id: bool = False, + fps_to_show: float | None = None, +) -> None: + """Encode video frames to MP4 using FFmpeg. + + Args: + frames: Video frames as numpy (T, H, W, 3) or torch tensor (C, T, H, W), uint8. + output_path: Path for the output .mp4 file. + fps: Output video frame rate. + overlay_frame_id: If True, draw frame index (0, 1, ...) on each frame via FFmpeg drawtext. + fps_to_show: If provided, draw the FPS value on the video instead of the actual FPS. + """ + cpu_fn = getattr(frames, "cpu", None) + if callable(cpu_fn): + frames = cpu_fn().numpy() # type: ignore[union-attr] + frames = np.asarray(frames, dtype=np.uint8) + if frames.ndim == 4 and frames.shape[0] == 3: + # CTHW -> THWC + frames = np.transpose(frames, (1, 2, 3, 0)) + if frames.ndim != 4 or frames.shape[-1] != 3: + raise ValueError("frames must be (T, H, W, 3) or (C, T, H, W) uint8") + t, h, w, _ = frames.shape + cmd = [ + "ffmpeg", + "-y", + "-f", + "rawvideo", + "-pix_fmt", + "rgb24", + "-s", + f"{w}x{h}", + "-r", + str(fps), + "-i", + "pipe:0", + ] + if overlay_frame_id: + # %{n} = frame index (0-based); add fps and resolution as literal text + drawtext_frame = "drawtext=text='%{n}':x=10:y=10:fontsize=24:fontcolor=white:box=1:boxcolor=black@0.6" + drawtext_fps = ( + f"drawtext=text='fps: {fps_to_show or fps}':x=10:y=40:fontsize=24:fontcolor=white:box=1:boxcolor=black@0.6" + ) + drawtext_res = f"drawtext=text='{w}x{h}':x=10:y=70:fontsize=24:fontcolor=white:box=1:boxcolor=black@0.6" + cmd += ["-vf", ",".join([drawtext_frame, drawtext_fps, drawtext_res])] + cmd += [ + "-c:v", + "libx264", + "-pix_fmt", + "yuv420p", + output_path, + ] + process = subprocess.Popen( + cmd, + stdin=subprocess.PIPE, + stdout=subprocess.DEVNULL, + stderr=subprocess.PIPE, + ) + _, stderr = process.communicate(input=frames.tobytes()) + if process.returncode != 0: + log.error(f"FFmpeg failed: {stderr.decode()}") + raise RuntimeError(f"FFmpeg exited with {process.returncode}") diff --git a/cosmos-inference/cosmos3/_src/vfm/datasets/local_datasets/sft_dataset.py b/cosmos-inference/cosmos3/_src/vfm/datasets/local_datasets/sft_dataset.py new file mode 100644 index 00000000..6e7f9117 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/vfm/datasets/local_datasets/sft_dataset.py @@ -0,0 +1,692 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +# SFT dataset loader — reads video metadata + captions from a JSONL file on S3. +import gzip +import hashlib +import io +import json +import os +import random +import tempfile +from pathlib import Path +from typing import Any, Optional + +import boto3 +import numpy as np +import torch + +from cosmos3._src.imaginaire.flags import INTERNAL +from cosmos3._src.imaginaire.lazy_config import instantiate as lazy_instantiate +from cosmos3._src.imaginaire.utils import log +from cosmos3._src.vfm.datasets.local_datasets.helper import ( + client_config, + download_from_s3, + ffmpeg_decode_video, + get_aspect_ratio, + get_video_metadata, + parse_s3_url, +) +from cosmos3._src.vfm.datasets.sequence_packing import SequencePlan, add_special_tokens +from cosmos3._src.vfm.datasets.utils import VIDEO_RES_SIZE_INFO +from cosmos3._src.vfm.models.vlm.qwen3_vl.utils import tokenize_caption + +_MAX_NUM_TOKENS = 1024 +_DURATION_TEMPLATE = "The video is {duration:.1f} seconds long and is of {fps:.0f} FPS." +_RESOLUTION_TEMPLATE = "This video is of {height}x{width} resolution." + +# Caption types available in the SFT JSONL. +# Format: {model}_{style} +# model: qwen3_235b | qwen3_32b | qwen3p5_397b +# style: short | temporal | descriptive | dense +CAPTION_TYPES_AND_WEIGHTS: dict[str, float] = { + # short: 10% total + "qwen3_235b_short": 0.1, + "qwen3_32b_short": 0.1, + "qwen3p5_397b_short": 0.1, + # descriptive: 20% total + "qwen3_235b_descriptive": 0.2, + "qwen3_32b_descriptive": 0.2, + "qwen3p5_397b_descriptive": 0.2, + # dense: 70% total + "qwen3_235b_dense": 0.7, + "qwen3_32b_dense": 0.7, + "qwen3p5_397b_dense": 0.7, + # temporal: 0% total + "qwen3_235b_temporal": 0.0, + "qwen3_32b_temporal": 0.0, + "qwen3p5_397b_temporal": 0.0, +} +CAPTION_TYPES = list(CAPTION_TYPES_AND_WEIGHTS.keys()) +CAPTION_WEIGHTS = list(CAPTION_TYPES_AND_WEIGHTS.values()) + + +class SFTDataset(torch.utils.data.IterableDataset): + """Dataset for loading SFT video clips with captions from JSONL metadata on S3.""" + + def __init__( + self, + metadata: list[dict], + num_video_frames: int, + resolution: str, + s3_credentials: dict, + temporal_interval_mode: str = "entire_chunk", + frame_selection_mode: str = "center", + tokenizer_config: Optional[Any] = None, + cfg_dropout_rate: float = 0.0, + use_system_prompt: bool = False, + append_duration_fps_timestamps: bool = True, + append_resolution_info: bool = True, + cfg_dropout_keep_metadata: bool = False, + caption_suffix: str = "", + conditioning_fps: float = 24, + conditioning_fps_noise_std: float = 0.0, + conditioning_config: dict[int, float] | None = None, + temporal_compression_factor: int = 4, + ): + assert temporal_interval_mode in ("force_one", "max_30fps", "entire_chunk"), ( + f"Unknown temporal_interval_mode={temporal_interval_mode!r}" + ) + assert frame_selection_mode in ("center", "first", "random"), ( + f"Unknown frame_selection_mode={frame_selection_mode!r}" + ) + assert temporal_compression_factor >= 1, "temporal_compression_factor must be >= 1" + self.metadata = metadata + self.num_video_frames = num_video_frames + self.resolution = resolution + self.s3_credentials = s3_credentials + self.temporal_interval_mode = temporal_interval_mode + self.frame_selection_mode = frame_selection_mode + self.tokenizer_config = tokenizer_config + self.cfg_dropout_rate = cfg_dropout_rate + self.use_system_prompt = use_system_prompt + self.append_duration_fps_timestamps = append_duration_fps_timestamps + self.append_resolution_info = append_resolution_info + self.cfg_dropout_keep_metadata = cfg_dropout_keep_metadata + self.caption_suffix = caption_suffix.strip() + self.conditioning_fps = conditioning_fps + self.conditioning_fps_noise_std = conditioning_fps_noise_std + + self.temporal_compression_factor = temporal_compression_factor + self.conditioning_config: dict[int, float] | None = None + if conditioning_config is not None: + total_prob = sum(conditioning_config.values()) + assert total_prob > 0, "conditioning_config probabilities must sum to a positive number" + self.conditioning_config = {k: v / total_prob for k, v in conditioning_config.items()} + log.info(f"Conditioning config: {self.conditioning_config}") + # They will be set by the RankPartitionedDataLoader + self.shard_world_size = None + self.shard_rank = None + self.shard_id = 0 + self.is_initialized = False + self.output_sizes = VIDEO_RES_SIZE_INFO[resolution] + + _vlm_proc = lazy_instantiate(self.tokenizer_config) + self.vlm_tokenizer = _vlm_proc.tokenizer + self.vlm_tokenizer, _ = add_special_tokens(self.vlm_tokenizer) + + def __len__(self): + return len(self.metadata) + + def _tokenize_caption(self, caption: str) -> tuple[list[int], str]: + text_ids = tokenize_caption( + caption, + self.vlm_tokenizer, + is_video=True, + use_system_prompt=self.use_system_prompt, + ) + if len(text_ids) > _MAX_NUM_TOKENS: + log.warning(f"Text ids are too long, truncating: {len(text_ids)} > {_MAX_NUM_TOKENS}") + text_ids = text_ids[:_MAX_NUM_TOKENS] + return text_ids, caption + + def process_one_sample(self, metadata: dict) -> dict | None: + """Process a single SFT sample: download, decode, and prepare for training. + + A random t2w_window is picked from the video's list of windows each time. + """ + windows = metadata["t2w_windows"] + win_idx = random.randrange(len(windows)) + t2w_window = windows[win_idx] + window_start = t2w_window["start_frame"] + window_end = t2w_window["end_frame"] + + # Compute output resolution + input_w, input_h = metadata["width"], metadata["height"] + target_w, target_h = self.output_sizes[metadata["aspect_ratio"]] + resize_ratio = max(target_w / input_w, target_h / input_h) + resize_h, resize_w = (round(input_h * resize_ratio), round(input_w * resize_ratio)) + crop_y, crop_x = (round((resize_h - target_h) / 2), round((resize_w - target_w) / 2)) + + video_bytes = download_from_s3(self.s3_client, metadata["vision_path"]) + if video_bytes is None: + log.warning(f"Failed to download video from S3: {metadata['vision_path']}") + return None + + # Decode all frames to (T, H, W, 3) + with tempfile.NamedTemporaryFile(suffix=".mp4", delete=True) as tmp_input: + tmp_input.write(video_bytes) + tmp_input.flush() + input_video_path = tmp_input.name + video_info = get_video_metadata(input_video_path) + original_fps = video_info["fps"] + total_frames = video_info["total_frames"] + + # Constrain to the t2w window + actual_end = min(window_end, total_frames - 1) + frames_in_window = actual_end - window_start + 1 + + if self.num_video_frames == -1: + # Native chunk mode: use start/end/interval directly from the window + temporal_interval = t2w_window["temporal_interval"] + start_frame = window_start + end_frame = actual_end + else: + if frames_in_window < self.num_video_frames: + log.warning( + f"Not enough frames in window: {metadata['uuid']}, " + f"frames_in_window: {frames_in_window}, required: {self.num_video_frames}" + ) + return None + + # Compute temporal interval + if self.temporal_interval_mode == "force_one": + temporal_interval = 1 + elif self.temporal_interval_mode == "max_30fps": + temporal_interval = max(1, int(original_fps / 30.0)) + elif self.temporal_interval_mode == "entire_chunk": + temporal_interval = frames_in_window // self.num_video_frames + temporal_interval = max(1, temporal_interval) + else: + raise ValueError(f"Unknown temporal_interval_mode: {self.temporal_interval_mode}") + + num_frames_before_downsample = (self.num_video_frames - 1) * temporal_interval + 1 + if self.frame_selection_mode == "first": + start_frame = window_start + elif self.frame_selection_mode == "center": + start_frame = window_start + (frames_in_window - num_frames_before_downsample) // 2 + elif self.frame_selection_mode == "random": + max_offset = frames_in_window - num_frames_before_downsample + start_frame = window_start + random.randint(0, max(0, max_offset)) + else: + raise ValueError(f"Unknown frame_selection_mode: {self.frame_selection_mode}") + end_frame = start_frame + num_frames_before_downsample - 1 + + fps = original_fps / temporal_interval + + video_chunk = [] + for idx, frame in enumerate( + ffmpeg_decode_video(input_video_path, scale_hw=(resize_h, resize_w), num_threads=2) + ): + if idx < start_frame: + continue + elif idx <= end_frame: + if (idx - start_frame) % temporal_interval == 0: + video_chunk.append(frame) + else: + break + + if not video_chunk: + log.warning( + f"No frames decoded for sample: {metadata['uuid']} " + f"(start={start_frame}, end={end_frame}, path={metadata['vision_path']})" + ) + return None + + video_chunk = np.stack(video_chunk, axis=0) # [T,H,W,3] + + # Truncate temporally to temporal_compression_factor * N + 1 + target_t = (video_chunk.shape[0] - 1) // self.temporal_compression_factor * self.temporal_compression_factor + 1 + + # Apply spatial center crop and temporal truncation + video_chunk = video_chunk[:target_t, crop_y : crop_y + target_h, crop_x : crop_x + target_w] # [T,H,W,3] + + # THWC -> CTHW + video_chunk = np.transpose(video_chunk, (3, 0, 1, 2)) # [3,T,H,W] + video = torch.from_numpy(np.ascontiguousarray(video_chunk)).to(torch.uint8) # [3,T,H,W] + padding_mask = torch.zeros((1, target_h, target_w), dtype=torch.float32) + # image_size: [target_h, target_w, orig_h, orig_w] in pixel space, for the model to crop the video + image_size = torch.tensor([target_h, target_w, target_h, target_w], dtype=torch.float32) + + available_types = [ct for ct in CAPTION_TYPES if ct in t2w_window] + if "qwen3_32b_rewrite-dense" in t2w_window: + caption_key = "qwen3_32b_rewrite-dense" + elif "caption" in t2w_window: + caption_key = "caption" + elif available_types: + available_weights = [CAPTION_TYPES_AND_WEIGHTS[ct] for ct in available_types] + caption_key = random.choices(available_types, weights=available_weights, k=1)[0] + else: + log.warning( + f"No known caption key found in t2w_window for sample {metadata['uuid']}. " + f"Keys: {list(t2w_window)}. Skipping sample." + ) + return None + caption = t2w_window[caption_key] + caption = caption.strip().rstrip(".") + "." + + num_decoded_frames = video.shape[1] + cond_fps = fps if self.conditioning_fps < 0 else self.conditioning_fps + if self.conditioning_fps_noise_std > 0: + noise_factor = np.exp(np.random.randn() * self.conditioning_fps_noise_std) + cond_fps = cond_fps * noise_factor + + if self.caption_suffix: + caption = (caption + " " + self.caption_suffix).strip() + + # CFG dropout: when cfg_dropout_keep_metadata is True, dropout fires + # before appending resolution/duration/FPS so that metadata text is + # preserved even under unconditional guidance. + if self.cfg_dropout_keep_metadata and self.cfg_dropout_rate > 0: + if random.random() < self.cfg_dropout_rate: + caption = "" + + if self.append_duration_fps_timestamps: + duration = num_decoded_frames / cond_fps + suffix = _DURATION_TEMPLATE.format(duration=duration, fps=cond_fps) + caption = caption + " " + suffix + if self.append_resolution_info: + suffix = _RESOLUTION_TEMPLATE.format(height=target_h, width=target_w) + caption = caption + " " + suffix + caption = caption.strip() + + if not self.cfg_dropout_keep_metadata and self.cfg_dropout_rate > 0: + if random.random() < self.cfg_dropout_rate: + caption = "" + text_ids, caption = self._tokenize_caption(caption) + + ret = dict( + __key__=f"{metadata['uuid']}_w{win_idx}", + __url__=metadata["vision_path"], + fps=original_fps, + n_orig_video_frames=total_frames, + chunk_index=win_idx, + frame_start=start_frame, + frame_end=end_frame, + num_frames=video.shape[1], + video=video, + num_multiplier=temporal_interval, + conditioning_fps=cond_fps, + padding_mask=padding_mask, + image_size=image_size, + ai_caption=caption, + sampled_caption_style=caption_key, + text_token_ids=torch.tensor(text_ids), + ) + + if self.conditioning_config is not None: + num_frames_pixel = video.shape[1] + t_latent = 1 + (num_frames_pixel - 1) // self.temporal_compression_factor + frames_options = list(self.conditioning_config.keys()) + weights = list(self.conditioning_config.values()) + num_cond = random.choices(frames_options, weights=weights, k=1)[0] + num_cond = min(num_cond, t_latent - 1) + ret["sequence_plan"] = SequencePlan( + has_text=True, + has_vision=True, + condition_frame_indexes_vision=list(range(num_cond)), + ) + + return ret + + def __iter__(self): + assert not self.is_initialized, "Dataset can only be initialized once." + assert len(self.metadata) > 0, "Did not find any data." + + self.s3_client = boto3.client( + "s3", + **self.s3_credentials, + config=client_config, + ) + # Ranks of the same pp/tp/cp group will have the same dp rank and thus share the same group id. + # zhao: Cosmos3 does not support TP/SP/CP + if self.shard_world_size is not None: + train_world_size = self.shard_world_size + train_rank = self.shard_rank + log.info(f"Using shard_world_size: {train_world_size} and shard_rank: {train_rank}", rank0_only=False) + else: + train_world_size = torch.distributed.get_world_size() + train_rank = torch.distributed.get_rank() + train_dp_rank = train_rank + train_num_dp_groups = train_world_size + train_dp_group_size = 1 + + # Get data worker rank. Each trainer have multiple dataloaders + worker_info = torch.utils.data.get_worker_info() + if worker_info is not None: + worker_rank = worker_info.id + total_data_ranks = worker_info.num_workers * train_num_dp_groups + data_rank = worker_rank + train_dp_rank * worker_info.num_workers + seed = worker_info.seed + else: + log.warning("No data worker info found. Using default worker rank and number of workers.", rank0_only=False) + total_data_ranks = train_num_dp_groups + data_rank = train_dp_rank + seed = 42 + + log.info( + f"train_world_size: {train_world_size}; " + f"train_rank: {train_rank}; " + f"train_dp_rank: {train_dp_rank}; " + f"train_num_dp_groups: {train_num_dp_groups}; " + f"train_dp_group_size: {train_dp_group_size}; " + f"worker_info: {worker_info}; " + f"total_data_ranks: {total_data_ranks}; " + f"data_rank: {data_rank}; " + f"seed: {seed}" + f"shard_id: {self.shard_id}; " + f"shard_world_size: {self.shard_world_size}; " + f"shard_rank: {self.shard_rank}", + rank0_only=False, + ) + + # Make sure len(self.metadata) is divisible by self.num_groups + multiplier = max(1, total_data_ranks * 50 // len(self.metadata)) + log.info(f"Dataset multiplier: {multiplier}", rank0_only=False) + self.metadata = self.metadata * multiplier # reduce bias caused by sharding + num_pad = total_data_ranks - len(self.metadata) % total_data_ranks + self.metadata = self.metadata + self.metadata[:num_pad] + # Deterministic shuffle based on the sha256 hash of uuid + # Note that the repeated samples are grouped together. + # Split list to keep only the data for this rank + if True: # This gives more diversity + random.Random(self.shard_id).shuffle(self.metadata) + log.info(f"Shuffled metadata for shard {self.shard_id}", rank0_only=False) + self.metadata = self.metadata[data_rank::total_data_ranks] + else: + # Keep the repeated samples together to aid cache hits. + self.metadata.sort(key=lambda x: hashlib.sha256(x["vision_path"].encode("utf-8")).hexdigest()) + # Equally chunk the list (guaranteed to be divisible by total_data_ranks) + chunk_size = len(self.metadata) // total_data_ranks + start = data_rank * chunk_size + end = (data_rank + 1) * chunk_size + log.info( + f"DRank {data_rank} has got a chunk {start}-{end} from {len(self.metadata)} data.", rank0_only=False + ) + self.metadata = self.metadata[start:end] + num_unique_vision_paths = len(set(metadata["vision_path"] for metadata in self.metadata)) + log.info( + f"DRank {data_rank} has {len(self.metadata)} data with {num_unique_vision_paths} unique vision_paths.", + rank0_only=False, + ) + + self.is_initialized = True + + # Make sure the data within a DRank is identical + rng = random.Random(data_rank + self.shard_id * 12345) + while True: + rng.shuffle(self.metadata) + for metadata in self.metadata: + sample = self.process_one_sample(metadata) + if sample is None: + log.warning(f"Failed to process sample {metadata['uuid']}, skipping...") + continue + yield sample + + +def _flatten_metadata_by_window(metadata_list: list[dict]) -> list[dict]: + """Expand metadata so each entry maps to exactly one t2w_window. + + Each output dict is a shallow copy of the original whose ``t2w_windows`` + list contains a single window. The ``uuid`` is suffixed with ``_w{idx}`` + so every entry has a unique identifier. + """ + flat: list[dict] = [] + for entry in metadata_list: + for win_idx, window in enumerate(entry["t2w_windows"]): + flat.append( + { + **entry, + "uuid": f"{entry['uuid']}_w{win_idx}", + "t2w_windows": [window], + } + ) + return flat + + +def _load_sft_metadata_from_s3( + s3_client, + jsonl_url: str, + min_frames: int, + uuid_prefix: str = "", + min_short_edge: int = 0, +) -> list[dict]: + """Load SFT metadata from a single JSONL file on S3. + + Returns one entry per video. Each entry keeps only the windows whose frame + span is at least *min_frames*; videos with no qualifying windows are dropped. + + Args: + s3_client: Boto3 S3 client + jsonl_url: S3 URL to the JSONL metadata file + min_frames: Minimum number of frames required per window + uuid_prefix: Prefix prepended to each uuid for disambiguation when + multiple JSONL files are loaded + min_short_edge: Drop videos whose shortest spatial edge (min of width, + height) is below this value. 0 disables the filter. + """ + log.info(f"Downloading SFT metadata from {jsonl_url}", rank0_only=False) + metadata_list: list[dict] = [] + num_raw_records = 0 + num_raw_windows = 0 + num_filtered_duration = 0 + num_filtered_windows = 0 + num_filtered_short_edge = 0 + + with io.BytesIO() as buffer: + if jsonl_url.startswith("s3://"): + bucket, key = parse_s3_url(jsonl_url) + s3_client.download_fileobj(Bucket=bucket, Key=key, Fileobj=buffer) + else: + path = Path(jsonl_url).absolute() + jsonl_url = str(path) + buffer.write(path.read_bytes()) + buffer.seek(0) + log.info("Finished downloading. Decoding...", rank0_only=False) + + line_iter = gzip.open(buffer, "rb") if jsonl_url.endswith(".gz") else buffer + for line in line_iter: + num_raw_records += 1 + record = json.loads(line.decode("utf-8")) + uuid = f"{uuid_prefix}{record['uuid']}" if uuid_prefix else record["uuid"] + if record["duration"] > 61.0: + print(f"Skipping video with too long duration: {uuid}, {record['duration']} > 61.0") + num_filtered_duration += 1 + continue + if min_short_edge > 0 and min(record["width"], record["height"]) < min_short_edge: + num_filtered_short_edge += 1 + continue + + windows = record.get("t2w_windows") + if not windows: + continue + + kept_windows = [] + for window in windows: + num_raw_windows += 1 + frames_in_window = window["end_frame"] - window["start_frame"] + 1 + if frames_in_window < min_frames: + num_filtered_windows += 1 + else: + kept_windows.append(window) + + if not kept_windows: + continue + + vision_path = record["vision_path"] + if "://" not in vision_path and not vision_path.startswith("/"): + # Relative path to the JSONL file + vision_path = f"{os.path.dirname(jsonl_url)}/{vision_path}" + + aspect_ratio = get_aspect_ratio(record["width"], record["height"]) + metadata_list.append( + { + "uuid": uuid, + "vision_path": vision_path, + "width": record["width"], + "height": record["height"], + "nb_frames": record.get("nb_frames"), + "framerate": record.get("framerate"), + "aspect_ratio": aspect_ratio, + "t2w_windows": kept_windows, + } + ) + + log.info( + f"Finished decoding SFT metadata from {jsonl_url}. " + f"Records: {num_raw_records}, " + f"Duration > 61s: {num_filtered_duration}, " + f"Short edge < {min_short_edge}: {num_filtered_short_edge}, " + f"Windows: {num_raw_windows}, Windows < {min_frames}f: {num_filtered_windows}, " + f"Videos kept: {len(metadata_list)}" + ) + return metadata_list + + +def get_sft_dataset( + jsonl_paths: str | list[str] = "s3://nv-00-10206-vfm/cosmos3_video_sft/human_1k/captions_full.jsonl", + resolution: str = "720", + num_video_frames: int = 93, + temporal_interval_mode: str = "entire_chunk", + frame_selection_mode: str = "center", + tokenizer_config: Optional[Any] = None, + cfg_dropout_rate: float = 0.1, + use_system_prompt: bool = False, + append_duration_fps_timestamps: bool = True, + append_resolution_info: bool = True, + cfg_dropout_keep_metadata: bool = False, + sample_by_window: bool = False, + min_short_edge: int = 0, + caption_suffix: str = "", + conditioning_fps: float = 24, + conditioning_fps_noise_std: float = 0.0, + conditioning_config: dict[int, float] | None = None, + temporal_compression_factor: int = 4, + **kwargs, +) -> SFTDataset: + """Create SFT video dataset from one or more JSONL files on S3. + + Args: + jsonl_paths: S3 path(s) to JSONL metadata file(s). A single string or + a list of strings. When multiple files are given, their samples are + concatenated and each file's uuids are prefixed with ``/`` + to avoid collisions. + resolution: Output resolution (e.g., "720", "480") + num_video_frames: Number of frames to extract from each video. + Videos with fewer frames are skipped at decode time. + Use -1 to take native chunks from the t2w_window metadata. + temporal_interval_mode: How to compute the temporal interval between sampled frames. + "force_one" — always 1 (consecutive frames at original fps). + "max_30fps" — smallest interval that keeps effective fps <= 30. + "entire_chunk" — spread num_video_frames evenly across the whole window. + frame_selection_mode: Where to select frames within the window. + "center" — center-crop temporally (default). + "first" — take the first num_video_frames from the window start. + tokenizer_config: Config for the tokenizer + cfg_dropout_rate: Dropout rate for the caption + use_system_prompt: Whether to use the system prompt during tokenization + append_duration_fps_timestamps: If True, appends duration/FPS text to captions + append_resolution_info: If True, appends resolution text to captions + cfg_dropout_keep_metadata: If True, CFG dropout fires before appending + duration/FPS/resolution text so that metadata is preserved during + unconditional guidance. If False (default), dropout fires after + and clears the entire caption including metadata. + sample_by_window: If True, each t2w_window is treated as a separate + sample (the dataset length equals the total number of windows). + If False (default), each video uuid is one sample and a random + window is chosen on every access. + min_short_edge: Drop videos whose shortest spatial edge (min of width, + height) is below this value. 0 (default) disables the filter. + caption_suffix: Text appended to every caption before the + duration/FPS/resolution templates, e.g. + ``"Overall, the video is of poor quality."``. Empty string + (default) disables the suffix. + conditioning_fps: FPS value used for duration/FPS conditioning. + A positive value is used directly (default 24). A negative + value (e.g. ``-1``) means the actual effective FPS + (``original_fps / temporal_interval``) is used instead. + conditioning_fps_noise_std: Standard deviation of log-normal + multiplicative noise applied to ``conditioning_fps``. The FPS + is multiplied by ``exp(N(0, std))``. 0.0 (default) disables + the noise. + conditioning_config: Weighted distribution mapping latent-frame counts + to unnormalized probabilities for image-to-video conditioning. + Example: ``{0: 0.7, 1: 0.2, 2: 0.1}``. ``None`` disables + conditioning (all frames are generation targets). + temporal_compression_factor: VAE temporal compression factor used to + convert pixel frame count to latent frame count. + Returns: + SFTDataset instance + """ + log.info(f"Unknown kwargs for get_sft_dataset: {kwargs}") + assert resolution in VIDEO_RES_SIZE_INFO.keys(), "The provided resolution cannot be found in VIDEO_RES_SIZE_INFO." + + if isinstance(jsonl_paths, str): + jsonl_paths = [jsonl_paths] + + if INTERNAL: + with open("credentials/gcs.secret", "r") as f: + credentials = json.load(f) + else: + credentials = {} + + s3_client = boto3.client("s3", **credentials) + + metadata_list: list[dict] = [] + for idx, jsonl_url in enumerate(jsonl_paths): + prefix = f"{idx}/" if len(jsonl_paths) > 1 else "" + metadata_list.extend( + _load_sft_metadata_from_s3( + s3_client, + jsonl_url, + min_frames=61, + uuid_prefix=prefix, + min_short_edge=min_short_edge, + ) + ) + + total_windows = sum(len(m["t2w_windows"]) for m in metadata_list) + log.info( + f"Finished loading metadata from {len(jsonl_paths)} file(s). " + f"Total videos: {len(metadata_list)}, total windows: {total_windows}" + ) + + if sample_by_window: + metadata_list = _flatten_metadata_by_window(metadata_list) + log.info(f"sample_by_window=True: flattened to {len(metadata_list)} samples (one per window)") + + # Deterministic shuffle based on the sha256 hash of uuid + metadata_list.sort(key=lambda x: hashlib.sha256(x["uuid"].encode("utf-8")).hexdigest()) + + dataset = SFTDataset( + metadata=metadata_list, + num_video_frames=num_video_frames, + resolution=resolution, + s3_credentials=credentials, + temporal_interval_mode=temporal_interval_mode, + frame_selection_mode=frame_selection_mode, + tokenizer_config=tokenizer_config, + cfg_dropout_rate=cfg_dropout_rate, + use_system_prompt=use_system_prompt, + append_duration_fps_timestamps=append_duration_fps_timestamps, + append_resolution_info=append_resolution_info, + cfg_dropout_keep_metadata=cfg_dropout_keep_metadata, + caption_suffix=caption_suffix, + conditioning_fps=conditioning_fps, + conditioning_fps_noise_std=conditioning_fps_noise_std, + conditioning_config=conditioning_config, + temporal_compression_factor=temporal_compression_factor, + ) + return dataset diff --git a/cosmos-inference/cosmos3/_src/vfm/datasets/sequence_packing.py b/cosmos-inference/cosmos3/_src/vfm/datasets/sequence_packing.py new file mode 100644 index 00000000..db6d5940 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/vfm/datasets/sequence_packing.py @@ -0,0 +1,2987 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +"""Functions for implementing sequence packing with flexible attention modes. + +This module provides utilities for packing text and image sequences together +with support for different attention patterns (causal, full, noise). + +Key Components: +--------------- +1. Attention Mask Creation: + - create_sparse_mask(): Creates sparse masks for flex attention + - prepare_attention_mask_per_sample(): Creates dense attention masks + +2. Position ID Generation: + - get_flattened_position_ids_extrapolate(): Extrapolation-based position encoding + - get_flattened_position_ids_interpolate(): Interpolation-based position encoding + +3. Tokenizer Setup: + - add_special_tokens(): Adds image boundary tokens to tokenizer + +4. Sequence Packing: + - pack_input_sequence(): Main function for packing text and image sequences + - Helper functions: _pack_text_tokens(), _pack_image_tokens(), _finalize_packed_data() + +Sequence Format: +--------------- +Each sample consists of alternating text and image sections: + [text_tokens] [image_tokens] ... + +Attention Modes: +--------------- +- 'causal': Standard causal/autoregressive attention for text +- 'full': Bidirectional attention for images +- 'noise': Special mode for noise conditioning +""" + +import math +from collections.abc import Mapping, Sequence +from dataclasses import dataclass, field +from typing import Any, Dict, List, Tuple + +import torch +from torch.nn.attention.flex_attention import and_masks, or_masks + +from cosmos3._src.imaginaire.attention.checks import check_valid_tuple_or_element +from cosmos3._src.imaginaire.attention.varlen import generate_multi_dim_varlen_parameters +from cosmos3._src.imaginaire.utils import log +from cosmos3._src.vfm.models.mot.unified_3dmrope_utils import ( + get_3d_mrope_ids_text_tokens, + get_3d_mrope_ids_vae_tokens, +) +from cosmos3._src.vfm.models.utils.data_and_condition import GenerationDataClean +from cosmos3._src.vfm.tokenizers.tokenization_qwen2 import Qwen2Tokenizer + +MAX_CAUSAL_LEN_IMAGE_BATCH = 0 +MAX_FULL_LEN_IMAGE_BATCH = 0 +MAX_CAUSAL_LEN_VIDEO_BATCH = 0 +MAX_FULL_LEN_VIDEO_BATCH = 0 + + +# ============================================================================ +# Attention mask creation +# ============================================================================ + + +def create_sparse_mask(document_lens, split_lens, attn_modes, device): + """Create a sparse attention mask combining multiple attention patterns. + + Args: + document_lens: List of document lengths + split_lens: List of split lengths within documents + attn_modes: List of attention modes ('causal', 'full', 'noise') for each split + device: Device to place tensors on + + Returns: + Combined mask using flex attention API + """ + + # Build sequence ID tensors for tracking full/noise attention regions + full_and_noise_seq_ids = [] + noise_seq_ids = [] + + for seq_idx, (length, attn_mode) in enumerate(zip(split_lens, attn_modes)): + # Assign sequence ID for full/noise regions, -1 for causal regions + seq_id = seq_idx if attn_mode in ["full", "noise"] else -1 + full_and_noise_seq_ids.extend([seq_id] * length) + + # Assign sequence ID only for noise regions + noise_seq_id = seq_idx if attn_mode == "noise" else -1 + noise_seq_ids.extend([noise_seq_id] * length) + + full_and_noise_seq_id = torch.tensor(full_and_noise_seq_ids, device=device) # [seq_len] + noise_seq_id = torch.tensor(noise_seq_ids, device=device) # [seq_len] + document_id = torch.cat([torch.full((l,), i) for i, l in enumerate(document_lens, start=1)]).to(device) # [seq_len] + + # Define component mask functions + def causal_mask(b, h, q_idx, kv_idx): + """Standard causal attention: query can only attend to prior keys.""" + return q_idx >= kv_idx + + def full_and_noise_mask(b, h, q_idx, kv_idx): + """Allow attention within same full/noise sequence.""" + return (full_and_noise_seq_id[q_idx] == full_and_noise_seq_id[kv_idx]) & (full_and_noise_seq_id[q_idx] >= 0) + + def remove_noise_mask(b, h, q_idx, kv_idx): + """Prevent attending to noise tokens from different sequences.""" + return ~((noise_seq_id[kv_idx] >= 0) & (noise_seq_id[q_idx] != noise_seq_id[kv_idx])) + + def sample_mask(b, h, q_idx, kv_idx): + """Ensure attention stays within same document/sample.""" + return document_id[q_idx] == document_id[kv_idx] + + # Combine all masks: (causal OR full_and_noise) AND remove_noise AND sample + return and_masks(or_masks(causal_mask, full_and_noise_mask), remove_noise_mask, sample_mask) + + +def prepare_attention_mask_per_sample(split_lens, attn_modes, device="cpu"): + """Prepare dense attention mask for a single sample with multiple splits. + + Args: + split_lens: List of integers indicating length of each split within the sample + attn_modes: List of attention modes for each split ('causal', 'full', or 'noise') + device: Device to place the attention mask tensor on + + Returns: + Attention mask tensor of shape (sample_len, sample_len) with -inf for masked positions + """ + sample_len = sum(split_lens) + attention_mask = torch.zeros((sample_len, sample_len), dtype=torch.bool, device=device) # [sample_len,sample_len] + + # First pass: Set up basic attention patterns for each split + current_pos = 0 + for split_len, attn_mode in zip(split_lens, attn_modes): + assert attn_mode in ["causal", "full", "noise"], f"Invalid attention mode: {attn_mode}" + + split_start = current_pos + split_end = current_pos + split_len + + if attn_mode == "causal": + # Causal: lower triangular within split + full attention to previous splits + attention_mask[split_start:split_end, split_start:split_end] = torch.ones( + (split_len, split_len), device=device + ).tril() # [split_len,split_len] + attention_mask[split_start:split_end, :split_start] = 1 + else: # "full" or "noise" + # Full attention within split and to previous splits + attention_mask[split_start:split_end, split_start:split_end] = torch.ones( + (split_len, split_len), device=device + ) # [split_len,split_len] + attention_mask[split_start:split_end, :split_start] = 1 + + current_pos += split_len + + # Second pass: Handle noise mode - mask out noise columns except within same split + current_pos = 0 + for split_len, attn_mode in zip(split_lens, attn_modes): + if attn_mode == "noise": + split_start = current_pos + split_end = current_pos + split_len + + # Zero out the entire column for noise tokens + attention_mask[:, split_start:split_end] = 0 + # But allow self-attention within the noise split + attention_mask[split_start:split_end, split_start:split_end] = 1 + + current_pos += split_len + + # Convert boolean mask to float with -inf for masked positions + attention_mask = torch.zeros_like(attention_mask, dtype=torch.float).masked_fill_( + ~attention_mask, float("-inf") + ) # [sample_len,sample_len] + + return attention_mask + + +# ============================================================================ +# Tokenizer utilities +# ============================================================================ + + +def add_special_tokens(tokenizer): + """Add image-related special tokens to tokenizer if not already present. + + Args: + tokenizer: Tokenizer to add special tokens to + + Returns: + Tuple of (modified tokenizer, dict of new token IDs) + """ + # Collect existing special tokens + existing_special_tokens = [] + for key, value in tokenizer.special_tokens_map.items(): + if isinstance(value, str): + existing_special_tokens.append(value) + elif isinstance(value, list): + existing_special_tokens.extend(value) + + # Define image boundary tokens to add if missing + tokens_to_add = [] + if "<|vision_start|>" not in existing_special_tokens: + tokens_to_add.append("<|vision_start|>") + if "<|vision_end|>" not in existing_special_tokens: + tokens_to_add.append("<|vision_end|>") + + # Add new tokens to tokenizer vocabulary + if tokens_to_add: + tokenizer.add_tokens(tokens_to_add) + + # Get token IDs for image boundary tokens + new_token_ids = { + "start_of_generation": tokenizer.convert_tokens_to_ids("<|vision_start|>"), + "end_of_generation": tokenizer.convert_tokens_to_ids("<|vision_end|>"), + } + + return tokenizer, new_token_ids + + +# ============================================================================ +# Data structures +# ============================================================================ + + +@dataclass +class ModalityData: + """Unified container for a single generation modality's data. + + This dataclass serves dual purposes: + 1. During packing: Acts as a builder, accumulating data in lists + 2. After finalize(): Holds finalized tensors ready for model consumption + + Attributes: + sequence_indexes: Indices in the packed sequence where this modality's tokens appear. + List during building, Tensor after finalize(). + timesteps: Diffusion timesteps for each noised token. + List during building, Tensor after finalize(). + mse_loss_indexes: Indices where MSE loss should be computed (noised tokens only). + List during building, Tensor after finalize(). + token_shapes: Shape metadata for each sample's tokens. + For vision: list of (T, H, W) tuples. + For action: list of (T,) tuples. + tokens: The actual latent tokens. List during build, Tensor after finalize(). + condition_mask: Mask indicating clean frames (1=clean, 0=noised). Only after finalize(). + noisy_frame_indexes: Indices of noised frames. Constructed from condition_mask during + sequence packing to reduce GPU->CPU synchronization later. Only after finalize(). + domain_id: Domain ID for multi-domain training. Only after finalize(). NOTE: only used for action modality. + raw_action_dim: Raw action dimension. Only after finalize(). NOTE: only used for action modality. + """ + + # Core tracking (list during build, tensor after finalize) + sequence_indexes: list[int] | torch.Tensor = field(default_factory=list) + timesteps: list[float] | torch.Tensor = field(default_factory=list) + mse_loss_indexes: list[int] | torch.Tensor = field(default_factory=list) + # list[tuple[int,int,int]] for vision, list[tuple[int]] for action, list[tuple[int,int,int]] for sound + token_shapes: list = field(default_factory=list) + + # Populated during finalization (from GenerationDataClean / noise path) + tokens: list[torch.Tensor] = field(default_factory=list) + condition_mask: list[torch.Tensor] = field(default_factory=list) + noisy_frame_indexes: list[torch.Tensor] = field(default_factory=list) + domain_id: list[torch.Tensor] = field(default_factory=list) + raw_action_dim: list[torch.Tensor | None] | None = field(default_factory=list) + + def to_cuda(self) -> None: + """Move all tensor fields to CUDA in-place.""" + if isinstance(self.sequence_indexes, torch.Tensor): + self.sequence_indexes = self.sequence_indexes.cuda() + if isinstance(self.timesteps, torch.Tensor): + self.timesteps = self.timesteps.cuda() + if isinstance(self.mse_loss_indexes, torch.Tensor): + self.mse_loss_indexes = self.mse_loss_indexes.cuda() + self.tokens = [token.cuda() for token in self.tokens] + self.condition_mask = [cm.cuda() for cm in self.condition_mask] + self.noisy_frame_indexes = [ni.cuda() for ni in self.noisy_frame_indexes] + self.domain_id = [d.cuda() for d in self.domain_id] + # raw_action_dim is optional (e.g., when action-channel masking is disabled). + if self.raw_action_dim is not None: + self.raw_action_dim = [d.cuda() if d is not None else None for d in self.raw_action_dim] + + +@dataclass +class PackedSequence: + """Unified sequence container - works as builder during packing and final output. + + This dataclass replaces the old SequenceStatus + PackedSequence pattern: + - Build phase: Accumulate data using lists, modalities use ModalityData builders + - After finalize(): Ready for model consumption with tensors + + Attributes: + # Sequence structure + sample_lens: Length of each sample in the packed sequence. + split_lens: Length of each split (text/vision/action sections). + attn_modes: Attention mode for each split ('causal', 'full'). + is_image_batch: Whether this batch contains images (vs videos). + sequence_length: Total length of packed sequence. Computed during finalize(). + + # Build-time tracking (not used after finalize) + curr: Current position in the packed sequence during building. + + # Text modality (list during build, tensor after finalize) + text_ids: All text token IDs (including special tokens). + text_indexes: Indices where text tokens appear in sequence. + position_ids: RoPE position IDs for all tokens. + + # Loss computation - Cross Entropy (text) + label_ids: Label IDs for cross-entropy loss. + ce_loss_indexes: Indices for computing cross-entropy loss. + ce_loss_weights: Weights for cross-entropy loss. + + # Generation modalities - named fields for type safety + vision: Vision modality data (images/videos). None if no vision in batch. + action: Action modality data (robotics). None if no actions in batch. + sound: Sound modality data (audio). None if no sound in batch. + """ + + # Sequence structure + sample_lens: list[int] = field(default_factory=list) + split_lens: list[int] = field(default_factory=list) + attn_modes: list[str] = field(default_factory=list) + is_image_batch: bool = False + sequence_length: int = 0 + + # Build-time tracking (used during packing, not after finalize) + curr: int = 0 + + # Text modality (list during build, tensor after finalize) + text_ids: list[int] | torch.Tensor = field(default_factory=list) + text_indexes: list[int] | torch.Tensor = field(default_factory=list) + position_ids: list[int] | torch.Tensor = field(default_factory=list) + + # Loss computation - Cross Entropy (text) + label_ids: list[int] | torch.Tensor | None = field(default_factory=list) + ce_loss_indexes: list[int] | torch.Tensor | None = field(default_factory=list) + ce_loss_weights: list[float] | torch.Tensor | None = field(default_factory=list) + + # Build-time mRoPE tracking (used during packing, not after finalize) + # When _use_mrope=True, position_ids accumulates (3, N) tensors instead of ints, + # and finalize() produces a (3, total_seq_len) tensor instead of (total_seq_len,). + _use_mrope: bool = False + # Running temporal index for mRoPE position ID generation within a single sample. + # Reset to 0 at the start of each sample, then advanced by text and vision helpers + # as segments are packed. Action reuses the pre-vision snapshot (parallel temporal + # range) without advancing it. Float when FPS modulation is enabled. + # E.g. offset=0 -> text(4 tokens) -> offset=4 -> vision(3 frames) -> offset=7. + _mrope_temporal_offset: int | float = 0 + _mrope_reset_spatial: bool = True + + # Temporal causal: whether supertoken 0's action slot contains null tokens. + # True for all training calls and AR frame 0; False for AR frame N>0 (real actions). + # Used by three_way_attention to zero out V for null action tokens (inline when attention_meta.null_action_supertokens=True). + null_action_supertokens: bool = False + + # Temporal causal: number of action tokens prefixing each vision supertoken. + # Equals temporal_compression_factor when actions are packed inline; 0 when + # action_gen=False or for non-temporal-causal layouts. Single source of truth + # for downstream attention/KV-cache code (per-supertoken layout is + # num_action_tokens_per_supertoken + H_p * W_p). + num_action_tokens_per_supertoken: int = 0 + + # Generation modalities - NAMED FIELDS for type safety + vision: ModalityData | None = None + action: ModalityData | None = None + sound: ModalityData | None = None + + def finalize( + self, + gen_data_clean: GenerationDataClean, + ) -> "PackedSequence": + """Convert all lists to tensors and compute derived values. + + Args: + gen_data_clean: GenerationDataClean for metadata (e.g., action domain IDs). + + Returns: + New PackedSequence instance with tensors instead of lists. + """ + # Compute sequence length + sequence_length = sum(self.sample_lens) + sample_lens = self.sample_lens.copy() + split_lens = self.split_lens.copy() + attn_modes = self.attn_modes.copy() + + # Prepare loss-related tensors (cross-entropy) + label_ids: torch.Tensor | None = None + ce_loss_indexes: torch.Tensor | None = None + ce_loss_weights: torch.Tensor | None = None + if self.label_ids and len(self.label_ids) > 0: + label_ids = torch.tensor(self.label_ids) # [N_ce_tokens] + ce_loss_indexes = torch.tensor(self.ce_loss_indexes) # [N_ce_tokens] + ce_loss_weights = torch.tensor(self.ce_loss_weights) # [N_ce_tokens] + + # The condition_mask and noisy_frame_indexes are kept as lists to support variable shapes. + + # Finalize vision modality + vision: ModalityData | None = None + if self.vision is not None and len(self.vision.sequence_indexes) > 0: + vision = ModalityData( + sequence_indexes=torch.tensor(self.vision.sequence_indexes, dtype=torch.long), # [N_vision_tokens] + timesteps=torch.tensor(self.vision.timesteps), # [N_vision_noisy_tokens] + mse_loss_indexes=torch.tensor( + self.vision.mse_loss_indexes, dtype=torch.long + ), # [N_vision_noisy_tokens] + token_shapes=list(self.vision.token_shapes), + tokens=self.vision.tokens, + condition_mask=list(self.vision.condition_mask), + noisy_frame_indexes=list(self.vision.noisy_frame_indexes), + ) + + # Finalize action modality + action: ModalityData | None = None + if self.action is not None and len(self.action.sequence_indexes) > 0: + action = ModalityData( + sequence_indexes=torch.tensor(self.action.sequence_indexes, dtype=torch.long), # [N_action_tokens] + timesteps=torch.tensor(self.action.timesteps), # [N_action_noisy_tokens] + mse_loss_indexes=torch.tensor( + self.action.mse_loss_indexes, dtype=torch.long + ), # [N_action_noisy_tokens] + token_shapes=list(self.action.token_shapes), + tokens=self.action.tokens, + condition_mask=list(self.action.condition_mask), # Keep as list to support variable shapes + noisy_frame_indexes=list(self.action.noisy_frame_indexes), + domain_id=( + gen_data_clean.action_domain_id + if gen_data_clean.action_domain_id is not None + else [torch.zeros(1, dtype=torch.long)] * len(self.action.token_shapes) + ), + raw_action_dim=gen_data_clean.raw_action_dim, + ) + + # Finalize sound modality (placeholder for future) + sound: ModalityData | None = None + if self.sound is not None and len(self.sound.sequence_indexes) > 0: + sound = ModalityData( + sequence_indexes=torch.tensor(self.sound.sequence_indexes, dtype=torch.long), # [N_sound_tokens] + timesteps=torch.tensor(self.sound.timesteps), # [N_sound_noisy_tokens] + mse_loss_indexes=torch.tensor(self.sound.mse_loss_indexes, dtype=torch.long), # [N_sound_noisy_tokens] + token_shapes=list(self.sound.token_shapes), + tokens=self.sound.tokens, + condition_mask=list(self.sound.condition_mask), + noisy_frame_indexes=list(self.sound.noisy_frame_indexes), + ) + + # Finalize position IDs: 3D mRoPE (3, seq_len) or 1D RoPE (seq_len,) + if self._use_mrope and len(self.position_ids) > 0 and isinstance(self.position_ids[0], torch.Tensor): + mrope_tensors: list[torch.Tensor] = self.position_ids # type: ignore[assignment] + position_ids = torch.cat(mrope_tensors, dim=1) # [3,actual_seq_len] + else: # Original 1D RoPE from Bagel, where all the media tokens share the same 1D position ID + position_ids = torch.tensor(self.position_ids) # [seq_len] + + return PackedSequence( + # Sequence structure + sequence_length=sequence_length, + sample_lens=sample_lens, + split_lens=split_lens, + attn_modes=attn_modes, + is_image_batch=gen_data_clean.is_image_batch, + # Text modality (converted to tensors) + text_ids=torch.tensor(self.text_ids, dtype=torch.long), # [N_text_tokens] + text_indexes=torch.tensor(self.text_indexes, dtype=torch.long), # [N_text_tokens] + position_ids=position_ids, # [seq_len] or [3,seq_len] + # Loss computation - Cross Entropy + label_ids=label_ids, + ce_loss_indexes=ce_loss_indexes, + ce_loss_weights=ce_loss_weights, + # Generation modalities + vision=vision, + action=action, + sound=sound, + # Temporal causal + null_action_supertokens=self.null_action_supertokens, + num_action_tokens_per_supertoken=self.num_action_tokens_per_supertoken, + ) + + def to_cuda(self) -> None: + """Move all tensor fields to CUDA in-place.""" + if isinstance(self.text_ids, torch.Tensor): + self.text_ids = self.text_ids.cuda() + if isinstance(self.text_indexes, torch.Tensor): + self.text_indexes = self.text_indexes.cuda() + if isinstance(self.position_ids, torch.Tensor): + self.position_ids = self.position_ids.cuda() + if isinstance(self.label_ids, torch.Tensor): + self.label_ids = self.label_ids.cuda() + if isinstance(self.ce_loss_indexes, torch.Tensor): + self.ce_loss_indexes = self.ce_loss_indexes.cuda() + if isinstance(self.ce_loss_weights, torch.Tensor): + self.ce_loss_weights = self.ce_loss_weights.cuda() + if self.vision is not None: + self.vision.to_cuda() + if self.action is not None: + self.action.to_cuda() + if self.sound is not None: + self.sound.to_cuda() + + +@dataclass +class SequencePlan: + """Plan describing which modalities are present in a sample. + + This dataclass tracks the presence of different modalities (text, vision, action) + and their conditioning configurations for a dataset sample. Unlike SequencePlan + which holds the actual tensor data, this class provides a lightweight summary + of what modalities exist and how they should be conditioned. + + Attributes: + has_text: Whether text/caption tokens are present for this sample. + Used for text-conditioned generation (e.g., text-to-image/video). + has_vision: Whether vision input (image or video latents) is present. + Defaults to False. + condition_frame_indexes_vision: Indexes of latent vision frames that are clean/conditioning. + [] means all frames are noised/supervised. + All frames specified means all frames are clean (no MSE supervision). + For multi-item samples (e.g. image editing where each sample has multiple + separately-encoded images), this applies to each vision item individually. + The number of items per sample is tracked by + ``GenerationDataClean.num_vision_items_per_sample``. + has_action: Whether action input is present for robotics/embodied AI tasks. + Defaults to False. + condition_frame_indexes_action: Indexes of action steps that are clean/conditioning. + [] means all steps are noised/supervised. + All steps specified means all steps are clean (no MSE supervision). + """ + + # -- understanding (text conditioning) -- + has_text: bool + + # -- vision modality -- + has_vision: bool = False + condition_frame_indexes_vision: list[int] = field(default_factory=list) + # If True, all vision items in this sample share the same temporal mRoPE grid + # (controlnet-style transfer: target frame i is spatio-temporally aligned with + # control frame i). Each item gets the same temporal_offset; spatial reset + # behavior is unchanged. Requires num_vision_items_per_sample > 1, equal latent_t, + # and equal fps across items. Default False preserves single-clip and + # image-editing semantics where items represent distinct time states. + share_vision_temporal_positions: bool = False + + # -- action modality -- + has_action: bool = False + condition_frame_indexes_action: list[int] = field(default_factory=list) + action_start_frame_offset: int = 1 + + # -- sound modality -- + has_sound: bool = False + condition_frame_indexes_sound: list[int] = field(default_factory=list) + + def as_dict(self) -> dict: + return { + "has_text": self.has_text, + "has_vision": self.has_vision, + "has_action": self.has_action, + "has_sound": self.has_sound, + "condition_frame_indexes_vision": self.condition_frame_indexes_vision, + "condition_frame_indexes_action": self.condition_frame_indexes_action, + "condition_frame_indexes_sound": self.condition_frame_indexes_sound, + "share_vision_temporal_positions": self.share_vision_temporal_positions, + } + + +# ============================================================================ +# Helper functions for packing sequences +# ============================================================================ + + +def compute_text_split_length( + num_caption_tokens: int, + special_tokens: Dict[str, int], + has_generation: bool = True, +) -> int: + """Compute the total text split length without mutating any state. + + This is the number of token positions occupied by the text split in a + packed sequence: caption tokens + optional BOS + EOS + optional BOV. + + Args: + num_caption_tokens: Number of raw caption token IDs (before special tokens). + special_tokens: Dictionary of special token IDs (checked for ``"bos_token_id"``). + has_generation: Whether a start-of-generation (BOV) token follows text. + + Returns: + Total text split length (positions consumed in the packed sequence). + """ + n = num_caption_tokens + if "bos_token_id" in special_tokens: + n += 1 + n += 1 # EOS + if has_generation: + n += 1 # start-of-generation / BOV + return n + + +def _pack_text_tokens( + packed_seq: PackedSequence, + text_ids: List[int], + special_tokens: Dict[str, int], + curr_rope_id: int, + has_generation: bool, + use_float_positions: bool = False, +) -> Tuple[int, int, int]: + """Pack text tokens into the sequence. + + Args: + packed_seq: PackedSequence instance to accumulate data into. + text_ids: List of text token IDs (integers). + special_tokens: Dictionary of special token IDs. + curr_rope_id: Current RoPE position ID. + has_generation: Whether there's media/action after text. + use_float_positions: If True, generate float position IDs for 3D mRoPE + (for consistency with FPS-modulated vision tokens). + + Returns: + Tuple of (updated curr_rope_id, split_length, sample_length). + """ + # Ensure we're in build mode (fields are lists, not tensors) + assert isinstance(packed_seq.text_ids, list), "PackedSequence must be in build mode" + assert isinstance(packed_seq.text_indexes, list) + assert isinstance(packed_seq.position_ids, list) + assert isinstance(packed_seq.label_ids, list) + assert isinstance(packed_seq.ce_loss_indexes, list) + assert isinstance(packed_seq.ce_loss_weights, list) + + curr = packed_seq.curr + + # Prepend BOS token if available + if "bos_token_id" in special_tokens: + shifted_text_ids = [special_tokens["bos_token_id"]] + text_ids + else: + shifted_text_ids = text_ids + + split_len = 0 + + # Add text tokens to sequence + packed_seq.text_ids.extend(shifted_text_ids) + packed_seq.text_indexes.extend(range(curr, curr + len(shifted_text_ids))) + + # Configure loss computation for text tokens + packed_seq.ce_loss_indexes.extend(range(curr, curr + len(shifted_text_ids))) + packed_seq.ce_loss_weights.extend([1.0] * len(shifted_text_ids)) + packed_seq.label_ids.extend(text_ids[1:] + [special_tokens["eos_token_id"]]) + + curr += len(shifted_text_ids) + split_len += len(shifted_text_ids) + + # Add EOS token + packed_seq.text_ids.append(special_tokens["eos_token_id"]) + packed_seq.text_indexes.append(curr) + curr += 1 + split_len += 1 + + # Add start-of-generation token, but only if there's media/action present. + if has_generation: + packed_seq.text_ids.append(special_tokens["start_of_generation"]) + packed_seq.text_indexes.append(curr) + curr += 1 + split_len += 1 + + # Sanity check -- compute_text_split_length() is called elsewhere. + assert split_len == compute_text_split_length(len(text_ids), special_tokens, has_generation) + + # Update position IDs and attention mode for text split + if packed_seq._use_mrope: + text_mrope_ids, packed_seq._mrope_temporal_offset = get_3d_mrope_ids_text_tokens( + num_tokens=split_len, + temporal_offset=packed_seq._mrope_temporal_offset, + use_float_positions=use_float_positions, + ) # text_mrope_ids: [3,split_len] + packed_seq.position_ids.append(text_mrope_ids) + else: + packed_seq.position_ids.extend(range(curr_rope_id, curr_rope_id + split_len)) + packed_seq.attn_modes.append("causal") + packed_seq.split_lens.append(split_len) + + packed_seq.curr = curr + return curr_rope_id + split_len, split_len, split_len + + +def _pack_vision_tokens( + packed_seq: PackedSequence, + input_vision_tokens: torch.Tensor, + condition_frame_indexes_vision: list[int], + input_timestep: float | torch.Tensor, + curr_rope_id: int, + latent_patch_size: int = 1, + vision_fps: float | None = None, + enable_fps_modulation: bool = False, + base_fps: float = 24.0, + temporal_compression_factor: int = 4, +) -> int: + """Pack vision tokens into the sequence. + + Args: + packed_seq: PackedSequence instance to accumulate data into. + input_vision_tokens: Vision latent tokens (C, T, H, W). + condition_frame_indexes_vision: Indexes of conditioning frames. + input_timestep: Diffusion timestep. Either a float (teacher_forcing/none — all frames + share the same sigma) or a Tensor(T_max,) (diffusion_forcing — per-frame sigma; + indexed as input_timestep[frame_idx] for each noisy frame). + curr_rope_id: Current RoPE position ID. + latent_patch_size: Patch size for latent patchification. + vision_fps: Frames per second of the video. Used when enable_fps_modulation=True. + enable_fps_modulation: If True, scale temporal position IDs based on video FPS. + base_fps: Base FPS for normalization (default 24.0). + temporal_compression_factor: VAE temporal compression factor (default 4). + Returns: + Vision split length. + """ + # Ensure we're in build mode + assert isinstance(packed_seq.position_ids, list), "PackedSequence must be in build mode" + + curr = packed_seq.curr + vision_split_len = 0 + + # Initialize vision modality if not present. + if packed_seq.vision is None: + packed_seq.vision = ModalityData() + + # Ensure vision modality is in build mode + assert isinstance(packed_seq.vision.sequence_indexes, list) + assert isinstance(packed_seq.vision.mse_loss_indexes, list) + assert isinstance(packed_seq.vision.timesteps, list) + assert isinstance(packed_seq.vision.tokens, list) + + # Compute position IDs for image patches + _, _, latent_t, latent_h, latent_w = input_vision_tokens.shape + if latent_patch_size < 1: + raise ValueError(f"latent_patch_size must be >= 1, got {latent_patch_size}") + # Use ceil to support latent dims not divisible by patch size (padding handled in network) + patch_h = math.ceil(latent_h / latent_patch_size) + patch_w = math.ceil(latent_w / latent_patch_size) + packed_seq.vision.token_shapes.append((latent_t, patch_h, patch_w)) + packed_seq.vision.tokens.append(input_vision_tokens) + + # Add image token indexes and loss information + num_vision_tokens = latent_t * patch_h * patch_w + packed_seq.vision.sequence_indexes.extend(range(curr, curr + num_vision_tokens)) + + # Supervise vision tokens based on conditioning frames + condition_set = {idx for idx in condition_frame_indexes_vision if 0 <= idx < latent_t} + assert isinstance(packed_seq.vision.condition_mask, list) + + vision_condition_mask = torch.zeros( + (latent_t, 1, 1), device=input_vision_tokens.device, dtype=input_vision_tokens.dtype + ) # [T,1,1] + for frame_idx in condition_set: + vision_condition_mask[frame_idx, 0, 0] = 1.0 + packed_seq.vision.condition_mask.append(vision_condition_mask) + + vision_noisy_frame_indexes = torch.tensor( + [idx for idx in range(latent_t) if idx not in condition_set], + device=input_vision_tokens.device, + dtype=torch.long, + ) # [N_noisy_frames] + assert isinstance(packed_seq.vision.noisy_frame_indexes, list) + packed_seq.vision.noisy_frame_indexes.append(vision_noisy_frame_indexes) + + frame_token_stride = patch_h * patch_w + for frame_idx in range(latent_t): + if frame_idx in condition_set: + continue + frame_start = curr + frame_idx * frame_token_stride + frame_end = frame_start + frame_token_stride + packed_seq.vision.mse_loss_indexes.extend(range(frame_start, frame_end)) + if isinstance(input_timestep, torch.Tensor): + frame_ts = input_timestep[frame_idx].item() + else: + frame_ts = input_timestep + packed_seq.vision.timesteps.extend([frame_ts] * frame_token_stride) + + curr += num_vision_tokens + vision_split_len += num_vision_tokens + + # Update position IDs for image split + if packed_seq._use_mrope: + # Determine FPS for this vision segment (None disables FPS modulation) + effective_fps = vision_fps if enable_fps_modulation else None + + vision_mrope_ids, packed_seq._mrope_temporal_offset = get_3d_mrope_ids_vae_tokens( + grid_t=latent_t, + grid_h=patch_h, + grid_w=patch_w, + temporal_offset=packed_seq._mrope_temporal_offset, + reset_spatial_indices=packed_seq._mrope_reset_spatial, + fps=effective_fps, + base_fps=base_fps, + temporal_compression_factor=temporal_compression_factor, + ) # vision_mrope_ids: [3,N_vision_tokens] + packed_seq.position_ids.append(vision_mrope_ids) + else: + # All image tokens share the same RoPE position ID + packed_seq.position_ids.extend([curr_rope_id] * vision_split_len) + + packed_seq.curr = curr + return vision_split_len + + +def _pack_action_tokens( + packed_seq: PackedSequence, + input_action_tokens: torch.Tensor, + condition_frame_indexes_action: list[int], + input_timestep: float, + curr_rope_id: int, + action_temporal_offset: int | float = 0, + enable_fps_modulation: bool = False, + base_fps: float = 24.0, + action_fps: float | None = None, + base_temporal_compression_factor: int | None = None, + action_start_frame_offset: int = 1, +) -> int: + """Pack action tokens into the sequence. + + Args: + packed_seq: PackedSequence instance to accumulate data into. + input_action_tokens: Action latent tokens (T, D). + condition_frame_indexes_action: Indexes of conditioning action steps. + input_timestep: Diffusion timestep. + curr_rope_id: Current RoPE position ID. + action_temporal_offset: Temporal offset for action mRoPE IDs (typically + the vision start offset so action aligns temporally with vision). + enable_fps_modulation: If True, scale temporal position IDs based on FPS. + base_fps: Base FPS for normalization (default 24.0). + action_fps: Frames per second of the action data. Used when enable_fps_modulation=True. + base_temporal_compression_factor: Base temporal compression factor for FPS scaling. + Should be set to the vision temporal compression factor (e.g. 4) so that action + tokens advance at frame rate (4x finer) relative to vision latent frames. + Only affects behavior when FPS modulation is enabled. + action_start_frame_offset: Frame offset for aligning action[0] with the + corresponding vision frame. Default 1 aligns action[0] with vision frame 1. + Returns: + Number of action tokens added. + """ + # Ensure we're in build mode + assert isinstance(packed_seq.position_ids, list), "PackedSequence must be in build mode" + + curr = packed_seq.curr + action_split_len = input_action_tokens.shape[0] + + # Initialize action modality if not present + if packed_seq.action is None: + packed_seq.action = ModalityData() + + # Ensure action modality is in build mode + assert isinstance(packed_seq.action.sequence_indexes, list) + assert isinstance(packed_seq.action.mse_loss_indexes, list) + assert isinstance(packed_seq.action.timesteps, list) + assert isinstance(packed_seq.action.tokens, list) + + # Add token indexes and loss information + action_indexes = list(range(curr, curr + action_split_len)) + packed_seq.action.sequence_indexes.extend(action_indexes) + packed_seq.action.token_shapes.append((action_split_len,)) + packed_seq.action.tokens.append(input_action_tokens) + + + condition_set = {idx for idx in condition_frame_indexes_action if 0 <= idx < action_split_len} + assert isinstance(packed_seq.action.condition_mask, list) + + action_condition_mask = torch.zeros( + (action_split_len, 1), device=input_action_tokens.device, dtype=input_action_tokens.dtype + ) # [T_action,1] + for frame_idx in condition_set: + action_condition_mask[frame_idx, 0] = 1.0 + packed_seq.action.condition_mask.append(action_condition_mask) + + action_noisy_frame_indexes = torch.tensor( + [idx for idx in range(action_split_len) if idx not in condition_set], + device=input_action_tokens.device, + dtype=torch.long, + ) # [N_noisy_action_frames] + assert isinstance(packed_seq.action.noisy_frame_indexes, list) + packed_seq.action.noisy_frame_indexes.append(action_noisy_frame_indexes) + + frame_token_stride = 1 # Action has 1 token per frame (no spatial dimension) + for frame_idx in range(action_split_len): + if frame_idx in condition_set: + continue + frame_start = curr + frame_idx * frame_token_stride + frame_end = frame_start + frame_token_stride + packed_seq.action.mse_loss_indexes.extend(range(frame_start, frame_end)) + packed_seq.action.timesteps.extend([input_timestep] * frame_token_stride) + + # Update RoPE position IDs for action tokens. + if packed_seq._use_mrope: + # 3D mRoPE: action tokens use a 1x1 spatial grid with start_frame_offset=1 + # so action[0] (null token) aligns with vision frame 1, not frame 0. + effective_fps = action_fps if enable_fps_modulation else None + + action_mrope_ids, _ = get_3d_mrope_ids_vae_tokens( + grid_t=action_split_len, + grid_h=1, + grid_w=1, + temporal_offset=action_temporal_offset, + reset_spatial_indices=packed_seq._mrope_reset_spatial, + fps=effective_fps, + base_fps=base_fps, + temporal_compression_factor=1, # Action is at frame rate (no temporal compression) + base_temporal_compression_factor=base_temporal_compression_factor, + start_frame_offset=action_start_frame_offset, # Align action[0] with vision frame action_start_frame_offset + ) # action_mrope_ids: [3,N_action_tokens] + packed_seq.position_ids.append(action_mrope_ids) + # Note: we don't update _mrope_temporal_offset here because action tokens + # share the temporal space with vision tokens (they run in parallel). + else: + # All action tokens share the SAME RoPE position as vision tokens (see docs/sequence_packing.md). + packed_seq.position_ids.extend([curr_rope_id] * action_split_len) + + packed_seq.curr = curr + action_split_len + return action_split_len + + +def _pack_sound_tokens( + packed_seq: PackedSequence, + input_sound_tokens: torch.Tensor, + condition_frame_indexes_sound: list[int], + input_timestep: float, + curr_rope_id: int, + sound_temporal_offset: int | float = 0, + enable_fps_modulation: bool = False, + base_fps: float = 24.0, + sound_fps: float | None = None, +) -> int: + """Pack sound/audio tokens into the sequence. + + Sound latents have shape [C, T] where C is channels and T is temporal frames. + Sound tokens are added to the unified generation split to maintain FactoredSequencePack's + 2-split invariant (causal + full). + + Args: + packed_seq: PackedSequence instance to accumulate data into. + input_sound_tokens: Sound latent tokens (C, T). + condition_frame_indexes_sound: Indexes of conditioning frames. + [] means all frames are noised/supervised. + All frames specified means all frames are clean (no MSE supervision). + input_timestep: Diffusion timestep. + curr_rope_id: Current RoPE position ID. + sound_temporal_offset: Temporal offset for m-RoPE position IDs (aligned with vision start). + enable_fps_modulation: If True, scale temporal positions by FPS ratio. + base_fps: Base FPS for normalization (default 24.0). + sound_fps: Sound latent FPS (e.g., 25.0). Used for FPS-aware m-RoPE positions. + + Returns: + Number of sound tokens added. + """ + # Ensure we're in build mode + assert isinstance(packed_seq.position_ids, list), "PackedSequence must be in build mode" + + curr = packed_seq.curr + + # Sound latent shape: [C, T] → T tokens + _, sound_split_len = input_sound_tokens.shape + + # Initialize sound modality if not present + if packed_seq.sound is None: + packed_seq.sound = ModalityData() + + # Ensure sound modality is in build mode + assert isinstance(packed_seq.sound.sequence_indexes, list) + assert isinstance(packed_seq.sound.mse_loss_indexes, list) + assert isinstance(packed_seq.sound.timesteps, list) + assert isinstance(packed_seq.sound.tokens, list) + + # Add token indexes - sound uses (T, 1, 1) shape for compatibility with 3D RoPE + packed_seq.sound.token_shapes.append((sound_split_len, 1, 1)) + packed_seq.sound.sequence_indexes.extend(range(curr, curr + sound_split_len)) + packed_seq.sound.tokens.append(input_sound_tokens) + + # Supervise sound tokens based on conditioning frames + condition_set = {idx for idx in condition_frame_indexes_sound if 0 <= idx < sound_split_len} + assert isinstance(packed_seq.sound.condition_mask, list) + + # Condition mask: shape (T, 1) — 1 = clean/conditioning, 0 = noised/supervised + sound_condition_mask = torch.zeros( + (sound_split_len, 1), device=input_sound_tokens.device, dtype=input_sound_tokens.dtype + ) # [T_sound,1] + for frame_idx in condition_set: + sound_condition_mask[frame_idx, 0] = 1.0 + packed_seq.sound.condition_mask.append(sound_condition_mask) + + sound_noisy_frame_indexes = torch.tensor( + [idx for idx in range(sound_split_len) if idx not in condition_set], + device=input_sound_tokens.device, + dtype=torch.long, + ) # [N_noisy_sound_frames] + assert isinstance(packed_seq.sound.noisy_frame_indexes, list) + packed_seq.sound.noisy_frame_indexes.append(sound_noisy_frame_indexes) + + # Add to MSE loss indexes and timesteps for non-conditioning frames + for frame_idx in range(sound_split_len): + if frame_idx in condition_set: + continue + # Sound has 1 token per frame (no spatial dimension) + frame_start = curr + frame_idx + frame_end = frame_start + 1 + packed_seq.sound.mse_loss_indexes.extend(range(frame_start, frame_end)) + packed_seq.sound.timesteps.extend([input_timestep]) + + # Update RoPE position IDs for sound tokens. + if packed_seq._use_mrope: + # 3D mRoPE: sound tokens use a 1x1 spatial grid, aligned with vision temporal positions. + # sound[0] aligns with vision frame 0 (start_frame_offset=0, unlike action which offsets by 1). + effective_fps = sound_fps if enable_fps_modulation else None + + sound_mrope_ids, _ = get_3d_mrope_ids_vae_tokens( + grid_t=sound_split_len, + grid_h=1, + grid_w=1, + temporal_offset=sound_temporal_offset, + reset_spatial_indices=packed_seq._mrope_reset_spatial, + fps=effective_fps, + base_fps=base_fps, + temporal_compression_factor=1, # Sound latent is already at sound_latent_fps (no further compression) + start_frame_offset=0, # Sound[0] aligns with vision frame 0 + ) # sound_mrope_ids: [3,N_sound_tokens] + packed_seq.position_ids.append(sound_mrope_ids) + # Note: we don't update _mrope_temporal_offset here because sound tokens + # share the temporal space with vision tokens (they run in parallel). + else: + # All sound tokens share the SAME RoPE position as vision/action tokens (unified generation split). + packed_seq.position_ids.extend([curr_rope_id] * sound_split_len) + + packed_seq.curr = curr + sound_split_len + return sound_split_len + + +def _pack_supertokens_temporal_causal( + packed_seq: "PackedSequence", + input_vision_tokens: torch.Tensor, + input_action_tokens: torch.Tensor | None, + condition_frame_indexes_vision: list[int], + input_timestep: float | torch.Tensor, + curr_rope_id: int, + latent_patch_size: int, + temporal_compression_factor: int, + action_dim: int, + vision_fps: float | None = None, + action_fps: float | None = None, + enable_fps_modulation: bool = False, + base_fps: float = 24.0, + pack_action_tokens: bool = True, +) -> tuple[int, bool]: + """Pack vision and (optionally) action tokens in supertoken order for temporal causal attention. + + Buffer layout per frame: + pack_action_tokens=True: [action_t (tcf), vision_t (H*W)] — supertoken size tcf + H*W + pack_action_tokens=False: [vision_t (H*W)] — supertoken size H*W + + Use ``pack_action_tokens=False`` when ``config.action_gen=False``; the resulting + ``num_action_tokens_per_supertoken=0`` is stamped on the pack and read by the + attention builder so NATTEN metadata stays in sync automatically. + + mRoPE layout (with actions, unified_3d_mrope only): + - Null actions (frame 0): all tcf tokens at ``temporal_offset``. + - Real training actions (frames 1..T-1): ``start_frame_offset=1`` so the + last action in group i co-locates with vision frame i. + - AR real actions (single supertoken): ``start_frame_offset=0``. + - Interleaved per frame as cat([action_ids, vision_ids]). + + ``input_timestep`` is float (TF/none) or Tensor(T_max,) (DF, per-frame sigma). + Conditioning frames are excluded from mse_loss_indexes either way. + + Returns (total_split_len, null_action_flag); null_action_flag is False when + pack_action_tokens=False. + """ + assert isinstance(packed_seq.position_ids, list), "PackedSequence must be in build mode" + + _, _, latent_t, latent_h, latent_w = input_vision_tokens.shape + patch_h = math.ceil(latent_h / latent_patch_size) + patch_w = math.ceil(latent_w / latent_patch_size) + tcf = temporal_compression_factor + patches_per_frame = patch_h * patch_w + supertoken_len = tcf + patches_per_frame if pack_action_tokens else patches_per_frame # S + + # Initialize modalities if needed + if packed_seq.vision is None: + packed_seq.vision = ModalityData() + if pack_action_tokens and packed_seq.action is None: + packed_seq.action = ModalityData() + + assert isinstance(packed_seq.vision.sequence_indexes, list) + assert isinstance(packed_seq.vision.mse_loss_indexes, list) + assert isinstance(packed_seq.vision.timesteps, list) + assert isinstance(packed_seq.vision.tokens, list) + assert isinstance(packed_seq.vision.condition_mask, list) + if pack_action_tokens: + assert isinstance(packed_seq.action.sequence_indexes, list) + assert isinstance(packed_seq.action.mse_loss_indexes, list) + assert isinstance(packed_seq.action.timesteps, list) + assert isinstance(packed_seq.action.tokens, list) + assert isinstance(packed_seq.action.condition_mask, list) + + device = input_vision_tokens.device + dtype = input_vision_tokens.dtype + + null_action_flag: bool + if pack_action_tokens: + # Build all_action_tokens: shape (latent_t * tcf, action_dim) + # + # Cases: + # 1. Training with conditioning frame (latent_t > 1, real_actions < latent_t*tcf): + # Prepend tcf null tokens for frame 0, then real actions for frames 1..T-1. + # 2. KV-cache continuation (latent_t > 1, real_actions == latent_t*tcf): all supertokens + # carry real actions (no conditioning frame in-segment). + # 3. AR frame N>0 (latent_t == 1, action provided): real actions, no null prefix. + # 4. AR frame 0 / image2video (action is None): all null tokens. + if input_action_tokens is not None: + # input_action_tokens shape: (1, T*tcf, D) or (T*tcf, D) for training; (tcf, D) for AR frame N>0 + if input_action_tokens.dim() == 3: + real_actions = input_action_tokens.squeeze(0) # [T*tcf,action_dim] or [N,action_dim] + else: + real_actions = input_action_tokens # [N,action_dim] + null_tokens = torch.zeros(tcf, action_dim, device=device, dtype=real_actions.dtype) # [tcf,action_dim] + if latent_t == 1: + # AR frame N>0: single supertoken with real actions, no null prefix + all_action_tokens = real_actions # [tcf,action_dim] + null_action_flag = False + elif real_actions.shape[0] == latent_t * tcf: + # All frames have real actions (e.g. KV-cache continuation segments) + all_action_tokens = real_actions + null_action_flag = False + else: + # Conditioning frame present: null for supertoken 0, real for 1..T-1 + all_action_tokens = torch.cat([null_tokens, real_actions], dim=0) # [T*tcf,action_dim] + null_action_flag = True + else: + # AR frame 0 or image2video: all action tokens are null + all_action_tokens = torch.zeros( + latent_t * tcf, action_dim, device=device, dtype=dtype + ) # [T*tcf,action_dim] + null_action_flag = True + else: + # pack_action_tokens=False: action tokens must not be supplied. + assert input_action_tokens is None, ( + "pack_action_tokens=False requires input_action_tokens=None; got a non-None tensor." + ) + null_action_flag = False + + # Record vision token shapes and tokens + packed_seq.vision.token_shapes.append((latent_t, patch_h, patch_w)) + packed_seq.vision.tokens.append(input_vision_tokens) + + # Vision conditioning mask: (T, 1, 1) + condition_set_vision = {idx for idx in condition_frame_indexes_vision if 0 <= idx < latent_t} + vision_condition_mask = torch.zeros((latent_t, 1, 1), device=device, dtype=dtype) # [T,1,1] + for fidx in condition_set_vision: + vision_condition_mask[fidx, 0, 0] = 1.0 + packed_seq.vision.condition_mask.append(vision_condition_mask) + + vision_noisy_frame_indexes = torch.tensor( + [idx for idx in range(latent_t) if idx not in condition_set_vision], + device=device, + dtype=torch.long, + ) # [N_noisy_frames] + packed_seq.vision.noisy_frame_indexes.append(vision_noisy_frame_indexes) + + if pack_action_tokens: + # Action token shapes: latent_t * tcf total (including null tokens) + packed_seq.action.token_shapes.append((latent_t * tcf,)) + packed_seq.action.tokens.append(all_action_tokens) + + # Action conditioning mask: all action tokens are conditioning (not supervised) + # Null tokens are always conditioning; real actions are conditioning too (they are inputs) + action_condition_mask = torch.ones((latent_t * tcf, 1), device=device, dtype=dtype) # [T*tcf,1] + packed_seq.action.condition_mask.append(action_condition_mask) + + # Pack in interleaved supertoken order: [action_t, vision_t] for each frame t + # (or just [vision_t] per frame when pack_action_tokens=False) + curr = packed_seq.curr + total_split_len = 0 + + # mRoPE: snapshot offset before this sample, compute IDs + if packed_seq._use_mrope: + temporal_offset = packed_seq._mrope_temporal_offset + effective_vision_fps = vision_fps if enable_fps_modulation else None + + # AR frame N>=1 with action_gen=True (latent_t==1 and real actions supplied): + # shift both vision and action by start_frame_offset=1 so the last action in + # the group co-locates with vision frame N, mirroring training's layout. + # All other cases (training latent_t>1, AR action_gen=False, AR frame 0 null) + # keep start_frame_offset=0. The caller in pack_input_sequence_autoregressive + # seeds temporal_offset accordingly (N-1 frames back when this shift applies). + ar_with_real_actions = latent_t == 1 and pack_action_tokens and input_action_tokens is not None + vision_sfo = 1 if ar_with_real_actions else 0 + + vision_ids_flat, new_offset = get_3d_mrope_ids_vae_tokens( + grid_t=latent_t, + grid_h=patch_h, + grid_w=patch_w, + temporal_offset=temporal_offset, + reset_spatial_indices=packed_seq._mrope_reset_spatial, + fps=effective_vision_fps, + base_fps=base_fps, + temporal_compression_factor=tcf, + start_frame_offset=vision_sfo, + ) # vision_ids_flat: [3,T*patch_h*patch_w] + + if pack_action_tokens: + effective_action_fps = action_fps if enable_fps_modulation else None + + # Action IDs: null for frame 0 (all tcf tokens share temporal_offset, + # co-located with vision frame 0), real for frames 1..T-1. + # Real tokens (training and AR) use start_frame_offset=1 so the last + # action in a group co-locates with vision frame i. + fps_active = effective_action_fps is not None + t_dtype = torch.float32 if fps_active else torch.long + t_offset = float(temporal_offset) if fps_active else int(temporal_offset) + null_t = torch.full((tcf,), t_offset, dtype=t_dtype) # [tcf] + null_hw = torch.zeros(tcf, dtype=t_dtype) # [tcf] + null_ids = torch.stack([null_t, null_hw, null_hw]) # [3,tcf] + + def _real_action_ids(n_frames: int, start_frame_offset: int) -> torch.Tensor: + flat, _ = get_3d_mrope_ids_vae_tokens( + grid_t=n_frames * tcf, + grid_h=1, + grid_w=1, + temporal_offset=temporal_offset, + reset_spatial_indices=packed_seq._mrope_reset_spatial, + fps=effective_action_fps, + base_fps=base_fps, + temporal_compression_factor=1, + base_temporal_compression_factor=tcf, + start_frame_offset=start_frame_offset, + ) + return flat.reshape(3, n_frames, tcf) # [3,n_frames,tcf] + + if latent_t > 1 and input_action_tokens is not None: + if real_actions.shape[0] == latent_t * tcf: + # KV continuation: real action in every supertoken (including frame 0) + action_ids_3d = _real_action_ids(latent_t, start_frame_offset=0) + else: + # Training with conditioning frame: supertoken 0 = null, 1..T-1 = real + null_ids_3d = null_ids.reshape(3, 1, tcf) # [3,1,tcf] + real_ids_3d = _real_action_ids(latent_t - 1, start_frame_offset=1) # [3,T-1,tcf] + action_ids_3d = torch.cat([null_ids_3d, real_ids_3d], dim=1) # [3,T,tcf] + elif latent_t > 1: + # No action tensor (all-null layout): same ID structure as training w/ conditioning frame. + null_ids_3d = null_ids.reshape(3, 1, tcf) # [3,1,tcf] + real_ids_3d = _real_action_ids(latent_t - 1, start_frame_offset=1) # [3,T-1,tcf] + action_ids_3d = torch.cat([null_ids_3d, real_ids_3d], dim=1) # [3,T,tcf] + elif input_action_tokens is None: + # AR frame 0 / image2video: only null + action_ids_3d = null_ids.reshape(3, 1, tcf) # [3,1,tcf] + else: + # AR frame N>=1: single supertoken with real actions. start_frame_offset=1 + # matches training (last action co-locates with vision frame N); caller + # seeds temporal_offset to (N-1) frame-strides back to compensate. + action_ids_3d = _real_action_ids(1, start_frame_offset=1) # [3,1,tcf] + + # (3, T*H*W) → (3, T, H*W) + vision_ids_3d = vision_ids_flat.reshape(3, latent_t, patches_per_frame) # [3,T,patch_h*patch_w] + + # Interleave per frame: (3, T, tcf+H*W) → (3, T*S) + interleaved_ids = torch.cat([action_ids_3d, vision_ids_3d], dim=2).reshape( + 3, latent_t * supertoken_len + ) # [3,T*S] + packed_seq.position_ids.append(interleaved_ids) + else: + # No action tokens: just vision IDs, already in (3, T*H*W) order. + packed_seq.position_ids.append(vision_ids_flat) + + packed_seq._mrope_temporal_offset = new_offset + + for frame_t in range(latent_t): + if pack_action_tokens: + # Pack action tokens for this frame (indexes only; tokens already stored in packed_seq.action.tokens) + action_indexes = list(range(curr, curr + tcf)) + packed_seq.action.sequence_indexes.extend(action_indexes) + # Action tokens are never in MSE loss (always conditioning) + curr += tcf + total_split_len += tcf + + if not packed_seq._use_mrope: + packed_seq.position_ids.extend([curr_rope_id] * tcf) + + # Pack vision tokens for this frame + frame_indexes = list(range(curr, curr + patches_per_frame)) + packed_seq.vision.sequence_indexes.extend(frame_indexes) + curr += patches_per_frame + total_split_len += patches_per_frame + + if not packed_seq._use_mrope: + packed_seq.position_ids.extend([curr_rope_id] * patches_per_frame) + + # Vision MSE loss: supervise non-conditioning frames + if frame_t not in condition_set_vision: + packed_seq.vision.mse_loss_indexes.extend(frame_indexes) + frame_ts = input_timestep[frame_t].item() if isinstance(input_timestep, torch.Tensor) else input_timestep + packed_seq.vision.timesteps.extend([frame_ts] * patches_per_frame) + + packed_seq.curr = curr + return total_split_len, null_action_flag + + +# ============================================================================ +# Main packing function +# ============================================================================ + + +def pack_input_sequence( + sequence_plans: list[SequencePlan], + input_text_indexes: list[list[int]], + gen_data_clean: GenerationDataClean, + input_timesteps: torch.Tensor, + special_tokens: dict[str, int], + max_num_tokens: int | None = None, + latent_patch_size: int = 1, + skip_text_tokens: bool = False, + include_end_of_generation_token: bool = False, + position_embedding_type: str = "3d_rope", + unified_3d_mrope_reset_spatial_ids: bool = True, + unified_3d_mrope_temporal_modality_margin: int = 0, + enable_fps_modulation: bool = False, + base_fps: float = 24.0, + temporal_compression_factor: int = 4, + video_temporal_causal: bool = False, + action_dim: int = 32, + initial_mrope_temporal_offset: int | float = 0, +) -> PackedSequence: + """ + Pack a sequence of input strings and VAE latents into a packed tensor format. + Uses SequencePlan to determine which modalities are present for each sample, + and maintains separate indices for text, vision, action, and sound to handle variable modality presence. + + Args: + sequence_plans: List of SequencePlan items describing which modalities are present. + input_text_indexes: List of text token ID sequences (only for samples where has_text=True). + gen_data_clean: GenerationDataClean containing vision, action, and sound tensors. + - x0_tokens_vision: Vision tensors for samples where has_vision=True + - x0_tokens_action: Action tensors for samples where has_action=True + - x0_tokens_sound: Sound tensors (list of [C, T]) for samples where has_sound=True + input_timesteps: Diffusion timesteps for each sample. Shape (B,) or (B, 1) for + teacher_forcing/none (all frames share the same sigma), or (B, T_max) for + diffusion_forcing (per-frame independent sigma). Entries are extracted per + sample as a float (numel==1) or Tensor(T_max,) for per-frame indexing. + special_tokens: Dictionary containing special token IDs (eos_token_id, start_of_generation, end_of_generation) + max_num_tokens: Maximum number of tokens in the packed sequence + latent_patch_size: Patch size used by the network to pack latents + skip_text_tokens: If True, skip packing text tokens + include_end_of_generation_token: If True, append end-of-generation token + position_embedding_type: Position embedding type for vision tokens: + - "3d_rope": Additive 3D RoPE embeddings + 1D position IDs for attention + - "flattened_sin_cos": Additive flattened sin/cos embeddings + 1D position IDs + - "unified_3d_mrope": No additive embedding + 3D position IDs for Qwen3VL-style mRoPE + unified_3d_mrope_reset_spatial_ids: If True (default), spatial (H, W) indices + start from 0 for each vision segment. If False, spatial indices are offset + by the temporal offset (Qwen2VL-style). Only used when position_embedding_type="unified_3d_mrope". + enable_fps_modulation: If True, scale temporal position IDs based on video FPS + to reflect real time. Requires fps_vision in gen_data_clean. + Uses the same flag as diffusion_expert_config.enable_fps_modulation. + base_fps: Base FPS for normalization (default 24.0). + Uses the same value as diffusion_expert_config.base_fps. + temporal_compression_factor: VAE temporal compression factor (default 4). + Obtained from the VAE tokenizer at runtime. + Returns: + PackedSequence containing all packed tensors and metadata. See PackedSequence for field details. + """ + del max_num_tokens + + assert special_tokens is not None, "Special tokens must be provided" + assert isinstance(input_timesteps, torch.Tensor), "input_timesteps must be a tensor" + if input_timesteps.is_cuda: + raise ValueError("input_timesteps must be on CPU, not CUDA") + if isinstance(input_text_indexes, torch.Tensor): + raise ValueError("input_text_tokens must be a list, not a tensor") + + # Initialize packed sequence (acts as builder during packing) + packed_seq = PackedSequence() + + # Configure 3D mRoPE on the builder (enabled when position_embedding_type is unified_3d_mrope) + packed_seq._use_mrope = position_embedding_type == "unified_3d_mrope" + packed_seq._mrope_reset_spatial = unified_3d_mrope_reset_spatial_ids + + # Maintain separate indices for each modality + idx_text = 0 + idx_vision = 0 + idx_action = 0 + idx_sound = 0 + null_action_flags: list[bool] = [] # collected from TC path; asserted consistent after the loop + + # Validate: all samples must have text (causal split is always required for two-way attention). + # CFG dropout only drops text *content*, not the structural text split. + if not skip_text_tokens: + for plan in sequence_plans: + assert plan.has_text, "All sequence plans must have has_text=True when skip_text_tokens=False" + + # Pack each sample based on its sequence plan + for sample_idx, sequence_plan in enumerate(sequence_plans): + curr_rope_id = 0 + sample_len = 0 + + # mRoPE temporal offset resets per sample. + # initial_mrope_temporal_offset is non-zero only for AR inference (frame N seeds at N*tcf). + packed_seq._mrope_temporal_offset = initial_mrope_temporal_offset + + _ts = input_timesteps[sample_idx] + input_timestep = _ts.item() if _ts.numel() == 1 else _ts # float (TF) or Tensor(T_max,) (DF) + + # Pack text tokens if has_text=True and not skipped + if sequence_plan.has_text and not skip_text_tokens: + text_ids = input_text_indexes[idx_text] + idx_text += 1 + + has_generation_for_sample = sequence_plan.has_vision or sequence_plan.has_action or sequence_plan.has_sound + curr_rope_id, _, text_sample_len = _pack_text_tokens( + packed_seq, + text_ids, + special_tokens, + curr_rope_id, + has_generation=has_generation_for_sample, + use_float_positions=enable_fps_modulation, + ) + sample_len += text_sample_len + + # End of text modality, add an offset as the boundary between text and vision. + packed_seq._mrope_temporal_offset += unified_3d_mrope_temporal_modality_margin + + # Save temporal offset before vision for action tokens (action uses same offset as vision start) + vision_start_temporal_offset = packed_seq._mrope_temporal_offset + + # Pack vision (and optionally action) tokens + if video_temporal_causal and sequence_plan.has_vision: + # Temporal causal path: when sequence_plan.has_action=True, interleaved supertokens + # [action_t, vision_t]; when False, supertokens are just vision patches. + assert position_embedding_type == "unified_3d_mrope", ( + "video_temporal_causal=True requires position_embedding_type='unified_3d_mrope'" + ) + input_vision_tokens = gen_data_clean.x0_tokens_vision[idx_vision] + idx_vision += 1 + + vision_fps = None + if ( + enable_fps_modulation + and gen_data_clean.fps_vision is not None + and idx_vision - 1 < len(gen_data_clean.fps_vision) + ): + vision_fps = float(gen_data_clean.fps_vision[idx_vision - 1].item()) + + input_action_tokens_tc: torch.Tensor | None = None + action_fps_tc: float | None = None + if sequence_plan.has_action: + input_action_tokens_tc = gen_data_clean.x0_tokens_action[idx_action] + if ( + enable_fps_modulation + and gen_data_clean.fps_action is not None + and idx_action < len(gen_data_clean.fps_action) + ): + action_fps_tc = float(gen_data_clean.fps_action[idx_action].item()) + idx_action += 1 + + supertoken_split_len, null_flag = _pack_supertokens_temporal_causal( + packed_seq=packed_seq, + input_vision_tokens=input_vision_tokens, + input_action_tokens=input_action_tokens_tc, + condition_frame_indexes_vision=sequence_plan.condition_frame_indexes_vision, + input_timestep=input_timestep, + curr_rope_id=curr_rope_id, + latent_patch_size=latent_patch_size, + temporal_compression_factor=temporal_compression_factor, + action_dim=action_dim, + vision_fps=vision_fps, + action_fps=action_fps_tc, + enable_fps_modulation=enable_fps_modulation, + base_fps=base_fps, + pack_action_tokens=sequence_plan.has_action, + ) + null_action_flags.append(null_flag) + # We assume all samples in a batch share the same has_action layout, so + # stamp the supertoken layout constant directly here. This is the + # single source of truth read by downstream attention / KV-cache + # code (no recomputation in the network). + packed_seq.num_action_tokens_per_supertoken = temporal_compression_factor if sequence_plan.has_action else 0 + sample_len += supertoken_split_len + vision_split_len = supertoken_split_len + action_split_len = 0 # Already absorbed into supertoken_split_len + + else: + # Standard path: vision and action packed separately + if sequence_plan.has_vision: + # Determine how many vision items this sample owns. + # For multi-item samples (e.g. image editing), num_vision_items_per_sample + # records [2, 2, ...]; for standard T2I/T2V it is None (1 item per sample). + num_vis = ( + gen_data_clean.num_vision_items_per_sample[sample_idx] + if gen_data_clean.num_vision_items_per_sample is not None + else 1 + ) + + vision_split_len = 0 + # Controlnet-style transfer: when set, all vision items share the same + # temporal mRoPE grid. We snapshot the offset before the loop and + # rewind to it before each item, so every item produces identical + # temporal IDs. Each _pack_vision_tokens call still advances the + # offset by latent_t internally; in shared-grid mode the post-loop + # offset equals snapshot + latent_t (single-clip semantics for + # downstream EOV / next-modality tokens). + shared_grid = sequence_plan.share_vision_temporal_positions and num_vis > 1 + items_temporal_offset_snapshot = packed_seq._mrope_temporal_offset + shared_latent_t: int | None = None + shared_patch_h: int | None = None + shared_patch_w: int | None = None + # FPS is recorded per-sample (shape [B]); for multi-item samples + # (transfer / image-edit) every vision item in this sample shares + # the same conditioning FPS, so we read by sample_idx, not by the + # flat idx_vision counter (which would alias to a neighbor sample's + # fps and corrupt RoPE FPS modulation). + sample_vision_fps: float | None = None + if ( + enable_fps_modulation + and gen_data_clean.fps_vision is not None + and sample_idx < len(gen_data_clean.fps_vision) + ): + sample_vision_fps = float(gen_data_clean.fps_vision[sample_idx].item()) + + for item_idx in range(num_vis): + input_vision_tokens = gen_data_clean.x0_tokens_vision[idx_vision] + vision_fps = sample_vision_fps + idx_vision += 1 + + # Determine conditioning for this vision item. + # For multi-item mode: all items except the last are fully conditioned + # (all frames are clean); the last item uses the SequencePlan's + # condition_frame_indexes_vision (typically [] = fully generated). + if num_vis > 1 and item_idx < num_vis - 1: + # Conditioning item (e.g. source image): mark all frames as clean + latent_t = input_vision_tokens.shape[2] + item_condition_frames = list(range(latent_t)) + else: + # Generation item (single-item mode or last item in multi-item) + item_condition_frames = sequence_plan.condition_frame_indexes_vision + + if shared_grid: + item_latent_t = input_vision_tokens.shape[2] + item_latent_h = input_vision_tokens.shape[3] + item_latent_w = input_vision_tokens.shape[4] + if shared_latent_t is None: + shared_latent_t = item_latent_t + shared_patch_h = item_latent_h + shared_patch_w = item_latent_w + else: + assert item_latent_t == shared_latent_t, ( + f"share_vision_temporal_positions requires equal latent_t across items, " + f"got item {item_idx} latent_t={item_latent_t} vs first={shared_latent_t}" + ) + assert item_latent_h == shared_patch_h and item_latent_w == shared_patch_w, ( + f"share_vision_temporal_positions requires equal spatial grid across items, " + f"got item {item_idx} (H,W)=({item_latent_h},{item_latent_w}) " + f"vs first=({shared_patch_h},{shared_patch_w})" + ) + # Rewind so this item starts at the same temporal offset as item 0. + packed_seq._mrope_temporal_offset = items_temporal_offset_snapshot + + item_split_len = _pack_vision_tokens( + packed_seq=packed_seq, + input_vision_tokens=input_vision_tokens, + condition_frame_indexes_vision=item_condition_frames, + input_timestep=input_timestep, + curr_rope_id=curr_rope_id, + latent_patch_size=latent_patch_size, + vision_fps=vision_fps, + enable_fps_modulation=enable_fps_modulation, + base_fps=base_fps, + temporal_compression_factor=temporal_compression_factor, + ) + vision_split_len += item_split_len + sample_len += vision_split_len + + else: + vision_split_len = 0 + + # Pack action tokens if has_action=True + if sequence_plan.has_action: + input_action_tokens = gen_data_clean.x0_tokens_action[idx_action] + + # Get FPS for action (action may have its own FPS independent of vision) + action_fps: float | None = None + if ( + enable_fps_modulation + and gen_data_clean.fps_action is not None + and idx_action < len(gen_data_clean.fps_action) + ): + action_fps = float(gen_data_clean.fps_action[idx_action].item()) + + idx_action += 1 + + action_split_len = _pack_action_tokens( + packed_seq=packed_seq, + input_action_tokens=input_action_tokens, + condition_frame_indexes_action=sequence_plan.condition_frame_indexes_action, + input_timestep=input_timestep, + curr_rope_id=curr_rope_id, + action_temporal_offset=vision_start_temporal_offset, + enable_fps_modulation=enable_fps_modulation, + base_fps=base_fps, + action_fps=action_fps, + base_temporal_compression_factor=temporal_compression_factor, + action_start_frame_offset=sequence_plan.action_start_frame_offset, + ) + sample_len += action_split_len + else: + action_split_len = 0 + + # Pack sound tokens if has_sound=True + if sequence_plan.has_sound: + input_sound_tokens = gen_data_clean.x0_tokens_sound[idx_sound] + + # Get FPS for sound (from gen_data_clean, like vision and action) + sound_fps: float | None = None + if ( + enable_fps_modulation + and gen_data_clean.fps_sound is not None + and idx_sound < len(gen_data_clean.fps_sound) + ): + sound_fps = float(gen_data_clean.fps_sound[idx_sound].item()) + + idx_sound += 1 + + sound_split_len = _pack_sound_tokens( + packed_seq=packed_seq, + input_sound_tokens=input_sound_tokens, + condition_frame_indexes_sound=sequence_plan.condition_frame_indexes_sound, + input_timestep=input_timestep, + curr_rope_id=curr_rope_id, + sound_temporal_offset=vision_start_temporal_offset, + enable_fps_modulation=enable_fps_modulation, + base_fps=base_fps, + sound_fps=sound_fps, + ) + sample_len += sound_split_len + else: + sound_split_len = 0 + + # Add end-of-generation token if needed + eov_len = 0 + has_any_generation = sequence_plan.has_vision or sequence_plan.has_action or sequence_plan.has_sound + if include_end_of_generation_token and has_any_generation: + # Type narrowing: we're in build mode, fields are lists + assert isinstance(packed_seq.text_ids, list) + assert isinstance(packed_seq.text_indexes, list) + assert isinstance(packed_seq.position_ids, list) + + packed_seq.text_ids.append(special_tokens["end_of_generation"]) + packed_seq.text_indexes.append(packed_seq.curr) + + # EOV position IDs: 3D mRoPE or 1D RoPE + if packed_seq._use_mrope: + # Use float dtype when FPS modulation is enabled for consistency + eov_dtype = torch.float32 if enable_fps_modulation else torch.long + eov_mrope_ids = torch.full((3, 1), packed_seq._mrope_temporal_offset, dtype=eov_dtype) # [3,1] + packed_seq.position_ids.append(eov_mrope_ids) # type: ignore[arg-type] + packed_seq._mrope_temporal_offset += 1 + else: + packed_seq.position_ids.append(curr_rope_id) # type: ignore[arg-type] + + packed_seq.curr += 1 + eov_len = 1 + sample_len += 1 + + combined_split_len = vision_split_len + action_split_len + sound_split_len + eov_len + packed_seq.attn_modes.append("full") + packed_seq.split_lens.append(combined_split_len) + packed_seq.sample_lens.append(sample_len) + + # Assert consistent null_action_supertokens across all TC samples, then set once + if null_action_flags: + assert len(set(null_action_flags)) == 1, ( + f"Inconsistent null_action_supertokens across samples: {null_action_flags}. " + "All samples in a batch must have the same structure (all training or all AR inference)." + ) + packed_seq.null_action_supertokens = null_action_flags[0] + + # Finalize and return packed data + return packed_seq.finalize( + gen_data_clean=gen_data_clean, + ) + + +# ============================================================================ +# SequencePack:Operations on packed sequences +# ============================================================================ + +""" +SequencePack is a dictionary-based container for packed sequences. +We provide two implementations: + +JointSequencePack: Stores all sub-sequences for all-sequences in a single tensor. + It is more flexible but is less performant. In this implementation, understanding tokens + can be placed in either causal or full-attention sub-sequences. +FactoredSequencePack: + Stores causal/undersanding and full/generation sub-sequences as separate tensors. + It is less flexible but is more performant. In this implementation, understanding tokens + must be on the causal sub-sequence, and generation tokens must be in the full-attention sub-sequence. + +NOTES: + - We are aiming to deprecate and remove JointSequencePack; keeping it available for backwards compatibility at the moment. + - The reason we're implementing them via dict instead of python classes is to make torch.compile + activation checkpointing to work. + +is_sharded (bool): + This flag indicates whether the sequence pack contains global data or a local shard for Context Parallelism (CP). + - When True, tensors represent only the local slice (Global_Length / CP_World_Size). + - Padding and reconstruction logic is skipped in `from_joint`. + - Operations requiring global context (e.g., `get_all_seq`, position ID reconstruction) are not allowed when is_sharded is True. +""" + + +# "Fake" types for readability; everything is plain dict at runtime. +FactoredSequencePack = dict[str, Any] +JointSequencePack = dict[str, Any] +SequencePack = FactoredSequencePack | JointSequencePack + +# ------------------------------------ +# SequencePack: internal helpers +# ------------------------------------ + + +def _find_non_causal_text_token_idx( + attn_modes: List[str], split_lens: List[int], und_token_indexes: List[int] +) -> List[int]: + """ + Find the indexes of the "und" tokens that are under the "full" mode. + This are indices into the full_only_seq. + """ + # Return indexes *into* full_only_seq, not into the original packed sequence. + # The order within full_only_seq is the concatenation of each "full" split in order. + out = [] + full_offset = 0 + packed_idx = 0 + und_token_set = set(und_token_indexes) + for attn_mode, split_len in zip(attn_modes, split_lens): + if attn_mode == "full": + split_indices = range(packed_idx, packed_idx + split_len) + # For this "full" split, find the und tokens within this split, mapped local to full_only_seq offset + for local_idx, split_idx in enumerate(split_indices): + if split_idx in und_token_set: + out.append(full_offset + local_idx) + full_offset += split_len + packed_idx += split_len + return out + + +def _compute_mode_indices_and_offsets( + split_lens: torch.Tensor | List[int], attn_modes: List[str], mode: str, device: torch.device +) -> tuple[torch.Tensor, torch.Tensor]: + """ + Compute indices from a joint tensor that are in the given mode. + """ + indices = [] + offsets = [0] + next_offset = 0 + start = 0 + + if isinstance(split_lens, torch.Tensor): + split_lens = split_lens.tolist() + + for i, (split_len, attn_mode) in enumerate(zip(split_lens, attn_modes)): + if attn_mode == mode: + indices.extend(range(start, start + split_len)) + next_offset += split_len + offsets.append(next_offset) + start += split_len + return torch.tensor(indices, dtype=torch.int32, device=device), torch.tensor( # [N_mode_tokens], [N_mode_splits+1] + offsets, dtype=torch.int32, device=device + ) + + +# Pad causal_seq and full_only_seq to have length 2048 if not already at that size +def _pad_to_N(N, x: torch.Tensor) -> torch.Tensor: + assert x.shape[0] <= N + padded = x.new_zeros((N, *x.shape[1:])) + padded[: x.shape[0]] = x + return padded + + +def _round_up_to_N(n: int, cp_world_size: int = 1, pad_for_cuda_graphs: bool = False) -> int: + if pad_for_cuda_graphs: + # Reduce recompilations / CUDA graph re-captures by bucketing lengths. + # <= 2K: 128, <= 4K: 256, <= 8K: 512, <= 16K: 1024, > 16K: 2048 + if n <= 2048: + alignment = 128 + elif n <= 4096: + alignment = 256 + elif n <= 8192: + alignment = 512 + elif n <= 16384: + alignment = 1024 + else: + alignment = 2048 + n = ((n + alignment - 1) // alignment) * alignment + + # ensure it's divisible by cp_world_size + if cp_world_size > 1: + remainder = n % cp_world_size + if remainder != 0: + n += cp_world_size - remainder + + return n + + +def _pad( + causal_seq: torch.Tensor, full_only_seq: torch.Tensor, max_causal_len: int, max_full_len: int +) -> tuple[torch.Tensor, torch.Tensor]: + causal_seq = _pad_to_N(max_causal_len, causal_seq) + full_only_seq = _pad_to_N(max_full_len, full_only_seq) + return causal_seq, full_only_seq + + +def _ensure_core_metadata(pack: SequencePack) -> None: + required = [ + "sample_offsets", + "max_sample_len", + "max_causal_len", + "max_full_len", + "_causal_indices", + "_full_indices", + "_causal_seq_offsets", + "_full_only_seq_offsets", + "is_sharded", + ] + for key in required: + if key not in pack: + raise KeyError(f"Missing required pack field: {key}") + + +def _init_sequence_pack( + sample_lens: List[int], + split_lens: List[int], + attn_modes: List[str], + device: torch.device, +) -> dict[str, Any]: + _max_sample_len = max(sample_lens) + _max_causal_len = max((split_lens[i] for i in range(len(split_lens)) if attn_modes[i] == "causal"), default=0) + _max_full_len = max((split_lens[i] for i in range(len(split_lens)) if attn_modes[i] == "full"), default=0) + + sample_lens_cu = torch.tensor([0] + sample_lens, device=device, dtype=torch.int32) # [N_samples+1] + _sample_offsets = torch.cumsum(sample_lens_cu, dim=0, dtype=torch.int32) # [N_samples+1] + + _causal_indices, _causal_seq_offsets = _compute_mode_indices_and_offsets(split_lens, attn_modes, "causal", device) + _full_indices, _full_only_seq_offsets = _compute_mode_indices_and_offsets(split_lens, attn_modes, "full", device) + + return dict( + sample_offsets=_sample_offsets, + max_sample_len=_max_sample_len, + max_causal_len=_max_causal_len, + max_full_len=_max_full_len, + _causal_indices=_causal_indices, + _full_indices=_full_indices, + _causal_seq_offsets=_causal_seq_offsets, + _full_only_seq_offsets=_full_only_seq_offsets, + _num_causal_tokens=len(_causal_indices), + _num_full_tokens=len(_full_indices), + split_lens=split_lens, + attn_modes=attn_modes, + ) + + +# ------------------------------------ +# SequencePack constructors +# ------------------------------------ + + +def _round_up_for_cuda_graphs_or_cp( + causal_seq: torch.Tensor, + full_only_seq: torch.Tensor, + need_causal: int, + need_full: int, + is_image_batch: bool, + pad_for_cuda_graphs: bool, +) -> tuple[torch.Tensor, torch.Tensor]: + """Pad causal/full sequences to the required lengths, growing global bounds for CUDA graphs.""" + if pad_for_cuda_graphs: + global \ + MAX_CAUSAL_LEN_IMAGE_BATCH, \ + MAX_FULL_LEN_IMAGE_BATCH, \ + MAX_CAUSAL_LEN_VIDEO_BATCH, \ + MAX_FULL_LEN_VIDEO_BATCH + if is_image_batch: + if need_causal > MAX_CAUSAL_LEN_IMAGE_BATCH: + MAX_CAUSAL_LEN_IMAGE_BATCH = need_causal + log.info(f"Growing MAX_CAUSAL_LEN_IMAGE_BATCH to {MAX_CAUSAL_LEN_IMAGE_BATCH}", rank0_only=False) + if need_full > MAX_FULL_LEN_IMAGE_BATCH: + MAX_FULL_LEN_IMAGE_BATCH = need_full + log.info(f"Growing MAX_FULL_LEN_IMAGE_BATCH to {MAX_FULL_LEN_IMAGE_BATCH}", rank0_only=False) + causal_seq, full_only_seq = _pad( + causal_seq, + full_only_seq, + max_causal_len=MAX_CAUSAL_LEN_IMAGE_BATCH, + max_full_len=MAX_FULL_LEN_IMAGE_BATCH, + ) + else: + if need_causal > MAX_CAUSAL_LEN_VIDEO_BATCH: + MAX_CAUSAL_LEN_VIDEO_BATCH = need_causal + log.info(f"Growing MAX_CAUSAL_LEN_VIDEO_BATCH to {MAX_CAUSAL_LEN_VIDEO_BATCH}", rank0_only=False) + if need_full > MAX_FULL_LEN_VIDEO_BATCH: + MAX_FULL_LEN_VIDEO_BATCH = need_full + log.info(f"Growing MAX_FULL_LEN_VIDEO_BATCH to {MAX_FULL_LEN_VIDEO_BATCH}", rank0_only=False) + causal_seq, full_only_seq = _pad( + causal_seq, + full_only_seq, + max_causal_len=MAX_CAUSAL_LEN_VIDEO_BATCH, + max_full_len=MAX_FULL_LEN_VIDEO_BATCH, + ) + elif need_causal != int(causal_seq.shape[0]) or need_full != int(full_only_seq.shape[0]): + causal_seq, full_only_seq = _pad(causal_seq, full_only_seq, need_causal, need_full) + return causal_seq, full_only_seq + + +def factored_from_joint_sequence( + packed_sequence: torch.Tensor, + attn_modes: List[str], + split_lens: List[int], + sample_lens: List[int], + packed_und_token_indexes: torch.Tensor, + packed_gen_token_indexes: torch.Tensor, + is_image_batch: bool = False, + cp_world_size: int = 1, + pad_for_cuda_graphs: bool = False, +) -> FactoredSequencePack: + """ + Create a factored sequence pack from a packed sequence and metadata. + NOTE: Some arguments seem redundant because they in principle support more flexible sequence setups. + This constructor checks that the required invariants for FactoredSequencePack are satisfied. + NOTE: This constructor checks that there are no "und" tokens under "full" mode, and no "gen" tokens under "causal" mode, + since this is a requirement for FactoredSequencePack. + Args: + packed_sequence (torch.Tensor): Tensor containing all tokens in the batch of sequences. + attn_modes (List[str]): List of attention modes. Must be alternating ["causal", "full", ... "causal", "full"] + split_lens (List[int]): Length of each subsequence. len(split_lens) == len(attn_modes) + sample_lens (List[int]): Length of each sequence. len(sample_lens) == number of samples. + packed_und_token_indexes (torch.Tensor): The indexes of the understanding tokens in the packed sequence. + packed_gen_token_indexes (torch.Tensor): The indexes of the generating tokens in the packed sequence. + """ + del packed_gen_token_indexes + + non_causal_text_idxs = _find_non_causal_text_token_idx(attn_modes, split_lens, packed_und_token_indexes.tolist()) + assert len(non_causal_text_idxs) == 0, "non_causal_text_idxs should be empty" + + assert sum(sample_lens) == packed_sequence.shape[0], ( + "sum(sample_lens) must be equal to the length of the packed sequence" + ) + + meta = _init_sequence_pack(sample_lens, split_lens, attn_modes, packed_sequence.device) + causal_seq = packed_sequence[meta["_causal_indices"]] # [N_causal_tokens,D] + full_only_seq = packed_sequence[meta["_full_indices"]] # [N_full_tokens,D] + + need_causal = _round_up_to_N(int(causal_seq.shape[0]), cp_world_size, pad_for_cuda_graphs) + need_full = _round_up_to_N(int(full_only_seq.shape[0]), cp_world_size, pad_for_cuda_graphs) + + causal_seq, full_only_seq = _round_up_for_cuda_graphs_or_cp( + causal_seq, + full_only_seq, + need_causal, + need_full, + is_image_batch, + pad_for_cuda_graphs, + ) + + pack: FactoredSequencePack = { + **meta, + "max_num_tokens": sum(sample_lens), + "causal_seq": causal_seq, + "full_only_seq": full_only_seq, + "is_sharded": False, + } + return pack + + +def _validate_single_dim_params(params: Mapping, layer_idx: int, num_dims: int | None) -> dict: + """ + Helper function to validate NATTEN parameters for a dimensionality profile. + + Args: + params (Mapping): parameter dict with window_size/window_size_float and other params + layer_idx (int): layer index for error messages + num_dims (int | None): 1, 2, 3, or None (for single-profile format) + + Returns: + dict: validated parameter dict with proper types + """ + if not isinstance(params, Mapping): + dim_str = f" ({num_dims}-D)" if num_dims else "" + raise ValueError(f"Parameters for layer {layer_idx}{dim_str} must be a dict or None, got {params=}.") + + is_causal = False if "is_causal" not in params else params["is_causal"] + + if "window_size_float" in params: + window_size_float = params["window_size_float"] + if ( + not isinstance(window_size_float, Sequence) + or len(window_size_float) not in [1, 2, 3] + or any(not isinstance(x, float) for x in window_size_float) + ): + raise ValueError(f"'window_size_float' must be a float tuple of size 1, 2, or 3, got {window_size_float=}") + window_size_float = tuple(k for k in window_size_float) + + num_dims = len(window_size_float) + + def check_stride_dilation(x): + if isinstance(x, float): + if 0.0 <= x <= 1.0: + return tuple(x for _ in range(num_dims)) + elif ( + isinstance(x, Sequence) + and len(x) == num_dims + and all(isinstance(y, float) and 0.0 <= y <= 1.0 for y in x) + ): + return tuple(y for y in x) + else: + raise ValueError(f"Invalid natten float parameter: {x=}") + + stride_float = 0.0 if "stride_float" not in params else params["stride_float"] + dilation_float = 0.0 if "dilation_float" not in params else params["dilation_float"] + + stride_float = check_stride_dilation(stride_float) + dilation_float = check_stride_dilation(dilation_float) + is_causal = check_valid_tuple_or_element( + is_causal, num_dims=num_dims, typename=bool, raise_error=True, param_name="is_causal" + ) + + if any(x in params for x in ["window_size", "stride", "dilation"]): + raise ValueError( + f"Please either use _float parameters, or integer ones, and not mix the two. Got {params=}." + ) + + return { + "window_size_float": window_size_float, + "stride_float": stride_float, + "dilation_float": dilation_float, + "is_causal": is_causal, + } + + elif "window_size" in params: + window_size = params["window_size"] + num_dims = len(window_size) + + stride = 1 if "stride" not in params else params["stride"] + dilation = 1 if "dilation" not in params else params["dilation"] + + if any("_float" in x for x in params.keys()): + raise ValueError( + f"Please either use _float parameters, or integer ones, and not mix the two. Got {params=}." + ) + + window_size = check_valid_tuple_or_element( + window_size, num_dims=num_dims, typename=int, raise_error=True, param_name="window_size" + ) + stride = check_valid_tuple_or_element( + stride, num_dims=num_dims, typename=int, raise_error=True, param_name="stride" + ) + dilation = check_valid_tuple_or_element( + dilation, num_dims=num_dims, typename=int, raise_error=True, param_name="dilation" + ) + is_causal = check_valid_tuple_or_element( + is_causal, num_dims=num_dims, typename=bool, raise_error=True, param_name="is_causal" + ) + + return {"window_size": window_size, "stride": stride, "dilation": dilation, "is_causal": is_causal} + else: + raise ValueError( + "Sparse parameters for a layer must have key 'window_size' or 'window_size_float', " + f"got {params=} in layer index {layer_idx}." + ) + + +def verify_natten_parameter_list( + natten_parameter_list: list | None, + num_layers: int, +) -> list | None: + """ + Converts list of NATTEN parameters into expected types, and assigns defaults to unset + parameters. + This needs to be done separately during model initialization, and not forward pass. + There are no torch operations in this function. + + Args: + natten_parameter_list (list | None): list of NATTEN parameters. Must be either None, or a + list of mappings, one for each layer. Each list element must be either None, + representing no sparsity / masking (full dense attention), or a mapping of NATTEN + parameters. + + Parameters can be specified directly with integer or float format: + - 'window_size_float' (required), 'stride_float', 'dilation_float' + - 'window_size' (required), 'stride', 'dilation' + + Or, parameters can be specified for multiple dimensionality profiles in case of + mixed-training (i.e. image and video training) using keys "1d", "2d", "3d": + - Each key maps to either None (dense attention) or a parameter dict + + Integer and float parameters cannot be used together in the same layer! + Additionally, you can specify 'is_causal'. + + Examples: + ``` + # 50 percent sparsity along each dimension in a 2-D token layout + {'window_size_float': (0.5, 0.5)} # valid + + # 50 percent sparsity along each dimension in a 2-D token layout + # Maximum dilation along first dimension, no dilation along second dimension + {'window_size_float': (0.5, 0.5), 'dilation_float': (1.0, 0.0)} # valid + + # Fixed window size of 8x8, dilation of 2x1. + + {'window_size': (8, 8), 'dilation': (2, 1)} # valid + + # Multi-profile: different parameters for 2D (images) and 3D (videos) + { + "2d": {"window_size_float": (0.5, 0.5)}, + "3d": {"window_size_float": (1.0, 0.5, 0.5)} + } # valid + + # Multi-profile: 2D uses dense attention, 3D uses sparse + { + "2d": None, + "3d": {"window_size_float": (1.0, 0.5, 0.5)} + } # valid + + # Invalid: + {'window_size_float': (0.5, 0.5), 'dilation': (2, 1)} + ``` + + num_layers (int): number of layers in the model. Just used to verify list length. + + Returns: + output_parameter_list (list | None): verified and type-checked NATTEN parameters, or None if + no parameters passed. + """ + + if natten_parameter_list is not None: + parameter_list_out = [] + if not isinstance(natten_parameter_list, Sequence): + raise ValueError(f"Argument 'natten_parameter_list' must be a list or None, got {natten_parameter_list=}.") + + if len(natten_parameter_list) != num_layers: + raise ValueError( + "Number of elements in 'natten_parameter_list' must match number of layers " + f"in the model, got {num_layers=}, {len(natten_parameter_list)=}." + ) + + for i, layer_parameters in enumerate(natten_parameter_list): + if layer_parameters is None: + log.debug(f"Layer {i} will use DENSE attention.") + parameter_list_out.append(None) + continue + + if not isinstance(layer_parameters, Mapping): + raise ValueError( + f"Sparse parameters for a layer must be a dict or None, got {layer_parameters=} in layer index {i}." + ) + + # Detect format: multi-profile if has keys "1d", "2d", or "3d" + dim_keys = {"1d", "2d", "3d"} + has_dim_keys = any(k in layer_parameters for k in dim_keys) + + if has_dim_keys: + # Multi-profile format: validate each explicitly defined dimensionality profile + validated_multi_profile = {} + for dim_str, dim_int in [("1d", 1), ("2d", 2), ("3d", 3)]: + if dim_str in layer_parameters: + dim_params = layer_parameters[dim_str] + if dim_params is None: + validated_multi_profile[dim_int] = None + else: + validated_multi_profile[dim_int] = _validate_single_dim_params(dim_params, i, dim_int) + else: + # Single-profile format: validate and convert to multi-profile format + # Infer dimensionality from parameter tuple length + validated_params = _validate_single_dim_params(layer_parameters, i, None) + if "window_size_float" in validated_params: + num_dims = len(validated_params["window_size_float"]) + else: # "window_size" + num_dims = len(validated_params["window_size"]) + validated_multi_profile = {num_dims: validated_params} + + # If all explicitly defined profiles are None, treat as fully dense layer + if all(v is None for v in validated_multi_profile.values()): + log.debug(f"Layer {i} will use DENSE attention (all profiles None).") + parameter_list_out.append(None) + else: + parameter_list_out.append(validated_multi_profile) + log.info(f"Layer {i} NATTEN parameters: {validated_multi_profile}") + + return parameter_list_out + + return None + + +def generate_natten_metadata( + token_shapes: list[tuple[int, int, int]], + head_dim: int, + num_layers: int, + device: torch.device, + dtype: torch.dtype, + requires_grad: bool, + natten_parameter_list: list | None = None, +) -> list | None: + """ + Generates list of metadata required by Variable-Sized (variable-length) operations in NATTEN. + Required when training with three_way attention and NATTEN (multi-dimensional / sparse + attention). + + Args: + token_shapes (list[tuple]): list of integer tuples corresponding to the + post-tokenization/patchify token layout shapes in the packed sequence. Must strictly be + integer tuples with the same profile (all 1D, 2D, or 3D). 1s will be automatically + stripped (i.e. [(1, 8, 8), (1, 16, 16)] is interpreted as [(8, 8), (16, 16)]). + + head_dim (int): Attention head dimension (used to select NATTEN kernel configurations). + + num_layers (int): number of layers in the model. Just used to verify list length. + + device (torch.device): PyTorch device for offset tensors (should match QKV device). + + dtype (torch.dtype): Expected QKV dtype. + + requires_grad (bool): Determines whether backprop is expected, and sets up metadata for + backward pass as well. + + natten_parameter_list (list | None): list of NATTEN parameters. Must be either None, or a + list of mappings, one for each layer. Each list element must be either None, + representing no sparsity / masking (full dense attention), or a mapping of NATTEN + parameters in either integer or float format: + - 'window_size_float' (required), 'stride_float', 'dilation_float' + - 'window_size' (required), 'stride', 'dilation' + + Integer and float parameters cannot be used together in the same layer! + Additionally, you can specify 'is_causal'. + + Examples: + ``` + # 50 percent sparsity along each dimension in a 2-D token layout + {'window_size_float': (0.5, 0.5)} # valid + + # 50 percent sparsity along each dimension in a 2-D token layout + # Maximum dilation along first dimension, no dilation along second dimension + {'window_size_float': (0.5, 0.5), 'dilation_float': (1.0, 0.0)} # valid + + # Fixed window size of 8x8, dilation of 2x1. + + {'window_size': (8, 8), 'dilation': (2, 1)} # valid + + # Invalid: + {'window_size_float': (0.5, 0.5), 'dilation': (2, 1)} + ``` + + Returns: + natten_metadata_list (list | None): list of NATTEN varlen metadata, or Nones (dense layers). + Each non-None element will be a dictionary containing final parameters, and varlen + metadata (offset and size tensors, max lengths). + NOTE: to avoid excessive recompilations in torch.compile, we must carefully index into + this list during model.forward, and ideally using the iteration counter from the loop + over layers (nn.ModuleList). + """ + + + if token_shapes is None or len(token_shapes) < 1: + raise ValueError("'token_shapes' is required for 'three_way' attention.") + + natten_metadata = None + + if natten_parameter_list is not None: + natten_metadata = [] + if not isinstance(natten_parameter_list, list): + raise ValueError(f"Argument 'natten_parameter_list' must be a list or None, got {natten_parameter_list=}.") + + if len(natten_parameter_list) != num_layers: + raise ValueError( + "Number of elements in 'natten_parameter_list' must match number of layers " + f"in the model, got {num_layers=}, {len(natten_parameter_list)=}." + ) + + # We need to filter out 1s from shapes + def filter_shape(shape: tuple) -> tuple: + return tuple(x for x in shape if x > 1) + + # Infer token layout rank (dimensionality) + num_dims = max([len(filter_shape(token_shape)) for token_shape in token_shapes]) + + # Single pass: check if all layers support this dimensionality and if any need processing + needs_processing = False + for i, layer_parameters in enumerate(natten_parameter_list): + if layer_parameters is None: + continue + + # Fail fast if this dimensionality is not defined + if num_dims not in layer_parameters: + raise ValueError( + f"Layer {i}: batch has {num_dims}D data but parameters are not defined for {num_dims}D. " + f"Defined dimensionalities: {sorted(layer_parameters.keys())}" + ) + + # Check if this layer needs processing for this dimensionality + if layer_parameters[num_dims] is not None: + needs_processing = True + + # Early exit if all layers are dense for this dimensionality profile + if not needs_processing: + log.debug(f"All layers use DENSE attention for {num_dims}D data.") + return None + + # We actually need to process, so validate and filter all shapes + token_layout_list = [] + for shape in token_shapes: + assert isinstance(shape, tuple) + shape_filtered = filter_shape(shape) + assert len(shape_filtered) == num_dims, ( + f"All data in batch must have same dimensionality, got {num_dims}D and {len(shape_filtered)}D" + ) + token_layout_list.append(shape_filtered) + + log.debug(f"Batch dimensionality: {num_dims}D, token_layout_list={token_layout_list}") + + for i, layer_parameters in enumerate(natten_parameter_list): + if layer_parameters is None: + natten_metadata.append(None) + continue + + # Get parameters for this dimensionality (already validated above) + dim_params = layer_parameters[num_dims] + + if dim_params is None: + # Dense attention for this dimensionality + natten_metadata.append(None) + continue + + # Use dim_params (parameters for this specific dimensionality) + window_size_list = [] + stride_list = [] + dilation_list = [] + + if "window_size_float" in dim_params: + window_size_float = dim_params["window_size_float"] + stride_float = dim_params["stride_float"] + dilation_float = dim_params["dilation_float"] + + for token_layout in token_layout_list: + window_size_ = tuple( + min(x, max(2, int(k * float(x)))) for k, x in zip(window_size_float, token_layout) + ) + stride_ = tuple(min(k, max(1, int(s * float(k)))) for s, k in zip(stride_float, window_size_)) + max_dilation = tuple(x // k for k, x in zip(window_size_, token_layout)) + dilation_ = tuple(min(m, max(1, int(d * float(m)))) for d, m in zip(dilation_float, max_dilation)) + + window_size_list.append(window_size_) + stride_list.append(stride_) + dilation_list.append(dilation_) + + assert len(window_size_list) == len(stride_list) == len(dilation_list) == len(token_layout_list) + + log.debug(f"Layer {i}: {window_size_list=}") + log.debug(f"Layer {i}: {stride_list=}") + log.debug(f"Layer {i}: {dilation_list=}") + + elif "window_size" in dim_params: + window_size = dim_params["window_size"] + stride = dim_params["stride"] + dilation = dim_params["dilation"] + + window_size_list = [window_size for _ in range(len(token_layout_list))] + stride_list = [stride for _ in range(len(token_layout_list))] + dilation_list = [dilation for _ in range(len(token_layout_list))] + else: + raise ValueError( + "Sparse parameters for a layer must have key 'window_size' or 'window_size_float', " + f"got {dim_params=} in layer index {i}." + ) + + is_causal = dim_params["is_causal"] + + # Create varlen metadata for natten varlen/varsized ops + + # full size, that's why constant window sizes aren't allowed. + + natten_metadata.append( + generate_multi_dim_varlen_parameters( + token_layout_list=token_layout_list, + head_dim=head_dim, + device=device, + dtype=dtype, + requires_grad=requires_grad, + # + window_size_list=window_size_list, + stride_list=stride_list, + dilation_list=dilation_list, + # + is_causal=is_causal, + ) + ) + + return natten_metadata + + +def generate_temporal_causal_natten_metadata( + vision_token_shapes: list[tuple[int, int, int]], + num_action_tokens_per_supertoken: int, + num_layers: int, + head_dim: int, + device: torch.device, + dtype: torch.dtype, + requires_grad: bool, +) -> list: + """Generate per-layer varlen metadata for temporal causal attention on supertokens. + + Each sample's generation tokens are laid out as T_i supertokens of size + S_i = num_action_tokens_per_supertoken + H_i*W_i. Metadata encodes + is_causal=(True, False): causal across T, full within S. All layers share + the same metadata (full window, no spatial sparsity). + + Unlike generate_natten_metadata, this function does not apply filter_shape — (T, S) layouts + are passed directly even when T=1. NATTEN handles T=1 causal masking correctly (trivially + full attention within S). + + Args: + vision_token_shapes: List of (T, H, W) per sample. + num_action_tokens_per_supertoken: Number of action tokens prefixing each + supertoken (0 when actions are not packed inline). + num_layers: Number of transformer layers. + head_dim: Attention head dimension. + device: Target device. + dtype: Target dtype. + requires_grad: Whether metadata tensors require gradient. + + Returns: + List of length num_layers, each element the same NATTEN varlen metadata dict. + """ + # T=1: NATTEN requires kernel_size >= 2 and kernel_size <= token_layout, which are mutually + # exclusive when T=1. Fall back to full dense attention (None) — a single supertoken trivially + # attends to only itself, so temporal causality is already satisfied. + # Mixed T=1/T>1 batches are rejected: NATTEN can't mask T=1 samples, and falling back to dense + # attention for the whole batch would break temporal causality for the T>1 samples. + # Ensure min_frames >= 5 in the dataloader so that T_latent = 1 + (N-1)//tcf >= 2 always. + has_short = any(t < 2 for t, h, w in vision_token_shapes) + if has_short: + if not all(t < 2 for t, h, w in vision_token_shapes): + raise ValueError( + "Mixed T=1 and T>1 samples in causal training batch: NATTEN cannot apply " + "causal masking when any sample has T=1 (kernel_size constraint), and falling " + "back to dense attention would break temporal causality for T>1 samples. " + "Ensure all samples have T_latent >= 2 (set min_frames >= 5 in the dataloader)." + ) + return [None] * num_layers + token_layout_list = [(t, num_action_tokens_per_supertoken + h * w) for t, h, w in vision_token_shapes] + metadata = generate_multi_dim_varlen_parameters( + token_layout_list=token_layout_list, + head_dim=head_dim, + device=device, + dtype=dtype, + requires_grad=requires_grad, + is_causal=(True, False), + ) + return [metadata] * num_layers + + +def joint_from_joint_sequence( + packed_sequence: torch.Tensor, + attn_modes: List[str], + split_lens: List[int], + sample_lens: List[int], + packed_und_token_indexes: torch.Tensor, + packed_gen_token_indexes: torch.Tensor, + is_image_batch: bool = False, + cp_world_size: int = 1, + pad_for_cuda_graphs: bool = False, +) -> JointSequencePack: + f""" + Create a JointSequencePack from a packed sequence and metadata. + This is in order to support the legacy joint flex-attention implementation. + Differently from FactoredSequencePack, it has less strict requirements on the packed sequence. + + Args: + packed_sequence (torch.Tensor): Tensor containing all tokens in the batch of sequences. + attn_modes (List[str]): List of attention modes. Supports any sequence of {"causal", "full", "noise"} + split_lens (List[int]): Length of each subsequence. len(split_lens) == len(attn_modes) + sample_lens (List[int]): Length of each sequence. In this mode, sequences may have different number of splits, + as opposed to FactoredSequencePack where each sequence has exactly two splits.. + packed_und_token_indexes (torch.Tensor): The indexes of the understanding tokens in the packed sequence. + packed_gen_token_indexes (torch.Tensor): The indexes of the generating tokens in the packed sequence. + """ + assert sum(sample_lens) == packed_sequence.shape[0], ( + "sum(sample_lens) must be equal to the length of the packed sequence" + ) + meta = _init_sequence_pack(sample_lens, split_lens, attn_modes, packed_sequence.device) + pack: JointSequencePack = { + **meta, + "max_num_tokens": sum(sample_lens), + "packed_sequence": packed_sequence, + "packed_und_token_indexes": packed_und_token_indexes, + "packed_gen_token_indexes": packed_gen_token_indexes, + "is_sharded": False, + } + return pack + + +def zeros_like(orig: FactoredSequencePack | JointSequencePack, shape: Tuple[int, ...] | torch.Size | None = None): + """ + Create a new sequence pack with the same metadata as the original, but with all tokens set to zero. + Args: + orig (FactoredSequencePack | JointSequencePack): The original sequence pack to copy metadata from. + shape (Tuple[int, ...] | torch.Size | None): The shape of the new sequence pack. If None, the shape will be the same as the original. + """ + _ensure_core_metadata(orig) + if "packed_sequence" in orig: + if shape is None: + shape_ = orig["packed_sequence"].shape + else: + assert len(shape) >= 1 and shape[0] == -1 + shape_ = (orig["packed_sequence"].shape[0],) + tuple(shape)[1:] + packed_sequence = torch.zeros( + shape_, device=orig["packed_sequence"].device, dtype=orig["packed_sequence"].dtype + ) # [seq_len,D] + return from_joint(packed_sequence, orig) + else: + if shape is None: + shape_causal = orig["causal_seq"].shape + shape_full = orig["full_only_seq"].shape + else: + assert len(shape) >= 1 and shape[0] == -1 + shape_causal = (orig["causal_seq"].shape[0],) + tuple(shape)[1:] + shape_full = (orig["full_only_seq"].shape[0],) + tuple(shape)[1:] + causal_seq = torch.zeros( + shape_causal, device=orig["causal_seq"].device, dtype=orig["causal_seq"].dtype + ) # [N_causal_tokens,D] + full_only_seq = torch.zeros( + shape_full, device=orig["full_only_seq"].device, dtype=orig["full_only_seq"].dtype + ) # [N_full_tokens,D] + return from_mode_splits(causal_seq, full_only_seq, orig) + + +def from_joint(packed_sequence: torch.Tensor, metadata_source: FactoredSequencePack | JointSequencePack): + """ + Create a new sequence pack from a packed sequence and another sequence pack with the same metadata. + Args: + packed_sequence (torch.Tensor): Tensor containing all tokens in the batch of sequences. + metadata_source (FactoredSequencePack | JointSequencePack): The metadata source to copy from. + """ + _ensure_core_metadata(metadata_source) + if "packed_sequence" in metadata_source: + out = dict(metadata_source) + out["packed_sequence"] = packed_sequence + return out + else: + if metadata_source["is_sharded"]: + # Use sharded sequences as is when is_sharded is True (used in Context Parallel) + causal_seq = packed_sequence[: len(metadata_source["causal_seq"])] # [N_causal_tokens,D] + full_only_seq = packed_sequence[len(metadata_source["causal_seq"]) :] # [N_full_tokens,D] + else: + causal_seq = packed_sequence[metadata_source["_causal_indices"]] # [N_causal_tokens,D] + full_only_seq = packed_sequence[metadata_source["_full_indices"]] # [N_full_tokens,D] + causal_seq, full_only_seq = _pad( + causal_seq, + full_only_seq, + max_causal_len=metadata_source["causal_seq"].shape[0], + max_full_len=metadata_source["full_only_seq"].shape[0], + ) + + return from_mode_splits(causal_seq, full_only_seq, metadata_source) + + +def from_mode_splits( + causal_seq: torch.Tensor, + full_only_seq: torch.Tensor, + orig: FactoredSequencePack | JointSequencePack, + is_sharded: bool | None = None, +): + """ + Create a new sequence pack from two mode splits. + Args: + causal_seq (torch.Tensor): The causal sequence. + full_only_seq (torch.Tensor): The full-only sequence. + orig (FactoredSequencePack | JointSequencePack): The metadata source to copy from. + is_sharded (bool | None): If True, create a local pack for context parallel. + If None, inherits from orig. + """ + _ensure_core_metadata(orig) + if is_sharded is None: + is_sharded = orig.get("is_sharded", False) + + if "packed_sequence" in orig: + all_len = int(orig["_causal_indices"].shape[0] + orig["_full_indices"].shape[0]) + packed_sequence = causal_seq.new_zeros((all_len, *causal_seq.shape[1:])) # [seq_len,D] + packed_sequence[orig["_causal_indices"]] = causal_seq + packed_sequence[orig["_full_indices"]] = full_only_seq + return from_joint(packed_sequence, orig) + else: + out = dict(orig) + out["causal_seq"] = causal_seq + out["full_only_seq"] = full_only_seq + out["is_sharded"] = is_sharded + return out + + +def from_und_gen_splits(und_seq: torch.Tensor, gen_seq: torch.Tensor, orig: FactoredSequencePack | JointSequencePack): + """ + Create a new sequence pack from two und/gen splits. + Args: + und_seq (torch.Tensor): The understanding sequence. + gen_seq (torch.Tensor): The generating sequence. + orig (FactoredSequencePack | JointSequencePack): The metadata source to copy from. + """ + # If we have a joint pack (single packed_sequence), place by und/gen indexes. + if "packed_sequence" in orig and "packed_und_token_indexes" in orig and "packed_gen_token_indexes" in orig: + all_len = int(und_seq.shape[0] + gen_seq.shape[0]) + packed_sequence = und_seq.new_zeros((all_len, *und_seq.shape[1:])) # [seq_len,D] + packed_sequence[orig["packed_und_token_indexes"]] = und_seq + packed_sequence[orig["packed_gen_token_indexes"]] = gen_seq + return from_joint(packed_sequence, orig) + # Otherwise, treat und/gen as mode splits (und == causal; gen == full). + return from_mode_splits(und_seq, gen_seq, orig) + + +# ------------------------------------ +# Getters and setters for SequencePack +# ------------------------------------ +def get_und_seq(pack: SequencePack) -> torch.Tensor: + """ + Get all understanding tokens in a sequence pack in a single tensor. + + Args: + pack (FactoredSequencePack | JointSequencePack): The sequence pack to get the understanding sequence from. + Returns: + torch.Tensor: All understanding tokens concatenated over all sequences in the batch. + """ + if "causal_seq" in pack: + return pack["causal_seq"] + if "packed_sequence" in pack and "packed_und_token_indexes" in pack: + return pack["packed_sequence"][pack["packed_und_token_indexes"]] + raise KeyError("Cannot derive und_seq from provided pack") + + +def set_und_seq(pack: SequencePack, value: torch.Tensor) -> None: + """ + Override the understanding tokens in a sequence pack. + The order of tokens passed in must correspond to the order of tokens returned by get_und_seq. + + Args: + pack (FactoredSequencePack | JointSequencePack): The sequence pack to set the understanding sequence in. + value (torch.Tensor): The understanding sequence to set. + """ + if "packed_sequence" in pack and "packed_und_token_indexes" in pack: + pack["packed_sequence"][pack["packed_und_token_indexes"]] = value + elif "causal_seq" in pack: + pack["causal_seq"] = value + else: + raise KeyError("Cannot set und_seq from provided pack") + + +def get_gen_seq(pack: SequencePack) -> torch.Tensor: + """ + Get all generating tokens in a sequence pack in a single tensor. + Args: + pack (FactoredSequencePack | JointSequencePack): The sequence pack to get the generating sequence from. + Returns: + torch.Tensor: All generating tokens concatenated over all sequences in the batch. + """ + if "full_only_seq" in pack: + return pack["full_only_seq"] + if "packed_sequence" in pack and "packed_gen_token_indexes" in pack: + return pack["packed_sequence"][pack["packed_gen_token_indexes"]] + raise KeyError("Cannot derive gen_seq from provided pack") + + +def set_gen_seq(pack: SequencePack, value: torch.Tensor) -> None: + """ + Override the generating tokens in a sequence pack. + The order of tokens passed in must correspond to the order of tokens returned by get_gen_seq. + Args: + pack (FactoredSequencePack | JointSequencePack): The sequence pack to set the generating sequence in. + value (torch.Tensor): The generating sequence to set. + """ + if "packed_sequence" in pack and "packed_gen_token_indexes" in pack: + pack["packed_sequence"][pack["packed_gen_token_indexes"]] = value + elif "full_only_seq" in pack: + pack["full_only_seq"] = value + else: + raise KeyError("Cannot set gen_seq from provided pack") + + +def get_all_seq(pack: SequencePack) -> torch.Tensor: + """ + Get all tokens in a sequence pack in a single tensor. + Args: + pack (FactoredSequencePack | JointSequencePack): The sequence pack to get the all sequence from. + Returns: + torch.Tensor: All tokens concatenated over all sequences in the batch. + """ + if "all_seq" in pack: + return pack["all_seq"] + if "packed_sequence" in pack: + return pack["packed_sequence"] + if "causal_seq" in pack and "full_only_seq" in pack: + _ensure_core_metadata(pack) + if pack["is_sharded"]: + assert False, "get_all_seq is not supported in context parallel sharded mode" + else: + out = pack["causal_seq"].new_zeros( + int(pack["_causal_indices"].shape[0] + pack["_full_indices"].shape[0]), *pack["causal_seq"].shape[1:] + ) # [seq_len,D] + if pack["causal_seq"].shape[0] > 0: + out[pack["_causal_indices"]] = pack["causal_seq"][: pack["_causal_indices"].shape[0]] + if pack["full_only_seq"].shape[0] > 0: + out[pack["_full_indices"]] = pack["full_only_seq"][: pack["_full_indices"].shape[0]] + return out + raise KeyError("Cannot derive all_seq from provided pack") + + +def set_all_seq(pack: SequencePack, value: torch.Tensor) -> None: + """ + Override the all tokens in a sequence pack. + The order of tokens passed in must correspond to the order of tokens returned by get_all_seq. + Args: + pack (FactoredSequencePack | JointSequencePack): The sequence pack to set the all sequence in. + value (torch.Tensor): The all sequence to set. + """ + if "packed_sequence" in pack: + pack["packed_sequence"] = value + elif "causal_seq" in pack and "full_only_seq" in pack: + _ensure_core_metadata(pack) + pack["causal_seq"][: pack["_causal_indices"].shape[0]] = value[pack["_causal_indices"]] + pack["full_only_seq"][: pack["_full_indices"].shape[0]] = value[pack["_full_indices"]] + else: + pack["all_seq"] = value + + +def get_causal_seq(pack: SequencePack) -> Tuple[torch.Tensor, torch.Tensor]: + """ + Get the causal sequence and its offsets in a sequence pack. + Args: + pack (FactoredSequencePack | JointSequencePack): The sequence pack to get the causal sequence from. + Returns: + Tuple[torch.Tensor, torch.Tensor]: The concatenated causal sub-sequences and the starting offset for each sub-sequence. + """ + _ensure_core_metadata(pack) + if "causal_seq" in pack: + return pack["causal_seq"], pack["_causal_seq_offsets"] + assert "packed_sequence" in pack + return pack["packed_sequence"][pack["_causal_indices"]], pack["_causal_seq_offsets"] + + +def get_full_only_seq(pack: SequencePack) -> Tuple[torch.Tensor, torch.Tensor]: + """ + Get the full-only sequence and its offsets in a sequence pack. + Args: + pack (FactoredSequencePack | JointSequencePack): The sequence pack to get the full-only sequence from. + Returns: + Tuple[torch.Tensor, torch.Tensor]: The concatenated full-only sub-sequences and the starting offset for each sub-sequence. + """ + _ensure_core_metadata(pack) + if "full_only_seq" in pack: + return pack["full_only_seq"], pack["_full_only_seq_offsets"] + assert "packed_sequence" in pack + return pack["packed_sequence"][pack["_full_indices"]], pack["_full_only_seq_offsets"] + + +def get_device_and_dtype(pack: SequencePack) -> Tuple[torch.device, torch.dtype]: + """ + Get the device and dtype of a sequence pack. + Args: + pack (FactoredSequencePack | JointSequencePack): The sequence pack to get the device and dtype from. + Returns: + Tuple[torch.device, torch.dtype]: The device and dtype of the sequence pack. + """ + if "packed_sequence" in pack: + return pack["packed_sequence"].device, pack["packed_sequence"].dtype + if "causal_seq" in pack and "full_only_seq" in pack: + return pack["causal_seq"].device, pack["causal_seq"].dtype + raise KeyError("Cannot derive device and dtype from provided pack") + + +def build_sequence_plans_from_data_batch( + data_batch: dict, + input_video_key, + input_image_key: str, +) -> list[SequencePlan]: + """Build or retrieve sequence plans from a data batch dictionary. + + This function extracts sequence plans from the data batch if they exist, + otherwise creates default SequencePlan objects for each sample + in the batch. + + Args: + data_batch: Dictionary containing the data batch from the dataloader. + Expected keys include 'video' or other tensors to determine batch size. + If 'sequence_plan' key exists, those plans are returned directly. + + Returns: + List of SequencePlan objects, one per sample in the batch. + """ + + # For new modalities, please generate the sequence_plan in the dataset class!!!! + + # If sequence_plan already exists in data_batch, return it + if "sequence_plan" in data_batch: + return data_batch["sequence_plan"] + + assert "action" not in data_batch or data_batch["action"] is None, "Action data SHOULD have sequence_plans!" + assert "sound" not in data_batch or data_batch["sound"] is None, "Sound data SHOULD have sequence_plans!" + + + # Determine batch size from available tensors + batch_size = 0 + for key in [input_video_key, input_image_key]: + if key in data_batch: + val = data_batch[key] + if isinstance(val, torch.Tensor): + batch_size = val.shape[0] + break + elif isinstance(val, list): + batch_size = len(val) + break + + if batch_size == 0: + raise ValueError( + f"Cannot determine batch size from data_batch. Expected {input_video_key}, {input_image_key}, or similar key." + ) + + # Build default SequencePlan objects + return [ + SequencePlan( + has_text=True, # Has text prompt! + has_vision=True, + condition_frame_indexes_vision=[], # No conditioning frames! + ) + for _ in range(batch_size) + ] + + +# ============================================================================ +# Demo/Test function +# ============================================================================ + + +def main(): + """Demonstrate sequence packing with sample text and images.""" + # Initialize tokenizer and add special tokens + tokenizer = Qwen2Tokenizer.from_pretrained("Qwen/Qwen2.5-0.5B-Instruct") + tokenizer, _ = add_special_tokens(tokenizer) + + # Define special tokens (Note: Qwen models don't have bos_token_id) + special_tokens = { + "eos_token_id": tokenizer.eos_token_id, + "start_of_generation": tokenizer.convert_tokens_to_ids("<|vision_start|>"), + "end_of_generation": tokenizer.convert_tokens_to_ids("<|vision_end|>"), + } + + # Sample text inputs + input_strings = ["Hello world", "How are you?", "I am fine"] + + # Tokenize input strings + input_text_tokens = [tokenizer.encode(text, add_special_tokens=False) for text in input_strings] + + # Create sample images (in practice, these would be VAE latents) + input_images = torch.stack([torch.randn(3, 1, 64, 64) for _ in range(3)]) # [B, C, T, H, W] format + + # Diffusion timesteps for each image + input_timesteps = torch.tensor([0.0, 0.5, 0.9]) + + # Create GenerationDataClean for images + gen_data_clean_images = GenerationDataClean( + batch_size=3, + is_image_batch=True, + raw_state_vision=input_images, + x0_tokens_vision=torch.randn(3, 16, 8, 8), # dummy tokenized latents + raw_state_action=None, + ) + + # Create SequencePlan for each sample (all have text and vision) + sequence_plans = [ + SequencePlan( + has_text=True, + has_vision=True, + has_action=False, + condition_frame_indexes_vision=[], + condition_frame_indexes_action=[], + ) + for _ in range(3) + ] + + # Pack sequences + packed_data = pack_input_sequence( + sequence_plans=sequence_plans, + input_text_indexes=input_text_tokens, + gen_data_clean=gen_data_clean_images, + input_timesteps=input_timesteps, + special_tokens=special_tokens, + include_end_of_generation_token=True, + ) + + # Display results (after finalize, fields are tensors) + print(f"Packed sequence length: {packed_data.sequence_length}") + assert isinstance(packed_data.text_ids, torch.Tensor) + print(f"Packed text IDs shape: {packed_data.text_ids.shape}") + if packed_data.vision: + assert isinstance(packed_data.vision.sequence_indexes, torch.Tensor) + print(f"VAE token indexes shape: {packed_data.vision.sequence_indexes.shape}") + print(f"Packed position_ids: {packed_data.position_ids}") + + ################## + ## Video data + input_videos = torch.stack([torch.randn(3, 5, 64, 64) for _ in range(2)]) # [B, C, T, H, W] format + + # Diffusion timesteps for each video + input_timesteps_video = torch.tensor([0.5, 0.9]) + + # Create GenerationDataClean for videos + gen_data_clean_videos = GenerationDataClean( + batch_size=2, + is_image_batch=False, + raw_state_vision=input_videos, + x0_tokens_vision=torch.randn(2, 16, 2, 8, 8), # dummy tokenized latents + raw_state_action=None, + ) + + # Create SequencePlan for video samples + sequence_plans_video = [ + SequencePlan( + has_text=True, + has_vision=True, + has_action=False, + condition_frame_indexes_vision=[], + condition_frame_indexes_action=[], + ) + for _ in range(2) + ] + + # Pack sequences + packed_data = pack_input_sequence( + sequence_plans=sequence_plans_video, + input_text_indexes=input_text_tokens[0:2], + gen_data_clean=gen_data_clean_videos, + input_timesteps=input_timesteps_video, + special_tokens=special_tokens, + include_end_of_generation_token=True, + ) + + # Display results (after finalize, fields are tensors) + print(f"Packed sequence length: {packed_data.sequence_length}") + assert isinstance(packed_data.text_ids, torch.Tensor) + print(f"Packed text IDs shape: {packed_data.text_ids.shape}") + if packed_data.vision: + assert isinstance(packed_data.vision.sequence_indexes, torch.Tensor) + print(f"VAE token indexes shape: {packed_data.vision.sequence_indexes.shape}") + print(f"Packed position_ids: {packed_data.position_ids}") + + +def get_und_position_ids(position_ids: torch.Tensor, meta: dict[str, Any]) -> torch.Tensor: + """ + Get the understanding position ids in a sequence pack. + Args: + position_ids (torch.Tensor): The position ids. Shape (seq_len,) for 1D RoPE + or (3, seq_len) for 3D mRoPE. + meta (dict[str, Any]): The metadata. + Returns: + torch.Tensor: The understanding position ids. + """ + assert not meta["is_sharded"], "get_und_position_ids is not supported in context parallel sharded mode" + if position_ids.dim() == 2: + # 3D mRoPE: position_ids is (3, seq_len) + return position_ids[:, meta["_causal_indices"]] # [3,N_causal_tokens] + return position_ids[meta["_causal_indices"]] # [N_causal_tokens] + + +def get_gen_position_ids(position_ids: torch.Tensor, meta: dict[str, Any]) -> torch.Tensor: + """ + Get the generating position ids in a sequence pack. + Args: + position_ids (torch.Tensor): The position ids. Shape (seq_len,) for 1D RoPE + or (3, seq_len) for 3D mRoPE. + meta (dict[str, Any]): The metadata. + Returns: + torch.Tensor: The generating position ids. + """ + assert not meta["is_sharded"], "get_gen_position_ids is not supported in context parallel sharded mode" + if position_ids.dim() == 2: + # 3D mRoPE: position_ids is (3, seq_len) + return position_ids[:, meta["_full_indices"]] # [3,N_full_tokens] + return position_ids[meta["_full_indices"]] # [N_full_tokens] + + +if __name__ == "__main__": + main() diff --git a/cosmos-inference/cosmos3/_src/vfm/datasets/utils.py b/cosmos-inference/cosmos3/_src/vfm/datasets/utils.py new file mode 100644 index 00000000..3d2bca39 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/vfm/datasets/utils.py @@ -0,0 +1,185 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +import re +from typing import List, Tuple + +IMAGE_RES_SIZE_INFO: dict[str, dict[str, tuple[int, int]]] = { + # Our desired 256 resolution is the one below (commented). + + # Desired: "256": {"1,1": (336, 336), "4,3": (384, 288), "3,4": (288, 384), "16,9": (448, 256), "9,16": (256, 448)}, + "256": { + "1,1": (256, 256), + "4,3": (320, 256), + "3,4": (256, 320), + "16,9": (320, 192), + "9,16": (192, 320), + }, + "480": {"1,1": (640, 640), "4,3": (736, 544), "3,4": (544, 736), "16,9": (832, 480), "9,16": (480, 832)}, + # 704 resolutions are nicely divisible by 32 + "704": {"1,1": (960, 960), "4,3": (1088, 832), "3,4": (832, 1088), "16,9": (1280, 704), "9,16": (704, 1280)}, + "720": {"1,1": (960, 960), "4,3": (1104, 832), "3,4": (832, 1104), "16,9": (1280, 720), "9,16": (720, 1280)}, + # 768 for arena.ai + "768": {"1,1": (1024, 1024), "4,3": (1184, 880), "3,4": (880, 1184), "16,9": (1360, 768), "9,16": (768, 1360)}, + "1080": {"1,1": (1440, 1440), "4,3": (1664, 1248), "3,4": (1248, 1664), "16,9": (1920, 1080), "9,16": (1080, 1920)}, + "1280": {"1,1": (1712, 1712), "4,3": (1968, 1472), "3,4": (1472, 1968), "16,9": (2272, 1280), "9,16": (1280, 2272)}, + "2048": { + "1,1": (2728, 2728), + "4,3": (3160, 2368), + "3,4": (2368, 3160), + "16,9": (3640, 2048), + "9,16": (2048, 3640), + }, + "gt_2048": { + "1,1": (5464, 5464), + "4,3": (6304, 4728), + "3,4": (4728, 6304), + "16,9": (7280, 4096), + "9,16": (4096, 7280), + }, +} + +VIDEO_RES_SIZE_INFO: dict[str, dict[str, tuple[int, int]]] = { + # Our desired 256 resolution is the one below (commented). + + # Desired: "256": {"1,1": (336, 336), "4,3": (384, 288), "3,4": (288, 384), "16,9": (448, 256), "9,16": (256, 448)}, + "256": { + "1,1": (256, 256), + "4,3": (320, 256), + "3,4": (256, 320), + "16,9": (320, 192), + "9,16": (192, 320), + }, + "480": {"1,1": (640, 640), "4,3": (736, 544), "3,4": (544, 736), "16,9": (832, 480), "9,16": (480, 832)}, + # 704 resolutions are nicely divisible by 32 + "704": {"1,1": (960, 960), "4,3": (1088, 832), "3,4": (832, 1088), "16,9": (1280, 704), "9,16": (704, 1280)}, + "720": {"1,1": (960, 960), "4,3": (1104, 832), "3,4": (832, 1104), "16,9": (1280, 720), "9,16": (720, 1280)}, + # 768 for arena.ai + "768": {"1,1": (1024, 1024), "4,3": (1184, 880), "3,4": (880, 1184), "16,9": (1360, 768), "9,16": (768, 1360)}, + "1080": {"1,1": (1440, 1440), "4,3": (1664, 1248), "3,4": (1248, 1664), "16,9": (1920, 1080), "9,16": (1080, 1920)}, + "1280": {"1,1": (1712, 1712), "4,3": (1968, 1472), "3,4": (1472, 1968), "16,9": (2272, 1280), "9,16": (1280, 2272)}, + "2048": { + "1,1": (2728, 2728), + "4,3": (3160, 2368), + "3,4": (2368, 3160), + "16,9": (3640, 2048), + "9,16": (2048, 3640), + }, + "gt_2048": { + "1,1": (5464, 5464), + "4,3": (6304, 4728), + "3,4": (4728, 6304), + "16,9": (7280, 4096), + "9,16": (4096, 7280), + }, +} + + +def get_aspect_ratios_from_wdinfos(wdinfos: list[str]) -> list[str]: + aspect_ratios = [] + for wdinfo in wdinfos: + aspect_ratio_match = re.search(r"aspect_ratio_(\d+_\d+)", wdinfo) + aspect_ratios.append(aspect_ratio_match.group(1)) + + return aspect_ratios + + +def get_wdinfos_w_aspect_ratio(wdinfos: list[str]) -> List[Tuple[str, str]]: + aspect_ratios = get_aspect_ratios_from_wdinfos(wdinfos) + + # return a list of (wdinfo_path, aspect_ratio) pairs + return [(wdinfo, aspect_ratio.replace("_", ",")) for wdinfo, aspect_ratio in zip(wdinfos, aspect_ratios)] + + +def parse_frame_range_from_wdinfo(wdinfo: str) -> tuple[int, int] | None: + """ + Parse frame range from wdinfo path. + + Args: + wdinfo: wdinfo path string containing frames_X_Y pattern + + Returns: + Tuple of (min_frames, max_frames) if found, None otherwise + + Example: + >>> parse_frame_range_from_wdinfo("wdinfo/v4/tv_drama/resolution_720/aspect_ratio_16_9/frames_300_400/wdinfo.json") + (300, 400) + """ + match = re.search(r"frames_(\d+)_(\d+)", wdinfo) + if match: + return (int(match.group(1)), int(match.group(2))) + return None + + +def filter_wdinfos_by_frame_range( + wdinfos: list[str], + min_frames: int | None = None, + max_frames: int | None = None, +) -> list[str]: + """ + Filter wdinfo files based on frame range. + + The frame range in wdinfo path (e.g., frames_300_400) represents videos + with frames between those values. This function filters wdinfo files + based on the wdinfo's upper bound (wdinfo_max): + - min_frames is EXCLUSIVE: wdinfo_max must be > min_frames + - max_frames is INCLUSIVE: wdinfo_max must be <= max_frames + + Args: + wdinfos: List of wdinfo paths + min_frames: Minimum number of frames (exclusive). If None, no lower bound. + max_frames: Maximum number of frames (inclusive). If None, no upper bound. + + Returns: + Filtered list of wdinfo paths + + Example: + >>> wdinfos = [ + ... "wdinfo/frames_400_500/wdinfo.json", + ... "wdinfo/frames_500_600/wdinfo.json", + ... "wdinfo/frames_600_700/wdinfo.json", + ... ] + >>> filter_wdinfos_by_frame_range(wdinfos, min_frames=500, max_frames=600) + ['wdinfo/frames_500_600/wdinfo.json'] + # frames_400_500 excluded because wdinfo_max (500) <= min_frames (500) + # frames_500_600 included because wdinfo_max (600) > min_frames (500) AND <= max_frames (600) + # frames_600_700 excluded because wdinfo_max (700) > max_frames (600) + """ + if min_frames is None and max_frames is None: + return wdinfos + + filtered = [] + for wdinfo in wdinfos: + frame_range = parse_frame_range_from_wdinfo(wdinfo) + if frame_range is None: + # If no frame range in path, include by default + filtered.append(wdinfo) + continue + + wdinfo_min, wdinfo_max = frame_range + + # Filter based on wdinfo's upper bound (wdinfo_max): + # - min_frames is exclusive: wdinfo_max must be > min_frames + # - max_frames is inclusive: wdinfo_max must be <= max_frames + include = True + if min_frames is not None and wdinfo_max <= min_frames: + include = False + if max_frames is not None and wdinfo_max > max_frames: + include = False + + if include: + filtered.append(wdinfo) + + return filtered diff --git a/cosmos-inference/cosmos3/_src/vfm/diffusion/__init__.py b/cosmos-inference/cosmos3/_src/vfm/diffusion/__init__.py new file mode 100644 index 00000000..db6d6eb6 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/vfm/diffusion/__init__.py @@ -0,0 +1,15 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025-2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + diff --git a/cosmos-inference/cosmos3/_src/vfm/diffusion/rectified_flow.py b/cosmos-inference/cosmos3/_src/vfm/diffusion/rectified_flow.py new file mode 100644 index 00000000..0cb18e41 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/vfm/diffusion/rectified_flow.py @@ -0,0 +1,174 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +from typing import Callable + +import torch +import torch.distributed +from diffusers import FlowMatchEulerDiscreteScheduler + +from cosmos3._src.vfm.algorithm.loss.time_weight import TrainTimeWeight + + +class TrainTimeSampler: + _WAVER_MODE_S = 1.29 + + def __init__( + self, + distribution: str = "uniform", + ): + self.distribution = distribution + + @torch.no_grad() + def __call__( + self, + batch_size: int, + device: torch.device = torch.device("cpu"), + dtype: torch.dtype = torch.float32, + generator: torch.Generator | None = None, + ) -> torch.Tensor: + """ + Sample time tensor for training + + Returns: + torch.Tensor: Time tensor, shape (batch_size,) + """ + if self.distribution == "uniform": + t = torch.rand((batch_size,), generator=generator).to(device=device, dtype=dtype) # [B] + elif self.distribution == "logitnormal": + t = torch.sigmoid(torch.randn((batch_size,), generator=generator)).to(device=device, dtype=dtype) # [B] + elif self.distribution == "waver": + u = torch.rand((batch_size,), dtype=torch.float32, generator=generator) # [B] + t = 1.0 - u - self._WAVER_MODE_S * (torch.cos(torch.pi / 2.0 * u) ** 2 - 1 + u) # [B] + t = t.to(device=device, dtype=dtype) # [B] + else: + raise NotImplementedError(f"Time distribution '{self.dist}' is not implemented.") + + return t # [B] + + +class RectifiedFlow: + def __init__( + self, + velocity_field: Callable, + train_time_distribution: TrainTimeSampler | str = "uniform", + train_time_weight_method: str = "uniform", + use_dynamic_shift: bool = False, + shift: int = 3, + device: torch.device = torch.device("cpu"), + dtype: torch.dtype = torch.float32, + ): + r"""Initialize the RectifiedFlow class. + + Args: + velocity_field (`Callable`): + A function that predicts the velocity given the current state and time. + train_time_distribution (`TrainTimeSampler` or `str`, *optional*, defaults to `"uniform"`): + Distribution for sampling training times. + Can be an instance of `TrainTimeSampler` or a string specifying the distribution type. + train_time_weight (`TrainTimeWeight` or `str`, *optional*, defaults to `"uniform"`): + Weight applied to training times. + Can be an instance of `TrainTimeWeight` or a string specifying the weight type. + """ + self.velocity_field = velocity_field + self.train_time_sampler: TrainTimeSampler = ( + train_time_distribution + if isinstance(train_time_distribution, TrainTimeSampler) + else TrainTimeSampler(train_time_distribution) + ) + + if use_dynamic_shift: + self.noise_scheduler = FlowMatchEulerDiscreteScheduler(use_dynamic_shifting=use_dynamic_shift) + else: + self.noise_scheduler = FlowMatchEulerDiscreteScheduler(shift=shift) + self.train_time_weight = TrainTimeWeight(self.noise_scheduler, train_time_weight_method) + + self.device = torch.device(device) if isinstance(device, str) else device + self.dtype = torch.dtype(dtype) if isinstance(dtype, str) else dtype + + def sample_train_time(self, batch_size: int, iteration: int | None = None) -> torch.Tensor: + r"""This method calls the `TrainTimeSampler` to sample training times. + + Args: + batch_size: Number of time values to sample. + iteration: When provided, sampling uses a local generator seeded from + ``(iteration, rank)`` so results are identical across independent runs + regardless of prior global RNG state. + + Returns: + t (`torch.Tensor`): + A tensor of sampled training times with shape `(batch_size,)`, + matching the class specified `device` and `dtype`. + """ + generator = None + if iteration is not None and torch.are_deterministic_algorithms_enabled(): + rank = torch.distributed.get_rank() if torch.distributed.is_initialized() else 0 + generator = torch.Generator() + generator.manual_seed(iteration * 65536 + rank) + time = self.train_time_sampler(batch_size, device=self.device, dtype=self.dtype, generator=generator) + return time + + def get_discrete_timestamp(self, u, tensor_kwargs): + r"""This method map time from 0,1 to discrete steps""" + + indices = (u.squeeze() * self.noise_scheduler.config.num_train_timesteps).long() # [B] + timesteps = self.noise_scheduler.timesteps.to(**tensor_kwargs)[indices] # [B] + return timesteps.unsqueeze(0) if timesteps.ndim == 0 else timesteps # [B] + + def get_sigmas(self, timesteps, tensor_kwargs): # timesteps: [B], returns [B] + sigmas = self.noise_scheduler.sigmas.to(**tensor_kwargs) # [N_timesteps+1] + schedule_timesteps = self.noise_scheduler.timesteps.to(**tensor_kwargs) # [N_timesteps] + step_indices = [(schedule_timesteps == t).nonzero().squeeze().tolist() for t in timesteps] + assert len(step_indices) == timesteps.shape[0], "Number of indices do not match the given timesteps." + sigma = sigmas[step_indices].flatten() # [B] + + return sigma # [B] + + def get_interpolation( + self, + x_0: list[torch.Tensor], # each element: [B,C,T,H,W] or [B,D1,...,Dn] + x_1: list[torch.Tensor], # each element: [B,C,T,H,W] or [B,D1,...,Dn] + t: list[torch.Tensor], # each element: [B] or [B,1,1,1,1] + ): + r""" + This method computes interpolation `X_t` and their time derivatives `dotX_t` at the specified time points `t`. + Note that `x_0` is the noise, and `x_1` is the clean data. This is aligned with the notation in the recified flow community, + but different from the notation in the diffusion community. + + Args: + x_0 (`torch.Tensor`): + noise, shape `(B, D1, D2, ..., Dn)`, where `B` is the batch size, and `D1, D2, ..., Dn` are the data dimensions. + x_1 (`torch.Tensor`): + clean data, with the same shape as `x_0` + t (`torch.Tensor`): + A tensor of time steps with values in `[0, 1]`. Can be shape `(B,)` or + pre-broadcast to `(B, 1, T, ..., 1)` matching `x_1`'s dimensionality along batch and temporal dimension. + + Returns: + (x_t, dot_x_t) (`Tuple[torch.Tensor, torch.Tensor]`): + - x_t (`torch.Tensor`): The interpolated state, with shape `(B, D1, D2, ..., Dn)`. + - dot_x_t (torch.Tensor): The time derivative of the interpolated state, with the same shape as `x_t`. + """ + assert len(x_0) == len(x_1), "x_0 and x_1 must have the same length." + assert len(x_0) == len(t), "Batch size of x_0 and x_1 must match." + assert len(t) == len(x_1), "Batch size of t must match x_1." + + x_t = [] + dot_x_t = [] + for i in range(len(x_0)): + x_t.append(x_0[i] * t[i] + x_1[i] * (1 - t[i])) # [B,C,T,H,W]; t[i] broadcasts [B] or [B,1,1,1,1] + dot_x_t.append(x_0[i] - x_1[i]) # [B,C,T,H,W] + + return x_t, dot_x_t # each list element: [B,C,T,H,W] diff --git a/cosmos-inference/cosmos3/_src/vfm/diffusion/samplers/__init__.py b/cosmos-inference/cosmos3/_src/vfm/diffusion/samplers/__init__.py new file mode 100644 index 00000000..db6d6eb6 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/vfm/diffusion/samplers/__init__.py @@ -0,0 +1,15 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025-2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + diff --git a/cosmos-inference/cosmos3/_src/vfm/diffusion/samplers/edm.py b/cosmos-inference/cosmos3/_src/vfm/diffusion/samplers/edm.py new file mode 100644 index 00000000..3fd89190 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/vfm/diffusion/samplers/edm.py @@ -0,0 +1,295 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +""" +A general framework for various sampling algorithm from a diffusion model. +Impl based on +* Refined Exponential Solver (RES) in https://arxiv.org/pdf/2308.02157 +* also clude other impl, DDIM, DEIS, DPM-Solver, EDM sampler. +Most of sampling algorihtm, Runge-Kutta, Multi-step, etc, can be impl in this framework by \ + adding new step function in get_runge_kutta_fn or get_multi_step_fn. +""" + +import math +from typing import Any, Callable, List, Literal, Optional, Tuple, Union + +import attrs +import torch + +from cosmos3._src.imaginaire.config import make_freezable +from cosmos3._src.imaginaire.functional.multi_step import get_multi_step_fn, is_multi_step_fn_supported +from cosmos3._src.imaginaire.functional.runge_kutta import get_runge_kutta_fn, is_runge_kutta_fn_supported +from cosmos3._src.imaginaire.utils import log + +COMMON_SOLVER_OPTIONS = Literal["2ab", "2mid", "1euler"] + + +@make_freezable +@attrs.define(slots=False) +class SolverConfig: + is_multi: bool = False + rk: str = "2mid" + multistep: str = "2ab" + # following parameters control stochasticity, see EDM paper + # BY default, we use deterministic with no stochasticity + s_churn: float = 0.0 + s_t_max: float = float("inf") + s_t_min: float = 0.05 + s_noise: float = 1.0 + + +@make_freezable +@attrs.define(slots=False) +class SolverTimestampConfig: + nfe: int = 50 + t_min: float = 0.002 + t_max: float = 80.0 + order: float = 7.0 + is_forward: bool = False # whether generate forward or backward timestamps + + +@make_freezable +@attrs.define(slots=False) +class EDMSamplerConfig: + solver: SolverConfig = attrs.field(factory=SolverConfig) + timestamps: SolverTimestampConfig = attrs.field(factory=SolverTimestampConfig) + sample_clean: bool = True # whether run one last step to generate clean image + convert_sigmas_to_rf: bool = True # whether convert sigmas to RF sigmas + + +def get_rev_ts( + t_min: float, t_max: float, num_steps: int, ts_order: Union[int, float], is_forward: bool = False +) -> torch.Tensor: + """ + Generate a sequence of reverse time steps. + + Args: + t_min (float): The minimum time value. + t_max (float): The maximum time value. + num_steps (int): The number of time steps to generate. + ts_order (Union[int, float]): The order of the time step progression. + is_forward (bool, optional): If True, returns the sequence in forward order. Defaults to False. + + Returns: + torch.Tensor: A tensor containing the generated time steps in reverse or forward order. + + Raises: + ValueError: If `t_min` is not less than `t_max`. + TypeError: If `ts_order` is not an integer or float. + """ + if t_min >= t_max: + raise ValueError("t_min must be less than t_max") + + if not isinstance(ts_order, (int, float)): + raise TypeError("ts_order must be an integer or float") + + step_indices = torch.arange(num_steps + 1, dtype=torch.float64) # [num_steps+1] + time_steps = ( + t_max ** (1 / ts_order) + step_indices / num_steps * (t_min ** (1 / ts_order) - t_max ** (1 / ts_order)) + ) ** ts_order # [num_steps+1] + + if is_forward: + return time_steps.flip(dims=(0,)) # [num_steps+1] + + return time_steps # [num_steps+1] + + +class EDMSampler(torch.nn.Module): + def __init__(self, cfg: Optional[EDMSamplerConfig] = None): + super().__init__() + if cfg is None: + cfg = EDMSamplerConfig() + self.cfg = cfg + + @torch.no_grad() + def forward( + self, + x0_fn: Callable, + x_sigma_max: torch.Tensor, # [B,StateShape] + num_steps: int = 35, + sigma_min: float = 0.002, + sigma_max: float = 80, + rho: float = 7, + S_churn: float = 0, + S_min: float = 0, + S_max: float = float("inf"), + S_noise: float = 1, + solver_option: str = "2ab", + ) -> torch.Tensor: # [B,StateShape] + in_dtype = x_sigma_max.dtype + + def float64_x0_fn(x_B_StateShape: torch.Tensor, t_B: torch.Tensor) -> torch.Tensor: + return x0_fn(x_B_StateShape.to(in_dtype), t_B.to(in_dtype)).to(torch.float64) + + is_multistep = is_multi_step_fn_supported(solver_option) + is_rk = is_runge_kutta_fn_supported(solver_option) + assert is_multistep or is_rk, f"Only support multistep or Runge-Kutta method, got {solver_option}" + + solver_cfg = SolverConfig( + s_churn=S_churn, + s_t_max=S_max, + s_t_min=S_min, + s_noise=S_noise, + is_multi=is_multistep, + rk=solver_option, + multistep=solver_option, + ) + timestamps_cfg = SolverTimestampConfig(nfe=num_steps, t_min=sigma_min, t_max=sigma_max, order=rho) + sampler_cfg = EDMSamplerConfig(solver=solver_cfg, timestamps=timestamps_cfg, sample_clean=True) + + return self._forward_impl(float64_x0_fn, x_sigma_max, sampler_cfg).to(in_dtype) + + @torch.no_grad() + def _forward_impl( + self, + denoiser_fn: Callable[[torch.Tensor, torch.Tensor], torch.Tensor], + noisy_input_B_StateShape: torch.Tensor, + sampler_cfg: Optional[EDMSamplerConfig] = None, + callback_fns: Optional[List[Callable]] = None, + ) -> torch.Tensor: + """ + Internal implementation of the forward pass. + + Args: + denoiser_fn: Function to denoise the input. + noisy_input_B_StateShape: Input tensor with noise. + sampler_cfg: Configuration for the sampler. + callback_fns: List of callback functions to be called during sampling. + + Returns: + torch.Tensor: Denoised output tensor. + """ + sampler_cfg = self.cfg if sampler_cfg is None else sampler_cfg + solver_order = 1 if sampler_cfg.solver.is_multi else int(sampler_cfg.solver.rk[0]) + num_timestamps = sampler_cfg.timestamps.nfe // solver_order + + sigmas_L = get_rev_ts( + sampler_cfg.timestamps.t_min, sampler_cfg.timestamps.t_max, num_timestamps, sampler_cfg.timestamps.order + ).to(noisy_input_B_StateShape.device) # [L] + + if self.cfg.convert_sigmas_to_rf: + sigmas_L = sigmas_L / (1 + sigmas_L) # [L] + + denoised_output = differential_equation_solver( + denoiser_fn, sigmas_L, sampler_cfg.solver, callback_fns=callback_fns + )(noisy_input_B_StateShape) # [B,StateShape] + + if sampler_cfg.sample_clean: + # Override denoised_output with fully denoised version + ones = torch.ones( + denoised_output.size(0), device=denoised_output.device, dtype=denoised_output.dtype + ) # [B] + denoised_output = denoiser_fn(denoised_output, sigmas_L[-1] * ones) # [B,StateShape] + + return denoised_output # [B,StateShape] + + +def fori_loop(lower: int, upper: int, body_fun: Callable[[int, Any], Any], init_val: Any) -> Any: + """ + Implements a for loop with a function. + + Args: + lower: Lower bound of the loop (inclusive). + upper: Upper bound of the loop (exclusive). + body_fun: Function to be applied in each iteration. + init_val: Initial value for the loop. + + Returns: + The final result after all iterations. + """ + val = init_val + for i in range(lower, upper): + # Add log during sampling to meet APS job health requirement of one log every 2mins + if i % 10 == 0: + log.info(f"fori_loop: {i}") + val = body_fun(i, val) + return val + + +def differential_equation_solver( + x0_fn: Callable[[torch.Tensor, torch.Tensor], torch.Tensor], + sigmas_L: torch.Tensor, # [L] + solver_cfg: SolverConfig, + callback_fns: Optional[List[Callable]] = None, +) -> Callable[[torch.Tensor], torch.Tensor]: + """ + Creates a differential equation solver function. + + Args: + x0_fn: Function to compute x0 prediction. + sigmas_L: Tensor of sigma values with shape [L,]. + solver_cfg: Configuration for the solver. + callback_fns: Optional list of callback functions. + + Returns: + A function that solves the differential equation. + """ + num_step = len(sigmas_L) - 1 + + if solver_cfg.is_multi: + update_step_fn = get_multi_step_fn(solver_cfg.multistep) + else: + update_step_fn = get_runge_kutta_fn(solver_cfg.rk) + + eta = min(solver_cfg.s_churn / (num_step + 1), math.sqrt(1.2) - 1) + + def sample_fn(input_xT_B_StateShape: torch.Tensor) -> torch.Tensor: + """ + Samples from the differential equation. + + Args: + input_xT_B_StateShape: Input tensor with shape [B, StateShape]. + + Returns: + Output tensor with shape [B, StateShape]. + """ + ones_B = torch.ones( + input_xT_B_StateShape.size(0), device=input_xT_B_StateShape.device, dtype=torch.float64 + ) # [B] + + def step_fn( + i_th: int, state: Tuple[torch.Tensor, Optional[List[torch.Tensor]]] + ) -> Tuple[torch.Tensor, Optional[List[torch.Tensor]]]: + input_x_B_StateShape, x0_preds = state # [B,StateShape] + sigma_cur_0, sigma_next_0 = sigmas_L[i_th], sigmas_L[i_th + 1] # scalar, scalar + + # algorithm 2: line 4-6 + if solver_cfg.s_t_min < sigma_cur_0 < solver_cfg.s_t_max: + hat_sigma_cur_0 = sigma_cur_0 + eta * sigma_cur_0 # scalar + input_x_B_StateShape = input_x_B_StateShape + ( + hat_sigma_cur_0**2 - sigma_cur_0**2 + ).sqrt() * solver_cfg.s_noise * torch.randn_like(input_x_B_StateShape) # [B,StateShape] + sigma_cur_0 = hat_sigma_cur_0 # scalar + + if solver_cfg.is_multi: + x0_pred_B_StateShape = x0_fn(input_x_B_StateShape, sigma_cur_0 * ones_B) # [B,StateShape]; sigma: [B] + output_x_B_StateShape, x0_preds = update_step_fn( + input_x_B_StateShape, sigma_cur_0 * ones_B, sigma_next_0 * ones_B, x0_pred_B_StateShape, x0_preds + ) # [B,StateShape] + else: + output_x_B_StateShape, x0_preds = update_step_fn( + input_x_B_StateShape, sigma_cur_0 * ones_B, sigma_next_0 * ones_B, x0_fn + ) # [B,StateShape] + + if callback_fns: + for callback_fn in callback_fns: + callback_fn(**locals()) + + return output_x_B_StateShape, x0_preds + + x_at_eps, _ = fori_loop(0, num_step, step_fn, [input_xT_B_StateShape, None]) + return x_at_eps # [B,StateShape] + + return sample_fn diff --git a/cosmos-inference/cosmos3/_src/vfm/diffusion/samplers/fixed_step.py b/cosmos-inference/cosmos3/_src/vfm/diffusion/samplers/fixed_step.py new file mode 100644 index 00000000..7b05b720 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/vfm/diffusion/samplers/fixed_step.py @@ -0,0 +1,143 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +"""Fixed-step sampler for DMD2-distilled student models. + +Uses an explicit, fixed sigma schedule (t_list) baked in at construction time. +Each step predicts x0 via a single velocity forward pass, then either: + - ODE: Euler step x_next = x_t + (sigma_next - sigma_cur) * v + - SDE: re-noise x0 to sigma_next with fresh noise + +This is incompatible with multi-step solvers (UniPC, EDM) because DMD2 students +are trained as one-shot denoisers at specific discrete sigmas, not as smooth +score functions. + +When ``shift`` is passed at call time, the schedule is derived dynamically via +the flow-matching shift formula (same as UniPC): + sigmas = shift * s / (1 + (shift - 1) * s), s = linspace(sigma_max, sigma_min, num_steps) +In this case ``num_steps`` is required. Otherwise ``self.t_list`` is used. +""" + +import torch + +from cosmos3._src.vfm.diffusion.samplers.utils import run_multiseed + + +class FixedStepSampler: + def __init__( + self, + t_list: list[float], + sample_type: str = "ode", + num_train_timesteps: float = 1000.0, + ) -> None: + assert len(t_list) >= 1, "t_list must have at least 1 entry" + assert sample_type in ("ode", "sde"), f"sample_type must be 'ode' or 'sde', got {sample_type}" + # Auto-append 0.0 if not present (convention: t_list in config excludes final step) + self.t_list = t_list if t_list[-1] == 0.0 else t_list + [0.0] + assert len(self.t_list) >= 2, "t_list must have at least 2 entries after appending 0.0" + self.sample_type = sample_type + self.num_train_timesteps = num_train_timesteps + + def _build_t_list(self, num_steps: int, shift: float, device: torch.device) -> list[float]: + """Compute a shifted sigma schedule with ``num_steps`` integration steps.""" + sigma_max = 1.0 + sigma_min = 1.0 / self.num_train_timesteps + sigmas = torch.linspace(sigma_max, sigma_min, num_steps, device=device) + sigmas = shift * sigmas / (1 + (shift - 1) * sigmas) + return sigmas.tolist() + [0.0] + + def __call__( + self, + velocity_fn, + noise: torch.Tensor | list[torch.Tensor], + num_steps: int | None = None, + shift: float | None = None, + seed: int | list[int] | None = None, + ) -> torch.Tensor | list[torch.Tensor]: + """Run the fixed-step sampling loop. + + Matches the UniPC sampler call signature so both can be used + interchangeably in ``generate_samples_from_batch``. + + ``noise`` and ``seed`` must both be single values or both be lists + (of the same length). When lists are provided, each element + corresponds to one independent sample; the return value is then a + list of denoised tensors. When single values are provided, a + single tensor is returned. + + Args: + velocity_fn: ``velocity_fn(noise=..., timestep=...) -> velocity``. + noise: Initial noise. Either a single ``torch.Tensor`` of shape + ``(D,)`` or a ``list[torch.Tensor]`` where each element has + shape ``(D,)``. + seed: RNG seed for SDE mode. Either a single ``int`` or a + ``list[int]`` with the same length as ``noise``. + num_steps: Number of denoising steps. Required when ``shift`` is + given; optional otherwise (asserted to equal + ``len(t_list) - 1`` when provided). + shift: When set, derive the sigma schedule dynamically using the + flow-matching shift formula instead of ``self.t_list``. + + Returns: + Denoised sample(s). A single ``torch.Tensor`` when ``noise`` is a + tensor, or a ``list[torch.Tensor]`` when ``noise`` is a list. + """ + if isinstance(noise, list): + device = noise[0].device + else: + device = noise.device + + if shift is not None: + assert num_steps is not None, "num_steps is required when shift is provided" + t_list = self._build_t_list(num_steps, shift, device) + else: + if num_steps is not None: + assert num_steps == len(self.t_list) - 1, ( + f"num_steps={num_steps} must match the schedule length len(t_list)-1={len(self.t_list) - 1}" + ) + t_list = self.t_list + + latent = noise + + for step_idx, (sigma_cur, sigma_next) in enumerate( + zip(t_list[:-1], t_list[1:]), + ): + timestep = torch.tensor(sigma_cur * self.num_train_timesteps, device=device) + v_pred = velocity_fn(latent, timestep.reshape(1, 1)) + + def _sde_step(seed: int | None, latent: torch.Tensor, v_pred: torch.Tensor) -> torch.Tensor: + x0_pred = latent - sigma_cur * v_pred + + if sigma_next > 0: + if self.sample_type == "ode": + # Euler ODE step + latent = latent + (sigma_next - sigma_cur) * v_pred + else: + if seed is not None: + torch.manual_seed(seed + step_idx) + eps_fresh = torch.randn_like(x0_pred) + latent = (1.0 - sigma_next) * x0_pred + sigma_next * eps_fresh + else: + latent = x0_pred + return latent + + latent = run_multiseed( + _sde_step, + seed=seed, + latent=latent, + v_pred=v_pred, + ) + + return latent diff --git a/cosmos-inference/cosmos3/_src/vfm/diffusion/samplers/fm_solvers_unipc.py b/cosmos-inference/cosmos3/_src/vfm/diffusion/samplers/fm_solvers_unipc.py new file mode 100644 index 00000000..606b9bbd --- /dev/null +++ b/cosmos-inference/cosmos3/_src/vfm/diffusion/samplers/fm_solvers_unipc.py @@ -0,0 +1,768 @@ +# Copied from https://github.com/huggingface/diffusers/blob/v0.31.0/src/diffusers/schedulers/scheduling_unipc_multistep.py +# Convert unipc for flow matching +# Copyright 2024-2025 The Alibaba Wan Team Authors. All rights reserved. + +import math +from typing import List, Optional, Tuple, Union + +import numpy as np +import torch +from diffusers.configuration_utils import ConfigMixin, register_to_config +from diffusers.schedulers.scheduling_utils import KarrasDiffusionSchedulers, SchedulerMixin, SchedulerOutput +from diffusers.utils import deprecate + + +class FlowUniPCMultistepScheduler(SchedulerMixin, ConfigMixin): + """ + `UniPCMultistepScheduler` is a training-free framework designed for the fast sampling of diffusion models. + + This model inherits from [`SchedulerMixin`] and [`ConfigMixin`]. Check the superclass documentation for the generic + methods the library implements for all schedulers such as loading and saving. + + Args: + num_train_timesteps (`int`, defaults to 1000): + The number of diffusion steps to train the model. + solver_order (`int`, default `2`): + The UniPC order which can be any positive integer. The effective order of accuracy is `solver_order + 1` + due to the UniC. It is recommended to use `solver_order=2` for guided sampling, and `solver_order=3` for + unconditional sampling. + prediction_type (`str`, defaults to "flow_prediction"): + Prediction type of the scheduler function; must be `flow_prediction` for this scheduler, which predicts + the flow of the diffusion process. + thresholding (`bool`, defaults to `False`): + Whether to use the "dynamic thresholding" method. This is unsuitable for latent-space diffusion models such + as Stable Diffusion. + dynamic_thresholding_ratio (`float`, defaults to 0.995): + The ratio for the dynamic thresholding method. Valid only when `thresholding=True`. + sample_max_value (`float`, defaults to 1.0): + The threshold value for dynamic thresholding. Valid only when `thresholding=True` and `predict_x0=True`. + predict_x0 (`bool`, defaults to `True`): + Whether to use the updating algorithm on the predicted x0. + solver_type (`str`, default `bh2`): + Solver type for UniPC. It is recommended to use `bh1` for unconditional sampling when steps < 10, and `bh2` + otherwise. + lower_order_final (`bool`, default `True`): + Whether to use lower-order solvers in the final steps. Only valid for < 15 inference steps. This can + stabilize the sampling of DPMSolver for steps < 15, especially for steps <= 10. + disable_corrector (`list`, default `[]`): + Decides which step to disable the corrector to mitigate the misalignment between `epsilon_theta(x_t, c)` + and `epsilon_theta(x_t^c, c)` which can influence convergence for a large guidance scale. Corrector is + usually disabled during the first few steps. + solver_p (`SchedulerMixin`, default `None`): + Any other scheduler that if specified, the algorithm becomes `solver_p + UniC`. + use_karras_sigmas (`bool`, *optional*, defaults to `False`): + Whether to use Karras sigmas for step sizes in the noise schedule during the sampling process. If `True`, + the sigmas are determined according to a sequence of noise levels {σi}. + use_exponential_sigmas (`bool`, *optional*, defaults to `False`): + Whether to use exponential sigmas for step sizes in the noise schedule during the sampling process. + timestep_spacing (`str`, defaults to `"linspace"`): + The way the timesteps should be scaled. Refer to Table 2 of the [Common Diffusion Noise Schedules and + Sample Steps are Flawed](https://huggingface.co/papers/2305.08891) for more information. + steps_offset (`int`, defaults to 0): + An offset added to the inference steps, as required by some model families. + final_sigmas_type (`str`, defaults to `"zero"`): + The final `sigma` value for the noise schedule during the sampling process. If `"sigma_min"`, the final + sigma is the same as the last sigma in the training schedule. If `zero`, the final sigma is set to 0. + """ + + _compatibles = [e.name for e in KarrasDiffusionSchedulers] + order = 1 + + @register_to_config + def __init__( + self, + num_train_timesteps: int = 1000, + solver_order: int = 2, + prediction_type: str = "flow_prediction", + shift: Optional[float] = 1.0, + use_dynamic_shifting=False, + thresholding: bool = False, + dynamic_thresholding_ratio: float = 0.995, + sample_max_value: float = 1.0, + predict_x0: bool = True, + solver_type: str = "bh2", + lower_order_final: bool = True, + disable_corrector: List[int] = [], + solver_p: SchedulerMixin = None, + timestep_spacing: str = "linspace", + steps_offset: int = 0, + final_sigmas_type: Optional[str] = "zero", # "zero", "sigma_min" + ): + if solver_type not in ["bh1", "bh2"]: + if solver_type in ["midpoint", "heun", "logrho"]: + self.register_to_config(solver_type="bh2") + else: + raise NotImplementedError(f"{solver_type} is not implemented for {self.__class__}") + + self.predict_x0 = predict_x0 + # setable values + self.num_inference_steps = None + alphas = np.linspace(1, 1 / num_train_timesteps, num_train_timesteps)[::-1].copy() + sigmas = 1.0 - alphas + sigmas = torch.from_numpy(sigmas).to(dtype=torch.float32) # [num_train_timesteps] + + if not use_dynamic_shifting: + # when use_dynamic_shifting is True, we apply the timestep shifting on the fly based on the image resolution + sigmas = shift * sigmas / (1 + (shift - 1) * sigmas) # [num_train_timesteps] # pyright: ignore + + self.sigmas = sigmas # [num_train_timesteps] + self.timesteps = sigmas * num_train_timesteps # [num_train_timesteps] + + self.model_outputs = [None] * solver_order + self.timestep_list = [None] * solver_order + self.lower_order_nums = 0 + self.disable_corrector = disable_corrector + self.solver_p = solver_p + self.last_sample = None + self._step_index = None + self._begin_index = None + + self.sigmas = self.sigmas.to("cpu") # to avoid too much CPU/GPU communication + self.sigma_min = self.sigmas[-1].item() + self.sigma_max = self.sigmas[0].item() + + @property + def step_index(self): + """ + The index counter for current timestep. It will increase 1 after each scheduler step. + """ + return self._step_index + + @property + def begin_index(self): + """ + The index for the first timestep. It should be set from pipeline with `set_begin_index` method. + """ + return self._begin_index + + # Copied from diffusers.schedulers.scheduling_dpmsolver_multistep.DPMSolverMultistepScheduler.set_begin_index + def set_begin_index(self, begin_index: int = 0): + """ + Sets the begin index for the scheduler. This function should be run from pipeline before the inference. + + Args: + begin_index (`int`): + The begin index for the scheduler. + """ + self._begin_index = begin_index + + # Modified from diffusers.schedulers.scheduling_flow_match_euler_discrete.FlowMatchEulerDiscreteScheduler.set_timesteps + def set_timesteps( + self, + num_inference_steps: Union[int, None] = None, + device: Union[str, torch.device] = None, + sigmas: Optional[List[float]] = None, + mu: Optional[Union[float, None]] = None, + shift: Optional[Union[float, None]] = None, + use_kerras_sigma: bool = False, + ): + """ + Sets the discrete timesteps used for the diffusion chain (to be run before inference). + Args: + num_inference_steps (`int`): + Total number of the spacing of the time steps. + device (`str` or `torch.device`, *optional*): + The device to which the timesteps should be moved to. If `None`, the timesteps are not moved. + """ + if self.config.use_dynamic_shifting and mu is None: + raise ValueError(" you have to pass a value for `mu` when `use_dynamic_shifting` is set to be `True`") + + if use_kerras_sigma: + # force to use the exact sigma used in edm sampler + sigma_max = 200 + sigma_min = 0.01 + rho = 7 + sigmas = np.arange(num_inference_steps + 1) / num_inference_steps + min_inv_rho = sigma_min ** (1 / rho) + max_inv_rho = sigma_max ** (1 / rho) + sigmas = (max_inv_rho + sigmas * (min_inv_rho - max_inv_rho)) ** rho + sigmas = sigmas / (1 + sigmas) + else: + if sigmas is None: + sigmas = np.linspace(self.sigma_max, self.sigma_min, num_inference_steps + 1).copy()[:-1] # pyright: ignore + + if self.config.use_dynamic_shifting: + sigmas = self.time_shift(mu, 1.0, sigmas) # pyright: ignore + else: + if shift is None: + shift = self.config.shift + sigmas = shift * sigmas / (1 + (shift - 1) * sigmas) # pyright: ignore + + if self.config.final_sigmas_type == "sigma_min": + sigma_last = ((1 - self.alphas_cumprod[0]) / self.alphas_cumprod[0]) ** 0.5 + elif self.config.final_sigmas_type == "zero": + sigma_last = 0 + else: + raise ValueError( + f"`final_sigmas_type` must be one of 'zero', or 'sigma_min', but got {self.config.final_sigmas_type}" + ) + + timesteps = sigmas * self.config.num_train_timesteps + sigmas = np.concatenate([sigmas, [sigma_last]]).astype(np.float32) # pyright: ignore + + self.sigmas = torch.from_numpy(sigmas) # [num_inference_steps+1] + self.timesteps = torch.from_numpy(timesteps).to(device=device, dtype=torch.int64) # [num_inference_steps] + + self.num_inference_steps = len(timesteps) + + self.model_outputs = [ + None, + ] * self.config.solver_order + self.lower_order_nums = 0 + self.last_sample = None + if self.solver_p: + self.solver_p.set_timesteps(self.num_inference_steps, device=device) + + # add an index counter for schedulers that allow duplicated timesteps + self._step_index = None + self._begin_index = None + self.sigmas = self.sigmas.to("cpu") # to avoid too much CPU/GPU communication + + # Copied from diffusers.schedulers.scheduling_ddpm.DDPMScheduler._threshold_sample + def _threshold_sample(self, sample: torch.Tensor) -> torch.Tensor: + """ + "Dynamic thresholding: At each sampling step we set s to a certain percentile absolute pixel value in xt0 (the + prediction of x_0 at timestep t), and if s > 1, then we threshold xt0 to the range [-s, s] and then divide by + s. Dynamic thresholding pushes saturated pixels (those near -1 and 1) inwards, thereby actively preventing + pixels from saturation at each step. We find that dynamic thresholding results in significantly better + photorealism as well as better image-text alignment, especially when using very large guidance weights." + + https://arxiv.org/abs/2205.11487 + """ + dtype = sample.dtype + batch_size, channels, *remaining_dims = sample.shape + + if dtype not in (torch.float32, torch.float64): + sample = sample.float() # upcast for quantile calculation, and clamp not implemented for cpu half + + # Flatten sample for doing quantile calculation along each image + sample = sample.reshape(batch_size, channels * np.prod(remaining_dims)) # [B,C*spatial] + + abs_sample = sample.abs() # "a certain percentile absolute pixel value" # [B,C*spatial] + + s = torch.quantile(abs_sample, self.config.dynamic_thresholding_ratio, dim=1) # [B] + s = torch.clamp( + s, min=1, max=self.config.sample_max_value + ) # When clamped to min=1, equivalent to standard clipping to [-1, 1] # [B] + s = s.unsqueeze(1) # [B,1] + sample = ( + torch.clamp(sample, -s, s) / s + ) # "we threshold xt0 to the range [-s, s] and then divide by s" # [B,C*spatial] + + sample = sample.reshape(batch_size, channels, *remaining_dims) # [B,C,...] + sample = sample.to(dtype) + + return sample + + # Copied from diffusers.schedulers.scheduling_flow_match_euler_discrete.FlowMatchEulerDiscreteScheduler._sigma_to_t + def _sigma_to_t(self, sigma): + return sigma * self.config.num_train_timesteps + + def _sigma_to_alpha_sigma_t(self, sigma): + return 1 - sigma, sigma + + # Copied from diffusers.schedulers.scheduling_flow_match_euler_discrete.set_timesteps + def time_shift(self, mu: float, sigma: float, t: torch.Tensor): + return math.exp(mu) / (math.exp(mu) + (1 / t - 1) ** sigma) + + def convert_model_output( + self, + model_output: torch.Tensor, + *args, + sample: torch.Tensor = None, + **kwargs, + ) -> torch.Tensor: + r""" + Convert the model output to the corresponding type the UniPC algorithm needs. + + Args: + model_output (`torch.Tensor`): + The direct output from the learned diffusion model. + timestep (`int`): + The current discrete timestep in the diffusion chain. + sample (`torch.Tensor`): + A current instance of a sample created by the diffusion process. + + Returns: + `torch.Tensor`: + The converted model output. + """ + timestep = args[0] if len(args) > 0 else kwargs.pop("timestep", None) + if sample is None: + if len(args) > 1: + sample = args[1] + else: + raise ValueError("missing `sample` as a required keyward argument") + if timestep is not None: + deprecate( + "timesteps", + "1.0.0", + "Passing `timesteps` is deprecated and has no effect as model output conversion is now handled via an internal counter `self.step_index`", + ) + + sigma = self.sigmas[self.step_index] + alpha_t, sigma_t = self._sigma_to_alpha_sigma_t(sigma) + + # print("sigma_t ==>", self.step_index, sigma, sigma_t, alpha_t, sample.shape, model_output.shape) + if self.predict_x0: + if self.config.prediction_type == "flow_prediction": + sigma_t = self.sigmas[self.step_index] + x0_pred = sample - sigma_t * model_output + else: + raise ValueError( + f"prediction_type given as {self.config.prediction_type} must be one of `epsilon`, `sample`," + " `v_prediction` or `flow_prediction` for the UniPCMultistepScheduler." + ) + + if self.config.thresholding: + x0_pred = self._threshold_sample(x0_pred) + # print("self.config.thresholding", self.config.thresholding) + return x0_pred + else: + if self.config.prediction_type == "flow_prediction": + sigma_t = self.sigmas[self.step_index] + epsilon = sample - (1 - sigma_t) * model_output + else: + raise ValueError( + f"prediction_type given as {self.config.prediction_type} must be one of `epsilon`, `sample`," + " `v_prediction` or `flow_prediction` for the UniPCMultistepScheduler." + ) + + if self.config.thresholding: + sigma_t = self.sigmas[self.step_index] + x0_pred = sample - sigma_t * model_output + x0_pred = self._threshold_sample(x0_pred) + epsilon = model_output + x0_pred + + return epsilon + + def multistep_uni_p_bh_update( + self, + model_output: torch.Tensor, + *args, + sample: torch.Tensor = None, + order: int = None, # pyright: ignore + **kwargs, + ) -> torch.Tensor: + """ + One step for the UniP (B(h) version). Alternatively, `self.solver_p` is used if is specified. + + Args: + model_output (`torch.Tensor`): + The direct output from the learned diffusion model at the current timestep. + prev_timestep (`int`): + The previous discrete timestep in the diffusion chain. + sample (`torch.Tensor`): + A current instance of a sample created by the diffusion process. + order (`int`): + The order of UniP at this timestep (corresponds to the *p* in UniPC-p). + + Returns: + `torch.Tensor`: + The sample tensor at the previous timestep. + """ + prev_timestep = args[0] if len(args) > 0 else kwargs.pop("prev_timestep", None) + if sample is None: + if len(args) > 1: + sample = args[1] + else: + raise ValueError(" missing `sample` as a required keyward argument") + if order is None: + if len(args) > 2: + order = args[2] + else: + raise ValueError(" missing `order` as a required keyward argument") + if prev_timestep is not None: + deprecate( + "prev_timestep", + "1.0.0", + "Passing `prev_timestep` is deprecated and has no effect as model output conversion is now handled via an internal counter `self.step_index`", + ) + model_output_list = self.model_outputs + + s0 = self.timestep_list[-1] + m0 = model_output_list[-1] + x = sample + + if self.solver_p: + x_t = self.solver_p.step(model_output, s0, x).prev_sample + return x_t + + sigma_t, sigma_s0 = self.sigmas[self.step_index + 1], self.sigmas[self.step_index] # pyright: ignore + alpha_t, sigma_t = self._sigma_to_alpha_sigma_t(sigma_t) + alpha_s0, sigma_s0 = self._sigma_to_alpha_sigma_t(sigma_s0) + + lambda_t = torch.log(alpha_t) - torch.log(sigma_t) + lambda_s0 = torch.log(alpha_s0) - torch.log(sigma_s0) + + h = lambda_t - lambda_s0 + device = sample.device + + rks = [] + D1s = [] + for i in range(1, order): + si = self.step_index - i # pyright: ignore + mi = model_output_list[-(i + 1)] + alpha_si, sigma_si = self._sigma_to_alpha_sigma_t(self.sigmas[si]) + lambda_si = torch.log(alpha_si) - torch.log(sigma_si) + rk = (lambda_si - lambda_s0) / h + rks.append(rk) + D1s.append((mi - m0) / rk) # pyright: ignore + + rks.append(1.0) + rks = torch.tensor(rks, device=device) # [order] + + R = [] + b = [] + + hh = -h if self.predict_x0 else h + h_phi_1 = torch.expm1(hh) # h\phi_1(h) = e^h - 1 + h_phi_k = h_phi_1 / hh - 1 + + factorial_i = 1 + + if self.config.solver_type == "bh1": + B_h = hh + elif self.config.solver_type == "bh2": + B_h = torch.expm1(hh) + else: + raise NotImplementedError() + + for i in range(1, order + 1): + R.append(torch.pow(rks, i - 1)) # [order] + b.append(h_phi_k * factorial_i / B_h) + factorial_i *= i + 1 + h_phi_k = h_phi_k / hh - 1 / factorial_i + + R = torch.stack(R) # [order,order] + b = torch.tensor(b, device=device) # [order] + + if len(D1s) > 0: + D1s = torch.stack(D1s, dim=1) # [B,order-1,C,T,H,W] + # for order 2, we use a simplified version + if order == 2: + rhos_p = torch.tensor([0.5], dtype=x.dtype, device=device) # [1] + else: + rhos_p = torch.linalg.solve(R[:-1, :-1], b[:-1]).to(device).to(x.dtype) # [order-1] + else: + D1s = None + + if self.predict_x0: + x_t_ = sigma_t / sigma_s0 * x - alpha_t * h_phi_1 * m0 # [B,C,T,H,W] + if D1s is not None: + pred_res = torch.einsum("k,bkc...->bc...", rhos_p, D1s) # [B,C,T,H,W] # pyright: ignore + else: + pred_res = 0 + x_t = x_t_ - alpha_t * B_h * pred_res # [B,C,T,H,W] + else: + x_t_ = alpha_t / alpha_s0 * x - sigma_t * h_phi_1 * m0 # [B,C,T,H,W] + if D1s is not None: + pred_res = torch.einsum("k,bkc...->bc...", rhos_p, D1s) # [B,C,T,H,W] # pyright: ignore + else: + pred_res = 0 + x_t = x_t_ - sigma_t * B_h * pred_res # [B,C,T,H,W] + + x_t = x_t.to(x.dtype) # [B,C,T,H,W] + return x_t # [B,C,T,H,W] + + def multistep_uni_c_bh_update( + self, + this_model_output: torch.Tensor, + *args, + last_sample: torch.Tensor = None, + this_sample: torch.Tensor = None, + order: int = None, # pyright: ignore + **kwargs, + ) -> torch.Tensor: + """ + One step for the UniC (B(h) version). + + Args: + this_model_output (`torch.Tensor`): + The model outputs at `x_t`. + this_timestep (`int`): + The current timestep `t`. + last_sample (`torch.Tensor`): + The generated sample before the last predictor `x_{t-1}`. + this_sample (`torch.Tensor`): + The generated sample after the last predictor `x_{t}`. + order (`int`): + The `p` of UniC-p at this step. The effective order of accuracy should be `order + 1`. + + Returns: + `torch.Tensor`: + The corrected sample tensor at the current timestep. + """ + this_timestep = args[0] if len(args) > 0 else kwargs.pop("this_timestep", None) + if last_sample is None: + if len(args) > 1: + last_sample = args[1] + else: + raise ValueError(" missing`last_sample` as a required keyward argument") + if this_sample is None: + if len(args) > 2: + this_sample = args[2] + else: + raise ValueError(" missing`this_sample` as a required keyward argument") + if order is None: + if len(args) > 3: + order = args[3] + else: + raise ValueError(" missing`order` as a required keyward argument") + if this_timestep is not None: + deprecate( + "this_timestep", + "1.0.0", + "Passing `this_timestep` is deprecated and has no effect as model output conversion is now handled via an internal counter `self.step_index`", + ) + + model_output_list = self.model_outputs + + m0 = model_output_list[-1] + x = last_sample + x_t = this_sample + model_t = this_model_output + + sigma_t, sigma_s0 = self.sigmas[self.step_index], self.sigmas[self.step_index - 1] # pyright: ignore + alpha_t, sigma_t = self._sigma_to_alpha_sigma_t(sigma_t) + alpha_s0, sigma_s0 = self._sigma_to_alpha_sigma_t(sigma_s0) + + lambda_t = torch.log(alpha_t) - torch.log(sigma_t) + lambda_s0 = torch.log(alpha_s0) - torch.log(sigma_s0) + + h = lambda_t - lambda_s0 + device = this_sample.device + + rks = [] + D1s = [] + for i in range(1, order): + si = self.step_index - (i + 1) # pyright: ignore + mi = model_output_list[-(i + 1)] + alpha_si, sigma_si = self._sigma_to_alpha_sigma_t(self.sigmas[si]) + lambda_si = torch.log(alpha_si) - torch.log(sigma_si) + rk = (lambda_si - lambda_s0) / h + rks.append(rk) + D1s.append((mi - m0) / rk) # pyright: ignore + + rks.append(1.0) + rks = torch.tensor(rks, device=device) # [order] + + R = [] + b = [] + + hh = -h if self.predict_x0 else h + h_phi_1 = torch.expm1(hh) # h\phi_1(h) = e^h - 1 + h_phi_k = h_phi_1 / hh - 1 + + factorial_i = 1 + + if self.config.solver_type == "bh1": + B_h = hh + elif self.config.solver_type == "bh2": + B_h = torch.expm1(hh) + else: + raise NotImplementedError() + + for i in range(1, order + 1): + R.append(torch.pow(rks, i - 1)) # [order] + b.append(h_phi_k * factorial_i / B_h) + factorial_i *= i + 1 + h_phi_k = h_phi_k / hh - 1 / factorial_i + + R = torch.stack(R) # [order,order] + b = torch.tensor(b, device=device) # [order] + + if len(D1s) > 0: + D1s = torch.stack(D1s, dim=1) # [B,order-1,C,T,H,W] + else: + D1s = None + + # for order 1, we use a simplified version + if order == 1: + rhos_c = torch.tensor([0.5], dtype=x.dtype, device=device) # [1] + else: + rhos_c = torch.linalg.solve(R, b).to(device).to(x.dtype) # [order] + + if self.predict_x0: + x_t_ = sigma_t / sigma_s0 * x - alpha_t * h_phi_1 * m0 # [B,C,T,H,W] + if D1s is not None: + corr_res = torch.einsum("k,bkc...->bc...", rhos_c[:-1], D1s) # [B,C,T,H,W] + else: + corr_res = 0 + D1_t = model_t - m0 # [B,C,T,H,W] + x_t = x_t_ - alpha_t * B_h * (corr_res + rhos_c[-1] * D1_t) # [B,C,T,H,W] + else: + x_t_ = alpha_t / alpha_s0 * x - sigma_t * h_phi_1 * m0 # [B,C,T,H,W] + if D1s is not None: + corr_res = torch.einsum("k,bkc...->bc...", rhos_c[:-1], D1s) # [B,C,T,H,W] + else: + corr_res = 0 + D1_t = model_t - m0 # [B,C,T,H,W] + x_t = x_t_ - sigma_t * B_h * (corr_res + rhos_c[-1] * D1_t) # [B,C,T,H,W] + x_t = x_t.to(x.dtype) # [B,C,T,H,W] + return x_t # [B,C,T,H,W] + + def index_for_timestep(self, timestep, schedule_timesteps=None): + if schedule_timesteps is None: + schedule_timesteps = self.timesteps + + indices = (schedule_timesteps == timestep).nonzero() + + # The sigma index that is taken for the **very** first `step` + # is always the second index (or the last index if there is only 1) + # This way we can ensure we don't accidentally skip a sigma in + # case we start in the middle of the denoising schedule (e.g. for image-to-image) + pos = 1 if len(indices) > 1 else 0 + + return indices[pos].item() + + # Copied from diffusers.schedulers.scheduling_dpmsolver_multistep.DPMSolverMultistepScheduler._init_step_index + def _init_step_index(self, timestep): + """ + Initialize the step_index counter for the scheduler. + """ + + if self.begin_index is None: + if isinstance(timestep, torch.Tensor): + timestep = timestep.to(self.timesteps.device) + self._step_index = self.index_for_timestep(timestep) + else: + self._step_index = self._begin_index + + def step( + self, + model_output: torch.Tensor, + timestep: Union[int, torch.Tensor], + sample: torch.Tensor, + return_dict: bool = True, + generator=None, + ) -> Union[SchedulerOutput, Tuple]: + """ + Predict the sample from the previous timestep by reversing the SDE. This function propagates the sample with + the multistep UniPC. + + Args: + model_output (`torch.Tensor`): + The direct output from learned diffusion model. + timestep (`int`): + The current discrete timestep in the diffusion chain. + sample (`torch.Tensor`): + A current instance of a sample created by the diffusion process. + return_dict (`bool`): + Whether or not to return a [`~schedulers.scheduling_utils.SchedulerOutput`] or `tuple`. + + Returns: + [`~schedulers.scheduling_utils.SchedulerOutput`] or `tuple`: + If return_dict is `True`, [`~schedulers.scheduling_utils.SchedulerOutput`] is returned, otherwise a + tuple is returned where the first element is the sample tensor. + + """ + if self.num_inference_steps is None: + raise ValueError( + "Number of inference steps is 'None', you need to run 'set_timesteps' after creating the scheduler" + ) + + if self.step_index is None: + self._init_step_index(timestep) + + # print("self.step_index ==> ", self.step_index) + + use_corrector = ( + self.step_index > 0 and self.step_index - 1 not in self.disable_corrector and self.last_sample is not None # pyright: ignore + ) + + model_output_convert = self.convert_model_output(model_output, sample=sample) + + if use_corrector: + sample = self.multistep_uni_c_bh_update( + this_model_output=model_output_convert, + last_sample=self.last_sample, + this_sample=sample, + order=self.this_order, + ) + + for i in range(self.config.solver_order - 1): + self.model_outputs[i] = self.model_outputs[i + 1] + self.timestep_list[i] = self.timestep_list[i + 1] + + self.model_outputs[-1] = model_output_convert + self.timestep_list[-1] = timestep # pyright: ignore + + if self.config.lower_order_final: + this_order = min(self.config.solver_order, len(self.timesteps) - self.step_index) # pyright: ignore + else: + this_order = self.config.solver_order + + self.this_order = min(this_order, self.lower_order_nums + 1) # warmup for multistep + assert self.this_order > 0 + + self.last_sample = sample + prev_sample = self.multistep_uni_p_bh_update( + model_output=model_output, # pass the original non-converted model output, in case solver-p is used + sample=sample, + order=self.this_order, + ) + + if self.lower_order_nums < self.config.solver_order: + self.lower_order_nums += 1 + + # upon completion increase step index by one + self._step_index += 1 # pyright: ignore + + if not return_dict: + return (prev_sample, model_output_convert) + + return SchedulerOutput(prev_sample=prev_sample) + + def scale_model_input(self, sample: torch.Tensor, *args, **kwargs) -> torch.Tensor: + """ + Ensures interchangeability with schedulers that need to scale the denoising model input depending on the + current timestep. + + Args: + sample (`torch.Tensor`): + The input sample. + + Returns: + `torch.Tensor`: + A scaled input sample. + """ + return sample + + # Copied from diffusers.schedulers.scheduling_dpmsolver_multistep.DPMSolverMultistepScheduler.add_noise + def add_noise( + self, + original_samples: torch.Tensor, + noise: torch.Tensor, + timesteps: torch.IntTensor, + ) -> torch.Tensor: + # Make sure sigmas and timesteps have the same device and dtype as original_samples + sigmas = self.sigmas.to(device=original_samples.device, dtype=original_samples.dtype) + if original_samples.device.type == "mps" and torch.is_floating_point(timesteps): + # mps does not support float64 + schedule_timesteps = self.timesteps.to(original_samples.device, dtype=torch.float32) + timesteps = timesteps.to(original_samples.device, dtype=torch.float32) + else: + schedule_timesteps = self.timesteps.to(original_samples.device) + timesteps = timesteps.to(original_samples.device) + + # begin_index is None when the scheduler is used for training or pipeline does not implement set_begin_index + if self.begin_index is None: + step_indices = [self.index_for_timestep(t, schedule_timesteps) for t in timesteps] + elif self.step_index is not None: + # add_noise is called after first denoising step (for inpainting) + step_indices = [self.step_index] * timesteps.shape[0] + else: + # add noise is called before first denoising step to create initial latent(img2img) + step_indices = [self.begin_index] * timesteps.shape[0] + + sigma = sigmas[step_indices].flatten() # [B] + while len(sigma.shape) < len(original_samples.shape): + sigma = sigma.unsqueeze(-1) # [B,1,...] broadcast-ready + + alpha_t, sigma_t = self._sigma_to_alpha_sigma_t(sigma) + noisy_samples = alpha_t * original_samples + sigma_t * noise # [B,C,T,H,W] + return noisy_samples + + def __len__(self): + return self.config.num_train_timesteps diff --git a/cosmos-inference/cosmos3/_src/vfm/diffusion/samplers/unipc.py b/cosmos-inference/cosmos3/_src/vfm/diffusion/samplers/unipc.py new file mode 100644 index 00000000..f166c49e --- /dev/null +++ b/cosmos-inference/cosmos3/_src/vfm/diffusion/samplers/unipc.py @@ -0,0 +1,124 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +from typing import Callable, Optional + +import attrs +import torch + +from cosmos3._src.imaginaire.config import make_freezable +from cosmos3._src.imaginaire.utils.progress_bar import progress_bar +from cosmos3._src.vfm.diffusion.samplers.fm_solvers_unipc import FlowUniPCMultistepScheduler +from cosmos3._src.vfm.diffusion.samplers.utils import run_multiseed + + +@make_freezable +@attrs.define(slots=False) +class UniPCSamplerConfig: + num_train_timesteps: int = 1000 + shift: float = 1.0 + use_dynamic_shifting: bool = False + + +class UniPCSampler(torch.nn.Module): + def __init__(self, cfg: Optional[UniPCSamplerConfig] = None, tensor_kwargs: Optional[dict] = None): + super().__init__() + if cfg is None: + cfg = UniPCSamplerConfig() + self.cfg = cfg + self.tensor_kwargs = tensor_kwargs + + @torch.no_grad() + def forward( + self, + velocity_fn: Callable, + noise: torch.Tensor | list[torch.Tensor], + num_steps: int = 35, + shift: float | None = None, + seed: int | list[int] | None = None, + ) -> torch.Tensor | list[torch.Tensor]: + """Run the UniPC multi-step sampling loop. + + ``noise`` and ``seed`` must both be single values or both be lists + (of the same length). When lists are provided, each element + corresponds to one independent sample with its own RNG generator + and scheduler; the return value is then a list of denoised tensors. + When single values are provided, a single tensor is returned. + + Args: + velocity_fn: ``velocity_fn(noise=..., timestep=...) -> velocity``. + noise: Initial noise. Either a single ``torch.Tensor`` of shape + ``(C, T, H, W)`` or a ``list[torch.Tensor]`` where each + element has shape ``(C, T, H, W)``. + seed: RNG seed. Either a single ``int`` or a ``list[int]`` with + the same length as ``noise``. + num_steps: Number of denoising steps. + shift: Flow-matching shift factor. Defaults to ``self.cfg.shift``. + + Returns: + Denoised sample(s). A single ``torch.Tensor`` when ``noise`` is a + tensor, or a ``list[torch.Tensor]`` when ``noise`` is a list. + """ + if shift is None: + shift = self.cfg.shift + assert isinstance(shift, float), "Shift must be a float" + + def _init_sample_scheduler(seed: int | None) -> tuple[torch.Generator, FlowUniPCMultistepScheduler]: + seed_g = torch.Generator(device=self.tensor_kwargs["device"]) + if seed is not None: + seed_g.manual_seed(seed) + sample_scheduler = FlowUniPCMultistepScheduler( + num_train_timesteps=self.cfg.num_train_timesteps, + shift=self.cfg.shift, + use_dynamic_shifting=self.cfg.use_dynamic_shifting, + ) + sample_scheduler.set_timesteps(num_steps, device=self.tensor_kwargs["device"], shift=shift) + return seed_g, sample_scheduler + + seed_g, sample_scheduler = run_multiseed(_init_sample_scheduler, seed=seed) + + timesteps = sample_scheduler[0].timesteps if isinstance(sample_scheduler, list) else sample_scheduler.timesteps + latent = noise + + for timestep in progress_bar(timesteps, desc="Sampling", total=len(timesteps)): + velocity_pred = velocity_fn(latent, timestep.reshape(1, 1)) + + def _scheduler_step( + seed_g: torch.Generator, + sample_scheduler: FlowUniPCMultistepScheduler, + velocity_pred: torch.Tensor, + latent: torch.Tensor, + ) -> torch.Tensor: + # multistep_uni_p_bh_update and multistep_uni_c_bh_update both use einsum patterns + # like "k,bkc...->bc...", which expect the tensor to have at least shape + # [B, C, ...] — where b is the batch dimension. Therefore, we need to unsqueeze + # the latent tensor to [B, C, ...] before passing it to the scheduler. + return sample_scheduler.step( + model_output=velocity_pred, + timestep=timestep, + sample=latent.unsqueeze(0), + return_dict=False, + generator=seed_g, + )[0].squeeze(0) + + latent = run_multiseed( + _scheduler_step, + seed_g=seed_g, + sample_scheduler=sample_scheduler, + velocity_pred=velocity_pred, + latent=latent, + ) + + return latent diff --git a/cosmos-inference/cosmos3/_src/vfm/diffusion/samplers/utils.py b/cosmos-inference/cosmos3/_src/vfm/diffusion/samplers/utils.py new file mode 100644 index 00000000..9e4b74c0 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/vfm/diffusion/samplers/utils.py @@ -0,0 +1,82 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025-2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +from typing import Any, Callable + + +def run_multiseed(fn: Callable, **kwargs: list[Any] | Any) -> Any: + """Run a callable once per seed, indexing all list kwargs in lockstep. + + All keyword arguments must be **either** all lists or all non-lists. + Mixing is not allowed. + + - **All non-list**: ``fn`` is called once with the kwargs as-is, and its + return value is passed through directly. + - **All list**: every list must have the same length *N*. ``fn`` is called + *N* times — call *i* receives ``{k: v[i] for k, v in kwargs}``. + Results are collected into a list. If ``fn`` returns a tuple, the + results are transposed into a tuple of lists. + + Args: + fn: Callable to invoke per seed. + **kwargs: Keyword arguments for ``fn``. Must be **all lists** (one + element per seed, all the same length) or **all non-lists** (single + call). + + Returns: + - All non-list kwargs: the raw return value of ``fn``. + - All list kwargs, ``fn`` returns a tuple: a ``tuple`` of lists, + transposed across calls. + - All list kwargs, ``fn`` returns non-tuple: a ``list`` of return + values. + + Raises: + AssertionError: If kwargs mix lists and non-lists, or if list kwargs + have differing lengths. + + Examples: + Single call (no lists):: + + run_multiseed(lambda x, y: x + y, x=1, y=2) # returns 3 + + Multiple calls with all-list kwargs:: + + run_multiseed(lambda x, y: x * y, x=[1, 2, 3], y=[10, 20, 30]) + # returns [10, 40, 90] + + Tuple return transposition:: + + run_multiseed(lambda x: (x, -x), x=[1, 2]) + # returns ([1, 2], [-1, -2]) + """ + all_list = all(isinstance(v, list) for v in kwargs.values()) + all_non_list = all(not isinstance(v, list) for v in kwargs.values()) + assert all_list or all_non_list, "All kwargs must be lists or all must be non-lists, cannot mix" + + if all_non_list: + return fn(**kwargs) + + lengths = {len(v) for v in kwargs.values()} + assert len(lengths) == 1, f"All list arguments must have the same length, got {lengths}" + num_calls = lengths.pop() + + results = [] + for i in range(num_calls): + kwargs_i = {k: v[i] for k, v in kwargs.items()} + results.append(fn(**kwargs_i)) + + if results and isinstance(results[0], tuple): + return tuple(list(items) for items in zip(*results)) + return results diff --git a/cosmos-inference/cosmos3/_src/vfm/evaluation/__init__.py b/cosmos-inference/cosmos3/_src/vfm/evaluation/__init__.py new file mode 100644 index 00000000..db6d6eb6 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/vfm/evaluation/__init__.py @@ -0,0 +1,15 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025-2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + diff --git a/cosmos-inference/cosmos3/_src/vfm/evaluation/action/__init__.py b/cosmos-inference/cosmos3/_src/vfm/evaluation/action/__init__.py new file mode 100644 index 00000000..db6d6eb6 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/vfm/evaluation/action/__init__.py @@ -0,0 +1,15 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025-2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + diff --git a/cosmos-inference/cosmos3/_src/vfm/evaluation/action/libero/__init__.py b/cosmos-inference/cosmos3/_src/vfm/evaluation/action/libero/__init__.py new file mode 100644 index 00000000..db6d6eb6 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/vfm/evaluation/action/libero/__init__.py @@ -0,0 +1,15 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025-2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + diff --git a/cosmos-inference/cosmos3/_src/vfm/evaluation/action/libero/closed_loop_eval.py b/cosmos-inference/cosmos3/_src/vfm/evaluation/action/libero/closed_loop_eval.py new file mode 100644 index 00000000..25699bd6 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/vfm/evaluation/action/libero/closed_loop_eval.py @@ -0,0 +1,1015 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025-2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +""" +Closed-loop evaluation for LIBERO using the Action HTTP inference server. + +# Single-view example (agentview camera): +PYTHONPATH=. python cosmos3/_src/vfm/evaluation/action/libero/closed_loop_eval.py \ + --server_url http://localhost:8000 \ + --task_suite libero_10 \ + --num_trials_per_task 10 \ + --action_horizon 16 \ + --camera agentview \ + --save_gifs --gif_fps 20 \ + --action_space frame_wise_relative \ + --rotation_space 6d \ + --action_dim 10 \ + --output_dir results/libero_closed_loop_10_single_view + +# Multi-view example (agentview + wrist cameras): +PYTHONPATH=. python cosmos3/_src/vfm/evaluation/action/libero/closed_loop_eval.py \ + --server_url http://localhost:8000 \ + --task_suite libero_goal \ + --num_trials_per_task 2 \ + --action_horizon 16 \ + --camera agentview,wrist \ + --save_gifs --gif_fps 20 \ + --action_space frame_wise_relative \ + --rotation_space 6d \ + --action_dim 10 \ + --output_dir results/libero_closed_loop_goal_multiview +""" + +from __future__ import annotations + +import argparse +import base64 +import io +import json +import os +import random +import sys +import time +from dataclasses import dataclass +from pathlib import Path +from typing import Any + +import numpy as np +import requests +from PIL import Image +from scipy.spatial.transform import Rotation as R + +from cosmos3._src.vfm.datasets.action.libero_pose_utils import ( + libero_rotation_format, + libero_rotation_space_from_action_dim, +) +from cosmos3._src.vfm.datasets.action.pose_utils import convert_rotation +from cosmos3._src.vfm.datasets.action.viewpoint_utils import DEFAULT_VIEWPOINT_TEMPLATES + +benchmark: Any +get_libero_path: Any +OffScreenRenderEnv: Any + + +TASK_MAX_STEPS: dict[str, int] = { + "libero_spatial": 220, + "libero_object": 280, + "libero_goal": 300, + "libero_10": 520, + "libero_90": 400, +} + + +_CAMERA_PROMPT_NAMES: dict[str, str] = { + "agentview": "third-person view", + "wrist": "wrist-mounted camera", +} + + +def _append_prompt_sentence(prompt: str, sentence: str) -> str: + """Append one metadata sentence using the same separator convention as training augmentors.""" + if sentence in prompt: + return prompt + prompt = prompt.rstrip() + if not prompt: + return sentence.rstrip() + separator = " " if prompt.rstrip().endswith(".") else ". " + return prompt + separator + sentence.rstrip() + + +def _concat_view_layout_description(cameras: list[str]) -> str: + """Describe the horizontal camera layout sent by ``ActionEnvironmentClient``.""" + camera_names = [_CAMERA_PROMPT_NAMES[camera] for camera in cameras] + if len(camera_names) == 2: + return f"The left half shows the {camera_names[0]}; the right half shows the {camera_names[1]}." + layout = ", ".join(camera_names) + return f"The views are concatenated horizontally from left to right as: {layout}." + + +def _augment_task_prompt_with_viewpoint(task_description: str, cameras: list[str]) -> str: + """Mirror DROID-style concat-view caption augmentation for closed-loop LIBERO eval.""" + if len(cameras) <= 1: + return task_description + prompt = _append_prompt_sentence(task_description, DEFAULT_VIEWPOINT_TEMPLATES["concat_view"]) + return _append_prompt_sentence(prompt, _concat_view_layout_description(cameras)) + + +def _rotation_repr_to_mat(rotation: np.ndarray, rotation_space: str) -> np.ndarray: + """Convert a single LIBERO rotation block to a 3x3 rotation matrix.""" + matrix = convert_rotation( + rotation, + libero_rotation_format(rotation_space), + "matrix", + normalize_matrix=rotation_space != "3d", + ) + if not isinstance(matrix, np.ndarray): + raise TypeError(f"Expected NumPy rotation matrix, got {type(matrix)!r}") + return matrix + + +@dataclass +class EpisodeResult: + success: bool + steps: int + error: str | None + actions: list[list[float]] + + +class ActionEnvironmentClient: + """Client for interacting with the Action model server.""" + + server_url: str + domain_name: str + prompt: str + image_size: int + timeout: float + + def __init__( + self, + server_url: str, + domain_name: str, + prompt: str, + image_size: int, + timeout: float, + ) -> None: + self.server_url = server_url.rstrip("/") + self.domain_name = domain_name + self.prompt = prompt + self.image_size = image_size + self.timeout = timeout + + def check_health(self) -> bool: + """Check if the model server is healthy.""" + try: + resp = requests.get(f"{self.server_url}/", timeout=5.0) + return resp.status_code == 200 + except requests.RequestException: + return False + + def get_info(self) -> dict[str, str]: + """Get model server info.""" + resp = requests.get(f"{self.server_url}/info", timeout=5.0) + resp.raise_for_status() + return resp.json() + + def notify_next_episode(self) -> None: + """Notify server to advance to next episode (used with dataset action server).""" + try: + requests.post( + f"{self.server_url}/next_episode", + json={"prompt": self.prompt}, + timeout=5.0, + ) + except requests.RequestException: + pass + + def encode_image(self, image: np.ndarray) -> str: + """Encode a numpy image (H, W, 3) uint8 to base64 PNG, resizing to image_size.""" + if image.dtype != np.uint8: + if image.max() <= 1.0: + image = (image * 255.0).round().astype(np.uint8) + else: + image = image.astype(np.uint8) + pil_img = Image.fromarray(image) + if pil_img.size != (self.image_size, self.image_size): + pil_img = pil_img.resize( + (self.image_size, self.image_size), + resample=Image.Resampling.BILINEAR, + ) + buf = io.BytesIO() + pil_img.save(buf, format="PNG") + return base64.b64encode(buf.getvalue()).decode("ascii") + + def encode_image_raw(self, image: np.ndarray) -> str: + """Encode a numpy image (H, W, 3) uint8 to base64 PNG without resizing.""" + if image.dtype != np.uint8: + if image.max() <= 1.0: + image = (image * 255.0).round().astype(np.uint8) + else: + image = image.astype(np.uint8) + pil_img = Image.fromarray(image) + buf = io.BytesIO() + pil_img.save(buf, format="PNG") + return base64.b64encode(buf.getvalue()).decode("ascii") + + def resize_image(self, image: np.ndarray) -> np.ndarray: + """Resize image to model input size.""" + if image.dtype != np.uint8: + if image.max() <= 1.0: + image = (image * 255.0).round().astype(np.uint8) + else: + image = image.astype(np.uint8) + pil_img = Image.fromarray(image) + if pil_img.size != (self.image_size, self.image_size): + pil_img = pil_img.resize( + (self.image_size, self.image_size), + resample=Image.Resampling.BILINEAR, + ) + return np.array(pil_img) + + def concatenate_images(self, images: list[np.ndarray]) -> np.ndarray: + """Resize each image and concatenate horizontally (side-by-side). + + Args: + images: List of images with shape (H, W, 3). + + Returns: + Concatenated image with shape (image_size, image_size*num_views, 3). + """ + resized = [self.resize_image(img) for img in images] + return np.concatenate(resized, axis=1) + + def predict(self, observation: np.ndarray | list[np.ndarray]) -> dict[str, Any]: + """Send observation(s) to model server and get predicted actions. + + Args: + observation: Single image as np.ndarray or list of images for multi-view. + For multi-view, images are resized and concatenated horizontally before sending. + """ + if isinstance(observation, list): + # Multi-view: resize each, concatenate horizontally, and send as single image + concatenated = self.concatenate_images(observation) + encoded = self.encode_image_raw(concatenated) + else: + # Single view: send single image + encoded = self.encode_image(observation) + + payload = { + "image": encoded, + "prompt": self.prompt, + "domain_name": self.domain_name, + "image_size": self.image_size, + } + + resp = requests.post( + f"{self.server_url}/predict", + json=payload, + headers={"Content-Type": "application/json"}, + timeout=self.timeout, + ) + resp.raise_for_status() + + result = resp.json() + if "error" in result and result["error"]: + raise RuntimeError(f"Model server error: {result['error']}") + return result + + +def _find_accessible_dri_nodes() -> list[Path]: + dri_path = Path("/dev/dri") + if not dri_path.exists(): + return [] + nodes = list(dri_path.glob("renderD*")) + list(dri_path.glob("card*")) + return [node for node in nodes if os.access(node, os.R_OK | os.W_OK)] + + +def _resolve_mujoco_backend(requested_backend: str) -> tuple[str, str]: + requested_backend = requested_backend.lower() + if requested_backend != "auto": + return requested_backend, "requested" + + env_backend = os.environ.get("MUJOCO_GL") + if env_backend: + return env_backend.lower(), "env" + + if _find_accessible_dri_nodes(): + return "egl", "auto-gpu" + return "osmesa", "auto-cpu" + + +def _configure_mujoco_env(requested_backend: str) -> str: + backend, source = _resolve_mujoco_backend(requested_backend) + if backend not in {"egl", "osmesa", "glfw"}: + raise ValueError(f"Unsupported MuJoCo GL backend: {backend!r}. Use auto, egl, osmesa, or glfw.") + + os.environ["MUJOCO_GL"] = backend + if backend == "egl": + os.environ["PYOPENGL_PLATFORM"] = "egl" + elif backend == "osmesa": + os.environ["PYOPENGL_PLATFORM"] = "osmesa" + return f"{backend} ({source})" + + +def _import_libero() -> None: + global benchmark, get_libero_path, OffScreenRenderEnv + try: + from libero.libero import benchmark as libero_benchmark + from libero.libero import get_libero_path as libero_get_libero_path + from libero.libero.envs import OffScreenRenderEnv as libero_offscreen_render_env + except ImportError as exc: # pragma: no cover - environment-specific dependency + raise RuntimeError( + "Failed to import LIBERO. Make sure the LIBERO environment is activated. " + f"python={sys.executable!r}, import_error={exc!r}" + ) from exc + + benchmark = libero_benchmark + get_libero_path = libero_get_libero_path + OffScreenRenderEnv = libero_offscreen_render_env + + +def _wait_for_server(client: ActionEnvironmentClient, timeout_s: float) -> None: + start = time.perf_counter() + while time.perf_counter() - start < timeout_s: + if client.check_health(): + return + time.sleep(1.0) + raise RuntimeError(f"Timed out waiting for server at {client.server_url}") + + +def _get_libero_env( + task: Any, + *, + resolution: int, + seed: int, + render_gpu_device_id: int, +) -> tuple[Any, str]: + task_description = str(task.language) + task_bddl_file = os.path.join(get_libero_path("bddl_files"), task.problem_folder, task.bddl_file) + env_args = { + "bddl_file_name": task_bddl_file, + "camera_heights": resolution, + "camera_widths": resolution, + "render_gpu_device_id": render_gpu_device_id, + } + env = OffScreenRenderEnv(**env_args) + env.seed(seed) + return env, task_description + + +def _get_libero_dummy_action() -> list[float]: + return [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, -1.0] + + +def _get_libero_image( + obs: dict[str, Any], + camera: str, + *, + flip_images: bool, + rotate_180: bool, +) -> np.ndarray: + if camera == "agentview": + image = obs["agentview_image"] + elif camera == "wrist": + image = obs["robot0_eye_in_hand_image"] + else: + raise ValueError(f"Unsupported camera={camera!r}. Use 'agentview' or 'wrist'.") + + if rotate_180: + image = image[::-1, ::-1] + if flip_images: + image = np.flipud(image) + return image + + +def _get_libero_images( + obs: dict[str, Any], + cameras: list[str], + *, + flip_images: bool, + rotate_180: bool, +) -> list[np.ndarray]: + """Get images from multiple cameras.""" + return [_get_libero_image(obs, camera, flip_images=flip_images, rotate_180=rotate_180) for camera in cameras] + + +def _ensure_uint8_image(image: np.ndarray) -> np.ndarray: + if image.dtype != np.uint8: + if image.max() <= 1.0: + image = (image * 255.0).round().astype(np.uint8) + else: + image = image.astype(np.uint8) + return image + + +def _save_gif(frames: list[Image.Image], output_path: Path, fps: int) -> None: + if not frames: + return + duration_ms = int(1000 / fps) if fps > 0 else 100 + output_path.parent.mkdir(parents=True, exist_ok=True) + first, *rest = frames + first.save( + output_path, + save_all=True, + append_images=rest, + duration=duration_ms, + loop=0, + ) + + +def _decode_b64_frames(b64_frames: list[str]) -> list[Image.Image]: + """Decode a list of base64-encoded PNG strings into PIL Images.""" + images: list[Image.Image] = [] + for b64 in b64_frames: + raw = base64.b64decode(b64) + images.append(Image.open(io.BytesIO(raw)).convert("RGB")) + return images + + +def _save_comparison_gif( + comparison_windows: list[tuple[list[Image.Image], list[Image.Image]]], + output_path: Path, + fps: int, + target_height: int = 256, + separator_width: int = 4, +) -> None: + """Create and save a side-by-side comparison GIF (Action prediction | env rollout). + + Each window is a (action_frames, env_frames) pair from one prediction call. + Frames are paired index-by-index; the conditioning frame (index 0) of + subsequent windows is skipped to avoid duplicating the boundary frame. + """ + from PIL import ImageDraw + + combined_frames: list[Image.Image] = [] + banner_h = 16 + + for window_idx, (action_frames, env_frames) in enumerate(comparison_windows): + n = min(len(action_frames), len(env_frames)) + start = 1 if window_idx > 0 else 0 + for i in range(start, n): + action_img = action_frames[i] + env_img = env_frames[i] + + action_w = int(action_img.width * target_height / action_img.height) + env_w = int(env_img.width * target_height / env_img.height) + action_resized = action_img.resize((action_w, target_height), Image.Resampling.BILINEAR) + env_resized = env_img.resize((env_w, target_height), Image.Resampling.BILINEAR) + + total_w = action_w + separator_width + env_w + total_h = target_height + banner_h + combined = Image.new("RGB", (total_w, total_h), color=0) + + draw = ImageDraw.Draw(combined) + draw.rectangle([(0, 0), (action_w, banner_h)], fill=(30, 30, 60)) + draw.rectangle([(action_w + separator_width, 0), (total_w, banner_h)], fill=(30, 60, 30)) + draw.text((4, 1), "Action Prediction", fill=(100, 180, 255)) + draw.text((action_w + separator_width + 4, 1), "Environment", fill=(100, 255, 100)) + + combined.paste(action_resized, (0, banner_h)) + combined.paste(env_resized, (action_w + separator_width, banner_h)) + combined_frames.append(combined) + + if combined_frames: + _save_gif(combined_frames, output_path, fps) + + +def _select_action_chunk(actions: list[list[float]], action_horizon: int) -> list[list[float]]: + if action_horizon <= 0 or action_horizon >= len(actions): + return actions + return actions[:action_horizon] + + +def _format_action(action: list[float], action_dim: int) -> list[float]: + if len(action) < action_dim: + raise ValueError(f"Action dimension {len(action)} smaller than expected {action_dim}") + return action[:action_dim] + + +def _remap_gripper_to_neg1_pos1(action: list[float]) -> list[float]: + """Remap gripper value from [0, 1] (training data range) to [-1, 1] (LIBERO env range). + + The training dataset stores gripper in [0, 1], but the LIBERO simulation + environment expects gripper commands in [-1, 1]. This applies the linear + mapping: gripper_env = gripper_model * 2 - 1. + """ + action = list(action) # avoid mutating the caller's list + action[-1] = max(-1.0, min(1.0, action[-1] * 2.0 - 1.0)) * -1 + return action + + +def _infer_rotation_space(action_dim: int, rotation_space: str) -> str: + if rotation_space != "auto": + return rotation_space + return libero_rotation_space_from_action_dim(action_dim) + + +def _obs_to_pose(obs: dict[str, Any]) -> tuple[np.ndarray, np.ndarray]: + position = np.asarray(obs["robot0_eef_pos"], dtype=np.float32) + quat = np.asarray(obs["robot0_eef_quat"], dtype=np.float32) + rotation = R.from_quat(quat).as_matrix() + return position, rotation + + +def _anchored_action_to_delta( + anchored_action: np.ndarray, + base_pose: tuple[np.ndarray, np.ndarray], + current_pose: tuple[np.ndarray, np.ndarray], + rotation_space: str, +) -> np.ndarray: + anchored_translation = anchored_action[:3] + rotation_dim = anchored_action.shape[0] - 4 + anchored_rotation = anchored_action[3 : 3 + rotation_dim] + gripper = anchored_action[3 + rotation_dim : 4 + rotation_dim] + + base_pos, base_rot = base_pose + current_pos, current_rot = current_pose + + if rotation_space == "3d": + anchored_rot = R.from_rotvec(anchored_rotation).as_matrix() + elif rotation_space == "6d": + anchored_rot = _rotation_repr_to_mat(anchored_rotation, rotation_space) + elif rotation_space == "9d": + anchored_rot = anchored_rotation.reshape(3, 3) + else: + raise ValueError(f"Unsupported rotation_space={rotation_space!r}. Use 3d/6d/9d.") + target_rot = base_rot @ anchored_rot + target_pos = base_pos + base_rot @ anchored_translation + delta_pos = target_pos - current_pos + delta_rot = target_rot @ current_rot.T + delta_rotvec = R.from_matrix(delta_rot).as_rotvec() + + return np.concatenate([delta_pos, delta_rotvec, gripper], axis=0) + + +def _framewise_action_to_delta( + framewise_action: np.ndarray, + rotation_space: str, +) -> np.ndarray: + """Convert a frame-wise policy action to LIBERO's 7D simulator command. + + Frame-wise actions are already per-step deltas in the LIBERO controller's + convention (see ``LiberoDataset`` with ``action_space='frame_wise_relative'``), + so the only conversion required is decoding the chosen rotation + representation back to a rotation vector. No anchor/current pose is needed. + """ + if rotation_space == "3d": + return framewise_action + + translation = framewise_action[:3] + rotation_dim = framewise_action.shape[0] - 4 + rotation_repr = framewise_action[3 : 3 + rotation_dim] + gripper = framewise_action[3 + rotation_dim : 4 + rotation_dim] + rotation_delta = _rotation_repr_to_mat(rotation_repr, rotation_space) + + delta_pos = translation + delta_rotvec = R.from_matrix(rotation_delta).as_rotvec() + return np.concatenate([delta_pos, delta_rotvec, gripper], axis=0) + + +def _run_episode( + env: Any, + client: ActionEnvironmentClient, + *, + cameras: list[str], + flip_images: bool, + rotate_180: bool, + action_horizon: int, + action_dim: int, + action_space: str, + rotation_space: str, + max_steps: int, + warmup_steps: int, + initial_state: np.ndarray | None, + gif_path: Path | None, + gif_fps: int, + comparison_path: Path | None = None, +) -> EpisodeResult: + env.reset() + if initial_state is not None: + obs = env.set_init_state(initial_state) + else: + obs = env.get_observation() + + action_queue: list[list[float]] = [] + base_pose: tuple[np.ndarray, np.ndarray] | None = None + step = 0 + success = False + gif_frames: list[Image.Image] = [] + action_log: list[list[float]] = [] + is_multi_view = len(cameras) > 1 + resolved_rotation_space = _infer_rotation_space(action_dim, rotation_space) + + comparison_windows: list[tuple[list[Image.Image], list[Image.Image]]] = [] + + def record_frame(current_obs: dict[str, Any]) -> None: + if gif_path is None: + return + image = _get_libero_image( + current_obs, + cameras[0], + flip_images=flip_images, + rotate_180=rotate_180, + ) + image = _ensure_uint8_image(image) + gif_frames.append(Image.fromarray(image).convert("RGB")) + + def capture_comparison_frame(current_obs: dict[str, Any]) -> Image.Image: + """Capture an env frame matching Action's input view (multi-view concatenated if applicable).""" + if is_multi_view: + imgs = _get_libero_images(current_obs, cameras, flip_images=flip_images, rotate_180=rotate_180) + concat = client.concatenate_images(imgs) + return Image.fromarray(_ensure_uint8_image(concat)).convert("RGB") + img = _get_libero_image(current_obs, cameras[0], flip_images=flip_images, rotate_180=rotate_180) + return Image.fromarray(_ensure_uint8_image(img)).convert("RGB") + + record_frame(obs) + + while step < max_steps: + if step < warmup_steps: + dummy = _get_libero_dummy_action() + obs, _, _, _ = env.step(dummy) + action_log.append(dummy) + step += 1 + record_frame(obs) + continue + + if not action_queue: + if is_multi_view: + observation_imgs = _get_libero_images( + obs, + cameras, + flip_images=flip_images, + rotate_180=rotate_180, + ) + result = client.predict(observation_imgs) + else: + observation_img = _get_libero_image( + obs, + cameras[0], + flip_images=flip_images, + rotate_180=rotate_180, + ) + result = client.predict(observation_img) + actions = result.get("action", []) + if not actions: + return EpisodeResult(False, step, "Empty action chunk from server", action_log) + action_queue = _select_action_chunk(actions, action_horizon) + + if comparison_path is not None: + action_video_b64 = result.get("video", []) + if action_video_b64: + action_frames = _decode_b64_frames(action_video_b64) + env_comparison_frames = [capture_comparison_frame(obs)] + comparison_windows.append((action_frames, env_comparison_frames)) + + if action_space == "relative": + base_pose = _obs_to_pose(obs) + + raw_action = _format_action(action_queue.pop(0), action_dim) + if action_space == "relative": + if base_pose is None: + raise RuntimeError("Missing base pose for relative action conversion") + current_pose = _obs_to_pose(obs) + action = _anchored_action_to_delta( + np.asarray(raw_action, dtype=np.float32), + base_pose, + current_pose, + resolved_rotation_space, + ) + action_list = action.tolist() + else: + action = _framewise_action_to_delta( + np.asarray(raw_action, dtype=np.float32), + resolved_rotation_space, + ) + action_list = action.tolist() + + # Remap gripper from [0, 1] (model/training range) to [-1, 1] (LIBERO env range) + action_list = _remap_gripper_to_neg1_pos1(action_list) + + action_log.append(action_list) + obs, _, done, info = env.step(action_list) + step += 1 + record_frame(obs) + + if comparison_path is not None and comparison_windows: + comparison_windows[-1][1].append(capture_comparison_frame(obs)) + + if isinstance(info, dict) and info.get("success"): + success = True + break + if done: + success = True if not isinstance(info, dict) else bool(info.get("success", True)) + break + + if gif_path is not None: + _save_gif(gif_frames, gif_path, gif_fps) + if comparison_path is not None and comparison_windows: + _save_comparison_gif(comparison_windows, comparison_path, gif_fps) + return EpisodeResult(success, step, None, action_log) + + +def _load_initial_states( + task_suite: Any, + task_id: int, + *, + task_description: str, + initial_states_path: str, + episode_idx: int, +) -> np.ndarray | None: + default_initial_states = task_suite.get_task_init_states(task_id) + + if initial_states_path == "DEFAULT": + return np.array(default_initial_states[episode_idx]) + + with open(initial_states_path, "r", encoding="utf-8") as f: + all_initial_states = json.load(f) + + task_key = task_description.replace(" ", "_") + episode_key = f"demo_{episode_idx}" + if not all_initial_states[task_key][episode_key]["success"]: + return None + return np.array(all_initial_states[task_key][episode_key]["initial_state"]) + + +def _parse_args() -> argparse.Namespace: + parser = argparse.ArgumentParser(description="LIBERO closed-loop evaluation via Action HTTP server") + parser.add_argument( + "--server_url", type=str, required=True, help="Base URL for Action server (e.g., http://host:8000)" + ) + parser.add_argument("--task_suite", type=str, default="libero_spatial", choices=sorted(TASK_MAX_STEPS.keys())) + parser.add_argument("--num_trials_per_task", type=int, default=10) + parser.add_argument("--task_ids", type=str, default="", help="Comma-separated task IDs to evaluate (default: all)") + parser.add_argument("--image_size", type=int, default=256, help="Model input image size") + parser.add_argument("--env_image_size", type=int, default=256, help="Environment render resolution") + parser.add_argument("--action_horizon", type=int, default=0, help="Actions to execute per request (0=full chunk)") + parser.add_argument("--action_dim", type=int, default=10, help="Action dimension for LIBERO") + parser.add_argument( + "--action_space", + type=str, + default="frame_wise_relative", + choices=["relative", "frame_wise_relative"], + help="Action space expected from the model (relative=anchored, frame_wise_relative=framewise deltas).", + ) + parser.add_argument( + "--rotation_space", + type=str, + default="auto", + choices=["auto", "3d", "6d", "9d"], + help="Rotation representation for anchored actions (auto infers from action_dim).", + ) + parser.add_argument("--domain_name", type=str, default="libero") + parser.add_argument( + "--camera", + type=str, + default="agentview", + help="Camera(s) to use. Single camera: 'agentview' or 'wrist'. Multiple cameras: comma-separated, e.g., 'agentview,wrist'.", + ) + parser.add_argument("--flip_images", action="store_true", help="Flip images vertically before encoding") + parser.add_argument( + "--rotate_180", + action=argparse.BooleanOptionalAction, + default=True, + help="Rotate images by 180 degrees before encoding (default: True; pass --no-rotate-180 to disable)", + ) + parser.add_argument("--warmup_steps", type=int, default=10, help="Stabilization steps with dummy actions") + parser.add_argument("--max_steps", type=int, default=0, help="Override max steps per episode (0=default)") + parser.add_argument("--timeout", type=float, default=30.0, help="HTTP request timeout in seconds") + parser.add_argument("--wait_timeout", type=float, default=60.0, help="Seconds to wait for server health") + parser.add_argument("--seed", type=int, default=0) + parser.add_argument("--save_gifs", action="store_true", help="Save per-episode GIFs of rendered frames") + parser.add_argument( + "--save_comparison", + action="store_true", + help="Save side-by-side comparison GIFs (Action prediction vs environment rollout)", + ) + parser.add_argument("--gif_fps", type=int, default=20, help="Frames per second for saved GIFs") + parser.add_argument( + "--mujoco_gl", + type=str, + default="auto", + choices=["auto", "egl", "osmesa", "glfw"], + help="MuJoCo GL backend (auto picks egl if /dev/dri is accessible, else osmesa).", + ) + parser.add_argument( + "--render_gpu_device_id", + type=int, + default=-1, + help="GPU device index for EGL rendering (-1 uses default device).", + ) + parser.add_argument( + "--initial_states_path", + type=str, + default="DEFAULT", + help='Path to initial states JSON. Use "DEFAULT" for benchmark defaults.', + ) + parser.add_argument("--output_dir", type=str, default="", help="Directory to save evaluation summary JSON") + return parser.parse_args() + + +def main() -> None: + args = _parse_args() + random.seed(args.seed) + np.random.seed(args.seed) + + if args.save_gifs and not args.output_dir: + raise ValueError("--save_gifs requires --output_dir to be set") + if args.save_comparison and not args.output_dir: + raise ValueError("--save_comparison requires --output_dir to be set") + + # Parse cameras from comma-separated string + cameras = [c.strip() for c in args.camera.split(",") if c.strip()] + if not cameras: + raise ValueError("At least one camera must be specified") + for cam in cameras: + if cam not in ("agentview", "wrist"): + raise ValueError(f"Unsupported camera={cam!r}. Use 'agentview' or 'wrist'.") + + mujoco_backend = _configure_mujoco_env(args.mujoco_gl) + _import_libero() + + client = ActionEnvironmentClient( + server_url=args.server_url, + domain_name=args.domain_name, + prompt="", + image_size=args.image_size, + timeout=args.timeout, + ) + print(f"MuJoCo GL backend: {mujoco_backend}", flush=True) + print("Waiting for model server...", flush=True) + _wait_for_server(client, args.wait_timeout) + print(f"Connected to model server: {client.get_info()}", flush=True) + + benchmark_dict = benchmark.get_benchmark_dict() + task_suite = benchmark_dict[args.task_suite]() + num_tasks = int(task_suite.n_tasks) + + if args.task_ids: + selected_task_ids = [int(t) for t in args.task_ids.split(",") if t.strip()] + else: + selected_task_ids = list(range(num_tasks)) + + max_steps = args.max_steps if args.max_steps > 0 else TASK_MAX_STEPS[args.task_suite] + + total_episodes = 0 + total_successes = 0 + task_results: list[dict[str, Any]] = [] + + output_dir = Path(args.output_dir) if args.output_dir else None + gif_root = output_dir / "gifs" if output_dir and args.save_gifs else None + comparison_root = output_dir / "comparisons" if output_dir and args.save_comparison else None + + for task_id in selected_task_ids: + task = task_suite.get_task(task_id) + env, task_description = _get_libero_env( + task, + resolution=args.env_image_size, + seed=args.seed, + render_gpu_device_id=args.render_gpu_device_id, + ) + + task_episodes = 0 + task_successes = 0 + episode_results: list[dict[str, Any]] = [] + + for episode_idx in range(args.num_trials_per_task): + episode_t0 = time.perf_counter() + client.prompt = _augment_task_prompt_with_viewpoint(task_description, cameras) + initial_state = _load_initial_states( + task_suite, + task_id, + task_description=task_description, + initial_states_path=args.initial_states_path, + episode_idx=episode_idx, + ) + if initial_state is None: + episode_elapsed_s = time.perf_counter() - episode_t0 + episode_results.append( + { + "episode": episode_idx, + "success": False, + "steps": 0, + "error": "Skipped due to failed expert demo", + "elapsed_s": round(episode_elapsed_s, 3), + } + ) + print( + f"Task {task_id} | Episode {episode_idx + 1}/{args.num_trials_per_task} | " + "success=False steps=0 " + f"elapsed_s={episode_elapsed_s:.1f} " + "error='Skipped due to failed expert demo'", + flush=True, + ) + continue + + gif_path = ( + gif_root / f"task_{task_id:03d}" / f"episode_{episode_idx:03d}.gif" if gif_root is not None else None + ) + comparison_path = ( + comparison_root / f"task_{task_id:03d}" / f"episode_{episode_idx:03d}.gif" + if comparison_root is not None + else None + ) + try: + result = _run_episode( + env, + client, + cameras=cameras, + flip_images=args.flip_images, + rotate_180=args.rotate_180, + action_horizon=args.action_horizon, + action_dim=args.action_dim, + action_space=args.action_space, + rotation_space=args.rotation_space, + max_steps=max_steps, + warmup_steps=args.warmup_steps, + initial_state=initial_state, + gif_path=gif_path, + gif_fps=args.gif_fps, + comparison_path=comparison_path, + ) + except Exception as exc: + result = EpisodeResult(False, 0, str(exc), []) + episode_elapsed_s = time.perf_counter() - episode_t0 + + task_episodes += 1 + total_episodes += 1 + if result.success: + task_successes += 1 + total_successes += 1 + + episode_results.append( + { + "episode": episode_idx, + "success": result.success, + "steps": result.steps, + "error": result.error, + "elapsed_s": round(episode_elapsed_s, 3), + } + ) + + # Save per-episode action log as JSON + if output_dir is not None and result.actions: + action_log_dir = output_dir / "actions" / f"task_{task_id:03d}" + action_log_dir.mkdir(parents=True, exist_ok=True) + action_log_path = action_log_dir / f"episode_{episode_idx:03d}.json" + action_log_path.write_text( + json.dumps(result.actions, indent=2), + encoding="utf-8", + ) + + client.notify_next_episode() + + print( + f"Task {task_id} | Episode {episode_idx + 1}/{args.num_trials_per_task} | " + f"success={result.success} steps={result.steps} elapsed_s={episode_elapsed_s:.1f}", + flush=True, + ) + + task_success_rate = float(task_successes) / float(task_episodes) if task_episodes > 0 else 0.0 + task_results.append( + { + "task_id": task_id, + "task_description": task_description, + "episodes": task_episodes, + "successes": task_successes, + "success_rate": task_success_rate, + "episode_results": episode_results, + } + ) + print( + f"Task {task_id} summary: {task_successes}/{task_episodes} ({task_success_rate * 100:.1f}%)", + flush=True, + ) + + overall_success_rate = float(total_successes) / float(total_episodes) if total_episodes > 0 else 0.0 + summary = { + "task_suite": args.task_suite, + "total_episodes": total_episodes, + "total_successes": total_successes, + "overall_success_rate": overall_success_rate, + "num_trials_per_task": args.num_trials_per_task, + "selected_task_ids": selected_task_ids, + "action_space": args.action_space, + "rotation_space": _infer_rotation_space(args.action_dim, args.rotation_space), + "action_dim": args.action_dim, + "task_results": task_results, + } + + print( + f"Overall success rate: {total_successes}/{total_episodes} ({overall_success_rate * 100:.1f}%)", + flush=True, + ) + + if output_dir is not None: + output_dir.mkdir(parents=True, exist_ok=True) + summary_path = output_dir / "summary.json" + summary_path.write_text(json.dumps(summary, indent=2), encoding="utf-8") + print(f"Saved summary to {summary_path}", flush=True) + + +if __name__ == "__main__": + main() diff --git a/cosmos-inference/cosmos3/_src/vfm/evaluation/action/libero/dataset_reply_action_server.py b/cosmos-inference/cosmos3/_src/vfm/evaluation/action/libero/dataset_reply_action_server.py new file mode 100644 index 00000000..680da660 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/vfm/evaluation/action/libero/dataset_reply_action_server.py @@ -0,0 +1,665 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025-2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +""" +HTTP server that serves ground-truth actions from LIBERO LeRobot datasets. + +Same HTTP interface as `cosmos3.scripts.action_policy_server` (the model-backed +server), enabling drop-in replacement for closed-loop evaluation to verify the +action pipeline with known-good GT actions. + +Endpoints: +- POST /predict: Return next chunk of GT actions for the given task (matched by prompt) +- GET /info: Return dataset info (tasks, episode counts) +- POST /next_episode: Advance to next episode for the task specified in request body +- POST /reset: Reset all per-task episode/step tracking + +Episode advancement: + The server auto-advances to the next episode when the current episode's actions + are exhausted. For early-termination cases (e.g. success before all actions are + consumed), call POST /next_episode with {"prompt": ""} between episodes. + +Example usage: + + +PYTHONPATH=. python cosmos3/_src/vfm/evaluation/action/libero/dataset_reply_action_server.py \ + --repo_id libero_10 \ + --root /path/to/libero_10_no_noops_1.0.0_lerobot_aligned \ + --action_space frame_wise_relative \ + --rotation_space 6d \ + --pose_coordinate_frame opencv \ + --action_chunk_size 16 \ + --send_video \ + --camera_mode agentview \ + --port 8000 + +# Multiple datasets: +PYTHONPATH=. python cosmos3/_src/vfm/evaluation/action/libero/dataset_reply_action_server.py \ + --repo_id libero_10,libero_goal \ + --root /path/to/libero_10,/path/to/libero_goal \ + --action_space relative \ + --rotation_space 6d \ + --pose_coordinate_frame opencv \ + --action_chunk_size 16 \ + --port 8000 +""" + +from __future__ import annotations + +import argparse +import base64 +import datetime +import io +import json +import socket +import threading +import time +from dataclasses import dataclass +from http.server import BaseHTTPRequestHandler, ThreadingHTTPServer +from typing import Any + +import numpy as np +import torch +from PIL import Image + +from cosmos3._src.vfm.datasets.action.libero_pose_utils import ( + libero_rotation_format, +) +from cosmos3._src.vfm.datasets.action.pose_utils import convert_rotation + + +def _ts() -> str: + return datetime.datetime.now(tz=datetime.timezone.utc).strftime("%Y-%m-%dT%H:%M:%S.%fZ") + + +def _get_local_ip() -> str: + try: + with socket.socket(socket.AF_INET, socket.SOCK_DGRAM) as s: + s.connect(("8.8.8.8", 80)) + return str(s.getsockname()[0]) + except Exception: + return socket.gethostbyname(socket.gethostname()) + + +# --------------------------------------------------------------------------- +# Action processing (mirrors LIBERODataset.__getitem__ logic) +# --------------------------------------------------------------------------- + + +def _compute_anchored_actions( + state_raw: torch.Tensor, + action_raw: torch.Tensor, +) -> tuple[torch.Tensor, torch.Tensor, torch.Tensor]: + """Compute anchored relative actions, same as LIBERODataset._compute_anchored_actions. + + Actions are expressed in state_raw[0]'s local coordinate frame. + + Args: + state_raw: (T+1, 8) states [x, y, z, ax, ay, az, grip1, grip2]. + action_raw: (T+1, 7) actions [dx, dy, dz, dax, day, daz, grip]. + + Returns: + anchored_translation (T, 3), anchored_rotation (T, 3, 3), gripper (T, 1). + """ + p_states = state_raw[:, :3] + rotvec_states = state_raw[:, 3:6] + delta_p = action_raw[:-1, :3] + delta_rotvec = action_raw[:-1, 3:6] + gripper = action_raw[:-1, 6:7] + + R_states = convert_rotation(rotvec_states, "axisangle", "matrix") + R_deltas = convert_rotation(delta_rotvec, "axisangle", "matrix") + + p_0 = p_states[0] + R_0_T = R_states[0].T + + p_t = p_states[:-1] + R_t = R_states[:-1] + + p_target = p_t + delta_p + R_target = torch.bmm(R_deltas, R_t) + + anchored_p = (R_0_T @ (p_target - p_0).T).T + R_0_T_expanded = R_0_T.unsqueeze(0).expand(R_target.shape[0], -1, -1) + anchored_R = torch.bmm(R_0_T_expanded, R_target) + + return anchored_p, anchored_R, gripper + + +def _convert_rotation_to_repr(rotation_matrix: torch.Tensor, rotation_space: str) -> torch.Tensor: + return convert_rotation(rotation_matrix, "matrix", libero_rotation_format(rotation_space)) + + +def _process_action_chunk( + action_raw: torch.Tensor, + state_raw: torch.Tensor, + action_space: str, + rotation_space: str, +) -> torch.Tensor: + """Process a chunk of raw actions with the same logic as LIBERODataset.__getitem__. + + Args: + action_raw: (chunk+1, 7) raw actions covering chunk+1 consecutive frames. + state_raw: (chunk+1, 8) raw states covering chunk+1 consecutive frames. + action_space: "relative" or "frame_wise_relative". + rotation_space: "3d", "6d", or "9d". + + Returns: + Processed actions (chunk, D) where D depends on rotation_space. + """ + if action_space == "relative": + translation, rotation_matrix, gripper = _compute_anchored_actions(state_raw, action_raw) + elif action_space == "frame_wise_relative": + action = action_raw[:-1].clone() + translation = action[:, :3] + rotation_matrix = convert_rotation(action[:, 3:6], "axisangle", "matrix") + gripper = action[:, 6:] + else: + raise ValueError(f"Unsupported action_space: {action_space}") + + rotation = _convert_rotation_to_repr(rotation_matrix, rotation_space) + return torch.cat([translation, rotation, gripper], dim=-1) + + +# --------------------------------------------------------------------------- +# Data structures +# --------------------------------------------------------------------------- + + +@dataclass(frozen=True) +class EpisodeData: + action_raw: torch.Tensor # (N, 7) per-frame raw actions for the full episode + state_raw: torch.Tensor # (N, 8) per-frame raw states for the full episode + task_description: str + dataset_ref_idx: int # index into DatasetActionService._hf_datasets + frame_start: int # first global frame index in the HF dataset + frame_end: int # one-past-last global frame index + + +@dataclass(frozen=True) +class DatasetServerConfig: + repo_id: list[str] + root: list[str | None] + action_space: str + rotation_space: str + pose_coordinate_frame: str + action_chunk_size: int + max_action_dim: int + split: str + send_video: bool + camera_mode: str + image_size: int + + +# --------------------------------------------------------------------------- +# Service +# --------------------------------------------------------------------------- + + +class DatasetActionService: + """Serves GT actions (and optionally GT video) from pre-loaded LIBERO LeRobot episodes.""" + + def __init__(self, cfg: DatasetServerConfig) -> None: + self.cfg = cfg + self.episodes_by_task: dict[str, list[EpisodeData]] = {} + self._hf_datasets: list[Any] = [] + self._lerobot_datasets: list[Any] = [] + self._task_state: dict[str, dict[str, int]] = {} + self._lock = threading.Lock() + + if cfg.camera_mode in ("concat_view", "both"): + self._image_keys = ["observation.images.image", "observation.images.wrist_image"] + elif cfg.camera_mode == "wrist_image": + self._image_keys = ["observation.images.wrist_image"] + else: + self._image_keys = ["observation.images.image"] + + self._load_datasets() + + def _load_datasets(self) -> None: + from lerobot.datasets.lerobot_dataset import LeRobotDataset + + for repo_id, root in zip(self.cfg.repo_id, self.cfg.root): + print(f"[{_ts()}] [dataset-server] loading repo_id={repo_id} root={root} ...", flush=True) + t0 = time.monotonic() + + dataset = LeRobotDataset(repo_id=repo_id, root=root) + tasks_df = dataset.meta.tasks + hf = dataset.hf_dataset + ds_ref_idx = len(self._hf_datasets) + self._hf_datasets.append(hf) + + if self.cfg.send_video: + delta_ts: dict[str, list[float]] = {k: [0.0] for k in self._image_keys} + video_dataset = LeRobotDataset(repo_id=repo_id, root=root, delta_timestamps=delta_ts) + self._lerobot_datasets.append(video_dataset) + else: + self._lerobot_datasets.append(None) + + for ep_meta in dataset.meta.episodes: + ep_idx = int(ep_meta["episode_index"]) # type: ignore[index] + start = int(ep_meta["dataset_from_index"]) # type: ignore[index] + end = int(ep_meta["dataset_to_index"]) # type: ignore[index] + + ep_slice = hf.select(range(start, end)) + actions = torch.tensor(np.array(ep_slice["action"], dtype=np.float32)) + states = torch.tensor(np.array(ep_slice["observation.state"], dtype=np.float32)) + + task_idx = int(ep_slice[0]["task_index"]) + matching = tasks_df[tasks_df["task_index"] == task_idx] + task_desc = str(matching.iloc[0].name) if not matching.empty else f"task_{task_idx}" + + self.episodes_by_task.setdefault(task_desc, []).append( + EpisodeData( + action_raw=actions, + state_raw=states, + task_description=task_desc, + dataset_ref_idx=ds_ref_idx, + frame_start=start, + frame_end=end, + ) + ) + + dt = time.monotonic() - t0 + print( + f"[{_ts()}] [dataset-server] loaded {repo_id}: {dataset.meta.total_episodes} episodes in {dt:.1f}s", + flush=True, + ) + + total_tasks = len(self.episodes_by_task) + total_eps = sum(len(eps) for eps in self.episodes_by_task.values()) + print( + f"[{_ts()}] [dataset-server] ready: {total_tasks} tasks, {total_eps} episodes " + f"send_video={self.cfg.send_video} camera_mode={self.cfg.camera_mode}", + flush=True, + ) + + def _load_video_frames(self, episode: EpisodeData, step: int, num_frames: int) -> list[str]: + """Load GT video frames from the dataset and encode as base64 PNGs. + + Uses the LeRobotDataset wrapper (not the raw HF dataset) so that video-backed + datasets are decoded correctly via the configured video backend. + + Args: + episode: Episode data with dataset reference. + step: Step offset within the episode (0-based). + num_frames: Number of frames to load (typically action_chunk_size + 1). + + Returns: + List of base64-encoded PNG strings. + """ + lr_dataset = self._lerobot_datasets[episode.dataset_ref_idx] + if lr_dataset is None: + return [] + image_size = self.cfg.image_size + b64_frames: list[str] = [] + + for i in range(num_frames): + global_idx = episode.frame_start + step + i + if global_idx >= episode.frame_end: + break + + item = lr_dataset[global_idx] + + pil_images: list[Image.Image] = [] + for key in self._image_keys: + img_tensor = item[key] + if isinstance(img_tensor, torch.Tensor): + # LeRobot returns (T, C, H, W) with delta_timestamps=[0.0] -> (1, C, H, W) + if img_tensor.dim() == 4: + img_tensor = img_tensor[0] + # (C, H, W) float [0, 1] -> PIL + arr = (img_tensor.permute(1, 2, 0).clamp(0, 1) * 255).to(torch.uint8).numpy() + img = Image.fromarray(arr) + elif isinstance(img_tensor, Image.Image): + img = img_tensor + else: + img = Image.fromarray(np.asarray(img_tensor, dtype=np.uint8)) + img = img.convert("RGB").resize((image_size, image_size), Image.Resampling.BILINEAR) + pil_images.append(img) + + if len(pil_images) > 1: + total_w = sum(im.width for im in pil_images) + combined = Image.new("RGB", (total_w, image_size)) + x = 0 + for im in pil_images: + combined.paste(im, (x, 0)) + x += im.width + frame = combined + else: + frame = pil_images[0] + + buf = io.BytesIO() + frame.save(buf, format="PNG") + b64_frames.append(base64.b64encode(buf.getvalue()).decode("ascii")) + + return b64_frames + + # -- state management -- + + def _get_task_state(self, prompt: str) -> dict[str, int]: + if prompt not in self._task_state: + self._task_state[prompt] = {"episode_idx": 0, "step": 0} + return self._task_state[prompt] + + def _resolve_prompt(self, prompt: str) -> str: + """Resolve prompt to a known task description (exact or substring match).""" + if prompt in self.episodes_by_task: + return prompt + prompt_lower = prompt.lower().strip() + for task_desc in self.episodes_by_task: + if task_desc.lower().strip() == prompt_lower: + return task_desc + for task_desc in self.episodes_by_task: + td_lower = task_desc.lower().strip() + if prompt_lower in td_lower or td_lower in prompt_lower: + return task_desc + raise ValueError( + f"Task not found for prompt: {prompt!r}. Available tasks: {sorted(self.episodes_by_task.keys())}" + ) + + # -- endpoints -- + + def get_info(self) -> dict[str, Any]: + return { + "type": "dataset_action_server", + "action_space": self.cfg.action_space, + "rotation_space": self.cfg.rotation_space, + "action_chunk_size": self.cfg.action_chunk_size, + "tasks": {k: len(v) for k, v in sorted(self.episodes_by_task.items())}, + } + + def predict(self, req: dict[str, Any]) -> dict[str, Any]: + prompt = req.get("prompt") + if not isinstance(prompt, str): + raise ValueError("'prompt' must be a string") + + resolved_prompt = self._resolve_prompt(prompt) + + with self._lock: + state = self._get_task_state(resolved_prompt) + episodes = self.episodes_by_task[resolved_prompt] + + ep_idx = state["episode_idx"] % len(episodes) + episode = episodes[ep_idx] + step = state["step"] + + # Number of valid actions = num_frames - 1 (need pairs of consecutive frames) + max_actions = len(episode.action_raw) - 1 + + if step >= max_actions: + state["episode_idx"] = (ep_idx + 1) % len(episodes) + state["step"] = 0 + ep_idx = state["episode_idx"] + episode = episodes[ep_idx] + step = 0 + max_actions = len(episode.action_raw) - 1 + + chunk_size = min(self.cfg.action_chunk_size, max_actions - step) + # Slice chunk+1 frames for action computation (needs next-frame state) + raw_slice_end = step + chunk_size + 1 + action_chunk_raw = episode.action_raw[step:raw_slice_end] + state_chunk_raw = episode.state_raw[step:raw_slice_end] + + processed = _process_action_chunk( + action_chunk_raw, + state_chunk_raw, + self.cfg.action_space, + self.cfg.rotation_space, + ) + + # Pad to max_action_dim (same as the Action transform pipeline) + t, d = processed.shape + if d < self.cfg.max_action_dim: + processed = torch.cat( + [processed, torch.zeros(t, self.cfg.max_action_dim - d)], + dim=-1, + ) + + state["step"] += chunk_size + + action_list = processed.float().numpy().tolist() + + video_b64: list[str] = [] + if self.cfg.send_video: + video_b64 = self._load_video_frames(episode, step, num_frames=chunk_size + 1) + + print( + f"[{_ts()}] [dataset-server] predict prompt={resolved_prompt!r} " + f"ep={ep_idx} step={step}..{state['step']} actions={len(action_list)} " + f"video_frames={len(video_b64)}", + flush=True, + ) + return {"action": action_list, "video": video_b64} + + def next_episode(self, prompt: str | None = None) -> dict[str, Any]: + with self._lock: + if prompt is not None: + resolved = self._resolve_prompt(prompt) + state = self._get_task_state(resolved) + episodes = self.episodes_by_task[resolved] + state["episode_idx"] = (state["episode_idx"] + 1) % len(episodes) + state["step"] = 0 + print( + f"[{_ts()}] [dataset-server] next_episode task={resolved!r} -> ep={state['episode_idx']}", + flush=True, + ) + return {"task": resolved, "episode_idx": state["episode_idx"]} + + for task in self._task_state: + episodes = self.episodes_by_task.get(task, []) + self._task_state[task]["episode_idx"] = (self._task_state[task]["episode_idx"] + 1) % max( + len(episodes), 1 + ) + self._task_state[task]["step"] = 0 + print(f"[{_ts()}] [dataset-server] next_episode (all tasks)", flush=True) + return {"advanced_all": True} + + def reset(self) -> dict[str, str]: + with self._lock: + self._task_state.clear() + print(f"[{_ts()}] [dataset-server] reset", flush=True) + return {"status": "reset"} + + +# --------------------------------------------------------------------------- +# HTTP handler +# --------------------------------------------------------------------------- + + +class _DatasetHandler(BaseHTTPRequestHandler): + server: ThreadingHTTPServer # type: ignore[assignment] + + def _send_json(self, status_code: int, payload: dict[str, Any]) -> None: + body = json.dumps(payload).encode("utf-8") + self.send_response(status_code) + self.send_header("Content-Type", "application/json") + self.send_header("Cache-Control", "no-store") + self.send_header("Content-Length", str(len(body))) + self.end_headers() + try: + self.wfile.write(body) + except (BrokenPipeError, ConnectionResetError): + return + + def _read_json_body(self) -> dict[str, Any] | None: + try: + length = int(self.headers.get("Content-Length") or "0") + except ValueError: + self._send_json(400, {"error": "Invalid Content-Length"}) + return None + body = self.rfile.read(max(0, length)) + if not body: + return {} + try: + req = json.loads(body.decode("utf-8")) + except Exception as e: + self._send_json(400, {"error": f"Invalid JSON: {e}"}) + return None + if not isinstance(req, dict): + self._send_json(400, {"error": "JSON body must be an object"}) + return None + return req + + def do_GET(self) -> None: # noqa: N802 + service: DatasetActionService = getattr(self.server, "service") + if self.path == "/info": + self._send_json(200, service.get_info()) + elif self.path == "/": + self._send_json(200, {"status": "ok"}) + else: + self._send_json(404, {"error": "Not found"}) + + def do_POST(self) -> None: # noqa: N802 + service: DatasetActionService = getattr(self.server, "service") + + if self.path in ("/", "/predict"): + req = self._read_json_body() + if req is None: + return + try: + out = service.predict(req) + except Exception as e: + print(f"[{_ts()}] [dataset-server] predict ERROR: {e}", flush=True) + self._send_json(400, {"action": [], "error": str(e)}) + return + self._send_json(200, out) + + elif self.path == "/next_episode": + req = self._read_json_body() + prompt = req.get("prompt") if req else None + try: + out = service.next_episode(prompt) + except Exception as e: + self._send_json(400, {"error": str(e)}) + return + self._send_json(200, out) + + elif self.path == "/reset": + out = service.reset() + self._send_json(200, out) + + else: + self._send_json(404, {"error": "Not found"}) + + def log_message(self, format: str, *args: Any) -> None: # noqa: A002 + return + + +# --------------------------------------------------------------------------- +# CLI +# --------------------------------------------------------------------------- + + +def main() -> None: + parser = argparse.ArgumentParser( + description="HTTP server serving ground-truth actions from LIBERO LeRobot datasets." + ) + parser.add_argument( + "--repo_id", + type=str, + required=True, + help="Comma-separated LeRobot repo IDs (e.g. libero_10,libero_goal)", + ) + parser.add_argument( + "--root", + type=str, + required=True, + help="Comma-separated local paths to dataset roots (one per repo_id)", + ) + parser.add_argument( + "--action_space", + type=str, + default="frame_wise_relative", + choices=["relative", "frame_wise_relative"], + help="Action space (must match closed-loop eval's --action_space).", + ) + parser.add_argument( + "--rotation_space", + type=str, + default="6d", + choices=["3d", "6d", "9d"], + help="Rotation representation (must match closed-loop eval's action_dim).", + ) + parser.add_argument( + "--pose_coordinate_frame", + type=str, + default="native", + choices=["native", "opencv"], + help="Pose/action coordinate frame. Accepted for compatibility with LIBERO eval launchers.", + ) + parser.add_argument("--action_chunk_size", type=int, default=16, help="Number of actions per predict call") + parser.add_argument("--max_action_dim", type=int, default=32, help="Pad actions to this dimension") + parser.add_argument("--split", type=str, default="full", help="Dataset split (train/val/full)") + parser.add_argument( + "--send_video", + action="store_true", + help="Include GT video frames (base64 PNGs) in /predict responses, same format as the Action server.", + ) + parser.add_argument( + "--camera_mode", + type=str, + default="image", + choices=["agentview", "wrist_image", "concat_view", "both"], + help="Camera view(s) to include in video frames.", + ) + parser.add_argument("--image_size", type=int, default=256, help="Resize video frames to this height/width") + parser.add_argument("--host", type=str, default="0.0.0.0") + parser.add_argument("--port", type=int, default=8000) + args = parser.parse_args() + + repo_ids = [r.strip() for r in args.repo_id.split(",") if r.strip()] + roots = [r.strip() for r in args.root.split(",") if r.strip()] + if len(repo_ids) != len(roots): + raise ValueError(f"Number of repo_ids ({len(repo_ids)}) must match number of roots ({len(roots)})") + + cfg = DatasetServerConfig( + repo_id=repo_ids, + root=roots, + action_space=args.action_space, + rotation_space=args.rotation_space, + pose_coordinate_frame=args.pose_coordinate_frame, + action_chunk_size=int(args.action_chunk_size), + max_action_dim=int(args.max_action_dim), + split=args.split, + send_video=bool(args.send_video), + camera_mode=args.camera_mode, + image_size=int(args.image_size), + ) + + service = DatasetActionService(cfg) + local_ip = _get_local_ip() + + print( + f"[{_ts()}] [dataset-server] starting host={args.host} port={args.port} " + f"action_space={cfg.action_space} rotation_space={cfg.rotation_space} " + f"action_chunk_size={cfg.action_chunk_size}", + flush=True, + ) + print(f"[{_ts()}] [dataset-server] Server accessible at: http://{local_ip}:{args.port}/", flush=True) + print(f"[{_ts()}] [dataset-server] Endpoints:", flush=True) + print(f" - GET / : Health check", flush=True) + print(f" - GET /info : Dataset info (tasks, episode counts)", flush=True) + print(f" - POST /predict : Get next GT action chunk (same interface as Action server)", flush=True) + print(f" - POST /next_episode : Advance to next episode for a task", flush=True) + print(f" - POST /reset : Reset all per-task state", flush=True) + + httpd = ThreadingHTTPServer((args.host, int(args.port)), _DatasetHandler) + setattr(httpd, "service", service) + httpd.serve_forever() + + +if __name__ == "__main__": + main() diff --git a/cosmos-inference/cosmos3/_src/vfm/evaluation/action/metrics.py b/cosmos-inference/cosmos3/_src/vfm/evaluation/action/metrics.py new file mode 100644 index 00000000..1739bee3 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/vfm/evaluation/action/metrics.py @@ -0,0 +1,664 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025-2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +""" +Evaluation metrics for Action models. + +This module provides standard metrics used for evaluating video prediction +and action prediction quality. + +All metric functions follow a unified structure: +1. Check if gt and pred share the same shape +2. Check if shape is expected +3. Detach tensor while keeping the device (if tensor) +4. Compute metric +5. Return float +""" + +from __future__ import annotations + +import warnings + +import numpy as np +import torch + + +def _get_dtype_info( + gt: np.ndarray | torch.Tensor, + pred: np.ndarray | torch.Tensor, +) -> tuple[float, float, float]: + """ + Get dtype info and derive data range. + + Args: + gt: Ground truth array/tensor. + pred: Predicted array/tensor. + + Returns: + Tuple of (data_range, expected_min, expected_max). + + Raises: + ValueError: If dtypes don't match or are unsupported. + """ + if isinstance(gt, torch.Tensor): + gt_is_float = gt.dtype.is_floating_point + gt_is_uint8 = gt.dtype == torch.uint8 + gt_dtype_str = str(gt.dtype) + else: + gt_is_float = np.issubdtype(gt.dtype, np.floating) + gt_is_uint8 = gt.dtype == np.uint8 + gt_dtype_str = str(gt.dtype) + + if isinstance(pred, torch.Tensor): + pred_is_float = pred.dtype.is_floating_point + pred_is_uint8 = pred.dtype == torch.uint8 + pred_dtype_str = str(pred.dtype) + else: + pred_is_float = np.issubdtype(pred.dtype, np.floating) + pred_is_uint8 = pred.dtype == np.uint8 + pred_dtype_str = str(pred.dtype) + + if gt_is_float != pred_is_float or gt_is_uint8 != pred_is_uint8: + raise ValueError(f"Dtype mismatch: gt {gt_dtype_str} vs pred {pred_dtype_str}") + + if gt_is_float: + return 2.0, -1.0, 1.0 + elif gt_is_uint8: + return 255.0, 0.0, 255.0 + else: + raise ValueError(f"Unsupported dtype: {gt_dtype_str}. Expected float or uint8.") + + +def _compute_motion_mask( + video: np.ndarray | torch.Tensor, + threshold_percentile: float, +) -> np.ndarray | torch.Tensor: + """ + Compute per-frame motion mask based on frame differences. + + For each frame t, computes |video[t] - video[t-1]| and thresholds based on + the given percentile. Frame 0 uses the same mask as frame 1. + + Args: + video: Video array/tensor of shape (C, T, H, W) or (B, C, T, H, W). + threshold_percentile: Percentile threshold (0-100). Pixels with motion + magnitude above this percentile are marked as dynamic. + E.g., 80.0 means top 20% of motion is considered "dynamic". + + Returns: + Boolean mask of shape (T, H, W) or (B, T, H, W) where True = dynamic pixel. + """ + is_tensor = isinstance(video, torch.Tensor) + + # Handle both (C, T, H, W) and (B, C, T, H, W) shapes + if video.ndim == 4: + # (C, T, H, W) -> time is dim 1 + time_dim = 1 + else: + # (B, C, T, H, W) -> time is dim 2 + time_dim = 2 + + if is_tensor: + video = video.detach().float() + # Compute frame differences: |video[t] - video[t-1]| + if time_dim == 1: + diff = torch.abs(video[:, 1:] - video[:, :-1]) # [C,T-1,H,W] + # Average over channels to get motion magnitude + motion_magnitude = diff.mean(dim=0) # [T-1,H,W] + else: + diff = torch.abs(video[:, :, 1:] - video[:, :, :-1]) # [B,C,T-1,H,W] + motion_magnitude = diff.mean(dim=1) # [B,T-1,H,W] + + # Compute threshold value from percentile + threshold_value = torch.quantile(motion_magnitude.flatten().float(), threshold_percentile / 100.0) + + # Create mask for frames 1..T-1 + mask_after_first = motion_magnitude > threshold_value + + # For frame 0, use the same mask as frame 1 + if time_dim == 1: + first_frame_mask = mask_after_first[0:1] # [1,H,W] + mask = torch.cat([first_frame_mask, mask_after_first], dim=0) # [T,H,W] + else: + first_frame_mask = mask_after_first[:, 0:1] # [B,1,H,W] + mask = torch.cat([first_frame_mask, mask_after_first], dim=1) # [B,T,H,W] + else: + video = np.asarray(video, dtype=np.float32) + # Compute frame differences + if time_dim == 1: + diff = np.abs(video[:, 1:] - video[:, :-1]) # [C,T-1,H,W] + motion_magnitude = diff.mean(axis=0) # [T-1,H,W] + else: + diff = np.abs(video[:, :, 1:] - video[:, :, :-1]) # [B,C,T-1,H,W] + motion_magnitude = diff.mean(axis=1) # [B,T-1,H,W] + + # Compute threshold value from percentile + threshold_value = np.percentile(motion_magnitude.flatten(), threshold_percentile) + + # Create mask for frames 1..T-1 + mask_after_first = motion_magnitude > threshold_value + + # For frame 0, use the same mask as frame 1 + if time_dim == 1: + first_frame_mask = mask_after_first[0:1] # [1,H,W] + mask = np.concatenate([first_frame_mask, mask_after_first], axis=0) # [T,H,W] + else: + first_frame_mask = mask_after_first[:, 0:1] # [B,1,H,W] + mask = np.concatenate([first_frame_mask, mask_after_first], axis=1) # [B,T,H,W] + + return mask + + +def compute_psnr( + gt: np.ndarray | torch.Tensor, + pred: np.ndarray | torch.Tensor, +) -> float: + """ + Compute Peak Signal-to-Noise Ratio (PSNR) between ground truth and prediction. + + PSNR is defined as: 20 * log10(MAX / sqrt(MSE)) + + The data range is automatically derived from dtype: + - float dtype: expects values in [-1, 1], data_range = 2.0 + - uint8 dtype: expects values in [0, 255], data_range = 255.0 + + Args: + gt: Ground truth array/tensor of shape (C, T, H, W) or (B, C, T, H, W). + - If float dtype: values must be in [-1, 1] range. + - If uint8 dtype: values must be in [0, 255] range. + pred: Predicted array/tensor of same shape and dtype as gt. + + Returns: + PSNR value in decibels (dB). Higher is better. + + Raises: + ValueError: If shapes don't match, dtypes don't match, dtype is unsupported, + or values are out of expected range. + + Example: + >>> gt_video = torch.randn(3, 16, 96, 96).clamp(-1, 1) # (C, T, H, W), float in [-1, 1] + >>> pred_video = torch.randn(3, 16, 96, 96).clamp(-1, 1) + >>> psnr = compute_psnr(gt_video, pred_video) + """ + # 1. Check if gt and pred share the same shape + if gt.shape != pred.shape: + raise ValueError(f"Shape mismatch: gt {gt.shape} vs pred {pred.shape}") + + # 2. Check if shape is expected: (C, T, H, W) or (B, C, T, H, W) + if gt.ndim not in (4, 5): + raise ValueError( + f"Expected gt to have 4 dims (C, T, H, W) or 5 dims (B, C, T, H, W), got {gt.ndim} dims: {gt.shape}" + ) + + # 3. Check dtype and derive data range + data_range, expected_min, expected_max = _get_dtype_info(gt, pred) + + # 4. Validate value range + if isinstance(gt, torch.Tensor): + gt_min, gt_max = gt.min().item(), gt.max().item() + else: + gt_min, gt_max = float(gt.min()), float(gt.max()) + + if isinstance(pred, torch.Tensor): + pred_min, pred_max = pred.min().item(), pred.max().item() + else: + pred_min, pred_max = float(pred.min()), float(pred.max()) + + if gt_min < expected_min or gt_max > expected_max: + raise ValueError(f"gt values out of range: got [{gt_min}, {gt_max}], expected [{expected_min}, {expected_max}]") + if pred_min < expected_min or pred_max > expected_max: + raise ValueError( + f"pred values out of range: got [{pred_min}, {pred_max}], expected [{expected_min}, {expected_max}]" + ) + + # 5. Detach tensor while keeping the device (if tensor) + # 6. Compute metric + if isinstance(gt, torch.Tensor) and isinstance(pred, torch.Tensor): + gt_t = gt.detach().float() + pred_t = pred.detach().float() + mse = torch.mean((gt_t - pred_t) ** 2).item() + else: + gt_np = np.asarray(gt, dtype=np.float32) + pred_np = np.asarray(pred, dtype=np.float32) + mse = float(np.mean((gt_np - pred_np) ** 2)) + + if mse == 0: + psnr = float("inf") + else: + psnr = 20.0 * np.log10(data_range / np.sqrt(mse)) + + # 7. Return float + return float(psnr) + + +def compute_dynamic_psnr( + gt: np.ndarray | torch.Tensor, + pred: np.ndarray | torch.Tensor, + motion_threshold_percentile: float = 80.0, +) -> tuple[float, np.ndarray | torch.Tensor]: + """ + Compute PSNR only on dynamic (moving) pixels, ignoring static background. + + This metric is more meaningful for videos with large static backgrounds, + as it focuses on the quality of dynamic content rather than being dominated + by easy-to-predict static regions. + + Dynamic pixels are detected using frame differences: for each frame t, + pixels where |gt[t] - gt[t-1]| exceeds a percentile threshold are considered + "dynamic". PSNR is computed only on these pixels. + + Args: + gt: Ground truth array/tensor of shape (C, T, H, W) or (B, C, T, H, W). + - If float dtype: values must be in [-1, 1] range. + - If uint8 dtype: values must be in [0, 255] range. + pred: Predicted array/tensor of same shape and dtype as gt. + motion_threshold_percentile: Percentile threshold for motion detection. + Default 80.0 means top 20% of motion magnitude is considered "dynamic". + Higher values = stricter threshold = fewer dynamic pixels. + + Returns: + Tuple of (psnr, motion_mask): + - psnr: PSNR value in decibels (dB) computed only on dynamic regions. + Higher is better. Returns regular PSNR if no dynamic pixels are found. + - motion_mask: Boolean mask of shape (T, H, W) or (B, T, H, W) where + True indicates a dynamic pixel. Same type as input (tensor or ndarray). + + Raises: + ValueError: If shapes don't match, dtypes don't match, dtype is unsupported, + or values are out of expected range. + + Example: + >>> gt_video = torch.randn(3, 16, 96, 96).clamp(-1, 1) # (C, T, H, W) + >>> pred_video = torch.randn(3, 16, 96, 96).clamp(-1, 1) + >>> dynamic_psnr, motion_mask = compute_dynamic_psnr(gt_video, pred_video) + >>> print(f"Dynamic PSNR: {dynamic_psnr:.2f}, Mask shape: {motion_mask.shape}") + """ + + # 1. Check if gt and pred share the same shape + if gt.shape != pred.shape: + raise ValueError(f"Shape mismatch: gt {gt.shape} vs pred {pred.shape}") + + # 2. Check if shape is expected: (C, T, H, W) or (B, C, T, H, W) + if gt.ndim not in (4, 5): + raise ValueError( + f"Expected gt to have 4 dims (C, T, H, W) or 5 dims (B, C, T, H, W), got {gt.ndim} dims: {gt.shape}" + ) + + # 3. Check dtype and derive data range + data_range, expected_min, expected_max = _get_dtype_info(gt, pred) + + # 4. Validate value range + if isinstance(gt, torch.Tensor): + gt_min, gt_max = gt.min().item(), gt.max().item() + else: + gt_min, gt_max = float(gt.min()), float(gt.max()) + + if isinstance(pred, torch.Tensor): + pred_min, pred_max = pred.min().item(), pred.max().item() + else: + pred_min, pred_max = float(pred.min()), float(pred.max()) + + if gt_min < expected_min or gt_max > expected_max: + raise ValueError(f"gt values out of range: got [{gt_min}, {gt_max}], expected [{expected_min}, {expected_max}]") + if pred_min < expected_min or pred_max > expected_max: + raise ValueError( + f"pred values out of range: got [{pred_min}, {pred_max}], expected [{expected_min}, {expected_max}]" + ) + + # 5. Compute motion mask from GT + motion_mask = _compute_motion_mask(gt, motion_threshold_percentile) + + # 6. Check if there are any dynamic pixels + if isinstance(motion_mask, torch.Tensor): + num_dynamic = motion_mask.sum().item() + else: + num_dynamic = int(motion_mask.sum()) + + if num_dynamic == 0: + warnings.warn( + "No dynamic pixels found in video. Returning regular PSNR. Consider lowering motion_threshold_percentile.", + stacklevel=2, + ) + return compute_psnr(gt, pred), motion_mask + + # 7. Compute MSE only on dynamic pixels + if isinstance(gt, torch.Tensor) and isinstance(pred, torch.Tensor): + gt_t = gt.detach().float() + pred_t = pred.detach().float() + # motion_mask is a tensor when gt is a tensor + mask_t = motion_mask if isinstance(motion_mask, torch.Tensor) else torch.from_numpy(motion_mask) + + # Expand mask to match video shape (add channel dimension) + if gt.ndim == 4: + # (C, T, H, W) - mask is (T, H, W) + expanded_mask = mask_t.unsqueeze(0).expand_as(gt_t) # [C,T,H,W] + else: + # (B, C, T, H, W) - mask is (B, T, H, W) + expanded_mask = mask_t.unsqueeze(1).expand_as(gt_t) # [B,C,T,H,W] + + # Compute squared error only on dynamic pixels + squared_error = (gt_t - pred_t) ** 2 + masked_squared_error = squared_error[expanded_mask] + mse = masked_squared_error.mean().item() + else: + gt_np = np.asarray(gt, dtype=np.float32) + pred_np = np.asarray(pred, dtype=np.float32) + # motion_mask is an ndarray when gt is an ndarray + mask_np = motion_mask if isinstance(motion_mask, np.ndarray) else motion_mask.numpy() + + # Expand mask to match video shape + if gt_np.ndim == 4: + # (C, T, H, W) - mask is (T, H, W) + expanded_mask = np.broadcast_to(mask_np[np.newaxis, :, :, :], gt_np.shape) + else: + # (B, C, T, H, W) - mask is (B, T, H, W) + expanded_mask = np.broadcast_to(mask_np[:, np.newaxis, :, :, :], gt_np.shape) + + # Compute squared error only on dynamic pixels + squared_error = (gt_np - pred_np) ** 2 + masked_squared_error = squared_error[expanded_mask] + mse = float(masked_squared_error.mean()) + + # 8. Convert MSE to PSNR + if mse == 0: + psnr = float("inf") + else: + psnr = 20.0 * np.log10(data_range / np.sqrt(mse)) + + return float(psnr), motion_mask + + +def compute_ssim( + gt: np.ndarray | torch.Tensor, + pred: np.ndarray | torch.Tensor, +) -> float: + """ + Compute Structural Similarity Index (SSIM) between ground truth and prediction. + + SSIM measures the structural similarity between two images, considering + luminance, contrast, and structure. It is computed per-frame and averaged. + + Uses skimage.metrics.structural_similarity for computation, following the + implementation in projects/cosmos/tokenizer/evaluation/metric.py. + + The data range is automatically derived from dtype: + - float dtype: expects values in [-1, 1], data_range = 2.0 + - uint8 dtype: expects values in [0, 255], data_range = 255.0 + + Args: + gt: Ground truth array/tensor of shape (C, T, H, W) or (B, C, T, H, W). + If float dtype: values must be in [-1, 1] range. + If uint8 dtype: values must be in [0, 255] range. + pred: Predicted array/tensor of same shape and dtype as gt. + + Returns: + SSIM value in range [-1, 1]. Higher is better (1.0 = identical). + + Raises: + ValueError: If shapes don't match, dtypes don't match, dtype is unsupported, + or values are out of expected range. + + Example: + >>> gt_video = torch.randn(3, 16, 96, 96).clamp(-1, 1) # (C, T, H, W), float in [-1, 1] + >>> pred_video = torch.randn(3, 16, 96, 96).clamp(-1, 1) + >>> ssim_val = compute_ssim(gt_video, pred_video) + """ + from skimage.metrics import structural_similarity as ssim + + # 1. Check if gt and pred share the same shape + if gt.shape != pred.shape: + raise ValueError(f"Shape mismatch: gt {gt.shape} vs pred {pred.shape}") + + # 2. Check if shape is expected: (C, T, H, W) or (B, C, T, H, W) + if gt.ndim not in (4, 5): + raise ValueError( + f"Expected gt to have 4 dims (C, T, H, W) or 5 dims (B, C, T, H, W), got {gt.ndim} dims: {gt.shape}" + ) + + # 3. Check dtype and derive data range + data_range, expected_min, expected_max = _get_dtype_info(gt, pred) + + # 4. Validate value range + if isinstance(gt, torch.Tensor): + gt_min, gt_max = gt.min().item(), gt.max().item() + else: + gt_min, gt_max = float(gt.min()), float(gt.max()) + + if isinstance(pred, torch.Tensor): + pred_min, pred_max = pred.min().item(), pred.max().item() + else: + pred_min, pred_max = float(pred.min()), float(pred.max()) + + if isinstance(gt, torch.Tensor) and gt.dtype == torch.bfloat16: + gt = gt.float() + if isinstance(pred, torch.Tensor) and pred.dtype == torch.bfloat16: + pred = pred.float() + + if gt_min < expected_min or gt_max > expected_max: + raise ValueError(f"gt values out of range: got [{gt_min}, {gt_max}], expected [{expected_min}, {expected_max}]") + if pred_min < expected_min or pred_max > expected_max: + raise ValueError( + f"pred values out of range: got [{pred_min}, {pred_max}], expected [{expected_min}, {expected_max}]" + ) + + # 5. Convert to numpy arrays + if isinstance(gt, torch.Tensor): + gt_np = gt.detach().cpu().numpy() + else: + gt_np = np.asarray(gt) + + if isinstance(pred, torch.Tensor): + pred_np = pred.detach().cpu().numpy() + else: + pred_np = np.asarray(pred) + + # 6. Reshape to (N, C, H, W) where N = number of frames + if gt_np.ndim == 4: + # (C, T, H, W) -> (T, C, H, W) + gt_frames = np.transpose(gt_np, (1, 0, 2, 3)) + pred_frames = np.transpose(pred_np, (1, 0, 2, 3)) + else: + # (B, C, T, H, W) -> (B*T, C, H, W) + b, c, t, h, w = gt_np.shape + gt_frames = np.transpose(gt_np, (0, 2, 1, 3, 4)).reshape(b * t, c, h, w) + pred_frames = np.transpose(pred_np, (0, 2, 1, 3, 4)).reshape(b * t, c, h, w) + + # 7. Compute SSIM per frame using skimage + ssim_values = [] + for gt_frame, pred_frame in zip(gt_frames, pred_frames): + # gt_frame and pred_frame are (C, H, W) + frame_ssim = ssim(gt_frame, pred_frame, channel_axis=0, data_range=data_range) + ssim_values.append(frame_ssim) + + # 8. Return average SSIM + return float(np.mean(ssim_values)) + + +def compute_action_mse( + gt_action: np.ndarray | torch.Tensor, + pred_action: np.ndarray | torch.Tensor, +) -> float: + """ + Compute Mean Squared Error (MSE) between ground truth and predicted actions. + + Args: + gt_action: Ground truth array/tensor of shape (T, D) or (B, T, D), + where T is the number of timesteps and D is the action dimension. + pred_action: Predicted array/tensor of same shape as gt_action. + + Returns: + MSE value. Lower is better. + + Example: + >>> gt = np.random.randn(16, 2) # (T, D) - 16 timesteps, 2D actions + >>> pred = np.random.randn(16, 2) + >>> mse = compute_action_mse(gt, pred) + """ + # 1. Check if gt and pred share the same shape + if gt_action.shape != pred_action.shape: + raise ValueError(f"Shape mismatch: gt_action {gt_action.shape} vs pred_action {pred_action.shape}") + + # 2. Check if shape is expected: (T, D) or (B, T, D) + if gt_action.ndim not in (2, 3): + raise ValueError( + f"Expected gt_action to have 2 dims (T, D) or 3 dims (B, T, D), got {gt_action.ndim} dims: {gt_action.shape}" + ) + + # 3. Detach tensor while keeping the device (if tensor) + # 4. Compute metric + if isinstance(gt_action, torch.Tensor) and isinstance(pred_action, torch.Tensor): + gt_t = gt_action.detach() + pred_t = pred_action.detach() + mse = torch.mean((gt_t - pred_t) ** 2).item() + else: + gt_np = np.asarray(gt_action) + pred_np = np.asarray(pred_action) + mse = float(np.mean((gt_np - pred_np) ** 2)) + + # 5. Return float + return float(mse) + + +def compute_grouped_action_mse( + gt_action: np.ndarray | torch.Tensor, + pred_action: np.ndarray | torch.Tensor, +) -> dict[str, float]: + """ + Compute grouped MSE for translation, rotation, and gripper action components. + + This metric is useful for robotics tasks where actions are structured as + [translation(3), rotation(9), gripper(1)] using 9D rotation matrix representation. + + NOTE: All actions must be converted to 9D rotation matrix format (13D total) + before calling this function. Use conversion utilities in the inference stage + to convert from other rotation representations (axis-angle, 6D) to 9D. + + Args: + gt_action: Ground truth array/tensor of shape (T, 13) or (B, T, 13), + where T is the number of timesteps. + Expected format: [translation(3), rotation_matrix(9), gripper(1)] + pred_action: Predicted array/tensor of same shape as gt_action. + + Returns: + Dictionary with MSE values for each component: + - "translation": MSE for x, y, z (dims 0-2) + - "rotation": MSE for flattened rotation matrix (dims 3-11) + - "gripper": MSE for gripper (dim 12) + Values are 0.0 if the corresponding dimensions are not present. + """ + if gt_action.shape != pred_action.shape: + raise ValueError(f"Shape mismatch: gt_action {gt_action.shape} vs pred_action {pred_action.shape}") + + if gt_action.ndim not in (2, 3): + raise ValueError( + f"Expected gt_action to have 2 dims (T, D) or 3 dims (B, T, D), got {gt_action.ndim} dims: {gt_action.shape}" + ) + + if isinstance(gt_action, torch.Tensor): + gt_np = gt_action.detach().cpu().numpy() + else: + gt_np = np.asarray(gt_action) + + if isinstance(pred_action, torch.Tensor): + pred_np = pred_action.detach().cpu().numpy() + else: + pred_np = np.asarray(pred_action) + + result: dict[str, float] = {"translation": 0.0, "rotation": 0.0, "gripper": 0.0} + action_dim = gt_np.shape[-1] + gripper_idx = 12 + + if action_dim >= 3: + result["translation"] = float(np.mean((gt_np[..., :3] - pred_np[..., :3]) ** 2)) + + # Rotation: dimensions 3-11 (flattened 3x3 rotation matrix) + if action_dim >= gripper_idx: + result["rotation"] = float(np.mean((gt_np[..., 3:gripper_idx] - pred_np[..., 3:gripper_idx]) ** 2)) + + # Gripper: dimension 12 + if action_dim > gripper_idx: + result["gripper"] = float(np.mean((gt_np[..., gripper_idx] - pred_np[..., gripper_idx]) ** 2)) + + return result + + +def compute_action_mae( + gt_action: np.ndarray | torch.Tensor, + pred_action: np.ndarray | torch.Tensor, +) -> float: + """ + Compute Mean Absolute Error (MAE) between ground truth and predicted actions. + + Args: + gt_action: Ground truth array/tensor of shape (T, D) or (B, T, D), + where T is the number of timesteps and D is the action dimension. + pred_action: Predicted array/tensor of same shape as gt_action. + + Returns: + MAE value. Lower is better. + + Example: + >>> gt = np.random.randn(16, 2) # (T, D) - 16 timesteps, 2D actions + >>> pred = np.random.randn(16, 2) + >>> mae = compute_action_mae(gt, pred) + """ + # 1. Check if gt and pred share the same shape + if gt_action.shape != pred_action.shape: + raise ValueError(f"Shape mismatch: gt_action {gt_action.shape} vs pred_action {pred_action.shape}") + + # 2. Check if shape is expected: (T, D) or (B, T, D) + if gt_action.ndim not in (2, 3): + raise ValueError( + f"Expected gt_action to have 2 dims (T, D) or 3 dims (B, T, D), got {gt_action.ndim} dims: {gt_action.shape}" + ) + + # 3. Detach tensor while keeping the device (if tensor) + # 4. Compute metric + if isinstance(gt_action, torch.Tensor) and isinstance(pred_action, torch.Tensor): + gt_t = gt_action.detach() + pred_t = pred_action.detach() + mae = torch.mean(torch.abs(gt_t - pred_t)).item() + else: + gt_np = np.asarray(gt_action) + pred_np = np.asarray(pred_action) + mae = float(np.mean(np.abs(gt_np - pred_np))) + + # 5. Return float + return float(mae) + + +def compute_geodesic_rotation_error( + gt_rot: np.ndarray, + pred_rot: np.ndarray, +) -> np.ndarray: + """Geodesic angular error between rotation matrices on SO(3). + + Computes ``arccos((tr(R_gt^T @ R_pred) - 1) / 2)`` for each pair, + returning the angular distance in **degrees**. + + Args: + gt_rot: Ground-truth rotation matrices of shape ``(N, 3, 3)``. + pred_rot: Predicted rotation matrices of shape ``(N, 3, 3)``. + + Returns: + Per-element angular errors in degrees, shape ``(N,)``. + """ + if gt_rot.shape != pred_rot.shape or gt_rot.shape[-2:] != (3, 3): + raise ValueError(f"Expected (N,3,3) arrays, got gt={gt_rot.shape}, pred={pred_rot.shape}") + + R_err = np.matmul(np.transpose(gt_rot, (0, 2, 1)), pred_rot) # [N,3,3] + trace = np.trace(R_err, axis1=1, axis2=2) # [N] + cos_angle = np.clip((trace - 1.0) / 2.0, -1.0, 1.0) # [N] + return np.degrees(np.arccos(cos_angle)) # [N] diff --git a/cosmos-inference/cosmos3/_src/vfm/models/__init__.py b/cosmos-inference/cosmos3/_src/vfm/models/__init__.py new file mode 100644 index 00000000..db6d6eb6 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/vfm/models/__init__.py @@ -0,0 +1,15 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025-2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + diff --git a/cosmos-inference/cosmos3/_src/vfm/models/hf_model.py b/cosmos-inference/cosmos3/_src/vfm/models/hf_model.py new file mode 100644 index 00000000..0aecf0e2 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/vfm/models/hf_model.py @@ -0,0 +1,339 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +"""Minimal HFModel for the vfm/ unified VLM training path. + +Responsibilities: + - ``__init__``: meta-init the underlying HF model via the appropriate + AutoClass (``AutoModelForImageTextToText`` / ``AutoModel`` / + ``AutoModelForCausalLM`` — see ``HFModel`` for selection rules); + no weights are loaded. + - ``apply_gradient_checkpointing``: wraps HF's standard + ``gradient_checkpointing_enable`` API. + - ``tie_embeddings``: re-establishes the input/output embedding tie after + FSDP wrapping + meta-materialization. + - ``load_weights``: dispatches to ``load_vlm_model`` (VLM) or + ``load_language_model`` (LLM) from ``safetensors_loader.py`` based on + ``vision_config``; returns the set of checkpoint keys that were loaded. + - ``forward``: pass-through returning logits. + +FSDP wrapping lives in ``vfm/models/parallelize_vlm.py::parallelize()``, +NOT here. +""" + +import torch +import torch.nn as nn +from accelerate import init_on_device +from transformers import AutoConfig, AutoModel, AutoModelForCausalLM, AutoModelForImageTextToText + +from cosmos3._src.imaginaire.utils import log +from cosmos3._src.vfm.models.utils.safetensors_loader import load_language_model, load_vlm_model +from cosmos3._src.vfm.utils.parallelism import ParallelDims + + +def _tensor_names_to_skip_for(model_type: str) -> list[str]: + """Per-model-type tensor-name regex skip list for load_vlm_model. + + Mirrors the upstream HF-model ``tensor_names_to_skip`` property from the + legacy VLM policy registry. Patterns match the **resolved model key** + (post-name_converter). Patterns are used twice inside load_vlm_model: + (1) the per-tensor loop skips copying matched model keys, (2) the + completeness check tolerates matched model keys that are absent from + the checkpoint. + + Registered VLMs (see + cosmos3/_src/vfm/configs/base/vlm/defaults/vlm_policy.py): + - Qwen3-VL dense (2B/4B/8B/32B): no skips needed. + - NemotronH_Nano_VL_V2 (nvidia/NVIDIA-Nemotron-Nano-12B-v2-VL-BF16): + RADIO backbone buffers — initialized by the module, not from ckpt. + """ + table: dict[str, list[str]] = { + "NemotronH_Nano_VL_V2": [ + r"vision_model\.radio_model\.summary_idxs", + r"vision_model\.radio_model\.input_conditioner\.norm_mean", + r"vision_model\.radio_model\.input_conditioner\.norm_std", + ], + } + return table.get(model_type, []) + + +class HFModel(nn.Module): + """Minimal HF model wrapper for the vfm/ unified VLM training path. + + Loads any HF causal LM or VL model on the meta device (no GPU memory) + via the appropriate AutoClass — see selection rules below. Weights are NOT + loaded in ``__init__``. Call :meth:`load_weights` after FSDP wrapping + + explicit meta-tensor materialization so each rank only fills its own shard. + + AutoClass selection (by vision_config presence + ``auto_map``): + - VLM with standard transformers registration (e.g. Qwen3-VL) + → ``AutoModelForImageTextToText``. Returns the full conditional-generation + class (e.g. ``Qwen3VLForConditionalGeneration``), which exposes ``.logits`` + on forward output. ``AutoModelForCausalLM`` raises ``ValueError`` for VLM + configs (``Qwen3VLConfig`` is not registered for that auto class), so it + cannot be used here. + - VLM with custom ``auto_map`` (e.g. NemotronVL): the registered entry maps + the full causal-LM class through ``AutoModel`` rather than + ``AutoModelForImageTextToText`` — use ``AutoModel`` for this case only. + - LLM (no ``vision_config``) → ``AutoModelForCausalLM``. Standard causal LM + with ``.logits``. + + Do NOT use ``AutoModel`` for the standard VLM path — it returns the backbone + only (e.g. ``Qwen3VLModel``), which does NOT have ``.logits``. + + FSDP / TP wrapping is applied externally by ``parallelize()`` in + ``vfm/models/parallelize_vlm.py``. + """ + + def __init__( + self, + model_name_or_path: str, + dtype: torch.dtype = torch.bfloat16, + attn_implementation: str = "flash_attention_2", + trust_remote_code: bool = True, + ): + super().__init__() + self.model_name_or_path = model_name_or_path + hf_config = AutoConfig.from_pretrained(model_name_or_path, trust_remote_code=trust_remote_code) + self.hf_config = hf_config + + # AutoClass selection by model type: + # - Standard VLM (Qwen3-VL, etc.): AutoModelForImageTextToText returns the full causal + # LM with .logits (Qwen3VLForConditionalGeneration, etc.). + # - Custom VLM with auto_map (e.g. NemotronVL): AutoModelForImageTextToText is not + # registered; use AutoModel instead which maps to the full causal LM via auto_map. + # - LLM (no vision_config): AutoModelForCausalLM → standard causal LM with .logits. + is_vlm = getattr(hf_config, "vision_config", None) is not None + auto_map = getattr(hf_config, "auto_map", None) or {} + if is_vlm: + if "AutoModelForImageTextToText" in auto_map or not auto_map: + # Standard VLM or no auto_map (rely on registered transformers type) + model_cls = AutoModelForImageTextToText + else: + # Custom VLM: use AutoModel which maps to the full causal-LM class via auto_map + model_cls = AutoModel + else: + model_cls = AutoModelForCausalLM + + # Meta init: allocates no GPU memory. FSDP2's ``fully_shard`` does NOT + # auto-materialize meta tensors; the caller (see ``vlm_model._init_vlm``) + # must explicitly materialize via ``_apply(empty_like, ...)`` between + # ``parallelize()`` and ``load_weights()``. + with init_on_device("meta", include_buffers=False): + self._model = model_cls.from_config( + hf_config, + attn_implementation=attn_implementation, + torch_dtype=dtype, + trust_remote_code=trust_remote_code, + ) + log.info(f"HFModel: {hf_config.model_type} ({'VLM' if is_vlm else 'LLM'}), dtype={dtype}") + + # Normalize floating-point *parameter* dtypes to ``dtype``. HF's + # ``from_config`` installs ``torch.set_default_dtype(dtype)`` around the + # init, but some HF submodules (and vendored remote-code classes) read + # ``config.torch_dtype`` directly or build tensors with an explicit + # ``dtype=`` kwarg, so their params can end up in the checkpoint's dtype + # (typically bf16) while the rest of the model is in ``dtype``. FSDP2's + # ``_init_mp_dtypes`` then asserts "uniform original parameter dtype … + # {bf16, fp32}". Normalize on meta (no GPU memory) so all FSDP units see + # a single original dtype. Buffers are left alone — ``inv_freq`` etc. + # must stay fp32 (enforced by e.g. qwen3_vl.py's inv_freq assertion). + n_cast = 0 + with torch.no_grad(): + for p in self._model.parameters(recurse=True): + if p.is_floating_point() and p.dtype != dtype: + p.data = p.data.to(dtype) + n_cast += 1 + if n_cast: + log.info(f"HFModel: normalized {n_cast} param(s) to {dtype} post-from_config") + + # Patch Qwen3-VL forward for text-only batches (no pixel_values / image_grid_thw). + # Required to avoid errors when a batch contains only text: every FSDP rank must + # call visual() each step for all-gather sync; the patch runs a lightweight dummy + # image and slices the output to [0:0] so it contributes no features. + # Must happen BEFORE parallelize() so FSDP captures the patched forward. + if hf_config.model_type == "qwen3_vl" and hasattr(self._model, "model"): + from cosmos3._src.vfm.utils.monkey_patch import patch_qwen3_vl_forward + + patch_qwen3_vl_forward(self._model.model) + log.info("HFModel: applied patch_qwen3_vl_forward for text-only batch support") + + def apply_gradient_checkpointing(self) -> None: + """Enable gradient checkpointing via HF's standard API.""" + self._model.gradient_checkpointing_enable(gradient_checkpointing_kwargs={"use_reentrant": False}) + log.info("HFModel: gradient checkpointing enabled") + + def tie_embeddings(self) -> None: + """Tie output embedding weight to input embedding, matching post_to_empty_hook behavior. + + Must be called AFTER ``parallelize()`` and AFTER the explicit + meta-tensor materialization step (FSDP2 does not auto-materialize — + see ``vlm_model._init_vlm`` step e), and BEFORE ``load_weights()`` so + the tied pointer survives weight loading. + + Two strategies, matching the HF API split: + 1. ``get_output_embeddings()`` path — standard for most HF models. + 2. ``_tied_weights_keys`` fallback — some VLMs (notably + ``Qwen3VLForConditionalGeneration``) define ``lm_head`` and + ``_tied_weights_keys = ["lm_head.weight"]`` but do NOT override + ``get_output_embeddings``. For those, walk the dotted key to the + owning module and assign its ``.weight`` directly. See spec §8.3. + + Reference: the legacy VLM HF-model tie_embeddings implementation. + """ + if not getattr(self.hf_config, "tie_word_embeddings", False): + return + input_embeddings = self._model.get_input_embeddings() + if input_embeddings is None: + return + output_embeddings = self._model.get_output_embeddings() + if output_embeddings is not None: + output_embeddings.weight = input_embeddings.weight + log.info("HFModel: tied input/output embeddings via get_output_embeddings") + return + # Fallback: HF models that use _tied_weights_keys instead of + # overriding get_output_embeddings (e.g. Qwen3VLForConditionalGeneration + # defines _tied_weights_keys = ["lm_head.weight"] but returns None + # from the default get_output_embeddings). Walk the dotted key to + # the owning module and assign the Parameter directly. + tied_keys = getattr(self._model, "_tied_weights_keys", None) or () + if not tied_keys: + return + for key in tied_keys: + parts = key.split(".") + *mod_path, attr = parts + target = self._model + for name in mod_path: + target = getattr(target, name, None) + if target is None: + log.warning( + f"HFModel.tie_embeddings: could not resolve path {key!r} on " + f"{type(self._model).__name__}; skipping tie (weights will " + f"remain untied for this key)." + ) + break + else: + setattr(target, attr, input_embeddings.weight) + log.info(f"HFModel: tied {key} via _tied_weights_keys fallback") + + def load_weights( + self, + checkpoint_path: str, + credential_path: str | None = None, + parallel_dims: ParallelDims | None = None, + extra_skip_patterns: list[str] | None = None, + ) -> set[str]: + r"""Load weights from a HF model directory (safetensors format). + + Dispatches on model type: + - VLM (vision_config present): ``load_vlm_model`` (universal + suffix-lookup loader inherited from the legacy VLM path; MoE VLMs + explicitly blocked — see spec §2.2). + - LLM (no vision_config): ``load_language_model`` — handles VFM-specific + per-family key remapping for Qwen3 / Nemotron (unchanged from today). + + Must be called AFTER ``parallelize()`` so parameters are DTensors with + CUDA local views. For tied-embedding models, ``tie_embeddings()`` must + be called between ``parallelize()`` and this function. + + Args: + checkpoint_path: Path to a directory containing .safetensors files. + Local paths and S3 URIs are tried first; if no safetensors are + found, explicit ``hf://org/model`` Hub URIs and bare + ``org/model`` repo IDs fall back to Hugging Face. + credential_path: S3 credential file, or None for local/HF. + parallel_dims: ``ParallelDims`` instance (from + ``projects.cosmos3.vfm.utils.parallelism``). The loader uses + it via :func:`~projects.cosmos3.vfm.models.utils.safetensors_loader._get_dp_shard_mesh` + to obtain the 1-D ``dp_shard`` sub-mesh (or None when + ``dp_shard <= 1``) for striping checkpoint reads across + FSDP shard ranks. When non-None, the caller MUST have + called ``parallel_dims.build_meshes()`` first — neither + this method nor ``load_vlm_model`` re-checks this. Pass + ``parallel_dims=None`` for the single-rank fallback used + by single-process / non-distributed runs. + extra_skip_patterns: Optional list of regex patterns appended to + ``tensor_names_to_skip`` inside ``load_vlm_model``. Use when + overlaying an LLM-only checkpoint onto a VLM model (e.g. swapping + the language tower while preserving visual + projector params) + — pass patterns like ``r"model\.visual\."`` so those keys are + skipped during the overlay. Only takes effect on the VLM + dispatch path; ignored when the model is a pure LLM (no + ``vision_config``). + + Returns: + Set of model state-dict keys that were loaded from the checkpoint. + """ + is_vlm = getattr(self.hf_config, "vision_config", None) is not None + if is_vlm: + keys_loaded = load_vlm_model( + model=self._model, + checkpoint_path=checkpoint_path, + credential_path=credential_path, + parallel_dims=parallel_dims, + tensor_names_to_skip=_tensor_names_to_skip_for(self.hf_config.model_type), + extra_skip_patterns=extra_skip_patterns, + ) + else: + keys_loaded = load_language_model( + model=self._model, + checkpoint_path=checkpoint_path, + credential_path=credential_path if credential_path else "", + parallel_dims=parallel_dims, + ) + log.info(f"HFModel: weights loaded from {checkpoint_path} ({len(keys_loaded)} keys)") + return keys_loaded + + # Keys added by the VLM collate_fn (vlm/datasets/collate_fn.py) that are NOT valid + # HF model forward arguments. These must be stripped before calling self._model.forward(). + # A blocklist (not a whitelist) is used so that legitimate kwargs passed via the model's + # **kwargs — e.g. second_per_grid_ts for Qwen3-VL temporal encoding, output_router_logits + # for MoE load-balancing — are forwarded correctly even when not named in the signature. + _COLLATE_NON_MODEL_KEYS: frozenset[str] = frozenset( + { + "token_mask", + "pad_token_id", + "ignore_index", + "collated", + "raw_image", + "raw_video", + # image_sizes is collected by collate_fn but is NOT a Qwen3-VL forward arg + # (Qwen3-VL uses image_grid_thw instead). Strip it so strict HF signatures + # don't reject it. + # a future Phase extends to those, remove this entry. + "image_sizes", + } + ) + + def forward(self, **kwargs) -> torch.Tensor: + """Pass-through forward. Returns logits (B, T, V). + + Strips collate-added non-model keys (see ``_COLLATE_NON_MODEL_KEYS``: + token_mask, pad_token_id, ignore_index, collated, raw_image, raw_video, + image_sizes) before forwarding. Forces use_cache=False for training. + All remaining keys (including ``**kwargs`` pass-throughs such as + second_per_grid_ts) are forwarded unchanged. + + For nemotron_vl: attention_mask is also dropped. NemotronVLModel.get_rope_index + strips padding positions when attention_mask is present, returning position_ids + shorter than inputs_embeds (padded_len). With right-padding + causal attention, + valid tokens never attend to padding tokens regardless, so dropping attention_mask + is equivalent and avoids the shape mismatch. + """ + filtered = {k: v for k, v in kwargs.items() if k not in self._COLLATE_NON_MODEL_KEYS} + if self.hf_config.model_type == "nemotron_vl": + filtered.pop("attention_mask", None) + filtered["use_cache"] = False + out = self._model(**filtered) + return out.logits diff --git a/cosmos-inference/cosmos3/_src/vfm/models/llm/__init__.py b/cosmos-inference/cosmos3/_src/vfm/models/llm/__init__.py new file mode 100644 index 00000000..db6d6eb6 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/vfm/models/llm/__init__.py @@ -0,0 +1,15 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025-2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + diff --git a/cosmos-inference/cosmos3/_src/vfm/models/llm/qwen3/__init__.py b/cosmos-inference/cosmos3/_src/vfm/models/llm/qwen3/__init__.py new file mode 100644 index 00000000..db6d6eb6 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/vfm/models/llm/qwen3/__init__.py @@ -0,0 +1,15 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025-2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + diff --git a/cosmos-inference/cosmos3/_src/vfm/models/llm/qwen3/configs/Qwen3-0.6B.json b/cosmos-inference/cosmos3/_src/vfm/models/llm/qwen3/configs/Qwen3-0.6B.json new file mode 100644 index 00000000..5050f2fd --- /dev/null +++ b/cosmos-inference/cosmos3/_src/vfm/models/llm/qwen3/configs/Qwen3-0.6B.json @@ -0,0 +1,30 @@ +{ + "architectures": [ + "Qwen3ForCausalLM" + ], + "attention_bias": false, + "attention_dropout": 0.0, + "bos_token_id": 151643, + "eos_token_id": 151645, + "head_dim": 128, + "hidden_act": "silu", + "hidden_size": 1024, + "initializer_range": 0.02, + "intermediate_size": 3072, + "max_position_embeddings": 40960, + "max_window_layers": 28, + "model_type": "qwen3", + "num_attention_heads": 16, + "num_hidden_layers": 28, + "num_key_value_heads": 8, + "rms_norm_eps": 1e-06, + "rope_scaling": null, + "rope_theta": 1000000, + "sliding_window": null, + "tie_word_embeddings": true, + "torch_dtype": "bfloat16", + "transformers_version": "4.51.0", + "use_cache": true, + "use_sliding_window": false, + "vocab_size": 151936 + } diff --git a/cosmos-inference/cosmos3/_src/vfm/models/llm/qwen3/configs/__init__.py b/cosmos-inference/cosmos3/_src/vfm/models/llm/qwen3/configs/__init__.py new file mode 100644 index 00000000..db6d6eb6 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/vfm/models/llm/qwen3/configs/__init__.py @@ -0,0 +1,15 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025-2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + diff --git a/cosmos-inference/cosmos3/_src/vfm/models/llm/qwen3/configuration_qwen3.py b/cosmos-inference/cosmos3/_src/vfm/models/llm/qwen3/configuration_qwen3.py new file mode 100644 index 00000000..4a3bb1c0 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/vfm/models/llm/qwen3/configuration_qwen3.py @@ -0,0 +1,259 @@ +# Copyright 2024 The Qwen team, Alibaba Group and the HuggingFace Inc. team. All rights reserved. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +# ----------------------------------------------------------------------------- +# Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. +# All rights reserved. +# +# This codebase constitutes NVIDIA proprietary technology and is strictly +# confidential. Any unauthorized reproduction, distribution, or disclosure +# of this code, in whole or in part, outside NVIDIA is strictly prohibited +# without prior written consent. +# +# For inquiries regarding the use of this code in other NVIDIA proprietary +# projects, please contact the Deep Imagination Research Team at +# dir@exchange.nvidia.com. +# ----------------------------------------------------------------------------- + +# This is adapted from src/transformers/models/qwen3/configuration_qwen3.py. +"""Qwen3 model configuration""" + +from typing import Optional + +from transformers.configuration_utils import PretrainedConfig +from transformers.modeling_rope_utils import rope_config_validation +from transformers.utils import logging + +logger = logging.get_logger(__name__) + + +# BAGEL-style: Add layer_type_validation function inline for compatibility +ALLOWED_LAYER_TYPES = ( + "full_attention", + "sliding_attention", + "chunked_attention", + "linear_attention", +) + + +def layer_type_validation(layer_types: list[str]) -> None: + """Check that each entry in `layer_types` are allowed.""" + if not all(layer_type in ALLOWED_LAYER_TYPES for layer_type in layer_types): + raise ValueError(f"The `layer_types` entries must be in {ALLOWED_LAYER_TYPES}") + + +class Qwen3Config(PretrainedConfig): + r""" + This is the configuration class to store the configuration of a [`Qwen3Model`]. It is used to instantiate a + Qwen3 model according to the specified arguments, defining the model architecture. Instantiating a configuration + with the defaults will yield a similar configuration to that of + Qwen3-8B [Qwen/Qwen3-8B](https://huggingface.co/Qwen/Qwen3-8B). + + Configuration objects inherit from [`PretrainedConfig`] and can be used to control the model outputs. Read the + documentation from [`PretrainedConfig`] for more information. + + + Args: + vocab_size (`int`, *optional*, defaults to 151936): + Vocabulary size of the Qwen3 model. Defines the number of different tokens that can be represented by the + `inputs_ids` passed when calling [`Qwen3Model`] + hidden_size (`int`, *optional*, defaults to 4096): + Dimension of the hidden representations. + intermediate_size (`int`, *optional*, defaults to 22016): + Dimension of the MLP representations. + num_hidden_layers (`int`, *optional*, defaults to 32): + Number of hidden layers in the Transformer encoder. + num_attention_heads (`int`, *optional*, defaults to 32): + Number of attention heads for each attention layer in the Transformer encoder. + num_key_value_heads (`int`, *optional*, defaults to 32): + This is the number of key_value heads that should be used to implement Grouped Query Attention. If + `num_key_value_heads=num_attention_heads`, the model will use Multi Head Attention (MHA), if + `num_key_value_heads=1` the model will use Multi Query Attention (MQA) otherwise GQA is used. When + converting a multi-head checkpoint to a GQA checkpoint, each group key and value head should be constructed + by meanpooling all the original heads within that group. For more details, check out [this + paper](https://huggingface.co/papers/2305.13245). If it is not specified, will default to `32`. + head_dim (`int`, *optional*, defaults to 128): + The attention head dimension. + hidden_act (`str` or `function`, *optional*, defaults to `"silu"`): + The non-linear activation function (function or string) in the decoder. + max_position_embeddings (`int`, *optional*, defaults to 32768): + The maximum sequence length that this model might ever be used with. + initializer_range (`float`, *optional*, defaults to 0.02): + The standard deviation of the truncated_normal_initializer for initializing all weight matrices. + rms_norm_eps (`float`, *optional*, defaults to 1e-06): + The epsilon used by the rms normalization layers. + use_cache (`bool`, *optional*, defaults to `True`): + Whether or not the model should return the last key/values attentions (not used by all models). Only + relevant if `config.is_decoder=True`. + tie_word_embeddings (`bool`, *optional*, defaults to `False`): + Whether the model's input and output word embeddings should be tied. + rope_theta (`float`, *optional*, defaults to 10000.0): + The base period of the RoPE embeddings. + rope_scaling (`Dict`, *optional*): + Dictionary containing the scaling configuration for the RoPE embeddings. NOTE: if you apply new rope type + and you expect the model to work on longer `max_position_embeddings`, we recommend you to update this value + accordingly. + Expected contents: + `rope_type` (`str`): + The sub-variant of RoPE to use. Can be one of ['default', 'linear', 'dynamic', 'yarn', 'longrope', + 'llama3'], with 'default' being the original RoPE implementation. + `factor` (`float`, *optional*): + Used with all rope types except 'default'. The scaling factor to apply to the RoPE embeddings. In + most scaling types, a `factor` of x will enable the model to handle sequences of length x * + original maximum pre-trained length. + `original_max_position_embeddings` (`int`, *optional*): + Used with 'dynamic', 'longrope' and 'llama3'. The original max position embeddings used during + pretraining. + `attention_factor` (`float`, *optional*): + Used with 'yarn' and 'longrope'. The scaling factor to be applied on the attention + computation. If unspecified, it defaults to value recommended by the implementation, using the + `factor` field to infer the suggested value. + `beta_fast` (`float`, *optional*): + Only used with 'yarn'. Parameter to set the boundary for extrapolation (only) in the linear + ramp function. If unspecified, it defaults to 32. + `beta_slow` (`float`, *optional*): + Only used with 'yarn'. Parameter to set the boundary for interpolation (only) in the linear + ramp function. If unspecified, it defaults to 1. + `short_factor` (`list[float]`, *optional*): + Only used with 'longrope'. The scaling factor to be applied to short contexts (< + `original_max_position_embeddings`). Must be a list of numbers with the same length as the hidden + size divided by the number of attention heads divided by 2 + `long_factor` (`list[float]`, *optional*): + Only used with 'longrope'. The scaling factor to be applied to long contexts (< + `original_max_position_embeddings`). Must be a list of numbers with the same length as the hidden + size divided by the number of attention heads divided by 2 + `low_freq_factor` (`float`, *optional*): + Only used with 'llama3'. Scaling factor applied to low frequency components of the RoPE + `high_freq_factor` (`float`, *optional*): + Only used with 'llama3'. Scaling factor applied to high frequency components of the RoPE + attention_bias (`bool`, defaults to `False`, *optional*, defaults to `False`): + Whether to use a bias in the query, key, value and output projection layers during self-attention. + use_sliding_window (`bool`, *optional*, defaults to `False`): + Whether to use sliding window attention. + sliding_window (`int`, *optional*, defaults to 4096): + Sliding window attention (SWA) window size. If not specified, will default to `4096`. + max_window_layers (`int`, *optional*, defaults to 28): + The number of layers using full attention. The first `max_window_layers` layers will use full attention, while any + additional layer afterwards will use SWA (Sliding Window Attention). + layer_types (`list`, *optional*): + Attention pattern for each layer. + attention_dropout (`float`, *optional*, defaults to 0.0): + The dropout ratio for the attention probabilities. + + ```python + >>> from transformers import Qwen3Model, Qwen3Config + + >>> # Initializing a Qwen3 style configuration + >>> configuration = Qwen3Config() + + >>> # Initializing a model from the Qwen3-8B style configuration + >>> model = Qwen3Model(configuration) + + >>> # Accessing the model configuration + >>> configuration = model.config + ```""" + + model_type = "qwen3" + keys_to_ignore_at_inference = ["past_key_values"] # noqa: RUF012 + + # Default tensor parallel plan for base model `Qwen3` + base_model_tp_plan = { # noqa: RUF012 + "layers.*.self_attn.q_proj": "colwise", + "layers.*.self_attn.k_proj": "colwise", + "layers.*.self_attn.v_proj": "colwise", + "layers.*.self_attn.o_proj": "rowwise", + "layers.*.mlp.gate_proj": "colwise", + "layers.*.mlp.up_proj": "colwise", + "layers.*.mlp.down_proj": "rowwise", + } + base_model_pp_plan = { # noqa: RUF012 + "embed_tokens": (["input_ids"], ["inputs_embeds"]), + "layers": (["hidden_states", "attention_mask"], ["hidden_states"]), + "norm": (["hidden_states"], ["hidden_states"]), + } + + def __init__( + self, + vocab_size: Optional[int] = 151936, + hidden_size: Optional[int] = 4096, + intermediate_size: Optional[int] = 22016, + num_hidden_layers: Optional[int] = 32, + num_attention_heads: Optional[int] = 32, + num_key_value_heads: Optional[int] = 32, + head_dim: Optional[int] = 128, + hidden_act: Optional[str] = "silu", + max_position_embeddings: Optional[int] = 32768, + initializer_range: Optional[float] = 0.02, + rms_norm_eps: Optional[float] = 1e-6, + use_cache: Optional[bool] = True, + tie_word_embeddings: Optional[bool] = False, + rope_theta: Optional[float] = 10000.0, + rope_scaling: Optional[dict] = None, + attention_bias: Optional[bool] = False, + use_sliding_window: Optional[bool] = False, + sliding_window: Optional[int] = 4096, + max_window_layers: Optional[int] = 28, + layer_types: Optional[list] = None, + attention_dropout: Optional[float] = 0.0, + **kwargs, + ) -> None: + self.vocab_size = vocab_size + self.max_position_embeddings = max_position_embeddings + self.hidden_size = hidden_size + self.intermediate_size = intermediate_size + self.num_hidden_layers = num_hidden_layers + self.num_attention_heads = num_attention_heads + self.use_sliding_window = use_sliding_window + self.sliding_window = sliding_window if self.use_sliding_window else None + self.max_window_layers = max_window_layers + + # for backward compatibility + if num_key_value_heads is None: + num_key_value_heads = num_attention_heads + + self.num_key_value_heads = num_key_value_heads + self.head_dim = head_dim + self.hidden_act = hidden_act + self.initializer_range = initializer_range + self.rms_norm_eps = rms_norm_eps + self.use_cache = use_cache + self.rope_theta = rope_theta + self.rope_scaling = rope_scaling + self.attention_bias = attention_bias + self.attention_dropout = attention_dropout + # Validate the correctness of rotary position embeddings parameters + # BC: if there is a 'type' field, move it to 'rope_type'. + if self.rope_scaling is not None and "type" in self.rope_scaling: + self.rope_scaling["rope_type"] = self.rope_scaling["type"] + rope_config_validation(self) + + self.layer_types = layer_types + if self.layer_types is None: + self.layer_types = [ + ( + "sliding_attention" + if self.sliding_window is not None and i >= self.max_window_layers + else "full_attention" + ) + for i in range(self.num_hidden_layers) + ] + layer_type_validation(self.layer_types) + + super().__init__( + tie_word_embeddings=tie_word_embeddings, + **kwargs, + ) + + +__all__ = ["Qwen3Config"] diff --git a/cosmos-inference/cosmos3/_src/vfm/models/llm/qwen3/qwen3.py b/cosmos-inference/cosmos3/_src/vfm/models/llm/qwen3/qwen3.py new file mode 100644 index 00000000..7b22dd05 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/vfm/models/llm/qwen3/qwen3.py @@ -0,0 +1,751 @@ +# Copyright 2025 The Qwen team, Alibaba Group and the HuggingFace Inc. team. All rights reserved. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +# ----------------------------------------------------------------------------- +# Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. +# All rights reserved. +# +# This codebase constitutes NVIDIA proprietary technology and is strictly +# confidential. Any unauthorized reproduction, distribution, or disclosure +# of this code, in whole or in part, outside NVIDIA is strictly prohibited +# without prior written consent. +# +# For inquiries regarding the use of this code in other NVIDIA proprietary +# projects, please contact the Deep Imagination Research Team at +# dir@exchange.nvidia.com. +# ----------------------------------------------------------------------------- + +# This is adapted from src/transformers/models/qwen3/modeling_qwen3.py. +"""PyTorch Qwen3 model.""" + +from functools import wraps +from typing import Any, Callable, Optional, Union + +import torch +from torch import nn +from transformers.activations import ACT2FN +from transformers.cache_utils import Cache, DynamicCache +from transformers.generation import GenerationMixin +from transformers.modeling_flash_attention_utils import FlashAttentionKwargs +from transformers.modeling_outputs import BaseModelOutputWithPast, CausalLMOutputWithPast +from transformers.modeling_rope_utils import ROPE_INIT_FUNCTIONS, dynamic_rope_update +from transformers.modeling_utils import ALL_ATTENTION_FUNCTIONS, PreTrainedModel +from transformers.processing_utils import Unpack +from transformers.utils import logging +from transformers.utils.deprecation import deprecate_kwarg + +from cosmos3._src.vfm.models.llm.qwen3.configuration_qwen3 import Qwen3Config + +TransformersKwargs = Any + +# Import masking functions from utils for full transformers compatibility +from cosmos3._src.vfm.models.vlm.qwen3_vl.utils import ( + create_causal_mask, + create_sliding_window_causal_mask, +) + +logger = logging.get_logger(__name__) + + +def can_return_tuple(func): # noqa: ANN001, ANN202 + """ + Decorator to wrap model method, to call output.to_tuple() if return_dict=False passed as a kwarg or + use_return_dict=False is set in the config. + + Note: + output.to_tuple() convert output to tuple skipping all `None` values. + """ + + @wraps(func) + def wrapper(self, *args, **kwargs): # noqa: ANN001, ANN202 + return_dict = self.config.return_dict if hasattr(self, "config") else True + return_dict_passed = kwargs.pop("return_dict", return_dict) + if return_dict_passed is not None: + return_dict = return_dict_passed + output = func(self, *args, **kwargs) + if not return_dict and not isinstance(output, tuple): + output = output.to_tuple() + return output + + return wrapper + + +# Documentation strings +QWEN3_START_DOCSTRING = r""" + This model inherits from [`PreTrainedModel`]. Check the superclass documentation for the generic methods the + library implements for all its model (such as downloading or saving, resizing the input embeddings, pruning heads + etc.) + + This model is also a PyTorch [torch.nn.Module](https://pytorch.org/docs/stable/nn.html#torch.nn.Module) subclass. + Use it as a regular PyTorch Module and refer to the PyTorch documentation for all matter related to general usage + and behavior. + + Parameters: + config ([`Qwen3Config`]): + Model configuration class with all the parameters of the model. Initializing with a config file does not + load the weights associated with the model, only the configuration. Check out the + [`~PreTrainedModel.from_pretrained`] method to load the model weights. +""" + +QWEN3_INPUTS_DOCSTRING = r""" + Args: + input_ids (`torch.LongTensor` of shape `(batch_size, sequence_length)`): + Indices of input sequence tokens in the vocabulary. Padding will be ignored by default should you provide + it. + + Indices can be obtained using [`AutoTokenizer`]. See [`PreTrainedTokenizer.encode`] and + [`PreTrainedTokenizer.__call__`] for details. + + [What are input IDs?](../glossary#input-ids) + attention_mask (`torch.Tensor` of shape `(batch_size, sequence_length)`, *optional*): + Mask to avoid performing attention on padding token indices. Mask values selected in `[0, 1]`: + + - 1 for tokens that are **not masked**, + - 0 for tokens that are **masked**. + + [What are attention masks?](../glossary#attention-mask) + position_ids (`torch.LongTensor` of shape `(batch_size, sequence_length)`, *optional*): + Indices of positions of each input sequence tokens in the position embeddings. Selected in the range `[0, + config.n_positions - 1]`. + + [What are position IDs?](../glossary#position-ids) + past_key_values (`Cache` or `tuple(tuple(torch.FloatTensor))`, *optional*): + Pre-computed hidden-states (key and values in the self-attention blocks and in the cross-attention + blocks) that can be used to speed up sequential decoding. This typically consists in the `past_key_values` + returned by the model at a previous stage of decoding, when `use_cache=True` or `config.use_cache=True`. + + Two formats are allowed: + - a [`~cache_utils.Cache`] instance, see our + [kv cache guide](https://huggingface.co/docs/transformers/en/kv_cache); + - Tuple of `tuple(torch.FloatTensor)` of length `config.n_layers`, with each tuple having 2 tensors of + shape `(batch_size, num_heads, sequence_length, embed_size_per_head)`). This is also known as the legacy + cache format. + + The model will output the same cache format that is fed as input. If no `past_key_values` are passed, the + legacy cache format will be returned. + + If `past_key_values` are used, the user can optionally input only the last `input_ids` (those that don't + have their past key value states given to this model) of shape `(batch_size, 1)` instead of all `input_ids` + of shape `(batch_size, sequence_length)`. + inputs_embeds (`torch.FloatTensor` of shape `(batch_size, sequence_length, hidden_size)`, *optional*): + Optionally, instead of passing `input_ids` you can choose to directly pass an embedded representation. This + is useful if you want more control over how to convert `input_ids` indices into associated vectors than the + model's internal embedding lookup matrix. + use_cache (`bool`, *optional*): + If set to `True`, `past_key_values` key value states are returned and can be used to speed up decoding (see + `past_key_values`). + output_attentions (`bool`, *optional*): + Whether or not to return the attentions tensors of all attention layers. See `attentions` under returned + tensors for more detail. + output_hidden_states (`bool`, *optional*): + Whether or not to return the hidden states of all layers. See `hidden_states` under returned tensors for + more detail. + return_dict (`bool`, *optional*): + Whether or not to return a [`~utils.ModelOutput`] instead of a plain tuple. + cache_position (`torch.LongTensor` of shape `(sequence_length)`, *optional*): + Indices depicting the position of the input sequence tokens in the sequence. Contrarily to `position_ids`, + this tensor is not affected by padding. It is used to update the cache in the correct position and to infer + the complete sequence length. +""" + + +class Qwen3RMSNorm(nn.Module): + def __init__(self, hidden_size: int, eps: float = 1e-6) -> None: + """ + Qwen3RMSNorm is equivalent to T5LayerNorm + """ + super().__init__() + self.weight = nn.Parameter(torch.ones(hidden_size)) + self.variance_epsilon = eps + + def forward(self, hidden_states: torch.Tensor) -> torch.Tensor: + input_dtype = hidden_states.dtype + hidden_states = hidden_states.to(torch.float32) + variance = hidden_states.pow(2).mean(-1, keepdim=True) + hidden_states = hidden_states * torch.rsqrt(variance + self.variance_epsilon) + return self.weight * hidden_states.to(input_dtype) + + def extra_repr(self) -> str: + return f"{tuple(self.weight.shape)}, eps={self.variance_epsilon}" + + +class Qwen3MLP(nn.Module): + def __init__(self, config: Qwen3Config) -> None: + super().__init__() + self.config = config + self.hidden_size = config.hidden_size + self.intermediate_size = config.intermediate_size + self.gate_proj = nn.Linear(self.hidden_size, self.intermediate_size, bias=False) + self.up_proj = nn.Linear(self.hidden_size, self.intermediate_size, bias=False) + self.down_proj = nn.Linear(self.intermediate_size, self.hidden_size, bias=False) + self.act_fn = ACT2FN[config.hidden_act] + + def forward(self, x: torch.Tensor) -> torch.Tensor: # x: [B,N,hidden_size] + down_proj = self.down_proj(self.act_fn(self.gate_proj(x)) * self.up_proj(x)) # [B,N,hidden_size] + return down_proj + + +def rotate_half(x: torch.Tensor) -> torch.Tensor: # x: [...,head_dim] + """Rotates half the hidden dims of the input.""" + x1 = x[..., : x.shape[-1] // 2] # [...,head_dim//2] + x2 = x[..., x.shape[-1] // 2 :] # [...,head_dim//2] + return torch.cat((-x2, x1), dim=-1) # [...,head_dim] + + +def apply_rotary_pos_emb( + q: torch.Tensor, # [B,num_heads,N,head_dim] + k: torch.Tensor, # [B,num_kv_heads,N,head_dim] + cos: torch.Tensor, # [B,N,head_dim] + sin: torch.Tensor, # [B,N,head_dim] + position_ids: Optional[torch.Tensor] = None, + unsqueeze_dim: int = 1, +) -> tuple[torch.Tensor, torch.Tensor]: # ([B,num_heads,N,head_dim], [B,num_kv_heads,N,head_dim]) + """Applies Rotary Position Embedding to the query and key tensors. + + Args: + q (`torch.Tensor`): The query tensor. + k (`torch.Tensor`): The key tensor. + cos (`torch.Tensor`): The cosine part of the rotary embedding. + sin (`torch.Tensor`): The sine part of the rotary embedding. + position_ids (`torch.Tensor`, *optional*): + Deprecated and unused. + unsqueeze_dim (`int`, *optional*, defaults to 1): + The 'unsqueeze_dim' argument specifies the dimension along which to unsqueeze cos[position_ids] and + sin[position_ids] so that they can be properly broadcasted to the dimensions of q and k. For example, note + that cos[position_ids] and sin[position_ids] have the shape [batch_size, seq_len, head_dim]. Then, if q and + k have the shape [batch_size, heads, seq_len, head_dim], then setting unsqueeze_dim=1 makes + cos[position_ids] and sin[position_ids] broadcastable to the shapes of q and k. Similarly, if q and k have + the shape [batch_size, seq_len, heads, head_dim], then set unsqueeze_dim=2. + Returns: + `tuple(torch.Tensor)` comprising of the query and key tensors rotated using the Rotary Position Embedding. + """ + cos = cos.unsqueeze(unsqueeze_dim) # [B,1,N,head_dim] + sin = sin.unsqueeze(unsqueeze_dim) # [B,1,N,head_dim] + q_embed = (q * cos) + (rotate_half(q) * sin) # [B,num_heads,N,head_dim] + k_embed = (k * cos) + (rotate_half(k) * sin) # [B,num_kv_heads,N,head_dim] + return q_embed, k_embed + + +def repeat_kv(hidden_states: torch.Tensor, n_rep: int) -> torch.Tensor: # hidden_states: [B,num_kv_heads,N,head_dim] + """ + This is the equivalent of torch.repeat_interleave(x, dim=1, repeats=n_rep). The hidden states go from (batch, + num_key_value_heads, seqlen, head_dim) to (batch, num_attention_heads, seqlen, head_dim) + """ + batch, num_key_value_heads, slen, head_dim = hidden_states.shape + if n_rep == 1: + return hidden_states # [B,num_kv_heads,N,head_dim] + hidden_states = hidden_states[:, :, None, :, :].expand( + batch, num_key_value_heads, n_rep, slen, head_dim + ) # [B,num_kv_heads,n_rep,N,head_dim] + return hidden_states.reshape(batch, num_key_value_heads * n_rep, slen, head_dim) # [B,num_heads,N,head_dim] + + +def eager_attention_forward( + module: nn.Module, + query: torch.Tensor, # [B,num_heads,N_q,head_dim] + key: torch.Tensor, # [B,num_kv_heads,N_kv,head_dim] + value: torch.Tensor, # [B,num_kv_heads,N_kv,head_dim] + attention_mask: Optional[torch.Tensor], + scaling: float, + dropout: float = 0.0, + **kwargs: Unpack[TransformersKwargs], +) -> tuple[torch.Tensor, torch.Tensor]: # ([B,N_q,num_heads,head_dim], [B,num_heads,N_q,N_kv]) + key_states = repeat_kv(key, module.num_key_value_groups) # [B,num_heads,N_kv,head_dim] + value_states = repeat_kv(value, module.num_key_value_groups) # [B,num_heads,N_kv,head_dim] + + attn_weights = torch.matmul(query, key_states.transpose(2, 3)) * scaling # [B,num_heads,N_q,N_kv] + if attention_mask is not None: + causal_mask = attention_mask[:, :, :, : key_states.shape[-2]] # [B,1,N_q,N_kv] + attn_weights = attn_weights + causal_mask # [B,num_heads,N_q,N_kv] + + attn_weights = nn.functional.softmax(attn_weights, dim=-1, dtype=torch.float32).to( + query.dtype + ) # [B,num_heads,N_q,N_kv] + attn_weights = nn.functional.dropout(attn_weights, p=dropout, training=module.training) # [B,num_heads,N_q,N_kv] + attn_output = torch.matmul(attn_weights, value_states) # [B,num_heads,N_q,head_dim] + attn_output = attn_output.transpose(1, 2).contiguous() # [B,N_q,num_heads,head_dim] + + return attn_output, attn_weights + + +class Qwen3Attention(nn.Module): + """Multi-headed attention from 'Attention Is All You Need' paper""" + + def __init__(self, config: Qwen3Config, layer_idx: int): + super().__init__() + self.config = config + self.layer_idx = layer_idx + self.head_dim = getattr(config, "head_dim", config.hidden_size // config.num_attention_heads) + self.num_key_value_groups = config.num_attention_heads // config.num_key_value_heads + self.scaling = self.head_dim**-0.5 + self.attention_dropout = config.attention_dropout + self.is_causal = True + + self.q_proj = nn.Linear( + config.hidden_size, config.num_attention_heads * self.head_dim, bias=config.attention_bias + ) + self.k_proj = nn.Linear( + config.hidden_size, config.num_key_value_heads * self.head_dim, bias=config.attention_bias + ) + self.v_proj = nn.Linear( + config.hidden_size, config.num_key_value_heads * self.head_dim, bias=config.attention_bias + ) + self.o_proj = nn.Linear( + config.num_attention_heads * self.head_dim, config.hidden_size, bias=config.attention_bias + ) + self.q_norm = Qwen3RMSNorm(self.head_dim, eps=config.rms_norm_eps) # unlike olmo, only on the head dim! + self.k_norm = Qwen3RMSNorm(self.head_dim, eps=config.rms_norm_eps) # thus post q_norm does not need reshape + self.sliding_window = config.sliding_window if config.layer_types[layer_idx] == "sliding_attention" else None + + @deprecate_kwarg("past_key_value", new_name="past_key_values", version="4.58") + def forward( + self, + hidden_states: torch.Tensor, # [B,N,hidden_size] + position_embeddings: tuple[torch.Tensor, torch.Tensor], # ([B,N,head_dim], [B,N,head_dim]) + attention_mask: Optional[torch.Tensor], + past_key_values: Optional[Cache] = None, + cache_position: Optional[torch.LongTensor] = None, + **kwargs: Unpack[FlashAttentionKwargs], + ) -> tuple[torch.Tensor, Optional[torch.Tensor]]: # ([B,N,hidden_size], optional [B,num_heads,N,N_kv]) + input_shape = hidden_states.shape[:-1] + hidden_shape = (*input_shape, -1, self.head_dim) + + query_states = self.q_norm(self.q_proj(hidden_states).view(hidden_shape)).transpose( + 1, 2 + ) # [B,num_heads,N,head_dim] + key_states = self.k_norm(self.k_proj(hidden_states).view(hidden_shape)).transpose( + 1, 2 + ) # [B,num_kv_heads,N,head_dim] + value_states = self.v_proj(hidden_states).view(hidden_shape).transpose(1, 2) # [B,num_kv_heads,N,head_dim] + + cos, sin = position_embeddings + query_states, key_states = apply_rotary_pos_emb( + query_states, key_states, cos, sin + ) # [B,num_heads,N,head_dim], [B,num_kv_heads,N,head_dim] + + if past_key_values is not None: + # sin and cos are specific to RoPE models; cache_position needed for the static cache + cache_kwargs = {"sin": sin, "cos": cos, "cache_position": cache_position} + key_states, value_states = past_key_values.update( + key_states, value_states, self.layer_idx, cache_kwargs + ) # [B,num_kv_heads,N_kv,head_dim] + + attention_interface: Callable = eager_attention_forward + if self.config._attn_implementation != "eager": + attention_interface = ALL_ATTENTION_FUNCTIONS[self.config._attn_implementation] + + attn_output, attn_weights = attention_interface( + self, + query_states, + key_states, + value_states, + attention_mask, + dropout=0.0 if not self.training else self.attention_dropout, + scaling=self.scaling, + sliding_window=self.sliding_window, # diff with Llama + **kwargs, + ) + # attn_output: [B,N,num_heads,head_dim] + + attn_output = attn_output.reshape(*input_shape, -1).contiguous() # [B,N,hidden_size] + attn_output = self.o_proj(attn_output) # [B,N,hidden_size] + return attn_output, attn_weights + + +class Qwen3DecoderLayer(nn.Module): + def __init__(self, config: Qwen3Config, layer_idx: int): + super().__init__() + self.hidden_size = config.hidden_size + + self.self_attn = Qwen3Attention(config=config, layer_idx=layer_idx) + + self.mlp = Qwen3MLP(config) + self.input_layernorm = Qwen3RMSNorm(config.hidden_size, eps=config.rms_norm_eps) + self.post_attention_layernorm = Qwen3RMSNorm(config.hidden_size, eps=config.rms_norm_eps) + self.attention_type = config.layer_types[layer_idx] + + @deprecate_kwarg("past_key_value", new_name="past_key_values", version="4.58") + def forward( + self, + hidden_states: torch.Tensor, + attention_mask: Optional[torch.Tensor] = None, + position_ids: Optional[torch.LongTensor] = None, + past_key_values: Optional[Cache] = None, + output_attentions: Optional[bool] = False, + use_cache: Optional[bool] = False, + cache_position: Optional[torch.LongTensor] = None, + position_embeddings: Optional[tuple[torch.Tensor, torch.Tensor]] = None, # necessary, but kept here for BC + **kwargs: Unpack[TransformersKwargs], + ) -> tuple[torch.Tensor, Optional[torch.Tensor]]: + residual = hidden_states # [B,N,hidden_size] + hidden_states = self.input_layernorm(hidden_states) # [B,N,hidden_size] + # Self Attention + hidden_states, self_attn_weights = self.self_attn( + hidden_states=hidden_states, + attention_mask=attention_mask, + position_ids=position_ids, + past_key_values=past_key_values, + use_cache=use_cache, + cache_position=cache_position, + position_embeddings=position_embeddings, + **kwargs, + ) + # hidden_states: [B,N,hidden_size] + hidden_states = residual + hidden_states # [B,N,hidden_size] + + # Fully Connected + residual = hidden_states # [B,N,hidden_size] + hidden_states = self.post_attention_layernorm(hidden_states) # [B,N,hidden_size] + hidden_states = self.mlp(hidden_states) # [B,N,hidden_size] + hidden_states = residual + hidden_states # [B,N,hidden_size] + + outputs = (hidden_states,) + + if output_attentions: + outputs += (self_attn_weights,) + + return outputs + + +class Qwen3PreTrainedModel(PreTrainedModel): + config_class = Qwen3Config + config: Qwen3Config + base_model_prefix = "model" + supports_gradient_checkpointing = True + _no_split_modules = ["Qwen3DecoderLayer"] # noqa: RUF012 + _skip_keys_device_placement = ["past_key_values"] # noqa: RUF012 + _supports_flash_attn = True + _supports_sdpa = True + _supports_flex_attn = True + + _can_compile_fullgraph = True + _supports_attention_backend = True + _can_record_outputs = { # noqa: RUF012 + "hidden_states": Qwen3DecoderLayer, + "attentions": Qwen3Attention, + } + + def _init_weights(self, module: nn.Module) -> None: + std = self.config.initializer_range + if isinstance(module, nn.Linear): + module.weight.data.normal_(mean=0.0, std=std) + if module.bias is not None: + module.bias.data.zero_() + elif isinstance(module, nn.Embedding): + module.weight.data.normal_(mean=0.0, std=std) + if module.padding_idx is not None: + module.weight.data[module.padding_idx].zero_() + elif isinstance(module, Qwen3RotaryEmbedding): + module._init_weights() + + def init_weights(self) -> None: + self.apply(self._init_weights) + + +class Qwen3RotaryEmbedding(nn.Module): + inv_freq: torch.Tensor # fix linting for `register_buffer` + + def __init__(self, config: Qwen3Config, device: Optional[torch.device] = None) -> None: + super().__init__() + # BC: "rope_type" was originally "type" + if hasattr(config, "rope_scaling") and isinstance(config.rope_scaling, dict): + self.rope_type = config.rope_scaling.get("rope_type", config.rope_scaling.get("type")) + else: + self.rope_type = "default" + self.max_seq_len_cached = config.max_position_embeddings + self.original_max_seq_len = config.max_position_embeddings + + self.config = config + self.rope_init_fn = ROPE_INIT_FUNCTIONS[self.rope_type] + self.device = device + self._init_weights() + + def _init_weights(self) -> None: + inv_freq, self.attention_scaling = self.rope_init_fn(self.config, self.device) + self.register_buffer("inv_freq", inv_freq, persistent=False) + self.original_inv_freq = self.inv_freq + + @torch.no_grad() + @dynamic_rope_update # power user: used with advanced RoPE types (e.g. dynamic rope) + def forward( + self, x: torch.Tensor, position_ids: torch.Tensor + ) -> tuple[torch.Tensor, torch.Tensor]: # position_ids: [B,N] -> ([B,N,head_dim], [B,N,head_dim]) + inv_freq_expanded = ( + self.inv_freq[None, :, None].float().expand(position_ids.shape[0], -1, 1).to(x.device) + ) # [B,head_dim//2,1] + position_ids_expanded = position_ids[:, None, :].float() # [B,1,N] + + device_type = x.device.type if isinstance(x.device.type, str) and x.device.type != "mps" else "cpu" + with torch.autocast(device_type=device_type, enabled=False): # Force float32 + freqs = (inv_freq_expanded.float() @ position_ids_expanded.float()).transpose(1, 2) # [B,N,head_dim//2] + emb = torch.cat((freqs, freqs), dim=-1) # [B,N,head_dim] + cos = emb.cos() * self.attention_scaling # [B,N,head_dim] + sin = emb.sin() * self.attention_scaling # [B,N,head_dim] + + return cos.to(dtype=x.dtype), sin.to(dtype=x.dtype) # [B,N,head_dim], [B,N,head_dim] + + +class Qwen3Model(Qwen3PreTrainedModel): + def __init__(self, config: Qwen3Config) -> None: + super().__init__(config) + self.padding_idx = config.pad_token_id + self.vocab_size = config.vocab_size + + self.embed_tokens = nn.Embedding(config.vocab_size, config.hidden_size, self.padding_idx) + self.layers = nn.ModuleList( + [Qwen3DecoderLayer(config, layer_idx) for layer_idx in range(config.num_hidden_layers)] + ) + self.norm = Qwen3RMSNorm(config.hidden_size, eps=config.rms_norm_eps) + self.rotary_emb = Qwen3RotaryEmbedding(config=config) + self.gradient_checkpointing = False + self.has_sliding_layers = "sliding_attention" in self.config.layer_types + + # Initialize weights and apply final processing + self.post_init() + + def forward( + self, + input_ids: Optional[torch.LongTensor] = None, + attention_mask: Optional[torch.Tensor] = None, + position_ids: Optional[torch.LongTensor] = None, + past_key_values: Optional[Cache] = None, + inputs_embeds: Optional[torch.FloatTensor] = None, + use_cache: Optional[bool] = None, + output_attentions: Optional[bool] = None, + output_hidden_states: Optional[bool] = None, + return_dict: Optional[bool] = None, + cache_position: Optional[torch.LongTensor] = None, + **kwargs: Unpack[TransformersKwargs], + ) -> BaseModelOutputWithPast: + output_attentions = output_attentions if output_attentions is not None else self.config.output_attentions + output_hidden_states = ( + output_hidden_states if output_hidden_states is not None else self.config.output_hidden_states + ) + use_cache = use_cache if use_cache is not None else self.config.use_cache + return_dict = return_dict if return_dict is not None else self.config.use_return_dict + + if (input_ids is None) ^ (inputs_embeds is not None): + raise ValueError("You must specify exactly one of input_ids or inputs_embeds") + + if inputs_embeds is None: + inputs_embeds = self.embed_tokens(input_ids) # [B,N,hidden_size] + + if use_cache and past_key_values is None: + # Compatibility: DynamicCache constructor changed between transformers versions + try: + past_key_values = DynamicCache(config=self.config) + except TypeError: + # Fallback for older transformers versions (4.51.3) that don't accept config + past_key_values = DynamicCache() + + if cache_position is None: + past_seen_tokens = past_key_values.get_seq_length() if past_key_values is not None else 0 + cache_position = torch.arange( + past_seen_tokens, past_seen_tokens + inputs_embeds.shape[1], device=inputs_embeds.device + ) + + if position_ids is None: + position_ids = cache_position.unsqueeze(0) + + # It may already have been prepared by e.g. `generate` + if not isinstance(causal_mask_mapping := attention_mask, dict): + # Prepare mask arguments + mask_kwargs = { + "config": self.config, + "input_embeds": inputs_embeds, + "attention_mask": attention_mask, + "cache_position": cache_position, + "past_key_values": past_key_values, + "position_ids": position_ids, + } + # Create the masks using our minimal implementations + causal_mask_mapping = { + "full_attention": create_causal_mask(**mask_kwargs), + } + # The sliding window alternating layers are not always activated depending on the config + if self.has_sliding_layers: + causal_mask_mapping["sliding_attention"] = create_sliding_window_causal_mask(**mask_kwargs) + + hidden_states = inputs_embeds # [B,N,hidden_size] + + # create position embeddings to be shared across the decoder layers + position_embeddings = self.rotary_emb(hidden_states, position_ids) # ([B,N,head_dim], [B,N,head_dim]) + + # decoder layers + all_hidden_states = () if output_hidden_states else None + all_self_attns = () if output_attentions else None + + for decoder_layer in self.layers[: self.config.num_hidden_layers]: + if output_hidden_states: + all_hidden_states += (hidden_states,) + + layer_outputs = decoder_layer( + hidden_states, + attention_mask=causal_mask_mapping[decoder_layer.attention_type], + position_ids=position_ids, + past_key_values=past_key_values, + output_attentions=output_attentions, + use_cache=use_cache, + cache_position=cache_position, + position_embeddings=position_embeddings, + **kwargs, + ) + + hidden_states = layer_outputs[0] + + if output_attentions: + all_self_attns += (layer_outputs[1],) + + hidden_states = self.norm(hidden_states) # [B,N,hidden_size] + + # add hidden states from the last decoder layer + if output_hidden_states: + all_hidden_states += (hidden_states,) + + if not return_dict: + return tuple( + v + for v in [hidden_states, past_key_values if use_cache else None, all_hidden_states, all_self_attns] + if v is not None + ) + + return BaseModelOutputWithPast( + last_hidden_state=hidden_states, + past_key_values=past_key_values if use_cache else None, + hidden_states=all_hidden_states, + attentions=all_self_attns, + ) + + +class Qwen3ForCausalLM(Qwen3PreTrainedModel, GenerationMixin): + _tied_weights_keys = ["lm_head.weight"] # noqa: RUF012 + _tp_plan = {"lm_head": "colwise_rep"} # noqa: RUF012 + _pp_plan = {"lm_head": (["hidden_states"], ["logits"])} # noqa: RUF012 + + def __init__(self, config: Qwen3Config) -> None: + super().__init__(config) + self.model = Qwen3Model(config) + self.vocab_size = config.vocab_size + self.lm_head = nn.Linear(config.hidden_size, config.vocab_size, bias=False) + + # Initialize weights and apply final processing + self.post_init() + + def tie_weights(self) -> None: + """ + Tie the weights between the input embeddings and the output embeddings. + + Since lm_head.weight is in _tied_weights_keys, we always tie the weights + to ensure compatibility with checkpoints that have tied embeddings. + """ + # Always tie weights since lm_head.weight is in _tied_weights_keys + # This ensures compatibility with checkpoints that don't have separate lm_head.weight + self._tie_or_clone_weights(self.lm_head, self.model.embed_tokens) + + @can_return_tuple + def forward( + self, + input_ids: Optional[torch.LongTensor] = None, + attention_mask: Optional[torch.Tensor] = None, + position_ids: Optional[torch.LongTensor] = None, + past_key_values: Optional[Cache] = None, + inputs_embeds: Optional[torch.FloatTensor] = None, + labels: Optional[torch.LongTensor] = None, + use_cache: Optional[bool] = None, + output_attentions: Optional[bool] = None, + output_hidden_states: Optional[bool] = None, + return_dict: Optional[bool] = None, + cache_position: Optional[torch.LongTensor] = None, + logits_to_keep: Union[int, torch.Tensor] = 0, + **kwargs: Unpack[TransformersKwargs], + ) -> CausalLMOutputWithPast: + r""" + labels (`torch.LongTensor` of shape `(batch_size, sequence_length)`, *optional*): + Labels for computing the masked language modeling loss. Indices should either be in `[0, ..., + config.vocab_size]` or -100 (see `input_ids` docstring). Tokens with indices set to `-100` are ignored + (masked), the loss is only computed for the tokens with labels in `[0, ..., config.vocab_size]`. + + Example: + + ```python + >>> from transformers import AutoTokenizer, Qwen3ForCausalLM + + >>> model = Qwen3ForCausalLM.from_pretrained("Qwen/Qwen3-8B") + >>> tokenizer = AutoTokenizer.from_pretrained("Qwen/Qwen3-8B") + + >>> prompt = "Hey, are you conscious? Can you talk to me?" + >>> inputs = tokenizer(prompt, return_tensors="pt") + + >>> # Generate + >>> generate_ids = model.generate(inputs.input_ids, max_length=30) + >>> tokenizer.batch_decode(generate_ids, skip_special_tokens=True, clean_up_tokenization_spaces=False)[0] + "Hey, are you conscious? Can you talk to me?\nI'm not conscious, but I can talk to you." + ```""" + output_attentions = output_attentions if output_attentions is not None else self.config.output_attentions + output_hidden_states = ( + output_hidden_states if output_hidden_states is not None else self.config.output_hidden_states + ) + return_dict = return_dict if return_dict is not None else self.config.use_return_dict + + outputs: BaseModelOutputWithPast = self.model( + input_ids=input_ids, + attention_mask=attention_mask, + position_ids=position_ids, + past_key_values=past_key_values, + inputs_embeds=inputs_embeds, + use_cache=use_cache, + output_attentions=output_attentions, + output_hidden_states=output_hidden_states, + return_dict=return_dict, + cache_position=cache_position, + **kwargs, + ) + + hidden_states = outputs.last_hidden_state # [B,N,hidden_size] + # Only compute necessary logits, and do not upcast them to float if we are not computing the loss + slice_indices = slice(-logits_to_keep, None) if isinstance(logits_to_keep, int) else logits_to_keep + logits = self.lm_head(hidden_states[:, slice_indices, :]) # [B,N_keep,vocab_size] + + loss = None + if labels is not None: + loss = self.loss_function(logits=logits, labels=labels, vocab_size=self.config.vocab_size, **kwargs) + + if not return_dict: + output = (logits, *outputs[1:]) + return (loss, *output) if loss is not None else output + + return CausalLMOutputWithPast( + loss=loss, + logits=logits, + past_key_values=outputs.past_key_values, + hidden_states=outputs.hidden_states, + attentions=outputs.attentions, + ) + + +__all__ = [ + "QWEN3_INPUTS_DOCSTRING", + "QWEN3_START_DOCSTRING", + # Documentation constants + "Qwen3ForCausalLM", + "Qwen3Model", + "Qwen3PreTrainedModel", + "Qwen3RMSNorm", + "Qwen3RotaryEmbedding", + "apply_rotary_pos_emb", + "eager_attention_forward", + "repeat_kv", + # Utility functions + "rotate_half", +] diff --git a/cosmos-inference/cosmos3/_src/vfm/models/llm/qwen3/test_qwen3.py b/cosmos-inference/cosmos3/_src/vfm/models/llm/qwen3/test_qwen3.py new file mode 100644 index 00000000..adef8600 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/vfm/models/llm/qwen3/test_qwen3.py @@ -0,0 +1,976 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +""" +Comprehensive test script for Qwen3 LLM with output control integration. + +This script runs all tests: +1. LLM implementation compatibility checks +2. Memory-efficient collection behavior tests +3. Return format control verification +4. Input/output functionality tests +5. HuggingFace model comparison tests +6. Pretrained weights tests + +Usage (run from imaginaire4 directory): + pytest -v cosmos3/_src/vfm/models/llm/qwen3/test_qwen3.py --all -s + +Example - Using Qwen3 LLM Model Directly: + import torch + from cosmos3._src.vfm.models.llm.qwen3.qwen3 import Qwen3ForCausalLM + from cosmos3._src.vfm.models.llm.qwen3.configuration_qwen3 import Qwen3Config + from cosmos3._src.vfm.models.llm.qwen2.tokenization_qwen2 import Qwen2Tokenizer + + # Option 1: Load from HuggingFace Hub (original) + model_name = "Qwen/Qwen3-0.6B" + config = Qwen3Config.from_pretrained(model_name) + model = Qwen3ForCausalLM.from_pretrained(model_name, config=config, torch_dtype=torch.float32) + tokenizer = Qwen2Tokenizer.from_pretrained(model_name) + + # Option 2: Load from Local Config (like qwen2 pattern) + config = Qwen3Config.from_json_file( + "cosmos3/_src/vfm/models/llm/qwen3/configs/Qwen3-0.6B.json" + ) + model = Qwen3ForCausalLM(config=config) # Create with local config + tokenizer = Qwen2Tokenizer.from_pretrained("Qwen/Qwen2.5-0.5B-Instruct") # Remote tokenizer + + # Prepare input + prompt = "Give me a short introduction to large language models." + inputs = tokenizer(prompt, return_tensors="pt") + + # Generate with MoT-style output controls + with torch.no_grad(): + outputs = model.generate( + **inputs, + max_new_tokens=50, + do_sample=False, + output_attentions=True, # LLM MoT-style control + output_hidden_states=True, # LLM MoT-style control + return_dict_in_generate=True + ) + + # Decode result + response = tokenizer.decode(outputs.sequences[0], skip_special_tokens=True) + print(f"Generated: {response}") +""" + +import inspect +import os +import sys +import traceback + +import pytest +import torch +from transformers import AutoModelForCausalLM, AutoTokenizer + +# MoT/Qwen3 imports +from cosmos3._src.vfm.models.llm.qwen3.configuration_qwen3 import Qwen3Config, layer_type_validation +from cosmos3._src.vfm.models.llm.qwen3.qwen3 import Qwen3ForCausalLM, Qwen3Model +from cosmos3._src.vfm.tokenizers.tokenization_qwen2 import Qwen2Tokenizer + + +# GPU device detection +def get_device(): + """Get the best available device (GPU if available, otherwise CPU)""" + if torch.cuda.is_available(): + device = torch.device("cuda") + print(f"Using GPU: {torch.cuda.get_device_name()}") + print(f"GPU Memory: {torch.cuda.get_device_properties(0).total_memory / 1024**3:.1f} GB") + else: + device = torch.device("cpu") + print("Using CPU (CUDA not available)") + return device + + +# Validate script is run from the correct directory +# Should be run from imaginaire4 directory with: python -m cosmos3._src.vfm.models.llm.qwen3.test_qwen3 # noqa: E501 +current_working_dir = os.getcwd() # Should be imaginaire4 +language_model_dir = "cosmos3/_src/vfm/models/llm" + +# Validate we're running from the correct directory +if not os.path.exists(language_model_dir): + print("ERROR: This script must be run from the imaginaire4 directory.") + print(f"Current directory: {current_working_dir}") + print(f"Expected to find: {language_model_dir}") + print("Please run: cd /path/to/imaginaire4 && python -m cosmos3._src.vfm.models.llm.qwen3.test_qwen3") # noqa: E501 + sys.exit(1) + + +def load_llm_tokenizer(model_name): + """Load tokenizer with fallback chain: Fast / Slow""" + tokenizer = Qwen2Tokenizer.from_pretrained(model_name) + print(" [OK] Using Qwen2Tokenizer") + return tokenizer + + +def initialize_models_and_tokenizers(model_name, device, is_large_model=False): + """Initialize all models and tokenizers once for reuse across tests""" + print(f"\nINITIALIZING MODELS ({model_name})...") + print("=" * 60) + + # Load configuration + print(f"Loading configuration from {model_name}...") + config = Qwen3Config.from_pretrained(model_name) + print(f" Config: vocab_size={config.vocab_size}, hidden_size={config.hidden_size}") + + # Initialize LLM models with pretrained weights + print("Loading LLM models with pretrained weights...") + if not is_large_model: + llm_model = Qwen3Model.from_pretrained(model_name, config=config).to(device) + llm_model.eval() + else: + llm_model = None + + llm_causal_model = Qwen3ForCausalLM.from_pretrained(model_name, config=config).to(device) + llm_causal_model.eval() + print(f" [OK] LLM models loaded on {device}") + + # Initialize HuggingFace model (for comparison) + print("Loading HuggingFace model...") + hf_tokenizer = AutoTokenizer.from_pretrained(model_name) + hf_model = AutoModelForCausalLM.from_pretrained( + model_name, + torch_dtype=torch.float32, + device_map="auto" if device.type == "cuda" else None, + ).eval() + print(f" [OK] HuggingFace model loaded on {device.type}") + + # Initialize LLM tokenizer + print("Loading LLM tokenizer...") + llm_tokenizer = load_llm_tokenizer(model_name) + + # Memory usage info + if device.type == "cuda": + print(f" Memory: GPU memory allocated: {torch.cuda.memory_allocated(device) / 1024**3:.2f} GB") + print(f" Memory: GPU memory cached: {torch.cuda.memory_reserved(device) / 1024**3:.2f} GB") + + total_params = sum(p.numel() for p in llm_causal_model.parameters()) + print(f" Info: Total parameters: {total_params:,}") + print("=" * 60) + + models = { + "config": config, + "llm_model": llm_model, + "llm_causal_model": llm_causal_model, + "hf_model": hf_model, + "llm_tokenizer": llm_tokenizer, + "hf_tokenizer": hf_tokenizer, + "device": device, + } + + return models + + +def test_qwen3_local_config_loading(): + """Test loading Qwen3 config from local JSON file and creating model.""" + + print("=" * 80) + print("TESTING QWEN3 LOCAL CONFIG LOADING") + print("=" * 80) + + try: + # Load config from local JSON file + config_path = "cosmos3/_src/vfm/models/llm/qwen3/configs/Qwen3-0.6B.json" + config = Qwen3Config.from_json_file(config_path) + + # Verify config loaded correctly + assert config.model_type == "qwen3", f"Expected model_type 'qwen3', got '{config.model_type}'" + assert config.hidden_size == 1024, f"Expected hidden_size 1024, got {config.hidden_size}" + assert config.num_hidden_layers == 28, f"Expected 28 layers, got {config.num_hidden_layers}" + assert config.vocab_size == 151936, f"Expected vocab_size 151936, got {config.vocab_size}" + + print(" Config loaded and validated successfully!") + print(f" Model: {config.model_type}") + print(f" Hidden size: {config.hidden_size}") + print(f" Layers: {config.num_hidden_layers}") + print(f" Vocab size: {config.vocab_size}") + + # Test model creation + print("\n Creating models with local config...") + base_model = Qwen3Model(config=config) + causal_model = Qwen3ForCausalLM(config=config) + + assert base_model.config.hidden_size == 1024 + assert len(base_model.layers) == 28 + assert causal_model.config.hidden_size == 1024 + assert hasattr(causal_model, "lm_head") + + print(" Models created successfully with local config") + + # Test basic forward pass + print("\n Testing forward pass...") + batch_size, seq_len = 2, 10 + input_ids = torch.randint(0, config.vocab_size, (batch_size, seq_len)) + + with torch.no_grad(): + outputs = causal_model(input_ids) + logits = outputs.logits + + expected_shape = (batch_size, seq_len, config.vocab_size) + assert logits.shape == expected_shape, f"Expected shape {expected_shape}, got {logits.shape}" + + print(" Forward pass working with correct output dimensions") + print(" Local config loading test PASSED!") + + # Clean up + del base_model, causal_model, config + + return True + + except Exception as e: + print(f" Local config loading test FAILED: {e}") + import traceback + + traceback.print_exc() + return False + + +def cleanup_models(models): + """Clean up models and free GPU memory""" + if models["device"].type == "cuda": + print("Cleaning up GPU memory...") + del models["llm_model"] + del models["llm_causal_model"] + del models["hf_model"] + torch.cuda.empty_cache() + print(f"GPU memory after cleanup: {torch.cuda.memory_allocated(models['device']) / 1024**3:.2f} GB") + + +def llm_output_controls_check(models): + """Test MoT-style output control implementation in Qwen3""" + + print("=" * 80) + print("TESTING QWEN3 MoT-STYLE OUTPUT CONTROLS") + print("=" * 80) + + # Use pre-initialized models + config = models["config"] + model = models["llm_model"] + causal_model = models["llm_causal_model"] + device = models["device"] + + print(" Using pre-initialized Qwen3 components...") + print("[PASS] Import successful") + + # Override output defaults for testing + config.output_attentions = False + config.output_hidden_states = False + print(f"[PASS] Configuration ready (vocab_size={config.vocab_size}, hidden_size={config.hidden_size})") + + # Test 1: Forward method signatures + print("\nTEST 1: Forward Method Signatures") + print("-" * 50) + + # Check Qwen3Model signature + sig = inspect.signature(model.forward) + params = list(sig.parameters.keys()) + required_params = ["output_attentions", "output_hidden_states", "return_dict"] + + missing = [p for p in required_params if p not in params] + if not missing: + print("[PASS] Qwen3Model has all required output control parameters") + else: + print(f"[FAIL] Qwen3Model missing parameters: {missing}") + + # Check Qwen3ForCausalLM signature + sig_causal = inspect.signature(causal_model.forward) + params_causal = list(sig_causal.parameters.keys()) + + missing_causal = [p for p in required_params if p not in params_causal] + if not missing_causal: + print("[PASS] Qwen3ForCausalLM has all required output control parameters") + else: + print(f"[FAIL] Qwen3ForCausalLM missing parameters: {missing_causal}") + + # Test 2: Memory Efficiency + print("\nTEST 2: Memory Efficiency") + print("-" * 50) + + # Create dummy input (small sequence for testing) + dummy_input = torch.randint(0, min(config.vocab_size, 1000), (1, 8)).to(device) + + print("Testing with output_hidden_states=False, output_attentions=False...") + with torch.no_grad(): + outputs_minimal = model(dummy_input, output_hidden_states=False, output_attentions=False, return_dict=True) + + hidden_states_none = outputs_minimal.hidden_states is None + attentions_none = outputs_minimal.attentions is None + + print(f" hidden_states is None: {hidden_states_none}") + print(f" attentions is None: {attentions_none}") + + if hidden_states_none and attentions_none: + print("[PASS] Memory efficiency: Collections are None when not requested") + else: + print("[FAIL] Memory efficiency failed: Collections should be None") + + print("\nTesting with output_hidden_states=True, output_attentions=True...") + with torch.no_grad(): + outputs_full = model(dummy_input, output_hidden_states=True, output_attentions=True, return_dict=True) + + has_hidden_states = outputs_full.hidden_states is not None + has_attentions = outputs_full.attentions is not None + + print(f" hidden_states collected: {has_hidden_states}") + print(f" attentions collected: {has_attentions}") + + if has_hidden_states and has_attentions: + print(f" hidden_states length: {len(outputs_full.hidden_states)}") + print(f" attentions length: {len(outputs_full.attentions)}") + print("[PASS] Full collection: All intermediate outputs captured") + else: + print("[FAIL] Full collection failed: Missing requested outputs") + + # Test 3: Return Format Control + print("\nTEST 3: Return Format Control") + print("-" * 50) + + print("Testing return_dict=False (tuple format)...") + with torch.no_grad(): + tuple_outputs = model(dummy_input, output_hidden_states=True, output_attentions=True, return_dict=False) + + is_tuple = isinstance(tuple_outputs, tuple) + print(f" Returns tuple: {is_tuple}") + print(f" Tuple length: {len(tuple_outputs) if is_tuple else 'N/A'}") + + if is_tuple: + print("[PASS] Tuple format working correctly") + else: + print("[FAIL] Tuple format failed") + + print("\nTesting return_dict=True (dictionary format)...") + with torch.no_grad(): + dict_outputs = model(dummy_input, output_hidden_states=True, output_attentions=True, return_dict=True) + + has_last_hidden = hasattr(dict_outputs, "last_hidden_state") + has_hidden = hasattr(dict_outputs, "hidden_states") + has_attentions = hasattr(dict_outputs, "attentions") + + print(f" Has last_hidden_state: {has_last_hidden}") + print(f" Has hidden_states: {has_hidden}") + print(f" Has attentions: {has_attentions}") + + if has_last_hidden and has_hidden and has_attentions: + print("[PASS] Dictionary format working correctly") + else: + print("[FAIL] Dictionary format missing fields") + + # Test 4: CausalLM Integration + print("\nTEST 4: CausalLM Integration") + print("-" * 50) + + print("Testing Qwen3ForCausalLM output controls...") + with torch.no_grad(): + causal_outputs = causal_model(dummy_input, output_hidden_states=True, output_attentions=True, return_dict=True) + + has_logits = hasattr(causal_outputs, "logits") + has_hidden = causal_outputs.hidden_states is not None + has_attentions = causal_outputs.attentions is not None + + print(f" Has logits: {has_logits}") + print(f" Has hidden_states: {has_hidden}") + print(f" Has attentions: {has_attentions}") + + if has_logits and has_hidden and has_attentions: + print("[PASS] CausalLM output controls working correctly") + else: + print("[FAIL] CausalLM output controls failed") + + # Test 5: Configuration Defaults + print("\nTEST 5: Configuration Defaults") + print("-" * 50) + + # Test with config defaults + with torch.no_grad(): + # Config has output_attentions=False, output_hidden_states=False + default_outputs = model(dummy_input) # No explicit parameters + + hidden_default = default_outputs.hidden_states is None + attentions_default = default_outputs.attentions is None + + print(f" Default hidden_states is None: {hidden_default}") + print(f" Default attentions is None: {attentions_default}") + + if hidden_default and attentions_default: + print("[PASS] Configuration defaults respected") + else: + print("[FAIL] Configuration defaults not working") + + # Test 6: HuggingFace Comparison + print("\nTEST 6: HuggingFace Comparison") + print("-" * 50) + + print("Comparing our LLM implementation with official HuggingFace model...") + try: + comparison_passed = compare_with_huggingface_model(models) + + if comparison_passed: + print("[PASS] Our LLM vs HuggingFace comparison successful") + else: + print("[FAIL] Our LLM vs HuggingFace comparison had differences") + except Exception as e: + print(f"[FAIL] HuggingFace comparison failed: {e}") + comparison_passed = False + + # Final Summary + print("\n" + "=" * 80) + print("SUMMARY: Our LLM INTEGRATION TEST SUMMARY") + print("=" * 80) + print("[PASS] Forward method signatures complete") + print("[PASS] Memory-efficient collection implemented") + print("[PASS] Return format control working") + print("[PASS] CausalLM integration successful") + print("[PASS] Configuration defaults respected") + if comparison_passed: + print("[PASS] HuggingFace comparison successful") + else: + print("[FAIL] HuggingFace comparison failed") + + print("\n Qwen3 MoT-style output controls are working perfectly!") + print("Memory usage is optimized and user has full control over outputs.") + + return True + + +def check_llm_implementation(models): + """Check if our LLM-style implementation is working correctly""" + + print("\nTEST 1: CHECKING our LLM IMPLEMENTATION...") + print("-" * 50) + + # Use pre-initialized models + config = models["config"] + model = models["llm_model"] + device = models["device"] + + # First, check the actual transformers version and environment + print(f"Python: Python executable: {sys.executable}") + print(f"Python: Python version: {sys.version}") + + try: + import transformers + + print(f"Transformers version: {transformers.__version__}") + print(f"Transformers location: {transformers.__file__}") + except Exception as e: + print(f"[ERROR] Error importing transformers: {e}") + return ["transformers_import_error"] + + implementation_status = [] + print("\nCHECKING: CHECKING our LLM FIXES...") + + # Consolidated implementation check + try: + # Test layer_type_validation + layer_type_validation(["full_attention", "sliding_attention"]) + print("[OK] layer_type_validation function working") + implementation_status.append("layer_validation_ok") + + # Model already instantiated and loaded with pretrained weights + print(f"[OK] Model instantiation successful (vocab_size={config.vocab_size}) on {device}") + implementation_status.append("model_instantiation_ok") + + # Test forward pass with dummy input (smaller batch for memory efficiency) + dummy_input = torch.randint(0, min(config.vocab_size, 1000), (1, 8)).to(device) + with torch.no_grad(): + model(dummy_input) + + print("[OK] Forward pass with masking successful") + implementation_status.append("forward_pass_ok") + + # Custom masking functions are implicitly tested by forward pass + print("[OK] Custom masking functions working") + implementation_status.append("masking_functions_ok") + + except Exception as e: + print(f"[ERROR] LLM implementation check failed: {e}") + # Determine which specific check failed based on how far we got + if "layer_validation_ok" not in implementation_status: + implementation_status.append("layer_validation_failed") + if "model_instantiation_ok" not in implementation_status: + implementation_status.append("model_instantiation_failed") + if "forward_pass_ok" not in implementation_status: + implementation_status.append("forward_pass_failed") + if "masking_functions_ok" not in implementation_status: + implementation_status.append("masking_functions_failed") + + print( + f"\nStatus: Implementation Status: " + f"{len([s for s in implementation_status if s.endswith('_ok')])}/{len(implementation_status)} checks passed" + ) + + return implementation_status + + +def check_llm_output_controls(models): + """Check if MoT-style output controls are implemented""" + + print("\nCHECKING MoT-STYLE OUTPUT CONTROLS...") + print("-" * 50) + + # Use pre-initialized models + model = models["llm_model"] + causal_model = models["llm_causal_model"] + + # Check signatures + sig = inspect.signature(model.forward) + params = list(sig.parameters.keys()) + + sig_causal = inspect.signature(causal_model.forward) + params_causal = list(sig_causal.parameters.keys()) + + required_params = ["output_attentions", "output_hidden_states", "return_dict"] + missing = [p for p in required_params if p not in params] + missing_causal = [p for p in required_params if p not in params_causal] + + if missing or missing_causal: + print("[WARNING] Missing MoT output control parameters:") + if missing: + print(f" - Qwen3Model: {missing}") + if missing_causal: + print(f" - Qwen3ForCausalLM: {missing_causal}") + print("\nTesting: NOTE: MoT-style output controls are not yet implemented.") + print("This is the next step after compatibility fixes.") + return False + else: + print("[OK] All MoT output control parameters present") + return True + + +def run_input_output_test(models): + """Run a simple input/output test to verify the model works""" + print("TESTING BASIC INPUT/OUTPUT...") + print("-" * 40) + + # Use pre-initialized models + config = models["config"] + model = models["llm_model"] + causal_model = models["llm_causal_model"] + device = models["device"] + + print(f"Config ready: vocab_size={config.vocab_size}, hidden_size={config.hidden_size}") + + # Test Qwen3Model + print("Testing: Testing Qwen3Model...") + + # Simple input (use smaller batch for memory efficiency with large models) + batch_size, seq_len = 1, 8 + input_ids = torch.randint(0, min(config.vocab_size, 1000), (batch_size, seq_len)).to(device) + + # Test forward pass + with torch.no_grad(): + outputs = model(input_ids) + + print(f" [OK] Input shape: {input_ids.shape}") + print(f" [OK] Output shape: {outputs.last_hidden_state.shape}") + print(f" [OK] Expected: ({batch_size}, {seq_len}, {config.hidden_size})") + + # Test Qwen3ForCausalLM + print("Testing Qwen3ForCausalLM...") + + with torch.no_grad(): + causal_outputs = causal_model(input_ids) + + print(f" [OK] Logits shape: {causal_outputs.logits.shape}") + print(f" [OK] Expected: ({batch_size}, {seq_len}, {config.vocab_size})") + + # Test with attention mask + print("Testing: Testing with attention mask...") + attention_mask = torch.ones_like(input_ids).to(device) + attention_mask[:, -2:] = 0 # Mask last 2 tokens + + with torch.no_grad(): + masked_outputs = causal_model(input_ids, attention_mask=attention_mask) + + print(f" [OK] Masked logits shape: {masked_outputs.logits.shape}") + + # Test generation-like scenario + print("Testing: Testing generation-like scenario...") + with torch.no_grad(): + # Simulate generating one token + next_token_logits = causal_outputs.logits[:, -1, :] # Last position + next_token_probs = torch.softmax(next_token_logits, dim=-1) + next_token = torch.argmax(next_token_probs, dim=-1) + + print(f" [OK] Next token shape: {next_token.shape}") + print(f" [OK] Next tokens: {next_token.tolist()}") + + # HuggingFace comparison is now handled in Test 6 of MoT output controls + + print("[OK] INPUT/OUTPUT TEST PASSED!") + return True + + +def compare_with_huggingface_model(models): + """Compare our LLM Qwen3 implementation with official HuggingFace model""" + print(" Comparing our LLM vs HuggingFace implementations...") + + # Use pre-initialized models + hf_model = models["hf_model"] + llm_model = models["llm_causal_model"] + hf_tokenizer = models["hf_tokenizer"] + llm_tokenizer = models["llm_tokenizer"] + device = models["device"] + + # Verify we're using our LLM implementation + print(f" Our LLM model class: {llm_model.__class__}") + print(f" Our LLM model module: {llm_model.__class__.__module__}") + print(f" HF model class: {hf_model.__class__}") + print(f" HF model module: {hf_model.__class__.__module__}") + + # Check if our custom masking functions are present (our LLM-specific) + llm_module = sys.modules.get("qwen3.modeling_qwen3") + if llm_module and hasattr(llm_module, "create_causal_mask"): + print(" [OK] Our LLM-specific masking functions detected in module") + else: + print(" [WARNING] Our LLM-specific functions not found - may be using HF implementation") + + # Verify module paths + if "qwen3.qwen3" in str(llm_model.__class__.__module__): + print(" [OK] Using our LLM implementation (qwen3.qwen3)") + else: + print(" [WARNING] Not using our LLM implementation!") + + # Models and tokenizers already loaded + + # Prepare test input as specified by user + prompt = '"Give me a short introduction to large language model."' + messages = [{"role": "user", "content": prompt}] + + # Apply chat template (using HF tokenizer for consistency) + text = hf_tokenizer.apply_chat_template( + messages, + tokenize=False, + add_generation_prompt=True, + enable_thinking=True, # Switches between thinking and non-thinking modes + ) + + print(f" Chat template applied: {text[:100]}...") + + # Tokenize input and move to device + hf_inputs = hf_tokenizer([text], return_tensors="pt") + llm_inputs = llm_tokenizer([text], return_tensors="pt") + + # Move inputs to device + hf_inputs = {k: v.to(device) for k, v in hf_inputs.items()} + llm_inputs = {k: v.to(device) for k, v in llm_inputs.items()} + + print(f" HF input length: {hf_inputs['input_ids'].shape[1]} tokens") + print(f" Our LLM input length: {llm_inputs['input_ids'].shape[1]} tokens") + + # Compare tokenization + if torch.equal(hf_inputs["input_ids"], llm_inputs["input_ids"]): + print(" [OK] Tokenization identical") + tokenization_matches = True + else: + print(" [WARNING] Tokenization differs between HF and our LLM tokenizers") + print(f" HF tokens: {hf_inputs['input_ids'].tolist()}") + print(f" Our LLM tokens: {llm_inputs['input_ids'].tolist()}") + # Continue with comparison using respective tokenizations + tokenization_matches = False + + # Test forward pass comparison + print(" Comparing forward pass outputs...") + with torch.no_grad(): + hf_outputs = hf_model(**hf_inputs) + llm_outputs = llm_model(**llm_inputs) + + # Compare logits only if tokenization matches + if tokenization_matches: + # Compare logits + logits_close = torch.allclose(hf_outputs.logits, llm_outputs.logits, atol=1e-4, rtol=1e-3) + max_diff = torch.max(torch.abs(hf_outputs.logits - llm_outputs.logits)).item() + + print(f" Logits close (atol=1e-4, rtol=1e-3): {logits_close}") + print(f" Max logits difference: {max_diff:.6f}") + + if not logits_close: + print(" [WARNING] Logits differ significantly") + return False + else: + print(" [SKIP] Logits comparison skipped due to different tokenization") + print(" This is expected if our LLM and HF tokenizers produce different tokens") + + # Test generation comparison (shorter version due to computational cost) + print(" Comparing generation (max 50 tokens)...") + with torch.no_grad(): + # HF generation + hf_generated = hf_model.generate( + **hf_inputs, + max_new_tokens=50, + do_sample=False, # Deterministic + temperature=None, # Clear conflicting params + top_p=None, + top_k=None, + pad_token_id=hf_tokenizer.eos_token_id, + ) + + # Our LLM generation + llm_generated = llm_model.generate( + **llm_inputs, + max_new_tokens=50, + do_sample=False, # Deterministic + temperature=None, # Clear conflicting params + top_p=None, + top_k=None, + pad_token_id=llm_tokenizer.eos_token_id, + ) + + # Extract new tokens only + hf_new_tokens = hf_generated[0][len(hf_inputs["input_ids"][0]) :].tolist() + our_llm_new_tokens = llm_generated[0][len(llm_inputs["input_ids"][0]) :].tolist() + + print(f" HF generated {len(hf_new_tokens)} tokens") + print(f" Our LLM generated {len(our_llm_new_tokens)} tokens") + + # Decode and display the generated text + hf_generated_text = hf_tokenizer.decode(hf_new_tokens, skip_special_tokens=True) + our_llm_generated_text = llm_tokenizer.decode(our_llm_new_tokens, skip_special_tokens=True) + + print(f" HF generated text: '{hf_generated_text}'") + print(f" Our LLM generated text: '{our_llm_generated_text}'") + + # Also show the full conversation (prompt + response) + hf_full_text = hf_tokenizer.decode(hf_generated[0], skip_special_tokens=True) + our_llm_full_text = llm_tokenizer.decode(llm_generated[0], skip_special_tokens=True) + + print(f" HF full conversation:\n{hf_full_text}") + print(f" Our LLM full conversation:\n{our_llm_full_text}") + + # Compare first few tokens + min_len = min(len(hf_new_tokens), len(our_llm_new_tokens), 10) + first_tokens_match = hf_new_tokens[:min_len] == our_llm_new_tokens[:min_len] + + print(f" First {min_len} tokens match: {first_tokens_match}") + + if tokenization_matches: + if first_tokens_match: + print(" [OK] Generation outputs are consistent") + else: + print(" [WARNING] Generation outputs differ") + print(f" HF first tokens: {hf_new_tokens[:min_len]}") + print(f" Our LLM first tokens: {our_llm_new_tokens[:min_len]}") + else: + print(" [INFO] Generation comparison with different input tokenizations") + print(f" HF generated: {hf_new_tokens[:min_len]}") + print(f" Our LLM generated: {our_llm_new_tokens[:min_len]}") + print(" Different outputs expected due to different input tokens") + + # Summary + if tokenization_matches: + print(" [OK] Complete comparison successful - identical tokenization and behavior") + else: + print(" [INFO] Partial comparison successful - Our LLM tokenizer differs but works correctly") + + return True + + +def run_pretrained_weights_test(models): + """Test using actual pretrained weights (already loaded)""" + print("TESTING WITH PRETRAINED WEIGHTS...") + print("-" * 50) + + # Use pre-initialized models with pretrained weights + config = models["config"] + model = models["llm_causal_model"] + tokenizer = models["llm_tokenizer"] + device = models["device"] + + print("Using pre-loaded model with pretrained weights") + print(f" [OK] Vocab size: {config.vocab_size}") + print(f" [OK] Hidden size: {config.hidden_size}") + print(f" [OK] Num layers: {config.num_hidden_layers}") + print(f" [OK] Num heads: {config.num_attention_heads}") + + # Test with a simple prompt + print("Testing: Testing text generation...") + prompt = "The quick brown fox" + print(f" Loading: Input: '{prompt}'") + + # Tokenize input and move to device + inputs = tokenizer(prompt, return_tensors="pt") + input_ids = inputs.input_ids.to(device) + attention_mask = inputs.attention_mask.to(device) + + print(f" Token IDs: Token IDs: {input_ids.tolist()}") + print(f" Length: Input length: {input_ids.shape[1]} tokens") + + # Generate with our LLM implementation + with torch.no_grad(): + # Test basic forward pass first + outputs = model(input_ids, attention_mask=attention_mask) + + print(f" [OK] Logits shape: {outputs.logits.shape}") + + # Get next token probabilities + next_token_logits = outputs.logits[0, -1, :] + next_token_probs = torch.softmax(next_token_logits, dim=-1) + top_tokens = torch.topk(next_token_probs, 5) + + print(" Top 5 next token predictions:") + for i, (prob, token_id) in enumerate(zip(top_tokens.values, top_tokens.indices, strict=False)): + token = tokenizer.decode([token_id]) + print(f" {i + 1}. '{token}' (prob: {prob:.4f})") + + # Test generation + print(" Testing generation...") + with torch.no_grad(): + generated = model.generate( + input_ids, + attention_mask=attention_mask, + max_new_tokens=10, + do_sample=False, # Use greedy decoding for reproducibility + temperature=None, # Clear conflicting params + top_p=None, + top_k=None, + pad_token_id=tokenizer.eos_token_id, + ) + + generated_text = tokenizer.decode(generated[0], skip_special_tokens=True) + print(f" [OK] Generated: '{generated_text}'") + + # Test memory usage info + if device.type == "cuda": + print(f" Status: GPU memory allocated: {torch.cuda.memory_allocated(device) / 1024**3:.2f} GB") + print(f" Status: GPU memory cached: {torch.cuda.memory_reserved(device) / 1024**3:.2f} GB") + + total_params = sum(p.numel() for p in model.parameters()) + print(f" Status: Total parameters: {total_params:,}") + + print("[OK] PRETRAINED WEIGHTS TEST PASSED!") + return True + + +@pytest.mark.L1 +def test_qwen3_llm_implementation(): + print("Qwen3 LLM Integration Test Suite") + print("This script tests our LLM implementation in Qwen3") + print(f"Running from: {os.getcwd()}") + print() + + model_name = "Qwen/Qwen3-0.6B" + + print(f" Testing Model: {model_name}") + print(" Running all tests: compatibility, I/O, HuggingFace comparison, and pretrained weights") + + # Show device info early + device = get_device() + print(f" Device: {device}") + print() + + is_large_model = "4B" in model_name + + try: + # Initialize all models and tokenizers once + models = initialize_models_and_tokenizers(model_name, device, is_large_model=is_large_model) + + if is_large_model: + print(" Info: Large model detected, skipping input/output test") + print(" Info: Large model detected, skipping pretrained weights test") + print(" Info: Large model detected, skipping output controls test") + print(" Info: Large model detected, skipping comprehensive output control tests") + huggingface_passed = compare_with_huggingface_model(models) + if not huggingface_passed: + print("\n[ERROR] HUGGINGFACE COMPARISON FAILED!") + raise Exception("There may be differences between our LLM and HuggingFace implementations.") + else: + print("\n[OK] HUGGINGFACE COMPARISON PASSED!") + + # Phase 0.5: Local config loading test + print("\n" + "=" * 50) + print("TESTING LOCAL CONFIG LOADING") + local_config_success = test_qwen3_local_config_loading() + + if not local_config_success: + print("\n[ERROR] LOCAL CONFIG LOADING FAILED!") + print("Please check the config file and fix any issues.") + cleanup_models(models) + raise Exception("Local config loading test failed.") + + print("\n[OK] LOCAL CONFIG LOADING PASSED!") + + # Phase 1: Check our LLM implementation (compatibility fixes) + implementation_status = check_llm_implementation(models) + + implementation_ok = all(status.endswith("_ok") for status in implementation_status) + + if not implementation_ok: + raise Exception("\n[ERROR] Our LLM implementation has issues!") + + print("\n[OK] Our LLM implementation is working!") + + # Phase 1.5: Run input/output test + print("\n" + "=" * 50) + io_test_passed = run_input_output_test(models) + + if not io_test_passed: + raise Exception("\n[ERROR] INPUT/OUTPUT TEST FAILED!") + + # Phase 1.6: Pretrained weights test + print("\n" + "=" * 50) + print(f"TESTING WITH PRETRAINED WEIGHTS: {model_name}") + print("Note: Pretrained weights are already loaded in the initialized models") + + pretrained_passed = run_pretrained_weights_test(models) + if not pretrained_passed: + print("\n[ERROR] PRETRAINED WEIGHTS TEST FAILED!") + print("There may be compatibility issues with the pretrained model.") + print("However, this doesn't block the LLM integration.") + else: + print("\nSUMMARY: PRETRAINED WEIGHTS TEST PASSED!") + print(f"Our LLM implementation is compatible with official {model_name} weights!") + + # Phase 2: Check if output controls are implemented + output_controls_ok = check_llm_output_controls(models) + + if not output_controls_ok: + print("\nTesting: MoT-style output controls not yet implemented.") + print("This is the next step in the MoT integration process.") + print("\nCurrent Status:") + print("[OK] Compatibility fixes completed") + print("[OK] Masking functions implemented") + print("[OK] Model instantiation working") + print("[PENDING] Output controls (next step)") + + # Phase 3: Run comprehensive output control tests (if implemented) + print("\n Running comprehensive MoT-style LLM output control tests...") + success = llm_output_controls_check(models) + + if success: + print("\nALL TESTS PASSED! ALL TESTS PASSED!") + print(f"Qwen3 successfully implements full MoT-style LLM integration for {model_name}!") + + # Cleanup + cleanup_models(models) + else: + raise Exception("\nOUTPUT CONTROL TESTS FAILED! OUTPUT CONTROL TESTS FAILED!") + + except KeyboardInterrupt: + print("\n\n[INTERRUPTED] Test suite interrupted by user") + # Try to cleanup if models were initialized + if "models" in locals(): + cleanup_models(models) + raise Exception("Test suite interrupted by user.") + except Exception as e: + print(f"\n[FATAL ERROR] Unexpected error during test execution: {e}") + traceback.print_exc() + # Try to cleanup if models were initialized + if "models" in locals(): + cleanup_models(models) + raise Exception(f"Unexpected error during test execution: {e}") + + +if __name__ == "__main__": + test_qwen3_llm_implementation() diff --git a/cosmos-inference/cosmos3/_src/vfm/models/mot/__init__.py b/cosmos-inference/cosmos3/_src/vfm/models/mot/__init__.py new file mode 100644 index 00000000..db6d6eb6 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/vfm/models/mot/__init__.py @@ -0,0 +1,15 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025-2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + diff --git a/cosmos-inference/cosmos3/_src/vfm/models/mot/attention.py b/cosmos-inference/cosmos3/_src/vfm/models/mot/attention.py new file mode 100644 index 00000000..1bf2278b --- /dev/null +++ b/cosmos-inference/cosmos3/_src/vfm/models/mot/attention.py @@ -0,0 +1,445 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025-2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +import torch +from torch.nn.attention.flex_attention import BlockMask, create_block_mask, flex_attention + +from cosmos3._src.imaginaire.attention import ( + attention, + merge_attentions, + multi_dimensional_attention_varlen, +) +from cosmos3._src.imaginaire.attention.masks import CausalType +from cosmos3._src.vfm.models.utils.memory import KVToStore, MemoryValue + +flex_attention = torch.compile(flex_attention) + + +class SplitInfo: + def __init__( + self, + split_lens: list[int], + attn_modes: list[str], + sample_lens: list[int], + actual_len: int, + is_three_way: bool = False, + vision_token_shapes: list[tuple[int, int, int]] | None = None, + action_token_shapes: list[tuple[int, ...]] | None = None, + num_action_tokens_per_supertoken: int = 0, + null_action_supertokens: bool = False, + ): + """ + Actual len is the actual non-padded length of the packed sequence. + It's used to trim split_lens, attn_modes and sample_lens, which were + originally padded to max sequence length (likely for flex attention). + """ + assert sum(sample_lens) == sum(split_lens), ( + f"Sum of new sample lens {sum(sample_lens)} is not equal to sum of new split lens {sum(split_lens)}" + ) + + max_causal_len = 0 + max_full_len = 0 + for split_len, attn_mode in zip(split_lens, attn_modes): + if attn_mode == "causal": + max_causal_len = max(max_causal_len, split_len) + elif attn_mode == "full": + max_full_len = max(max_full_len, split_len) + + self.max_causal_len = max_causal_len + self.max_full_len = max_full_len + self.max_sample_len = max(sample_lens) + + self.split_lens = split_lens + self.attn_modes = attn_modes + self.sample_lens = sample_lens + + self.is_three_way = is_three_way + self.vision_token_shapes = vision_token_shapes + self.action_token_shapes = action_token_shapes + self.num_action_tokens_per_supertoken = num_action_tokens_per_supertoken + self.null_action_supertokens = null_action_supertokens + + +AttentionMaskType = BlockMask | SplitInfo + + +_dotproduct_attention_cache = {} + + +from cosmos3._src.vfm.datasets.sequence_packing import ( + FactoredSequencePack, + JointSequencePack, + create_sparse_mask, + factored_from_joint_sequence, + from_joint, + from_mode_splits, + generate_natten_metadata, + generate_temporal_causal_natten_metadata, + get_all_seq, + get_causal_seq, + get_full_only_seq, + joint_from_joint_sequence, +) + + +def two_way_attention( + packed_query_states: FactoredSequencePack | JointSequencePack, + packed_key_states: FactoredSequencePack | JointSequencePack, + packed_value_states: FactoredSequencePack | JointSequencePack, +): + """ + Performs two-way attention with causal and full attention. + """ + + causal_q, causal_q_offsets = get_causal_seq(packed_query_states) + causal_k, causal_k_offsets = get_causal_seq(packed_key_states) + causal_v, _ = get_causal_seq(packed_value_states) + full_q, full_q_offsets = get_full_only_seq(packed_query_states) + + sample_offsets = packed_query_states["sample_offsets"] + + use_dont_care_mask = causal_q_offsets is causal_k_offsets + + + causal_res = attention( + causal_q.unsqueeze(0), # [1,N_und,heads,head_dim] + causal_k.unsqueeze(0), # [1,N_und,heads,head_dim] + causal_v.unsqueeze(0), # [1,N_und,heads,head_dim] + cumulative_seqlen_Q=causal_q_offsets, + cumulative_seqlen_KV=causal_k_offsets, + max_seqlen_Q=packed_query_states["max_causal_len"], + max_seqlen_KV=packed_query_states["max_causal_len"], + is_causal=True, + causal_type=CausalType.DontCare if use_dont_care_mask else CausalType.TopLeft, + ) # [1,N_und,heads,head_dim] + + # [1,N_und,heads,head_dim] -> [N_und,heads,head_dim] -> [N_und,heads*head_dim] + causal_out = causal_res.squeeze(0).flatten(-2, -1) # type: ignore # [N_und,heads*head_dim] + + full_res = attention( + full_q.unsqueeze(0), # [1,N_full,heads,head_dim] + get_all_seq(packed_key_states).unsqueeze(0), # [1,N_all,heads,head_dim] + get_all_seq(packed_value_states).unsqueeze(0), # [1,N_all,heads,head_dim] + cumulative_seqlen_Q=full_q_offsets, + cumulative_seqlen_KV=sample_offsets, + max_seqlen_Q=packed_query_states["max_full_len"], + max_seqlen_KV=packed_query_states["max_sample_len"], + ) # [1,N_full,heads,head_dim] + + # [1,N_full,heads,head_dim] -> [N_full,heads,head_dim] -> [N_full,heads*head_dim] + full_out = full_res.squeeze(0).flatten(-2, -1) # type: ignore # [N_full,heads*head_dim] + + out_all = from_mode_splits(causal_out, full_out, packed_query_states) + return out_all + + +def three_way_attention( + packed_query_states: FactoredSequencePack | JointSequencePack, + packed_key_states: FactoredSequencePack | JointSequencePack, + packed_value_states: FactoredSequencePack | JointSequencePack, + natten_metadata: dict | None, + attention_meta: SplitInfo | None = None, +): + """ + Performs three-way attention, with understanding and generations attentions fully decomposed, + and allows sparsity / multi-dimensional masking in the generation tower. + + When attention_meta is provided with null_action_supertokens=True, zeros V for the first + num_action_tokens_per_supertoken tokens of each sample's GEN sequence (null action + supertokens for temporal causal training). The metadata encodes is_causal=(True, False): + causal across T supertokens, full within each supertoken S. + + NOTE: the three-way decomposition is only done so we can handle sparsity in the gen tower, + but a KEY assumption is that the "full" tokens all correspond to the same modality! + We should be careful when extending this to beyond t2i and t2v. + """ + + causal_q, causal_q_offsets = get_causal_seq(packed_query_states) + causal_k, causal_k_offsets = get_causal_seq(packed_key_states) + causal_v, _ = get_causal_seq(packed_value_states) + full_q, full_q_offsets = get_full_only_seq(packed_query_states) + full_k, full_k_offsets = get_full_only_seq(packed_key_states) + full_v, _ = get_full_only_seq(packed_value_states) + + sample_offsets = packed_query_states["sample_offsets"] + + if attention_meta is not None and attention_meta.null_action_supertokens: + # Zero V for the first num_action_tokens_per_supertoken tokens of each + # sample's GEN sequence (null action supertokens at t=0). + # out_i = Σ_j softmax(QKᵀ/√d)_j · V_j — terms with V_j=0 contribute exactly 0 to the output, + # regardless of attention weights. Softmax mass is still allocated to these positions (not + # redistributed), so this differs from hard key masking, but the output contribution is 0. + full_v = full_v.clone() + starts = full_q_offsets[:-1].long() # [B] + null_positions = ( + starts.unsqueeze(1) + torch.arange(attention_meta.num_action_tokens_per_supertoken, device=starts.device) + ).reshape(-1) + full_v[null_positions] = 0 + + use_dont_care_mask = causal_q_offsets is causal_k_offsets + + + causal_res = attention( + causal_q.unsqueeze(0), # [1,N_und,heads,head_dim] + causal_k.unsqueeze(0), # [1,N_und,heads,head_dim] + causal_v.unsqueeze(0), # [1,N_und,heads,head_dim] + cumulative_seqlen_Q=causal_q_offsets, + cumulative_seqlen_KV=causal_k_offsets, + max_seqlen_Q=packed_query_states["max_causal_len"], + max_seqlen_KV=packed_query_states["max_causal_len"], + is_causal=True, + causal_type=CausalType.DontCare if use_dont_care_mask else CausalType.TopLeft, + ) # [1,N_und,heads,head_dim] + # [1,N_und,heads,head_dim] -> [N_und,heads,head_dim] -> [N_und,heads*head_dim] + causal_out = causal_res.squeeze(0).flatten(-2, -1) # type: ignore # [N_und,heads*head_dim] + + # If there's no metadata, it's a dense layer + if natten_metadata is None: + full_sa, full_sa_lse = attention( + full_q.unsqueeze(0), # [1,N_full,heads,head_dim] + full_k.unsqueeze(0), # [1,N_full,heads,head_dim] + full_v.unsqueeze(0), # [1,N_full,heads,head_dim] + cumulative_seqlen_Q=full_q_offsets, + cumulative_seqlen_KV=full_k_offsets, + max_seqlen_Q=packed_query_states["max_full_len"], + max_seqlen_KV=packed_query_states["max_full_len"], + return_lse=True, + ) # full_sa: [1,N_full,heads,head_dim], full_sa_lse: [1,N_full,heads] + else: + assert natten_metadata is not None + full_sa, full_sa_lse = multi_dimensional_attention_varlen( + full_q.unsqueeze(0), # [1,N_full,heads,head_dim] + full_k.unsqueeze(0), # [1,N_full,heads,head_dim] + full_v.unsqueeze(0), # [1,N_full,heads,head_dim] + metadata=natten_metadata, + return_lse=True, + ) # full_sa: [1,N_full,heads,head_dim], full_sa_lse: [1,N_full,heads] + + full_ca, full_ca_lse = attention( + full_q.unsqueeze(0), # [1,N_full,heads,head_dim] + causal_k.unsqueeze(0), # [1,N_und,heads,head_dim] + causal_v.unsqueeze(0), # [1,N_und,heads,head_dim] + cumulative_seqlen_Q=full_q_offsets, + cumulative_seqlen_KV=causal_k_offsets, + max_seqlen_Q=packed_query_states["max_full_len"], + max_seqlen_KV=packed_query_states["max_causal_len"], + return_lse=True, + ) # full_ca: [1,N_full,heads,head_dim], full_ca_lse: [1,N_full,heads] + + assert full_sa.shape == full_ca.shape + full_res, _ = merge_attentions( + outputs=[full_sa, full_ca], lse_tensors=[full_sa_lse, full_ca_lse], torch_compile=False + ) # [1,N_full,heads,head_dim] + + # [1,N_full,heads,head_dim] -> [N_full,heads,head_dim] -> [N_full,heads*head_dim] + full_out = full_res.squeeze(0).flatten(-2, -1) # type: ignore # [N_full,heads*head_dim] + + out_all = from_mode_splits(causal_out, full_out, packed_query_states) + return out_all + + +def pad_sequence(tensor, pad_size): + """ + Pad a tensor along the second-to-last dimension. + + Args: + tensor: Input tensor to pad + pad_size: Number of padding elements to add + + Returns: + Padded tensor with zeros added along dim=-2 + """ + if pad_size <= 0: + return tensor + pad_shape = list(tensor.shape) + pad_shape[-2] = pad_size + padding = torch.zeros(pad_shape, dtype=tensor.dtype, device=tensor.device) + return torch.cat([tensor, padding], dim=-2) # [...,S+pad_size,...] + + +def block_flex_attention( + packed_query_states: FactoredSequencePack | JointSequencePack, + packed_key_states: FactoredSequencePack | JointSequencePack, + packed_value_states: FactoredSequencePack | JointSequencePack, + attention_mask: BlockMask, + block_size: int | None = None, +): + packed_queries = get_all_seq(packed_query_states) # [N,heads,head_dim] + packed_keys = get_all_seq(packed_key_states) # [N,heads,head_dim] + packed_values = get_all_seq(packed_value_states) # [N,heads,head_dim] + max_num_tokens = packed_query_states["max_num_tokens"] + + num_attention_heads = packed_queries.shape[1] + head_dim = packed_queries.shape[2] + + # Handle block mask attention with flex_attention + pad_size = max_num_tokens - packed_queries.shape[0] + packed_queries_padded = pad_sequence(packed_queries.permute(1, 0, 2), pad_size) # [heads,max_num_tokens,head_dim] + packed_keys_padded = pad_sequence(packed_keys.permute(1, 0, 2), pad_size) # [heads,max_num_tokens,head_dim] + packed_values_padded = pad_sequence(packed_values.permute(1, 0, 2), pad_size) # [heads,max_num_tokens,head_dim] + + packed_attn_output = flex_attention( + packed_queries_padded.unsqueeze(0), # [1,heads,max_num_tokens,head_dim] + packed_keys_padded.unsqueeze(0), # [1,heads,max_num_tokens,head_dim] + packed_values_padded.unsqueeze(0), # [1,heads,max_num_tokens,head_dim] + enable_gqa=True, + block_mask=attention_mask, + ) # [1,heads,max_num_tokens,head_dim] + assert isinstance(packed_attn_output, torch.Tensor) + + end_index = packed_attn_output.shape[2] - pad_size + packed_attn_output = packed_attn_output[0, :, :end_index, :] # [heads,N,head_dim] + packed_attn_output = packed_attn_output.transpose(0, 1).reshape( + -1, num_attention_heads * head_dim + ) # [N,heads*head_dim] + + return from_joint(packed_attn_output, packed_query_states) + + +def dispatch_attention( + packed_query_states: FactoredSequencePack | JointSequencePack, + packed_key_states: FactoredSequencePack | JointSequencePack, + packed_value_states: FactoredSequencePack | JointSequencePack, + attention_mask: BlockMask | SplitInfo, + natten_metadata: dict | None = None, + memory_value: MemoryValue | None = None, +) -> tuple[FactoredSequencePack | JointSequencePack, KVToStore | None]: + assert memory_value is None, "Base dispatch_attention does not handle MemoryValue" + if isinstance(attention_mask, SplitInfo) and attention_mask.is_three_way: + output = three_way_attention( + packed_query_states, + packed_key_states, + packed_value_states, + natten_metadata=natten_metadata, + attention_meta=attention_mask, + ) + elif isinstance(attention_mask, SplitInfo): + output = two_way_attention(packed_query_states, packed_key_states, packed_value_states) + else: + output = block_flex_attention(packed_query_states, packed_key_states, packed_value_states, attention_mask) + return output, None + + +def build_packed_sequence( + joint_attn_implementation: str, + *, + packed_sequence: torch.Tensor, + attn_modes: list[str], + split_lens: list[int], + sample_lens: list[int], + packed_und_token_indexes: torch.LongTensor, + packed_gen_token_indexes: torch.LongTensor, + num_heads: int, + head_dim: int, + num_layers: int, + token_shapes: list[tuple[int, int, int]] | None = None, + natten_parameter_list: list | None = None, + block_size: int = 128, + is_image_batch: bool = False, + cp_world_size: int = 1, + video_temporal_causal: bool = False, + use_rolling_kv_cache: bool = False, + vision_token_shapes: list[tuple[int, int, int]] | None = None, + action_token_shapes: list[tuple[int, ...]] | None = None, + num_action_tokens_per_supertoken: int = 0, + null_action_supertokens: bool = False, + pad_for_cuda_graphs: bool = False, +) -> tuple[FactoredSequencePack | JointSequencePack, AttentionMaskType, list | None]: + """ + Build the model input pack and attention meta for joint attention. + Returns a tuple: (input_pack, attention_meta). + """ + device = packed_sequence.device + natten_metadata_list = None + if joint_attn_implementation == "flex": + sparse_mask = create_sparse_mask(sample_lens, split_lens, attn_modes, device) + seqlen = sum(sample_lens) + attention_meta = create_block_mask( + sparse_mask, + B=1, + H=num_heads, + Q_LEN=seqlen, + KV_LEN=seqlen, + device=device, + BLOCK_SIZE=block_size, + _compile=True, + ) + make_pack = joint_from_joint_sequence + elif joint_attn_implementation == "two_way": + attention_meta = SplitInfo( + split_lens=split_lens, + attn_modes=attn_modes, + sample_lens=sample_lens, + actual_len=int(packed_sequence.shape[0]), + ) + make_pack = factored_from_joint_sequence + elif joint_attn_implementation == "three_way": + attention_meta = SplitInfo( + split_lens=split_lens, + attn_modes=attn_modes, + sample_lens=sample_lens, + actual_len=int(packed_sequence.shape[0]), + is_three_way=True, + vision_token_shapes=vision_token_shapes, + action_token_shapes=action_token_shapes, + num_action_tokens_per_supertoken=num_action_tokens_per_supertoken, + null_action_supertokens=null_action_supertokens, + ) + make_pack = factored_from_joint_sequence + # The rolling KV-cache path implements temporal causality in + # three_way_attention_with_kv_cache; skip NATTEN metadata. + if not use_rolling_kv_cache: + # Temporal causal: encode (T, S) supertoken layout; spatial NATTEN: encode (H, W) layout. + if video_temporal_causal: + natten_metadata_list = generate_temporal_causal_natten_metadata( + vision_token_shapes=vision_token_shapes, + num_action_tokens_per_supertoken=num_action_tokens_per_supertoken, + num_layers=num_layers, + head_dim=head_dim, + device=device, + dtype=packed_sequence.dtype, + requires_grad=packed_sequence.requires_grad, + ) + else: + natten_metadata_list = generate_natten_metadata( + token_shapes=token_shapes, + head_dim=head_dim, + num_layers=num_layers, + device=device, + dtype=packed_sequence.dtype, + requires_grad=packed_sequence.requires_grad, + natten_parameter_list=natten_parameter_list, + ) + else: + raise ValueError( + f"Invalid joint_attn_implementation: {joint_attn_implementation}. " + "Must be 'two_way', 'three_way', or 'flex'." + ) + + input_pack = make_pack( + packed_sequence=packed_sequence, + attn_modes=attn_modes, + split_lens=split_lens, + sample_lens=sample_lens, + packed_und_token_indexes=packed_und_token_indexes.to(device), + packed_gen_token_indexes=packed_gen_token_indexes.to(device), + is_image_batch=is_image_batch, + cp_world_size=cp_world_size, + pad_for_cuda_graphs=pad_for_cuda_graphs, + ) + # Not needed anymore, can cause recompilations. + input_pack.pop("split_lens", None) + input_pack.pop("attn_modes", None) + return input_pack, attention_meta, natten_metadata_list diff --git a/cosmos-inference/cosmos3/_src/vfm/models/mot/context_parallel_utils.py b/cosmos-inference/cosmos3/_src/vfm/models/mot/context_parallel_utils.py new file mode 100644 index 00000000..677d4f7a --- /dev/null +++ b/cosmos-inference/cosmos3/_src/vfm/models/mot/context_parallel_utils.py @@ -0,0 +1,427 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +""" +Context Parallelism Utilities. + +Integration Guide: +------------------ +1. Shard the Input Sequence: + Call `get_context_parallel_sharded_sequence` at the start of the forward pass to split + the global input pack into local shards. + + ```python + input_pack, position_ids = get_context_parallel_sharded_sequence( + attn_implementation, input_pack, position_ids, parallel_dims + ) + ``` + +2. Apply Context Parallel Attention: + Use `context_parallel_attention` inside your attention block. It handles All-to-All + communication (gather seq, scatter heads -> attn -> gather heads, scatter seq). + + ```python + output, kv_to_store = context_parallel_attention( + cp_mesh, query_pack, key_pack, value_pack, mask, local_attn_func + ) + ``` + +3. Gather Final Hidden States (Optional): + Use `get_context_parallel_last_hidden_state` if the full global sequence is needed for + loss or post-processing. +""" + +from typing import Callable + +import torch +import torch.distributed as dist +from torch.distributed.device_mesh import DeviceMesh +from torch.distributed.tensor import DTensor, Replicate, Shard +from torch.nn.attention.flex_attention import BlockMask + +from cosmos3._src.vfm.datasets.sequence_packing import ( + FactoredSequencePack, + JointSequencePack, + from_mode_splits, + get_all_seq, + get_causal_seq, + get_full_only_seq, + get_gen_position_ids, + get_gen_seq, + get_und_position_ids, + get_und_seq, +) +from cosmos3._src.vfm.models.mot.attention import SplitInfo +from cosmos3._src.vfm.models.utils.memory import KVToStore, MemoryValue +from cosmos3._src.vfm.utils.parallelism import ParallelDims + + +def _pad_to_N(N, x: torch.Tensor) -> torch.Tensor: + assert x.shape[0] <= N + padded = x.new_zeros((N, *x.shape[1:])) + padded[: x.shape[0]] = x + return padded + + +def _filter_and_rebase_sparse_index( + global_indices: torch.Tensor, + start_offset: int, + end_offset: int, +) -> torch.Tensor: + """Filters sparse global indices to the local physical slice and shifts them to local 0-based coordinates.""" + + # Keep only global indices that fall within [start_offset, end_offset) + mask = (global_indices >= start_offset) & (global_indices < end_offset) + local_global_indices = global_indices[mask] + + # Subtract the start_offset to make them local (0-based) + local_rebased_indices = local_global_indices - start_offset + + return local_rebased_indices + + +def get_context_parallel_sharded_sequence( + attn_implementation: str, + input_pack: FactoredSequencePack, + position_ids: torch.Tensor, + parallel_dims: ParallelDims | None, +) -> tuple[FactoredSequencePack, torch.Tensor]: + """ + Splits the full input_pack into a local shard for Context Parallelism. + """ + if parallel_dims is None or not parallel_dims.cp_enabled: + return input_pack, position_ids + + assert attn_implementation in ("two_way", "three_way"), ( + f"Context parallel is only supported for two_way and three_way joint attention modes, " + f"got {attn_implementation!r}" + ) + cp_mesh = parallel_dims.cp_mesh + cp_group = cp_mesh.get_group() + rank = dist.get_rank(cp_group) + world_size = dist.get_world_size(cp_group) + + text_seq = get_und_seq(input_pack) + gen_seq = get_gen_seq(input_pack) + assert text_seq.shape[0] % world_size == 0, "text_seq.shape[0] must be divisible by world_size" + assert gen_seq.shape[0] % world_size == 0, "gen_seq.shape[0] must be divisible by world_size" + + text_len = text_seq.shape[0] + text_shard_len = text_len // world_size + text_shard = text_seq.narrow(0, rank * text_shard_len, text_shard_len) + + gen_len = gen_seq.shape[0] + gen_shard_len = gen_len // world_size + gen_shard = gen_seq.narrow(0, rank * gen_shard_len, gen_shard_len) + + text_position_ids = get_und_position_ids(position_ids, input_pack) + gen_position_ids = get_gen_position_ids(position_ids, input_pack) + + # Handle 3D mRoPE position IDs: shape (3, L) + is_mrope = position_ids.dim() == 2 and position_ids.shape[0] == 3 + if is_mrope: + text_position_ids = text_position_ids.transpose(0, 1) # [text_len,3] + gen_position_ids = gen_position_ids.transpose(0, 1) # [gen_len,3] + + # pad to N + text_position_ids = _pad_to_N(text_seq.shape[0], text_position_ids) + gen_position_ids = _pad_to_N(gen_seq.shape[0], gen_position_ids) + + text_position_ids_shard = text_position_ids.narrow(0, rank * text_shard_len, text_shard_len) + gen_position_ids_shard = gen_position_ids.narrow(0, rank * gen_shard_len, gen_shard_len) + + # create local pack + local_pack = from_mode_splits(text_shard, gen_shard, input_pack, is_sharded=True) + local_position_ids = torch.cat( + [text_position_ids_shard, gen_position_ids_shard], dim=0 + ) # [text_shard_len+gen_shard_len] or [text_shard_len+gen_shard_len,3] + + if is_mrope: + local_position_ids = local_position_ids.transpose(0, 1) # [3,text_shard_len+gen_shard_len] + + return local_pack, local_position_ids + + +def get_context_parallel_last_hidden_state( + packed_outputs: FactoredSequencePack, + parallel_dims: ParallelDims | None, +) -> torch.Tensor: + if parallel_dims is None or not parallel_dims.cp_enabled: + return get_all_seq(packed_outputs) + + # since unpatchify assumes full images, for now using all_gather to gather the predictions from all context parallel ranks + # This step can be removed once we make unpatchify work with context parallel local sequences + und_hidden_seq = get_und_seq(packed_outputs) # [text_shard_len,hidden_size] + gen_hidden_seq = get_gen_seq(packed_outputs) # [gen_shard_len,hidden_size] + + gathered_und_seq = all_gather_tensor( + und_hidden_seq, gather_dim=0, cp_mesh=parallel_dims.cp_mesh + ) # [text_len,hidden_size] + gathered_gen_seq = all_gather_tensor( + gen_hidden_seq, gather_dim=0, cp_mesh=parallel_dims.cp_mesh + ) # [gen_len,hidden_size] + + gathered_hidden_pack = from_mode_splits(gathered_und_seq, gathered_gen_seq, packed_outputs, is_sharded=False) + last_hidden_state = get_all_seq(gathered_hidden_pack) + return last_hidden_state + + +def context_parallel_broadcast( + data: torch.Tensor | dict[str, torch.Tensor], parallel_dims: ParallelDims, iteration: int +) -> torch.Tensor | dict[str, torch.Tensor]: + """ + Broadcasts a tensor or a dictionary of tensors to all ranks in the context parallel group. + """ + rank = parallel_dims.cp_rank + cp_world_size = parallel_dims.cp_mesh.size() + cp_data_batch_owner = iteration % cp_world_size + + broadcast_list = [None] + if rank == cp_data_batch_owner: + broadcast_list = [data] + + global_src_rank = dist.get_global_rank(parallel_dims.cp_mesh.get_group(), cp_data_batch_owner) + dist.broadcast_object_list(broadcast_list, src=global_src_rank, group=parallel_dims.cp_mesh.get_group()) + local_data = broadcast_list[0] + assert local_data is not None + return local_data + + +def all_to_all_tensor( + local_input: torch.Tensor, + scatter_dim: int, + gather_dim: int, + cp_mesh: "DeviceMesh", +) -> torch.Tensor: + """ + All-to-all via DTensor redistribute. + Input placement: Shard(gather_dim) -> The dimension we are about to gather was split. + Output placement: Shard(scatter_dim) -> The dimension we are about to scatter will be split. + """ + # Wrap local tensor as DTensor with current placement + # gather_dim is the dimension that is currently sharded locally (so we can gather it to full) + global_dt = DTensor.from_local(local_input, cp_mesh, [Shard(gather_dim)], run_check=False) + + # Redistribute to new placement (shard scatter_dim) + new_dt = global_dt.redistribute(cp_mesh, [Shard(scatter_dim)]) + + # Convert back to local + return new_dt.to_local() + + +def all_gather_tensor( + local_input: torch.Tensor, + gather_dim: int, + cp_mesh: "DeviceMesh", +) -> torch.Tensor: + """ + All-gather via DTensor redistribute. + Input placement: Shard(gather_dim) -> The dimension we are about to gather was split. + Output placement: Replicate() -> Full copy on each rank. + """ + # Wrap local tensor as DTensor with current placement + global_dt = DTensor.from_local(local_input, cp_mesh, [Shard(gather_dim)], run_check=False) + + # Redistribute to new placement (Replicate) + new_dt = global_dt.redistribute(cp_mesh, [Replicate()]) + + # Convert back to local + return new_dt.to_local() + + +def gather_seq_scatter_heads( + x: torch.Tensor, + seq_dim: int, + head_dim: int, + cp_mesh: DeviceMesh, +) -> torch.Tensor: + """ + A func to sync embedding input with alltoall in sequence parallel. + gather sequence dimension and scatter head dim: + For example, when seq_dim is 0, head_dim is 1, the transformation is: + [z, seq/n, h, ...] -> [z, seq, h/n, ...] + Args: + x: shape of [z, seq, h, ...] + seq_dim: the dimension to gather + head_dim: the dimension to scatter + cp_mesh: ulysses sequence parallelism size + Returns: + torch.Tensor: shape of gathered and scattered tensor + """ + return all_to_all_tensor(x, head_dim, seq_dim, cp_mesh) + + +def gather_heads_scatter_seq( + x: torch.Tensor, + head_dim: int, + seq_dim: int, + cp_mesh: DeviceMesh, +) -> torch.Tensor: + """ + A func to sync attention result with alltoall in sequence parallel. + gather head dimension and scatter seq dim: + For example, when seq_dim is 0, head_dim is 1, the transformation is: + [seq, h/n, ...] -> [seq/n, h, ...] + + Args: + x (torch.Tensor): shape of [bsz, seq, h/n, ...] + head_dim (int): the dimension to gather + seq_dim (int): the dimension to scatter + cp_mesh (DeviceMesh): ulysses sequence parallelism size + splits (List[torch.Tensor], optional): Manual splits for variable length scattering + + Returns: + torch.Tensor: shape of [bsz, seq/n, h, ...] + """ + return all_to_all_tensor(x, seq_dim, head_dim, cp_mesh) + + +def context_parallel_attention( + cp_mesh: DeviceMesh, + packed_query_states: FactoredSequencePack, + packed_key_states: FactoredSequencePack, + packed_value_states: FactoredSequencePack, + attention_mask: BlockMask | SplitInfo, + attention_function: Callable, + natten_metadata: dict | None = None, + memory_value: MemoryValue | None = None, +) -> tuple[FactoredSequencePack | JointSequencePack, KVToStore | None]: + """Ulysses-style context parallel attention for packed und+gen sequences. + + Each rank holds a sequence shard [S/cp, H, D]. Two all-to-all calls convert + between seq-sharded and head-sharded representations: + 1. gather seq, scatter heads → [S, H/cp, D] (head-sharded) + 2. run attention on full sequence with reduced heads + 3. gather heads, scatter seq → [S/cp, H, D] (seq-sharded) + + When ``memory_value`` is present, produces head-sharded ``kv_to_store`` + from the post-all-to-all K/V tensors for the caller to write back via + ``MemoryState.write_for_layer()``. Does **not** write to any cache + directly. + + Args: + cp_mesh: Device mesh for context parallelism. + packed_query_states: Packed Q for both und and gen tokens, seq-sharded [S/cp, H, D]. + packed_key_states: Packed K for both und and gen tokens, seq-sharded [S/cp, H, D]. + packed_value_states: Packed V for both und and gen tokens, seq-sharded [S/cp, H, D]. + attention_mask: Block mask or split info describing causal/full attention pattern. + attention_function: Callable implementing the actual attention kernel. + natten_metadata: Optional neighborhood attention metadata. + memory_value: Optional memory value for KV-cache training / AR inference. + + Returns: + (output_pack, kv_to_store): + output_pack: Packed attention output, seq-sharded [S/cp, H, D]. + kv_to_store: Head-sharded ``(gen_k, gen_v, und_k, und_v)`` when + ``memory_value`` is present, ``None`` otherwise. + """ + cp_group = cp_mesh.get_group() + cp_world_size = torch.distributed.get_world_size(cp_group) + assert cp_world_size > 1, "Context parallel world size must be greater than 1" + q_und_seq, _ = get_causal_seq(packed_query_states) # [text_shard_len,H,head_dim] + q_gen_seq, _ = get_full_only_seq(packed_query_states) # [gen_shard_len,H,head_dim] + k_und_seq, _ = get_causal_seq(packed_key_states) # [text_shard_len,H,head_dim] + k_gen_seq, _ = get_full_only_seq(packed_key_states) # [gen_shard_len,H,head_dim] + v_und_seq, _ = get_causal_seq(packed_value_states) # [text_shard_len,H,head_dim] + v_gen_seq, _ = get_full_only_seq(packed_value_states) # [gen_shard_len,H,head_dim] + + # Check that number of heads is divisible by CP world size + assert q_und_seq.shape[1] % cp_world_size == 0, ( + f"Query heads ({q_und_seq.shape[1]}) must be divisible by context parallel world size ({cp_world_size})" + ) + assert k_und_seq.shape[1] % cp_world_size == 0, ( + f"Key heads ({k_und_seq.shape[1]}) must be divisible by context parallel world size ({cp_world_size})" + ) + assert v_und_seq.shape[1] % cp_world_size == 0, ( + f"Value heads ({v_und_seq.shape[1]}) must be divisible by context parallel world size ({cp_world_size})" + ) + + + # when doing AR-inference with a KV-cache. + + # all2all: gather sequence, scatter heads → head-sharded + q_und_seq = gather_seq_scatter_heads( + q_und_seq, seq_dim=0, head_dim=1, cp_mesh=cp_mesh + ) # [text_len,H_local,head_dim] + q_gen_seq = gather_seq_scatter_heads( + q_gen_seq, seq_dim=0, head_dim=1, cp_mesh=cp_mesh + ) # [gen_len,H_local,head_dim] + k_und_seq = gather_seq_scatter_heads( + k_und_seq, seq_dim=0, head_dim=1, cp_mesh=cp_mesh + ) # [text_len,H_local,head_dim] + k_gen_seq = gather_seq_scatter_heads( + k_gen_seq, seq_dim=0, head_dim=1, cp_mesh=cp_mesh + ) # [gen_len,H_local,head_dim] + v_und_seq = gather_seq_scatter_heads( + v_und_seq, seq_dim=0, head_dim=1, cp_mesh=cp_mesh + ) # [text_len,H_local,head_dim] + v_gen_seq = gather_seq_scatter_heads( + v_gen_seq, seq_dim=0, head_dim=1, cp_mesh=cp_mesh + ) # [gen_len,H_local,head_dim] + + # Build head-sharded kv_to_store when memory is active. + kv_to_store: KVToStore | None = None + if memory_value is not None: + und_len = packed_key_states["_num_causal_tokens"] + gen_len = packed_key_states["_num_full_tokens"] + kv_to_store = ( + k_gen_seq[:gen_len].unsqueeze(0), + v_gen_seq[:gen_len].unsqueeze(0), + k_und_seq[:und_len].unsqueeze(0), + v_und_seq[:und_len].unsqueeze(0), + ) + + q_und_seq_len = q_und_seq.shape[0] + q_gen_seq_len = q_gen_seq.shape[0] + meta = dict(packed_query_states) + packed_query_states_ = from_mode_splits(q_und_seq, q_gen_seq, meta, is_sharded=False) + packed_key_states_ = from_mode_splits(k_und_seq, k_gen_seq, meta, is_sharded=False) + packed_value_states_ = from_mode_splits(v_und_seq, v_gen_seq, meta, is_sharded=False) + + # dispatch_attention returns (output, kv_to_store | None) + attn_output_pack_hp, _inner_kv_to_store = attention_function( + packed_query_states_, + packed_key_states_, + packed_value_states_, + attention_mask, + natten_metadata=natten_metadata, + memory_value=memory_value, + ) + + attn_output_und_hp = get_und_seq(attn_output_pack_hp) # [text_len,H_local,head_dim] + attn_output_gen_hp = get_gen_seq(attn_output_pack_hp) # [gen_len,H_local,head_dim] + + attn_output_und_hp = attn_output_und_hp[:q_und_seq_len].contiguous() # [text_len,H_local,head_dim] + attn_output_gen_hp = attn_output_gen_hp[:q_gen_seq_len].contiguous() # [gen_len,H_local,head_dim] + + # all2all: gather heads, scatter seq → seq-sharded + attn_output_und_sp = gather_heads_scatter_seq( + attn_output_und_hp, + seq_dim=0, + head_dim=1, + cp_mesh=cp_mesh, + ) # [text_shard_len,H,head_dim] + attn_output_gen_sp = gather_heads_scatter_seq( + attn_output_gen_hp, + seq_dim=0, + head_dim=1, + cp_mesh=cp_mesh, + ) # [gen_shard_len,H,head_dim] + + final_output_pack_sp = from_mode_splits( + attn_output_und_sp, attn_output_gen_sp, packed_query_states, is_sharded=True + ) + + return final_output_pack_sp, kv_to_store diff --git a/cosmos-inference/cosmos3/_src/vfm/models/mot/cosmos3_vfm_network.py b/cosmos-inference/cosmos3/_src/vfm/models/mot/cosmos3_vfm_network.py new file mode 100644 index 00000000..3a2a6333 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/vfm/models/mot/cosmos3_vfm_network.py @@ -0,0 +1,1118 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +import math +from typing import List, Optional, Tuple + +import torch +from torch import nn +from transformers.configuration_utils import PretrainedConfig +from transformers.modeling_utils import PreTrainedModel + +from cosmos3._src.vfm.datasets.sequence_packing import ModalityData, PackedSequence, verify_natten_parameter_list +from cosmos3._src.vfm.models.mot.attention import build_packed_sequence +from cosmos3._src.vfm.models.mot.context_parallel_utils import ( + get_context_parallel_last_hidden_state, + get_context_parallel_sharded_sequence, +) +from cosmos3._src.vfm.models.mot.domain_aware_linear import DomainAwareLinear +from cosmos3._src.vfm.models.mot.modeling_utils import ( + FlattenedSinCosPositionEmbedding, + TimestepEmbedder, + VideoRopePosition3DEmb, +) +from cosmos3._src.vfm.models.utils.memory import MemoryState + + +class Cosmos3VFMNetworkConfig(PretrainedConfig): + def __init__( + self, + vision_gen=True, + action_gen=False, + sound_gen=False, + vlm_config=None, + latent_patch_size=2, + latent_downsample_factor=8, + latent_channel_size=16, + position_embedding_type="3d_rope", + max_latent_h=32, + max_latent_w=32, + max_latent_t=32, + rope_h_extrapolation_ratio=1.0, + rope_w_extrapolation_ratio=1.0, + rope_t_extrapolation_ratio=1.0, + enable_fps_modulation=False, + base_fps=24, + vit_max_num_patch_per_side=70, + connector_act="gelu_pytorch_tanh", + interpolate_pos=False, + timestep_shift=1.0, + timestep_scale=0.001, + predict_text_tokens=False, + joint_attn_implementation="two_way", + action_dim=32, + num_embodiment_domains=32, + temporal_compression_factor_vision=4, + temporal_compression_factor_action=1, + natten_parameter_list=None, + video_temporal_causal=False, + # Sound generation parameters + sound_dim: int | None = None, + temporal_compression_factor_sound=1, + sound_latent_fps: int = 25, + **kwargs, + ): + self.vision_gen = vision_gen + self.sound_gen = sound_gen + self.vlm_config = vlm_config + self.latent_patch_size = latent_patch_size + self.latent_downsample_factor = latent_downsample_factor + self.latent_channel_size = latent_channel_size + self.position_embedding_type = position_embedding_type + self.max_latent_h = max_latent_h + self.max_latent_w = max_latent_w + self.max_latent_t = max_latent_t + self.rope_h_extrapolation_ratio = rope_h_extrapolation_ratio + self.rope_w_extrapolation_ratio = rope_w_extrapolation_ratio + self.rope_t_extrapolation_ratio = rope_t_extrapolation_ratio + self.enable_fps_modulation = enable_fps_modulation + self.base_fps = base_fps + self.vit_max_num_patch_per_side = vit_max_num_patch_per_side + self.connector_act = connector_act + self.interpolate_pos = interpolate_pos + self.timestep_shift = timestep_shift + self.timestep_scale = timestep_scale + self.predict_text_tokens = predict_text_tokens + self.joint_attn_implementation = joint_attn_implementation + self.temporal_compression_factor_vision = temporal_compression_factor_vision + self.natten_parameter_list = natten_parameter_list + self.video_temporal_causal = video_temporal_causal + + # action related parameters + self.action_gen = action_gen # whether to generate action tokens + self.action_dim = action_dim + self.num_embodiment_domains = num_embodiment_domains + self.temporal_compression_factor_action = temporal_compression_factor_action + if self.action_gen: + assert self.vision_gen, ( + "Action generation requires visual generation! We do NOT support action only training!" + ) + + # sound related parameters + self.sound_dim = sound_dim + self.temporal_compression_factor_sound = temporal_compression_factor_sound + self.sound_latent_fps = sound_latent_fps + if self.sound_gen: + assert self.vision_gen, ( + "Sound generation requires visual generation! We do NOT support sound only training!" + ) + + super().__init__(**kwargs) + + +class Cosmos3VFMNetwork(PreTrainedModel): + config_class = Cosmos3VFMNetworkConfig + base_model_prefix = "cosmos3" + + def __init__(self, language_model, config: Cosmos3VFMNetworkConfig): + super().__init__(config) + self.language_model = language_model + self.hidden_size = config.vlm_config.hidden_size + self.use_moe = "Mo" in config.vlm_config.layer_module + self.num_heads = config.vlm_config.num_attention_heads + self.num_kv_heads = config.vlm_config.num_key_value_heads + self.head_dim = config.vlm_config.head_dim + self.num_hidden_layers = config.vlm_config.num_hidden_layers + self.predict_text_tokens = config.predict_text_tokens + + if config.natten_parameter_list is not None and config.joint_attn_implementation != "three_way": + raise NotImplementedError( + f"Sparsity is only supported with 'three_way' attention, but got {config.joint_attn_implementation=}, " + "and 'natten_parameter_list' was not None." + ) + self.natten_parameter_list = verify_natten_parameter_list( + config.natten_parameter_list, num_layers=self.num_hidden_layers + ) + + if config.video_temporal_causal and config.joint_attn_implementation != "three_way": + raise ValueError( + f"video_temporal_causal=True requires joint_attn_implementation='three_way', " + f"but got {config.joint_attn_implementation!r}." + ) + self.video_temporal_causal = config.video_temporal_causal + self.pad_for_cuda_graphs = False + + if config.vision_gen: + self.latent_patch_size = config.latent_patch_size + self.timestep_shift = config.timestep_shift + self.timestep_scale = config.timestep_scale + self.latent_downsample = config.latent_downsample_factor * config.latent_patch_size + self.max_latent_h = config.max_latent_h + self.max_latent_w = config.max_latent_w + self.max_latent_t = config.max_latent_t + self.latent_channel = config.latent_channel_size + self.patch_latent_dim = self.latent_patch_size**2 * self.latent_channel + + self.time_embedder = TimestepEmbedder(self.hidden_size) + self.vae2llm = nn.Linear(self.patch_latent_dim, self.hidden_size) + self.llm2vae = nn.Linear(self.hidden_size, self.patch_latent_dim) + + assert config.position_embedding_type in ["3d_rope", "flattened_sin_cos", "unified_3d_mrope"] + if config.position_embedding_type == "3d_rope": + self.latent_pos_embed = VideoRopePosition3DEmb( + head_dim=self.hidden_size, + len_h=self.max_latent_h, + len_w=self.max_latent_w, + len_t=self.max_latent_t, + h_extrapolation_ratio=config.rope_h_extrapolation_ratio, + w_extrapolation_ratio=config.rope_w_extrapolation_ratio, + t_extrapolation_ratio=config.rope_t_extrapolation_ratio, + enable_fps_modulation=config.enable_fps_modulation, # fps_modulation scales RoPE by fps. By default, disable FPS RoPE modulation. + base_fps=config.base_fps, + base_temporal_compression_factor=config.temporal_compression_factor_vision, + temporal_compression_factor=config.temporal_compression_factor_vision, + ) + elif config.position_embedding_type == "flattened_sin_cos": + self.latent_pos_embed = FlattenedSinCosPositionEmbedding( + max_latent_h=self.max_latent_h, max_latent_w=self.max_latent_w, hidden_size=self.hidden_size + ) + elif config.position_embedding_type == "unified_3d_mrope": + # No additive position embedding - position info is in 3D position IDs for attention + self.latent_pos_embed = None + else: + raise ValueError(f"Unknown position_embedding_type: {config.position_embedding_type!r}") + + if config.action_gen: + self.action_dim = config.action_dim + self.num_embodiment_domains = config.num_embodiment_domains + self.action2llm = DomainAwareLinear(self.action_dim, self.hidden_size, self.num_embodiment_domains) + self.llm2action = DomainAwareLinear(self.hidden_size, self.action_dim, self.num_embodiment_domains) + + if config.position_embedding_type == "3d_rope": + self.action_pos_embed = VideoRopePosition3DEmb( + head_dim=self.hidden_size, + len_h=1, + len_w=1, + len_t=self.max_latent_t * config.temporal_compression_factor_vision, + h_extrapolation_ratio=config.rope_h_extrapolation_ratio, + w_extrapolation_ratio=config.rope_w_extrapolation_ratio, + t_extrapolation_ratio=config.rope_t_extrapolation_ratio, + enable_fps_modulation=config.enable_fps_modulation, + base_fps=config.base_fps, + base_temporal_compression_factor=config.temporal_compression_factor_vision, # vision compression factor is used for base tps + temporal_compression_factor=config.temporal_compression_factor_action, # Action is at frame rate (no temporal compression) + ) + elif config.position_embedding_type == "unified_3d_mrope": + # No additive position embedding - position info is in 3D position IDs for attention + self.action_pos_embed = None + else: + raise ValueError(f"Unknown position_embedding_type: {config.position_embedding_type!r}") + + self.action_modality_embed = nn.Parameter(torch.zeros(self.hidden_size)) + + if config.sound_gen: + self.sound_dim = config.sound_dim + self.sound2llm = nn.Linear(config.sound_dim, self.hidden_size) + self.llm2sound = nn.Linear(self.hidden_size, config.sound_dim) + self.sound_modality_embed = nn.Parameter(torch.zeros(self.hidden_size)) + + self.config = config + self.parallel_dims = None + + def init_weights(self, buffer_device: torch.device | None): + if self.config.vision_gen or self.config.action_gen or self.config.sound_gen: + self.time_embedder._init_weights() + + if self.config.vision_gen: + std = 1.0 / math.sqrt(self.patch_latent_dim) + torch.nn.init.trunc_normal_(self.vae2llm.weight, std=std, a=-3 * std, b=3 * std) + torch.nn.init.zeros_(self.vae2llm.bias) + + std = 1.0 / math.sqrt(self.hidden_size) + torch.nn.init.trunc_normal_(self.llm2vae.weight, std=std, a=-3 * std, b=3 * std) + torch.nn.init.zeros_(self.llm2vae.bias) + + if self.latent_pos_embed is not None: + self.latent_pos_embed._init_weights() + + if self.config.action_gen: + # DomainAwareLinear uses embeddings for weights, so we initialize them differently + # action2llm: input_size=action_dim, output_size=hidden_size + std = 1.0 / math.sqrt(self.action_dim) + torch.nn.init.trunc_normal_(self.action2llm.fc.weight, std=std, a=-3 * std, b=3 * std) + torch.nn.init.zeros_(self.action2llm.bias.weight) + + # llm2action: input_size=hidden_size, output_size=action_dim + std = 1.0 / math.sqrt(self.hidden_size) + torch.nn.init.trunc_normal_(self.llm2action.fc.weight, std=std, a=-3 * std, b=3 * std) + torch.nn.init.zeros_(self.llm2action.bias.weight) + + std = 1.0 / math.sqrt(self.hidden_size) + torch.nn.init.trunc_normal_(self.action_modality_embed, std=std, a=-3 * std, b=3 * std) + + if self.action_pos_embed is not None: + self.action_pos_embed._init_weights() + + if self.config.sound_gen: + # sound2llm: input_size=sound_dim, output_size=hidden_size + std = 1.0 / math.sqrt(self.sound_dim) + torch.nn.init.trunc_normal_(self.sound2llm.weight, std=std, a=-3 * std, b=3 * std) + torch.nn.init.zeros_(self.sound2llm.bias) + + # llm2sound: input_size=hidden_size, output_size=sound_dim + std = 1.0 / math.sqrt(self.hidden_size) + torch.nn.init.trunc_normal_(self.llm2sound.weight, std=std, a=-3 * std, b=3 * std) + torch.nn.init.zeros_(self.llm2sound.bias) + + std = 1.0 / math.sqrt(self.hidden_size) + torch.nn.init.trunc_normal_(self.sound_modality_embed, std=std, a=-3 * std, b=3 * std) + + self.language_model.init_weights(buffer_device=buffer_device) + + def patchify_and_pack_latents( + self, tokens_vision: torch.Tensor, token_shapes_vision: List[Tuple[int, int, int]] + ) -> tuple[torch.Tensor, List[Tuple[int, int, int]]]: + p = self.latent_patch_size + # Patchify and pack the latents + packed_latent = [] + original_latent_shapes = [] # Store original shapes for unpadding later + + # C, T, H, W + for latent, (t, h, w) in zip(tokens_vision, token_shapes_vision): + latent = latent.squeeze(0) # [C,T,H,W] + + # Get original latent dimensions + _, t_actual, h_actual, w_actual = latent.shape + original_latent_shapes.append((t_actual, h_actual, w_actual)) + + # Compute padded dimensions (must be divisible by p) + h_padded = ((h_actual + p - 1) // p) * p + w_padded = ((w_actual + p - 1) // p) * p + + # Zero-pad if dimensions are not divisible by p + if h_padded != h_actual or w_padded != w_actual: + padded = torch.zeros( + (self.latent_channel, t_actual, h_padded, w_padded), + device=latent.device, + dtype=latent.dtype, + ) # [C,T,H_padded,W_padded] + padded[:, :, :h_actual, :w_actual] = latent + latent = padded # [C,T,H_padded,W_padded] + + # Compute number of patches after padding + h_patches = h_padded // p + w_patches = w_padded // p + + # Patchify + latent = latent.reshape( + self.latent_channel, t_actual, h_patches, p, w_patches, p + ) # [C,T,h_patches,p,w_patches,p] + latent = torch.einsum("cthpwq->thwpqc", latent).reshape( + -1, p * p * self.latent_channel + ) # [T*h_patches*w_patches,patch_latent_dim] + packed_latent.append(latent) + + # We assumed latents we get to the network is already noised + packed_latent = torch.cat(packed_latent, dim=0) # [total_vision_patches,patch_latent_dim] + return packed_latent, original_latent_shapes + + def unpatchify_and_unpack_latents( + self, + packed_mse_preds: torch.Tensor, + token_shapes_vision: List[Tuple[int, int, int]], + noisy_frame_indexes_vision: list[torch.Tensor], + original_latent_shapes: List[Tuple[int, int, int]] | None = None, + ) -> list[torch.Tensor]: + p = self.latent_patch_size + unpatchified_latents = [] + + # Split packed_mse_preds back into individual latents based on token_shapes_vision + start_idx = 0 + for i, (t_c, h_c, w_c) in enumerate(token_shapes_vision): + # Get original shape for unpadding (if provided) + if original_latent_shapes is not None: + t_orig, h_orig, w_orig = original_latent_shapes[i] + # Compute padded dimensions used during patchify + h_padded = ((h_orig + p - 1) // p) * p + w_padded = ((w_orig + p - 1) // p) * p + h_patches = h_padded // p + w_patches = w_padded // p + else: + # Fallback: use token shapes directly (assumes no padding was needed) + t_orig, h_orig, w_orig = t_c, h_c * p, w_c * p + h_patches, w_patches = h_c, w_c + + # noisy_frame_indexes_vision is a list of tensors, each with shape (T,), + # where the values are the noisy frame indices. + noisy_frame_indexes = noisy_frame_indexes_vision[i] + t_n = len(noisy_frame_indexes) + + # Initialize with the original shape (after unpadding), zeros for clean frames + output_tensor = torch.zeros( + (self.latent_channel, t_c, h_orig, w_orig), + device=packed_mse_preds.device, + dtype=packed_mse_preds.dtype, + ) # [C,T,H_orig,W_orig] + num_patches = t_n * h_patches * w_patches + if num_patches > 0: + end_idx = start_idx + num_patches + # Extract patches for this latent + latent_patches = packed_mse_preds[start_idx:end_idx] # [num_patches,patch_latent_dim] + # Reshape back to [t_n, h_patches, w_patches, p, p, channels] + latent_patches = latent_patches.reshape( + t_n, h_patches, w_patches, p, p, self.latent_channel + ) # [T_n,h_patches,w_patches,p,p,C] + # Invert the einsum operation: "thwpqc->cthpwq" + latent = torch.einsum("thwpqc->cthpwq", latent_patches) # [C,T_n,h_patches,p,w_patches,p] + # Reshape back to [channels, t_n, h_padded, w_padded] + latent = latent.reshape( + self.latent_channel, t_n, h_patches * p, w_patches * p + ) # [C,T_n,H_padded,W_padded] + + # Crop to original dimensions (unpad the zeros) + latent = latent[:, :, :h_orig, :w_orig] # [C,T_n,H_orig,W_orig] + + # Fill only the noisy frame positions using the actual mask indices + output_tensor[:, noisy_frame_indexes] = latent + + start_idx = end_idx + + unpatchified_latents.append(output_tensor.unsqueeze(0)) # [1,C,T,H,W] + + # Return list of unpatchified latents (supports variable shapes) + return unpatchified_latents + + def pack_action( + self, + tokens_action: list[torch.Tensor], + token_shapes_action: list[tuple[int, ...]], + domain_id_action: list[torch.Tensor], + ) -> tuple[torch.Tensor, torch.Tensor]: + """Pack variable-length action tokens into a 1D sequence for transformer input. + + Args: + tokens_action: List of action tensors, each [T_i, action_dim] (T_i may vary). + token_shapes_action: List of (T_i,) tuples per sample. + domain_id_action: List of domain ID tensors, each of shape [1]. + + Returns: + Tuple of (packed_tokens, per_token_domain_id): + packed_tokens: [total_action_tokens, action_dim] + per_token_domain_id: [total_action_tokens] + """ + packed: list[torch.Tensor] = [] + domain_ids: list[torch.Tensor] = [] + for tokens, shape, d_id in zip(tokens_action, token_shapes_action, domain_id_action): + T = shape[0] + packed.append(tokens[:T]) + domain_ids.append(d_id.expand(T)) + return torch.cat(packed, dim=0), torch.cat(domain_ids, dim=0) + + def unpack_action( + self, + packed_action_preds: torch.Tensor, + token_shapes_action: list[tuple[int, ...]], + noisy_frame_indexes_action: list[torch.Tensor], + ) -> list[torch.Tensor]: + """Unpack action predictions back into per-sample action tensors. + + Args: + packed_action_preds: Packed action predictions of shape (total_noisy_tokens, action_dim) + token_shapes_action: Per-sample token shapes, each (T_i,) tuple. + noisy_frame_indexes_action: List of tensors, each with shape (Tn_i,), where the values + are the noisy frame indices for sample i. + + Returns: + List of per-sample tensors, each of shape (T_i, action_dim), with predictions + placed at noisy positions. Clean positions are left as zeros. + """ + unpacked: list[torch.Tensor] = [] + start_idx = 0 + for shape, noisy_frame_indexes in zip(token_shapes_action, noisy_frame_indexes_action): + T = shape[0] + output = torch.zeros( + (T, self.action_dim), + device=packed_action_preds.device, + dtype=packed_action_preds.dtype, + ) + t_n = len(noisy_frame_indexes) + if t_n > 0: + end_idx = start_idx + t_n + output[noisy_frame_indexes] = packed_action_preds[start_idx:end_idx] + start_idx = end_idx + unpacked.append(output) + return unpacked + + def pack_sound_latents( + self, + tokens_sound: list[torch.Tensor], + token_shapes_sound: list[tuple[int, int, int]], + ) -> torch.Tensor: + """Pack sound latents into a 1D sequence for transformer input. + + Args: + tokens_sound: List of sound latent tensors, each [C, T] + token_shapes_sound: List of (T, 1, 1) tuples per sample + + Returns: + Packed tensor of shape [total_sound_tokens, C] + """ + packed = [] + for sound, shape in zip(tokens_sound, token_shapes_sound): + T = shape[0] + # sound: [C, T] → take first T frames → [C, T] + # Then permute to [T, C] for packing + sound_tokens = sound[:, :T].permute(1, 0) # [T,C] + packed.append(sound_tokens) + return torch.cat(packed, dim=0) # [total_sound_tokens,C] + + def unpack_sound_latents( + self, + packed_sound_preds: torch.Tensor, + token_shapes_sound: list[tuple[int, int, int]], + noisy_frame_indexes_sound: list[torch.Tensor], + ) -> list[torch.Tensor]: + """Unpack sound predictions back into per-sample sound latents. + + Args: + packed_sound_preds: Packed sound predictions of shape (total_noisy_tokens, sound_dim) + token_shapes_sound: List of (T, 1, 1) tuples per sample + noisy_frame_indexes_action: List of tensors, each with shape (T_i,), where the values + are the noisy frame indices. T_i <= max_T. + + Returns: + List of per-sample tensors, each [C, T], with predictions placed at noisy positions. + Clean positions are left as zeros. + """ + unpacked = [] + start_idx = 0 + for shape, noisy_frame_indexes in zip(token_shapes_sound, noisy_frame_indexes_sound): + T = shape[0] + # Initialize output with zeros for clean positions + output = torch.zeros( + (self.sound_dim, T), + device=packed_sound_preds.device, + dtype=packed_sound_preds.dtype, + ) + + t_n = len(noisy_frame_indexes) + + if t_n > 0: + end_idx = start_idx + t_n + # packed_sound_preds: [total_noisy_tokens, C] → transpose and fill at noisy positions + output[:, noisy_frame_indexes] = packed_sound_preds[ + start_idx:end_idx + ].T # packed_sound_preds[...]: [T_n,C] → .T: [C,T_n] + start_idx = end_idx + + unpacked.append(output) + return unpacked + + def _encode_text( + self, + packed_seq: PackedSequence, + ) -> tuple[torch.Tensor, torch.dtype]: + """Embed text tokens and initialize packed_sequence. + + Args: + packed_seq: PackedSequence containing text_ids and text_indexes. + + Returns: + tuple of (packed_sequence, target_dtype) where packed_sequence has text embeddings filled in. + """ + packed_text_embedding = self.language_model.model.embed_tokens(packed_seq.text_ids) # [N_text,hidden_size] + packed_sequence = packed_text_embedding.new_zeros( + size=(packed_seq.sequence_length, self.hidden_size) + ) # [N_total,hidden_size] + packed_sequence[packed_seq.text_indexes] = ( + packed_text_embedding # [N_text,hidden_size] scattered into [N_total,hidden_size] + ) + return packed_sequence, packed_text_embedding.dtype + + def _encode_vision( + self, + packed_seq: PackedSequence, + packed_sequence: torch.Tensor, + target_dtype: torch.dtype, + fps: Optional[torch.Tensor] = None, + ) -> List[Tuple[int, int, int]] | None: + """Project vision tokens and fill into packed_sequence. + + Args: + packed_seq: PackedSequence containing vision tokens and metadata. + packed_sequence: The packed sequence tensor to fill vision embeddings into (modified in-place). + target_dtype: Target dtype for embeddings (typically from text embedding). + fps: Optional FPS tensor for RoPE modulation. + + Returns: + Original latent shapes before padding (for unpadding during decode), or None if no vision tokens. + """ + if packed_seq.vision is None or packed_seq.vision.tokens is None: + # No vision tokens in this batch + return None + + vision = packed_seq.vision + assert vision.tokens is not None # Type narrowing (checked above but reassignment loses it) + assert vision.token_shapes is not None + assert isinstance(vision.sequence_indexes, torch.Tensor) + assert isinstance(vision.timesteps, torch.Tensor) + assert isinstance(vision.mse_loss_indexes, torch.Tensor) + + packed_tokens_vision, original_latent_shapes = self.patchify_and_pack_latents( + vision.tokens, vision.token_shapes + ) # packed_tokens_vision: [total_vision_patches,patch_latent_dim] + + packed_tokens_vision = self.vae2llm(packed_tokens_vision) # [total_vision_patches,hidden_size] + + # Add absolute position embedding only when NOT using unified 3D mRoPE + # (3D mRoPE provides positional information via rotary embeddings instead) + if self.latent_pos_embed is not None: + latent_token_pos_emb = self.latent_pos_embed(vision.token_shapes, fps=fps).to( + target_dtype + ) # [total_vision_patches,hidden_size] + packed_tokens_vision = packed_tokens_vision + latent_token_pos_emb # [total_vision_patches,hidden_size] + + has_noisy_vision = vision.mse_loss_indexes.numel() > 0 + + if has_noisy_vision: + timesteps_vision = vision.timesteps * self.timestep_scale # [N_noisy_frames_vision] + + # Timesteps are computed in FP32 for numerical stability. + with torch.autocast("cuda", enabled=True, dtype=torch.float32): + packed_timestep_embeds_vision = self.time_embedder( + timesteps_vision + ) # [N_noisy_frames_vision,hidden_size] + packed_timestep_embeds_vision = packed_timestep_embeds_vision.to( + target_dtype + ) # [N_noisy_frames_vision,hidden_size] + + packed_tokens_vision = _apply_timestep_embeds_to_noisy_tokens( + packed_tokens=packed_tokens_vision, + packed_timestep_embeds=packed_timestep_embeds_vision, + noisy_frame_indexes=vision.noisy_frame_indexes, + token_shapes=vision.token_shapes, + ) # [total_vision_patches,hidden_size] + + packed_sequence[vision.sequence_indexes] = ( + packed_tokens_vision # [total_vision_patches,hidden_size] scattered into [N_total,hidden_size] + ) + return original_latent_shapes + + def _decode_vision( + self, + packed_seq: PackedSequence, + last_hidden_state: torch.Tensor, + output_dict: dict, + original_latent_shapes: List[Tuple[int, int, int]] | None = None, + ) -> None: + """Decode vision tokens from hidden states and update output_dict. + + Args: + packed_seq: PackedSequence containing mse_loss_indexes_vision and token_shapes_vision. + last_hidden_state: Hidden states from the transformer. + output_dict: Output dictionary to update with mse_preds (modified in-place). + original_latent_shapes: Original latent shapes before padding (for unpadding). + """ + vision = packed_seq.vision + # Check if no vision or no noisy vision tokens + has_noisy_vision = ( + vision is not None + and vision.tokens is not None + and isinstance(vision.mse_loss_indexes, torch.Tensor) + and vision.mse_loss_indexes.numel() > 0 + ) + if not has_noisy_vision: + # No noisy vision tokens present. The model is predicting actions + # given clean vision tokens. We need to execute a dummy forward to maintain + # computation graph consistency across ranks (FSDP should torch all weights). + preds_vision = torch.zeros( + [1, self.patch_latent_dim], device=last_hidden_state.device, dtype=last_hidden_state.dtype + ) # [1,patch_latent_dim] + preds_vision = self.vae2llm(preds_vision) # [1,hidden_size] + preds_vision = self.llm2vae(preds_vision) # [1,patch_latent_dim] + # Return a list of per-sample zero tensors with correct shapes (e.g. (C, T, H, W)), + # so downstream code (_get_velocity, _compute_flow_matching_loss) that iterates over preds_vision + # gets properly-shaped tensors. Without this, the dummy tensor (1, patch_latent_dim) + # would cause a size mismatch when concatenating vision+action velocities. + # When vision is None (no vision in batch), fall back to [preds_vision] purely for + # gradient graph consistency — it won't be iterated over. + if vision is not None and vision.tokens is not None: + preds_vision_list = [torch.zeros_like(tok) for tok in vision.tokens] + # Inject dummy forward's computation graph so vae2llm/llm2vae params + # stay in the autograd graph (zeros_like creates detached tensors). + preds_vision_list[0] = preds_vision_list[0] + 0.0 * preds_vision.sum() + else: + preds_vision_list = [preds_vision] + output_dict.update(preds_vision=preds_vision_list) + else: + assert vision is not None # Type narrowing + assert isinstance(vision.mse_loss_indexes, torch.Tensor) + assert vision.noisy_frame_indexes is not None + preds_vision = self.llm2vae( + last_hidden_state[vision.mse_loss_indexes] + ) # [total_noisy_vision_patches,patch_latent_dim] + preds_vision = self.unpatchify_and_unpack_latents( + preds_vision, + token_shapes_vision=vision.token_shapes, + noisy_frame_indexes_vision=vision.noisy_frame_indexes, + original_latent_shapes=original_latent_shapes, + ) + output_dict.update(preds_vision=preds_vision) + + def _encode_action( + self, + packed_seq: PackedSequence, + packed_sequence: torch.Tensor, + target_dtype: torch.dtype, + fps_action: Optional[torch.Tensor] = None, + ) -> None: + """Encode action tokens and fill into packed_sequence.""" + if packed_seq.action is None or packed_seq.action.tokens is None: + # No action tokens in this batch + return + + action: ModalityData = packed_seq.action + assert action.token_shapes is not None + assert isinstance(action.sequence_indexes, torch.Tensor) + assert isinstance(action.timesteps, torch.Tensor) + assert isinstance(action.mse_loss_indexes, torch.Tensor) + + # Pack variable-length action tokens into a 1D sequence (same pattern as pack_sound_latents) + packed_tokens_action, per_token_domain_id = self.pack_action( + action.tokens, action.token_shapes, action.domain_id + ) + packed_tokens_action = self.action2llm(packed_tokens_action, per_token_domain_id) + + # Add additive position embedding only if not using unified_3d_mrope + if self.action_pos_embed is not None: + # VideoRopePosition3DEmb expects shapes as (t, h, w). For actions we use a 1x1 spatial grid. + action_shapes_3d = [(ts[0], 1, 1) for ts in action.token_shapes] + action_token_pos_emb = self.action_pos_embed( + action_shapes_3d, + fps=fps_action, + start_frame_offset=1, + ).to(target_dtype) # [B_action*T_action,hidden_size] + packed_tokens_action = packed_tokens_action + action_token_pos_emb # [B_action*T_action,hidden_size] + + packed_tokens_action = packed_tokens_action + self.action_modality_embed.view( + 1, -1 + ) # [B_action*T_action,hidden_size] + + has_noisy_actions = action.mse_loss_indexes.numel() > 0 + if has_noisy_actions: + timesteps_action = action.timesteps * self.timestep_scale # [N_noisy_frames_action] + with torch.autocast("cuda", enabled=True, dtype=torch.float32): + packed_timestep_embeds_action = self.time_embedder( + timesteps_action + ) # [N_noisy_frames_action,hidden_size] + packed_timestep_embeds_action = packed_timestep_embeds_action.to( + target_dtype + ) # [N_noisy_frames_action,hidden_size] + + packed_tokens_action = _apply_timestep_embeds_to_noisy_tokens( + packed_tokens=packed_tokens_action, + packed_timestep_embeds=packed_timestep_embeds_action, + noisy_frame_indexes=action.noisy_frame_indexes, + token_shapes=action.token_shapes, + ) # [B_action*T_action,hidden_size] + + packed_sequence[action.sequence_indexes] = ( + packed_tokens_action # [B_action*T_action,hidden_size] scattered into [N_total,hidden_size] + ) + + def _decode_action( + self, + packed_seq: PackedSequence, + last_hidden_state: torch.Tensor, + output_dict: dict, + ) -> None: + """Decode action tokens from hidden states and update output_dict.""" + action = packed_seq.action + # Check if no action or no noisy action tokens + has_noisy_action = ( + action is not None + and action.tokens is not None + and isinstance(action.mse_loss_indexes, torch.Tensor) + and action.mse_loss_indexes.numel() > 0 + ) + if not has_noisy_action: + # dummy forward to maintain computation graph consistency across ranks + preds_action = torch.zeros( + [1, self.action_dim], device=last_hidden_state.device, dtype=last_hidden_state.dtype + ) # [1,action_dim] + dummy_domain_id = torch.zeros([1], device=last_hidden_state.device, dtype=torch.long) # [1] + preds_action = self.action2llm(preds_action, dummy_domain_id) + self.action_modality_embed.view( + 1, -1 + ) # [1,hidden_size] + preds_action = self.llm2action(preds_action, dummy_domain_id) # [1,action_dim] + # Return a list of per-sample zero tensors with correct shapes (e.g. (T, action_dim)), + # so downstream code (_get_velocity, _compute_flow_matching_loss) that iterates over preds_action + # gets properly-shaped tensors. Without this, the dummy tensor (1, action_dim) + # would cause a size mismatch when concatenating vision+action velocities. + if action is not None and action.tokens is not None: + preds_action_list = [torch.zeros_like(tok) for tok in action.tokens] + # Inject dummy forward's computation graph so DomainAwareLinear params + # stay in the autograd graph (zeros_like creates detached tensors). + preds_action_list[0] = preds_action_list[0] + 0.0 * preds_action.sum() + # When action is None (no action in batch), fall back to [preds_action] purely for + # gradient graph consistency — it won't be iterated over. + else: + preds_action_list = [preds_action] + output_dict.update(preds_action=preds_action_list) + else: + assert action is not None # Type narrowing + assert isinstance(action.mse_loss_indexes, torch.Tensor) + assert action.condition_mask is not None + assert len(action.domain_id) > 0 + + action_hidden_states = last_hidden_state[action.mse_loss_indexes] # [total_noisy_action_tokens,hidden_size] + + # Build per-token domain IDs for the noisy tokens (same expansion logic as pack_action) + domain_ids: list[torch.Tensor] = [] + for nfi, d_id in zip(action.noisy_frame_indexes, action.domain_id): + domain_ids.append(d_id.expand(len(nfi))) + per_token_domain_id = torch.cat(domain_ids, dim=0) + + preds_action = self.llm2action( + action_hidden_states, per_token_domain_id + ) # [total_noisy_action_tokens,action_dim] + preds_action = self.unpack_action(preds_action, action.token_shapes, action.noisy_frame_indexes) + output_dict.update(preds_action=preds_action) + + def _encode_sound( + self, + packed_seq: PackedSequence, + packed_sequence: torch.Tensor, + target_dtype: torch.dtype, + fps_sound: Optional[torch.Tensor] = None, + ) -> None: + """Encode sound tokens and fill into packed_sequence. + + Args: + packed_seq: PackedSequence containing sound tokens and metadata. + packed_sequence: The packed sequence tensor to fill sound embeddings into (modified in-place). + target_dtype: Target dtype for embeddings (typically from text embedding). + fps_sound: FPS tensor for RoPE modulation. Should be the sound latent rate (e.g., 25 Hz). + """ + if packed_seq.sound is None or packed_seq.sound.tokens is None: + # No sound tokens in this batch + return + + sound = packed_seq.sound + assert sound.token_shapes is not None + assert isinstance(sound.sequence_indexes, torch.Tensor) + assert isinstance(sound.timesteps, torch.Tensor) + assert isinstance(sound.mse_loss_indexes, torch.Tensor) + + # Pack sound latents: list of [C, T] tensors → [total_tokens, C] + packed_tokens_sound = self.pack_sound_latents( + sound.tokens, sound.token_shapes + ) # [total_sound_tokens,sound_dim] + packed_tokens_sound = packed_tokens_sound.to(target_dtype) # [total_sound_tokens,sound_dim] + + # Project sound tokens + modality embedding + + # No additive position embedding is used (unlike legacy video which keeps one for backward compat). + packed_tokens_sound = ( + self.sound2llm(packed_tokens_sound) + self.sound_modality_embed + ) # [total_sound_tokens,hidden_size] + + has_noisy_sound = sound.mse_loss_indexes.numel() > 0 + if has_noisy_sound: + timesteps_sound = sound.timesteps * self.timestep_scale # [N_noisy_frames_sound] + with torch.autocast("cuda", enabled=True, dtype=torch.float32): + packed_timestep_embeds_sound = self.time_embedder(timesteps_sound) # [N_noisy_frames_sound,hidden_size] + packed_timestep_embeds_sound = packed_timestep_embeds_sound.to( + target_dtype + ) # [N_noisy_frames_sound,hidden_size] + + packed_tokens_sound = _apply_timestep_embeds_to_noisy_tokens( + packed_tokens=packed_tokens_sound, + packed_timestep_embeds=packed_timestep_embeds_sound, + noisy_frame_indexes=sound.noisy_frame_indexes, + token_shapes=sound.token_shapes, + ) # [total_sound_tokens,hidden_size] + + packed_sequence[sound.sequence_indexes] = ( + packed_tokens_sound # [total_sound_tokens,hidden_size] scattered into [N_total,hidden_size] + ) + + def _decode_sound( + self, + packed_seq: PackedSequence, + last_hidden_state: torch.Tensor, + output_dict: dict, + ) -> None: + """Decode sound tokens from hidden states and update output_dict. + + Args: + packed_seq: PackedSequence containing sound modality data. + last_hidden_state: Hidden states from the transformer. + output_dict: Output dictionary to update with preds_sound (modified in-place). + """ + sound = packed_seq.sound + # Check if no sound or no noisy sound tokens + has_noisy_sound = ( + sound is not None + and sound.tokens is not None + and isinstance(sound.mse_loss_indexes, torch.Tensor) + and sound.mse_loss_indexes.numel() > 0 + ) + if not has_noisy_sound: + # dummy forward to maintain computation graph consistency across ranks + preds_sound = torch.zeros( + [1, self.sound_dim], device=last_hidden_state.device, dtype=last_hidden_state.dtype + ) # [1,sound_dim] + preds_sound = self.sound2llm(preds_sound) + self.sound_modality_embed # [1,hidden_size] + preds_sound = self.llm2sound(preds_sound) # [1,sound_dim] + if sound is not None and sound.tokens is not None: + preds_sound_list = [torch.zeros_like(tok) for tok in sound.tokens] + preds_sound_list[0] = preds_sound_list[0] + 0.0 * preds_sound.sum() + else: + preds_sound_list = [preds_sound] + output_dict.update(preds_sound=preds_sound_list) + else: + assert sound is not None # Type narrowing + assert isinstance(sound.mse_loss_indexes, torch.Tensor) + assert sound.condition_mask is not None + preds_sound = self.llm2sound( + last_hidden_state[sound.mse_loss_indexes] + ) # [total_noisy_sound_tokens,sound_dim] + preds_sound = self.unpack_sound_latents( + preds_sound, sound.token_shapes, sound.noisy_frame_indexes + ) # list of [C,T] per sample + output_dict.update(preds_sound=preds_sound) + + def forward( + self, + packed_seq: PackedSequence, + fps_vision: Optional[torch.Tensor] = None, + fps_action: Optional[torch.Tensor] = None, + fps_sound: Optional[torch.Tensor] = None, + memory: MemoryState | None = None, + ) -> dict: + """ + Forward pass for Cosmos3VFMNetwork. + + Args: + packed_seq: PackedSequence containing all packed tensors and metadata. + See PackedSequence dataclass for field details. + fps_vision: Optional FPS tensor for vision RoPE modulation. + fps_action: Optional FPS tensor for action RoPE modulation. + fps_sound: Optional FPS tensor for sound RoPE modulation (e.g., sound_latent_fps=25). + memory: Optional MemoryState for persistent KV-cache memory + (AR inference or rolling-KV-cache training). Built by + ``OmniMoTModel.build_memory_state()``. + + Returns: + dict with keys: + - "preds_vision": list[Tensor[C,T,H,W]], one per sample. + - "preds_action": Velocity predictions for action tokens (if action_gen). + - "preds_sound": Velocity predictions for sound tokens (if sound_gen). + - "last_hidden_state": Last hidden state from the transformer. + - "lbl_metadata_*": Load balancing metadata. + - "ce_preds": Cross-entropy predictions (if predict_text_tokens is True). + """ + # Note: During inference with @torch.no_grad(), model may be in training mode + # This is intentional for proper batch norm / dropout behavior + # assert self.training, "Cosmos3VFMNetwork only supports training mode" + + packed_sequence, target_dtype = self._encode_text(packed_seq) # packed_sequence: [N_total,hidden_size] + + # encode vision tokens + original_latent_shapes: List[Tuple[int, int, int]] | None = None + if self.config.vision_gen: + original_latent_shapes = self._encode_vision(packed_seq, packed_sequence, target_dtype, fps_vision) + + # encode action tokens + if self.config.action_gen: + self._encode_action(packed_seq, packed_sequence, target_dtype, fps_action) + + # encode sound tokens + if self.config.sound_gen: + self._encode_sound(packed_seq, packed_sequence, target_dtype, fps_sound) + + assert self.use_moe + assert packed_seq.attn_modes is not None + assert packed_seq.split_lens is not None + + # Get all generation sequence indexes for MoE routing + # IMPORTANT: Include ALL latent tokens (video + action + sound), not just generation targets. + # Condition tokens still need to be routed to diffusion experts; they are excluded from + # LOSS computation, not from routing. + all_gen_indexes = [] + if packed_seq.vision is not None: + assert packed_seq.vision.token_shapes is not None + assert isinstance(packed_seq.vision.sequence_indexes, torch.Tensor) + all_gen_indexes.append(packed_seq.vision.sequence_indexes) + if packed_seq.action is not None and isinstance(packed_seq.action.sequence_indexes, torch.Tensor): + all_gen_indexes.append(packed_seq.action.sequence_indexes) + if packed_seq.sound is not None and isinstance(packed_seq.sound.sequence_indexes, torch.Tensor): + all_gen_indexes.append(packed_seq.sound.sequence_indexes) + vision_sequence_indexes = torch.cat(all_gen_indexes, dim=0) if all_gen_indexes else None # [N_gen_tokens] + + # When temporal causal is enabled the buffer is [action_t0, vision_t0, action_t1, vision_t1, ...]. + # After torch.cat([vision_indexes, action_indexes]) the interleaved order is lost; sorting restores it. + if self.video_temporal_causal: + assert packed_seq.sound is None, "Sound generation is not supported with video_temporal_causal=True." + if vision_sequence_indexes is not None: + vision_sequence_indexes = vision_sequence_indexes.sort().values # [N_gen_tokens] + + vision_token_shapes = packed_seq.vision.token_shapes if packed_seq.vision else None + + # The packer is the single source of truth for the supertoken layout. + # ``num_action_tokens_per_supertoken`` is stamped onto ``packed_seq`` by + # ``_pack_supertokens_temporal_causal`` (= tcf when actions are packed + # inline, 0 otherwise) and read unchanged by the attention builder, the + # NATTEN metadata generator, and the rolling KV-cache state — keeping + # all downstream supertoken geometry automatically in sync with the pack. + num_action_tokens_per_supertoken = packed_seq.num_action_tokens_per_supertoken + + input_pack, attention_meta, natten_metadata_list = build_packed_sequence( + self.config.joint_attn_implementation, + packed_sequence=packed_sequence, + attn_modes=packed_seq.attn_modes, + split_lens=packed_seq.split_lens, + sample_lens=packed_seq.sample_lens, + packed_und_token_indexes=packed_seq.text_indexes, + packed_gen_token_indexes=vision_sequence_indexes, + num_heads=self.num_heads, + is_image_batch=packed_seq.is_image_batch, + head_dim=self.head_dim, + num_layers=self.num_hidden_layers, + token_shapes=packed_seq.vision.token_shapes, + natten_parameter_list=self.natten_parameter_list, + cp_world_size=self.parallel_dims.cp_size if self.parallel_dims else 1, + video_temporal_causal=self.video_temporal_causal, + use_rolling_kv_cache=memory is not None and memory.uses_rolling_kv_cache, + vision_token_shapes=vision_token_shapes, + action_token_shapes=packed_seq.action.token_shapes if packed_seq.action else None, + num_action_tokens_per_supertoken=num_action_tokens_per_supertoken, + null_action_supertokens=packed_seq.null_action_supertokens, + pad_for_cuda_graphs=self.pad_for_cuda_graphs, + ) + + input_pack, packed_position_ids = get_context_parallel_sharded_sequence( + attn_implementation=self.config.joint_attn_implementation, + input_pack=input_pack, + position_ids=packed_seq.position_ids, + parallel_dims=self.parallel_dims, + ) + + packed_outputs, lbl_metadata = self.language_model( + input_pack, + attention_mask=attention_meta, + position_ids=packed_position_ids, + natten_metadata_list=natten_metadata_list, + memory=memory, + ) + last_hidden_state = get_context_parallel_last_hidden_state( + packed_outputs=packed_outputs, + parallel_dims=self.parallel_dims, + ) # [N_total,hidden_size] + output_dict = dict() + + # decode vision tokens + if self.config.vision_gen: + self._decode_vision(packed_seq, last_hidden_state, output_dict, original_latent_shapes) + + # decode action tokens + if self.config.action_gen: + self._decode_action(packed_seq, last_hidden_state, output_dict) + + # decode sound tokens + if self.config.sound_gen: + self._decode_sound(packed_seq, last_hidden_state, output_dict) + + output_dict.update(last_hidden_state=last_hidden_state) + for lbl_metadata_key, lbl_metadata_value in lbl_metadata.items(): + output_dict.update({f"lbl_metadata_{lbl_metadata_key}": lbl_metadata_value}) + if self.predict_text_tokens: + packed_ce_preds = self.language_model.lm_head( + last_hidden_state[packed_seq.ce_loss_indexes] + ) # [N_ce_tokens,vocab_size] + output_dict["ce_preds"] = packed_ce_preds + + return output_dict + + +def _apply_timestep_embeds_to_noisy_tokens( + packed_tokens: torch.Tensor, + packed_timestep_embeds: torch.Tensor, + noisy_frame_indexes: List[torch.Tensor], + token_shapes: list[tuple[int, ...]], +) -> torch.Tensor: + """Apply timestep embeddings to noisy tokens. + Tn is the number of noisy frames for a given sample. + Tc is the number of clean frames for a given sample. + T is the total number of frames for a given sample. + T = Tn + Tc + + Args: + packed_tokens: The packed tokens to apply timestep embeddings to. + packed_timestep_embeds: The packed timestep embeddings to apply. + noisy_frame_indexes: The frame indices to apply timestep embeddings to + (list of tensors, each with shape (Tn,)). + token_shapes: The token shapes for each sample. Each entry is a tuple + shaped like ``(T, ...)`` where trailing dimensions represent the spatial grid. + + Returns: + The packed tokens with timestep embeddings applied to the noisy tokens. + """ + + # Handle variable token shapes by processing each sample's noisy_frame_indexes individually. + # The noisy indices are first expanded to cover the entire spatial grid of each frame. + # + # For video frames, the spatial grid is (H, W). + # For action frames, the spatial grid is (). + # For sound frames, the spatial grid is (1, 1). + # + # The noisy indices are then flattened into a single tensor overall. When flattening, + # we must ensure that the noisy indices from each sample are unique by adding the + # cumulative sum of the token shapes of previous samples to the noisy indices for + # a given sample. + start_noisy_index = 0 + flattened_noisy_frame_indexes = [] + + for noisy_indexes_i, token_shape_i in zip(noisy_frame_indexes, token_shapes): + assert noisy_indexes_i.numel() <= token_shape_i[0] + spatial_numel_i = math.prod(token_shape_i[1:]) + spatial_indexes_i = torch.arange(spatial_numel_i, device=packed_tokens.device) # [spatial_numel_i] + noisy_indexes_i = ( + (noisy_indexes_i * spatial_numel_i).unsqueeze(-1).expand(-1, spatial_numel_i) + ) # [Tn_i,spatial_numel_i] + noisy_indexes_i = noisy_indexes_i.clone() + spatial_indexes_i + start_noisy_index # [Tn_i,spatial_numel_i] + flattened_noisy_frame_indexes.append(noisy_indexes_i.flatten()) # [Tn_i*spatial_numel_i] + start_noisy_index += math.prod(token_shape_i) + + flattened_noisy_frame_indexes = torch.cat(flattened_noisy_frame_indexes, dim=0) # [total_noisy_patches] + + assert packed_tokens.dim() == 2 + assert packed_timestep_embeds.dim() == 2 + assert packed_timestep_embeds.shape[1] == packed_tokens.shape[1] + assert packed_timestep_embeds.shape[0] <= packed_tokens.shape[0] + assert flattened_noisy_frame_indexes.dim() == 1 + assert flattened_noisy_frame_indexes.shape[0] == packed_timestep_embeds.shape[0] + + flattened_noisy_frame_indexes = flattened_noisy_frame_indexes.unsqueeze(-1).expand( + -1, + packed_tokens.shape[1], + ) # [total_noisy_patches,hidden_size] + + return packed_tokens.scatter_add( + dim=0, + index=flattened_noisy_frame_indexes, + src=packed_timestep_embeds, + ) # [total_tokens,hidden_size] diff --git a/cosmos-inference/cosmos3/_src/vfm/models/mot/domain_aware_linear.py b/cosmos-inference/cosmos3/_src/vfm/models/mot/domain_aware_linear.py new file mode 100644 index 00000000..d4f86c27 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/vfm/models/mot/domain_aware_linear.py @@ -0,0 +1,90 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +"""Domain-aware linear layer for multi-embodiment robot learning. + +This module provides a linear layer with domain-conditioned parameters, +where each domain (embodiment) has its own weight and bias vectors. + +Based on the X-VLA implementation: +https://github.com/2toinf/X-VLA/blob/main/models/transformer.py +""" + +import torch +from torch import nn + + +class DomainAwareLinear(nn.Module): + """Linear layer with domain-conditioned parameters (per-sample). + + Each domain has its own weight and bias vectors, stored in embeddings. + During forward pass, weights are retrieved based on per-sample domain IDs. + + This enables learning domain-specific transformations for different robot + embodiments while sharing the overall model architecture. + """ + + def __init__(self, input_size: int, output_size: int, num_domains: int = 50) -> None: + """Initialize the domain-aware linear layer. + + Args: + input_size: Dimension of input features. + output_size: Dimension of output features. + num_domains: Number of domains (embodiments) to support. + """ + super().__init__() + self.input_size = input_size + self.output_size = output_size + self.num_domains = num_domains + + # Store per-domain weights as embeddings: [num_domains, output_size * input_size] + self.fc = nn.Embedding(num_domains, output_size * input_size) + # Store per-domain biases as embeddings: [num_domains, output_size] + self.bias = nn.Embedding(num_domains, output_size) + + # Initialize weights + nn.init.xavier_uniform_(self.fc.weight) + nn.init.zeros_(self.bias.weight) + + def forward(self, x: torch.Tensor, domain_id: torch.LongTensor) -> torch.Tensor: + """Forward pass with domain-specific weights. + + Args: + x: Input tensor of shape [B, I] or [B, T, I] where B is batch size, + T is sequence length, and I is input_size. + domain_id: Domain indices of shape [B], one per sample in the batch. + + Returns: + Output tensor of shape [B, O] or [B, T, O] where O is output_size. + """ + B = domain_id.shape[0] + + # Retrieve per-sample weights: [B, input_size, output_size] + W = self.fc(domain_id).view(B, self.input_size, self.output_size) # [B,input_size,output_size] + + # Retrieve per-sample biases: [B, output_size] + b = self.bias(domain_id).view(B, self.output_size) # [B,output_size] + + if x.dim() == 2: + # 2D input: [B, I] @ [B, I, O] -> [B, O] + return ( + torch.bmm(x.unsqueeze(1), W).squeeze(1) + b + ) # [B,1,input_size] @ [B,input_size,output_size] -> [B,output_size] + else: + # 3D input: [B, T, I] @ [B, I, O] -> [B, T, O] + # Bias [B, O] -> [B, 1, O] for broadcasting + return torch.bmm(x, W) + b.unsqueeze( + 1 + ) # [B,T,input_size] @ [B,input_size,output_size] -> [B,T,output_size] diff --git a/cosmos-inference/cosmos3/_src/vfm/models/mot/dot_product_attention.py b/cosmos-inference/cosmos3/_src/vfm/models/mot/dot_product_attention.py new file mode 100644 index 00000000..2a080705 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/vfm/models/mot/dot_product_attention.py @@ -0,0 +1,452 @@ +# Copyright (c) 2022-2025, NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# +# See LICENSE for license information. + +""" +Simplified wrapper around TransformerEngine's C++ pytorch backend. +This supports torch.compile(fullgraph=True). +Lowers to cudnn ultimately. +Only bf16 / fp16 is supported. +Only THD layout is supported. +Currently, tensors are made contiguous -- packed th2d, th3d not supported yet. +""" + +import math +from typing import List, Optional, Tuple + +import torch +import transformer_engine + +_TE_VER = tuple(int(x) for x in transformer_engine.__version__.split(".")[:2]) + + +try: + # transformer_engine 2.8.0 + import transformer_engine.pytorch.attention.dot_product_attention.utils as dpa_utils +except ImportError: + # older transformer_engine + import transformer_engine.pytorch.dot_product_attention.utils as dpa_utils # type: ignore + +import transformer_engine_torch as tex +from transformer_engine.pytorch.constants import ( + TE_DType, +) +from transformer_engine.pytorch.cpp_extensions.fused_attn import ( + AttnBiasType, + AttnMaskType, + QKVLayout, +) + +if _TE_VER >= (2, 8): + from transformer_engine.pytorch.cpp_extensions.fused_attn import SoftmaxType + + +__all__ = ["cudnn_fused_attention"] + + +def get_window_size(attn_mask_type: str) -> Tuple[int, int]: + return dpa_utils.check_set_window_size(attn_mask_type) + + +def cudnn_fused_attention( + query_layer: torch.Tensor, # [total_tokens_q,num_heads,head_dim] + key_layer: torch.Tensor, # [total_tokens_kv,num_heads,head_dim] + value_layer: torch.Tensor, # [total_tokens_kv,num_heads,head_dim] + cu_seqlens_q: torch.Tensor, + cu_seqlens_kv: torch.Tensor, + max_seqlen_q: Optional[int] = None, + max_seqlen_kv: Optional[int] = None, + attn_mask_type: str = "causal", + attention_dropout: float = 0.0, + training: bool = True, +) -> torch.Tensor: # [total_tokens_q,num_heads*head_dim] + """fused attention fprop""" + + deterministic = torch.are_deterministic_algorithms_enabled() + window_size = get_window_size(attn_mask_type) + softmax_scale = 1.0 / math.sqrt(key_layer.shape[-1]) + + output_tensors = cudnn_fused_attn( + training, + max_seqlen_q, + max_seqlen_kv, + cu_seqlens_q, + cu_seqlens_kv, + query_layer, + key_layer, + value_layer, + window_size, + softmax_scale, + attention_dropout if training else 0.0, + attn_mask_type, + deterministic, + ) + output = output_tensors[0] # [total_tokens_q,num_heads,head_dim] + + # ...hd -> ...(hd) + return output.view(*output.shape[:-2], -1) # [total_tokens_q,num_heads*head_dim] + + +BACKEND_F16arb_ELTS_PER_THREADS = 16 + + +@torch.library.custom_op("cosmos3::cudnn_fused_attn", mutates_args=()) +def cudnn_fused_attn( + is_training: bool, + max_seqlen_q: torch.Tensor, + max_seqlen_kv: torch.Tensor, + cu_seqlens_q: torch.Tensor, + cu_seqlens_kv: torch.Tensor, + q: torch.Tensor, # [total_tokens_q,num_heads,head_dim] + k: torch.Tensor, # [total_tokens_kv,num_heads,head_dim] + v: torch.Tensor, # [total_tokens_kv,num_heads,head_dim] + window_size: List[int], + attn_scale: float, + dropout: float, + attn_mask_type: str, + deterministic: bool, +) -> List[torch.Tensor]: + attn_bias = None + attn_bias_type = "no_bias" + fast_zero_fill = True + softmax_offset = None + softmax_type = "vanilla" + fake_dtype = q.dtype + + rng_elts_per_thread = BACKEND_F16arb_ELTS_PER_THREADS + s_quantizer = None + o_quantizer = None + rng_gen = None + + + # "thd_thd_thd" format requires contiguous tensors. + # We should benchmark thd_th2d / th3d formats as well. + q = q.contiguous() + k = k.contiguous() + v = v.contiguous() + qkv_layout = "thd_thd_thd" + + cu_seqlens_q_padded = cu_seqlens_q + cu_seqlens_kv_padded = cu_seqlens_kv + + args = ( + max_seqlen_q.item(), + max_seqlen_kv.item(), + is_training, + attn_scale, + dropout, + fast_zero_fill, + QKVLayout[qkv_layout], + AttnBiasType[attn_bias_type], + AttnMaskType[attn_mask_type], + ) + + if _TE_VER >= (2, 8): + args += (SoftmaxType[softmax_type],) + + args += ( + tuple(window_size), + cu_seqlens_q, + cu_seqlens_kv, + q, + k, + v, + fake_dtype, + cu_seqlens_q_padded, + cu_seqlens_kv_padded, + None, # page_table_k, + None, # page_table_v, + s_quantizer, + o_quantizer, + attn_bias, + ) + + if _TE_VER >= (2, 8): + args += (softmax_offset,) + + args += ( + rng_gen, + rng_elts_per_thread, + ) + + if _TE_VER >= (2, 9): + # return_max_logit + args += (False,) + + if _TE_VER >= (2, 10): + # is_cuda_graph + args += (False,) + + + # I'd have to create DotProductAttention class and somehow pass it in here, but argument types for these torch.ops are very strict. + # Moreover, back-propagation would still need additional tweaks to work properly. + output_tensors = tex.fused_attn_fwd(*args) + return output_tensors + + +import math + + +def _get_max_tokens(num_tokens: int) -> int: + """ + Quantize token count: + - t = 0, ..., 1024 -> max_t = 1024 + - t = 1025, ..., 32k -> max_t = next power of 2 + - t = 32k+1, ... -> max_t = increment by 32k steps + + Note: translated from transformer_engine/common/fused_attn/utils.cu::get_max_tokens + """ + if num_tokens <= 0: + return 1024 + log2_t = math.ceil(math.log2(num_tokens)) + if log2_t <= 10: + max_t = 1024 + elif log2_t <= 15: + max_t = 2**log2_t + else: + max_t = ((num_tokens + 32767) // 32768) * 32768 + return max_t + + + +# The goal for this function is to return fake tensors of the correct shape and dtype +# without having to run the actual operator. + + +@cudnn_fused_attn.register_fake +def _( + is_training: bool, + max_seqlen_q: torch.Tensor, + max_seqlen_kv: torch.Tensor, + cu_seqlens_q: torch.Tensor, + cu_seqlens_kv: torch.Tensor, + q: torch.Tensor, # [total_tokens_q,num_heads,head_dim] + k: torch.Tensor, # [total_tokens_kv,num_heads,head_dim] + v: torch.Tensor, # [total_tokens_kv,num_heads,head_dim] + window_size: List[int], + attn_scale: float, + dropout: float, + attn_mask_type: str, + deterministic: bool, +) -> List[torch.Tensor]: + max_tokens = _get_max_tokens(q.shape[0]) + return [ + q.new_empty(tuple(q.shape[:-1]) + (v.shape[-1],)), # [total_tokens_q,num_heads,head_dim] + q.new_empty( + max_tokens, q.shape[1], 1, dtype=torch.float32 + ), # these are the softmax outputs from cudnn; will always be float32 + q.new_empty((2,)), + ] + + +def cudnn_fused_attn_bwd_setup_context(ctx, inputs, output) -> None: + ( + _, # is_training + max_seqlen_q, + max_seqlen_kv, + cu_seqlens_q, + cu_seqlens_kv, + q, + k, + v, + window_size, + attn_scale, + dropout, + attn_mask_type, + deterministic, + ) = inputs + + out = output[0] + aux_ctx_tensors = output[1:] + qkvo_tensors = (q, k, v, out) + + # assume fwd and bwd always use the same high precision, i.e. torch.float16 or torch.bfloat16 + # used when some tensors are base tensors and loose the "dtype" attribute + ctx.nominal_dtype = q.dtype + + ctx.save_for_backward( + *qkvo_tensors, + cu_seqlens_q, + cu_seqlens_kv, + cu_seqlens_q, + cu_seqlens_kv, + *aux_ctx_tensors, + ) + + ctx.max_seqlen_q = max_seqlen_q + ctx.max_seqlen_kv = max_seqlen_kv + ctx.attn_scale = attn_scale + ctx.dropout_p = dropout + ctx.fast_zero_fill = True + ctx.attn_bias_type = "no_bias" + ctx.attn_mask_type = attn_mask_type + ctx.softmax_type = "vanilla" + ctx.window_size = window_size + ctx.deterministic = deterministic + + +@torch.library.custom_op("cosmos3::cudnn_fused_attn_bwd_op", mutates_args=()) +def cudnn_fused_attn_bwd_op( + max_seqlen_q: torch.Tensor, + max_seqlen_kv: torch.Tensor, + attn_scale: float, + dropout: float, + fast_zero_fill: bool, + attn_bias_type: str, + attn_mask_type: str, + softmax_type: str, + window_size: List[int], + deterministic: bool, + cu_seqlens_q: torch.Tensor, + cu_seqlens_kv: torch.Tensor, + q: torch.Tensor, # [total_tokens_q,num_heads,head_dim] + k: torch.Tensor, # [total_tokens_kv,num_heads,head_dim] + v: torch.Tensor, # [total_tokens_kv,num_heads,head_dim] + out: torch.Tensor, # [total_tokens_q,num_heads,head_dim] + d_out: torch.Tensor, # [total_tokens_q,num_heads,head_dim] + dqkv_nominal_dtype: torch.dtype, + dqkv_te_dtype: torch.dtype, + aux_ctx_tensors: List[torch.Tensor], + cu_seqlens_q_padded: torch.Tensor, + cu_seqlens_kv_padded: torch.Tensor, +) -> Tuple[torch.Tensor, torch.Tensor, torch.Tensor]: # dq,dk,dv each [total_tokens,num_heads,head_dim] + qkv_layout = "thd_thd_thd" + args = ( + max_seqlen_q.item(), + max_seqlen_kv.item(), + attn_scale, + dropout, + fast_zero_fill, + QKVLayout[qkv_layout], + AttnBiasType[attn_bias_type], + AttnMaskType[attn_mask_type], + ) + + if _TE_VER >= (2, 8): + args += (SoftmaxType[softmax_type],) + + args += ( + window_size, + deterministic, + cu_seqlens_q, + cu_seqlens_kv, + q, + k, + v, + out, + d_out, + dqkv_nominal_dtype, + TE_DType[dqkv_te_dtype], + aux_ctx_tensors, + cu_seqlens_q_padded, + cu_seqlens_kv_padded, + None, # s_quantizer, + None, # dp_quantizer, + None, # dqkv_quantizer, + ) + + if _TE_VER >= (2, 10): + # is_cuda_graph + args += (False,) + + dq, dk, dv, *rest = tex.fused_attn_bwd(*args) + return dq, dk, dv + + +@cudnn_fused_attn_bwd_op.register_fake +def _( + max_seqlen_q: torch.Tensor, + max_seqlen_kv: torch.Tensor, + attn_scale: float, + dropout: float, + fast_zero_fill: bool, + attn_bias_type: str, + attn_mask_type: str, + softmax_type: str, + window_size: List[int], + deterministic: bool, + cu_seqlens_q: torch.Tensor, + cu_seqlens_kv: torch.Tensor, + q: torch.Tensor, # [total_tokens_q,num_heads,head_dim] + k: torch.Tensor, # [total_tokens_kv,num_heads,head_dim] + v: torch.Tensor, # [total_tokens_kv,num_heads,head_dim] + out: torch.Tensor, # [total_tokens_q,num_heads,head_dim] + d_out: torch.Tensor, # [total_tokens_q,num_heads,head_dim] + dqkv_nominal_dtype: torch.dtype, + dqkv_te_dtype: torch.dtype, + aux_ctx_tensors: List[torch.Tensor], + cu_seqlens_q_padded: torch.Tensor, + cu_seqlens_kv_padded: torch.Tensor, +) -> Tuple[torch.Tensor, torch.Tensor, torch.Tensor]: # dq,dk,dv each [total_tokens,num_heads,head_dim] + return torch.empty_like(q), torch.empty_like(k), torch.empty_like(v) + + +def cudnn_fused_attn_bwd_impl(ctx, grad): + d_out, _, _ = grad + d_out = d_out.contiguous() + + ( + q, + k, + v, + out, + cu_seqlens_q, + cu_seqlens_kv, + cu_seqlens_q_padded, + cu_seqlens_kv_padded, + *aux_ctx_tensors, + ) = ctx.saved_tensors + + if not aux_ctx_tensors[0].is_contiguous(): + aux_ctx_tensors[0] = aux_ctx_tensors[0].contiguous() + + with torch.cuda.nvtx.range("FusedAttnFunc.backward"): + # get nominal data type of dq, dk, dv + # FP16/BF16 attention: torch.float16 or torch.bfloat16 + dqkv_nominal_dtype = ctx.nominal_dtype + + # q, k, v, out, d_out, dq, dk, dv: torch.Tensor; torch.float16 or torch.bfloat16 + dq, dk, dv = cudnn_fused_attn_bwd_op( + ctx.max_seqlen_q, + ctx.max_seqlen_kv, + ctx.attn_scale, + ctx.dropout_p, + ctx.fast_zero_fill, + ctx.attn_bias_type, + ctx.attn_mask_type, + ctx.softmax_type, + ctx.window_size, + ctx.deterministic, + cu_seqlens_q, + cu_seqlens_kv, + q, + k, + v, + out, + d_out, + dqkv_nominal_dtype, + d_out.dtype, + aux_ctx_tensors, + cu_seqlens_q_padded, + cu_seqlens_kv_padded, + ) + + output = ( + None, # is_training + None, # max_seqlen_q + None, # max_seqlen_kv + None, # cu_seqlens_q + None, # cu_seqlens_kv + dq, + dk, + dv, + None, # window_size + None, # attn_scale + None, # dropout + None, # attn_mask_type + None, # deterministic + ) + return output + + +cudnn_fused_attn.register_autograd(cudnn_fused_attn_bwd_impl, setup_context=cudnn_fused_attn_bwd_setup_context) diff --git a/cosmos-inference/cosmos3/_src/vfm/models/mot/modeling_utils.py b/cosmos-inference/cosmos3/_src/vfm/models/mot/modeling_utils.py new file mode 100644 index 00000000..2896b0c9 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/vfm/models/mot/modeling_utils.py @@ -0,0 +1,407 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +import math +from typing import Optional + +import numpy as np +import torch +from einops import rearrange, repeat +from torch import nn +from torch.distributed import ProcessGroup +from transformers.activations import ACT2FN + +from cosmos3._src.vfm.datasets.sequence_packing import ModalityData + + +def has_noisy_tokens(modality_data: ModalityData | None) -> bool: + """Check if a modality has valid noisy tokens for loss computation.""" + return ( + modality_data is not None + and modality_data.tokens is not None + and isinstance(modality_data.mse_loss_indexes, torch.Tensor) + and modality_data.mse_loss_indexes.numel() > 0 + ) + + +# -------------------------------------------------------- +# 2D sine-cosine position embedding (flattened) +# References: +# DiT: https://github.com/facebookresearch/DiT/blob/main/models.py +# -------------------------------------------------------- +def get_2d_sincos_pos_embed( + embed_dim: int, grid_size_h: int, grid_size_w: int, cls_token: bool = False, extra_tokens: int = 0 +) -> np.ndarray: + grid_h = np.arange(grid_size_h, dtype=np.float32) + grid_w = np.arange(grid_size_w, dtype=np.float32) + grid = np.meshgrid(grid_w, grid_h) # here w goes first + grid = np.stack(grid, axis=0) + + grid = grid.reshape([2, 1, grid_size_h, grid_size_w]) + pos_embed = get_2d_sincos_pos_embed_from_grid(embed_dim, grid) + if cls_token and extra_tokens > 0: + pos_embed = np.concatenate([np.zeros([extra_tokens, embed_dim]), pos_embed], axis=0) + return pos_embed + + +def get_2d_sincos_pos_embed_from_grid(embed_dim: int, grid: np.ndarray) -> np.ndarray: + assert embed_dim % 2 == 0 + + # use half of dimensions to encode grid_h + emb_h = get_1d_sincos_pos_embed_from_grid(embed_dim // 2, grid[0]) # [H*W,D/2] + emb_w = get_1d_sincos_pos_embed_from_grid(embed_dim // 2, grid[1]) # [H*W,D/2] + + emb = np.concatenate([emb_h, emb_w], axis=1) # [H*W,D] + return emb + + +def get_1d_sincos_pos_embed_from_grid(embed_dim: int, pos: np.ndarray) -> np.ndarray: + """ + embed_dim: output dimension for each position + pos: a list of positions to be encoded: size [M] + out: [M,D] + """ + assert embed_dim % 2 == 0 + omega = np.arange(embed_dim // 2, dtype=np.float64) + omega /= embed_dim / 2.0 + omega = 1.0 / 10000**omega # [D/2] + + pos = pos.reshape(-1) # [M] + out = np.einsum("m,d->md", pos, omega) # [M,D/2], outer product + + emb_sin = np.sin(out) # [M,D/2] + emb_cos = np.cos(out) # [M,D/2] + + emb = np.concatenate([emb_sin, emb_cos], axis=1) # [M,D] + return emb + + +class FlattenedSinCosPositionEmbedding(nn.Module): + # This module creates a flattened sin-cos position embedding for a given number of patches per side. + # Indices are created for 2D array and flattened into 1D array. + + def __init__(self, max_latent_h: int, max_latent_w: int, hidden_size: int, interpolate_pos: bool = False): + super().__init__() + self.max_latent_h = max_latent_h + self.max_latent_w = max_latent_w + self.hidden_size = hidden_size + self.interpolate_pos = interpolate_pos + self.pos_embed = nn.Parameter(torch.zeros(max_latent_h * max_latent_w, hidden_size), requires_grad=False) + self._init_weights() + + def _get_flattened_position_ids_extrapolate(self, latent_dim_h: int, latent_dim_w: int) -> torch.Tensor: + coords_h = torch.arange(0, latent_dim_h) # [H] + coords_w = torch.arange(0, latent_dim_w) # [W] + pos_ids = (coords_h[:, None] * self.max_latent_w + coords_w).flatten() # [H*W] + return pos_ids + + def _get_flattened_position_ids_interpolate(self, latent_dim_h: int, latent_dim_w: int) -> torch.Tensor: + boundaries = torch.arange(1 / self.max_latent_w, 1.0, 1 / self.max_latent_w) # [max_latent_w-1] + fractional_coords_h = torch.arange(0, 1 - 1e-6, 1 / latent_dim_h) # [H] + fractional_coords_w = torch.arange(0, 1 - 1e-6, 1 / latent_dim_w) # [W] + bucket_coords_h = torch.bucketize(fractional_coords_h, boundaries, right=True) # [H] + bucket_coords_w = torch.bucketize(fractional_coords_w, boundaries, right=True) # [W] + pos_ids = (bucket_coords_h[:, None] * self.max_latent_w + bucket_coords_w).flatten() # [H*W] + return pos_ids + + def _create_flattened_position_ids_packed(self, token_shapes_vision: list[tuple[int, int]]) -> torch.Tensor: + flattened_position_ids = [] + for t, h, w in token_shapes_vision: + if self.interpolate_pos: + flattened_position_ids.append(self._get_flattened_position_ids_interpolate(h, w)) # [H*W] + else: + flattened_position_ids.append(self._get_flattened_position_ids_extrapolate(h, w)) # [H*W] + flattened_position_ids_packed = torch.cat(flattened_position_ids, dim=0) # [N_vision] + return flattened_position_ids_packed + + def _init_weights(self): + # Initialize (and freeze) pos_embed by sin-cos embedding: + pos_embed = get_2d_sincos_pos_embed( + embed_dim=self.hidden_size, grid_size_h=self.max_latent_h, grid_size_w=self.max_latent_w + ) + self.pos_embed.data.copy_(torch.from_numpy(pos_embed).float()) + + def forward(self, token_shapes_vision: list[tuple[int, int]], fps: Optional[torch.Tensor] = None) -> torch.Tensor: + # First create 2D index array + flattened_position_ids_packed = self._create_flattened_position_ids_packed(token_shapes_vision) # [N_vision] + return self.pos_embed[flattened_position_ids_packed] # [N_vision,hidden_size] + + +# -------------------------------------------------------- +# 2D / 3D RoPE Position Embedding +# -------------------------------------------------------- + + +class VideoRopePosition3DEmb(nn.Module): + def __init__( + self, + *, # enforce keyword arguments + head_dim: int, + len_h: int, + len_w: int, + len_t: int, + base_fps: int = 24, + base_temporal_compression_factor: int = 4, + temporal_compression_factor: int = 4, + h_extrapolation_ratio: float = 1.0, + w_extrapolation_ratio: float = 1.0, + t_extrapolation_ratio: float = 1.0, + enable_fps_modulation: bool = False, + **kwargs, # used for compatibility with other positional embeddings; unused in this class + ): + del kwargs + super().__init__() + self.base_tps = base_fps / base_temporal_compression_factor + self.temporal_compression_factor = temporal_compression_factor + self.max_h = len_h + self.max_w = len_w + self.max_t = len_t + self.enable_fps_modulation = enable_fps_modulation + dim = head_dim + dim_h = dim // 6 * 2 + dim_w = dim_h + dim_t = dim - 2 * dim_h + assert dim == dim_h + dim_w + dim_t, f"bad dim: {dim} != {dim_h} + {dim_w} + {dim_t}" + + self.register_buffer( + "dim_spatial_range", + torch.arange(0, dim_h, 2)[: (dim_h // 2)].float() / dim_h, + persistent=True, + ) + self.register_buffer( + "dim_temporal_range", + torch.arange(0, dim_t, 2)[: (dim_t // 2)].float() / dim_t, + persistent=True, + ) + self._dim_h = dim_h + self._dim_t = dim_t + + self.h_ntk_factor = h_extrapolation_ratio ** (dim_h / (dim_h - 2)) + self.w_ntk_factor = w_extrapolation_ratio ** (dim_w / (dim_w - 2)) + self.t_ntk_factor = t_extrapolation_ratio ** (dim_t / (dim_t - 2)) + self._init_weights() + + def _init_weights(self) -> None: + dim_h = self._dim_h + dim_t = self._dim_t + + self.dim_spatial_range = ( + torch.arange(0, dim_h, 2)[: (dim_h // 2)].float().to(self.dim_spatial_range.device) / dim_h + ) + self.dim_temporal_range = ( + torch.arange(0, dim_t, 2)[: (dim_t // 2)].float().to(self.dim_spatial_range.device) / dim_t + ) + + def enable_context_parallel(self, process_group: ProcessGroup): + pass + + def disable_context_parallel(self): + pass + + def generate_embeddings( + self, + latent_shape: torch.Size, + input_fps: Optional[torch.Tensor] = None, + h_ntk_factor: Optional[float] = None, + w_ntk_factor: Optional[float] = None, + t_ntk_factor: Optional[float] = None, + start_frame_offset: int = 0, + ): + """ + Generate embeddings for the given input size. + + Args: + latent_shape (torch.Size): Input tensor size (Batch, Time, Height, Width). + input_fps (Optional[torch.Tensor], optional): Frames per second. Defaults to None. + h_ntk_factor (Optional[float], optional): Height NTK factor. If None, uses self.h_ntk_factor. + w_ntk_factor (Optional[float], optional): Width NTK factor. If None, uses self.w_ntk_factor. + t_ntk_factor (Optional[float], optional): Time NTK factor. If None, uses self.t_ntk_factor. + start_frame_offset (int, optional): Offset for frame indices. Use 1 for action embeddings + so that action frame indices start at 1 instead of 0. Defaults to 0. + + Returns: + Not specified in the original code snippet. + """ + if input_fps is not None: + tps = input_fps / self.temporal_compression_factor + else: + tps = None + + h_ntk_factor = h_ntk_factor if h_ntk_factor is not None else self.h_ntk_factor + w_ntk_factor = w_ntk_factor if w_ntk_factor is not None else self.w_ntk_factor + t_ntk_factor = t_ntk_factor if t_ntk_factor is not None else self.t_ntk_factor + assert h_ntk_factor is not None and w_ntk_factor is not None and t_ntk_factor is not None + + h_theta = 10000.0 * h_ntk_factor + w_theta = 10000.0 * w_ntk_factor + t_theta = 10000.0 * t_ntk_factor + + h_spatial_freqs = 1.0 / (h_theta ** self.dim_spatial_range.float()) # [dim_h/2] + w_spatial_freqs = 1.0 / (w_theta ** self.dim_spatial_range.float()) # [dim_w/2] + temporal_freqs = 1.0 / (t_theta ** self.dim_temporal_range.float()) # [dim_t/2] + + B, T, H, W = latent_shape + assert H <= self.max_h and W <= self.max_w, ( + f"Input dimensions (H={H}, W={W}) exceed the maximum dimensions (max_h={self.max_h}, max_w={self.max_w})" + ) + + # Re-allocate buffer if current video needs more indices than what we have for self.seq + # Only rellocate when needed. + max_needed = max(T, H, W) + seq = torch.arange(max_needed, device=self.dim_spatial_range.device, dtype=torch.float) + + half_emb_h = torch.outer(seq[:H], h_spatial_freqs) # [H,dim_h/2] + half_emb_w = torch.outer(seq[:W], w_spatial_freqs) # [W,dim_w/2] + + # Frame indices for the embedding (always 0, 1, 2, ...) + frame_indices = seq[:T] # [T] + + if self.enable_fps_modulation: + uniform_tps = tps is None or tps.shape == (1,) + assert uniform_tps or B == 1 or T == 1, ( + "For video batch, B should be 1 for non-uniform fps. For image batch, T should be 1." + ) + + # apply sequence scaling in temporal dimension + if tps is None: # image case + assert T == 1, "T should be 1 for image batch." + half_emb_t = torch.outer(frame_indices, temporal_freqs) # [T,dim_t/2] + else: + # Calculate scaled time indices + # Apply start_frame_offset to the time calculation (not frame indices) + # This allows one to manipulate the start frame index of embeddings for cross-modality alignment. + scaled_time = (frame_indices + start_frame_offset) / tps[:1] * self.base_tps # [T] + half_emb_t = torch.outer(scaled_time, temporal_freqs) # [T,dim_t/2] + else: + half_emb_t = torch.outer(frame_indices, temporal_freqs) # [T,dim_t/2] + + rope_embed = torch.cat( + [ + repeat(half_emb_t, "t d -> t h w d", h=H, w=W), # [T,H,W,dim_t/2] + repeat(half_emb_h, "h d -> t h w d", t=T, w=W), # [T,H,W,dim_h/2] + repeat(half_emb_w, "w d -> t h w d", t=T, h=H), # [T,H,W,dim_w/2] + ] + * 2, + dim=-1, + ) # [T,H,W,head_dim] + + return rearrange(rope_embed, "t h w d -> (t h w) d").float() # [T*H*W,head_dim] + + def forward( + self, + token_shapes_vision: list[tuple[int, int, int]], + fps: Optional[torch.Tensor] = None, + start_frame_offset: int = 0, + ) -> torch.Tensor: + """ + With CP, the function assume that the input tensor is already split. + It delegates the embedding generation to generate_embeddings function. + + Args: + token_shapes_vision: List of (t, h, w) tuples for each latent. + fps: Frames per second tensor. + start_frame_offset: Offset for frame indices. Use 1 for action embeddings + so that action frame indices start at 1 instead of 0. Defaults to 0. + """ + + embeddings_packed = [] + for i, latent_shape in enumerate(token_shapes_vision): + # latent_shape: (t, h, w) + shape = (1, latent_shape[0], latent_shape[1], latent_shape[2]) + + # Extract FPS for this specific video + video_fps = None + if fps is not None: + assert i < fps.shape[0], f"Index {i} out of bounds for fps tensor of shape {fps.shape}" + video_fps = fps[i : i + 1] + + embeddings = self.generate_embeddings(shape, input_fps=video_fps, start_frame_offset=start_frame_offset) + embeddings_packed.append(embeddings) + + embeddings_packed = torch.cat(embeddings_packed, dim=0) # [N_vision,head_dim] + return embeddings_packed + + @property + def seq_dim(self): + return 0 + + +# -------------------------------------------------------- +# TimestepEmbedder +# Reference: +# DiT: https://github.com/facebookresearch/DiT/blob/main/models.py +# -------------------------------------------------------- +class TimestepEmbedder(nn.Module): + """ + Embeds scalar timesteps into vector representations. + """ + + def __init__(self, hidden_size, frequency_embedding_size=256): + super().__init__() + self.mlp = nn.Sequential( + nn.Linear(frequency_embedding_size, hidden_size, bias=True), + nn.SiLU(), + nn.Linear(hidden_size, hidden_size, bias=True), + ) + self.frequency_embedding_size = frequency_embedding_size + self.hidden_size = hidden_size + + def _init_weights(self): + std = 1.0 / math.sqrt(self.frequency_embedding_size) + torch.nn.init.trunc_normal_(self.mlp[0].weight, std=std, a=-3 * std, b=3 * std) + torch.nn.init.zeros_(self.mlp[0].bias) + + std = 1.0 / math.sqrt(self.hidden_size) + torch.nn.init.trunc_normal_(self.mlp[2].weight, std=std, a=-3 * std, b=3 * std) + torch.nn.init.zeros_(self.mlp[2].bias) + + @staticmethod + def timestep_embedding(t, dim, max_period=10000): + """ + Create sinusoidal timestep embeddings. + :param t: a 1-D Tensor of N indices, one per batch element. + These may be fractional. + :param dim: the dimension of the output. + :param max_period: controls the minimum frequency of the embeddings. + :return: an (N, D) Tensor of positional embeddings. + """ + half = dim // 2 + freqs = torch.exp(-math.log(max_period) * torch.arange(start=0, end=half, dtype=torch.float32) / half).to( + device=t.device + ) # [D/2] + args = t[:, None].float() * freqs[None] # [N,D/2] + embedding = torch.cat([torch.cos(args), torch.sin(args)], dim=-1) # [N,D] + if dim % 2: + embedding = torch.cat([embedding, torch.zeros_like(embedding[:, :1])], dim=-1) # [N,D+1] + return embedding + + def forward(self, t): + t_freq = self.timestep_embedding(t, self.frequency_embedding_size) # [N,frequency_embedding_size] + t_emb = self.mlp(t_freq) # [N,hidden_size] + return t_emb + + +class MLPconnector(nn.Module): + def __init__(self, in_dim: int, out_dim: int, hidden_act: str): + super().__init__() + self.activation_fn = ACT2FN[hidden_act] + self.fc1 = nn.Linear(in_dim, out_dim) + self.fc2 = nn.Linear(out_dim, out_dim) + + def forward(self, hidden_states: torch.Tensor) -> torch.Tensor: + hidden_states = self.fc1(hidden_states) # [N,out_dim] + hidden_states = self.activation_fn(hidden_states) # [N,out_dim] + hidden_states = self.fc2(hidden_states) # [N,out_dim] + return hidden_states diff --git a/cosmos-inference/cosmos3/_src/vfm/models/mot/parallelize_unified_mot.py b/cosmos-inference/cosmos3/_src/vfm/models/mot/parallelize_unified_mot.py new file mode 100644 index 00000000..9682264a --- /dev/null +++ b/cosmos-inference/cosmos3/_src/vfm/models/mot/parallelize_unified_mot.py @@ -0,0 +1,89 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +import torch +import torch.nn as nn +from torch.distributed.algorithms._checkpoint.checkpoint_wrapper import ( + checkpoint_wrapper as ptd_checkpoint_wrapper, +) +from torch.distributed.fsdp import fully_shard + +from cosmos3._src.vfm.configs.base.defaults.model_config import ParallelismConfig +from cosmos3._src.vfm.utils.parallelism import ParallelDims + + +def apply_ac(model: nn.Module): + """Apply activation checkpointing to the model.""" + + for layer_id, block in model.model.layers.named_children(): + block = ptd_checkpoint_wrapper(block, preserve_rng_state=True) + model.model.layers.register_module(layer_id, block) + + +def apply_compile(model: nn.Module, config: ParallelismConfig): + """ + Apply torch.compile to each TransformerBlock, which makes compilation efficient due to + repeated structure. Alternatively one can compile the whole model (after applying DP). + """ + compile_options = {} + if config.max_autotune_pointwise: + compile_options["max_autotune_pointwise"] = True + if config.coordinate_descent_tuning: + compile_options["coordinate_descent_tuning"] = True + + for layer_id, block in model.model.layers.named_children(): + block = torch.compile( + block, + fullgraph=True, + dynamic=config.compile_dynamic, + mode="reduce-overhead" if config.use_cuda_graphs else None, + options=compile_options or None, + ) + model.model.layers.register_module(layer_id, block) + + +def apply_fsdp( + model: nn.Module, + parallel_dims: ParallelDims, +): + """ + Apply data parallelism (via FSDP2) to the model. + + Args: + model (nn.Module): The model to apply data parallelism to. + parallel_dims (ParallelDims): The device mesh to use for data parallelism and expert parallel. + """ + for _, block in model.model.layers.named_children(): + fully_shard(block, mesh=parallel_dims.dp_mesh) + + +def parallelize_unified_mot( + model: nn.Module, + parallel_dims: ParallelDims | None, + config: ParallelismConfig, +) -> nn.Module: + """Optimize the model using FSDP, activation checkpointing, and torch.compile. + + FSDP reduces memory usage by sharding the model parameters across multiple GPUs. + Activation checkpointing reduces memory usage by selectively checkpointing only + the outputs of each layer. Torch.compile compiles the model for faster training. + """ + if config.use_activation_checkpointing: + apply_ac(model) + if config.use_torch_compile: + apply_compile(model, config) + if parallel_dims is not None and parallel_dims.dp_enabled: + apply_fsdp(model, parallel_dims) + return model diff --git a/cosmos-inference/cosmos3/_src/vfm/models/mot/parallelize_vfm_network.py b/cosmos-inference/cosmos3/_src/vfm/models/mot/parallelize_vfm_network.py new file mode 100644 index 00000000..f5eec1ad --- /dev/null +++ b/cosmos-inference/cosmos3/_src/vfm/models/mot/parallelize_vfm_network.py @@ -0,0 +1,172 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +from typing import Callable + +import torch +from torch.distributed.fsdp import fully_shard +from torch.nn.attention.flex_attention import BlockMask + +from cosmos3._src.vfm.configs.base.defaults.model_config import ParallelismConfig +from cosmos3._src.vfm.datasets.sequence_packing import ( + FactoredSequencePack, + JointSequencePack, +) +from cosmos3._src.vfm.models.mot.attention import SplitInfo, dispatch_attention +from cosmos3._src.vfm.models.mot.context_parallel_utils import context_parallel_attention +from cosmos3._src.vfm.models.mot.parallelize_unified_mot import parallelize_unified_mot +from cosmos3._src.vfm.models.utils.memory import KVToStore, MemoryValue +from cosmos3._src.vfm.utils.parallelism import ParallelDims + + +class ContextParallelDispatch(torch.nn.Module): + """CP-aware wrapper for the installed attention dispatch function. + + Installed on ``PackedAttentionMoT.dispatch_attention_fn`` when context + parallelism is enabled, replacing whatever dispatch function was there + previously. The call signature of :meth:`forward` matches + ``dispatch_attention`` so the two are interchangeable. + + All paths delegate to :func:`context_parallel_attention`, which wraps + the inner ``wrapped_dispatch`` with Ulysses-style all-to-all + communication. This includes the AR frame 1+ gen-only path — the inner + dispatch routes to ``attention_AR_gen_only`` which operates on the + head-sharded tensors produced by the all-to-all. + + All cache writes flow through the ``MemoryState`` interface; neither this + class nor the CP attention functions write to the cache directly. + """ + + def __init__( + self, + cp_mesh, + wrapped_dispatch: Callable = dispatch_attention, + ): + super().__init__() + self.cp_mesh = cp_mesh + self.wrapped_dispatch = wrapped_dispatch + + def forward( + self, + packed_query_states: FactoredSequencePack | JointSequencePack, + packed_key_states: FactoredSequencePack | JointSequencePack, + packed_value_states: FactoredSequencePack | JointSequencePack, + attention_mask: BlockMask | SplitInfo, + natten_metadata: dict | None = None, + memory_value: MemoryValue | None = None, + ) -> tuple[FactoredSequencePack | JointSequencePack, KVToStore | None]: + if memory_value is not None and not memory_value.supports_context_parallel_attention: + raise ValueError("Context-parallel doesn't work when training with a KV-cache.") + + return context_parallel_attention( + self.cp_mesh, + packed_query_states, + packed_key_states, + packed_value_states, + attention_mask, + attention_function=self.wrapped_dispatch, + natten_metadata=natten_metadata, + memory_value=memory_value, + ) + + +def apply_compile(model: torch.nn.Module, config: ParallelismConfig): + """Apply torch.compile to the VFM encode/decode heads. + + The MoT-side ``compile_dynamic`` knob on ``ParallelismConfig`` intentionally + does **not** propagate here. The VFM encode/decode paths have no graph + breaks and their input shapes are stable across a prompt, so we always + trace them as a single dynamic graph (``fullgraph=True, dynamic=True``). + This keeps AR inference (which sets ``compile_dynamic=False`` on MoT for + shape-specialized kernels) from accidentally regressing the VFM compile. + """ + + inductor_options = {} + if config.max_autotune_pointwise: + inductor_options["max_autotune_pointwise"] = True + if config.coordinate_descent_tuning: + inductor_options["coordinate_descent_tuning"] = True + + compile_options = { + "fullgraph": True, + "dynamic": True, + "mode": "reduce-overhead" if config.use_cuda_graphs else None, + "options": inductor_options or None, + } + + model._encode_text = torch.compile(model._encode_text, **compile_options) + model._encode_vision = torch.compile(model._encode_vision, **compile_options) + model._encode_action = torch.compile(model._encode_action, **compile_options) + model._decode_vision = torch.compile(model._decode_vision, **compile_options) + model._decode_action = torch.compile(model._decode_action, **compile_options) + return model + + +def context_parallel_unified_mot( + model: torch.nn.Module, + parallel_dims: ParallelDims | None, +) -> torch.nn.Module: + for i in range(len(model.model.layers)): + attn = model.model.layers[i].self_attn + cp_dispatch = ContextParallelDispatch( + parallel_dims.cp_mesh, + wrapped_dispatch=attn.dispatch_attention_fn, + ) + attn.dispatch_attention_fn = cp_dispatch + attn.cp_mesh = parallel_dims.cp_mesh + + return model + + +def parallelize_vfm_network( + model: torch.nn.Module, + parallel_dims: ParallelDims | None, + config: ParallelismConfig, +) -> torch.nn.Module: + """Optimize the model using FSDP, CP, activation checkpointing, and torch.compile. + + FSDP reduces memory usage by sharding the model parameters across multiple GPUs. + Activation checkpointing reduces memory usage by selectively checkpointing only + the outputs of each layer. Torch.compile compiles the model for faster training. + """ + if parallel_dims is not None and parallel_dims.cp_enabled: + model.parallel_dims = parallel_dims + model.language_model = context_parallel_unified_mot( + model.language_model, + parallel_dims=parallel_dims, + ) + + model.language_model = parallelize_unified_mot( + model.language_model, + parallel_dims=parallel_dims, + config=config, + ) + + if config.use_torch_compile and config.compiled_region == "all": + model = apply_compile(model, config) + + if parallel_dims is not None and parallel_dims.dp_enabled: + # Collect parameters to ignore during FSDP wrapping + ignored_params = set() + if model.latent_pos_embed is not None: + ignored_params.update(model.latent_pos_embed.parameters()) + + model = fully_shard( + module=model, + mesh=parallel_dims.dp_mesh, + ignored_params=ignored_params, + ) + + return model diff --git a/cosmos-inference/cosmos3/_src/vfm/models/mot/qwen3_vl_unified_mot.py b/cosmos-inference/cosmos3/_src/vfm/models/mot/qwen3_vl_unified_mot.py new file mode 100644 index 00000000..06c4ea63 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/vfm/models/mot/qwen3_vl_unified_mot.py @@ -0,0 +1,34 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025-2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +# Backward-compatibility shim: this module was renamed to unified_mot.py. +# Existing serialized configs / checkpoints may reference the old module path, +# so we re-export everything from the new location. +from cosmos3._src.vfm.models.mot.unified_mot import * # noqa: F401, F403 +from cosmos3._src.vfm.models.mot.unified_mot import ( # noqa: F401 # explicit re-exports for type checkers + LayerTypes, + MoTDecoderLayer, + Nemotron3DenseVLTextConfig, + Nemotron3DenseVLTextForCausalLM, + Nemotron3DenseVLTextModel, + PackedAttentionMoT, + Qwen3VLMoeTextConfig, + Qwen3VLMoeTextForCausalLM, + Qwen3VLMoeTextModel, + Qwen3VLTextConfig, + Qwen3VLTextForCausalLM, + Qwen3VLTextModel, + Qwen3VLTextMoTDecoderLayer, +) diff --git a/cosmos-inference/cosmos3/_src/vfm/models/mot/unified_3dmrope_utils.py b/cosmos-inference/cosmos3/_src/vfm/models/mot/unified_3dmrope_utils.py new file mode 100644 index 00000000..30f9cd96 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/vfm/models/mot/unified_3dmrope_utils.py @@ -0,0 +1,206 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +"""Utility functions for generating 3D multi-modal RoPE (mRoPE) position IDs. + +3D mRoPE uses three axes (temporal, height, width) for position embedding, +following the Qwen3VL design for multi-modal RoPE: + +- **Text tokens**: All three axes share the same monotonically increasing position IDs. + For example: (0,0,0), (1,1,1), (2,2,2), ... +- **Vision tokens** (image/video latents): Creates a local 3D grid (T, H, W) with a + temporal offset. For each frame t in [0, T), for each row h in [0, H), for each + column w in [0, W), the position is (temporal_offset + t, h_offset, w_offset). + +The ``reset_spatial_indices`` flag controls spatial axis behavior: +- ``True`` (default): Spatial (H, W) indices start from 0 for each vision segment, + giving the model absolute spatial position within each image/video. +- ``False`` (Qwen2VL-style): All axes are offset by ``temporal_offset``. + +After each segment, the ``temporal_offset`` is updated to ``max(all_positions) + 1`` +(Qwen3VL design), ensuring subsequent segments start at a non-overlapping position. + +**FPS Modulation** (optional): +When ``fps`` is provided, the temporal position IDs are scaled to reflect real time +rather than just frame indices. The formula is: + scaled_time = (frame_index + start_frame_offset) / tps * base_tps +where: + tps = fps / temporal_compression_factor + base_tps = base_fps / base_temporal_compression_factor + +This ensures that videos with different FPS values have comparable temporal position +embeddings, allowing the model to understand temporal relationships across different +video sources. +""" + +import math + +import torch + + +def get_3d_mrope_ids_text_tokens( + num_tokens: int, + temporal_offset: int | float, + use_float_positions: bool = False, +) -> tuple[torch.Tensor, int | float]: + """Generate 3D mRoPE position IDs for text tokens. + + For text tokens, all three axes (temporal, height, width) share the same + monotonically increasing position IDs, starting from ``temporal_offset``. + + Args: + num_tokens: Number of text tokens. + temporal_offset: Current temporal offset to start from. Can be float when + FPS modulation is enabled for vision tokens. + use_float_positions: If ``True``, generate float position IDs (for consistency + with FPS-modulated vision tokens). If ``False``, generate integer IDs. + + Returns: + Tuple of: + - Position IDs tensor of shape ``(3, num_tokens)`` where each row is identical. + - Updated temporal offset (``temporal_offset + num_tokens``). + """ + if use_float_positions: + # Float mode: for consistency with FPS-modulated vision tokens + ids = torch.arange(num_tokens, dtype=torch.float32) + temporal_offset # [num_tokens] + else: + # Integer mode (default) + ids = torch.arange(num_tokens, dtype=torch.long) + int(temporal_offset) # [num_tokens] + + mrope_ids = ids.unsqueeze(0).expand(3, -1).contiguous() # [3,num_tokens] + next_temporal_offset = temporal_offset + num_tokens + return mrope_ids, next_temporal_offset + + +def get_3d_mrope_ids_vae_tokens( + grid_t: int, + grid_h: int, + grid_w: int, + temporal_offset: int | float, + reset_spatial_indices: bool = True, + fps: float | None = None, + base_fps: float = 24.0, + temporal_compression_factor: int = 4, + base_temporal_compression_factor: int | None = None, + start_frame_offset: int = 0, +) -> tuple[torch.Tensor, int | float]: + """Generate 3D mRoPE position IDs for VAE vision tokens (image/video latents). + + Creates a 3D position grid for vision tokens with shape ``(T, H, W)``, then flattens + to produce position IDs for each axis. The flattening order is T-major: + for each temporal frame, iterate over height then width. + + Args: + grid_t: Number of temporal frames in the latent grid. + grid_h: Height of the latent grid (after patchification). + grid_w: Width of the latent grid (after patchification). + temporal_offset: Current temporal offset. Always applied to the temporal axis. + When ``reset_spatial_indices=False``, also applied to spatial axes. + Can be float when FPS modulation is enabled. + reset_spatial_indices: If ``True``, spatial (height, width) indices start from 0 + for each vision segment, giving the model absolute spatial position + within each image/video. If ``False``, spatial indices are also offset by + ``temporal_offset`` (Qwen2VL-style behavior). + fps: Frames per second of the video. ``None`` disables fps modulation + (integer positions); pass the real fps for fps-scaled, possibly + fractional positions. Honored at grid_t=1 too (per-frame AR packs), + where it collapses to ``scaled_t[0] = temporal_offset``. + base_fps: Base FPS for normalization. Default is 24.0. + temporal_compression_factor: VAE temporal compression factor. Default is 4. + base_temporal_compression_factor: Base temporal compression factor. If ``None``, + defaults to ``temporal_compression_factor`` (typical case where base matches actual). + start_frame_offset: Offset added to frame indices before FPS scaling. + Use 1 for action embeddings so they start at frame 1 instead of 0. + + Returns: + Tuple of: + - Position IDs tensor of shape ``(3, grid_t * grid_h * grid_w)``. + Row 0: temporal axis (float if FPS modulation enabled, else long). + Row 1: height axis (long), Row 2: width axis (long). + - Updated temporal offset for the next segment. When FPS modulation is + enabled, this is a float representing the next scaled time position. + Otherwise, it's ``max(all_positions) + 1`` (Qwen3VL design). + """ + # Enabled whenever fps is provided, including grid_t=1 (per-frame AR packs). + # Callers that want integer positions (e.g. images) pass fps=None. + fps_modulation_enabled = fps is not None + + # Default base_temporal_compression_factor to temporal_compression_factor if not specified + effective_base_tcf = ( + base_temporal_compression_factor + if base_temporal_compression_factor is not None + else temporal_compression_factor + ) + + if fps_modulation_enabled: + # FPS modulation: scale temporal indices to reflect real time + # tps = tokens per second (fps divided by temporal compression) + # base_tps = base tokens per second + tps = fps / temporal_compression_factor + base_tps = base_fps / effective_base_tcf + + # Frame indices: 0, 1, 2, ..., grid_t-1 + frame_indices = torch.arange(grid_t, dtype=torch.float32) # [grid_t] + + # Apply FPS scaling: scaled_time = (frame_index + start_frame_offset) / tps * base_tps + scaled_t = (frame_indices + start_frame_offset) / tps * base_tps + temporal_offset # [grid_t] + + # Expand temporal indices for all spatial positions + t_index = scaled_t.view(-1, 1).expand(-1, grid_h * grid_w).flatten() # [grid_t*grid_h*grid_w] + t_dtype = torch.float32 + else: + # No FPS modulation: use integer frame indices + # Apply start_frame_offset for cross-modality alignment (e.g., action tokens start at frame 1) + t_index = ( + ( + torch.arange(grid_t, dtype=torch.long).view(-1, 1).expand(-1, grid_h * grid_w).flatten() + ) # [grid_t*grid_h*grid_w] + + int(temporal_offset) + + start_frame_offset + ) + t_dtype = torch.long + + # Height axis: for each temporal frame, cycles through h values, each repeated w times + h_index = ( + torch.arange(grid_h, dtype=torch.long).view(1, -1, 1).expand(grid_t, -1, grid_w).flatten() + ) # [grid_t*grid_h*grid_w] + + # Width axis: for each temporal frame and height, cycles through w values + w_index = ( + torch.arange(grid_w, dtype=torch.long).view(1, 1, -1).expand(grid_t, grid_h, -1).flatten() + ) # [grid_t*grid_h*grid_w] + + if not reset_spatial_indices: + # Qwen2VL-style: offset all axes by temporal_offset (use int for spatial) + spatial_offset = int(temporal_offset) + h_index = h_index + spatial_offset # [grid_t*grid_h*grid_w] + w_index = w_index + spatial_offset # [grid_t*grid_h*grid_w] + + # Stack into (3, T*H*W) tensor + # Note: When FPS modulation is enabled, temporal axis is float, spatial axes are long + # We convert h_index and w_index to the same dtype as t_index for stacking + if fps_modulation_enabled: + mrope_ids = torch.stack( + [t_index, h_index.to(torch.float32), w_index.to(torch.float32)], dim=0 + ) # [3,grid_t*grid_h*grid_w] + else: + mrope_ids = torch.stack([t_index, h_index, w_index], dim=0) # [3,grid_t*grid_h*grid_w] + + # Compute next temporal offset: max position + 1 + # Use the actual computed positions to handle FPS modulation correctly + max_position = mrope_ids.max().item() + next_temporal_offset = math.ceil(max_position) + 1 + + return mrope_ids, next_temporal_offset diff --git a/cosmos-inference/cosmos3/_src/vfm/models/mot/unified_mot.py b/cosmos-inference/cosmos3/_src/vfm/models/mot/unified_mot.py new file mode 100644 index 00000000..a649f560 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/vfm/models/mot/unified_mot.py @@ -0,0 +1,1041 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +import json +from dataclasses import dataclass +from typing import Optional, Tuple + +import torch +from torch import nn +from transformers.utils import ModelOutput + +from cosmos3._src.imaginaire.utils import log +from cosmos3._src.vfm.datasets.sequence_packing import ( + FactoredSequencePack, + from_joint, + from_und_gen_splits, + get_device_and_dtype, + get_gen_seq, + get_und_seq, + set_gen_seq, + set_und_seq, + zeros_like, +) +from cosmos3._src.vfm.models.mot.attention import ( + AttentionMaskType, + dispatch_attention, +) +from cosmos3._src.vfm.models.utils.memory import KVToStore, MemoryState, MemoryValue + +# Nemotron 3 Dense VL imports +from cosmos3._src.vfm.models.vlm.nemotron_3_dense_vl.configuration_nemotron_3_dense_vl import ( + Nemotron3DenseVLTextConfig as _Nemotron3DenseVLTextConfig, +) +from cosmos3._src.vfm.models.vlm.nemotron_3_dense_vl.nemotron_3_dense_vl import ( + MultiModalRotaryEmbedding, + Nemotron3DenseVLMLP, + Nemotron3DenseVLPreTrainedModel, + Nemotron3DenseVLRMSNorm, + apply_rotary_pos_emb_partial, +) + +# Qwen3-VL imports +from cosmos3._src.vfm.models.vlm.qwen3_vl.configuration_qwen3_vl import ( + Qwen3VLTextConfig as _Qwen3VLTextConfig, +) +from cosmos3._src.vfm.models.vlm.qwen3_vl.qwen3_vl import ( + Qwen3VLPreTrainedModel, + Qwen3VLTextMLP, + Qwen3VLTextRMSNorm, + Qwen3VLTextRotaryEmbedding, +) +from cosmos3._src.vfm.models.vlm.qwen3_vl.qwen3_vl import ( + apply_rotary_pos_emb as qwen3_vl_apply_rotary_pos_emb, +) + +# Qwen3-VL-MoE imports +from cosmos3._src.vfm.models.vlm.qwen3_vl_moe.configuration_qwen3_vl_moe import ( + Qwen3VLMoeTextConfig as _Qwen3VLMoeTextConfig, +) +from cosmos3._src.vfm.models.vlm.qwen3_vl_moe.qwen3_vl_moe import ( + LBLMetadata, + Qwen3VLMoePreTrainedModel, + Qwen3VLMoeTextMLP, + Qwen3VLMoeTextRMSNorm, + Qwen3VLMoeTextRotaryEmbedding, + Qwen3VLMoeTextSparseMoeBlock, +) + +# Torch optimization settings +torch._dynamo.config.cache_size_limit = 512 +torch._dynamo.config.accumulated_cache_size_limit = 4096 + +# ----------------------------------------------------------------------------- +# Unified MoT (Mixture of Transformers) implementation supporting: +# - Qwen3-VL Dense, Qwen3-VL MoE, and Nemotron 3 Dense VL +# +# Shared components: +# - PackedAttentionMoT (config-driven QK norm and RoPE) +# - MoTDecoderLayer (used by all variants) +# - _impl_* (shared init/forward) +# +# Variant-specific wrapper classes are needed for different PreTrainedModel bases. +# Sub-layer classes (MLP, RMSNorm, RotaryEmbedding, RoPE fn) are selected via LayerTypes. +# ----------------------------------------------------------------------------- + + +class LayerTypes: + def __init__(self, variant: str): + self.variant = variant + if variant == "qwen3_vl_moe": + self.mlp = Qwen3VLMoeTextMLP + self.rms_norm = Qwen3VLMoeTextRMSNorm + self.rotary_embedding = Qwen3VLMoeTextRotaryEmbedding + self.apply_rotary_pos_emb = qwen3_vl_apply_rotary_pos_emb + elif variant == "nemotron_dense": + self.mlp = Nemotron3DenseVLMLP + self.rms_norm = Nemotron3DenseVLRMSNorm + self.rotary_embedding = MultiModalRotaryEmbedding + self.apply_rotary_pos_emb = apply_rotary_pos_emb_partial + elif variant == "qwen3_vl_dense": + self.mlp = Qwen3VLTextMLP + self.rms_norm = Qwen3VLTextRMSNorm + self.rotary_embedding = Qwen3VLTextRotaryEmbedding + self.apply_rotary_pos_emb = qwen3_vl_apply_rotary_pos_emb + else: + raise ValueError(f"Unknown LayerTypes variant: {variant!r}") + + @property + def is_moe(self) -> bool: + return self.variant == "qwen3_vl_moe" + + +class NaiveCache: + def __init__(self, num_layers): + self.key_cache = {k: None for k in range(num_layers)} + self.value_cache = {k: None for k in range(num_layers)} + + @property + def num_layers(self): + return len(self.key_cache) + + @property + def seq_lens(self): + if self.key_cache[0] is not None: + return self.key_cache[0].shape[0] + else: + return 0 + + +@dataclass +class BaseOutputWithPast(ModelOutput): + packed_query_sequence: torch.FloatTensor = None + past_key_values: Optional[NaiveCache] = None + + +# Qwen3-VL MoT (Mixture of Tokens) implementation +# Combines Qwen3-VL vision-language capabilities with MoT dual-pathway architecture + + +class Qwen3VLTextConfig(_Qwen3VLTextConfig): + r""" + Qwen3VLTextConfig with MoT-specific parameters. + Extends Qwen3VLTextConfig for text component MoT support with comprehensive configuration. + """ + + def __init__( + self, + # MoT-specific parameters with comprehensive defaults + qk_norm_for_text: bool = False, # Whether to apply QK norm in the understanding (text) pathway + qk_norm_for_diffusion: bool = True, # Whether to apply QK norm in the generation (diffusion) pathway + freeze_und: bool = False, # Freeze understanding pathway + layer_module: str = "MoTDecoderLayer", + tie_word_embeddings: bool = True, + **kwargs, + ): + # Store MoT-specific parameters + self.qk_norm_for_text = qk_norm_for_text + self.qk_norm_for_diffusion = qk_norm_for_diffusion + self.freeze_und = freeze_und + self.layer_module = layer_module + super().__init__(tie_word_embeddings=tie_word_embeddings, **kwargs) + + @classmethod + def from_json_file(cls, json_file): + """ + Enhanced from_json_file that handles both nested and flat configs. + + For nested configs (with text_config section), extracts the text_config. + For flat configs, loads directly. + """ + + # Load the raw JSON + with open(json_file, encoding="utf-8") as reader: + config_dict = json.load(reader) + + # Check if this is a nested config with text_config section + if "text_config" in config_dict and isinstance(config_dict["text_config"], dict): + # Extract the text_config section for nested configs + log.debug("Detected nested config, extracting text_config section") + config_dict = config_dict["text_config"] + else: + # Use the config as-is for flat configs + log.debug("Detected flat config, using directly") + + # Create config from the (potentially extracted) dict + return cls(**config_dict) + + +# Qwen3-VL-MoE MoT (Mixture of Tokens) implementation +# Combines Qwen3-VL-MoE vision-language capabilities with MoT dual-pathway architecture + + +class Qwen3VLMoeTextConfig(_Qwen3VLMoeTextConfig): + r""" + Qwen3VLMoeTextConfig with MoT-specific parameters. + Extends Qwen3VLMoeTextConfig for text component MoT support with comprehensive configuration. + """ + + def __init__( + self, + # MoT-specific parameters with comprehensive defaults + qk_norm_for_text: bool = False, # Whether to apply QK norm in the understanding (text) pathway + qk_norm_for_diffusion: bool = True, # Whether to apply QK norm in the generation (diffusion) pathway + freeze_und: bool = False, # Freeze understanding pathway + layer_module: str = "MoTDecoderLayer", + tie_word_embeddings: bool = True, + **kwargs, + ): + # Store MoT-specific parameters + self.qk_norm_for_text = qk_norm_for_text + self.qk_norm_for_diffusion = qk_norm_for_diffusion + self.freeze_und = freeze_und + self.layer_module = layer_module + super().__init__(tie_word_embeddings=tie_word_embeddings, **kwargs) + + @classmethod + def from_json_file(cls, json_file): + """ + Enhanced from_json_file that handles both nested and flat configs. + For nested configs (with text_config section), extracts the text_config. + For flat configs, loads directly. + """ + + # Load the raw JSON + with open(json_file, encoding="utf-8") as reader: + config_dict = json.load(reader) + + # Check if this is a nested config with text_config section + if "text_config" in config_dict and isinstance(config_dict["text_config"], dict): + # Extract the text_config section for nested configs + log.debug("Detected nested config, extracting text_config section") + config_dict = config_dict["text_config"] + else: + # Use the config as-is for flat configs + log.debug("Detected flat config, using directly") + + # Create config from the (potentially extracted) dict + return cls(**config_dict) + + +# Nemotron 3 Dense VL MoT config + +_NEMOTRON_MOT_TEXT_CONFIG_KEYS = { + "vocab_size", + "tie_word_embeddings", + "hidden_size", + "intermediate_size", + "num_hidden_layers", + "num_attention_heads", + "head_dim", + "num_key_value_heads", + "mlp_hidden_act", + "attention_bias", + "mlp_bias", + "initializer_range", + "layer_norm_epsilon", + "residual_in_fp32", + "use_cache", + "num_logits_to_keep", + "pad_token_id", + "bos_token_id", + "eos_token_id", + "sliding_window", + "max_position_embeddings", + "attention_dropout", + "hidden_dropout", + "enable_rope", + "rope_scaling", + "rope_theta", + "enable_mrope", + "mrope_section", + "torch_dtype", +} + + +class Nemotron3DenseVLTextConfig(_Nemotron3DenseVLTextConfig): + """MoT-enabled config for the Nemotron 3 Dense VL text backbone. + + Extends the upstream ``Nemotron3DenseVLTextConfig`` with MoT-specific + fields (per-pathway QK normalisation, freeze control, decoder layer class). + Supports both the VLM nested config and the flat LLM config format. + """ + + def __init__( + self, + qk_norm_for_text: bool = False, + qk_norm_for_diffusion: bool = True, + freeze_und: bool = False, + layer_module: str = "MoTDecoderLayer", + tie_word_embeddings: bool = False, + **kwargs, + ) -> None: + self.qk_norm_for_text = qk_norm_for_text + self.qk_norm_for_diffusion = qk_norm_for_diffusion + self.freeze_und = freeze_und + self.layer_module = layer_module + super().__init__(tie_word_embeddings=tie_word_embeddings, **kwargs) + + @classmethod + def from_json_file(cls, json_file: str) -> "Nemotron3DenseVLTextConfig": + """Load config from a JSON file, handling both VLM nested and flat LLM formats.""" + with open(json_file, encoding="utf-8") as reader: + config_dict = json.load(reader) + if "text_config" in config_dict and isinstance(config_dict["text_config"], dict): + log.debug("Detected nested config, extracting text_config section") + config_dict = dict(config_dict["text_config"]) + else: + log.debug("Detected flat config, using directly") + if config_dict.get("num_hidden_layers") == 56: + # Upstream VLM stores attention and MLP as separate alternating blocks (56 total); + # MoT combines both into standard transformer layers (28 total). + config_dict = {**config_dict, "num_hidden_layers": 28} + filtered = {k: v for k, v in config_dict.items() if k in _NEMOTRON_MOT_TEXT_CONFIG_KEYS} + return cls(**filtered) + + +# ----------------------------------------------------------------------------- +# Common layers between Qwen3VL Dense, MoE, and Nemotron 3 Dense VL models +# ----------------------------------------------------------------------------- + + +class PackedAttentionMoT(nn.Module): + """ + Dual-pathway packed attention for MoT architectures. + Implements understanding and generation pathways with separate projections. + + Used for Qwen3VL (Dense), Qwen3VL-MoE, and Nemotron 3 Dense VL variants. + QK normalisation and RoPE function are selected via ``layer_types`` and config + attributes (``qk_norm_for_text`` / ``qk_norm_for_diffusion``). + """ + + def __init__(self, config, layer_idx: int, layer_types: LayerTypes): + super().__init__() + self.config = config + self.layer_idx = layer_idx + self.head_dim = getattr(config, "head_dim", config.hidden_size // config.num_attention_heads) + self.hidden_size = config.hidden_size + self.num_attention_heads = config.num_attention_heads + self.num_key_value_heads = config.num_key_value_heads + self.num_key_value_groups = self.num_attention_heads // self.num_key_value_heads + self.scaling = self.head_dim**-0.5 + self.attention_dropout = config.attention_dropout + + eps = config.rms_norm_eps + + # Understanding pathway projections + self.q_proj = nn.Linear(self.hidden_size, self.num_attention_heads * self.head_dim, bias=config.attention_bias) + self.k_proj = nn.Linear(self.hidden_size, self.num_key_value_heads * self.head_dim, bias=config.attention_bias) + self.v_proj = nn.Linear(self.hidden_size, self.num_key_value_heads * self.head_dim, bias=config.attention_bias) + self.o_proj = nn.Linear(self.num_attention_heads * self.head_dim, self.hidden_size, bias=config.attention_bias) + + # Understanding pathway QK norm + if config.qk_norm_for_text: + self.q_norm = layer_types.rms_norm(self.head_dim, eps=eps) + self.k_norm = layer_types.rms_norm(self.head_dim, eps=eps) + else: + self.q_norm = nn.Identity() + self.k_norm = nn.Identity() + + # Generation pathway QK norm + if config.qk_norm_for_diffusion: + self.q_norm_moe_gen = layer_types.rms_norm(self.head_dim, eps=eps) + self.k_norm_moe_gen = layer_types.rms_norm(self.head_dim, eps=eps) + else: + self.q_norm_moe_gen = nn.Identity() + self.k_norm_moe_gen = nn.Identity() + + # Generation pathway linear projections + self.q_proj_moe_gen = nn.Linear( + self.hidden_size, self.num_attention_heads * self.head_dim, bias=config.attention_bias + ) + self.k_proj_moe_gen = nn.Linear( + self.hidden_size, self.num_key_value_heads * self.head_dim, bias=config.attention_bias + ) + self.v_proj_moe_gen = nn.Linear( + self.hidden_size, self.num_key_value_heads * self.head_dim, bias=config.attention_bias + ) + self.o_proj_moe_gen = nn.Linear( + self.num_attention_heads * self.head_dim, self.hidden_size, bias=config.attention_bias + ) + + self._apply_rotary_pos_emb = layer_types.apply_rotary_pos_emb + self.dispatch_attention_fn = dispatch_attention + self.cp_mesh = None + + def forward( + self, + pack: FactoredSequencePack, + attention_mask: AttentionMaskType, + packed_position_embeddings: Tuple[FactoredSequencePack, FactoredSequencePack], + natten_metadata: dict | None = None, + memory_value: MemoryValue | None = None, + ) -> tuple[FactoredSequencePack, KVToStore | None]: + """Forward pass with optional memory-augmented attention. + + When ``memory_value`` is provided, ``dispatch_attention_fn`` routes to + the appropriate attention kernel (e.g. three-way KV-cache attention + for training, or AR inference concat + dense attention). + + ``kv_to_store`` is produced when ``memory_value`` is present: + ``(gen_k, gen_v, und_k, und_v)`` for the caller to write back via + ``MemoryState.write_for_layer()``. The tensors are passed with + gradients attached; each ``MemoryState`` decides whether to detach + (e.g. for truncated BPTT) or keep gradients (e.g. teacher forcing). + + Args: + pack: Packed sequence with und/gen tokens + attention_mask: Attention mask (BlockMask or SplitInfo) + packed_position_embeddings: RoPE embeddings (cos, sin) + natten_metadata: Optional NATTEN metadata for neighborhood attention. + memory_value: Optional read-only tensor container for memory-augmented attention. + """ + + q_und_in = self.q_proj(get_und_seq(pack)) # [N_und,num_heads*head_dim] + q_gen_in = self.q_proj_moe_gen(get_gen_seq(pack)) # [N_gen,num_heads*head_dim] + + k_und_in = self.k_proj(get_und_seq(pack)) # [N_und,num_kv_heads*head_dim] + k_gen_in = self.k_proj_moe_gen(get_gen_seq(pack)) # [N_gen,num_kv_heads*head_dim] + + v_und_in = self.v_proj(get_und_seq(pack)) # [N_und,num_kv_heads*head_dim] + v_gen_in = self.v_proj_moe_gen(get_gen_seq(pack)) # [N_gen,num_kv_heads*head_dim] + + q_und = q_und_in.view(-1, self.num_attention_heads, self.head_dim) # [N_und,num_heads,head_dim] + k_und = k_und_in.view(-1, self.num_key_value_heads, self.head_dim) # [N_und,num_kv_heads,head_dim] + v_und = v_und_in.view(-1, self.num_key_value_heads, self.head_dim) # [N_und,num_kv_heads,head_dim] + + q_gen = q_gen_in.view(-1, self.num_attention_heads, self.head_dim) # [N_gen,num_heads,head_dim] + k_gen = k_gen_in.view(-1, self.num_key_value_heads, self.head_dim) # [N_gen,num_kv_heads,head_dim] + v_gen = v_gen_in.view(-1, self.num_key_value_heads, self.head_dim) # [N_gen,num_kv_heads,head_dim] + + q_und = self.q_norm(q_und) # [N_und,num_heads,head_dim] + k_und = self.k_norm(k_und) # [N_und,num_kv_heads,head_dim] + + q_gen = self.q_norm_moe_gen(q_gen) # [N_gen,num_heads,head_dim] + k_gen = self.k_norm_moe_gen(k_gen) # [N_gen,num_kv_heads,head_dim] + + if self.config.freeze_und: + q_und = q_und.detach() + k_und = k_und.detach() + v_und = v_und.detach() + + packed_cos = packed_position_embeddings[0] + packed_sin = packed_position_embeddings[1] + + q_und_, k_und_ = self._apply_rotary_pos_emb( + q_und, + k_und, + get_und_seq(packed_cos), + get_und_seq(packed_sin), + unsqueeze_dim=1, + ) # q_und_: [N_und,num_heads,head_dim], k_und_: [N_und,num_kv_heads,head_dim] + q_gen_, k_gen_ = self._apply_rotary_pos_emb( + q_gen, + k_gen, + get_gen_seq(packed_cos), + get_gen_seq(packed_sin), + unsqueeze_dim=1, + ) # q_gen_: [N_gen,num_heads,head_dim], k_gen_: [N_gen,num_kv_heads,head_dim] + + packed_query_states_ = from_und_gen_splits(q_und_, q_gen_, pack) # [N_und+N_gen,num_heads,head_dim] + packed_key_states_ = from_und_gen_splits(k_und_, k_gen_, pack) # [N_und+N_gen,num_kv_heads,head_dim] + packed_value_states_ = from_und_gen_splits(v_und, v_gen, pack) # [N_und+N_gen,num_kv_heads,head_dim] + + packed_attn_output, kv_to_store = self.dispatch_attention_fn( + packed_query_states_, + packed_key_states_, + packed_value_states_, + attention_mask, + natten_metadata=natten_metadata, + memory_value=memory_value, + ) + + # Produce kv_to_store for MemoryState.write_for_layer() when the + # dispatch didn't already provide one (e.g. standard or AR frame-0 + # non-CP paths). CP dispatch returns head-sharded kv_to_store + # directly, so kv_to_store is already non-None in that case. + # + # Gradient detach is NOT done here; each MemoryState.write_for_layer() + # decides its own gradient policy (e.g. detach for truncated BPTT, + # keep gradients for teacher forcing). + if memory_value is not None and kv_to_store is None: + und_len = pack["_num_causal_tokens"] + gen_len = pack["_num_full_tokens"] + kv_to_store = ( + k_gen_[:gen_len].unsqueeze(0), + v_gen[:gen_len].unsqueeze(0), + k_und_[:und_len].unsqueeze(0), + v_und[:und_len].unsqueeze(0), + ) + + # Apply projections directly to get final results + und_seq = self.o_proj(get_und_seq(packed_attn_output)) # [N_und,hidden_size] + gen_seq = self.o_proj_moe_gen(get_gen_seq(packed_attn_output)) # [N_gen,hidden_size] + return from_und_gen_splits(und_seq, gen_seq, pack), kv_to_store # [N_und+N_gen,hidden_size] + + +def _impl_init( + self, config: Qwen3VLTextConfig | Qwen3VLMoeTextConfig | Nemotron3DenseVLTextConfig, layer_types: LayerTypes +): + """ + Common implementation for Qwen3VLTextModel, Qwen3VLMoeTextModel, and Nemotron3DenseVLTextModel __init__. + """ + self.padding_idx = config.pad_token_id + self.vocab_size = config.vocab_size + assert "Mo" in config.layer_module, "Only MoT layers are supported" + + # Text configuration for decoder layers + + # Embeddings from Qwen3VL base + self.embed_tokens = nn.Embedding(config.vocab_size, config.hidden_size, self.padding_idx) + + self.layers = nn.ModuleList( + [MoTDecoderLayer(config, layer_idx, layer_types) for layer_idx in range(config.num_hidden_layers)] + ) + + # Layer norm and rotary embeddings (text-only optimized) + self.norm = layer_types.rms_norm(config.hidden_size, eps=config.rms_norm_eps) + + # Pathway-specific normalization + self.norm_moe_gen = layer_types.rms_norm(config.hidden_size, eps=config.rms_norm_eps) + + # Rotary embedding (text-only optimized) + self.rotary_emb = layer_types.rotary_embedding(config) + + # Initialize weights and apply final processing + self.post_init() + + +def _impl_init_taylorseer(self, cache_dic=None, current=None): + """ + Initialize TaylorSeer acceleration attributes. + Common implementation for Qwen3VLTextModel.init_taylorseer and Qwen3VLMoeTextModel.init_taylorseer + """ + self.cache_dic = cache_dic or {} + self.current = current or { + "step": 0, + "type": "full", + "stream": "layers_stream", + "layer": 0, + "module": "total", + "activated_steps": [0], + } + # Enable TaylorSeer flag + self.enable_taylorseer = True + + +def _impl_forward( + self, + pack: FactoredSequencePack, + attention_mask, + position_ids: torch.Tensor, + natten_metadata_list: list | None = None, + memory: MemoryState | None = None, +) -> tuple[FactoredSequencePack, dict[str, LBLMetadata]]: + """ + Training forward pass - Attempted port from qwen2_mot + Common implementation for Qwen3VLTextModel.forward_train and Qwen3VLMoeTextModel.forward_train + + Args: + pack: Packed sequence + attention_mask: Attention mask + position_ids: Position IDs + natten_metadata_list: Optional per-layer NATTEN metadata. + memory: Optional MemoryState for persistent memory across forward passes. + """ + + # Create position embeddings (Qwen3 style) - squeeze once at model level + # tensor below is only used for its dtype and device + device, dtype = get_device_and_dtype(pack) + _meta_tensor = torch.tensor([], dtype=dtype, device=device) + cos, sin = self.rotary_emb( + _meta_tensor, position_ids=position_ids.unsqueeze(0) if position_ids.ndim == 1 else position_ids.unsqueeze(1) + ) # if ndim == 2, then the mrope position_ids is (3, seq_len), we need to put batch dimension in the middle to make it compatible with the rotary_emb + # cos, sin: [1,N,head_dim] (1D pos_ids) or [3,1,N,head_dim] (mrope pos_ids) + cos = cos.squeeze(0) # [N,head_dim] or [3,N,head_dim] + sin = sin.squeeze(0) # [N,head_dim] or [3,N,head_dim] + position_embeddings = ( + from_joint(cos, pack), + from_joint(sin, pack), + ) + + # Tracking the load balancing loss across all layers. For dense models, lbl_metadata_all + # will be a dictionary with empty lists for each pathway. For MoE models, the lists + # for each pathway will be populated with the load balancing loss metadata for each layer. + lbl_metadata_all = dict(und=[], gen=[]) + + hidden_states = pack + + # --- MemoryState: per-step init (outside compile) --- + if memory is not None: + memory.init(hidden_states, device) + + # Derive gen_only once (outside compile) if using MemoryState + memory_gen_only = memory.is_gen_only() if memory is not None else False + + for i, decoder_layer in enumerate(self.layers): + # MemoryState: produce read-only MemoryValue for this layer (outside compile) + memory_value = memory.read_for_layer(i) if memory is not None else None + + hidden_states, lbl_metadata_dict, kv_to_store = decoder_layer( + hidden_states, + attention_mask, + position_embeddings, + natten_metadata=None if natten_metadata_list is None else natten_metadata_list[i], + memory_value=memory_value, + gen_only=memory_gen_only, + ) + + # MemoryState: store K/V produced by this layer (outside compile) + if kv_to_store is not None and memory is not None: + memory.write_for_layer(i, kv_to_store) + + for pathway, lbl_metadata in lbl_metadata_dict.items(): + lbl_metadata_all[pathway].append(lbl_metadata) + + # Compute the load balancing loss across all layers. For dense models, final_lbl_metadata + # will be an empty dictionary. For MoE models, it will be a dictionary with the stacked + # load balancing loss metadata for each pathway. + final_lbl_metadata: dict[str, LBLMetadata] = dict() + for pathway, lbl_metadata_list in lbl_metadata_all.items(): + if len(lbl_metadata_list) > 0: + num_tokens_per_expert = torch.stack( + [lbl_metadata.num_tokens_per_expert for lbl_metadata in lbl_metadata_list] + ) # [num_layers,num_experts] + num_tokens = torch.stack([lbl_metadata.num_tokens for lbl_metadata in lbl_metadata_list]) # [num_layers] + mean_router_prob_per_expert = torch.stack( + [lbl_metadata.mean_router_prob_per_expert for lbl_metadata in lbl_metadata_list] + ) # [num_layers,num_experts] + final_lbl_metadata[pathway] = LBLMetadata( + num_tokens_per_expert=num_tokens_per_expert, + num_tokens=num_tokens, + mean_router_prob_per_expert=mean_router_prob_per_expert, + ) + + hidden_states_out = zeros_like(hidden_states) + set_und_seq(hidden_states_out, self.norm(get_und_seq(hidden_states))) # [N_und,hidden_size] + set_gen_seq(hidden_states_out, self.norm_moe_gen(get_gen_seq(hidden_states))) # [N_gen,hidden_size] + + return hidden_states_out, final_lbl_metadata + + +def _run_mlp( + mlp: torch.nn.Module, + input: torch.Tensor, +) -> tuple[torch.Tensor, LBLMetadata | None]: + if isinstance(mlp, Qwen3VLMoeTextSparseMoeBlock): + ( + output_tensor, + lbl_metadata, + ) = mlp(input) + else: + output_tensor = mlp(input) + lbl_metadata = None + return output_tensor, lbl_metadata + + +class MoTDecoderLayer(nn.Module): + """ + Unified MoT (Mixture of Transformers) decoder layer. + Features dual-pathway attention for understanding vs generation. + + This is used for both Dense and MoE models. + """ + + def __init__( + self, + config: Qwen3VLTextConfig | Qwen3VLMoeTextConfig | Nemotron3DenseVLTextConfig, + layer_idx: int, + layer_types: LayerTypes, + ): + super().__init__() + self.hidden_size = config.hidden_size + self.freeze_und = config.freeze_und + self.self_attn = PackedAttentionMoT(config, layer_idx, layer_types) + + if ( + hasattr(config, "mlp_only_layers") + and (layer_idx not in config.mlp_only_layers) + and (config.num_experts > 0 and (layer_idx + 1) % config.decoder_sparse_step == 0) + ): + self.mlp = Qwen3VLMoeTextSparseMoeBlock(config) + self.mlp_moe_gen = Qwen3VLMoeTextSparseMoeBlock(config) + else: + self.mlp = layer_types.mlp(config) + self.mlp_moe_gen = layer_types.mlp(config) + + self.input_layernorm = layer_types.rms_norm(config.hidden_size, eps=config.rms_norm_eps) + self.input_layernorm_moe_gen = layer_types.rms_norm(config.hidden_size, eps=config.rms_norm_eps) + self.post_attention_layernorm = layer_types.rms_norm(config.hidden_size, eps=config.rms_norm_eps) + self.post_attention_layernorm_moe_gen = layer_types.rms_norm(config.hidden_size, eps=config.rms_norm_eps) + + def forward( + self, + input: FactoredSequencePack, + attention_mask, + packed_position_embeddings: Tuple[FactoredSequencePack, FactoredSequencePack], + natten_metadata: dict | None = None, + memory_value: MemoryValue | None = None, + gen_only: bool = False, + ) -> tuple[FactoredSequencePack, dict[str, LBLMetadata], KVToStore | None]: + """Forward pass with MoT routing and optional memory-augmented attention. + + Returns a 3-tuple: ``(hidden_states, lbl_metadata_dict, kv_to_store)``. + ``kv_to_store`` is non-None when ``memory_value`` is provided, + containing ``(gen_k, gen_v, und_k, und_v)`` to be written back by + ``MemoryState.write_for_layer()`` outside the ``torch.compile`` + boundary. + + Args: + input: Packed sequence with und/gen tokens + attention_mask: Attention mask + packed_position_embeddings: RoPE embeddings (cos, sin) + natten_metadata: Optional NATTEN metadata for neighborhood attention. + memory_value: Read-only tensor container from MemoryState.read_for_layer(). + gen_only: When True, skip the understanding pathway (und K/V come from cache). + """ + # Pre-Attention layernorm + pack_norm_out = from_und_gen_splits( + self.input_layernorm(get_und_seq(input)), # [N_und,hidden_size] + self.input_layernorm_moe_gen(get_gen_seq(input)), # [N_gen,hidden_size] + input, + ) # [N_und+N_gen,hidden_size] + + # Self Attention + Residual + kv_to_store: KVToStore | None = None + if gen_only: + assert natten_metadata is None + # gen_only: skip und, compute gen tokens only (und K/V come from cache) + _gen_norm = get_gen_seq(pack_norm_out) + gen_pack = from_und_gen_splits( + _gen_norm.new_empty(0, _gen_norm.shape[-1]), + _gen_norm, + pack_norm_out, + ) + + # Build position embeddings whose und length matches gen_pack's + # und length (always 0). Required when the outer pack carries + # a padded causal_seq (``pad_for_cuda_graphs=True``): without + # this, the und RoPE inside ``PackedAttentionMoT.forward`` + # would broadcast cos/sin of shape ``(MAX_CAUSAL_LEN, head_dim)`` + # onto a length-0 ``q_und`` / ``k_und`` and crash. When the + # outer pack is unpadded (eager AR path), the und cos/sin + # already have length 0 and this slice is a no-op. + _cos, _sin = packed_position_embeddings + _empty_cos_und = get_und_seq(_cos)[:0] + _empty_sin_und = get_und_seq(_sin)[:0] + gen_position_embeddings = ( + from_und_gen_splits(_empty_cos_und, get_gen_seq(_cos), _cos), + from_und_gen_splits(_empty_sin_und, get_gen_seq(_sin), _sin), + ) + + pack_attn_out, kv_to_store = self.self_attn( + gen_pack, + attention_mask, + gen_position_embeddings, + natten_metadata=natten_metadata, + memory_value=memory_value, + ) + gen_attn_out = get_gen_seq(pack_attn_out) + residual_und = gen_attn_out.new_empty(0, gen_attn_out.shape[-1]) + residual_gen = get_gen_seq(input) + gen_attn_out + else: + # STANDARD PATH: Process both und and gen tokens + pack_attn_out, kv_to_store = self.self_attn( + pack_norm_out, + attention_mask, + packed_position_embeddings, + natten_metadata=natten_metadata, + memory_value=memory_value, + ) + residual_und = get_und_seq(input) + get_und_seq(pack_attn_out) # [N_und,hidden_size] + residual_gen = get_gen_seq(input) + get_gen_seq(pack_attn_out) # [N_gen,hidden_size] + + # Pre-MLP layernorm and processing + lbl_metadata_dict: dict[str, LBLMetadata] = dict() + + if gen_only: + # gen_only: skip und, compute gen tokens only + ln_out_und = residual_gen.new_empty(0, residual_gen.shape[-1]) + ln_out_gen = self.post_attention_layernorm_moe_gen(residual_gen) + + # UNPAD MLP INPUT (gen only) + gen_len = pack_attn_out["_num_full_tokens"] + ln_out_gen_unpadded = ln_out_gen[:gen_len] # [N_gen_unpadded,hidden_size] + + # Run MLP (gen only) + mlp_out_gen_unpadded, lbl_metadata_gen = _run_mlp(self.mlp_moe_gen, ln_out_gen_unpadded) + # mlp_out_gen_unpadded: [N_gen_unpadded,hidden_size] + + # PAD MLP OUTPUT (gen only) + mlp_out_gen = torch.cat([mlp_out_gen_unpadded, ln_out_gen[gen_len:]], dim=0) # [N_gen,hidden_size] + + # Build metadata dict (no und metadata in optimized path) + if lbl_metadata_gen is not None: + lbl_metadata_dict["gen"] = lbl_metadata_gen + + # Final output with residual (gen only) + mlp_out_und_seq = residual_gen.new_empty(0, residual_gen.shape[-1]) + mlp_out_gen_seq = residual_gen + mlp_out_gen + else: + # STANDARD PATH: Process both und and gen tokens + ln_out_und = self.post_attention_layernorm(residual_und) # [N_und,hidden_size] + ln_out_gen = self.post_attention_layernorm_moe_gen(residual_gen) # [N_gen,hidden_size] + + # UNPAD MLP INPUT =============== + + # artificial expert inbalance due to routing padding tokens. + gen_len = pack_attn_out["_num_full_tokens"] + und_len = pack_attn_out["_num_causal_tokens"] + ln_out_und_unpadded = ln_out_und[:und_len] # [N_und_unpadded,hidden_size] + ln_out_gen_unpadded = ln_out_gen[:gen_len] # [N_gen_unpadded,hidden_size] + + mlp_out_und_unpadded, lbl_metadata_und = _run_mlp(self.mlp, ln_out_und_unpadded) + # mlp_out_und_unpadded: [N_und_unpadded,hidden_size] + mlp_out_gen_unpadded, lbl_metadata_gen = _run_mlp(self.mlp_moe_gen, ln_out_gen_unpadded) + # mlp_out_gen_unpadded: [N_gen_unpadded,hidden_size] + + # PAD MLP OUTPUT =============== + mlp_out_und = torch.cat([mlp_out_und_unpadded, ln_out_und[und_len:]], dim=0) # [N_und,hidden_size] + mlp_out_gen = torch.cat([mlp_out_gen_unpadded, ln_out_gen[gen_len:]], dim=0) # [N_gen,hidden_size] + + if lbl_metadata_und is not None: + lbl_metadata_dict["und"] = lbl_metadata_und + if lbl_metadata_gen is not None: + lbl_metadata_dict["gen"] = lbl_metadata_gen + + mlp_out_und_seq = residual_und + mlp_out_und # [N_und,hidden_size] + mlp_out_gen_seq = residual_gen + mlp_out_gen # [N_gen,hidden_size] + + return from_und_gen_splits(mlp_out_und_seq, mlp_out_gen_seq, input), lbl_metadata_dict, kv_to_store + + +# Backward-compat alias: serialized checkpoint configs reference the old name. +Qwen3VLTextMoTDecoderLayer = MoTDecoderLayer + + +class Qwen3VLTextModel(Qwen3VLPreTrainedModel): + """ + Qwen3VL text model for MoT with dense MLPs. + This is a wrapper around the _impl_forward defined above, + specialized for dense models. + """ + + def __init__(self, config: Qwen3VLMoeTextConfig): + super().__init__(config) + _impl_init(self, config, layer_types=LayerTypes("qwen3_vl_dense")) + + def init_taylorseer(self, cache_dic=None, current=None): + _impl_init_taylorseer(self, cache_dic=cache_dic, current=current) + + def forward(self, *args, **kwargs): + return _impl_forward(self, *args, **kwargs) + + +class Qwen3VLMoeTextModel(Qwen3VLMoePreTrainedModel): + """ + Qwen3VL text model for MoT with MoE MLPs. + This is a wrapper around the _impl_* helpers defined above, + specialized for MoE models. + """ + + def __init__(self, config: Qwen3VLMoeTextConfig): + super().__init__(config) + _impl_init(self, config, layer_types=LayerTypes("qwen3_vl_moe")) + + def init_taylorseer(self, cache_dic=None, current=None): + _impl_init_taylorseer(self, cache_dic=cache_dic, current=current) + + def forward(self, *args, **kwargs): + return _impl_forward(self, *args, **kwargs) + + +class Qwen3VLTextForCausalLM(Qwen3VLPreTrainedModel): + """ + Qwen3VL text causal language model for MoT. + This variant is used for dense-only MLP models. + """ + + _tied_weights_keys = ["lm_head.weight"] + + def __init__(self, config: Qwen3VLTextConfig): + super().__init__(config) + self.model = Qwen3VLTextModel(config) + self.vocab_size = config.vocab_size + self.lm_head = nn.Linear(config.hidden_size, config.vocab_size, bias=False) + + # Initialize weights and apply final processing + self.post_init() + + def init_moe(self) -> None: + """Initialize MoE/MoT weights by copying understanding to generation pathway.""" + state_dict = self.state_dict() + for name, param in self.named_parameters(): + if "moe_gen" in name: + original_name = name.replace("_moe_gen", "").replace("_checkpoint_wrapped_module.", "") + if original_name in state_dict: + param.data.copy_(state_dict[original_name].data) + else: + raise ValueError(f"Could not find {original_name} in state_dict for initialization of {name}") + + def forward( + self, + pack: FactoredSequencePack, + attention_mask, + position_ids: torch.Tensor, + natten_metadata_list: list | None = None, + memory: MemoryState | None = None, + ) -> tuple[FactoredSequencePack, dict[str, LBLMetadata]]: + """Training forward pass - simplified to match qwen3_mot""" + outputs = self.model( + pack=pack, + attention_mask=attention_mask, + position_ids=position_ids, + natten_metadata_list=natten_metadata_list, + memory=memory, + ) + return outputs + + +class Qwen3VLMoeTextForCausalLM(Qwen3VLMoePreTrainedModel): + """ + Qwen3VL text causal language model for MoT with MoE on the generation pathway. + This variant is used for MoE MLP models. + """ + + _tied_weights_keys = ["lm_head.weight"] + + def __init__(self, config: Qwen3VLMoeTextConfig): + super().__init__(config) + self.model = Qwen3VLMoeTextModel(config) + self.vocab_size = config.vocab_size + self.lm_head = nn.Linear(config.hidden_size, config.vocab_size, bias=False) + + # Initialize weights and apply final processing + self.post_init() + + def init_moe(self) -> None: + """Initialize MoE/MoT weights by copying understanding to generation pathway.""" + state_dict = self.state_dict() + for name, param in self.named_parameters(): + if "moe_gen" in name: + original_name = name.replace("_moe_gen", "").replace("_checkpoint_wrapped_module.", "") + if original_name in state_dict: + param.data.copy_(state_dict[original_name].data) + else: + raise ValueError(f"Could not find {original_name} in state_dict for initialization of {name}") + + def forward( + self, + pack: FactoredSequencePack, + attention_mask, + position_ids: torch.Tensor, + natten_metadata_list: list | None = None, + memory: MemoryState | None = None, + ) -> tuple[FactoredSequencePack, dict[str, torch.Tensor]]: + """Training forward pass - simplified to match qwen3_mot""" + + outputs = self.model( + pack=pack, + attention_mask=attention_mask, + position_ids=position_ids, + natten_metadata_list=natten_metadata_list, + memory=memory, + ) + + return outputs + + +# ----------------------------------------------------------------------------- +# Nemotron 3 Dense VL MoT model wrappers +# ----------------------------------------------------------------------------- + + +class Nemotron3DenseVLTextModel(Nemotron3DenseVLPreTrainedModel): + """Nemotron 3 Dense VL text model adapted for MoT training.""" + + def __init__(self, config: Nemotron3DenseVLTextConfig) -> None: + super().__init__(config) + _impl_init(self, config, layer_types=LayerTypes("nemotron_dense")) + + def init_taylorseer(self, cache_dic=None, current=None) -> None: + _impl_init_taylorseer(self, cache_dic=cache_dic, current=current) + + def forward(self, *args, **kwargs): + return _impl_forward(self, *args, **kwargs) + + +class Nemotron3DenseVLTextForCausalLM(Nemotron3DenseVLPreTrainedModel): + """Causal LM head on top of the Nemotron 3 Dense VL MoT text model.""" + + _tied_weights_keys: list[str] = [] + + def __init__(self, config: Nemotron3DenseVLTextConfig) -> None: + super().__init__(config) + self.model = Nemotron3DenseVLTextModel(config) + self.vocab_size = config.vocab_size + self.lm_head = nn.Linear(config.hidden_size, config.vocab_size, bias=False) + self.post_init() + + def init_moe(self) -> None: + """Copy understanding-pathway weights into the generation-pathway parameters.""" + state_dict = self.state_dict() + for name, param in self.named_parameters(): + if "moe_gen" not in name: + continue + original_name = name.replace("_moe_gen", "").replace("_checkpoint_wrapped_module.", "") + if original_name in state_dict: + param.data.copy_(state_dict[original_name].data) + elif any(norm_key in original_name for norm_key in ("q_norm", "k_norm")): + # qk_norm_for_text=False → q_norm/k_norm are nn.Identity() with no parameters; + # the moe_gen counterpart (q_norm_moe_gen) is a real RMSNorm, so skip init here. + pass + else: + raise ValueError(f"Could not find {original_name} in state_dict for initialization of {name}") + + def forward( + self, + pack: FactoredSequencePack, + attention_mask, + position_ids: torch.Tensor, + natten_metadata_list: list | None = None, + memory: MemoryState | None = None, + ) -> tuple[FactoredSequencePack, dict[str, LBLMetadata]]: + return self.model( + pack=pack, + attention_mask=attention_mask, + position_ids=position_ids, + natten_metadata_list=natten_metadata_list, + memory=memory, + ) diff --git a/cosmos-inference/cosmos3/_src/vfm/models/omni_mot_model.py b/cosmos-inference/cosmos3/_src/vfm/models/omni_mot_model.py new file mode 100644 index 00000000..a6175e33 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/vfm/models/omni_mot_model.py @@ -0,0 +1,3037 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +from __future__ import annotations + +import collections +from contextlib import contextmanager +from typing import Any, Callable, Dict, Mapping, Optional, Tuple + +import numpy as np +import torch +import torch.distributed as dist +from einops import rearrange +from torch.distributed._composable.fsdp import FSDPModule +from torch.nn.modules.module import _IncompatibleKeys + +from cosmos3._src.imaginaire.flags import DEVICE, TRAINING, Device +from cosmos3._src.imaginaire.lazy_config import LazyDict +from cosmos3._src.imaginaire.lazy_config import instantiate as lazy_instantiate +from cosmos3._src.imaginaire.model import ImaginaireModel +from cosmos3._src.imaginaire.utils import log, misc +from cosmos3._src.imaginaire.utils.count_params import count_params +from cosmos3._src.imaginaire.utils.timer import Timer +from cosmos3._src.vfm.algorithm.loss.flow_matching import compute_flow_matching_loss +from cosmos3._src.vfm.algorithm.loss.load_balancing import compute_load_balancing_loss +from cosmos3._src.vfm.configs.base.defaults.model_config import OmniMoTModelConfig +from cosmos3._src.vfm.datasets.sequence_packing import ( + PackedSequence, + SequencePlan, + add_special_tokens, + build_sequence_plans_from_data_batch, + pack_input_sequence, +) +from cosmos3._src.vfm.datasets.utils import VIDEO_RES_SIZE_INFO +from cosmos3._src.vfm.diffusion.rectified_flow import RectifiedFlow +from cosmos3._src.vfm.diffusion.samplers.edm import EDMSampler +from cosmos3._src.vfm.diffusion.samplers.fixed_step import FixedStepSampler +from cosmos3._src.vfm.diffusion.samplers.unipc import UniPCSampler, UniPCSamplerConfig +from cosmos3._src.vfm.models.mot.cosmos3_vfm_network import Cosmos3VFMNetwork, Cosmos3VFMNetworkConfig +from cosmos3._src.vfm.models.mot.modeling_utils import has_noisy_tokens +from cosmos3._src.vfm.models.mot.parallelize_vfm_network import parallelize_vfm_network +from cosmos3._src.vfm.models.utils.data_and_condition import ( + GenerationDataClean, + GenerationDataNoised, + _expand_per_sample_to_per_vision_item, + build_dense_sound_schedule, + unwrap_and_densify, +) +from cosmos3._src.vfm.models.utils.memory import MemoryState +from cosmos3._src.vfm.models.utils.safetensors_loader import load_language_model as load_language_model_safetensors +from cosmos3._src.vfm.models.vlm.qwen3_vl.utils import tokenize_caption +from cosmos3._src.vfm.tokenizers.interface import VideoTokenizerInterface +from cosmos3._src.vfm.utils.data_utils import get_vision_data_resolution +from cosmos3._src.vfm.utils.dtensor_helper import DTensorFastEmaModelUpdater +from cosmos3._src.vfm.utils.model_weights_stats import WeightTrainingStat +from cosmos3._src.vfm.utils.parallelism import ParallelDims + + +class OmniMoTModel(ImaginaireModel): + """ + Mixture of Transformers (MoT) model to be trained with the flow matching objective + for visual / sound / action generation. + """ + + def __init__(self, config: OmniMoTModelConfig): + super().__init__() + self.config = config + log.info(f"OmniMoTModel: config {self.config}") + # 0. Set up precision + self.set_precision() + + # 1. Set data keys and data information + self.set_up_data_key() + + # 2. Text, vision, audio, action tokenizers + self.set_up_tokenizers() + + # 3. FSDP setup. Note: call this before building the model. + self.set_up_parallelism() + + # 4. Build the denoiser network + self.set_up_model() + + # 5. Set up training time scheduler and inference time sampler + self.set_up_scheduler_and_sampler() + + self.log_enc_time_every_n = config.log_enc_time_every_n + + def set_precision(self) -> None: + self.precision = getattr(torch, self.config.parallelism.precision) + self.tensor_kwargs = {"device": DEVICE, "dtype": self.precision} + self.tensor_kwargs_fp32 = {"device": DEVICE, "dtype": torch.float32} + log.warning(f"OmniMoTModel: precision {self.precision}") + + # Disable TF32 for CUDA matrix multiplications since this may impact model quality. + torch.backends.cudnn.allow_tf32 = torch.backends.cuda.matmul.allow_tf32 = False + + def set_up_data_key(self) -> None: + + self.input_video_key = self.config.input_video_key # by default it is video key for Video diffusion model + self.input_image_key = self.config.input_image_key + self.input_caption_key = self.config.input_caption_key + + @misc.timer("OmniMoTModel: set_up_tokenizers") + def set_up_tokenizers(self) -> None: + """ + Variable names follow the naming convention: + - tokenizer__gen if used for generation branch + - tokenizer__und if used for understanding branch + """ + # 1. Text tokenizer + self.vlm_config = self.config.vlm_config + _vlm_proc = lazy_instantiate(self.vlm_config.tokenizer) + vlm_tokenizer = _vlm_proc.tokenizer + vlm_tokenizer, special_tokens = add_special_tokens(vlm_tokenizer) + self.vlm_tokenizer = vlm_tokenizer + + self.llm_special_tokens = special_tokens + self.llm_special_tokens["eos_token_id"] = vlm_tokenizer.eos_token_id + + # 2. Vision tokenizer (images/videos) for generation. + self.tokenizer_vision_gen: VideoTokenizerInterface = lazy_instantiate(self.config.tokenizer) + assert self.tokenizer_vision_gen.latent_ch == self.config.state_ch, ( + f"vision tokenizer latent_ch {self.tokenizer_vision_gen.latent_ch} != state_shape {self.config.state_ch}" + ) + if hasattr(self.tokenizer_vision_gen, "reset_dtype"): + self.tokenizer_vision_gen.reset_dtype() + + # 3. Sound/audio tokenizer (optional) + if self.config.sound_gen: + assert self.config.sound_tokenizer is not None, "sound_tokenizer must be provided when sound_gen is True" + self.tokenizer_sound_gen = lazy_instantiate(self.config.sound_tokenizer) + assert self.config.sound_dim is not None, "sound_dim must be provided when sound_gen is True" + assert self.tokenizer_sound_gen.latent_ch == self.config.sound_dim, ( + f"sound tokenizer latent_ch {self.tokenizer_sound_gen.latent_ch} != sound_dim {self.config.sound_dim}" + ) + if hasattr(self.tokenizer_sound_gen, "reset_dtype"): + self.tokenizer_sound_gen.reset_dtype() + log.info(f"Sound tokenizer initialized: {type(self.tokenizer_sound_gen).__name__}") + else: + self.tokenizer_sound_gen = None + + + + def build_net(self, dtype: torch.dtype): + # Build model network and parallelize it. + with torch.device("meta"): + assert self.vlm_config.model_instance is not None, "Model instance should be specified" + + language_model = lazy_instantiate(self.vlm_config.model_instance) + + # (i.e., roughly [0, num_train_timesteps]). The MoT network expects to internally + # rescale timesteps before embedding; avoid hard-coding 1e-3 by computing it from + # the configured scheduler resolution. + num_train_timesteps = self.config.rectified_flow_inference_config.num_train_timesteps + network_config = Cosmos3VFMNetworkConfig( + vlm_config=language_model.config, + latent_patch_size=self.config.diffusion_expert_config.patch_spatial, + latent_downsample_factor=self.config.latent_downsample_factor, + latent_channel_size=self.config.state_ch, + max_latent_h=self.config.diffusion_expert_config.max_vae_latent_side_after_patchify, + max_latent_w=self.config.diffusion_expert_config.max_vae_latent_side_after_patchify, + max_latent_t=self.config.state_t, + rope_h_extrapolation_ratio=self.config.diffusion_expert_config.rope_h_extrapolation_ratio, + rope_w_extrapolation_ratio=self.config.diffusion_expert_config.rope_w_extrapolation_ratio, + rope_t_extrapolation_ratio=self.config.diffusion_expert_config.rope_t_extrapolation_ratio, + enable_fps_modulation=self.config.diffusion_expert_config.enable_fps_modulation, + base_fps=self.config.diffusion_expert_config.base_fps, + vision_gen=self.config.vision_gen, + action_gen=self.config.action_gen, + sound_gen=self.config.sound_gen, + position_embedding_type=self.config.diffusion_expert_config.position_embedding_type, + joint_attn_implementation=self.config.joint_attn_implementation, + timestep_scale=1.0 / float(num_train_timesteps) * self.config.diffusion_expert_config.timestep_range, + action_dim=self.config.max_action_dim, + num_embodiment_domains=self.config.num_embodiment_domains, + temporal_compression_factor_vision=self.tokenizer_vision_gen.temporal_compression_factor, + natten_parameter_list=self.config.natten_parameter_list, + video_temporal_causal=self.config.video_temporal_causal, + # Sound generation parameters + sound_dim=self.config.sound_dim, + sound_latent_fps=self.config.sound_latent_fps, + ) + network_config._attn_implementation_internal = "eager" + net = Cosmos3VFMNetwork( + language_model=language_model, + config=network_config, + ) + net.pad_for_cuda_graphs = self.config.parallelism.use_cuda_graphs + + # Inject LoRA BEFORE FSDP wrap, while still on meta device. The + # injector must see unsharded Linear shapes; injecting post-FSDP causes + # lora_B to be created at the per-rank shard size and crashes at + # forward time. See `OmniMoTModel.add_lora` for details. + if getattr(self.config, "lora_enabled", False): + net = self.add_lora( + net, + lora_rank=self.config.lora_rank, + lora_alpha=self.config.lora_alpha, + lora_target_modules=self.config.lora_target_modules, + ) + + self.install_attention_dispatch(net) + + net = parallelize_vfm_network( + net, + parallel_dims=self.parallel_dims, + config=self.config.parallelism, + ) + + with misc.timer("meta to cuda and broadcast model states"): + net = net.to(dtype=dtype) + net.to_empty(device=DEVICE) + if DEVICE == Device.CUDA: + # Weight initialization is not needed for other devices (cpu, + # meta), since they are only for checkpoint conversion and smoke + # tests. + net.init_weights(buffer_device=DEVICE) + if getattr(self.config, "lora_enabled", False): + self._init_lora_weights_post_materialization(net) + + return net + + def load_pretrained_model_if_needed(self): + """ + This function is used to load the pretrained model weights from HF if needed. + + 1. If self.vlm_config.load_pretrained is False, we skip loading the pretrained + model weights. + 2. If self.vlm_config.load_pretrained is True, and + self.config.diffusion_expert_config.load_weights_from_pretrained is True, + we load the understanding pathway weights from HF, and copy them to the + generation pathway. + 3. If self.vlm_config.load_pretrained is True, and + self.config.diffusion_expert_config.load_weights_from_pretrained is False, + we load the understanding pathway weights from HF, but do not copy them to + the generation pathway. This is used when we warm-start from a load_path + (but no previous checkpoint exists), and we want to switch the understanding + pathway weights to a new model (e.g., Qwen3-VL to Cosmos-Reason2). + """ + if not self.vlm_config.load_pretrained: + return + + def _load_language_model(net: torch.nn.Module): + load_language_model_safetensors( + model=net.language_model, + checkpoint_path=self.vlm_config.checkpoint_path, + credential_path=self.vlm_config.credential_path, + parallel_dims=self.parallel_dims, + checkpoint_format=getattr(self.vlm_config, "vlm_checkpoint_format", None), + ) + + # When specified, we load pretrained LLM weights. + log.info(f"Loading understanding pathway weights from {self.vlm_config.checkpoint_path}") + _load_language_model(self.net) + if self.config.ema.enabled: + _load_language_model(self.net_ema) + log.info("Successfully loaded understanding pathway weights.") + + if self.config.diffusion_expert_config.load_weights_from_pretrained: + log.info("Copying understanding pathway weights to generation pathway.") + self.net.language_model.init_moe() + if self.config.ema.enabled: + self.net_ema.language_model.init_moe() + log.info("Successfully copied understanding pathway weights to generation pathway.") + + @misc.timer("OmniMoTModel: set_up_model") + def set_up_model(self): + assert hasattr(self, "parallel_dims"), "parallel_dims must be set" + config = self.config + with misc.timer("Creating PyTorch model and ema if enabled"): + self.net = self.build_net(dtype=self.precision) + self._param_count = count_params(self.net, verbose=False) + + if config.ema.enabled: + self.net_ema = self.build_net(dtype=torch.float32) + self.net_ema.requires_grad_(False) + + self.net_ema_worker = DTensorFastEmaModelUpdater() + + + s = config.ema.rate + self.ema_exp_coefficient = np.roots([1, 7, 16 - s**-2, 12 - s**-2]).real.max() + + self.net_ema_worker.copy_to(src_model=self.net, tgt_model=self.net_ema) + + self.set_up_memory() + + torch.cuda.empty_cache() + + def install_attention_dispatch(self, net: torch.nn.Module) -> None: + """Install a custom attention dispatch function on the network. + + Called during ``build_net()`` after the network is constructed but + before parallelization. The base implementation is a no-op; + ``OmniMoTCausalModel`` overrides this to install + ``dispatch_attention_with_memory`` on every attention layer. + """ + pass + + def set_up_memory(self) -> None: + """Initialize memory state used during training (e.g. KV caches). + + The base implementation is a no-op. ``OmniMoTCausalModel`` overrides + this to allocate a KV cache. + """ + pass + + def set_up_parallelism(self) -> None: + """Set up the fsdp for the model.""" + if not torch.distributed.is_initialized(): + self.parallel_dims = None + return + + self.parallel_dims = ParallelDims( + enable_inference_mode=self.config.parallelism.enable_inference_mode, + world_size=torch.distributed.get_world_size(), + dp_shard=self.config.parallelism.data_parallel_shard_degree, + cfgp=self.config.parallelism.cfg_parallel_shard_degree, + cp=self.config.parallelism.context_parallel_shard_degree, + ) + self.parallel_dims.build_meshes(device_type=DEVICE) + + def set_up_scheduler_and_sampler(self): + # Get shift value - support both int and dict-based resolution lookup + # For scheduler initialization, use model's configured resolution + shift_config = self.config.rectified_flow_training_config.shift + if isinstance(shift_config, int): + shift = shift_config + else: + # shift set in RectifiedFlow is only used during inference. + # So, set it to the resolution of the model. + # This part gets executed only when we specify shift as a dict + # This is needed during multi-resolution training. + shift_dict = dict(shift_config) + resolution = self.config.resolution + if resolution not in shift_dict: + raise ValueError( + f"Resolution '{resolution}' not found in shift dict. Available resolutions: {list(shift_dict.keys())}" + ) + shift = shift_dict[resolution] + + # Rectified Flow timestep scheduler and sampler for training (separate for image and video) + if self.config.vision_gen: + self.rectified_flow_image = RectifiedFlow( + velocity_field=self.net, + train_time_distribution=self.config.rectified_flow_training_config.train_time_image_distribution, + use_dynamic_shift=self.config.rectified_flow_training_config.use_dynamic_shift, + shift=shift, + train_time_weight_method=self.config.rectified_flow_training_config.train_time_weight, + device=torch.device(DEVICE), + dtype=self.tensor_kwargs_fp32["dtype"], + ) + self.rectified_flow_video = RectifiedFlow( + velocity_field=self.net, + train_time_distribution=self.config.rectified_flow_training_config.train_time_video_distribution, + use_dynamic_shift=self.config.rectified_flow_training_config.use_dynamic_shift, + shift=shift, + train_time_weight_method=self.config.rectified_flow_training_config.train_time_weight, + device=torch.device(DEVICE), + dtype=self.tensor_kwargs_fp32["dtype"], + ) + if self.config.action_gen: + self.rectified_flow_action = RectifiedFlow( + velocity_field=self.net, + train_time_distribution=self.config.rectified_flow_training_config.train_time_action_distribution, + use_dynamic_shift=self.config.rectified_flow_training_config.use_dynamic_shift, + shift=shift, + train_time_weight_method=self.config.rectified_flow_training_config.train_time_weight, + device=torch.device(DEVICE), + dtype=self.tensor_kwargs_fp32["dtype"], + ) + if self.config.sound_gen: + self.rectified_flow_sound = RectifiedFlow( + velocity_field=self.net, + train_time_distribution=self.config.rectified_flow_training_config.train_time_sound_distribution, + use_dynamic_shift=self.config.rectified_flow_training_config.use_dynamic_shift, + shift=shift, + train_time_weight_method=self.config.rectified_flow_training_config.train_time_weight, + device=torch.device(DEVICE), + dtype=self.tensor_kwargs_fp32["dtype"], + ) + + # Denoising sampler (solver) for inference + assert self.config.rectified_flow_inference_config.scheduler_type in ["unipc", "edm"] + if self.config.rectified_flow_inference_config.scheduler_type == "unipc": + unipc_sampler_config = UniPCSamplerConfig( + num_train_timesteps=self.config.rectified_flow_inference_config.num_train_timesteps, + shift=self.config.rectified_flow_inference_config.shift, + use_dynamic_shifting=self.config.rectified_flow_inference_config.use_dynamic_shifting, + ) + self.sampler = UniPCSampler(cfg=unipc_sampler_config, tensor_kwargs=self.tensor_kwargs) + else: + self.sampler = EDMSampler() + + # Fixed-step sampler for distilled models (None for base models) + if self.config.fixed_step_sampler_config is not None: + cfg = self.config.fixed_step_sampler_config + self.fixed_step_sampler = FixedStepSampler( + t_list=list(cfg.t_list), + sample_type=cfg.sample_type, + num_train_timesteps=float(self.config.rectified_flow_inference_config.num_train_timesteps), + ) + else: + self.fixed_step_sampler = None + + def init_optimizer_scheduler( + self, optimizer_config: LazyDict, scheduler_config: LazyDict + ) -> tuple[torch.optim.Optimizer, torch.optim.lr_scheduler.LRScheduler]: + """Creates the optimizer and scheduler for the model. + + Args: + optimizer_config (LazyDict): The lazy config for the optimizer. + scheduler_config (LazyDict): The lazy config for the learning rate scheduler. + + Returns: + optimizer (torch.optim.Optimizer): The model optimizer. + scheduler (torch.optim.lr_scheduler.LRScheduler): The optimization scheduler. + """ + + optimizer = lazy_instantiate(optimizer_config, model=self) + scheduler = lazy_instantiate(scheduler_config, optimizer=optimizer) + return optimizer, scheduler + + def _derive_include_end_of_generation_token(self) -> bool: + impl = self.config.joint_attn_implementation + assert impl in ("flex", "two_way", "three_way"), ( + f"Invalid joint_attn_implementation: {impl}. Must be 'flex', 'two_way', or 'three_way'." + ) + return impl == "flex" + + # ------------------------ training hooks ------------------------ + def on_before_zero_grad( + self, optimizer: torch.optim.Optimizer, scheduler: torch.optim.lr_scheduler.LRScheduler, iteration: int + ) -> None: + """ + update the net_ema + """ + del scheduler, optimizer + + if self.config.ema.enabled: + # calculate beta for EMA update + ema_beta = self.ema_beta(iteration) + self.net_ema_worker.update_average(self.net, self.net_ema, beta=ema_beta) + + # ------------------------ helpers ------------------------ + + def _pack_input_sequence( + self, + sequence_plans: list[SequencePlan], + input_text_indexes: list[list[int]], + gen_data_clean: GenerationDataClean, + input_timesteps: torch.Tensor, + include_end_of_generation_token: bool = False, + skip_text_tokens: bool = False, + initial_mrope_temporal_offset: int | float = 0, + ) -> PackedSequence: + """Wrap ``pack_input_sequence`` with all config-derived args pre-filled. + + Centralises the 9 config-derived positional/embedding args so callers only + supply the four per-call arguments (sequence_plans, text tokens, data, timesteps) + plus three optional flags. + """ + assert self.tokenizer_vision_gen is not None + return pack_input_sequence( + sequence_plans=sequence_plans, + input_text_indexes=input_text_indexes, + gen_data_clean=gen_data_clean, + input_timesteps=input_timesteps, + special_tokens=self.llm_special_tokens, + latent_patch_size=self.config.diffusion_expert_config.patch_spatial, + skip_text_tokens=skip_text_tokens, + include_end_of_generation_token=include_end_of_generation_token, + position_embedding_type=self.config.diffusion_expert_config.position_embedding_type, + unified_3d_mrope_reset_spatial_ids=self.config.diffusion_expert_config.unified_3d_mrope_reset_spatial_ids, + unified_3d_mrope_temporal_modality_margin=self.config.diffusion_expert_config.unified_3d_mrope_temporal_modality_margin, + enable_fps_modulation=self.config.diffusion_expert_config.enable_fps_modulation, + base_fps=float(self.config.diffusion_expert_config.base_fps), + temporal_compression_factor=self.tokenizer_vision_gen.temporal_compression_factor, + video_temporal_causal=self.config.video_temporal_causal, + action_dim=self.config.max_action_dim, + initial_mrope_temporal_offset=initial_mrope_temporal_offset, + ) + + # ------------------------ training ------------------------ + + def memory_init_training( + self, + gen_data_clean: GenerationDataClean, + data_batch: dict[str, torch.Tensor], + input_text_indexes: list[list[int]], + ) -> tuple[GenerationDataClean, dict]: + """Prepare the memory for a single training step. + + Called at the start of ``training_step`` to give the causal subclass + an injection point for memory-based segment handling (frame trimming, + segment bookkeeping, cache resets, packing overrides). + + The base implementation returns *gen_data_clean* unmodified and a + default memory_info dict that does not support memory-backed training. + + The ``skip_text`` and ``initial_temporal`` offset fields are required, + and are used for both sequence packing and memory. + + Returns: + ``(gen_data_clean, memory_info)`` where *memory_info* is a dict with keys: + ``skip_text``, ``initial_temporal_offset`` + """ + return gen_data_clean, { + "skip_text": False, + "initial_temporal_offset": 0, + } + + def build_memory_state( + self, + packed_seq: PackedSequence, + memory_info: dict, + ) -> MemoryState | None: + """Construct a ``MemoryState`` from a packed sequence and context dict. + + Called after packing in ``training_step()``, and before ``denoise()`` + in AR inference. The base implementation returns ``None`` (no + persistent memory). ``OmniMoTCausalModel`` overrides this to build + the appropriate ``ARMemoryState`` or ``KVCacheTrainMemoryState``. + + Args: + packed_seq: The packed multi-modal sequence produced by + ``_pack_input_sequence``. + memory_info: Context dict returned by ``memory_init_training()`` + (for the training path) or constructed by the AR inference + caller. See ``memory_init_training()`` for the base keys. + """ + return None + + def pre_noise_memory_hook( + self, + packed_sequence: PackedSequence, + gen_data_clean: GenerationDataClean, + memory_info: dict, + ) -> dict: + """Hook called after sequence packing and before noising. Returns (possibly updated) memory_info. + + The packed sequence still contains clean tokens at this point. + Override in subclasses to run a clean forward pass (e.g. for teacher forcing). + """ + return memory_info + + def training_step( + self, data_batch: dict[str, torch.Tensor], iteration: int + ) -> tuple[dict[str, torch.Tensor], torch.Tensor]: + """ + Performs a single training step for the rectified-flow (flow-matching) model. + + This method executes one iteration of the model's training. It involves: + 1. Tokenizing generation modalities (vision/action/sound) into latents (tokens). + 2. Sampling a training timestep (t) for each modality and constructing noised latents (xt) + per the rectified-flow formulation. + 3. Packing text + generation tokens into a single sequence and running the MoT network to predict + the flow field velocity at the given t. + 4. Computing flow-matching loss (plus optional auxiliary load-balancing losses). + + Args: + data_batch (dict): raw data batch draw from the training data loader. + iteration (int): Current iteration number. + + Returns: + tuple: A tuple containing two elements: + - dict: additional data that used to debug / logging / callbacks + - Tensor: The computed loss for the training step as a PyTorch Tensor. + + """ + if self.parallel_dims is None or self.parallel_dims.cp_rank == 0: + self._update_train_stats(data_batch) + + # Load, apply dropout, and tokenize input captions + input_text_indexes = self._load_and_tokenize_text_data(data_batch, iteration) + + # Build sequence plans if not present. SequencePlan has the conditioning information. + sequence_plans = build_sequence_plans_from_data_batch( + data_batch=data_batch, + input_video_key=self.input_video_key, + input_image_key=self.input_image_key, + ) + + # Get data from raw data batch and tokenize into corresponding tokens for *generation* task + # The unnoised, tokenized data for the generation task. + gen_data_clean = self.get_data_and_condition(data_batch, iteration=iteration) + + gen_data_clean, memory_info = self.memory_init_training(gen_data_clean, data_batch, input_text_indexes) + + # Compute resolution per sample for per-sample shift lookup + # image_size[i] may be (1, 4) from IterativeJointDataLoader or (4,) from custom_collate_fn. + if "image_size" in data_batch: + data_resolutions = [] + for i in range(gen_data_clean.batch_size): + img_size = data_batch["image_size"][i] + if img_size.dim() == 2: + img_size = img_size[0] + target_h = int(img_size[0].item()) + target_w = int(img_size[1].item()) + data_resolutions.append(get_vision_data_resolution((target_h, target_w))) + else: + data_resolutions = None + + # Calculate number of tokens per sample (before 2x2 merge) for dynamic shift + # gen_data_clean.x0_tokens_vision: B, C, T, H, W + assert all(x.shape[0] == 1 for x in gen_data_clean.x0_tokens_vision), ( + "Batch size must be 1 for individual samples" + ) + num_tokens_per_sample = [x.shape[2] * x.shape[3] * x.shape[4] for x in gen_data_clean.x0_tokens_vision] + + # Sample a random noise level (sigma) and corresponding interpolation coefficient ("timesteps" in RF) + # Apply shift per sample based on each sample's resolution + num_vision_latent_frames = [x.shape[2] for x in gen_data_clean.x0_tokens_vision] + timesteps_vision, sigmas_vision = self._get_train_noise_level_vision( + batch_size=gen_data_clean.batch_size, + is_image_batch=gen_data_clean.is_image_batch, + resolutions=data_resolutions, + num_vision_latent_frames=num_vision_latent_frames, + num_tokens=num_tokens_per_sample, + iteration=iteration, + ) # [B, T_vis] each + + # Optional independent action schedule (sampled from rectified_flow_action with + # action-specific shift/high-sigma overrides). Only active when the config opts in and + # the batch contains action data. + # + # Mixed-batch indexing: gen_data_clean.x0_tokens_action (and every packed_sequence.action.* + # field) is *dense* — one entry per sample with has_action=True, in the original batch order + # but skipping non-action samples. To feed each dense action entry its sample's sigma, we + # sample σ for the full batch and reindex with action_sample_indices (the batch positions + # of action-bearing samples). This avoids the mismatch that happens when, e.g., batch + # sample 1 has action but the dense entry 0 would otherwise read σ from batch position 0. + rf_cfg = self.config.rectified_flow_training_config + action_sample_indices = [i for i, plan in enumerate(sequence_plans) if plan.has_action] + if rf_cfg.independent_action_schedule and action_sample_indices: + ts_full, sg_full = self._get_train_noise_level_action( + batch_size=gen_data_clean.batch_size, iteration=iteration + ) # [B, 1] each + idx = torch.tensor(action_sample_indices, dtype=torch.long) # [n_action] + timesteps_action = ts_full[idx] # [n_action, 1] + sigmas_action = sg_full[idx] # [n_action, 1] + else: + timesteps_action, sigmas_action = (None, None) + + # Optional independent sound schedule: sample a scalar sound sigma per batch + # slot, then reindex to the dense audio-bearing subset. + sound_sample_indices = [i for i, plan in enumerate(sequence_plans) if getattr(plan, "has_sound", False)] + if getattr(rf_cfg, "independent_sound_schedule", False) and sound_sample_indices: + ts_sound_full, sg_sound_full = self._get_train_noise_level_sound( + batch_size=gen_data_clean.batch_size + ) # [B,1] each + timesteps_sound, sigmas_sound = build_dense_sound_schedule( + sequence_plans, + gen_data_clean.x0_tokens_sound, + ts_sound_full, + sg_sound_full, + ) # [n_sound,1], [n_sound,1] + else: + timesteps_sound, sigmas_sound = (None, None) + + # Broadcast timesteps/sigmas across CP group to ensure consistency + if self.parallel_dims is not None and self.parallel_dims.cp_enabled: + src_rank = 0 # use cp rank 0 to broadcast timesteps/sigmas + cp_group = self.parallel_dims.cp_mesh.get_group() + global_src_rank = torch.distributed.get_global_rank(cp_group, src_rank) + timesteps_vision = timesteps_vision.contiguous() + sigmas_vision = sigmas_vision.contiguous() + torch.distributed.broadcast(timesteps_vision, src=global_src_rank, group=cp_group) + torch.distributed.broadcast(sigmas_vision, src=global_src_rank, group=cp_group) + if sigmas_action is not None: + timesteps_action = timesteps_action.contiguous() + sigmas_action = sigmas_action.contiguous() + torch.distributed.broadcast(timesteps_action, src=global_src_rank, group=cp_group) + torch.distributed.broadcast(sigmas_action, src=global_src_rank, group=cp_group) + if sigmas_sound is not None: + timesteps_sound = timesteps_sound.contiguous() # [n_sound,1] + sigmas_sound = sigmas_sound.contiguous() # [n_sound,1] + torch.distributed.broadcast(timesteps_sound, src=global_src_rank, group=cp_group) + torch.distributed.broadcast(sigmas_sound, src=global_src_rank, group=cp_group) + + if timesteps_sound is None: + # Sound tensors are dense over audio-bearing samples, while the vision timestep/sigma schedule + # is indexed by original batch position. Reindex here so mixed audio/no-audio batches use each + # sound sample's own schedule for noising and loss weighting. + timesteps_sound, sigmas_sound = build_dense_sound_schedule( + sequence_plans, + gen_data_clean.x0_tokens_sound, + timesteps_vision, + sigmas_vision, + ) # [n_sound,T_vis] or None, [n_sound,T_vis] or None + + packed_sequence = self._pack_input_sequence( + sequence_plans, + input_text_indexes, + gen_data_clean, + timesteps_vision.cpu(), + skip_text_tokens=memory_info["skip_text"], + initial_mrope_temporal_offset=memory_info["initial_temporal_offset"], + ) + + # Under independent_action_schedule, overwrite the vision-based action timestep the + # packer injected with the action timestep, so the denoiser's action timestep embedding + # matches the sigma used to noise action tokens. + if timesteps_action is not None and packed_sequence.action is not None: + action_has_noisy_tokens = any(nfi.numel() > 0 for nfi in packed_sequence.action.noisy_frame_indexes) + if action_has_noisy_tokens: + sample_ts = timesteps_action.squeeze(1).cpu() # [n_action] + packed_sequence.action.timesteps = torch.cat( + [ + sample_ts[i : i + 1].expand(nfi.numel()) + for i, nfi in enumerate(packed_sequence.action.noisy_frame_indexes) + ] + ).to(dtype=torch.float32) # [N_action_noisy] + else: + timesteps_action, sigmas_action = (None, None) + + # Under independent_sound_schedule, overwrite the vision-based sound timestep the packer + # injected with the sound timestep, so the denoiser's sound timestep embedding matches + # the sigma used to noise sound tokens. + if ( + getattr(rf_cfg, "independent_sound_schedule", False) + and timesteps_sound is not None + and packed_sequence.sound is not None + ): + sound_has_noisy_tokens = any(nfi.numel() > 0 for nfi in packed_sequence.sound.noisy_frame_indexes) + if sound_has_noisy_tokens: + sample_ts = timesteps_sound.squeeze(1).cpu() # [n_sound] + packed_sequence.sound.timesteps = torch.cat( + [ + sample_ts[i : i + 1].expand(nfi.numel()) + for i, nfi in enumerate(packed_sequence.sound.noisy_frame_indexes) + ] + ).to(dtype=torch.float32) # [N_sound_noisy] + else: + timesteps_sound, sigmas_sound = (None, None) + + # For image editing (multi-item vision), expand per-sample timesteps/sigmas to + # per-vision-item so downstream noise/loss indexing matches the flat x0_tokens_vision + # list. No-op when num_vision_items_per_sample is None (standard T2I/T2V/policy cases). + # Conditioning items get sigma=0 via their condition_mask, so the actual timestep value + # for them does not matter. + timesteps_vision = _expand_per_sample_to_per_vision_item( + timesteps_vision, gen_data_clean.num_vision_items_per_sample + ) # [B_items, T_vis] + sigmas_vision = _expand_per_sample_to_per_vision_item( + sigmas_vision, gen_data_clean.num_vision_items_per_sample + ) # [B_items, T_vis] + + memory_info = self.pre_noise_memory_hook(packed_sequence, gen_data_clean, memory_info) + + # Flow matching/diffusion forward process: noise the input signal with the sampled noise level + gen_data_noised = self._add_noise_to_input( + gen_data_clean, + packed_sequence, + sigmas_vision, + sigmas_action=sigmas_action, + sigmas_sound=sigmas_sound, + iteration=iteration, + ) + self._replace_clean_with_noised(packed_sequence, gen_data_noised) + + # Move packed sequence to CUDA + packed_sequence.to_cuda() + + # Network forward pass + memory = self.build_memory_state(packed_sequence, memory_info) # pylint: disable=assignment-from-none + out_net = self.denoise( + data_batch_packed=packed_sequence, + fps_vision=gen_data_clean.fps_vision, + fps_action=gen_data_clean.fps_action, + fps_sound=gen_data_clean.fps_sound, + memory=memory, + ) + + loss, losses_dict = self._compute_losses( + out_net=out_net, + data_batch_packed=packed_sequence, + gen_data_noised=gen_data_noised, + timesteps=timesteps_vision, + is_image_batch=gen_data_clean.is_image_batch, + timesteps_action=timesteps_action, + timesteps_sound=timesteps_sound, + ) + + # Pixel-space video shapes for VAE FLOPs estimation in callbacks (e.g. MFU). + _vae_pixel_shapes: list[tuple[int, int, int]] = [] + if gen_data_clean.raw_state_vision is not None: + for _v in gen_data_clean.raw_state_vision: + if _v is not None: + assert _v.dim() in [4, 5], ( + "Currently only [C, T, H, W] and [B, C, T, H, W] formats are supported for the VAE encoding." + ) + t_h_w = ( + (int(_v.shape[2]), int(_v.shape[3]), int(_v.shape[4])) + if _v.dim() == 5 + else (int(_v.shape[1]), int(_v.shape[2]), int(_v.shape[3])) + ) + _vae_pixel_shapes.append(t_h_w) + + _vision_tokens = len(packed_sequence.vision.sequence_indexes) if packed_sequence.vision else 0 + _action_tokens = len(packed_sequence.action.sequence_indexes) if packed_sequence.action else 0 + _sound_tokens = len(packed_sequence.sound.sequence_indexes) if packed_sequence.sound else 0 + + output_batch = { + "x0": gen_data_clean.x0_tokens_vision, + "xt": gen_data_noised.xt_tokens_vision, + "sigma": sigmas_vision, # [B_items, T_vis] + "model_pred": out_net["preds_vision"], + "condition_mask_vision": packed_sequence.vision.condition_mask if packed_sequence.vision else None, + "condition_mask_action": packed_sequence.action.condition_mask if packed_sequence.action else None, + "und_token_length": packed_sequence.text_indexes.shape[0], + "gen_token_length": packed_sequence.sequence_length - packed_sequence.text_indexes.shape[0], + "vision_token_length": _vision_tokens, + "action_token_length": _action_tokens, + "sound_token_length": _sound_tokens, + "is_image_batch": gen_data_clean.is_image_batch, + "batch_size": gen_data_clean.batch_size, + "split_lens": packed_sequence.split_lens, + "attn_modes": packed_sequence.attn_modes, + "vae_pixel_shapes": _vae_pixel_shapes, + **losses_dict, + } + if sigmas_action is not None: + output_batch["sigma_action"] = sigmas_action # [n_action, 1] — dense over action-bearing samples + if getattr(rf_cfg, "independent_sound_schedule", False) and sigmas_sound is not None: + output_batch["sigma_sound"] = sigmas_sound # [n_sound, 1] — dense over sound-bearing samples + + return output_batch, loss + + def _compute_flow_matching_loss( + self, + pred: list[torch.Tensor], + target: list[torch.Tensor], + condition_mask: list[torch.Tensor], + timesteps: torch.Tensor, + has_valid_tokens: bool, + rectified_flow: RectifiedFlow, + loss_scale: float | None = None, + raw_action_dim: list[torch.Tensor] | None = None, + normalize_by_active: bool = False, + ) -> torch.Tensor: + """Compute flow matching loss for a modality. + + Args: + pred: Predicted velocity field (list of tensors, one per sample). + target: Target velocity field (list of tensors, one per sample). + Under rectified flow the target is ``v = eps - x0``. + condition_mask: Mask where 1 = clean/conditioning, 0 = noisy/generation (list of tensors). + timesteps: Diffusion timesteps for time weighting. Shape [B,1] for + base/teacher_forcing (all frames share one timestep) or [B,T_max] + for diffusion_forcing (per-frame independent timesteps). Time weights + are applied per-frame before averaging, so non-uniform weight functions + are handled correctly. + has_valid_tokens: Whether this modality has valid noisy tokens. + rectified_flow: The rectified flow object for time weighting. + loss_scale: Optional per-modality loss scale. Falls back to the global + ``rectified_flow_training_config.loss_scale`` when *None*. + normalize_by_active: When True, normalize per-instance loss by the count of + active (noisy) elements rather than all elements. Preserves the + ``sum / active_count`` semantics needed for distillation critics where + conditioned frames contribute no signal and should not dilute the + denominator. + + Returns: + tuple: A tuple containing two elements: + - Flow matching loss (or dummy loss for gradient consistency). + - Per-instance loss (or dummy loss for gradient consistency). + """ + return compute_flow_matching_loss( + pred=pred, + target=target, + condition_mask=condition_mask, + timesteps=timesteps, + has_valid_tokens=has_valid_tokens, + rectified_flow=rectified_flow, + tensor_kwargs_fp32=self.tensor_kwargs_fp32, + loss_scale=loss_scale, + raw_action_dim=raw_action_dim, + normalize_by_active=normalize_by_active, + ) + + def _compute_losses( + self, + out_net: dict, + data_batch_packed: PackedSequence, + gen_data_noised: GenerationDataNoised, + timesteps: torch.Tensor, + is_image_batch: bool, + timesteps_action: torch.Tensor | None = None, + timesteps_sound: torch.Tensor | None = None, + ) -> tuple[torch.Tensor, dict[str, torch.Tensor]]: + """Compute flow matching loss and auxiliary load balancing losses. + + ``timesteps_action`` is an optional ``[n_action, 1]`` override for the action loss + time-weighting — dense over action-bearing samples, matching ``data_batch_packed.action.*``. + When None, action reuses ``timesteps`` (vision timesteps, legacy behavior). Set by + ``training_step`` under ``independent_action_schedule=True``. + + ``timesteps_sound`` is an optional dense sound timestep tensor, matching + ``data_batch_packed.sound.*``. When None, sound reuses ``timesteps``. + """ + total_loss = 0.0 + losses_dict = {} + # ts_action shape: vision fallback [B_items, T_vis] (legacy) or [n_action, 1] (independent). + ts_action = timesteps if timesteps_action is None else timesteps_action # [B_items,T_vis] or [n_action,1] + # ts_sound shape: vision fallback [B_items,T_vis] or dense sound schedule [n_sound,...]. + ts_sound = timesteps if timesteps_sound is None else timesteps_sound # [B_items,T_vis] or [n_sound,...] + + rf_cfg = self.config.rectified_flow_training_config + normalize_by_active = rf_cfg.normalize_loss_by_active + if self.config.vision_gen: + assert data_batch_packed.vision is not None, "Vision packed data required when vision_gen is True" + assert isinstance(data_batch_packed.vision.condition_mask, list), ( + "Vision condition mask must be a list of tensors for loss computation" + ) + rectified_flow_vision = self.rectified_flow_image if is_image_batch else self.rectified_flow_video + + fm_loss_vision, fm_loss_vision_per_instance = self._compute_flow_matching_loss( + pred=out_net["preds_vision"], + target=gen_data_noised.vt_target_vision, + condition_mask=data_batch_packed.vision.condition_mask, + timesteps=timesteps, + has_valid_tokens=has_noisy_tokens(data_batch_packed.vision), + rectified_flow=rectified_flow_vision, + normalize_by_active=normalize_by_active, + ) + loss_scale = ( + rf_cfg.image_loss_scale if is_image_batch and rf_cfg.image_loss_scale is not None else rf_cfg.loss_scale + ) + total_loss += fm_loss_vision * loss_scale + losses_dict["flow_matching_loss_vision"] = fm_loss_vision + losses_dict["flow_matching_loss_vision_per_instance"] = fm_loss_vision_per_instance + else: + losses_dict["flow_matching_loss_vision"] = torch.tensor(0.0, **self.tensor_kwargs_fp32) + + if self.config.action_gen: + if data_batch_packed.action is not None: + assert isinstance(data_batch_packed.action.condition_mask, list), ( + "Action condition mask must be a list of tensors for loss computation" + ) + assert gen_data_noised.vt_target_action is not None, "Action targets required when action_gen is True" + fm_loss_action, _ = self._compute_flow_matching_loss( + pred=out_net["preds_action"], + target=gen_data_noised.vt_target_action, + condition_mask=data_batch_packed.action.condition_mask, + timesteps=ts_action, + has_valid_tokens=has_noisy_tokens(data_batch_packed.action), + rectified_flow=self.rectified_flow_action, + raw_action_dim=data_batch_packed.action.raw_action_dim, + normalize_by_active=normalize_by_active, + ) + + # Yihuai: In case the video loss is too large (1.5) and covers the action loss (0.05), we scale up the action loss to match the video loss to improve action precision. + total_loss += fm_loss_action * rf_cfg.action_loss_weight + losses_dict["flow_matching_loss_action"] = fm_loss_action + else: + # No action data in this batch. Connect the network's dummy preds_action + # to the loss so action-specific params + # (llm2action, action2llm, action_modality_embed) stay in the backward + # graph. Without this, FSDP reduce-scatter / DDP all-reduce will hang + # when other ranks do have action data. + dummy_loss = 0.0 * sum(p.sum() for p in out_net["preds_action"]) + total_loss += dummy_loss + losses_dict["flow_matching_loss_action"] = dummy_loss + else: + losses_dict["flow_matching_loss_action"] = torch.tensor(0.0, **self.tensor_kwargs_fp32) + + if self.config.sound_gen: + if data_batch_packed.sound is not None: + assert isinstance(data_batch_packed.sound.condition_mask, list), ( + "Sound condition mask must be a list of tensors for loss computation" + ) + assert gen_data_noised.vt_target_sound is not None, "Sound targets required when sound_gen is True" + # Sound preds/targets are (C, T); condition_mask is (T, 1) — transpose to (1, T) for broadcasting + fm_loss_sound, _ = self._compute_flow_matching_loss( + pred=out_net["preds_sound"], + target=gen_data_noised.vt_target_sound, + condition_mask=[m.T for m in data_batch_packed.sound.condition_mask], + timesteps=ts_sound, + has_valid_tokens=has_noisy_tokens(data_batch_packed.sound), + rectified_flow=self.rectified_flow_sound, + normalize_by_active=normalize_by_active, + ) + loss_scale = rf_cfg.sound_loss_scale if rf_cfg.sound_loss_scale is not None else rf_cfg.loss_scale + total_loss += fm_loss_sound * loss_scale + losses_dict["flow_matching_loss_sound"] = fm_loss_sound + else: + # No sound data in this batch. Connect the network's dummy preds_sound + # to the loss so sound-specific params (sound2llm, llm2sound, + # sound_modality_embed) stay in the backward graph. Without this, + # FSDP gradient reduce hangs when other ranks do have sound data. + dummy_loss = 0.0 * sum(p.sum() for p in out_net["preds_sound"]) + total_loss += dummy_loss + losses_dict["flow_matching_loss_sound"] = dummy_loss + else: + losses_dict["flow_matching_loss_sound"] = torch.tensor(0.0, **self.tensor_kwargs_fp32) + + # 2. Load balancing auxiliary losses + for load_balancing_type in ["und", "gen"]: + lbl_metadata = out_net.get(f"lbl_metadata_{load_balancing_type}", None) + if lbl_metadata is None: + continue + load_balancing_loss = compute_load_balancing_loss( + lbl_metadata, + coeff=getattr(self.config.lbl, f"coeff_{load_balancing_type}"), + method=self.config.lbl.method, + device_mesh=self.parallel_dims.dp_mesh if self.parallel_dims else None, + ) + if load_balancing_loss is not None: + total_loss += load_balancing_loss + losses_dict[f"aux_loss_{load_balancing_type}"] = load_balancing_loss + + return total_loss, losses_dict + + def _update_train_stats(self, data_batch: dict[str, torch.Tensor]) -> None: + is_image = self.is_image_batch(data_batch) + input_key = self.input_image_key if is_image else self.input_video_key + if isinstance(self.net, WeightTrainingStat): + val = data_batch[input_key] + # For image editing data_batch[input_key] is a list-of-lists, not a tensor. + sample_count = len(val) if isinstance(val, list) else val.shape[0] + if is_image: + self.net.accum_image_sample_counter += sample_count + else: + self.net.accum_video_sample_counter += sample_count + + def _load_and_tokenize_text_data(self, data_batch: dict[str, torch.Tensor], iteration: int) -> list[list[int]]: + """ + Load and tokenize the text data from the data batch. + + Args: + data_batch (dict[str, torch.Tensor]): The data batch. + iteration (int): The current iteration number. + + Returns: + list[torch.Tensor]: The input text tokens. + """ + input_text_indexes = [] + + input_captions = data_batch[self.input_caption_key] + input_text_tokens = data_batch["text_token_ids"] + if isinstance(input_text_tokens, list): + # Convert text tokens to list of lists of ints + input_text_tokens = [tokens.tolist() for x in input_text_tokens for tokens in x] + else: + input_text_tokens = [tokens.squeeze(0).tolist() for tokens in input_text_tokens] + + return input_text_tokens + + def _get_train_noise_level_vision( + self, + batch_size: int, + is_image_batch: bool, + num_vision_latent_frames: list[int], + resolutions: list[str] | str | None = None, + num_tokens: list[int] | None = None, + iteration: int | None = None, + ) -> tuple[torch.Tensor, torch.Tensor]: + """ + Sample the rectified flow interpolation coefficient (timesteps), optionally adjust the sampled + timesteps with high sigma strategy, and obtain the corresponding normalized timestep. + + Args: + batch_size: Batch size for sampling timesteps. + is_image_batch: Whether this is an image batch (vs video). + num_vision_latent_frames: Per-sample vision latent frame counts [T_0, ..., T_{B-1}]. + For causal_training_strategy="diffusion_forcing", resamples B*T_max independent + times and returns tensors of shape [B,T_max]. For base/TF strategies, ignored — + returns shape [B,1] (all frames share the same sigma). + resolutions: Resolution string(s) (e.g., "256", "512") for dict-based shift lookup. + Can be a single string (applied to all samples) or a list of strings (one per sample). + If None, defaults to self.config.resolution (can be used for other modalities). + num_tokens: Number of tokens for each sample (before 2x2 merge). Needed for dynamic shift. + + Returns: + (timesteps, sigmas): Both [B,1] for TF/base, or [B,T_max] for diffusion_forcing. + """ + + + rectified_flow = self.rectified_flow_image if is_image_batch else self.rectified_flow_video + + assert not self.config.rectified_flow_training_config.use_discrete_rf, ( + "Discrete RF is not supported for Cosmos3" + ) + # Continuous RF implementation + max_timestep = rectified_flow.noise_scheduler.config.num_train_timesteps + + # Get shift value(s) - support both int and dict-based resolution lookup + shift_config = self.config.rectified_flow_training_config.shift + if isinstance(shift_config, int): + # Int-based shift: use directly for all samples + shifts = torch.full((batch_size,), shift_config, dtype=torch.float32) + else: + # Convert to plain dict to avoid traceback-based memory leaks when GC is disabled + # (OmegaConf's `in` operator uses exception control flow internally). + shift_dict = dict(shift_config) + if not is_image_batch and "dynamic_shift_base_num_tokens_video" in shift_dict: + # Dynamic shift based on token count + assert num_tokens is not None and len(num_tokens) == batch_size + base_num_tokens = shift_dict["dynamic_shift_base_num_tokens_video"] + shifts = torch.sqrt(torch.tensor(num_tokens, dtype=torch.float32) / base_num_tokens) + elif is_image_batch and "dynamic_shift_base_num_tokens_image" in shift_dict: + assert num_tokens is not None and len(num_tokens) == batch_size + base_num_tokens = shift_dict["dynamic_shift_base_num_tokens_image"] + shifts = torch.sqrt(torch.tensor(num_tokens, dtype=torch.float32) / base_num_tokens) + else: + # Dict-based shift: lookup per sample + if resolutions is None: + raise ValueError("Resolutions must be provided when shift is a dict") + + # Normalize to list format + if isinstance(resolutions, str): + resolutions = [resolutions] * batch_size + + assert len(resolutions) == batch_size, ( + f"Number of resolutions ({len(resolutions)}) must match batch_size ({batch_size})" + ) + + # Lookup shift per sample + shifts_list = [] + for resolution in resolutions: + if resolution not in shift_dict: + raise ValueError( + f"Resolution '{resolution}' not found in shift dict. Available resolutions: {list(shift_dict.keys())}" + ) + shifts_list.append(shift_dict[resolution]) + shifts = torch.tensor(shifts_list, dtype=torch.float32) + + # Sample noise times: B×T_max for DF (one per video latent frame), B×1 for base/TF + if self.config.causal_training_strategy == "diffusion_forcing": + # T_max = max(num_vision_latent_frames) across the batch; trailing entries for shorter + # sequences are unused (sliced away in _add_noise_to_input). + T_max = max(num_vision_latent_frames) + t_raw = ( + rectified_flow.sample_train_time(batch_size * T_max, iteration=iteration) + .to(**self.tensor_kwargs_fp32) + .reshape(batch_size, T_max) + ) # [B,T_max] + else: + t_raw = ( + rectified_flow.sample_train_time(batch_size, iteration=iteration) + .to(**self.tensor_kwargs_fp32) + .unsqueeze(1) + ) # [B,1] + + # Apply shift and scale: t_raw ∈ [0,1] → timesteps ∈ [0,max_timestep] + # shifts.unsqueeze(1) → [B,1], broadcasts with both [B,1] (base/TF) and [B,T_max] (DF) + t = 1 - t_raw # [B,1] or [B,T_max] + shifts_2d = shifts.unsqueeze(1).to(t_raw.device) # [B,1], broadcasts with [B,1] and [B,T_max] + timesteps = shifts_2d * t / (1 + (shifts_2d - 1) * t) * max_timestep # [B,1] or [B,T_max] + + if self.config.rectified_flow_training_config.use_high_sigma_strategy: + timesteps = self._apply_high_noise_strategy(timesteps, max_timestep) # [B,1] or [B,T_max] + + sigmas = timesteps / max_timestep # [B,1] for base/TF, [B,T_max] for DF + return timesteps, sigmas + + def _apply_high_noise_strategy(self, timesteps: torch.Tensor, max_timestep: int) -> torch.Tensor: + """ + Update the sampled RF timesteps to shift the distribution towards higher noise levels (high sigmas). + + Args: + timesteps (torch.Tensor): Input timesteps. Shape [B,1] for base/TF or [B,T_max] for DF. + max_timestep (int): The maximum timestep value. + + Returns: + torch.Tensor: Timesteps with the same shape as input — [B,1] or [B,T_max]. + """ + mask = ( + torch.rand(timesteps.shape, device=timesteps.device) + < self.config.rectified_flow_training_config.high_sigma_ratio + ) + new_timesteps = ( + torch.rand(timesteps.shape, device=timesteps.device).type_as(timesteps) + * ( + self.config.rectified_flow_training_config.high_sigma_timesteps_max + - self.config.rectified_flow_training_config.high_sigma_timesteps_min + ) + + self.config.rectified_flow_training_config.high_sigma_timesteps_min + ) + timesteps = torch.where(mask, new_timesteps, timesteps) + + return timesteps + + def _get_train_noise_level_action( + self, batch_size: int, iteration: int | None = None + ) -> tuple[torch.Tensor, torch.Tensor]: + """Sample ``(timesteps, sigmas)`` of shape ``[batch_size, 1]`` from ``rectified_flow_action``. + + This helper is locally-scoped: it just draws ``batch_size`` independent σ values and + applies action-specific shift / high-sigma config. The caller decides what ``batch_size`` + means semantically — ``training_step`` passes the full batch size and then reindexes to + the dense action-bearing subset with ``action_sample_indices``. + + ``shift_action`` must be an int (or ``None`` to inherit ``shift``). Dict-keyed + per-resolution shifts are vision-only — multi-resolution action training would need + per-sample lookup, which this helper does not implement; if the global ``shift`` is a + dict and ``shift_action`` is None, this raises so the user sets shift_action explicitly. + ``use_high_sigma_strategy_action`` toggles the high-σ strategy for action; when on, the + global ``high_sigma_ratio`` / ``_min`` / ``_max`` apply. σ is a shared scalar per input + slot (no per-frame σ for action). + """ + rf_cfg = self.config.rectified_flow_training_config + rf = self.rectified_flow_action + max_timestep = rf.noise_scheduler.config.num_train_timesteps # int + + # Resolve shift. shift_action, when provided, must be an int. + if rf_cfg.shift_action is not None: + if not isinstance(rf_cfg.shift_action, int): + raise ValueError( + f"shift_action must be an int; got {type(rf_cfg.shift_action).__name__}. " + "Dict-keyed per-resolution shifts are vision-only." + ) + shift_val = rf_cfg.shift_action # int + elif isinstance(rf_cfg.shift, int): + shift_val = rf_cfg.shift # inherit the global int shift + else: + raise ValueError( + "shift_action=None requires the global `shift` to be an int. When `shift` is a " + f"dict (multi-resolution vision training), set shift_action explicitly as an int. " + f"Got shift={rf_cfg.shift!r}." + ) + + t_raw = ( + rf.sample_train_time(batch_size, iteration=iteration).to(**self.tensor_kwargs_fp32).unsqueeze(1) + ) # [B,1] + t = 1 - t_raw # [B,1] + shifts_2d = torch.full((batch_size, 1), shift_val, dtype=torch.float32, device=t_raw.device) # [B,1] + timesteps = shifts_2d * t / (1 + (shifts_2d - 1) * t) * max_timestep # [B,1] + + if rf_cfg.use_high_sigma_strategy_action: + timesteps = self._apply_high_noise_strategy(timesteps, max_timestep) # [B,1] + + sigmas = timesteps / max_timestep # [B,1] + return timesteps, sigmas + + def _get_train_noise_level_sound(self, batch_size: int) -> tuple[torch.Tensor, torch.Tensor]: + """Sample ``(timesteps, sigmas)`` of shape ``[batch_size, 1]`` from ``rectified_flow_sound``. + + Sound uses a shared scalar sigma per audio-bearing sample, then training_step + reindexes the full-batch samples to the dense sound tensor list. + """ + rf_cfg = self.config.rectified_flow_training_config + rf = self.rectified_flow_sound + max_timestep = rf.noise_scheduler.config.num_train_timesteps # int + + # Resolve shift. shift_sound, when provided, must be an int. + if rf_cfg.shift_sound is not None: + if not isinstance(rf_cfg.shift_sound, int): + raise ValueError( + f"shift_sound must be an int; got {type(rf_cfg.shift_sound).__name__}. " + "Dict-keyed per-resolution shifts are vision-only." + ) + shift_val = rf_cfg.shift_sound # int + elif isinstance(rf_cfg.shift, int): + shift_val = rf_cfg.shift # inherit the global int shift + else: + raise ValueError( + "shift_sound=None requires the global `shift` to be an int. When `shift` is a " + f"dict (multi-resolution vision training), set shift_sound explicitly as an int. " + f"Got shift={rf_cfg.shift!r}." + ) + + t_raw = rf.sample_train_time(batch_size).to(**self.tensor_kwargs_fp32).unsqueeze(1) # [B,1] + t = 1 - t_raw # [B,1] + shifts_2d = torch.full((batch_size, 1), shift_val, dtype=torch.float32, device=t_raw.device) # [B,1] + timesteps = shifts_2d * t / (1 + (shifts_2d - 1) * t) * max_timestep # [B,1] + + if rf_cfg.use_high_sigma_strategy_sound: + timesteps = self._apply_high_noise_strategy(timesteps, max_timestep) # [B,1] + + sigmas = timesteps / max_timestep # [B,1] + return timesteps, sigmas + + def _add_noise_to_input( + self, + gen_data_clean: GenerationDataClean, + packed_sequence: PackedSequence, + sigmas: torch.Tensor, + sigmas_action: torch.Tensor | None = None, + sigmas_sound: torch.Tensor | None = None, + iteration: int | None = None, + ) -> GenerationDataNoised: + """ + Diffusion / Flow matching forward process: apply noise of given noise level (sigmas) to input data. + + Args: + gen_data_clean (GenerationDataClean): The input dataclass containing the clean data *latents* (tokens). + packed_sequence (PackedSequence): Packed sequence with condition masks attached to modalities. + sigmas (torch.Tensor): The noise levels. Shape [B,1] for base/teacher_forcing (all video + latent frames share the same sigma) or [B,T_max] for diffusion_forcing (per-latent-frame + independent sigma). T_max is the number of video latent frames (temporally compressed + tokens), not RGB frames. In all modes, sigmas are multiplied by (1 - condition_mask) + so conditioning latent frames get sigma_eff=0 and only non-conditioned frames contribute + to the loss. + sigmas_action: Optional ``[n_action, 1]`` override for action noising — dense over + action-bearing samples, matching ``packed_sequence.action.*``. When None, action + reuses ``sigmas`` (vision σ, legacy behavior). Set by ``training_step`` when + ``independent_action_schedule=True``. + sigmas_sound: Optional dense sound sigma tensor matching ``packed_sequence.sound.*``. + When None, sound reuses ``sigmas``. + + Returns: + GenerationDataNoised: A dataclass containing the noise, noisy data (xt), and velocity field (vt). + """ + # Action sigma defaults to the shared vision sigma (legacy behavior). + # Legacy (sigmas_action=None): vision σ of shape [B_items, T_vis]. + # Independent (sigmas_action provided): dense action σ of shape [n_action, 1]. + sigmas_for_action = sigmas if sigmas_action is None else sigmas_action # [B_items,T_vis] or [n_action,1] + # Sound uses a dense view of the per-sample vision schedule so mixed audio/no-audio + # batches do not index full-batch sigmas with dense sound positions. + sigmas_for_sound = sigmas if sigmas_sound is None else sigmas_sound # [B_items,T_vis] or [n_sound,...] + + # Seeded noise generator (deterministic mode only): keyed on (iteration, rank) so + # noise is identical across independent runs. Built on the same CUDA device as + # tensor_kwargs_fp32 so we can fuse it into a single torch.randn call below + # (no extra CPU alloc + H2D copy). Offset +32768 keeps this seed distinct from + # the sigma seed in sample_train_time. When noise_gen is None, torch.randn + # falls back to the default CUDA RNG, matching prior non-deterministic behavior. + noise_gen: torch.Generator | None = None + if iteration is not None and torch.are_deterministic_algorithms_enabled(): + rank = torch.distributed.get_rank() if torch.distributed.is_initialized() else 0 + noise_gen = torch.Generator(device=self.tensor_kwargs_fp32["device"]) + noise_gen.manual_seed(iteration * 65536 + rank + 32768) + + # Vision + x0_vision = gen_data_clean.x0_tokens_vision # list of [C,T,H,W] + epsilon_vision = [ + torch.randn(x0_vision_i.size(), generator=noise_gen, **self.tensor_kwargs_fp32) for x0_vision_i in x0_vision + ] # list of [C,T,H,W] + + # Derive noisy mask (1 for noised, 0 for clean) for sigmas computation + assert packed_sequence.vision is not None, "Packed vision data required for noise scheduling" + assert packed_sequence.vision.condition_mask is not None, "Vision condition mask required for noise scheduling" + assert isinstance(packed_sequence.vision.condition_mask, list), ( + "Vision condition mask must be a list of tensors for noise scheduling" + ) + + # Compute sigmas per vision item (supports variable shapes). + # For image editing, x0_tokens_vision is a flat list with multiple items per sample + # and sigmas has already been expanded to match (see _expand_per_sample_to_per_vision_item). + # Conditioning latent frames are zeroed via (1 - condition_mask) in all modes (base/TF/DF). + # view(-1,1,1)[:T_latent]: for base/TF sigmas[i] is (1,), view gives (1,1,1) and the slice is a no-op; + # for DF sigmas[i] is (T_max,) — one sigma per video latent frame — view gives (T_max,1,1) + # and [:T_latent] slices to (T_latent,1,1) matching the per-item latent frame count. + num_vision_items = len(packed_sequence.vision.condition_mask) + noisy_mask_vision = [1.0 - cond_mask for cond_mask in packed_sequence.vision.condition_mask] + sigmas_vision = [ + sigmas[i].view(-1, 1, 1)[: x0_vision[i].shape[2]] * noisy_mask_vision[i] for i in range(num_vision_items) + ] + rectified_flow_vision = ( + self.rectified_flow_image if gen_data_clean.is_image_batch else self.rectified_flow_video + ) + xt_vision, vt_vision = rectified_flow_vision.get_interpolation( + epsilon_vision, x0_vision, sigmas_vision + ) # list of [C,T,H,W], list of [C,T,H,W] + + xt_vision = [ + xt_vision_i.to(**self.tensor_kwargs) for xt_vision_i in xt_vision + ] # list of [C,T,H,W]; to make tensor compatible with the precision of the model + + # Action (x0_tokens_action is already a dense list with no None entries). + # Gate on action_gen: the dataset may emit action tensors for models that + # don't consume them (e.g. camera dataset on a vision-only config), in + # which case packed_sequence.action is None and we must skip this block. + x0_action = gen_data_clean.x0_tokens_action # list of [T,action_dim] + if self.config.action_gen and x0_action is not None and len(x0_action) > 0: + assert packed_sequence.action is not None, "Packed action data required when action tokens exist" + assert packed_sequence.action.condition_mask is not None, ( + "Action condition mask required when action tokens exist" + ) + action_batch_size = len(packed_sequence.action.condition_mask) + all_actions_are_conditioning = all( + torch.all(condition_mask == 1).item() for condition_mask in packed_sequence.action.condition_mask + ) + if all_actions_are_conditioning: + epsilon_action = [ + torch.zeros(x0_action_i.size(), **self.tensor_kwargs_fp32) for x0_action_i in x0_action + ] # list of [T,action_dim] + sigmas_action = [ + torch.zeros_like(condition_mask, dtype=torch.float32, device=condition_mask.device) + for condition_mask in packed_sequence.action.condition_mask + ] # list of [T,1] + xt_action = [ + x0_action_i.to(**self.tensor_kwargs) for x0_action_i in x0_action + ] # list of [T,action_dim] + vt_action = [ + torch.zeros(x0_action_i.size(), **self.tensor_kwargs_fp32) for x0_action_i in x0_action + ] # list of [T,action_dim] + else: + epsilon_action = [ + torch.randn(x0_action_i.size(), generator=noise_gen, **self.tensor_kwargs_fp32) + for x0_action_i in x0_action + ] # list of [T,action_dim] + # Conditioning action timesteps are zeroed via (1 - condition_mask) in all modes (base/TF/DF). + # Action timesteps are aligned 1-to-1 with video latent frames, not RGB frames. + # view(-1,1)[:T_i]: for base/TF sigmas[i] is (1,) → (1,1), slice is a no-op; + # for DF sigmas[i] is (T_max,) → (T_max,1) → (T_i,1) per-action-timestep sigmas. + # condition_mask[i] shape [T_i,1]; result broadcasts with x0 shape [T_i,C]. + sigmas_action = [ + sigmas_for_action[i].view(-1, 1)[: x0_action[i].shape[0]] + * (1.0 - packed_sequence.action.condition_mask[i]) + for i in range(action_batch_size) + ] # list of [T_i,1] + xt_action, vt_action = self.rectified_flow_action.get_interpolation( + epsilon_action, x0_action, sigmas_action + ) # list of [T,action_dim], list of [T,action_dim] + xt_action = [ + xt_action_i.to(**self.tensor_kwargs) for xt_action_i in xt_action + ] # list of [T,action_dim]; to make tensor compatible with the precision of the model + for i in range(len(xt_action)): + if gen_data_clean.raw_action_dim is not None and gen_data_clean.raw_action_dim[i] is not None: + xt_action[i][:, gen_data_clean.raw_action_dim[i] :] = 0 + + else: + epsilon_action = None + sigmas_action = None + xt_action = None + vt_action = None + + # Sound (x0_tokens_sound is a list of [C, T] tensors, or None) + x0_sound = gen_data_clean.x0_tokens_sound # list of [sound_channels,T_sound] + if x0_sound is not None and len(x0_sound) > 0: + assert packed_sequence.sound is not None, "Packed sound data required when sound tokens exist" + assert packed_sequence.sound.condition_mask is not None, ( + "Sound condition mask required when sound tokens exist" + ) + sound_batch_size = len(packed_sequence.sound.condition_mask) + epsilon_sound = [ + torch.randn(x0_i.size(), generator=noise_gen, **self.tensor_kwargs_fp32) for x0_i in x0_sound + ] + # Conditioning frames are zeroed via (1 - condition_mask) in all modes (base/TF/DF). + # view(-1,1)[:T_sound].T: for base/TF sigmas[i] is (1,) → (1,1) → no-op → (1,1); + # for DF sigmas[i] is (T_max,) → (T_max,1) → (T_sound,1) → (1,T_sound). + # condition_mask[i] shape [T_sound,1]; .T gives [1,T_sound]; result broadcasts with x0 [C,T_sound]. + sigmas_sound = [ + sigmas_for_sound[i].view(-1, 1)[: x0_sound[i].shape[1]].T + * (1.0 - packed_sequence.sound.condition_mask[i].T) + for i in range(sound_batch_size) + ] + xt_sound, vt_sound = self.rectified_flow_sound.get_interpolation(epsilon_sound, x0_sound, sigmas_sound) + xt_sound = [xt_i.to(**self.tensor_kwargs) for xt_i in xt_sound] + else: + epsilon_sound = None + sigmas_sound = None + xt_sound = None + vt_sound = None + + # create the GenerationDataNoised object + gen_data_noised = GenerationDataNoised( + batch_size=gen_data_clean.batch_size, + # vision + epsilon_vision=epsilon_vision, + xt_tokens_vision=xt_vision, + vt_target_vision=vt_vision, + sigmas_vision=sigmas_vision, + # action + epsilon_action=epsilon_action, + xt_tokens_action=xt_action, + vt_target_action=vt_action, + sigmas_action=sigmas_action, + raw_action_dim=gen_data_clean.raw_action_dim, + # sound + epsilon_sound=epsilon_sound, + xt_tokens_sound=xt_sound, + vt_target_sound=vt_sound, + sigmas_sound=sigmas_sound, + ) + + return gen_data_noised + + def _replace_clean_with_noised( + self, + packed_sequence: PackedSequence, + gen_data_noised: GenerationDataNoised, + ) -> None: + """Replace packed clean tokens with noised tokens.""" + if packed_sequence.vision is not None: + packed_sequence.vision.tokens = gen_data_noised.xt_tokens_vision + if packed_sequence.action is not None and gen_data_noised.xt_tokens_action is not None: + action_all_conditioning = all( + torch.all(condition_mask == 1).item() for condition_mask in packed_sequence.action.condition_mask + ) + if not action_all_conditioning: + packed_sequence.action.tokens = gen_data_noised.xt_tokens_action + if packed_sequence.sound is not None and gen_data_noised.xt_tokens_sound is not None: + packed_sequence.sound.tokens = gen_data_noised.xt_tokens_sound + + # ------------------------ Inference Utils ------------------------ + def _get_inference_text_tokens( + self, data_batch: dict, has_negative_prompt: bool + ) -> tuple[list[list[int]], list[list[int]]]: + """Tokenize conditional and unconditional captions for inference.""" + use_system_prompt = self.vlm_config.use_system_prompt + system_prompt: str | None = data_batch.get("system_prompt") + + cond_tokens = [ + tokenize_caption( + c, + self.vlm_tokenizer, + is_video=False, + use_system_prompt=use_system_prompt, + system_prompt=system_prompt, + ) + for c in data_batch[self.input_caption_key] + ] + + if has_negative_prompt: + neg_key = "neg_" + self.input_caption_key + assert neg_key in data_batch, f"Negative prompt ({neg_key}) not found" + uncond_captions = data_batch[neg_key] + else: + uncond_captions = [""] * len(cond_tokens) + + uncond_tokens = [ + tokenize_caption( + c, + self.vlm_tokenizer, + is_video=False, + use_system_prompt=use_system_prompt, + system_prompt=system_prompt, + ) + for c in uncond_captions + ] + return cond_tokens, uncond_tokens + + def _prepare_inference_data( + self, + data_batch: dict, + seed: list[int], + has_negative_prompt: bool = False, + ) -> tuple[ + list[SequencePlan], + GenerationDataClean, + list[list[int]], + list[list[int]], + list[torch.Tensor], + ]: + """ + Prepare all data needed for inference sampling. + Mirrors training_step's data preparation flow. + + This method: + 1. Builds sequence plans (conditioning information) + 2. Gets data and condition (encodes vision) + 3. Tokenizes text (conditional and unconditional for CFG) + 4. Builds a packed sequence to fetch conditioning masks + 5. Initializes noise with conditioning applied (as lists for variable shapes) + 6. If action_gen is True, concatenates action noise with vision noise + + Args: + data_batch: Raw data batch from dataloader. + seed: Random seed(s) for noise generation. + has_negative_prompt: If True, use negative prompt for unconditional branch. + + Returns: + Tuple of: + - sequence_plans: List of SequencePlan objects + - gen_data_clean: GenerationDataClean with encoded tokens + - cond_text_tokens: Conditional text tokens + - uncond_text_tokens: Unconditional text tokens (for CFG) + - initial_noise: List of noise tensors (one per sample), each containing + flattened vision (and optionally action) noise concatenated + """ + # 1. Build sequence plans (same as training) + sequence_plans = build_sequence_plans_from_data_batch( + data_batch=data_batch, + input_video_key=self.input_video_key, + input_image_key=self.input_image_key, + ) + + # 2. Get data and condition (same as training) + # This encodes vision to x0_tokens + gen_data_clean = self.get_data_and_condition(data_batch) + + num_items_per_sample = gen_data_clean.num_vision_items_per_sample # None for standard T2I/T2V + + # 3. Tokenize text (similar to training's _load_and_tokenize_text_data) + cond_text_tokens, uncond_text_tokens = self._get_inference_text_tokens(data_batch, has_negative_prompt) + + # 4. Build packed sequence to fetch conditioning masks + mask_timesteps = torch.zeros((gen_data_clean.batch_size,), dtype=torch.float32) # [B] + packed_sequence = self._pack_input_sequence( + sequence_plans, + cond_text_tokens, + gen_data_clean, + mask_timesteps, + include_end_of_generation_token=self._derive_include_end_of_generation_token(), + ) + + # 5. Initialize vision noise with conditioning + assert packed_sequence.vision is not None, "Packed vision data required for inference noise" + assert packed_sequence.vision.condition_mask is not None, "Vision condition mask required for inference noise" + assert isinstance(packed_sequence.vision.condition_mask, list), ( + "Vision condition mask must be a list of tensors for inference noise" + ) + assert gen_data_clean.x0_tokens_vision is not None, "Vision data required for inference noise" + n_sample = ( + len(gen_data_clean.x0_tokens_vision) + if gen_data_clean.num_vision_items_per_sample is None + else len(gen_data_clean.num_vision_items_per_sample) + ) + + assert len(seed) == n_sample, ( + f"Seed list length {len(seed)} must have the same length as the number of samples {n_sample}" + ) + + # For image2image, num_items_per_sample could be > 1 (multi-vision), + # so we need to repeat the seed for each vision item. + seed_dict = {"vision": [], "action": [], "sound": []} + for sample_idx in range(n_sample): + num_vision_items = num_items_per_sample[sample_idx] if num_items_per_sample is not None else 1 + seed_dict["vision"].extend([seed[sample_idx]] * num_vision_items) + seed_dict["action"].append(seed[sample_idx]) + seed_dict["sound"].append(seed[sample_idx]) + + # Generate noise and apply conditioning per vision item (supports variable shapes) + noise_vision_list: list[torch.Tensor] = [] + for i, (x0_token, cond_mask) in enumerate( + zip(gen_data_clean.x0_tokens_vision, packed_sequence.vision.condition_mask, strict=True) + ): + pure_noise_i = misc.arch_invariant_rand( + tuple(x0_token.shape), + self.tensor_kwargs["dtype"], + self.tensor_kwargs["device"], + seed_dict["vision"][i], # Different seed per sample for diversity + ) # [C,T,H,W] + noise_i = cond_mask * x0_token.to(**self.tensor_kwargs) + (1.0 - cond_mask) * pure_noise_i # [C,T,H,W] + noise_vision_list.append(noise_i) + + # 6. Initialize action noise if action_gen is True + has_action = self.config.action_gen and any(plan.has_action for plan in sequence_plans) + noise_action_list: list[torch.Tensor] | None = None + + if has_action: + assert gen_data_clean.x0_tokens_action is not None, "Action data required when sequence plan has action" + assert packed_sequence.action is not None, "Packed action data required when action_gen is True" + assert packed_sequence.action.condition_mask is not None, "Action condition mask required" + assert isinstance(packed_sequence.action.condition_mask, list), ( + "Action condition mask must be a list of tensors for inference noise" + ) + + # Generate action noise per sample (x0_tokens_action is already dense, no None entries) + noise_action_list = [] + for i, (x0_action, cond_mask_action) in enumerate( + zip(gen_data_clean.x0_tokens_action, packed_sequence.action.condition_mask, strict=True) + ): + pure_noise_action_i = misc.arch_invariant_rand( + tuple(x0_action.shape), + self.tensor_kwargs["dtype"], + self.tensor_kwargs["device"], + seed_dict["action"][i], # Different seed per sample for diversity + ) # [T,action_dim] + noise_action_i = ( + cond_mask_action * x0_action.to(**self.tensor_kwargs) + + (1.0 - cond_mask_action) * pure_noise_action_i + ) + if gen_data_clean.raw_action_dim is not None and gen_data_clean.raw_action_dim[i] is not None: + noise_action_i[:, gen_data_clean.raw_action_dim[i] :] = 0 + noise_action_list.append(noise_action_i) + + # 7. Initialize sound noise if sound_gen is True + has_sound = self.config.sound_gen and any(plan.has_sound for plan in sequence_plans) + noise_sound_list: list[torch.Tensor] | None = None + + if has_sound: + assert gen_data_clean.x0_tokens_sound is not None, "Sound data required when sequence plan has sound" + assert packed_sequence.sound is not None, "Packed sound data required when sound_gen is True" + assert packed_sequence.sound.condition_mask is not None, "Sound condition mask required" + assert isinstance(packed_sequence.sound.condition_mask, list), ( + "Sound condition mask must be a list of tensors for inference noise" + ) + + noise_sound_list = [] + for i, (x0_sound, cond_mask_sound) in enumerate( + zip(gen_data_clean.x0_tokens_sound, packed_sequence.sound.condition_mask, strict=True) + ): + pure_noise_sound_i = misc.arch_invariant_rand( + tuple(x0_sound.shape), + self.tensor_kwargs["dtype"], + self.tensor_kwargs["device"], + seed_dict["sound"][i], # Different seed per sample for diversity + ) # [sound_channels,T_sound] + # cond_mask_sound is (T, 1), x0_sound is (C, T) — transpose mask for broadcasting + noise_sound_i = ( + cond_mask_sound.T * x0_sound.to(**self.tensor_kwargs) + + (1.0 - cond_mask_sound.T) * pure_noise_sound_i + ) # [sound_channels,T_sound] + noise_sound_list.append(noise_sound_i) + + # 8. Concatenate vision, action, and sound noise per sample (flattened) + # Order: [vision | action (if present) | sound (if present)] + # noise_action_list and noise_sound_list are dense (only modality-having samples), + # so we use separate indexes. + initial_noise: list[torch.Tensor] = [] + idx_vision = 0 + idx_action = 0 + idx_sound = 0 + + for i in range(n_sample): + parts = [] + + # Flatten and concatenate all vision items for this sample + num_vis = num_items_per_sample[i] if num_items_per_sample is not None else 1 + for _ in range(num_vis): + parts.append(noise_vision_list[idx_vision].reshape(-1)) + idx_vision += 1 + + if noise_action_list is not None and sequence_plans[i].has_action: + parts.append(noise_action_list[idx_action].reshape(-1)) + idx_action += 1 + + if noise_sound_list is not None and sequence_plans[i].has_sound: + parts.append(noise_sound_list[idx_sound].reshape(-1)) + idx_sound += 1 + + initial_noise.append(torch.cat(parts, dim=0)) # [N_tokens_flat] + + return ( + sequence_plans, + gen_data_clean, + cond_text_tokens, + uncond_text_tokens, + initial_noise, + ) + + def _get_velocity( + self, + *, + net: torch.nn.Module | None = None, + noise_x: list[torch.Tensor], + timestep: torch.Tensor, + text_tokens: list[list[int]], + sequence_plans: list[SequencePlan], + gen_data_clean: GenerationDataClean, + skip_text_tokens: bool = False, + ) -> list[torch.Tensor]: + """ + Compute velocity prediction for a single sampling step. + + This method handles the full pipeline for one denoising step: + 1. Splits flattened noise_x into vision (and action) parts per sample + 2. Packs the input sequence with current noisy latents + 3. Runs the network via self.denoise() + 4. Applies velocity masks (zeroes out conditioned parts) + 5. Returns flattened velocities (concatenated vision + action per sample) + + Args: + noise_x: List of noisy latents, each containing concatenated + vision (and optionally action) noise. + len(noise_x) == B, noise_x[i] is shape (D) + timestep: Current timestep for each sample + text_tokens: Tokenized text for each sample + sequence_plans: Pre-computed sequence plans (from _prepare_inference_data) + gen_data_clean: Pre-computed clean data (from _prepare_inference_data) + skip_text_tokens: If True, skip text tokens (for CFG unconditional branch) + + Returns: + Stacked flattened velocity tensors (one per sample), each containing + concatenated vision (and optionally action) velocity + """ + n_samples = len(noise_x) + is_image_batch = gen_data_clean.is_image_batch + has_action = self.config.action_gen and any(plan.has_action for plan in sequence_plans) + num_items = gen_data_clean.num_vision_items_per_sample # None for standard T2I/T2V + has_sound = self.config.sound_gen and any(plan.has_sound for plan in sequence_plans) + + # Split flattened noise_x into vision, action, and sound parts per sample + # Order must match _prepare_inference_data: [vision | action (if present) | sound (if present)] + noise_x_vision: list[torch.Tensor] = [] + noise_x_action: list[torch.Tensor] | None = [] if has_action else None + noise_x_sound: list[torch.Tensor] | None = [] if has_sound else None + + vision_offset = 0 # tracks position in the flat x0_tokens_vision list + idx_action = 0 + idx_sound = 0 + for i in range(n_samples): + n_vis = num_items[i] if num_items is not None else 1 + offset = 0 + for j in range(n_vis): + vision_shape = gen_data_clean.x0_tokens_vision[vision_offset + j].shape + vision_dim = int(torch.prod(torch.tensor(vision_shape))) + noise_vision_ij = noise_x[i][offset : offset + vision_dim].reshape(vision_shape) + noise_x_vision.append(noise_vision_ij) + offset += vision_dim + vision_offset += n_vis + + if has_action and noise_x_action is not None: + assert gen_data_clean.x0_tokens_action is not None + action_shape = gen_data_clean.x0_tokens_action[idx_action].shape + action_dim = int(torch.prod(torch.tensor(action_shape))) + noise_x_action.append(noise_x[i][offset : offset + action_dim].reshape(action_shape)) # [T,action_dim] + offset += action_dim + idx_action += 1 + + # Extract sound if present for this sample + if has_sound and noise_x_sound is not None and sequence_plans[i].has_sound: + assert gen_data_clean.x0_tokens_sound is not None + sound_shape = gen_data_clean.x0_tokens_sound[idx_sound].shape + sound_dim = int(torch.prod(torch.tensor(sound_shape))) + noise_x_sound.append( + noise_x[i][offset : offset + sound_dim].reshape(sound_shape) + ) # [sound_channels,T_sound] + offset += sound_dim + idx_sound += 1 + + gen_data_for_packing = GenerationDataClean( + batch_size=n_samples, + is_image_batch=is_image_batch, + raw_state_vision=gen_data_clean.raw_state_vision, + x0_tokens_vision=noise_x_vision, + fps_vision=gen_data_clean.fps_vision, + # Action fields + raw_state_action=gen_data_clean.raw_state_action if has_action else None, + x0_tokens_action=noise_x_action if has_action else None, + action_domain_id=gen_data_clean.action_domain_id if has_action else None, + fps_action=gen_data_clean.fps_action if has_action else None, + raw_action_dim=gen_data_clean.raw_action_dim if has_action else None, + # Sound fields + raw_state_sound=gen_data_clean.raw_state_sound if has_sound else None, + x0_tokens_sound=noise_x_sound if has_sound else None, + fps_sound=gen_data_clean.fps_sound if has_sound else None, + num_vision_items_per_sample=num_items, + ) + + packed_sequence = self._pack_input_sequence( + sequence_plans, + text_tokens, + gen_data_for_packing, + timestep.cpu(), + include_end_of_generation_token=self._derive_include_end_of_generation_token(), + skip_text_tokens=skip_text_tokens, + ) + + # Set the actual noisy latents (as lists) + if packed_sequence.vision is not None: + packed_sequence.vision.tokens = [x.to(**self.tensor_kwargs) for x in noise_x_vision] + + if has_action and noise_x_action is not None: + assert packed_sequence.action is not None, "packed_sequence.action must exist when has_action is True" + packed_sequence.action.tokens = [x.to(**self.tensor_kwargs) for x in noise_x_action] + packed_sequence.action.domain_id = gen_data_clean.action_domain_id + + if has_sound and noise_x_sound is not None: + assert packed_sequence.sound is not None, "packed_sequence.sound must exist when has_sound is True" + packed_sequence.sound.tokens = [x.to(**self.tensor_kwargs) for x in noise_x_sound] + + packed_sequence.to_cuda() + + # --- Network forward --- + fps_action = gen_data_clean.fps_action if has_action else None + fps_sound = gen_data_clean.fps_sound if has_sound else None + out = self.denoise( + net=net, + data_batch_packed=packed_sequence, + fps_vision=gen_data_clean.fps_vision, + fps_action=fps_action, + fps_sound=fps_sound, + ) + + # --- Apply velocity masks --- + # Zero out velocity for conditioned parts (they don't change during sampling) + assert packed_sequence.vision is not None, "packed_sequence.vision must exist for velocity masking" + assert packed_sequence.vision.condition_mask is not None, "Vision condition mask required for masking" + assert isinstance(packed_sequence.vision.condition_mask, list), ( + "Vision condition mask must be a list of tensors for masking" + ) + # Compute noisy_mask per sample (supports variable shapes) + noisy_mask_vision = [1.0 - cond_mask for cond_mask in packed_sequence.vision.condition_mask] + + # Apply velocity mask per element - check if each sample has noisy tokens + velocity_vision: list[torch.Tensor] = [] + for i, (pred, noisy_mask) in enumerate(zip(out["preds_vision"], noisy_mask_vision)): + # pred: [C,T,H,W], noisy_mask: [T,1,1] + has_noisy_tokens_i = noisy_mask.sum() > 0 + if has_noisy_tokens_i: + # Apply mask to prediction + velocity_vision.append(pred * noisy_mask.to(dtype=pred.dtype, device=pred.device)) # [C,T,H,W] + else: + # All tokens are conditioned - velocity should be zero + velocity_vision.append(torch.zeros_like(pred)) # [C,T,H,W] + + # Handle action velocity + velocity_action: list[torch.Tensor] | None = None + if ( + has_action + and packed_sequence.action is not None + and packed_sequence.action.condition_mask is not None + and isinstance(packed_sequence.action.condition_mask, list) + ): + noisy_mask_action = [1.0 - cond_mask for cond_mask in packed_sequence.action.condition_mask] + + velocity_action = [] + for i, (pred, noisy_mask) in enumerate(zip(out["preds_action"], noisy_mask_action)): + # pred: [T,action_dim], noisy_mask: [T,1] + has_noisy_tokens_i = noisy_mask.sum() > 0 + if has_noisy_tokens_i: + v = pred * noisy_mask.to(dtype=pred.dtype, device=pred.device) # [T,action_dim] + else: + v = torch.zeros_like(pred) # [T,action_dim] + if gen_data_clean.raw_action_dim is not None and gen_data_clean.raw_action_dim[i] is not None: + v[:, gen_data_clean.raw_action_dim[i] :] = 0 + velocity_action.append(v) + + # Handle sound velocity + velocity_sound: list[torch.Tensor] | None = None + if ( + has_sound + and packed_sequence.sound is not None + and packed_sequence.sound.condition_mask is not None + and isinstance(packed_sequence.sound.condition_mask, list) + ): + noisy_mask_sound = [1.0 - cond_mask for cond_mask in packed_sequence.sound.condition_mask] + + velocity_sound = [] + for i, (pred, noisy_mask) in enumerate(zip(out["preds_sound"], noisy_mask_sound)): + # pred: [sound_channels,T_sound], noisy_mask: [T_sound,1] + has_noisy_tokens_i = noisy_mask.sum() > 0 + if has_noisy_tokens_i: + # noisy_mask is (T, 1), pred is (C, T) — transpose mask for broadcasting + velocity_sound.append( + pred * noisy_mask.T.to(dtype=pred.dtype, device=pred.device) + ) # [sound_channels,T_sound] + else: + velocity_sound.append(torch.zeros_like(pred)) # [sound_channels,T_sound] + + # Concatenate vision, action, and sound velocities per sample (flattened) + # Order must match _prepare_inference_data: [vision | action | sound] + velocity_output: list[torch.Tensor] = [] + vis_offset = 0 + idx_action = 0 + idx_sound = 0 + for i in range(n_samples): + parts = [] + n_vis = num_items[i] if num_items is not None else 1 + + for _ in range(n_vis): + parts.append(velocity_vision[vis_offset].reshape(-1)) + vis_offset += 1 + + if velocity_action is not None and sequence_plans[i].has_action: + parts.append(velocity_action[idx_action].reshape(-1)) + idx_action += 1 + + if velocity_sound is not None and sequence_plans[i].has_sound: + parts.append(velocity_sound[idx_sound].reshape(-1)) + idx_sound += 1 + + velocity_output.append(torch.cat(parts, dim=0)) # [N_tokens_flat] + + return velocity_output + + def _remove_padding_from_latent( + self, x0_tokens_vision: list[torch.Tensor], frame_size: list[torch.Tensor] + ) -> list[torch.Tensor]: + """ + Remove reflection padding from encoded latent vision tokens. + + Each sample in the batch may have different original dimensions, so we process + each sample individually and return a list of latents with varying spatial sizes. + + The padding coordinates are scaled down by the spatial compression factor since + we're operating in latent space. + + Args: + x0_tokens_vision (list[torch.Tensor]): List of encoded latent tensors, + each of shape (1, C, T, H_latent, W_latent) + where H_latent, W_latent include scaled padding. + frame_size (list[torch.Tensor]): List of tensors, each of shape (1,4) or (4,) containing + [target_h, target_w, orig_h, orig_w] for each sample (in pixel space). + + Returns: + list[torch.Tensor]: List of cropped latent tokens, each of shape (1, C, T, H_latent_cropped, W_latent_cropped). + Each element may have different spatial sizes based on original image dimensions. + """ + batch_size = len(x0_tokens_vision) + spatial_factor = self.tokenizer_vision_gen.spatial_compression_factor + cropped_latents = [] + for i in range(batch_size): + # frame_size: [target_h, target_w, orig_h, orig_w] in pixel space + # Normalize: frame_size[i] may be (1, 4) from IterativeJointDataLoader + # or (4,) when loaded from safetensors in the eval/export path. + fs = frame_size[i] + if fs.dim() == 2: + fs = fs[0] + orig_h = int(fs[2].item()) + orig_w = int(fs[3].item()) + + # Scale to latent space + if orig_h // spatial_factor == 0 or orig_w // spatial_factor == 0: + log.warning( + f"Zero-sized latent found: orig_h: {orig_h}, orig_w: {orig_w}, spatial_factor: {spatial_factor}" + ) + + orig_h_latent = max(orig_h // spatial_factor, 1) + orig_w_latent = max(orig_w // spatial_factor, 1) + + # Crop to remove padding: x0_tokens_vision[i] shape is (1, C, T, H, W) + cropped_latent = x0_tokens_vision[i][:, :, :, :orig_h_latent, :orig_w_latent].contiguous() + cropped_latents.append(cropped_latent) + + return cropped_latents + + def _run_classifier_free_guidance( + self, + cond_tokens: list[list[int]], + uncond_tokens: list[list[int]], + skip_text_tokens_for_cfg: bool, + single_velocity_fn: Callable[[list[list[int]], bool], list[torch.Tensor]], + ) -> tuple[list[torch.Tensor], list[torch.Tensor]]: + """Run classifier-free guidance, optionally in parallel via CFG parallelism. + + Args: + cond_tokens: Tokenized text for the conditional branch. + uncond_tokens: Tokenized text for the unconditional branch. + skip_text_tokens_for_cfg: If True, skip text tokens in the + unconditional branch. + single_velocity_fn: Computes velocity for a given set of tokens. + Accepts ``(tokens, skip_text_tokens)`` and returns a list of + velocity tensors (one per sample). + + Returns: + A tuple ``(cond_v, uncond_v)`` where each element is a list of + velocity tensors (one per sample). + """ + if self.parallel_dims is None or not self.parallel_dims.cfgp_enabled: + return ( + single_velocity_fn(cond_tokens, False), + single_velocity_fn(uncond_tokens, skip_text_tokens_for_cfg), + ) + + cfgp_rank = self.parallel_dims.cfgp_rank + cfgp_size = self.parallel_dims.cfgp_size + cfgp_group = self.parallel_dims.cfgp_mesh.get_group() + cfgp_peer = (cfgp_rank + 1) % cfgp_size + + if cfgp_rank == 0: + v_list = single_velocity_fn(cond_tokens, False) + else: + v_list = single_velocity_fn(uncond_tokens, skip_text_tokens_for_cfg) + + other_v_list = [torch.empty_like(v_i) for v_i in v_list] + + ops: list[dist.P2POp] = [] + for v_i, other_v_i in zip(v_list, other_v_list): + ops.append(dist.P2POp(op=dist.isend, tensor=v_i, group_peer=cfgp_peer, group=cfgp_group)) + ops.append(dist.P2POp(op=dist.irecv, tensor=other_v_i, group_peer=cfgp_peer, group=cfgp_group)) + + reqs = dist.batch_isend_irecv(ops) + for req in reqs: + req.wait() + + if cfgp_rank == 0: + return v_list, other_v_list + else: + return other_v_list, v_list + + @torch.no_grad() + def generate_samples_from_batch( + self, + data_batch: Dict, + net: torch.nn.Module | None = None, + sampler: Any | None = None, + guidance: float = 1.5, + guidance_interval: Optional[list[float]] = None, + seed: list[int] | int = 1, + n_sample: int | None = None, + has_negative_prompt: bool = False, + num_steps: int = 35, + shift: float = 5.0, + sigma_max: float = 80.0, + skip_text_tokens_for_cfg: bool = False, + normalize_cfg: bool = False, + **kwargs, + ) -> dict[str, list[torch.Tensor]]: + """ + Generate samples from the batch. Based on given batch, it will automatically determine + whether to generate image or video samples. + + This method follows the same structure as training_step: + 1. Build sequence plans + 2. Get data and condition (encode vision) + 3. Initialize noise with conditioning (as lists for variable shapes) + 4. Run sampling loop with velocity function + 5. Return latents as lists (supports variable shapes) + + Args: + data_batch (dict): Raw data batch from the dataloader. + guidance (float): Classifier-free guidance weight. + guidance_interval (list[float] | None): Optional timestep interval to apply guidance. + For the timesteps (ranging between 0-1000) that fall between the interval, we perform CFG, otherwise, we skip the unconditional generation. + seed (list[int] | int): Random seeds for noise generation. For all new use-cases, + we use a list of seeds, one for each sample. The length of the list must match + the number of samples. Legacy use-cases use a single integer seed which is + incremented by 1 for each sample. But this is not supported anymore, and will + raise an error if used. + n_sample (int | None): Number of samples to generate; defaults to batch size. + has_negative_prompt (bool): If True, use negative prompt for unconditional branch. + num_steps (int): Number of sampling steps for the diffusion process. + shift (float): Time shift parameter for the sampler. + sigma_max (float): Maximum sigma for the EDM sampler. + skip_text_tokens_for_cfg (bool): If True, skip text tokens in unconditional branch. + normalize_cfg (bool): If True, normalize the CFG output. + + Returns: + Dict with keys: + - "vision": List of vision latent tensors (one per sample, variable shapes) + - "action": List of action latent tensors or None (only present when action_gen=True and has_action) + + Raises: + ValueError: If the number of samples does not match the number of noise tensors or seeds. + ValueError: If the seed is a single integer. This is not supported anymore: `seed` must be + a list of integers, one for each sample. + """ + if isinstance(seed, int): + raise ValueError( + "Single integer seed is not supported anymore: `seed` must be a list of integers, one for each sample." + ) + assert isinstance(seed, list) + + if self.parallel_dims is not None and self.parallel_dims.cp_enabled: + seed = _broadcast_seed(seed, self.parallel_dims.cp_mesh.get_group(), self.parallel_dims.cp_rank) + + if self.parallel_dims is not None and self.parallel_dims.cfgp_enabled: + seed = _broadcast_seed(seed, self.parallel_dims.cfgp_mesh.get_group(), self.parallel_dims.cfgp_rank) + + # Prepare all data (initial noise as list of flattened tensors per sample) + ( + sequence_plans, + gen_data_clean, + cond_tokens, + uncond_tokens, + initial_noise, + ) = self._prepare_inference_data(data_batch, seed, has_negative_prompt) + + if n_sample is not None: + assert n_sample == len(initial_noise), ( + f"Number of samples {n_sample} must match number of noise tensors {len(initial_noise)}" + ) + else: + n_sample = len(initial_noise) + + assert n_sample == len(seed), f"Number of samples {n_sample} must match number of seeds {len(seed)}" + + # Create a velocity function for a single sample (for use with self.sampler). + + def velocity_fn(noise_x: list[torch.Tensor], timestep: torch.Tensor) -> list[torch.Tensor]: + # len(noise_x) == B, noise_x[i] is shape (D) + # timestep is shape (B, 1) + torch.compiler.cudagraph_mark_step_begin() + + assert timestep.ndim == 2, f"timestep must be 2D, got {timestep.shape}" + assert timestep.shape == (1, 1), f"timestep must be (1, 1), got {timestep.shape}" + + # Expand timestep to (B, 1) + timestep = timestep.repeat(len(noise_x), 1) + + def _single_velocity_fn(tokens: list[list[int]], skip_text_tokens: bool): + return self._get_velocity( + net=net, + noise_x=noise_x, + timestep=timestep, + text_tokens=tokens, + sequence_plans=sequence_plans, + gen_data_clean=gen_data_clean, + skip_text_tokens=skip_text_tokens, + ) + + # Skip unconditional branch when outside the guidance interval + needs_cfg = guidance != 1.0 + if needs_cfg and guidance_interval is not None: + assert len(guidance_interval) == 2, f"guidance_interval must be [lo, hi], got {guidance_interval}" + t_lo, t_hi = guidance_interval + needs_cfg = t_lo < timestep[0].item() < t_hi + + if not needs_cfg: + return _single_velocity_fn(cond_tokens, skip_text_tokens=False) + + cond_v, uncond_v = self._run_classifier_free_guidance( + cond_tokens=cond_tokens, + uncond_tokens=uncond_tokens, + skip_text_tokens_for_cfg=skip_text_tokens_for_cfg, + single_velocity_fn=_single_velocity_fn, + ) + + v_pred = [u_i + guidance * (c_i - u_i) for c_i, u_i in zip(cond_v, uncond_v)] + + if normalize_cfg: + v_pred = [ + v_i * (torch.norm(c_i) / (torch.norm(v_i) + 1e-8)).clamp(min=0.0, max=1.0) + for v_i, c_i in zip(v_pred, cond_v) + ] + + return v_pred + + # Run sampler for all samples at once. + sampler = sampler or self.sampler + scheduler_type = self.config.rectified_flow_inference_config.scheduler_type + if scheduler_type == "unipc": + log.info(f"Using sampler: UniPC (shift={shift}, num_steps={num_steps})") + else: + log.info(f"Using sampler: EDM (sigma_max={sigma_max}, num_steps={num_steps})") + + if scheduler_type == "unipc": + latents = sampler( + velocity_fn, + initial_noise, + num_steps=num_steps, + shift=shift, + seed=seed, + ) + else: + # EDM Sampler + chunk_sizes = [_x.shape[0] for _x in initial_noise] + initial_noise = torch.cat(initial_noise, dim=0) + + def x0_fn(noise_x: torch.Tensor, sigma: torch.Tensor) -> torch.Tensor: + assert sigma.ndim == 0, f"sigma must be 0D, got {sigma.shape}" + timestep_rf = sigma * float(self.config.rectified_flow_inference_config.num_train_timesteps) + + # Convert noise_x to list of tensors for velocity_fn, and then + # concatenate the results back into a single tensor. + _noise_x = list(torch.split(noise_x, chunk_sizes, dim=0)) + _velocity_pred = velocity_fn(_noise_x, timestep_rf.reshape(1, 1)) + velocity_pred = torch.cat(_velocity_pred, dim=0) + + x0_pred = noise_x - sigma * velocity_pred + return x0_pred + + latents = sampler( + x0_fn, + initial_noise, + num_steps=num_steps, + sigma_max=sigma_max, + sigma_min=0.002, + solver_option="2ab", + ) + latents = list(torch.split(latents, chunk_sizes, dim=0)) + + # Split flattened latents back into vision, action, and sound + # Mirror the per-sample logic from _prepare_inference_data: + # Order: [vision | action (if present) | sound (if present)] + # action/sound lists are dense (only modality-having samples), so use separate indexes. + result_vision: list[torch.Tensor] = [] + result_action: list[torch.Tensor] = [] + result_sound: list[torch.Tensor] = [] + idx_vision = 0 + idx_action = 0 + idx_sound = 0 + num_vision_items = gen_data_clean.num_vision_items_per_sample + + for i in range(n_sample): + offset = 0 + + # Extract vision + n_vis = num_vision_items[i] if num_vision_items is not None else 1 + for j in range(n_vis): + vision_shape = gen_data_clean.x0_tokens_vision[idx_vision + j].shape + vision_dim = int(torch.prod(torch.tensor(vision_shape))) + if j == n_vis - 1: # the last vision item is the only target for each sample. + + result_vision.append(latents[i][offset : offset + vision_dim].reshape(vision_shape)) + else: # the other vision items are the condition inputs that we don't need to return + pass + offset += vision_dim + idx_vision += n_vis + + # Extract action if present + if self.config.action_gen and sequence_plans[i].has_action: + assert gen_data_clean.x0_tokens_action is not None + action_shape = gen_data_clean.x0_tokens_action[idx_action].shape + action_dim = int(torch.prod(torch.tensor(action_shape))) + result_action.append(latents[i][offset : offset + action_dim].reshape(action_shape)) + offset += action_dim + idx_action += 1 + + # Extract sound if present + if self.config.sound_gen and sequence_plans[i].has_sound: + assert gen_data_clean.x0_tokens_sound is not None + sound_shape = gen_data_clean.x0_tokens_sound[idx_sound].shape + sound_dim = int(torch.prod(torch.tensor(sound_shape))) + result_sound.append(latents[i][offset : offset + sound_dim].reshape(sound_shape)) + offset += sound_dim + idx_sound += 1 + + result: dict[str, list[torch.Tensor]] = {"vision": result_vision} + if self.config.action_gen and len(result_action) > 0: + result["action"] = result_action + if self.config.sound_gen and len(result_sound) > 0: + result["sound"] = result_sound + return result + + def _extract_condition_images_for_visualization( + self, + gen_data_clean: GenerationDataClean, + sequence_plans: list[SequencePlan], + n_samples: int, + ) -> list[torch.Tensor | None]: + """Extract condition images from gen_data_clean for visualization. + + For image editing, raw_state_vision is a flat list of individually-encoded + images (e.g. [src1, tgt1, src2, tgt2, ...]). The first vision item for + each sample is the condition (source) image. This method extracts it and + resizes to match the target for side-by-side display. + + Args: + gen_data_clean: Clean data containing raw vision states. + sequence_plans: Sequence plans for each sample. + n_samples: Number of samples to process. + + Returns: + List of condition image tensors (one per sample with condition frames). + """ + condition_images: list[torch.Tensor | None] = [] + + if gen_data_clean.num_vision_items_per_sample is not None: + # Multi-item (image editing): raw_state_vision is flat [src1, tgt1, src2, tgt2, ...] + vision_offset = 0 + for i in range(n_samples): + num_items = gen_data_clean.num_vision_items_per_sample[i] + if num_items >= 2: + cond_frame = gen_data_clean.raw_state_vision[vision_offset] # (1, C, 1, H_s, W_s) + target_frame = gen_data_clean.raw_state_vision[vision_offset + 1] # (1, C, 1, H_t, W_t) + # Resize condition frame to match target size for visualization + if cond_frame.shape[-2:] != target_frame.shape[-2:]: + cond_frame = torch.nn.functional.interpolate( + cond_frame.squeeze(2), # (1, C, H, W) + size=target_frame.shape[-2:], + mode="bilinear", + align_corners=False, + ).unsqueeze(2) # (1, C, 1, H, W) + condition_images.append(cond_frame) + else: + condition_images.append(None) + vision_offset += num_items + else: + # Standard single-item mode: check condition_frame_indexes_vision + for i in range(n_samples): + plan = sequence_plans[i] + if len(plan.condition_frame_indexes_vision) > 0 and gen_data_clean.raw_state_vision is not None: + raw_vision = gen_data_clean.raw_state_vision[i] # (1, C, T, H, W) + condition_images.append(raw_vision[:, :, 0:1, :, :]) + else: + condition_images.append(None) + + return condition_images + + def _slice_gen_data_clean(self, gen_data_clean: GenerationDataClean, start: int, limit: int) -> GenerationDataClean: + """Extract a subset of GenerationDataClean for inference. + + The samples in [start:limit] are extracted from the original GenerationDataClean. + + For image editing (``num_vision_items_per_sample`` is set), the sample index refers to + the *real sample* index. The method computes the correct slice of the flat + ``x0_tokens_vision`` / ``raw_state_vision`` lists using the item counts and + preserves ``num_vision_items_per_sample`` on the returned subset so that + downstream packing works correctly. + + Args: + gen_data_clean: GenerationDataClean to slice. + start: Start index of the slice. + limit: Limit index of the slice. + + Returns: + Sliced GenerationDataClean. + """ + # x0_tokens_action can be an empty list (e.g. image2video mode), not just None + has_action = bool(gen_data_clean.x0_tokens_action) + has_sound = bool(gen_data_clean.x0_tokens_sound) + + # Determine vision slice for this sample + num_items = gen_data_clean.num_vision_items_per_sample + if num_items is not None: + # Multi-item mode: compute flat-list offset + vis_start = sum(num_items[:start]) # number of all the vision tokens before the start + vis_end = sum(num_items[:limit]) + subset_x0_vision = gen_data_clean.x0_tokens_vision[vis_start:vis_end] + subset_raw_vision = ( + gen_data_clean.raw_state_vision[vis_start:vis_end] if gen_data_clean.raw_state_vision else None + ) + subset_num_items = num_items[start:limit] + else: + # Standard single-item mode + subset_x0_vision = gen_data_clean.x0_tokens_vision[start:limit] + subset_raw_vision = ( + gen_data_clean.raw_state_vision[start:limit] if gen_data_clean.raw_state_vision else None + ) + subset_num_items = None + fps_vision = gen_data_clean.fps_vision[start:limit] if gen_data_clean.fps_vision is not None else None + + if has_action: + subset_raw_action = ( + gen_data_clean.raw_state_action[start:limit] if gen_data_clean.raw_state_action else None + ) + x0_tokens_action = gen_data_clean.x0_tokens_action[start:limit] + fps_action = gen_data_clean.fps_action[start:limit] if gen_data_clean.fps_action is not None else None + action_domain_id = gen_data_clean.action_domain_id[start:limit] if gen_data_clean.action_domain_id else None + raw_action_dim = gen_data_clean.raw_action_dim[start:limit] if gen_data_clean.raw_action_dim else None + else: + subset_raw_action = None + x0_tokens_action = None + fps_action = None + action_domain_id = None + raw_action_dim = None + + if has_sound: + subset_raw_sound = gen_data_clean.raw_state_sound[start:limit] if gen_data_clean.raw_state_sound else None + x0_tokens_sound = gen_data_clean.x0_tokens_sound[start:limit] + fps_sound = gen_data_clean.fps_sound[start:limit] if gen_data_clean.fps_sound is not None else None + else: + subset_raw_sound = None + x0_tokens_sound = None + fps_sound = None + + return GenerationDataClean( + batch_size=limit - start, + is_image_batch=gen_data_clean.is_image_batch, + raw_state_vision=subset_raw_vision, + raw_state_action=subset_raw_action, + raw_state_sound=subset_raw_sound, + x0_tokens_vision=subset_x0_vision, + x0_tokens_action=x0_tokens_action, + x0_tokens_sound=x0_tokens_sound, + fps_vision=fps_vision, + fps_action=fps_action, + fps_sound=fps_sound, + action_domain_id=action_domain_id, + raw_action_dim=raw_action_dim, + num_vision_items_per_sample=subset_num_items, + ) + + @torch.no_grad() + def validation_step(self, data_batch: dict[str, torch.Tensor], iteration: int): + pass + + @torch.no_grad() + def forward(self, xt, t): + pass + + def get_data_and_condition(self, data_batch: dict[str, torch.Tensor], iteration: int = 1) -> GenerationDataClean: + """ + - Get raw data of different modalities from databatch + - Tokenize into corresponding latents + - Load other conditioning information if any (fps, etc.) + """ + # Detect whether any sample has multiple vision items (e.g. image editing). + # If so, track the count per sample before all vision items from this batch are flattened into a list. + is_image_batch = self.is_image_batch(data_batch) + sample_vision_list = data_batch[self.input_image_key if is_image_batch else self.input_video_key] + + + # we should always get this information here during training. If we can read this field + # from data_batch it means we are in the visualization callback: + if "num_vision_items_per_sample" not in data_batch: + # Each element must be a list/tuple of tensors (not a bare tensor) to count + # as multi-vision. A bare tensor's len() returns its first dim size (e.g. C=3), + # which would incorrectly trigger the multi-vision path for regular video batches. + has_multiple_vision_per_sample = any( + isinstance(v, (list, tuple)) and len(v) > 1 for v in sample_vision_list + ) + num_vision_items_per_sample: list[int] | None = ( + [len(v) for v in sample_vision_list] if has_multiple_vision_per_sample else None + ) + + # information is only stored in the GenerationDataClean object which will be discarded + # outside the training loop. Error will be raised when the data batch is passed to the + # visualization callbacks. + data_batch["num_vision_items_per_sample"] = num_vision_items_per_sample + + # if has_multiple_vision_per_sample, this means that the input media is a list of lists of tensors, we need to flatten it to a list of tensors + if has_multiple_vision_per_sample: + media_key = self.input_video_key if not is_image_batch else self.input_image_key + data_batch[media_key] = [item.unsqueeze(0) for sublist in sample_vision_list for item in sublist] + if data_batch[media_key][0].dtype == torch.float32 and not is_image_batch: + data_batch["is_preprocessed"] = ( + True # for video batch, is_processed = True means the video data is normalized. However, for the image batch, is_processed = True means the image data is augmented with a temporal dimension. + ) + else: + num_vision_items_per_sample = data_batch["num_vision_items_per_sample"] + + batch_size = ( + len(sample_vision_list) if num_vision_items_per_sample is None else len(num_vision_items_per_sample) + ) + + log_enc_time = False + timer = None + if TRAINING: + import wandb + + log_enc_time = iteration % self.log_enc_time_every_n == 0 and wandb.run + if log_enc_time: + timer = Timer(unit="s") + timer.start() + # Vision (image/video) raw state and tokenized latent state + self._normalize_video_databatch_inplace(data_batch) + self._augment_image_dim_inplace(data_batch) # converts each image tensor to (1, C, 1, H, W) + raw_state_vision = data_batch[self.input_image_key if is_image_batch else self.input_video_key] + x0_tokens_vision = [ + self.encode(raw_state_vision_i).contiguous().float() for raw_state_vision_i in raw_state_vision + ] + + frame_size = data_batch.get("image_size", None) + if frame_size is not None: + x0_tokens_vision = self._remove_padding_from_latent(x0_tokens_vision, frame_size) + + # Action – extract dense action / domain_id without mutating data_batch, + # so downstream callbacks can still read the original per-sample domain_ids. + raw_state_action, action_domain_id = self._normalize_action_databatch(data_batch) + x0_tokens_action = raw_state_action + raw_action_dim = data_batch.get("raw_action_dim", None) + + # Sound/audio - normalize, encode if present and sound_gen is enabled + self._normalize_sound_databatch_inplace(data_batch) + raw_state_sound = data_batch.get("sound", None) + if raw_state_sound is not None and self.tokenizer_sound_gen is not None: + x0_tokens_sound = [self.encode_sound(s).contiguous().float() for s in raw_state_sound] + else: + x0_tokens_sound = None + + # We pass the conditioning FPS along to the denoising function + # It will not be used for RoPE FPS modulation unless enabled in the training config + # Note: conditioning_fps from data is converted to TPS via temporal_compression_factor + # in VideoRopePosition3DEmb. + fps_raw = data_batch.get("conditioning_fps", None) + if isinstance(fps_raw, list): + fps_raw = torch.stack(fps_raw).flatten() # list of scalar tensors -> (B,) + fps_vision = fps_raw.to(**self.tensor_kwargs) if fps_raw is not None else None + fps_action = fps_raw.to(**self.tensor_kwargs) if fps_raw is not None else None + + # Sound FPS for RoPE alignment (constant, from config) + if x0_tokens_sound is not None: + sound_batch_size = len(x0_tokens_sound) + fps_sound = torch.full( + (sound_batch_size,), + self._get_sound_fps_for_rope(), + dtype=torch.float32, + ).to(**self.tensor_kwargs) + else: + fps_sound = None + + if TRAINING and log_enc_time and timer is not None: + timer.end() + elapsed = timer.get_cuda_time() + h, w = raw_state_vision[0].shape[-2], raw_state_vision[0].shape[-1] + resolution_label = "unknown" + for res_name, aspect_ratios in VIDEO_RES_SIZE_INFO.items(): + if (h, w) in aspect_ratios.values(): + resolution_label = res_name + if res_name == "704": + # 720 shares some aspect ratios with 704 (e.g., 1:1 at 960x960); prefer 720. + if (h, w) in VIDEO_RES_SIZE_INFO.get("720", {}).values(): + resolution_label = "720" + break + wandb.log( + { + f"timer/encoding_{resolution_label}p": elapsed, + "timer/encoding": elapsed, + }, + step=iteration, + ) + return GenerationDataClean( + batch_size=batch_size, + is_image_batch=is_image_batch, + raw_state_vision=raw_state_vision, + raw_state_action=raw_state_action, + raw_state_sound=raw_state_sound, + x0_tokens_vision=x0_tokens_vision, + x0_tokens_action=x0_tokens_action, + x0_tokens_sound=x0_tokens_sound, + fps_vision=fps_vision, + fps_action=fps_action, + fps_sound=fps_sound, + action_domain_id=action_domain_id, + num_vision_items_per_sample=num_vision_items_per_sample, + raw_action_dim=raw_action_dim, + ) + + def _normalize_video_databatch_inplace( + self, data_batch: dict[str, torch.Tensor], input_key: str | None = None + ) -> None: + """ + Normalizes video data in-place on a CUDA device to reduce data loading overhead. + + This function modifies the video data tensor within the provided data_batch dictionary + in-place, scaling the uint8 data from the range [0, 255] to the normalized range [-1, 1]. + + Args: + data_batch (dict[str, Tensor]): A dictionary containing the video data under a specific key. + This tensor is expected to be on a CUDA device and have dtype of torch.uint8. + input_key (str | None): The key for the video tensor in the data_batch. Defaults to + `self.input_video_key` if not provided. + + Side Effects: + Modifies the tensor at `input_key` within `data_batch` in-place. + + Note: + This operation is performed directly on the CUDA device to avoid the overhead associated + with moving data to/from the GPU. Ensure that the tensor is already on the appropriate device + and has the correct dtype (torch.uint8) to avoid unexpected behaviors. + """ + IS_PREPROCESSED_KEY = "is_preprocessed" + input_key = self.input_video_key if input_key is None else input_key + # only handle video batch + if input_key in data_batch: + if IS_PREPROCESSED_KEY in data_batch and data_batch[IS_PREPROCESSED_KEY] is True: + for i in range(len(data_batch[input_key])): + assert torch.is_floating_point(data_batch[input_key][i]), "Video data is not in float format." + assert torch.all((data_batch[input_key][i] >= -1.0001) & (data_batch[input_key][i] <= 1.0001)), ( + f"Video data is not in the range [-1, 1]. get data range " + f"[{data_batch[input_key][i].min()}, {data_batch[input_key][i].max()}]" + ) + else: + for i in range(len(data_batch[input_key])): + item = data_batch[input_key][i] + if isinstance(item, torch.Tensor): + item = [item] + assert item[0].dtype == torch.uint8, "Video data is not in uint8 format." + data_batch[input_key][i] = torch.stack(item).to(**self.tensor_kwargs) / 127.5 - 1.0 + data_batch[IS_PREPROCESSED_KEY] = True + + def _normalize_action_databatch( + self, data_batch: dict[str, torch.Tensor] + ) -> tuple[list[torch.Tensor] | None, list[torch.Tensor] | None]: + """Extract dense action and domain_id lists from the data batch. + + The joint dataloader produces action and domain_id data as + ``[[tensor], [None], [tensor], ...]`` (each sample wrapped in a + single-element list). This method unwraps inner lists and filters + out ``None`` entries to produce dense lists suitable for the model, + **without mutating** ``data_batch``. + + Returns: + (dense_action, dense_domain_id): Each is a list of device tensors + containing only non-None entries, or ``None`` if all entries are + ``None`` / the key is absent. + """ + dense_action = unwrap_and_densify(data_batch.get("action", None), self.tensor_kwargs) + dense_domain_id = unwrap_and_densify( + data_batch.get("domain_id", None), {"device": self.tensor_kwargs["device"]} + ) + return dense_action, dense_domain_id + + def _normalize_sound_databatch_inplace(self, data_batch: dict[str, torch.Tensor]) -> None: + """Flatten and densify nested sound lists in-place. + + The joint dataloader produces sound data as + ``[[tensor], [None], [tensor], ...]`` (each sample wrapped in a single-element + list). This method: + + 1. Unwraps inner lists: ``[[t], [None], [t]]`` -> ``[t, None, t]`` + 2. Clears ``sequence_plan.has_sound`` for samples whose sound is ``None`` + (kept aligned by ``custom_collate_fn`` preserving ``None`` placeholders). + 3. Filters out None entries: ``[t, None, t]`` -> ``[t, t]`` + 4. Moves tensors to the model device. + 5. Sets ``data_batch["sound"]`` to ``None`` if no valid sound data remains. + + Alignment invariant: ``custom_collate_fn`` keeps the ``"sound"`` key + as a list with ``None`` placeholders for samples that lack audio (e.g. + audio-extraction failures), so the unwrapped ``raw_state_sound`` is + 1:1 with ``sequence_plan``. ``SoundSequencePlanBuilder`` already sets + each plan's ``has_sound`` according to that sample's actual sound + presence, so clearing flags for ``None`` slots here is just defensive. + """ + raw_state_sound = data_batch.get("sound", None) + sequence_plans = data_batch.get("sequence_plan", None) + sound_enabled = self.tokenizer_sound_gen is not None + + def _disable_sound_on_plans() -> None: + if isinstance(sequence_plans, list): + for plan in sequence_plans: + if hasattr(plan, "has_sound"): + plan.has_sound = False + plan.condition_frame_indexes_sound = [] + + if not isinstance(raw_state_sound, list) or len(raw_state_sound) == 0: + # No sound entries at all (image-only batches, or every sample + # came from a non-audio stream). Defensively clear has_sound on + # any plan that somehow has it set so packing does not look up + # missing tensors. + _disable_sound_on_plans() + data_batch["sound"] = None + return + + # Unwrap single-element inner lists produced by IterativeJointDataLoader + if isinstance(raw_state_sound[0], list): + raw_state_sound = [item[0] if isinstance(item, list) else item for item in raw_state_sound] + + if not sound_enabled: + # Model is not configured for sound generation: drop tensors and + # clear any has_sound flags so packing skips the sound path. + _disable_sound_on_plans() + data_batch["sound"] = None + return + + if isinstance(sequence_plans, list): + if len(sequence_plans) == len(raw_state_sound): + # Expected path: 1:1 alignment between plans and per-sample + # sound slots. Clear has_sound where the per-sample tensor + # is None so sequence_packing's idx_sound counter stays in + # sync with the filtered dense list. + for plan, sound in zip(sequence_plans, raw_state_sound, strict=True): + if hasattr(plan, "has_sound") and sound is None: + plan.has_sound = False + plan.condition_frame_indexes_sound = [] + else: + # Length mismatch can only happen if some upstream code path + # (e.g. a stale collate that drops "sound" when any sample is + # None) leaves the dense list shorter than the plans. Without + # 1:1 alignment we cannot safely associate tensors with plans, + # so we conservatively disable sound for the whole batch. + # This trades a small amount of training signal for guaranteed + # correctness — better than silently feeding sound from one + # sample into another sample's plan. + log.warning( + f"Sound/plan length mismatch ({len(sequence_plans)} plans vs " + f"{len(raw_state_sound)} sound entries). Disabling sound for " + "this batch. Check that custom_collate_fn preserves the " + "'sound' key with None placeholders." + ) + _disable_sound_on_plans() + data_batch["sound"] = None + return + + # Filter out None entries (samples without audio) and move to device. + # After the alignment step above, the remaining dense list has the + # same cardinality as plans with has_sound=True. + raw_state_sound = [ + s.to(self.tensor_kwargs["device"]) for s in raw_state_sound if s is not None + ] # list of [C,T_audio] + + if len(raw_state_sound) == 0: + _disable_sound_on_plans() + data_batch["sound"] = None + else: + data_batch["sound"] = raw_state_sound + + def _augment_image_dim_inplace(self, data_batch: dict[str, torch.Tensor], input_key: str = None) -> None: + """ + Augments image tensors by adding a temporal dimension (B, C, H, W) -> (B, C, 1, H, W). + + Args: + data_batch (dict[str, Tensor]): A dictionary containing the image data. + input_key (str | None): The key for the image tensor. Defaults to `self.input_image_key`. + + Side Effects: + Modifies the tensor at `input_key` within `data_batch` in-place. + """ + IS_PREPROCESSED_KEY = "is_preprocessed" + + input_key = self.input_image_key if input_key is None else input_key + if input_key in data_batch: + # Check if the data has already been augmented and avoid re-augmenting + if IS_PREPROCESSED_KEY in data_batch and data_batch[IS_PREPROCESSED_KEY] is True: + for i in range(len(data_batch[input_key])): + assert data_batch[input_key][i].shape[2] == 1, ( + f"Image data is claimed be augmented while its shape is {data_batch[input_key][i].shape} for sample {i}" + ) + return + else: + new_image_tensor_list = [] + for i in range(len(data_batch[input_key])): + for img_tensor in data_batch[input_key][i]: + img_tensor = rearrange(img_tensor, "c h w -> 1 c 1 h w").contiguous() + if img_tensor.dtype == torch.uint8: + img_tensor = img_tensor.to(**self.tensor_kwargs) / 127.5 - 1.0 + new_image_tensor_list.append(img_tensor) + data_batch[input_key] = new_image_tensor_list + data_batch[IS_PREPROCESSED_KEY] = True + + # ------------------ Checkpointing ------------------ + + def state_dict(self, prefix: str = "", **kwargs) -> Dict[str, Any]: + final_state_dict = self.net.state_dict(prefix=prefix + "net.", **kwargs) + if self.config.ema.enabled: + ema_state_dict = self.net_ema.state_dict(prefix=prefix + "net_ema.", **kwargs) + final_state_dict.update(ema_state_dict) + return final_state_dict + + def load_state_dict(self, state_dict: Mapping[str, Any], strict: bool = True, assign: bool = False): + """ + Loads a state dictionary into the model and optionally its EMA counterpart. + + Parameters: + state_dict (Mapping[str, Any]): A dictionary containing separate state + dictionaries for the model and potentially for an EMA version of the model + under the keys 'net' and 'net_ema', respectively. + strict (bool, optional): If True, the method will enforce that the keys in + the state dict match exactly those in the model and EMA model (if applicable). + Defaults to True. + assign (bool, optional): If True and in strict mode, will assign the state dictionary + directly rather than matching keys one-by-one. This is typically used when loading + parts of state dicts or using customized loading procedures. Defaults to False. + """ + if not strict: + raise ValueError("Strict mode is required for OmniMoTModel load_state_dict") + if assign: + raise ValueError("Assign mode is not supported for OmniMoTModel load_state_dict") + + _reg_state_dict = collections.OrderedDict() + _ema_state_dict = collections.OrderedDict() + for k, v in state_dict.items(): + if k.startswith("net."): + _reg_state_dict[k.replace("net.", "")] = v + elif k.startswith("net_ema."): + _ema_state_dict[k.replace("net_ema.", "")] = v + + state_dict = _reg_state_dict + + reg_results: _IncompatibleKeys = self.net.load_state_dict(_reg_state_dict, strict=True, assign=False) + missing_keys = reg_results.missing_keys + unexpected_keys = reg_results.unexpected_keys + + if self.config.ema.enabled: + ema_results: _IncompatibleKeys = self.net_ema.load_state_dict(_ema_state_dict, strict=True, assign=False) + missing_keys += ema_results.missing_keys + unexpected_keys += ema_results.unexpected_keys + else: + assert len(_ema_state_dict) == 0, f"EMA is disabled but EMA state dict is not empty: {len(_ema_state_dict)}" + + return _IncompatibleKeys(missing_keys=missing_keys, unexpected_keys=unexpected_keys) + + # ------------------ public methods ------------------ + + def ema_beta(self, iteration: int) -> float: + """ + Calculate the beta value for EMA update. + weights = weights * beta + (1 - beta) * new_weights + + Args: + iteration (int): Current iteration number. + + Returns: + float: The calculated beta value. + """ + iteration = iteration + self.config.ema.iteration_shift + if iteration < 1: + return 0.0 + return (1 - 1 / (iteration + 1)) ** (self.ema_exp_coefficient + 1) + + def model_param_stats(self) -> Dict[str, int]: + return {"total_learnable_param_num": self._param_count} + + def is_image_batch(self, data_batch: dict[str, torch.Tensor]) -> bool: + """Check if the data_batch contains images (vs. videos). + + We handle two types of data_batch: one from a joint_dataloader where "dataset_name" can + differentiate image_batch and video_batch, another from a single dataloader which we + assume as video_data by default. + """ + is_image = self.input_image_key in data_batch + is_video = self.input_video_key in data_batch + assert is_image != is_video, ( + "Only one of the input_image_key or input_video_key should be present in the data_batch." + ) + return is_image + + def denoise( + self, + net: torch.nn.Module | None = None, + data_batch_packed: PackedSequence | None = None, + fps_vision: torch.Tensor | None = None, + fps_action: torch.Tensor | None = None, + fps_sound: torch.Tensor | None = None, + memory: MemoryState | None = None, + ) -> dict: + """ + Runs the MoT network on a packed multi-modal sequence to predict velocity (v) targets. + + Args: + data_batch_packed: PackedSequence from `pack_input_sequence(...)`. + fps_vision: Optional FPS tensor used for RoPE FPS modulation (if enabled in config). + fps_action: Optional FPS tensor used for action RoPE FPS modulation (if enabled in config). + fps_sound: Optional FPS tensor for sound RoPE modulation (e.g., sound_latent_fps=25). + memory: Optional pre-built MemoryState for autoregressive generation + or KV-cache training. + + Returns: + dict containing: + - "preds_vision": list[Tensor[C,T,H,W]], one per sample. + - "preds_action": Velocity prediction for action modality (if action_gen enabled). + - "preds_sound": Velocity prediction for sound modality (if sound_gen enabled). + - "lbl_metadata_und": Load balancing metadata for understanding pathway (if present). + - "lbl_metadata_gen": Load balancing metadata for generation pathway (if present). + """ + net = net or self.net + out_net = net( + packed_seq=data_batch_packed, + fps_vision=fps_vision, + fps_action=fps_action, + fps_sound=fps_sound, + memory=memory, + ) + output_dict = dict() + output_dict["preds_vision"] = out_net["preds_vision"] + if self.config.action_gen and "preds_action" in out_net: + output_dict["preds_action"] = out_net["preds_action"] + if self.config.sound_gen and "preds_sound" in out_net: + output_dict["preds_sound"] = out_net["preds_sound"] + for key, value in out_net.items(): + if "lbl_metadata_" in key: + output_dict[key] = value + + return output_dict + + @torch.no_grad() + def encode(self, state: torch.Tensor) -> torch.Tensor: + return self.tokenizer_vision_gen.encode(state) + + @torch.no_grad() + def decode(self, latent: torch.Tensor) -> torch.Tensor: + return self.tokenizer_vision_gen.decode(latent) + + @torch.no_grad() + def encode_sound(self, waveform: torch.Tensor) -> torch.Tensor: + """Encode audio waveform into latent tokens. + + Args: + waveform: Audio tensor of shape (C, N). A batch dim is added/removed + internally since AVAE expects (B, C, N). + Mono audio is duplicated to stereo if the tokenizer expects 2 channels. + """ + assert self.tokenizer_sound_gen is not None, "Sound tokenizer not initialized" + # Ensure correct number of channels (AVAE typically expects stereo) + expected_channels = self.tokenizer_sound_gen.audio_channels + if waveform.shape[0] == 1 and expected_channels == 2: + waveform = waveform.repeat(2, 1) # mono → stereo + elif waveform.shape[0] > expected_channels: + waveform = waveform[:expected_channels] + # AVAE expects (B, C, N) + latent = self.tokenizer_sound_gen.encode(waveform.unsqueeze(0)) # [1,sound_channels,T_sound] + return latent.squeeze(0) # [sound_channels,T_sound] + + @torch.no_grad() + def decode_sound(self, latent: torch.Tensor) -> torch.Tensor: + """Decode sound latent tokens back to waveform. + + Args: + latent: Sound latent tensor of shape (C, T). A batch dim is added/removed + internally since AVAE expects (B, C, T). + """ + assert self.tokenizer_sound_gen is not None, "Sound tokenizer not initialized" + # AVAE expects (B, C, T) + waveform = self.tokenizer_sound_gen.decode(latent.unsqueeze(0)) # [1,audio_channels,N_samples] + return waveform.squeeze(0) # [audio_channels,N_samples] + + def _get_sound_fps_for_rope(self) -> float: + """Compute the sound FPS to pass to RoPE for temporal alignment with video. + + Returns the sound tokenizer's latent rate (e.g., 25 Hz for 48kHz/1920 hop). + This is passed as input_fps to the sound RoPE's generate_embeddings(), where + the FPS modulation formula aligns sound indices with video indices. + """ + return float(self.config.sound_latent_fps) + + def get_video_height_width(self) -> Tuple[int, int]: + return VIDEO_RES_SIZE_INFO[self.config.resolution]["9,16"] + + def get_video_latent_height_width(self) -> Tuple[int, int]: + height, width = VIDEO_RES_SIZE_INFO[self.config.resolution]["9,16"] + return ( + height // self.tokenizer_vision_gen.spatial_compression_factor, + width // self.tokenizer_vision_gen.spatial_compression_factor, + ) + + def get_num_video_latent_frames(self) -> int: + return self.config.state_t + + @contextmanager + def ema_scope(self, context=None, is_cpu=False): + if self.config.ema.enabled: + # https://github.com/pytorch/pytorch/issues/144289 + for module in self.net.modules(): + if isinstance(module, FSDPModule): + module.reshard() + self.net_ema_worker.cache(self.net.parameters(), is_cpu=is_cpu) + self.net_ema_worker.copy_to(src_model=self.net_ema, tgt_model=self.net) + if context is not None: + log.info(f"{context}: Switched to EMA weights") + try: + yield None + finally: + if self.config.ema.enabled: + for module in self.net.modules(): + if isinstance(module, FSDPModule): + module.reshard() + self.net_ema_worker.restore(self.net.parameters()) + if context is not None: + log.info(f"{context}: Restored training weights") + + def add_lora( + self, + network: torch.nn.Module, + lora_rank: int, + lora_alpha: int, + lora_target_modules: str, + ) -> torch.nn.Module: + """Pre-FSDP LoRA injection — see :func:`inject_lora_pre_fsdp` for details.""" + from cosmos3._src.vfm.utils.lora import inject_lora_pre_fsdp + + self.lora_alpha = lora_alpha + return inject_lora_pre_fsdp( + network, + lora_rank=lora_rank, + lora_alpha=lora_alpha, + lora_target_modules=lora_target_modules, + ) + + def _init_lora_weights_post_materialization(self, network: torch.nn.Module) -> None: + """Post-materialization LoRA init — see :func:`init_lora_weights_post_materialization`.""" + from cosmos3._src.vfm.utils.lora import init_lora_weights_post_materialization + + init_lora_weights_post_materialization(network) + + +def _broadcast_seed(seed: list[int], group: dist.ProcessGroup, rank: int) -> list[int]: + global_src_rank = torch.distributed.get_global_rank(group, 0) + + if rank == 0: + seed_tensor = torch.tensor(seed, dtype=torch.int64, device=DEVICE) # [len(seed)] + else: + seed_tensor = torch.zeros(len(seed), dtype=torch.int64, device=DEVICE) # [len(seed)] + + torch.distributed.broadcast(seed_tensor, src=global_src_rank, group=group) + return seed_tensor.tolist() diff --git a/cosmos-inference/cosmos3/_src/vfm/models/parallelize_vlm.py b/cosmos-inference/cosmos3/_src/vfm/models/parallelize_vlm.py new file mode 100644 index 00000000..cde4518a --- /dev/null +++ b/cosmos-inference/cosmos3/_src/vfm/models/parallelize_vlm.py @@ -0,0 +1,105 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +"""FSDP2 wrapping for Cosmos3 VLM ``HFModel`` instances. + +Hosts the single VLM-specific ``parallelize`` entry point used by +``vlm_model.VLMModel._init_vlm``. Lives under ``projects/cosmos3/vfm/models/`` +so the FSDP wrapping concern sits next to the model class it operates on +(mirroring the layout of ``models/mot/parallelize_unified_mot.py`` for the +MoT path). + +Pure parallelism plumbing — :class:`~projects.cosmos3.vfm.utils.parallelism.ParallelDims` +and its meshes — stays in ``vfm/utils/parallelism.py``. +""" + +import torch +from torch.distributed.fsdp import CPUOffloadPolicy, MixedPrecisionPolicy, fully_shard + +from cosmos3._src.imaginaire.utils import log +from cosmos3._src.vfm.models.hf_model import HFModel +from cosmos3._src.vfm.utils.parallelism import ParallelDims + +_PRECISION_TO_TORCH_DTYPE: dict[str, torch.dtype] = { + "bfloat16": torch.bfloat16, + "float16": torch.float16, + "float32": torch.float32, +} + + +def parallelize( + model: HFModel, + parallel_dims: ParallelDims, + precision: str, + fsdp_offload: bool = False, +) -> None: + """Apply FSDP2 to an HFModel in-place. + + Uses torch.distributed.fsdp.fully_shard (FSDP2). Each transformer block is + sharded individually for fine-grained memory savings; the outer model is then + wrapped to cover remaining parameters (embeddings, layer norms, lm_head). + + Supported architectures: + - Language models: ``inner.model.layers`` (standard HF LLM structure) + - Vision-language models: additionally ``inner.visual.blocks`` (Qwen3-VL) + + No-op when FSDP is not needed (single-GPU or replicate-only). + + Args: + model: HFModel instance (``_model`` attribute must be on meta or CPU device). + parallel_dims: ParallelDims with meshes already built via + :meth:`ParallelDims.build_meshes`. + precision: Activation / parameter dtype for FSDP MixedPrecisionPolicy + (one of ``"bfloat16"``, ``"float16"``, ``"float32"``). + Sourced from ``policy.parallelism.precision``. + fsdp_offload: If True, wrap with CPUOffloadPolicy. Sourced from + ``train.fsdp_offload``. + """ + if not parallel_dims.dp_shard_enabled: + # No shard axis: dp_shard <= 1. FSDP2 (fully_shard) has nothing to do. + # For replicate-only (dp_replicate > 1, dp_shard == 1), use DDP outside + # this function. + log.info("parallelize: dp_shard <= 1 — skipping FSDP2 wrapping") + return + + mp_policy = MixedPrecisionPolicy( + param_dtype=_PRECISION_TO_TORCH_DTYPE[precision], + reduce_dtype=torch.float32, + ) + + # 2-D (dp_replicate × dp_shard) mesh for HSDP, or 1-D dp_shard sub-mesh + # for pure FSDP. In the overlay design cp does NOT fold into the FSDP + # shard axis; cp/cfgp are handled by separate meshes. + if parallel_dims.dp_replicate_enabled: + fsdp_mesh = parallel_dims.dp_mesh + else: + fsdp_mesh = parallel_dims.dp_shard_mesh + fsdp_kwargs = {"mesh": fsdp_mesh, "mp_policy": mp_policy} + + inner = model._model + + no_split_names = set(getattr(inner, "_no_split_modules", [])) + wrapped = 0 + for module in reversed(list(inner.modules())): + if type(module).__name__ in no_split_names: + fully_shard(module, **fsdp_kwargs) + wrapped += 1 + log.info(f"Wrapped {wrapped} sub-modules.") + + # Wrap the full inner model to cover remaining parameters + # (embed_tokens, final layer norm, lm_head, visual projector stem, etc.) + cpu_offload_policy = CPUOffloadPolicy() if fsdp_offload else None + + fully_shard(inner, offload_policy=cpu_offload_policy, **fsdp_kwargs) + log.info("parallelize: FSDP2 applied to HFModel._model") diff --git a/cosmos-inference/cosmos3/_src/vfm/models/utils/__init__.py b/cosmos-inference/cosmos3/_src/vfm/models/utils/__init__.py new file mode 100644 index 00000000..db6d6eb6 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/vfm/models/utils/__init__.py @@ -0,0 +1,15 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025-2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + diff --git a/cosmos-inference/cosmos3/_src/vfm/models/utils/data_and_condition.py b/cosmos-inference/cosmos3/_src/vfm/models/utils/data_and_condition.py new file mode 100644 index 00000000..58456c01 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/vfm/models/utils/data_and_condition.py @@ -0,0 +1,181 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +""" +Unified data and condition interface where we save the tokenized states and/or +noised latent states for diffusion/flow-matching training. +Used for the VFM generation model. +""" + +from dataclasses import dataclass + +import torch + + +@dataclass(slots=True) +class GenerationDataClean: + """ + Container for tokenized states and conditioning info (clean states) + for the multi-modal (vision, sound, action) MoT training. + Used for the VFM generation model. + """ + + batch_size: int + # Vision (list of per-sample tensors) + is_image_batch: bool + raw_state_vision: list[torch.Tensor] | None = None # raw state in pixel space + x0_tokens_vision: list[torch.Tensor] | None = None # tokenized latent state + fps_vision: torch.Tensor | None = None + + # Image editing: number of vision items per sample. + # When set, x0_tokens_vision is a flat list of individually-encoded image latents + # (e.g. [src1, tgt1, src2, tgt2, ...]) and this field records how many items belong + # to each sample (e.g. [2, 2, ...]). None for standard T2I/T2V (one item per sample). + num_vision_items_per_sample: list[int] | None = None + + # Audio (Sound) + raw_state_sound: torch.Tensor | None = None + x0_tokens_sound: torch.Tensor | None = None + fps_sound: torch.Tensor | None = None + + # Action (dense list of per-sample tensors, only action-having samples) + raw_state_action: list[torch.Tensor] | None = None + x0_tokens_action: list[torch.Tensor] | None = None + fps_action: torch.Tensor | None = None + action_domain_id: list[torch.Tensor] | None = None # per-sample domain IDs, None when no action samples + raw_action_dim: list[torch.Tensor] | None = None # raw action dimension, used adding masks to loss calculation + + +@dataclass(slots=True) +class GenerationDataNoised: + """Container for states after noise addition, along with other + helper attributes for the flow-matching (gt velocity and noise) + for the multi-modal (vision, sound, action) MoT training. + Used for the VFM generation model. + """ + + batch_size: int + # Vision + epsilon_vision: torch.Tensor # unit gaussian noise tensor + xt_tokens_vision: torch.Tensor # tokens added with noise level t per flow-matching formulation + vt_target_vision: torch.Tensor # gt rectified flow field + sigmas_vision: torch.Tensor | None = None # SNR to add to the vision tokens + + # Audio (Sound) + epsilon_sound: torch.Tensor | None = None + xt_tokens_sound: torch.Tensor | None = None + vt_target_sound: torch.Tensor | None = None + sigmas_sound: torch.Tensor | None = None + + # Action + epsilon_action: torch.Tensor | None = None + xt_tokens_action: torch.Tensor | None = None + vt_target_action: torch.Tensor | None = None + sigmas_action: torch.Tensor | None = None + raw_action_dim: list[torch.Tensor] | None = None # raw action dimension, used adding masks to loss calculation + + +def unwrap_and_densify(raw: list | torch.Tensor | None, to_kwargs: dict) -> list[torch.Tensor] | None: + """Unwrap nested single-element lists and filter ``None`` entries. + + The joint dataloader can produce data as nested single-element lists + (e.g. ``[[t1], [None], [t2]]``). This helper flattens the nesting, + drops ``None`` entries, and moves the remaining tensors to the target + device/dtype. + + Args: + raw: The raw value from ``data_batch``. May be ``None``, a bare + tensor, or a (possibly nested) list of tensors / ``None`` s. + Each tensor in the list has shape ``(...)``. + to_kwargs: Keyword arguments forwarded to ``torch.Tensor.to`` + (e.g. ``{"device": "cuda"}`` or ``{"device": "cuda", "dtype": torch.bfloat16}``). + + Returns: + A dense list of device tensors each with shape ``(...)``, or ``None`` + if the input is ``None`` or every entry is ``None``. + + Examples: + >>> unwrap_and_densify([[t1], [None], [t2]], {"device": "cuda"}) + [t1.cuda(), t2.cuda()] + >>> unwrap_and_densify(None, {"device": "cuda"}) + None + """ + if raw is None: + return None + if not isinstance(raw, list): + return [raw.to(**to_kwargs)] # list of 1 tensor: [(...)] + # Unwrap single-element inner lists: [[t], [None]] -> [t, None] + if len(raw) > 0 and isinstance(raw[0], list): + raw = [item[0] if isinstance(item, list) else item for item in raw] + # Filter None entries and move to device + dense = [x.to(**to_kwargs) for x in raw if x is not None] # list of B tensors: [(...), ...] + return dense if dense else None + + +def _expand_per_sample_to_per_vision_item( + tensor: torch.Tensor, # [B,...] + num_vision_items_per_sample: list[int] | None, +) -> torch.Tensor: # [N_vision_items,...] + """Expand a per-sample tensor to a per-vision-item tensor. + + For image editing, each sample may contribute multiple vision items + (e.g. source + target). This helper repeats each sample's value for + all of its vision items so that downstream indexing by vision-item + position works correctly. + + Args: + tensor: Per-sample tensor of shape ``(N, ...)``. + num_vision_items_per_sample: Number of vision items per sample, + e.g. ``[2, 2, ...]``. If ``None``, the tensor is returned as-is + (standard single-item-per-sample case). + + Returns: + Tensor of shape ``(sum(num_vision_items_per_sample), ...)``, or the + original tensor when ``num_vision_items_per_sample`` is ``None``. + """ + if num_vision_items_per_sample is None: + return tensor # [B,...] + expanded = [] + for sample_idx, num_items in enumerate(num_vision_items_per_sample): + for _ in range( + num_items + ): # torch.stack(tensor[idx].repeat(num_vision_items_per_sample[idx]) for idx in range(len(num_vision_items_per_sample))) + expanded.append(tensor[sample_idx]) # [...] + return torch.stack(expanded) # [N_vision_items,...] + + +def build_dense_sound_schedule( + sequence_plans: list, + x0_tokens_sound: list[torch.Tensor] | None, + timesteps: torch.Tensor, # [B,...] + sigmas: torch.Tensor, # [B,...] +) -> tuple[torch.Tensor | None, torch.Tensor | None]: + """Reindex per-sample schedules to match the dense sound tensor list. + + Sound tensors are dense over samples with ``has_sound=True``, while input + timesteps/sigmas are indexed by original batch position. This helper maps + dense sound entry ``i`` back to its source sample's schedule row. + """ + sound_sample_indices = [i for i, plan in enumerate(sequence_plans) if getattr(plan, "has_sound", False)] + num_sound_tensors = 0 if x0_tokens_sound is None else len(x0_tokens_sound) + assert len(sound_sample_indices) == num_sound_tensors, ( + "Sound tensor count must match sequence plans with has_sound=True. " + f"Got {num_sound_tensors} sound tensor(s) for {len(sound_sample_indices)} sound plan(s)." + ) + + if not sound_sample_indices: + return None, None + + idx_sound = torch.tensor(sound_sample_indices, dtype=torch.long, device=timesteps.device) # [n_sound] + return timesteps[idx_sound], sigmas[idx_sound] # [n_sound,...], [n_sound,...] diff --git a/cosmos-inference/cosmos3/_src/vfm/models/utils/dcp_loader.py b/cosmos-inference/cosmos3/_src/vfm/models/utils/dcp_loader.py new file mode 100644 index 00000000..0d111a90 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/vfm/models/utils/dcp_loader.py @@ -0,0 +1,132 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +""" +Checkpoint utilities for VFM models. +Contains utilities for loading model checkpoints with key remapping. +""" + +import time + +import torch +import torch.distributed.checkpoint as dcp +from torch.distributed.checkpoint.metadata import STATE_DICT_TYPE, Metadata + +from cosmos3._src.imaginaire.checkpointer.s3_filesystem import S3StorageReader +from cosmos3._src.imaginaire.utils import log + + +class RenameLoadPlanner(dcp.DefaultLoadPlanner): + """ + RenameLoadPlanner that does two renaming operations during pretrained model load. + 1. Renames model parameters from "model." to "model.language_model." if the core model is a VLM. + 2. Renames model parameters from "_orig_mod." to "." since the pretrained model does not have torch compiled modules. + """ + + def set_up_planner( + self, + state_dict: STATE_DICT_TYPE, + metadata: Metadata | None = None, + is_coordinator: bool = False, + ) -> None: + state_dict = remove_orig_mode_from_state_dict(state_dict) + + # Renames model parameters from "model." to "model.language_model." if the core model is a VLM. + # This is necessary because the checkpoint is saved with "model.language_model." for a VLM. + # If the core model is an LLM, this renaming is not necessary. + has_language_model = any("model.language_model." in key for key in metadata.state_dict_metadata) + if has_language_model: + state_dict = insert_language_model_in_state_dict(state_dict) + + # Perform more error checking to ensure the checkpoint is valid. If the keys are missing, + # then the missing keys should be from the generation pathway. All keys from the + # understanding pathway must be present in the checkpoint. Additionally, for 2B and 4B + # dense Qwen VLMs, the `lm_head.weight` key is not present in the checkpoint. For these + # models, the input embedding and generation layer share the same params due to + # `tie_word_embeddings` being set to True in the configs. For the 0.6B LLM, 8B and 32B dense + # VLMs, and the 30B and 235B MoE VLMs, the `lm_head.weight` key is present in the + # checkpoint. + for key in state_dict: + if key not in metadata.state_dict_metadata and "_moe_gen" not in key and "lm_head.weight" not in key: + raise ValueError(f"Missing key: {key} in checkpoint.") + + # If the keys are unexpected, then the unexpected keys should be from the visual part of the + # VLM. All keys in the language part should be used by the Cosmos3 model. + for key in metadata.state_dict_metadata: + if key not in state_dict and "model.visual." not in key: + raise ValueError(f"Unexpected key: {key} in checkpoint.") + + super().set_up_planner( + state_dict=state_dict, + metadata=metadata, + is_coordinator=is_coordinator, + ) + + +def remove_orig_mode_from_state_dict(state_dict: STATE_DICT_TYPE) -> STATE_DICT_TYPE: + """Renames model parameters from "_orig_mod." to "." since the pretrained model does not have torch compiled modules.""" + return {key.replace("_orig_mod.", ""): value for key, value in state_dict.items()} + + +def insert_language_model_in_state_dict(state_dict: STATE_DICT_TYPE) -> STATE_DICT_TYPE: + """Renames model parameters from "model." to "model.language_model." if the core model is a VLM.""" + new_state_dict = {} + for key, value in state_dict.items(): + if key.startswith("model."): + new_state_dict[key.replace("model.", "model.language_model.")] = value + else: + new_state_dict[key] = value + return new_state_dict + + +def load_language_model( + model: torch.nn.Module, checkpoint_path: str, credential_path: str, enable_gcs_patch_in_boto3: bool = False +) -> None: + """ + Universal language model loading function using DCP format. + Handles key remapping for "model.language_model." -> "model." by default. + + Args: + model: The language model to load weights into + checkpoint_path: Path to checkpoint (must end with .safetensors or dcp) + credential_path: Path to S3 credentials + enable_gcs_patch_in_boto3: Whether to enable GCS patch in boto3 for DCP loading from GCS + """ + start_time = time.time() + if not checkpoint_path.strip("/").endswith("dcp"): + raise ValueError(f"Checkpoint path {checkpoint_path} must end with dcp") + + log.info(f"Loading language model weights in DCP format from: {checkpoint_path}") + state_dict = model.state_dict() + + # Choose storage reader + if checkpoint_path.startswith("s3://"): + storage_reader = S3StorageReader( + credential_path=credential_path, + path=checkpoint_path, + enable_gcs_patch_in_boto3=enable_gcs_patch_in_boto3, + ) + else: + storage_reader = dcp.FileSystemReader(checkpoint_path) + + # Load checkpoint + dcp.load( + state_dict=state_dict, + storage_reader=storage_reader, + planner=RenameLoadPlanner(allow_partial_load=False), + ) + + log.info(f"Successfully loaded language model from {checkpoint_path}") + log.info(f"Time taken to load language model: {time.time() - start_time} seconds") diff --git a/cosmos-inference/cosmos3/_src/vfm/models/utils/memory.py b/cosmos-inference/cosmos3/_src/vfm/models/utils/memory.py new file mode 100644 index 00000000..9a6189bf --- /dev/null +++ b/cosmos-inference/cosmos3/_src/vfm/models/utils/memory.py @@ -0,0 +1,115 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +"""Abstract interfaces for persistent memory in the MoT transformer stack. + +``MemoryState`` is a *mutable* Python object that lives **outside** the +``torch.compile`` boundary. It is responsible for reading cached tensors +(``read_for_layer``) and writing new tensors back (``write_for_layer``). + +``MemoryValue`` is a *read-only* tensor container that is safe to pass +**into** a compiled decoder layer. Concrete implementations are plain +dataclasses whose fields are tensors (or None). No methods on +``MemoryValue`` should mutate state. + +``KVToStore`` is a type alias for the 4-tuple of tensors +``(gen_k, gen_v, und_k, und_v)`` returned by each compiled layer so +the caller can write them back into the cache outside the compile boundary. +""" + +from __future__ import annotations + +from abc import ABC, abstractmethod +from dataclasses import dataclass + +import torch + +# (gen_k, gen_v, und_k, und_v) returned by each compiled layer for the caller +# to write back into the cache outside the torch.compile boundary. +KVToStore = tuple[torch.Tensor, torch.Tensor, torch.Tensor, torch.Tensor] + + +@dataclass +class MemoryValue(ABC): + """Read-only tensor container safe to pass into ``torch.compile``. + + Concrete subclasses (e.g. ``ARMemoryValue``, ``KVTrainMemoryValue``) + are plain dataclasses of tensors. No methods on this class should + mutate state or perform non-trivial computation. + """ + + @property + def supports_context_parallel_attention(self) -> bool: + """Whether this memory value is compatible with context-parallel attention. + + Overridden by ``KVTrainMemoryValue`` to return ``False``. Used by + ``ContextParallelDispatch`` to reject an unsupported combination + without importing the concrete subclass. + """ + return True + + +class MemoryState(ABC): + """Mutable persistent memory that lives outside ``torch.compile``. + + The outer loop in ``_impl_forward`` calls ``read_for_layer`` before + each decoder layer and ``write_for_layer`` after. The ``MemoryState`` + object itself is **never** passed into a compiled region. + """ + + @abstractmethod + def init(self, hidden_states: dict, device: torch.device) -> None: + """Initialization per training step. + + Called once before any transformer layers are processed. + + Args: + hidden_states: The packed sequence (``FactoredSequencePack``). + device: Target device for any new tensors. + """ + + @abstractmethod + def read_for_layer(self, layer_idx: int) -> MemoryValue: + """Produce a read-only tensor snapshot for *layer_idx*. + + Used to retrieve KV values from the cache. + The returned ``MemoryValue`` is passed into the compiled decoder + layer as ``memory_value``. + """ + + @abstractmethod + def write_for_layer(self, layer_idx: int, kv_to_store: KVToStore) -> None: + """Store the K/V tensors produced by *layer_idx* back into the cache. + + Called outside the ``torch.compile`` boundary. + """ + + @abstractmethod + def is_gen_only(self) -> bool: + """Return ``True`` when only the generation pathway should run. + + When ``True``, the decoder layer assumes that the text caption has + already been processed and cached in the MemoryState object. + Used for autoregressive frame-by-frame generation of video. + """ + + @property + def uses_rolling_kv_cache(self) -> bool: + """Whether this memory uses the rolling KV-cache / compile-safe path. + + When ``True``, the network skips NATTEN metadata computation because + temporal causality is handled inside three-way attention instead. + """ + return False diff --git a/cosmos-inference/cosmos3/_src/vfm/models/utils/safetensors_loader.py b/cosmos-inference/cosmos3/_src/vfm/models/utils/safetensors_loader.py new file mode 100644 index 00000000..ef8c488b --- /dev/null +++ b/cosmos-inference/cosmos3/_src/vfm/models/utils/safetensors_loader.py @@ -0,0 +1,1177 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +"""Distributed safetensors loading and HF→Cosmos3 weight conversion. + +Layered API +----------- + +Three layers of functionality, lowest first: + +1. **Multi-rank checkpoint I/O** — :class:`MultiRankCheckpointLoader` distributes + safetensors file reads across the FSDP ``dp_shard`` ranks and then + broadcasts each tensor to every rank. It is checkpoint-format-agnostic: + it just yields ``(name, tensor)`` pairs from the raw HF state dict. + +2. **Name / weight conversion** — Per-family converters translate raw HF + parameter names (and optionally shard the tensor along FSDP / EP axes) + into the Cosmos3 VFM layout: + + - :func:`convert_weight_from_qwen3_hf` — Qwen3 VL / LLM (dense + MoE). + - :func:`convert_weight_from_nemotron_vl_hf` — Nemotron-3 Dense VL (hybrid 56-block layout). + - :func:`convert_weight_from_nemotron_llm_hf` — Nemotron-3 pure LLM. + + For the generic VLM path, :func:`_make_name_converter` consumes the model's + ``_checkpoint_conversion_mapping`` (transformers v4) or falls back to + suffix-lookup against the model's own state dict (transformers v5). + +3. **High-level loaders** — Composing the above: + + - :func:`load_language_model` — loads HF text-tower weights into the MoT + language model. Auto-detects the checkpoint format + (:func:`detect_vlm_checkpoint_format`). + - :func:`load_vlm_model` — generic loader for HF VLM checkpoints into an + FSDP-wrapped ``HFModel``; honors a skip-pattern overlay and the + ``fsdp_offload`` mode. + +Borrowed from cosmos_rl's ``MultiRankWeightLoader`` (renamed to +``MultiRankCheckpointLoader`` here) with modifications for loading from +S3 / GCS and support for Cosmos3 VFM models. +https://github.com/nvidia-cosmos/cosmos-rl/blob/main/cosmos_rl/utils/multi_rank_weight_loader.py +""" + +import os +import re +import time +from collections.abc import Callable, Iterator + +import torch +import torch.distributed as dist +from safetensors.torch import load as load_safetensors +from torch.distributed.device_mesh import DeviceMesh +from torch.distributed.tensor import DTensor + +from cosmos3._src.imaginaire.flags import INTERNAL +from cosmos3._src.imaginaire.utils import log +from cosmos3._src.imaginaire.utils.easy_io import easy_io +from cosmos3._src.vfm.utils.parallelism import ParallelDims + +# Prefixes stripped when matching checkpoint keys to model state-dict keys. +# Order matters: longest first. For each model key, the longest matching +# prefix is stripped (yielding the most specific tail) before we record it +# in the lookup table. The trailing empty string acts as a default that +# leaves keys without any known prefix unchanged. +# Ref: cosmos-rl cosmos_rl/policy/model/hf_models/__init__.py:465-472. +_VLM_KEY_PREFIXES: tuple[str, ...] = ( + "model.language_model.model.", + "model.language_model.", + "language_model.model.", + "language_model.", + "model.", + "", +) + +_HF_URI_PREFIX = "hf://" + + +def _looks_like_hf_repo_id(checkpoint_path: str) -> bool: + """Return True for unambiguous bare Hugging Face repo IDs. + + Explicit ``hf://`` paths are handled separately. For bare paths, require the + common ``namespace/repo`` shape so local relative paths such as ``ckpt`` are + not silently treated as Hub repos. + """ + if os.path.exists(os.path.expanduser(checkpoint_path)): + return False + if checkpoint_path.startswith(("/", "./", "../", "~")): + return False + if "://" in checkpoint_path: + return False + return re.fullmatch(r"[\w.-]+/[\w.-]+", checkpoint_path) is not None + + +def _download_hf_checkpoint(checkpoint_path: str) -> str: + """Download safetensors from Hugging Face Hub and return the local snapshot path.""" + from huggingface_hub import snapshot_download + + repo_id = checkpoint_path.removeprefix(_HF_URI_PREFIX) + hf_home = os.environ.get("HF_HOME") + cache_dir = os.path.join(hf_home, "hub") if hf_home else None + token = os.environ.get("HF_TOKEN") + log.info(f"Resolving Hugging Face checkpoint: {repo_id}", rank0_only=False) + local_path = snapshot_download( + repo_id=repo_id, + token=token, + cache_dir=cache_dir, + allow_patterns=["*.safetensors", "*.safetensors.index.json"], + ) + log.info(f"Resolved Hugging Face checkpoint {repo_id} to {local_path}", rank0_only=False) + return local_path + + +def _is_hf_checkpoint_candidate(checkpoint_path: str) -> bool: + return checkpoint_path.startswith(_HF_URI_PREFIX) or _looks_like_hf_repo_id(checkpoint_path) + + +def _make_backend_args(checkpoint_path: str, credential_path: str | None) -> dict[str, str | None] | None: + if checkpoint_path.startswith("s3://"): + return { + "backend": "s3", + "s3_credential_path": credential_path, + } + return None + + +def _list_safetensors_files( + checkpoint_path: str, + backend_args: dict[str, str | None] | None, +) -> list[str]: + return list( + easy_io.list_dir_or_file( + checkpoint_path, + list_dir=False, + list_file=True, + suffix="safetensors", + recursive=False, + backend_args=backend_args, + ) + ) + + +def _get_local_rank_and_size(device_mesh: DeviceMesh) -> tuple[int, int]: + """Get the local rank and size of a device mesh. + + Args: + device_mesh: The device mesh to get the attributes from. + + Returns: + A tuple of (local rank, size). + """ + return device_mesh.get_local_rank(), device_mesh.size() + + +def _shard_tensor_on_fsdp_mesh( + tensor: torch.Tensor, + parallel_dims: ParallelDims | None, +) -> torch.Tensor: + """Slice ``tensor`` along dim 0 according to the FSDP ``dp_shard`` mesh. + + Returns the rank-local shard when ``dp_shard`` is enabled, otherwise the + full tensor (made contiguous). Requires that ``tensor.shape[0]`` is + divisible by ``dp_shard_size`` — this is a hard requirement of the even- + split semantics; uneven splits should go through :func:`_shard_first_dim`. + + Args: + tensor: The tensor to shard. + parallel_dims: Parallel dims object, or None for single-rank. + + Returns: + Contiguous rank-local shard (or full tensor if dp_shard is disabled). + """ + if parallel_dims is None or not parallel_dims.dp_shard_enabled: + return tensor.contiguous() + + fsdp_rank, fsdp_size = _get_local_rank_and_size(parallel_dims.dp_shard_mesh) + if tensor.shape[0] % fsdp_size != 0: + raise ValueError(f"Shard shape {tensor.shape} is not divisible by dp_shard_size {fsdp_size} on dim 0") + shard = tensor.chunk(chunks=fsdp_size, dim=0)[fsdp_rank] + return shard.contiguous() + + +def _get_dp_shard_mesh(parallel_dims: ParallelDims | None) -> DeviceMesh | None: + """Get the dp_shard mesh from the parallel dimensions. + + Args: + parallel_dims: The parallel dimensions to use for the conversion. + + Returns: + The dp_shard mesh, or None if dp_shard is not enabled. + """ + if parallel_dims is not None and parallel_dims.dp_shard_enabled: + return parallel_dims.dp_shard_mesh + else: + return None + + +def _build_model_key_by_tail(state_dict: dict) -> dict[str, str]: + """Build a ``tail → model_key`` lookup table for suffix-based key matching. + + For each model key, strip the longest matching prefix in + ``_VLM_KEY_PREFIXES`` and record ``tail -> model_key``. The longest + prefix yields the shortest, most specific tail. The trailing empty + prefix in ``_VLM_KEY_PREFIXES`` ensures keys with no known prefix map + to themselves as their own tail. + """ + table: dict[str, str] = {} + for model_key in state_dict: + for pfx in _VLM_KEY_PREFIXES: + if model_key.startswith(pfx): + tail = model_key[len(pfx) :] + if tail and tail not in table: + table[tail] = model_key + break + return table + + +def _is_moe_vlm(model: torch.nn.Module) -> bool: + """Detect whether an HF VLM is a Mixture-of-Experts model. + + MoE VLMs (Qwen3-VL-30B-A3B, Qwen3-VL-235B-A22B) need replicated-gate + + FSDP-fused-expert shard rules that load_vlm_model does NOT yet implement. + Callers use this to raise NotImplementedError before sharding. + + Detection sources (any one is sufficient): + - ``model.config.text_config.num_experts`` (if present and non-None) + - ``model.config.text_config.num_local_experts`` (if present and non-None) + - Same attributes on ``model.config`` directly (text-only fallback) + - Any state-dict key containing ``.mlp.experts.`` + """ + text_cfg = getattr(model.config, "text_config", None) or model.config + for attr in ("num_experts", "num_local_experts"): + value = getattr(text_cfg, attr, None) + if value is not None and value != 0: + return True + for name in model.state_dict().keys(): + if ".mlp.experts." in name: + return True + return False + + +def _make_name_converter( + state_dict: dict, + hf_conv_map: dict[str, str] | None, +) -> Callable[[str], str]: + """Return a callable that maps checkpoint keys to model keys. + + Two strategies, matching cosmos-rl's flow: + 1. If ``hf_conv_map`` is non-empty (transformers v4 pre-computed pattern + mapping), apply each pattern/replacement as a regex substitution and + return on the first match (no further fallback). + 2. Otherwise (transformers v5 or no map), use a direct-match against the + model's state dict, then a longest-prefix-stripped suffix lookup + through ``_VLM_KEY_PREFIXES``. Names that match nothing are returned + unchanged (the caller is responsible for filtering / raising). + """ + model_key_by_tail = _build_model_key_by_tail(state_dict) + + def convert(name: str) -> str: + if hf_conv_map: + for pattern, replacement in hf_conv_map.items(): + if re.search(pattern, name): + return re.sub(pattern, replacement, name) + return name + if name in state_dict: + return name + for pfx in _VLM_KEY_PREFIXES: + if name.startswith(pfx): + tail = name[len(pfx) :] + if tail and tail in model_key_by_tail: + return model_key_by_tail[tail] + return name + + return convert + + +class MultiRankCheckpointLoader: + """Multi-rank loader for model weights stored as safetensors files. + + Files in the checkpoint directory are statically partitioned across the + ranks of the ``dp_shard`` sub-mesh by ``file_idx % world_size``. Each + rank reads its assigned files locally and the per-tensor data is later + broadcast (via :meth:`broadcast_tensor`) so every rank ends up with the + full tensor before sharding. + + When constructed with ``dp_shard_mesh=None`` the loader degrades to a + single-rank fallback: ``world_size = 1``, every rank reads every file, + and broadcasts are no-ops. + + Renamed from cosmos-rl's ``MultiRankWeightLoader`` and extended to load + from S3 / GCS via easy_io and to support Cosmos3 VFM models. + https://github.com/nvidia-cosmos/cosmos-rl/blob/main/cosmos_rl/utils/multi_rank_weight_loader.py + """ + + # Mapping from dtype to integer for broadcasting + DTYPE_TO_INT = { + torch.float32: 0, + torch.float16: 1, + torch.bfloat16: 2, + torch.int64: 3, + torch.int32: 4, + torch.int8: 5, + torch.uint8: 6, + torch.float8_e4m3fn: 7, + torch.float8_e5m2: 8, + } + # Mapping from integer to dtype for broadcasting + INT_TO_DTYPE = {v: k for k, v in DTYPE_TO_INT.items()} + + def __init__(self, dp_shard_mesh: DeviceMesh | None): + """Initialize the multi-rank weight loader. + + Args: + dp_shard_mesh: 1-D ``dp_shard`` mesh, or None if dp_shard is not + enabled. Callers should obtain this via + :func:`_get_dp_shard_mesh` so the ``parallel_dims is None`` and + ``dp_shard <= 1`` cases collapse to the single-rank fallback. + """ + if dp_shard_mesh is None: + self.group = None + self.rank = 0 + self.world_size = 1 + else: + self.group = dp_shard_mesh.get_group() + self.rank = dp_shard_mesh.get_local_rank() + self.world_size = dp_shard_mesh.size() + + def load_files_parallel( + self, + checkpoint_path: str, + credential_path: str | None, + loading_device: torch.device, + ) -> tuple[ + dict[str, torch.Tensor], + dict[str, tuple[list, int]], + set[str], + ]: + """ + Load safetensors files in parallel across ranks. + + Args: + checkpoint_path: Path to the model directory. Local paths and S3 + URIs are tried first; if no safetensors are found, explicit + ``hf://org/model`` Hub URIs and bare ``org/model`` repo IDs + fall back to Hugging Face. + credential_path: Path to the credential file for S3/GCS. + loading_device: Device to load tensors on. + + Returns: + Tuple of (rank_tensors, rank_tensor_metadata, weights_of_ckpt_names): + - rank_tensors: Dict mapping tensor names to tensors loaded by this rank. + - rank_tensor_metadata: Dict mapping tensor names to (shape, dtype_int) tuples. + - weights_of_ckpt_names: Set of all tensor names found by this rank. + """ + rank_tensors = {} # {tensor_name: tensor_data} for this rank + rank_tensor_metadata = {} # {tensor_name: (shape, dtype)} for this rank + weights_of_ckpt_names = set() + + backend_args = _make_backend_args(checkpoint_path, credential_path) + + log.info(f"Loading safetensors files from: {checkpoint_path}", rank0_only=False) + log.info(f"Credential path: {credential_path}", rank0_only=False) + list_error: Exception | None = None + if checkpoint_path.startswith(_HF_URI_PREFIX): + safetensors_files: list[str] = [] + else: + try: + safetensors_files = _list_safetensors_files(checkpoint_path, backend_args) + except Exception as exc: + if not _is_hf_checkpoint_candidate(checkpoint_path): + raise + list_error = exc + safetensors_files = [] + + if not safetensors_files: + if _is_hf_checkpoint_candidate(checkpoint_path): + original_checkpoint_path = checkpoint_path + # Multi-rank: serialize the actual download through global rank 0 + # so we don't race on the shared HF cache. snapshot_download's + # per-blob locks are unreliable on NFS/lustre under concurrent + # access, and hitting HF from N ranks simultaneously also risks + # rate-limiting; without this gate the snapshot dir can end up + # with only config.json and the listing below fails. + # + # We do NOT need N downloads + N barriers. The Slurm job mounts + # one shared HF cache (HF_HOME), so once rank 0 finishes the + # download all other ranks share that cache. The second call to + # _download_hf_checkpoint() on non-zero ranks therefore hits the + # populated cache and just resolves the local snapshot path + # (snapshot_download is idempotent on cache hits — no re-download, + # no network). Two barriers total: + # 1. After rank 0's actual download — others wait so they see + # a complete cache before resolving. + # 2. After non-zero ranks' cache-hit path resolution — keeps + # ranks aligned before subsequent collective ops below. + if dist.is_initialized() and dist.get_world_size() > 1: + if dist.get_rank() == 0: + checkpoint_path = _download_hf_checkpoint(checkpoint_path) + dist.barrier() + if dist.get_rank() != 0: + checkpoint_path = _download_hf_checkpoint(checkpoint_path) + dist.barrier() + else: + checkpoint_path = _download_hf_checkpoint(checkpoint_path) + backend_args = None + log.info( + "No local/S3 safetensors found; falling back to Hugging Face checkpoint " + f"{original_checkpoint_path} -> {checkpoint_path}", + rank0_only=False, + ) + safetensors_files = _list_safetensors_files(checkpoint_path, backend_args) + elif list_error is not None: + raise list_error + + if not safetensors_files: + raise FileNotFoundError(f"No .safetensors files found in checkpoint path: {checkpoint_path}") + + for file_idx, file_path in enumerate(safetensors_files): + file_rank = file_idx % self.world_size + if self.rank == file_rank: + log.info(f"Loading safetensors file: {file_path}", rank0_only=False) + full_path = easy_io.join_path(checkpoint_path, file_path, backend_args=backend_args) + # Download the file + weights_data = easy_io.get(full_path, backend_args=backend_args) + state_dict = load_safetensors(weights_data) + for name, tensor in state_dict.items(): + # Names are stored RAW here; per-checkpoint name + # conversion (see _make_name_converter / the + # convert_weight_from_*_hf functions) is applied later + # by the caller after broadcast. + weights_of_ckpt_names.add(name) + rank_tensors[name] = tensor.to(device=loading_device) + rank_tensor_metadata[name] = ( + list(tensor.shape), + self.DTYPE_TO_INT.get(tensor.dtype, 0), + ) + + return rank_tensors, rank_tensor_metadata, weights_of_ckpt_names + + def gather_tensor_names_and_build_mapping( + self, weights_of_ckpt_names: set[str], rank_tensors: dict[str, torch.Tensor] + ) -> tuple[set[str], dict[str, int]]: + """ + Gather all tensor names from all ranks and build a tensor-to-rank mapping. + + Args: + weights_of_ckpt_names: Set of tensor names found by this rank. + rank_tensors: Dict of tensors loaded by this rank. + + Returns: + Tuple of (all_tensor_names, tensor_to_rank_map): + - all_tensor_names: Set of all tensor names across all ranks. + - tensor_to_rank_map: Dict mapping tensor names to the rank that loaded them. + """ + if self.world_size > 1: + # all_gather_object requires output list to be pre-initialized with world_size + all_tensor_names_lists: list[list[str] | None] = [None] * self.world_size + dist.all_gather_object(all_tensor_names_lists, list(weights_of_ckpt_names), group=self.group) + # Flatten the list and create a set + all_tensor_names = set() + for names_list in all_tensor_names_lists: + if names_list is not None: + all_tensor_names.update(names_list) + + # Build tensor-to-rank mapping: gather which rank has which tensors + # Create a dict mapping tensor_name -> rank for this rank + local_tensor_to_rank = {name: self.rank for name in rank_tensors.keys()} + all_tensor_to_rank_dicts: list[dict[str, int] | None] = [None] * self.world_size + dist.all_gather_object(all_tensor_to_rank_dicts, local_tensor_to_rank, group=self.group) + + # Merge all dicts into a global ``tensor_name -> rank`` map. + # Duplicates aren't expected (each tensor lives in exactly one + # file, which is owned by exactly one rank), but if they do + # occur the lowest rank wins. + tensor_to_rank_map = {} + for rank_idx, tensor_dict in enumerate(all_tensor_to_rank_dicts): + if tensor_dict is not None: + for tensor_name in tensor_dict: + if tensor_name not in tensor_to_rank_map: + tensor_to_rank_map[tensor_name] = rank_idx + else: + all_tensor_names = weights_of_ckpt_names + tensor_to_rank_map = {name: 0 for name in rank_tensors.keys()} + + return all_tensor_names, tensor_to_rank_map + + def broadcast_tensor( + self, + name: str, + tensor_rank: int, + rank_tensors: dict[str, torch.Tensor], + rank_tensor_metadata: dict[str, tuple[list, int]], + device: torch.device, + ) -> torch.Tensor: + """ + Broadcast a tensor from the rank that has it to all ranks. + + Args: + name: Name of the tensor to broadcast. + tensor_rank: Rank that has the tensor. + rank_tensors: Dict of tensors loaded by this rank. + rank_tensor_metadata: Dict of tensor metadata (shape, dtype) for this rank. + device: Device to create tensors on. + + Returns: + The broadcasted tensor (same on all ranks). + """ + # Get tensor from the rank that has it + if self.rank == tensor_rank: + ckpt_tensor = rank_tensors[name] + tensor_shape, tensor_dtype_int = rank_tensor_metadata[name] + # Move tensor from CPU to GPU if needed (tensors are loaded to CPU to avoid OOM) + ckpt_tensor = ckpt_tensor.to(device=device) + else: + ckpt_tensor = None + tensor_shape = [] + tensor_dtype_int = 0 + + # Broadcast tensor metadata (shape, dtype) from the rank that has it + if self.world_size > 1: + # Ensure all ranks participate in broadcast + if self.rank == tensor_rank: + shape_len = len(tensor_shape) + shape_len_tensor = torch.tensor([shape_len], dtype=torch.long, device=device) + shape_tensor = torch.tensor(tensor_shape, dtype=torch.long, device=device) + dtype_int_tensor = torch.tensor([tensor_dtype_int], dtype=torch.long, device=device) + else: + shape_len_tensor = torch.zeros(1, dtype=torch.long, device=device) + shape_tensor = None # Will be created after knowing shape_len + dtype_int_tensor = torch.zeros(1, dtype=torch.long, device=device) + + # Broadcast shape length first + dist.broadcast(shape_len_tensor, group=self.group, group_src=tensor_rank) + shape_len = shape_len_tensor.item() + + # Create shape_tensor with correct size for all ranks + if self.rank != tensor_rank: + shape_tensor = torch.zeros(shape_len, dtype=torch.long, device=device) + + # Broadcast shape values + dist.broadcast(shape_tensor, group=self.group, group_src=tensor_rank) + + # Broadcast dtype + dist.broadcast(dtype_int_tensor, group=self.group, group_src=tensor_rank) + + if self.rank != tensor_rank: + tensor_shape = shape_tensor.cpu().tolist() + tensor_dtype = self.INT_TO_DTYPE.get(dtype_int_tensor.item(), torch.float32) + ckpt_tensor = torch.empty(tensor_shape, dtype=tensor_dtype, device=device) + + # Broadcast the actual tensor data + dist.broadcast(ckpt_tensor, group=self.group, group_src=tensor_rank) + + # Ensure ckpt_tensor is not None + if ckpt_tensor is None: + raise ValueError( + f"Failed to get tensor {name} on rank {self.rank}. " + f"tensor_rank={tensor_rank}, world_size={self.world_size}, " + f"group={self.group}" + ) + + return ckpt_tensor + + def iterate_tensors( + self, + all_tensor_names: set[str], + tensor_to_rank_map: dict[str, int], + rank_tensors: dict[str, torch.Tensor], + rank_tensor_metadata: dict[str, tuple[list, int]], + device: torch.device, + ) -> Iterator[tuple[str, torch.Tensor]]: + """ + Iterate over all tensors, broadcasting them as needed. + + Args: + all_tensor_names: Set of all tensor names across all ranks. + tensor_to_rank_map: Dict mapping tensor names to the rank that loaded them. + rank_tensors: Dict of tensors loaded by this rank. + rank_tensor_metadata: Dict of tensor metadata (shape, dtype) for this rank. + device: Device to create tensors on. + + Yields: + Tuple of (tensor_name, tensor) for each tensor. + """ + for name in sorted(all_tensor_names): + tensor_rank = tensor_to_rank_map.get(name) + if tensor_rank is None: + continue + + tensor = self.broadcast_tensor(name, tensor_rank, rank_tensors, rank_tensor_metadata, device) + yield name, tensor + + +def convert_weight_from_qwen3_hf( + tensor: torch.Tensor, + name: str, + parallel_dims: ParallelDims | None, +) -> tuple[str | None, torch.Tensor | None]: + """Map Qwen3 VL / LLM HF weights to the Cosmos3 VFM layout and shard them. + + Steps: + + 1. Strip the ``model.language_model.`` prefix (so keys from the VL + checkpoint variant collapse onto the LLM key namespace). + 2. Classify the resulting ``dest_name`` against two allowlists: + + - **used_patterns** — embeddings, norms, attention/MLP projections, + fused ``mlp.experts.{gate_up_proj,down_proj}``, and + ``mlp.gate.weight``. Matching keys are kept. + - **discarded_patterns** — currently just ``model.visual.*`` (the + vision tower is loaded separately). Matching keys are dropped + and the function returns ``(None, None)``. + + A key that matches neither raises :class:`ValueError`. + 3. Shard kept tensors along dim 0 on the FSDP ``dp_shard`` mesh via + :func:`_shard_tensor_on_fsdp_mesh`. Expert parallelism is **not** + handled here — fused expert tensors flow through the same FSDP + sharding as the rest, which is correct only for ``ep == 1``; the + moe-mesh sharding path will need to be added back when EP support + lands. + + Args: + tensor: Raw HF tensor. + name: HF parameter name (with ``model.`` / ``model.language_model.`` + prefix as it appears in the safetensors checkpoint). + parallel_dims: Parallel dims; ``None`` skips sharding. + + Returns: + Tuple ``(dest_name, sharded_tensor)`` in the Cosmos3 layout, or + ``(None, None)`` if the tensor is intentionally discarded. + """ + dest_name = name.replace("model.language_model.", "model.") + + used_patterns = [ + r"^lm_head\.weight$", + r"^model\.embed_tokens\.weight$", + r"^model\.norm\.weight$", + r"^model\.layers\.(\d+)\.(input_layernorm|post_attention_layernorm)\.weight$", + r"^model\.layers\.(\d+)\.self_attn\.(q_norm|k_norm|v_norm)\.weight$", + r"^model\.layers\.(\d+)\.self_attn\.(q_proj|k_proj|v_proj|o_proj)\.weight$", + r"^model\.layers\.(\d+)\.mlp\.(gate_proj|up_proj|down_proj)\.weight$", + r"^model\.layers\.(\d+)\.mlp\.experts\.(gate_up_proj|down_proj)$", + r"^model\.layers\.(\d+)\.mlp\.gate\.weight$", + ] + + discarded_patterns = [ + r"^model\.visual\.", + ] + + def _is_used_pattern(dest_name: str) -> bool: + for used_pattern in used_patterns: + if re.search(used_pattern, dest_name) is not None: + return True + + for discarded_pattern in discarded_patterns: + if re.search(discarded_pattern, dest_name) is not None: + return False + + raise ValueError(f"Unexpected weight found in checkpoint: {dest_name}") + + if _is_used_pattern(dest_name): + return dest_name, _shard_tensor_on_fsdp_mesh(tensor, parallel_dims) + + return None, None + + +def convert_weight_from_nemotron_vl_hf( + tensor: torch.Tensor, + name: str, + parallel_dims: ParallelDims | None, +) -> tuple[str | None, torch.Tensor | None]: + """Map Nemotron VLM HF keys (56 hybrid blocks) to Cosmos3 VFM MoT keys (28 paired layers). + + The Nemotron 3 Dense VL checkpoint (NVIDIA-Nemotron-3-Dense-VL-2B-BF16-Alignment) + uses a hybrid layout with 56 alternating attention and MLP blocks, where: + + - Even-indexed blocks (0, 2, 4, ...) contain attention (``mixer.q/k/v/o_proj``) + - Odd-indexed blocks (1, 3, 5, ...) contain MLP (``mixer.up/down_proj``) + - Each block has a ``norm.weight`` (pre-attention or post-attention layer norm) + + The MoT model uses a standard layout with 28 paired layers, each containing both + attention and MLP sub-modules. + + Weight mapping (HF → MoT):: + + model.visual.*, model.projector.*, model.multi_modal_projector.* + → skipped (vision weights, loaded separately) + + model.lm_head.weight / lm_head.weight → lm_head.weight + model.language_model.embeddings.weight → model.embed_tokens.weight + model.language_model.norm_f.weight → model.norm.weight + + model.language_model.layers.{2i}.norm.weight + → model.layers.{i}.input_layernorm.weight + model.language_model.layers.{2i+1}.norm.weight + → model.layers.{i}.post_attention_layernorm.weight + + model.language_model.layers.{2i}.mixer.{q,k,v,o}_proj.weight + → model.layers.{i}.self_attn.{q,k,v,o}_proj.weight + + model.language_model.layers.{2i+1}.mixer.{up,down}_proj.weight + → model.layers.{i}.mlp.{up,down}_proj.weight + """ + if name.startswith("model.visual.") or name.startswith("model.projector."): + return None, None + if name.startswith("model.multi_modal_projector."): + return None, None + + dest_name: str | None = None + if name == "lm_head.weight" or name == "model.lm_head.weight": + dest_name = "lm_head.weight" + elif name == "model.language_model.embeddings.weight": + dest_name = "model.embed_tokens.weight" + elif name == "model.language_model.norm_f.weight": + dest_name = "model.norm.weight" + else: + # Layer norm: even idx → pre-attention (input_layernorm), odd idx → post-attention + m = re.match(r"model\.language_model\.layers\.(\d+)\.norm\.weight", name) + if m is not None: + idx = int(m.group(1)) + paired = idx // 2 + if idx % 2 == 0: + dest_name = f"model.layers.{paired}.input_layernorm.weight" + else: + dest_name = f"model.layers.{paired}.post_attention_layernorm.weight" + else: + # Attention projections: must be at even indices + m = re.match( + r"model\.language_model\.layers\.(\d+)\.mixer\.(q_proj|k_proj|v_proj|o_proj)\.weight", + name, + ) + if m is not None: + idx = int(m.group(1)) + if idx % 2 != 0: + raise ValueError(f"Expected attention block at even layer index, got {name}") + paired = idx // 2 + dest_name = f"model.layers.{paired}.self_attn.{m.group(2)}.weight" + else: + # MLP projections: must be at odd indices + m = re.match( + r"model\.language_model\.layers\.(\d+)\.mixer\.(up_proj|down_proj)\.weight", + name, + ) + if m is not None: + idx = int(m.group(1)) + if idx % 2 != 1: + raise ValueError(f"Expected MLP block at odd layer index, got {name}") + paired = idx // 2 + dest_name = f"model.layers.{paired}.mlp.{m.group(2)}.weight" + + if dest_name is None: + raise ValueError(f"Unexpected Nemotron checkpoint tensor: {name}") + + return dest_name, _shard_tensor_on_fsdp_mesh(tensor, parallel_dims) + + +def convert_weight_from_nemotron_llm_hf( + tensor: torch.Tensor, + name: str, + parallel_dims: ParallelDims | None, +) -> tuple[str | None, torch.Tensor | None]: + """Map Nemotron pure-LLM HF keys (CosmosNemotronForCausalLM) to MoT language model keys. + + The Nemotron 3 LLM checkpoint (NVIDIA-Nemotron-3-2B-BF16) uses a standard + decoder-only layout with 28 layers, each containing attention and MLP. The key + names are already close to the MoT model's expected layout, so most keys pass + through with minimal renaming. + + Weight mapping (HF → MoT):: + + model.embeddings.weight → model.embed_tokens.weight + lm_head.weight → lm_head.weight + model.norm.weight → model.norm.weight + + model.layers.{i}.input_layernorm.weight → (unchanged) + model.layers.{i}.post_attention_layernorm.weight → (unchanged) + model.layers.{i}.self_attn.{q,k,v,o}_proj.weight → (unchanged) + model.layers.{i}.mlp.{up,down}_proj.weight → (unchanged) + """ + if name == "model.embeddings.weight": + dest_name = "model.embed_tokens.weight" + elif name in ("lm_head.weight", "model.lm_head.weight"): + dest_name = "lm_head.weight" + elif name == "model.norm.weight": + dest_name = "model.norm.weight" + elif re.match(r"model\.layers\.\d+\.(input_layernorm|post_attention_layernorm)\.weight", name): + dest_name = name + elif re.match(r"model\.layers\.\d+\.self_attn\.(q_proj|k_proj|v_proj|o_proj)\.weight", name): + dest_name = name + elif re.match(r"model\.layers\.\d+\.mlp\.(up_proj|down_proj)\.weight", name): + dest_name = name + else: + raise ValueError(f"Unexpected Nemotron LLM checkpoint tensor: {name}") + + return dest_name, _shard_tensor_on_fsdp_mesh(tensor, parallel_dims) + + +def _shard_first_dim(tensor: torch.Tensor, world_size: int, rank: int) -> torch.Tensor: + """Slice a tensor along dim 0 for FSDP sharding. + + Matches cosmos-rl weight_converter.py:71-79 semantics: even splits use + tensor_split; uneven splits use ceil-divide with the last rank getting + the remainder (may be smaller than average). This layout must match + FSDP2's local_view shape per rank — caller asserts shape equality. + """ + tensor = tensor.contiguous() + row_size = tensor.shape[0] + if world_size == 1: + return tensor + if row_size % world_size == 0: + return tensor.tensor_split(world_size, dim=0)[rank].contiguous() + avg = (row_size + world_size - 1) // world_size + start = rank * avg + end = min(start + avg, row_size) + return tensor[start:end].contiguous() + + +def detect_vlm_checkpoint_format(all_tensor_names: set[str]) -> str: + """Detect the checkpoint family from its tensor key set. + + Detection rules (first match wins): + + - ``"nemotron_3_dense_vl"`` — any key shaped like + ``model.language_model.layers.*.mixer.q_proj.*``. This is the hybrid + 56-block layout where attention and MLP live in alternating blocks + under ``mixer.``. + - ``"nemotron_3_llm"`` — checkpoints that expose + ``model.embeddings.weight`` (Nemotron's pure LLM key for the input + embedding; Qwen3 uses ``model.embed_tokens.weight``). + - ``"qwen3"`` — default; covers Qwen3 VL and Qwen3 LLM (dense and MoE). + + The resulting tag is consumed by :func:`load_language_model` to dispatch + to the matching ``convert_weight_from_*_hf`` converter. + """ + for n in all_tensor_names: + if "model.language_model.layers." in n and ".mixer.q_proj" in n: + return "nemotron_3_dense_vl" + if "model.embeddings.weight" in all_tensor_names: + return "nemotron_3_llm" + return "qwen3" + + +def load_language_model( + model: torch.nn.Module, + checkpoint_path: str, + credential_path: str | None, + parallel_dims: ParallelDims | None, + checkpoint_format: str | None = None, +) -> set[str]: + """ + Universal language model loading function using SafeTensors (.safetensors) format. + Handles key remapping for "model.language_model." -> "model." by default. + + Args: + model: The language model to load weights into. + checkpoint_path: Path to checkpoint containing .safetensors files. Local + paths and S3 URIs are tried first; if no safetensors are found, + explicit ``hf://org/model`` Hub URIs and bare ``org/model`` repo IDs + fall back to Hugging Face. + credential_path: Path to S3 credentials, or None for local/HF. + parallel_dims: ParallelDims object to use for parallel loading. + If None, the loading is done in a single rank. + checkpoint_format: ``"qwen3"``, ``"nemotron_3_dense_vl"``, ``"nemotron_3_llm"``, or None to auto-detect. + + Returns: + Set of model state-dict keys successfully loaded from the checkpoint. + """ + if not INTERNAL: + from cosmos3._src.imaginaire.utils.checkpoint_db import download_checkpoint, sanitize_uri + + checkpoint_path = download_checkpoint(sanitize_uri(checkpoint_path)) + + start_time = time.time() + log.info(f"load_language_model: loading weights from {checkpoint_path}") + + lm_state_dict = {} + for name, tensor in model.state_dict().items(): + # Remove the original module (torch compiled module) and checkpoint wrapped module prefixes. + final_name = name.replace("_orig_mod.", "").replace("_checkpoint_wrapped_module.", "") + lm_state_dict[final_name] = tensor + + # Initialize multi-rank weight loader + loader = MultiRankCheckpointLoader(_get_dp_shard_mesh(parallel_dims)) + + # Step 1: Load files in parallel + rank_tensors, rank_tensor_metadata, weights_of_ckpt_names = loader.load_files_parallel( + checkpoint_path=checkpoint_path, + credential_path=credential_path, + loading_device="cpu", + ) + + # Step 2: Gather tensor names and build mapping + all_tensor_names, tensor_to_rank_map = loader.gather_tensor_names_and_build_mapping( + weights_of_ckpt_names, rank_tensors + ) + + resolved_format = checkpoint_format or detect_vlm_checkpoint_format(all_tensor_names) + log.info(f"Language model checkpoint format: {resolved_format}", rank0_only=False) + + # Step 3: Process each tensor + keys_loaded = set() + for name, tensor in loader.iterate_tensors( + all_tensor_names, + tensor_to_rank_map, + rank_tensors, + rank_tensor_metadata, + device="cuda", + ): + if resolved_format == "nemotron_3_dense_vl": + dest_name, dest_weight = convert_weight_from_nemotron_vl_hf( + tensor=tensor, + name=name, + parallel_dims=parallel_dims, + ) + elif resolved_format == "nemotron_3_llm": + dest_name, dest_weight = convert_weight_from_nemotron_llm_hf( + tensor=tensor, + name=name, + parallel_dims=parallel_dims, + ) + elif resolved_format == "qwen3": + dest_name, dest_weight = convert_weight_from_qwen3_hf( + tensor=tensor, + name=name, + parallel_dims=parallel_dims, + ) + else: + raise ValueError(f"Unexpected checkpoint format: {resolved_format}") + + if dest_name is None: + # This is due to the visual weights of VLM models. + continue + + # If the weight is not found in the language model's state dict, then the weight is + # unexpected. The unexpected weights should be from the visual part of the VLM (already + # handled by the previous check). All weights in the language part should be used by + # the Cosmos3 VFM. + if dest_name not in lm_state_dict: + raise ValueError( + f"Unexpected weight found in checkpoint: {name}, " + f"language model's corresponding weight {dest_name} not found." + ) + + target_tensor = lm_state_dict[dest_name] + is_dist_tensor = isinstance(target_tensor, DTensor) + local_view = target_tensor.to_local() if is_dist_tensor else target_tensor + + if dest_weight.device != local_view.device: + dest_weight = dest_weight.to(local_view.device) + + assert local_view.shape == dest_weight.shape, ( + f"Shape mismatch: {local_view.shape} != {dest_weight.shape} " + f"for {dest_name} with original shape {target_tensor.shape}" + ) + with torch.no_grad(): + local_view.data.copy_(dest_weight) + + keys_loaded.add(dest_name) + + # Perform more error checking to ensure the checkpoint is valid. If the keys are missing, + # then the missing keys should be from the generation pathway. All keys from the + # understanding pathway must be present in the checkpoint. Additionally, for 2B and 4B + # dense Qwen VLMs, the `lm_head.weight` key is not present in the checkpoint. For these + # models, the input embedding and generation layer share the same params due to + # `tie_word_embeddings` being set to True in the configs. For the 0.6B LLM, 8B and 32B dense + # VLMs, and the 30B and 235B MoE VLMs, the `lm_head.weight` key is present in the + # checkpoint. + keys_missing = set(lm_state_dict.keys()) - keys_loaded + tie = getattr(model.config, "tie_word_embeddings", False) + real_keys_missing = {k for k in keys_missing if not ("_moe_gen" in k or (tie and "lm_head.weight" in k))} + if real_keys_missing: + raise ValueError( + f"load_language_model: {len(real_keys_missing)} required model " + f"parameter(s) not found in checkpoint '{checkpoint_path}'. " + f"First up to 10: {sorted(real_keys_missing)[:10]}" + ) + + log.info( + f"load_language_model: successfully loaded {len(keys_loaded)} tensors " + f"from {checkpoint_path} in {time.time() - start_time:.1f}s" + ) + return keys_loaded + + +def load_vlm_model( + model: torch.nn.Module, + checkpoint_path: str, + credential_path: str | None, + parallel_dims: ParallelDims | None, + tensor_names_to_skip: list[str] | None = None, + extra_skip_patterns: list[str] | None = None, +) -> set[str]: + """Load a HF VLM checkpoint (safetensors) into an FSDP-wrapped HFModel. + + Local paths and S3 URIs are tried first; if no safetensors are found, + explicit ``hf://org/model`` Hub URIs and bare ``org/model`` repo IDs fall + back to Hugging Face. + + Both ``tensor_names_to_skip`` and ``extra_skip_patterns`` are lists of + regex patterns applied to the RESOLVED model key (post-name_converter). + Phase-5 skips any model key matched by either list; Phase-6's + completeness check tolerates missing model keys matched by either + list. The two kwargs are semantically identical — separate names let + call sites distinguish "model-type fixed skips" (from + ``_tensor_names_to_skip_for``) from "overlay-specific skips" (from + ``VLMModel._init_vlm`` for the pretrain_weights_path_llm overlay). + + Cosmos-rl-style universal loader — no per-family hand-coded key mapping. + Resolves the FSDP shard sub-group via :func:`_get_dp_shard_mesh`, which + reads ``parallel_dims.dp_shard_mesh`` (the 1-D ``dp_shard`` sub-mesh + populated by ``ParallelDims.build_meshes()``). ``cp`` and ``cfgp`` live + in their own overlay meshes and do NOT participate in checkpoint sharding. + + Preconditions: + - ``parallelize()`` has been called on the HFModel (parameters are DTensors). + - ``HFModel.tie_embeddings()`` has been called before this function so that + tied ``lm_head.weight`` / ``embed_tokens.weight`` share DTensor storage. + - When ``parallel_dims`` is provided AND ``parallel_dims.dp_shard > 1``, + ``parallel_dims.build_meshes()`` MUST have been called by the caller. + Otherwise ``dp_shard_mesh`` returns None and the loader silently falls + back to single-rank loading — every rank reads every file and slices + locally, which is correct for ``dp_shard <= 1`` but a silent perf / + correctness regression for FSDP runs. Pass ``parallel_dims=None`` + explicitly for the single-process / unit-test fallback. + + Raises: + NotImplementedError: for MoE VLMs (not yet supported — see spec §2.2). + ValueError: when the checkpoint is missing a required model parameter. + + Returns: + Set of model state-dict keys successfully loaded from the checkpoint. + """ + start_time = time.time() + log.info(f"Loading VLM weights in safetensors format from: {checkpoint_path}") + + # Phase 1: canonical model state dict with compile/FSDP wrapper prefixes stripped. + vlm_state_dict = { + name.replace("_orig_mod.", "").replace("_checkpoint_wrapped_module.", ""): tensor + for name, tensor in model.state_dict().items() + } + + # Phase 2+3: suffix-lookup table + name converter. + hf_conv_map = getattr(model, "_checkpoint_conversion_mapping", None) + name_converter = _make_name_converter( + vlm_state_dict, + hf_conv_map=hf_conv_map if hf_conv_map else None, + ) + + # Phase 4: MoE precheck — fail early rather than silently mis-shard. + if _is_moe_vlm(model): + raise NotImplementedError( + "load_vlm_model does not yet support MoE VLMs " + "(e.g. Qwen3-VL-30B-A3B, Qwen3-VL-235B-A22B). Expected follow-up MR " + "ports cosmos-rl's is_moe_mlp_fused_into_dp_shard / replicated-gate " + "handling. Use a dense VLM checkpoint (2B, 4B, 8B, 32B) until then." + ) + + # Detect fsdp_offload mode by inspecting a sample parameter's device. In + # offload mode, the FSDP-materialized local_views live on CPU; routing the + # loader's distributed broadcast through CUDA would materialize the full + # checkpoint tensor on GPU transiently (defeats the point of offload). Use + # a single-rank fallback in that case: every rank reads every file on CPU, + # slices locally, no broadcast. I/O-redundant but memory-safe, matching + # the pre-MR _load_vlm_weights behavior under offload. + sample_target = next(iter(vlm_state_dict.values())) if vlm_state_dict else None + sample_local = sample_target.to_local() if isinstance(sample_target, DTensor) else sample_target + offload_mode = sample_local is not None and sample_local.device.type == "cpu" + + # Pick the loader's group: single-rank (no broadcast) in offload mode to + # keep memory off-GPU; the dp_shard sub-mesh otherwise. + loader = MultiRankCheckpointLoader(_get_dp_shard_mesh(parallel_dims) if not offload_mode else None) + rank_tensors, rank_tensor_meta, ckpt_names = loader.load_files_parallel( + checkpoint_path=checkpoint_path, + credential_path=credential_path if credential_path else "", + loading_device="cpu", + ) + all_tensor_names, tensor_to_rank = loader.gather_tensor_names_and_build_mapping( + ckpt_names, + rank_tensors, + ) + + # Phase 5: per-tensor copy. Skip patterns match the MODEL key (post- + # name_converter), not the raw ckpt key — this matches cosmos-rl's + # semantics and avoids fragility with prefix variations. The two lists + # are concatenated; they share Phase-5 skip + Phase-6 tolerance + # semantics. + _all_skip_patterns = (tensor_names_to_skip or []) + (extra_skip_patterns or []) + skip_patterns = [re.compile(p) for p in _all_skip_patterns] + keys_loaded: set[str] = set() + skipped_model_keys: set[str] = set() + + # Broadcast/iterate device: CUDA (NCCL) unless we're in offload mode, in + # which case everything stays on CPU. In offload mode the loader group + # is single-rank, so iterate_tensors doesn't actually broadcast — device + # just controls where the tensor is yielded. + if offload_mode or not torch.cuda.is_available(): + target_device = "cpu" + else: + target_device = "cuda" + + # Resolve the shard axis for the FSDP slicing. Even in offload mode we + # still need the real (shard_rank, shard_size) from parallel_dims so each + # rank takes its own FSDP slice — we just route the LOAD/BROADCAST through + # world_size=1 (single-rank fallback) to avoid the GPU spike. + dp_shard_mesh = _get_dp_shard_mesh(parallel_dims) + if dp_shard_mesh is not None: + shard_rank = dp_shard_mesh.get_local_rank() + shard_size = dp_shard_mesh.size() + else: + shard_rank = 0 + shard_size = 1 + + for ckpt_name, tensor in loader.iterate_tensors( + all_tensor_names, + tensor_to_rank, + rank_tensors, + rank_tensor_meta, + device=target_device, + ): + dest_name = name_converter(ckpt_name) + + if any(p.fullmatch(dest_name) for p in skip_patterns): + skipped_model_keys.add(dest_name) + continue + + if dest_name not in vlm_state_dict: + continue # extra checkpoint key — ignore + + target = vlm_state_dict[dest_name] + is_dtensor = isinstance(target, DTensor) + local_view = target.to_local() if is_dtensor else target + + # Slice using the REAL FSDP shard_rank/shard_size derived from + # parallel_dims.dp_shard_mesh, NOT loader.rank/world_size. In + # offload mode those two differ: the loader runs single-rank + # (world_size=1) but FSDP still shards across N ranks. + shard = _shard_first_dim(tensor, shard_size, shard_rank) + if shard.device != local_view.device: + shard = shard.to(local_view.device) + + if shard.shape != local_view.shape: + raise ValueError( + f"Shape mismatch for {dest_name}: local_view={tuple(local_view.shape)}, shard={tuple(shard.shape)}" + ) + with torch.no_grad(): + local_view.data.copy_(shard) + keys_loaded.add(dest_name) + + # Phase 6: completeness check with tied-embedding AND skip-list tolerance. + missing = set(vlm_state_dict) - keys_loaded - skipped_model_keys + + # Also tolerate missing model keys that match a skip pattern directly — + # handles the case where the ckpt doesn't contain the key at all, so the + # Phase 5 loop never saw it and skipped_model_keys didn't accumulate it. + missing = {k for k in missing if not any(p.fullmatch(k) for p in skip_patterns)} + tie = getattr(model.config, "tie_word_embeddings", False) + real_missing = {k for k in missing if not (tie and "lm_head.weight" in k)} + if real_missing: + raise ValueError( + f"load_vlm_model: {len(real_missing)} required model parameter(s) " + f"not found in checkpoint '{checkpoint_path}'. First up to 10: " + f"{sorted(real_missing)[:10]}" + ) + log.info( + f"load_vlm_model: loaded {len(keys_loaded)} tensors from {checkpoint_path} in {time.time() - start_time:.1f}s" + ) + return keys_loaded diff --git a/cosmos-inference/cosmos3/_src/vfm/models/utils/taylorseer.py b/cosmos-inference/cosmos3/_src/vfm/models/utils/taylorseer.py new file mode 100644 index 00000000..ec846e69 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/vfm/models/utils/taylorseer.py @@ -0,0 +1,180 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +# Adapted from https://github.com/Shenyi-Z/TaylorSeer/blob/main/TaylorSeers-xDiT/taylorseer_flux/taylorseer_utils/__init__.py + +import math +from typing import Dict + +import torch + + +def derivative_approximation(cache_dic: Dict, current: Dict, feature: torch.Tensor): + """ + Compute derivative approximation. + + :param cache_dic: Cache dictionary + :param current: Information of the current step + """ + difference_distance = current["activated_steps"][-1] - current["activated_steps"][-2] + + updated_taylor_factors = {} + updated_taylor_factors[0] = feature + + for i in range(cache_dic["max_order"]): + if ( + cache_dic["cache"][-1][current["stream"]][current["layer"]][current["module"]].get(i, None) is not None + ) and (current["step"] > cache_dic["first_enhance"] - 2): + updated_taylor_factors[i + 1] = ( + updated_taylor_factors[i] + - cache_dic["cache"][-1][current["stream"]][current["layer"]][current["module"]][i] + ) / difference_distance + else: + break + + cache_dic["cache"][-1][current["stream"]][current["layer"]][current["module"]] = updated_taylor_factors + + +def taylor_formula(cache_dic: Dict, current: Dict) -> torch.Tensor: + """ + Compute Taylor expansion error. + + :param cache_dic: Cache dictionary + :param current: Information of the current step + """ + x = current["step"] - current["activated_steps"][-1] + # x = current['t'] - current['activated_times'][-1] + output = 0 + + for i in range(len(cache_dic["cache"][-1][current["stream"]][current["layer"]][current["module"]])): + output += ( + (1 / math.factorial(i)) + * cache_dic["cache"][-1][current["stream"]][current["layer"]][current["module"]][i] + * (x**i) + ) + + return output + + +def taylor_cache_init(cache_dic: Dict, current: Dict): + """ + Initialize Taylor cache and allocate storage for different-order derivatives in the Taylor cache. + + :param cache_dic: Cache dictionary + :param current: Information of the current step + """ + if (current["step"] == 0) and (cache_dic["taylor_cache"]): + cache_dic["cache"][-1][current["stream"]][current["layer"]][current["module"]] = {} + + +# Copied from https://github.com/Shenyi-Z/TaylorSeer/blob/main/TaylorSeers-xDiT/taylorseer_flux/cache_functions/force_scheduler.py + + +def force_scheduler(cache_dic, current): + if cache_dic["fresh_ratio"] == 0: + # FORA + linear_step_weight = 0.0 + else: + # TokenCache + linear_step_weight = 0.0 + step_factor = torch.tensor(1 - linear_step_weight + 2 * linear_step_weight * current["step"] / current["num_steps"]) + threshold = torch.round(cache_dic["fresh_threshold"] / step_factor) + + # no force constrain for sensitive steps, cause the performance is good enough. + # you may have a try. + + cache_dic["cal_threshold"] = threshold + # return threshold + + +# Copied from https://github.com/Shenyi-Z/TaylorSeer/blob/main/TaylorSeers-xDiT/taylorseer_flux/cache_functions/cal_type.py + + +def cal_type(cache_dic, current): + """ + Determine calculation type for this step + """ + if (cache_dic["fresh_ratio"] == 0.0) and (not cache_dic["taylor_cache"]): + # FORA:Uniform + first_step = current["step"] == 0 + else: + # ToCa: First enhanced + first_step = current["step"] < cache_dic["first_enhance"] + + if not first_step: + fresh_interval = cache_dic["cal_threshold"] + else: + fresh_interval = cache_dic["fresh_threshold"] + + if (first_step) or (cache_dic["cache_counter"] == fresh_interval - 1): + current["type"] = "full" + cache_dic["cache_counter"] = 0 + current["activated_steps"].append(current["step"]) + force_scheduler(cache_dic, current) + + elif cache_dic["taylor_cache"]: + cache_dic["cache_counter"] += 1 + current["type"] = "Taylor" + + elif cache_dic["cache_counter"] % 2 == 1: # 0: ToCa-Aggresive-ToCa, 1: Aggresive-ToCa-Aggresive + cache_dic["cache_counter"] += 1 + current["type"] = "ToCa" + # 'cache_noise' 'ToCa' 'FORA' + elif cache_dic["Delta-DiT"]: + cache_dic["cache_counter"] += 1 + current["type"] = "Delta-Cache" + else: + cache_dic["cache_counter"] += 1 + current["type"] = "ToCa" + + +# Modified from https://github.com/Shenyi-Z/TaylorSeer/blob/main/TaylorSeers-xDiT/taylorseer_flux/cache_functions/cache_init.py + + +def cache_init(self, num_steps: int): + """ + Initialization for cache. + """ + cache_dic = {} + cache = {} + cache_index = {} + cache[-1] = {} + cache_index[-1] = {} + cache_index["layer_index"] = {} + cache[-1]["layers_stream"] = {} + cache_dic["cache_counter"] = 0 + + for j in range(len(self.language_model.model.layers)): + cache[-1]["layers_stream"][j] = {} + cache_index[-1][j] = {} + + cache_dic["Delta-DiT"] = False + cache_dic["cache_type"] = "random" + cache_dic["cache_index"] = cache_index + cache_dic["cache"] = cache + cache_dic["fresh_ratio_schedule"] = "ToCa" + cache_dic["fresh_ratio"] = 0.0 + cache_dic["fresh_threshold"] = 3 + cache_dic["soft_fresh_weight"] = 0.0 + cache_dic["taylor_cache"] = True + cache_dic["max_order"] = 6 + cache_dic["first_enhance"] = 5 + + current = {} + current["activated_steps"] = [0] + current["step"] = 0 + current["num_steps"] = num_steps + + return cache_dic, current diff --git a/cosmos-inference/cosmos3/_src/vfm/models/vlm/__init__.py b/cosmos-inference/cosmos3/_src/vfm/models/vlm/__init__.py new file mode 100644 index 00000000..db6d6eb6 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/vfm/models/vlm/__init__.py @@ -0,0 +1,15 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025-2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + diff --git a/cosmos-inference/cosmos3/_src/vfm/models/vlm/nemotron_3_dense_vl/__init__.py b/cosmos-inference/cosmos3/_src/vfm/models/vlm/nemotron_3_dense_vl/__init__.py new file mode 100644 index 00000000..db6d6eb6 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/vfm/models/vlm/nemotron_3_dense_vl/__init__.py @@ -0,0 +1,15 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025-2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + diff --git a/cosmos-inference/cosmos3/_src/vfm/models/vlm/nemotron_3_dense_vl/configs/Nemotron-2B-Dense-VL.json b/cosmos-inference/cosmos3/_src/vfm/models/vlm/nemotron_3_dense_vl/configs/Nemotron-2B-Dense-VL.json new file mode 100644 index 00000000..3f1075eb --- /dev/null +++ b/cosmos-inference/cosmos3/_src/vfm/models/vlm/nemotron_3_dense_vl/configs/Nemotron-2B-Dense-VL.json @@ -0,0 +1,31 @@ +{ + "vocab_size": 131072, + "tie_word_embeddings": false, + "hidden_size": 2048, + "intermediate_size": 9216, + "num_hidden_layers": 28, + "num_attention_heads": 16, + "head_dim": 128, + "num_key_value_heads": 8, + "mlp_hidden_act": "relu2", + "attention_bias": false, + "mlp_bias": false, + "initializer_range": 0.02, + "layer_norm_epsilon": 1e-05, + "residual_in_fp32": false, + "use_cache": true, + "num_logits_to_keep": 1, + "pad_token_id": 0, + "bos_token_id": 1, + "eos_token_id": 11, + "sliding_window": null, + "max_position_embeddings": 131072, + "attention_dropout": 0.0, + "hidden_dropout": 0.0, + "enable_rope": true, + "rope_scaling": null, + "rope_theta": 100000000, + "enable_mrope": true, + "mrope_section": [24, 20, 20], + "torch_dtype": "bfloat16" +} diff --git a/cosmos-inference/cosmos3/_src/vfm/models/vlm/nemotron_3_dense_vl/configuration_nemotron_3_dense_vl.py b/cosmos-inference/cosmos3/_src/vfm/models/vlm/nemotron_3_dense_vl/configuration_nemotron_3_dense_vl.py new file mode 100644 index 00000000..e4786731 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/vfm/models/vlm/nemotron_3_dense_vl/configuration_nemotron_3_dense_vl.py @@ -0,0 +1,97 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +"""Configuration for Nemotron 3 Dense VL text backbone (paired hybrid blocks -> standard layers).""" + +from transformers.configuration_utils import PretrainedConfig + + +class Nemotron3DenseVLTextConfig(PretrainedConfig): + """Text config for Nemotron-H style language model after pairing attn+MLP blocks (28 effective layers).""" + + model_type = "nemotron_3_dense_vl_text" + + def __init__( + self, + vocab_size: int = 131072, + tie_word_embeddings: bool = False, + hidden_size: int = 2048, + intermediate_size: int = 9216, + num_hidden_layers: int = 28, + num_attention_heads: int = 16, + head_dim: int = 128, + num_key_value_heads: int = 8, + mlp_hidden_act: str = "relu2", + attention_bias: bool = False, + mlp_bias: bool = False, + initializer_range: float = 0.02, + layer_norm_epsilon: float = 1e-5, + residual_in_fp32: bool = False, + use_cache: bool = True, + num_logits_to_keep: int = 1, + pad_token_id: int = 0, + bos_token_id: int = 1, + eos_token_id: int = 11, + sliding_window: int | None = None, + max_position_embeddings: int = 131072, + attention_dropout: float = 0.0, + hidden_dropout: float = 0.0, + enable_rope: bool = True, + rope_scaling: dict | None = None, + rope_theta: float = 100_000_000.0, + enable_mrope: bool = True, + mrope_section: list[int] | None = None, + torch_dtype: str = "bfloat16", + **kwargs, + ) -> None: + self.vocab_size = vocab_size + self.tie_word_embeddings = tie_word_embeddings + self.hidden_size = hidden_size + self.intermediate_size = intermediate_size + self.num_hidden_layers = num_hidden_layers + self.num_attention_heads = num_attention_heads + self.head_dim = head_dim + self.num_key_value_heads = num_key_value_heads + self.mlp_hidden_act = mlp_hidden_act + self.attention_bias = attention_bias + self.mlp_bias = mlp_bias + self.initializer_range = initializer_range + self.layer_norm_epsilon = layer_norm_epsilon + self.residual_in_fp32 = residual_in_fp32 + self.use_cache = use_cache + self.num_logits_to_keep = num_logits_to_keep + self.sliding_window = sliding_window + self.max_position_embeddings = max_position_embeddings + self.attention_dropout = attention_dropout + self.hidden_dropout = hidden_dropout + self.rope_scaling = rope_scaling + self.rope_theta = rope_theta + self.enable_rope = enable_rope + self.enable_mrope = enable_mrope + self.mrope_section = mrope_section if mrope_section is not None else [24, 20, 20] + self.torch_dtype = torch_dtype + self._attn_implementation = kwargs.pop("_attn_implementation", "eager") + super().__init__( + pad_token_id=pad_token_id, + bos_token_id=bos_token_id, + eos_token_id=eos_token_id, + tie_word_embeddings=tie_word_embeddings, + **kwargs, + ) + + @property + def rms_norm_eps(self) -> float: + """Alias for Qwen-style MoT code paths.""" + return self.layer_norm_epsilon diff --git a/cosmos-inference/cosmos3/_src/vfm/models/vlm/nemotron_3_dense_vl/nemotron_3_dense_vl.py b/cosmos-inference/cosmos3/_src/vfm/models/vlm/nemotron_3_dense_vl/nemotron_3_dense_vl.py new file mode 100644 index 00000000..2afd3f02 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/vfm/models/vlm/nemotron_3_dense_vl/nemotron_3_dense_vl.py @@ -0,0 +1,165 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +"""Nemotron-H style text modules for Nemotron 3 Dense VL (ReLU^2 MLP, partial RoPE helper, mRoPE).""" + +from __future__ import annotations + +import functools + +import torch +import torch.nn.functional as F +from torch import nn +from transformers.activations import ACT2FN + +if "relu2" not in ACT2FN: + ACT2FN["relu2"] = lambda x: F.relu(x).square() +from transformers.modeling_rope_utils import dynamic_rope_update +from transformers.modeling_utils import PreTrainedModel + +from cosmos3._src.vfm.models.vlm.nemotron_3_dense_vl.configuration_nemotron_3_dense_vl import ( + Nemotron3DenseVLTextConfig, +) + + +def rotate_half(x: torch.Tensor) -> torch.Tensor: + x1 = x[..., : x.shape[-1] // 2] + x2 = x[..., x.shape[-1] // 2 :] + return torch.cat((-x2, x1), dim=-1) + + +def apply_rotary_pos_emb_partial( + q: torch.Tensor, + k: torch.Tensor, + cos: torch.Tensor, + sin: torch.Tensor, + unsqueeze_dim: int = 1, +) -> tuple[torch.Tensor, torch.Tensor]: + """Apply RoPE to the first rot_dim channels; remainder passes through (rot_dim == head_dim for 2B Dense).""" + cos = cos.unsqueeze(unsqueeze_dim) + sin = sin.unsqueeze(unsqueeze_dim) + rot_dim = cos.shape[-1] + q_rot, q_pass = q[..., :rot_dim], q[..., rot_dim:] + k_rot, k_pass = k[..., :rot_dim], k[..., rot_dim:] + q_embed = (q_rot * cos) + (rotate_half(q_rot) * sin) + k_embed = (k_rot * cos) + (rotate_half(k_rot) * sin) + return torch.cat((q_embed, q_pass), dim=-1), torch.cat((k_embed, k_pass), dim=-1) + + +class Nemotron3DenseVLRMSNorm(nn.Module): + def __init__(self, hidden_size: int, eps: float = 1e-5) -> None: + super().__init__() + self.weight = nn.Parameter(torch.ones(hidden_size)) + self.variance_epsilon = eps + + def forward(self, hidden_states: torch.Tensor) -> torch.Tensor: + input_dtype = hidden_states.dtype + hidden_states = hidden_states.to(torch.float32) + variance = hidden_states.pow(2).mean(-1, keepdim=True) + hidden_states = hidden_states * torch.rsqrt(variance + self.variance_epsilon) + return (self.weight.to(torch.float32) * hidden_states).to(input_dtype) + + def extra_repr(self) -> str: + return f"{tuple(self.weight.shape)}, eps={self.variance_epsilon}" + + +class Nemotron3DenseVLMLP(nn.Module): + def __init__(self, config: Nemotron3DenseVLTextConfig, layer_idx: int | None = None) -> None: + super().__init__() + self.config = config + self.hidden_size = config.hidden_size + self.intermediate_size = config.intermediate_size + self.up_proj = nn.Linear(self.hidden_size, self.intermediate_size, bias=config.mlp_bias) + self.down_proj = nn.Linear(self.intermediate_size, self.hidden_size, bias=config.mlp_bias) + if config.mlp_hidden_act in ACT2FN: + self.act_fn = ACT2FN[config.mlp_hidden_act] + else: + self.act_fn = lambda x: F.relu(x).square() + + def forward(self, x: torch.Tensor) -> torch.Tensor: + return self.down_proj(self.act_fn(self.up_proj(x))) + + +class MultiModalRotaryEmbedding(nn.Module): + inv_freq: torch.Tensor + + def __init__(self, config: Nemotron3DenseVLTextConfig, device: torch.device | None = None) -> None: + super().__init__() + self.rope_type = "default" + self.max_seq_len_cached = config.max_position_embeddings + self.original_max_seq_len = config.max_position_embeddings + self.config = config + self.mrope_section = getattr(config, "mrope_section", [24, 20, 20]) + inv_freq, self.attention_scaling = self.compute_default_rope_parameters(self.config, device) + self.register_buffer("inv_freq", inv_freq, persistent=False) + self.register_buffer("original_inv_freq", inv_freq.clone(), persistent=False) + + @staticmethod + def compute_default_rope_parameters( + config: Nemotron3DenseVLTextConfig | None = None, + device: torch.device | None = None, + seq_len: int | None = None, + ) -> tuple[torch.Tensor, float]: + rope_theta = config.rope_theta + dim = config.head_dim + attention_factor = 1.0 + inv_freq = 1.0 / ( + rope_theta ** (torch.arange(0, dim, 2, dtype=torch.int64, device=device).to(dtype=torch.float) / dim) + ) + return inv_freq, attention_factor + + def apply_interleaved_mrope(self, freqs: torch.Tensor, mrope_section: list[int]) -> torch.Tensor: + freqs_t = freqs[0] + for dim, offset in enumerate((1, 2), start=1): + length = mrope_section[dim] * 3 + idx = slice(offset, length, 3) + freqs_t[..., idx] = freqs[dim, ..., idx] + return freqs_t + + @torch.no_grad() + @dynamic_rope_update + def forward(self, x: torch.Tensor, position_ids: torch.Tensor) -> tuple[torch.Tensor, torch.Tensor]: + if position_ids.ndim == 2: + position_ids = position_ids[None, ...].expand(3, position_ids.shape[0], -1) + inv_freq_expanded = self.inv_freq[None, None, :, None].float().expand(3, position_ids.shape[1], -1, 1) + position_ids_expanded = position_ids[:, :, None, :].float() + device_type = x.device.type if isinstance(x.device.type, str) and x.device.type != "mps" else "cpu" + with torch.autocast(device_type=device_type, enabled=False): + freqs = (inv_freq_expanded.float() @ position_ids_expanded.float()).transpose(2, 3) + freqs = self.apply_interleaved_mrope(freqs, self.mrope_section) + emb = torch.cat((freqs, freqs), dim=-1) + cos = emb.cos() * self.attention_scaling + sin = emb.sin() * self.attention_scaling + return cos.to(dtype=x.dtype), sin.to(dtype=x.dtype) + + def init_weights(self, buffer_device: torch.device | None = None) -> None: + inv_freq, self.attention_scaling = self.compute_default_rope_parameters(self.config, buffer_device) + self.register_buffer("inv_freq", inv_freq, persistent=False) + + +class Nemotron3DenseVLPreTrainedModel(PreTrainedModel): + config_class = Nemotron3DenseVLTextConfig + base_model_prefix = "model" + supports_gradient_checkpointing = True + _supports_flash_attn = True + _supports_sdpa = True + + def _init_weights(self, module: nn.Module, buffer_device: torch.device | None) -> None: + super()._init_weights(module) + if isinstance(module, MultiModalRotaryEmbedding): + module.init_weights(buffer_device=buffer_device) + + def init_weights(self, buffer_device: torch.device | None = None) -> None: + self.apply(functools.partial(self._init_weights, buffer_device=buffer_device)) diff --git a/cosmos-inference/cosmos3/_src/vfm/models/vlm/qwen3_vl/__init__.py b/cosmos-inference/cosmos3/_src/vfm/models/vlm/qwen3_vl/__init__.py new file mode 100644 index 00000000..db6d6eb6 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/vfm/models/vlm/qwen3_vl/__init__.py @@ -0,0 +1,15 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025-2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + diff --git a/cosmos-inference/cosmos3/_src/vfm/models/vlm/qwen3_vl/configs/Qwen3-VL-2B-Instruct.json b/cosmos-inference/cosmos3/_src/vfm/models/vlm/qwen3_vl/configs/Qwen3-VL-2B-Instruct.json new file mode 100644 index 00000000..0cd8c646 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/vfm/models/vlm/qwen3_vl/configs/Qwen3-VL-2B-Instruct.json @@ -0,0 +1,63 @@ +{ + "architectures": [ + "Qwen3VLForConditionalGeneration" + ], + "image_token_id": 151655, + "model_type": "qwen3_vl", + "text_config": { + "attention_bias": false, + "attention_dropout": 0.0, + "bos_token_id": 151643, + "dtype": "bfloat16", + "eos_token_id": 151645, + "head_dim": 128, + "hidden_act": "silu", + "hidden_size": 2048, + "initializer_range": 0.02, + "intermediate_size": 6144, + "max_position_embeddings": 262144, + "model_type": "qwen3_vl_text", + "num_attention_heads": 16, + "num_hidden_layers": 28, + "num_key_value_heads": 8, + "rms_norm_eps": 1e-06, + "rope_scaling": { + "mrope_interleaved": true, + "mrope_section": [ + 24, + 20, + 20 + ], + "rope_type": "default" + }, + "rope_theta": 5000000, + "tie_word_embeddings": true, + "use_cache": true, + "vocab_size": 151936 + }, + "tie_word_embeddings": true, + "transformers_version": "4.57.0.dev0", + "video_token_id": 151656, + "vision_config": { + "deepstack_visual_indexes": [ + 5, + 11, + 17 + ], + "depth": 24, + "hidden_act": "gelu_pytorch_tanh", + "hidden_size": 1024, + "in_channels": 3, + "initializer_range": 0.02, + "intermediate_size": 4096, + "model_type": "qwen3_vl", + "num_heads": 16, + "num_position_embeddings": 2304, + "out_hidden_size": 2048, + "patch_size": 16, + "spatial_merge_size": 2, + "temporal_patch_size": 2 + }, + "vision_end_token_id": 151653, + "vision_start_token_id": 151652 +} diff --git a/cosmos-inference/cosmos3/_src/vfm/models/vlm/qwen3_vl/configs/Qwen3-VL-32B-Instruct.json b/cosmos-inference/cosmos3/_src/vfm/models/vlm/qwen3_vl/configs/Qwen3-VL-32B-Instruct.json new file mode 100644 index 00000000..29e92509 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/vfm/models/vlm/qwen3_vl/configs/Qwen3-VL-32B-Instruct.json @@ -0,0 +1,62 @@ +{ + "architectures": [ + "Qwen3VLForConditionalGeneration" + ], + "image_token_id": 151655, + "model_type": "qwen3_vl", + "text_config": { + "attention_bias": false, + "attention_dropout": 0.0, + "bos_token_id": 151643, + "dtype": "bfloat16", + "eos_token_id": 151645, + "head_dim": 128, + "hidden_act": "silu", + "hidden_size": 5120, + "initializer_range": 0.02, + "intermediate_size": 25600, + "max_position_embeddings": 262144, + "model_type": "qwen3_vl_text", + "num_attention_heads": 64, + "num_hidden_layers": 64, + "num_key_value_heads": 8, + "rms_norm_eps": 1e-06, + "rope_scaling": { + "mrope_interleaved": true, + "mrope_section": [ + 24, + 20, + 20 + ], + "rope_type": "default" + }, + "rope_theta": 5000000, + "use_cache": true, + "vocab_size": 151936 + }, + "tie_word_embeddings": false, + "transformers_version": "4.57.0.dev0", + "video_token_id": 151656, + "vision_config": { + "deepstack_visual_indexes": [ + 8, + 16, + 24 + ], + "depth": 27, + "hidden_act": "gelu_pytorch_tanh", + "hidden_size": 1152, + "in_channels": 3, + "initializer_range": 0.02, + "intermediate_size": 4304, + "model_type": "qwen3_vl", + "num_heads": 16, + "num_position_embeddings": 2304, + "out_hidden_size": 5120, + "patch_size": 16, + "spatial_merge_size": 2, + "temporal_patch_size": 2 + }, + "vision_end_token_id": 151653, + "vision_start_token_id": 151652 +} diff --git a/cosmos-inference/cosmos3/_src/vfm/models/vlm/qwen3_vl/configs/Qwen3-VL-4B-Instruct.json b/cosmos-inference/cosmos3/_src/vfm/models/vlm/qwen3_vl/configs/Qwen3-VL-4B-Instruct.json new file mode 100644 index 00000000..e8add36d --- /dev/null +++ b/cosmos-inference/cosmos3/_src/vfm/models/vlm/qwen3_vl/configs/Qwen3-VL-4B-Instruct.json @@ -0,0 +1,63 @@ +{ + "architectures": [ + "Qwen3VLForConditionalGeneration" + ], + "image_token_id": 151655, + "model_type": "qwen3_vl", + "text_config": { + "attention_bias": false, + "attention_dropout": 0.0, + "bos_token_id": 151643, + "dtype": "bfloat16", + "eos_token_id": 151645, + "head_dim": 128, + "hidden_act": "silu", + "hidden_size": 2560, + "initializer_range": 0.02, + "intermediate_size": 9728, + "max_position_embeddings": 262144, + "model_type": "qwen3_vl_text", + "num_attention_heads": 32, + "num_hidden_layers": 36, + "num_key_value_heads": 8, + "rms_norm_eps": 1e-06, + "rope_scaling": { + "mrope_interleaved": true, + "mrope_section": [ + 24, + 20, + 20 + ], + "rope_type": "default" + }, + "rope_theta": 5000000, + "tie_word_embeddings": true, + "use_cache": true, + "vocab_size": 151936 + }, + "tie_word_embeddings": true, + "transformers_version": "4.57.0.dev0", + "video_token_id": 151656, + "vision_config": { + "deepstack_visual_indexes": [ + 5, + 11, + 17 + ], + "depth": 24, + "hidden_act": "gelu_pytorch_tanh", + "hidden_size": 1024, + "in_channels": 3, + "initializer_range": 0.02, + "intermediate_size": 4096, + "model_type": "qwen3_vl", + "num_heads": 16, + "num_position_embeddings": 2304, + "out_hidden_size": 2560, + "patch_size": 16, + "spatial_merge_size": 2, + "temporal_patch_size": 2 + }, + "vision_end_token_id": 151653, + "vision_start_token_id": 151652 + } diff --git a/cosmos-inference/cosmos3/_src/vfm/models/vlm/qwen3_vl/configs/Qwen3-VL-8B-Instruct.json b/cosmos-inference/cosmos3/_src/vfm/models/vlm/qwen3_vl/configs/Qwen3-VL-8B-Instruct.json new file mode 100644 index 00000000..0e2df614 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/vfm/models/vlm/qwen3_vl/configs/Qwen3-VL-8B-Instruct.json @@ -0,0 +1,62 @@ +{ + "architectures": [ + "Qwen3VLForConditionalGeneration" + ], + "image_token_id": 151655, + "model_type": "qwen3_vl", + "text_config": { + "attention_bias": false, + "attention_dropout": 0.0, + "bos_token_id": 151643, + "dtype": "bfloat16", + "eos_token_id": 151645, + "head_dim": 128, + "hidden_act": "silu", + "hidden_size": 4096, + "initializer_range": 0.02, + "intermediate_size": 12288, + "max_position_embeddings": 262144, + "model_type": "qwen3_vl_text", + "num_attention_heads": 32, + "num_hidden_layers": 36, + "num_key_value_heads": 8, + "rms_norm_eps": 1e-06, + "rope_scaling": { + "mrope_interleaved": true, + "mrope_section": [ + 24, + 20, + 20 + ], + "rope_type": "default" + }, + "rope_theta": 5000000, + "use_cache": true, + "vocab_size": 151936 + }, + "tie_word_embeddings": false, + "transformers_version": "4.57.0.dev0", + "video_token_id": 151656, + "vision_config": { + "deepstack_visual_indexes": [ + 8, + 16, + 24 + ], + "depth": 27, + "hidden_act": "gelu_pytorch_tanh", + "hidden_size": 1152, + "in_channels": 3, + "initializer_range": 0.02, + "intermediate_size": 4304, + "model_type": "qwen3_vl", + "num_heads": 16, + "num_position_embeddings": 2304, + "out_hidden_size": 4096, + "patch_size": 16, + "spatial_merge_size": 2, + "temporal_patch_size": 2 + }, + "vision_end_token_id": 151653, + "vision_start_token_id": 151652 +} diff --git a/cosmos-inference/cosmos3/_src/vfm/models/vlm/qwen3_vl/configs/__init__.py b/cosmos-inference/cosmos3/_src/vfm/models/vlm/qwen3_vl/configs/__init__.py new file mode 100644 index 00000000..db6d6eb6 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/vfm/models/vlm/qwen3_vl/configs/__init__.py @@ -0,0 +1,15 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025-2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + diff --git a/cosmos-inference/cosmos3/_src/vfm/models/vlm/qwen3_vl/configuration_qwen3_vl.py b/cosmos-inference/cosmos3/_src/vfm/models/vlm/qwen3_vl/configuration_qwen3_vl.py new file mode 100644 index 00000000..aba4aff0 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/vfm/models/vlm/qwen3_vl/configuration_qwen3_vl.py @@ -0,0 +1,298 @@ +# Copyright 2025 The Qwen Team and The HuggingFace Inc. team. All rights reserved. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +# ----------------------------------------------------------------------------- +# Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. +# All rights reserved. +# +# This codebase constitutes NVIDIA proprietary technology and is strictly +# confidential. Any unauthorized reproduction, distribution, or disclosure +# of this code, in whole or in part, outside NVIDIA is strictly prohibited +# without prior written consent. +# +# For inquiries regarding the use of this code in other NVIDIA proprietary +# projects, please contact the Deep Imagination Research Team at +# dir@exchange.nvidia.com. +# ----------------------------------------------------------------------------- + +# Source Repository: https://github.com/huggingface/transformers +# This is adapted from src/transformers/models/qwen3_vl/configuration_qwen3_vl.py. +# Commit Hash: 41e5abac5cb49983a08ddef3e8645d6efd23c8f3 +from transformers.configuration_utils import PretrainedConfig +from transformers.modeling_rope_utils import rope_config_validation + + +class Qwen3VLVisionConfig(PretrainedConfig): + model_type = "qwen3_vl" + base_config_key = "vision_config" + + def __init__( + self, + depth=27, + hidden_size=1152, + hidden_act="gelu_pytorch_tanh", + intermediate_size=4304, + num_heads=16, + in_channels=3, + patch_size=16, + spatial_merge_size=2, + temporal_patch_size=2, + out_hidden_size=3584, + num_position_embeddings=2304, + deepstack_visual_indexes=[8, 16, 24], + initializer_range=0.02, + **kwargs, + ): + super().__init__(**kwargs) + + self.depth = depth + self.hidden_size = hidden_size + self.hidden_act = hidden_act + self.intermediate_size = intermediate_size + self.num_heads = num_heads + self.in_channels = in_channels + self.patch_size = patch_size + self.spatial_merge_size = spatial_merge_size + self.temporal_patch_size = temporal_patch_size + self.out_hidden_size = out_hidden_size + self.num_position_embeddings = num_position_embeddings + self.initializer_range = initializer_range + self.deepstack_visual_indexes = deepstack_visual_indexes + + +class Qwen3VLTextConfig(PretrainedConfig): + r""" + This is the configuration class to store the configuration of a [`Qwen3VLTextModel`]. It is used to instantiate a + Qwen3-VL model according to the specified arguments, defining the model architecture. Instantiating a configuration + with the defaults will yield a similar configuration to that of + Qwen3-VL-4B-Instruct [Qwen/Qwen3-VL-4B-Instruct](https://huggingface.co/Qwen/Qwen3-VL-4B-Instruct). + + Configuration objects inherit from [`PretrainedConfig`] and can be used to control the model outputs. Read the + documentation from [`PretrainedConfig`] for more information. + + Args: + vocab_size (`int`, *optional*, defaults to 151936): + Vocabulary size of the Qwen3VL model. Defines the number of different tokens that can be represented by the + `inputs_ids` passed when calling [`Qwen3VLModel`] + hidden_size (`int`, *optional*, defaults to 4096): + Dimension of the hidden representations. + intermediate_size (`int`, *optional*, defaults to 22016): + Dimension of the MLP representations. + num_hidden_layers (`int`, *optional*, defaults to 32): + Number of hidden layers in the Transformer encoder. + num_attention_heads (`int`, *optional*, defaults to 32): + Number of attention heads for each attention layer in the Transformer encoder. + num_key_value_heads (`int`, *optional*, defaults to 32): + This is the number of key_value heads that should be used to implement Grouped Query Attention. If + `num_key_value_heads=num_attention_heads`, the model will use Multi Head Attention (MHA), if + `num_key_value_heads=1` the model will use Multi Query Attention (MQA) otherwise GQA is used. When + converting a multi-head checkpoint to a GQA checkpoint, each group key and value head should be constructed + by meanpooling all the original heads within that group. For more details, check out [this + paper](https://huggingface.co/papers/2305.13245). If it is not specified, will default to `32`. + head_dim (`int`, *optional*, defaults to 128): + The dimension of the head. If not specified, will default to `hidden_size // num_attention_heads`. + hidden_act (`str` or `function`, *optional*, defaults to `"silu"`): + The non-linear activation function (function or string) in the decoder. + max_position_embeddings (`int`, *optional*, defaults to 128000): + The maximum sequence length that this model might ever be used with. + initializer_range (`float`, *optional*, defaults to 0.02): + The standard deviation of the truncated_normal_initializer for initializing all weight matrices. + rms_norm_eps (`float`, *optional*, defaults to 1e-06): + The epsilon used by the rms normalization layers. + use_cache (`bool`, *optional*, defaults to `True`): + Whether or not the model should return the last key/values attentions (not used by all models). Only + relevant if `config.is_decoder=True`. + tie_word_embeddings (`bool`, *optional*, defaults to `False`): + Whether the model's input and output word embeddings should be tied. + rope_theta (`float`, *optional*, defaults to 5000000.0): + The base period of the RoPE embeddings. + rope_scaling (`Dict`, *optional*): + Dictionary containing the scaling configuration for the RoPE embeddings. NOTE: if you apply new rope type + and you expect the model to work on longer `max_position_embeddings`, we recommend you to update this value + accordingly. + Expected contents: + `rope_type` (`str`): + The sub-variant of RoPE to use. Can be one of ['default', 'linear', 'dynamic', 'yarn', 'longrope', + 'llama3'], with 'default' being the original RoPE implementation. + `factor` (`float`, *optional*): + Used with all rope types except 'default'. The scaling factor to apply to the RoPE embeddings. In + most scaling types, a `factor` of x will enable the model to handle sequences of length x * + original maximum pre-trained length. + `original_max_position_embeddings` (`int`, *optional*): + Used with 'dynamic', 'longrope' and 'llama3'. The original max position embeddings used during + pretraining. + `attention_factor` (`float`, *optional*): + Used with 'yarn' and 'longrope'. The scaling factor to be applied on the attention + computation. If unspecified, it defaults to value recommended by the implementation, using the + `factor` field to infer the suggested value. + `beta_fast` (`float`, *optional*): + Only used with 'yarn'. Parameter to set the boundary for extrapolation (only) in the linear + ramp function. If unspecified, it defaults to 32. + `beta_slow` (`float`, *optional*): + Only used with 'yarn'. Parameter to set the boundary for interpolation (only) in the linear + ramp function. If unspecified, it defaults to 1. + `short_factor` (`list[float]`, *optional*): + Only used with 'longrope'. The scaling factor to be applied to short contexts (< + `original_max_position_embeddings`). Must be a list of numbers with the same length as the hidden + size divided by the number of attention heads divided by 2 + `long_factor` (`list[float]`, *optional*): + Only used with 'longrope'. The scaling factor to be applied to long contexts (< + `original_max_position_embeddings`). Must be a list of numbers with the same length as the hidden + size divided by the number of attention heads divided by 2 + `low_freq_factor` (`float`, *optional*): + Only used with 'llama3'. Scaling factor applied to low frequency components of the RoPE + `high_freq_factor` (`float`, *optional*): + Only used with 'llama3'. Scaling factor applied to high frequency components of the RoPE + attention_bias (`bool`, defaults to `False`, *optional*, defaults to `False`): + Whether to use a bias in the query, key, value and output projection layers during self-attention. + attention_dropout (`float`, *optional*, defaults to 0.0): + The dropout ratio for the attention probabilities. + + ```python + >>> from transformers import Qwen3VLTextModel, Qwen3VLTextConfig + + >>> # Initializing a Qwen3VL style configuration + >>> configuration = Qwen3VLTextConfig() + + >>> # Initializing a model from the Qwen3-VL-7B style configuration + >>> model = Qwen3VLTextModel(configuration) + + >>> # Accessing the model configuration + >>> configuration = model.config + ```""" + + model_type = "qwen3_vl_text" + base_config_key = "text_config" + + def __init__( + self, + vocab_size=151936, + hidden_size=4096, + intermediate_size=22016, + num_hidden_layers=32, + num_attention_heads=32, + num_key_value_heads=32, + head_dim=128, + hidden_act="silu", + max_position_embeddings=128000, + initializer_range=0.02, + rms_norm_eps=1e-6, + use_cache=True, + tie_word_embeddings=False, + rope_theta=5000000.0, + rope_scaling=None, + attention_bias=False, + attention_dropout=0.0, + **kwargs, + ): + self.vocab_size = vocab_size + self.max_position_embeddings = max_position_embeddings + self.hidden_size = hidden_size + self.intermediate_size = intermediate_size + self.num_hidden_layers = num_hidden_layers + self.num_attention_heads = num_attention_heads + + # for backward compatibility + if num_key_value_heads is None: + num_key_value_heads = num_attention_heads + + self.num_key_value_heads = num_key_value_heads + self.head_dim = head_dim + self.hidden_act = hidden_act + self.initializer_range = initializer_range + self.rms_norm_eps = rms_norm_eps + self.use_cache = use_cache + self.rope_theta = rope_theta + self.rope_scaling = rope_scaling + self.attention_bias = attention_bias + self.attention_dropout = attention_dropout + + rope_config_validation(self, ignore_keys={"mrope_section", "mrope_interleaved"}) + + super().__init__(tie_word_embeddings=tie_word_embeddings, **kwargs) + + +class Qwen3VLConfig(PretrainedConfig): + r""" + This is the configuration class to store the configuration of a [`Qwen3VLModel`]. It is used to instantiate a + Qwen3-VL model according to the specified arguments, defining the model architecture. Instantiating a configuration + with the defaults will yield a similar configuration to that of + Qwen3-VL-4B-Instruct [Qwen/Qwen3-VL-4B-Instruct](https://huggingface.co/Qwen/Qwen3-VL-4B-Instruct). + + Configuration objects inherit from [`PretrainedConfig`] and can be used to control the model outputs. Read the + documentation from [`PretrainedConfig`] for more information. + + + Args: + text_config (`Union[PreTrainedConfig, dict]`, *optional*, defaults to `Qwen3VLTextConfig`): + The config object or dictionary of the text backbone. + vision_config (`Union[PreTrainedConfig, dict]`, *optional*, defaults to `Qwen3VLVisionConfig`): + The config object or dictionary of the vision backbone. + image_token_id (`int`, *optional*, defaults to 151655): + The image token index to encode the image prompt. + video_token_id (`int`, *optional*, defaults to 151656): + The video token index to encode the image prompt. + vision_start_token_id (`int`, *optional*, defaults to 151652): + The start token index to encode the image prompt. + vision_end_token_id (`int`, *optional*, defaults to 151653): + The end token index to encode the image prompt. + tie_word_embeddings (`bool`, *optional*, defaults to `False`): + Whether to tie the word embeddings. + + ```python + >>> from transformers import Qwen3VLForConditionalGeneration, Qwen3VLConfig + + >>> # Initializing a Qwen3-VL style configuration + >>> configuration = Qwen3VLConfig() + + >>> # Initializing a model from the Qwen3-VL-4B style configuration + >>> model = Qwen3VLForConditionalGeneration(configuration) + + >>> # Accessing the model configuration + >>> configuration = model.config + ```""" + + model_type = "qwen3_vl" + sub_configs = {"vision_config": Qwen3VLVisionConfig, "text_config": Qwen3VLTextConfig} + keys_to_ignore_at_inference = ["past_key_values"] + + def __init__( + self, + text_config=None, + vision_config=None, + image_token_id=151655, + video_token_id=151656, + vision_start_token_id=151652, + vision_end_token_id=151653, + tie_word_embeddings=False, + **kwargs, + ): + if isinstance(vision_config, dict): + self.vision_config = self.sub_configs["vision_config"](**vision_config) + elif vision_config is None: + self.vision_config = self.sub_configs["vision_config"]() + + if isinstance(text_config, dict): + self.text_config = self.sub_configs["text_config"](**text_config) + elif text_config is None: + self.text_config = self.sub_configs["text_config"]() + + self.image_token_id = image_token_id + self.video_token_id = video_token_id + self.vision_start_token_id = vision_start_token_id + self.vision_end_token_id = vision_end_token_id + super().__init__(**kwargs, tie_word_embeddings=tie_word_embeddings) + + +__all__ = ["Qwen3VLConfig", "Qwen3VLTextConfig"] diff --git a/cosmos-inference/cosmos3/_src/vfm/models/vlm/qwen3_vl/qwen3_vl.py b/cosmos-inference/cosmos3/_src/vfm/models/vlm/qwen3_vl/qwen3_vl.py new file mode 100644 index 00000000..da65d4ee --- /dev/null +++ b/cosmos-inference/cosmos3/_src/vfm/models/vlm/qwen3_vl/qwen3_vl.py @@ -0,0 +1,1651 @@ +# Copyright 2025 The Qwen Team and The HuggingFace Inc. team. All rights reserved. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +# ----------------------------------------------------------------------------- +# Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. +# All rights reserved. +# +# This codebase constitutes NVIDIA proprietary technology and is strictly +# confidential. Any unauthorized reproduction, distribution, or disclosure +# of this code, in whole or in part, outside NVIDIA is strictly prohibited +# without prior written consent. +# +# For inquiries regarding the use of this code in other NVIDIA proprietary +# projects, please contact the Deep Imagination Research Team at +# dir@exchange.nvidia.com. +# ----------------------------------------------------------------------------- + +# Source Repository: https://github.com/huggingface/transformers +# This is adapted from src/transformers/models/qwen3_vl/modeling_qwen3_vl.py. +# Commit Hash: 41e5abac5cb49983a08ddef3e8645d6efd23c8f3 +"""PyTorch Qwen3-VL model.""" + +import functools +from dataclasses import dataclass +from typing import Any, Callable, Optional, Union + +import torch +import torch.nn as nn +import torch.nn.functional as F +from transformers.activations import ACT2FN +from transformers.cache_utils import Cache, DynamicCache +from transformers.generation import GenerationMixin +from transformers.modeling_flash_attention_utils import FlashAttentionKwargs +from transformers.modeling_outputs import BaseModelOutputWithPast, ModelOutput +from transformers.modeling_rope_utils import ROPE_INIT_FUNCTIONS, dynamic_rope_update +from transformers.modeling_utils import ALL_ATTENTION_FUNCTIONS, PreTrainedModel +from transformers.processing_utils import Unpack +from transformers.utils import is_torchdynamo_compiling +from transformers.utils.deprecation import deprecate_kwarg + +TransformersKwargs = Any + +# Import masking functions from utils for compatibility +from cosmos3._src.vfm.models.vlm.qwen3_vl.utils import ( + create_causal_mask, +) + +from .configuration_qwen3_vl import Qwen3VLConfig, Qwen3VLTextConfig, Qwen3VLVisionConfig + + +class Qwen3VLVisionMLP(nn.Module): + def __init__(self, config): + super().__init__() + self.hidden_size = config.hidden_size + self.intermediate_size = config.intermediate_size + self.linear_fc1 = nn.Linear(self.hidden_size, self.intermediate_size, bias=True) + self.linear_fc2 = nn.Linear(self.intermediate_size, self.hidden_size, bias=True) + self.act_fn = ACT2FN[config.hidden_act] + + def forward(self, hidden_state): + return self.linear_fc2(self.act_fn(self.linear_fc1(hidden_state))) + + +class Qwen3VLVisionPatchEmbed(nn.Module): + def __init__(self, config) -> None: + super().__init__() + self.patch_size = config.patch_size + self.temporal_patch_size = config.temporal_patch_size + self.in_channels = config.in_channels + self.embed_dim = config.hidden_size + + kernel_size = [self.temporal_patch_size, self.patch_size, self.patch_size] + self.proj = nn.Conv3d(self.in_channels, self.embed_dim, kernel_size=kernel_size, stride=kernel_size, bias=True) + + def forward( + self, hidden_states: torch.Tensor + ) -> torch.Tensor: # hidden_states: [N_patches,in_channels*temporal_patch_size*patch_size*patch_size] + target_dtype = self.proj.weight.dtype + hidden_states = hidden_states.view( + -1, self.in_channels, self.temporal_patch_size, self.patch_size, self.patch_size + ) # [N_patches,in_channels,temporal_patch_size,patch_size,patch_size] + hidden_states = self.proj(hidden_states.to(dtype=target_dtype)).view( + -1, self.embed_dim + ) # [N_patches,embed_dim] + return hidden_states # [N_patches,embed_dim] + + +class Qwen3VLVisionRotaryEmbedding(nn.Module): + inv_freq: torch.Tensor # fix linting for `register_buffer` + + def __init__(self, dim: int, theta: float = 10000.0) -> None: + super().__init__() + inv_freq = 1.0 / (theta ** (torch.arange(0, dim, 2, dtype=torch.float) / dim)) + self.register_buffer("inv_freq", inv_freq, persistent=False) + + def forward(self, seqlen: int) -> torch.Tensor: + seq = torch.arange(seqlen, device=self.inv_freq.device, dtype=self.inv_freq.dtype) # [seqlen] + freqs = torch.outer(seq, self.inv_freq) # [seqlen,dim//2] + return freqs # [seqlen,dim//2] + + +class Qwen3VLVisionPatchMerger(nn.Module): + def __init__(self, config: Qwen3VLVisionConfig, use_postshuffle_norm=False) -> None: + super().__init__() + self.hidden_size = config.hidden_size * (config.spatial_merge_size**2) + self.use_postshuffle_norm = use_postshuffle_norm + self.norm = nn.LayerNorm(self.hidden_size if use_postshuffle_norm else config.hidden_size, eps=1e-6) + self.linear_fc1 = nn.Linear(self.hidden_size, self.hidden_size) + self.act_fn = nn.GELU() + self.linear_fc2 = nn.Linear(self.hidden_size, config.out_hidden_size) + + def forward(self, x: torch.Tensor) -> torch.Tensor: # x: [N_tokens,vision_hidden_size] + x = self.norm(x.view(-1, self.hidden_size) if self.use_postshuffle_norm else x).view( + -1, self.hidden_size + ) # [N_merged,hidden_size] + x = self.linear_fc2(self.act_fn(self.linear_fc1(x))) # [N_merged,out_hidden_size] + return x # [N_merged,out_hidden_size] + + +def rotate_half(x): + """Rotates half the hidden dims of the input.""" + x1 = x[..., : x.shape[-1] // 2] # [...,head_dim//2] + x2 = x[..., x.shape[-1] // 2 :] # [...,head_dim//2] + return torch.cat((-x2, x1), dim=-1) # [...,head_dim] + + +def apply_rotary_pos_emb_vision( + q: torch.Tensor, # [N_vision,num_heads,head_dim] + k: torch.Tensor, # [N_vision,num_heads,head_dim] + cos: torch.Tensor, # [N_vision,head_dim] + sin: torch.Tensor, # [N_vision,head_dim] +) -> tuple[torch.Tensor, torch.Tensor]: + orig_q_dtype = q.dtype + orig_k_dtype = k.dtype + q, k = q.float(), k.float() + cos, sin = cos.unsqueeze(-2).float(), sin.unsqueeze(-2).float() # [N_vision,1,head_dim] + q_embed = (q * cos) + (rotate_half(q) * sin) # [N_vision,num_heads,head_dim] + k_embed = (k * cos) + (rotate_half(k) * sin) # [N_vision,num_heads,head_dim] + q_embed = q_embed.to(orig_q_dtype) + k_embed = k_embed.to(orig_k_dtype) + return q_embed, k_embed # [N_vision,num_heads,head_dim], [N_vision,num_heads,head_dim] + + +def repeat_kv(hidden_states: torch.Tensor, n_rep: int) -> torch.Tensor: + """ + This is the equivalent of torch.repeat_interleave(x, dim=1, repeats=n_rep). The hidden states go from (batch, + num_key_value_heads, seqlen, head_dim) to (batch, num_attention_heads, seqlen, head_dim) + """ + batch, num_key_value_heads, slen, head_dim = hidden_states.shape + if n_rep == 1: + return hidden_states + hidden_states = hidden_states[:, :, None, :, :].expand( + batch, num_key_value_heads, n_rep, slen, head_dim + ) # [B,num_kv_heads,n_rep,N,head_dim] + return hidden_states.reshape(batch, num_key_value_heads * n_rep, slen, head_dim) # [B,num_heads,N,head_dim] + + +def eager_attention_forward( + module: nn.Module, + query: torch.Tensor, # [B,num_heads,N_q,head_dim] + key: torch.Tensor, # [B,num_kv_heads,N_kv,head_dim] + value: torch.Tensor, # [B,num_kv_heads,N_kv,head_dim] + attention_mask: Optional[torch.Tensor], + scaling: float, + dropout: float = 0.0, + **kwargs: Unpack[TransformersKwargs], +): + key_states = repeat_kv(key, module.num_key_value_groups) # [B,num_heads,N_kv,head_dim] + value_states = repeat_kv(value, module.num_key_value_groups) # [B,num_heads,N_kv,head_dim] + + attn_weights = torch.matmul(query, key_states.transpose(2, 3)) * scaling # [B,num_heads,N_q,N_kv] + if attention_mask is not None: + causal_mask = attention_mask[:, :, :, : key_states.shape[-2]] # [B,1,N_q,N_kv] + attn_weights = attn_weights + causal_mask # [B,num_heads,N_q,N_kv] + + attn_weights = nn.functional.softmax(attn_weights, dim=-1, dtype=torch.float32).to( + query.dtype + ) # [B,num_heads,N_q,N_kv] + attn_weights = nn.functional.dropout(attn_weights, p=dropout, training=module.training) + attn_output = torch.matmul(attn_weights, value_states) # [B,num_heads,N_q,head_dim] + attn_output = attn_output.transpose(1, 2).contiguous() # [B,N_q,num_heads,head_dim] + + return attn_output, attn_weights + + +class Qwen3VLVisionAttention(nn.Module): + def __init__(self, config: Qwen3VLVisionConfig) -> None: + super().__init__() + self.dim = config.hidden_size + self.num_heads = config.num_heads + self.head_dim = self.dim // self.num_heads + self.num_key_value_groups = 1 # needed for eager attention + self.qkv = nn.Linear(self.dim, self.dim * 3, bias=True) + self.proj = nn.Linear(self.dim, self.dim) + self.scaling = self.head_dim**-0.5 + self.config = config + self.attention_dropout = 0.0 + self.is_causal = False + + def forward( + self, + hidden_states: torch.Tensor, # [N_vision,hidden_size] + cu_seqlens: torch.Tensor, + rotary_pos_emb: Optional[torch.Tensor] = None, + position_embeddings: Optional[tuple[torch.Tensor, torch.Tensor]] = None, + **kwargs, + ) -> torch.Tensor: # [N_vision,hidden_size] + seq_length = hidden_states.shape[0] + query_states, key_states, value_states = ( + self.qkv(hidden_states).reshape(seq_length, 3, self.num_heads, -1).permute(1, 0, 2, 3).unbind(0) + ) # each: [N_vision,num_heads,head_dim] + cos, sin = position_embeddings + query_states, key_states = apply_rotary_pos_emb_vision(query_states, key_states, cos, sin) + # each: [N_vision,num_heads,head_dim] + + query_states = query_states.transpose(0, 1).unsqueeze(0) # [1,num_heads,N_vision,head_dim] + key_states = key_states.transpose(0, 1).unsqueeze(0) # [1,num_heads,N_vision,head_dim] + value_states = value_states.transpose(0, 1).unsqueeze(0) # [1,num_heads,N_vision,head_dim] + + attention_interface: Callable = eager_attention_forward + if self.config._attn_implementation != "eager": + attention_interface = ALL_ATTENTION_FUNCTIONS[self.config._attn_implementation] + + if self.config._attn_implementation == "flash_attention_2": + # Flash Attention 2: Use cu_seqlens for variable length attention + max_seqlen = (cu_seqlens[1:] - cu_seqlens[:-1]).max() + attn_output, _ = attention_interface( + self, + query_states, + key_states, + value_states, + attention_mask=None, + scaling=self.scaling, + dropout=0.0 if not self.training else self.attention_dropout, + cu_seq_lens_q=cu_seqlens, + cu_seq_lens_k=cu_seqlens, + max_length_q=max_seqlen, + max_length_k=max_seqlen, + is_causal=False, + **kwargs, + ) + else: + # Other implementations: Process each chunk separately + lengths = cu_seqlens[1:] - cu_seqlens[:-1] + splits = [ + torch.split(tensor, lengths.tolist(), dim=2) for tensor in (query_states, key_states, value_states) + ] + + attn_outputs = [ + attention_interface( + self, + q, + k, + v, + attention_mask=None, + scaling=self.scaling, + dropout=0.0 if not self.training else self.attention_dropout, + is_causal=False, + **kwargs, + )[0] + for q, k, v in zip(*splits) + ] + attn_output = torch.cat(attn_outputs, dim=1) # [1,N_vision,num_heads,head_dim] + + attn_output = attn_output.reshape(seq_length, -1).contiguous() # [N_vision,hidden_size] + attn_output = self.proj(attn_output) # [N_vision,hidden_size] + return attn_output # [N_vision,hidden_size] + + +class Qwen3VLVisionBlock(nn.Module): + def __init__(self, config, attn_implementation: str = "sdpa") -> None: + super().__init__() + self.norm1 = nn.LayerNorm(config.hidden_size, eps=1e-6) + self.norm2 = nn.LayerNorm(config.hidden_size, eps=1e-6) + self.attn = Qwen3VLVisionAttention(config=config) + self.mlp = Qwen3VLVisionMLP(config=config) + + def forward( + self, + hidden_states: torch.Tensor, + cu_seqlens: torch.Tensor, + rotary_pos_emb: Optional[torch.Tensor] = None, + position_embeddings: Optional[tuple[torch.Tensor, torch.Tensor]] = None, + **kwargs, + ) -> torch.Tensor: + hidden_states = hidden_states + self.attn( + self.norm1(hidden_states), + cu_seqlens=cu_seqlens, + rotary_pos_emb=rotary_pos_emb, + position_embeddings=position_embeddings, + **kwargs, + ) + hidden_states = hidden_states + self.mlp(self.norm2(hidden_states)) + return hidden_states + + +class Qwen3VLTextRotaryEmbedding(nn.Module): + def __init__(self, config: Qwen3VLTextConfig): + super().__init__() + if hasattr(config, "rope_scaling") and config.rope_scaling is not None: + self.rope_type = config.rope_scaling.get("rope_type", "default") + else: + self.rope_type = "default" + self.max_seq_len_cached = config.max_position_embeddings + self.original_max_seq_len = config.max_position_embeddings + + self.config = config + self.rope_init_fn = ROPE_INIT_FUNCTIONS[self.rope_type] + + self.mrope_section = ( + config.rope_scaling.get("mrope_section", [24, 20, 20]) if config.rope_scaling is not None else [24, 20, 20] + ) + + def init_weights(self, buffer_device: torch.device | None = None) -> None: + inv_freq, self.attention_scaling = self.rope_init_fn(self.config, buffer_device) + self.register_buffer("inv_freq", inv_freq, persistent=False) + + def apply_interleaved_mrope(self, freqs, mrope_section): + """Apply interleaved MRoPE to 3D rotary embeddings. + Reorganizes frequency layout from chunked [TTT...HHH...WWW] to + interleaved [THTHWHTHW...TT], preserving frequency continuity. + args: + x: (3, bs, seq_len, head_dim // 2) + mrope_section: (3,) + returns: + x_t: (bs, seq_len, head_dim // 2) + """ + freqs_t = freqs[0] # just overwrite the first dimension T + for dim, offset in enumerate((1, 2), start=1): # H, W + length = mrope_section[dim] * 3 + idx = slice(offset, length, 3) + freqs_t[..., idx] = freqs[dim, ..., idx] + return freqs_t + + @torch.no_grad() + @dynamic_rope_update # power user: used with advanced RoPE types (e.g. dynamic rope) + def forward(self, x, position_ids): + assert self.inv_freq.dtype == torch.float32, f"inv_freq must be float32, but got {self.inv_freq.dtype}" + + # In contrast to other models, Qwen3VL has different position ids for the grids + # So we expand the inv_freq to shape (3, ...) + if position_ids.ndim == 2: + position_ids = position_ids[None, ...].expand(3, position_ids.shape[0], -1) # [3,B,N] + inv_freq_expanded = ( + self.inv_freq[None, None, :, None].float().expand(3, position_ids.shape[1], -1, 1).to(x.device) + ) # [3,B,head_dim//2,1] + position_ids_expanded = position_ids[:, :, None, :].float() # [3,B,1,N] + + freqs = (inv_freq_expanded.float() @ position_ids_expanded.float()).transpose(2, 3) # [3,B,N,head_dim//2] + freqs = self.apply_interleaved_mrope(freqs, self.mrope_section) # [B,N,head_dim//2] + emb = torch.cat((freqs, freqs), dim=-1) # [B,N,head_dim] + cos = emb.cos() * self.attention_scaling # [B,N,head_dim] + sin = emb.sin() * self.attention_scaling # [B,N,head_dim] + + return cos.to(dtype=x.dtype), sin.to(dtype=x.dtype) # each: [B,N,head_dim] + + +class Qwen3VLTextRMSNorm(nn.Module): + def __init__(self, hidden_size: int, eps: float = 1e-6) -> None: + """ + Qwen3VLTextRMSNorm is equivalent to T5LayerNorm + """ + super().__init__() + self.weight = nn.Parameter(torch.ones(hidden_size)) + self.variance_epsilon = eps + + def forward(self, hidden_states: torch.Tensor) -> torch.Tensor: + input_dtype = hidden_states.dtype + hidden_states = hidden_states.to(torch.float32) + variance = hidden_states.pow(2).mean(-1, keepdim=True) + hidden_states = hidden_states * torch.rsqrt(variance + self.variance_epsilon) + return self.weight * hidden_states.to(input_dtype) + + def extra_repr(self) -> str: + return f"{tuple(self.weight.shape)}, eps={self.variance_epsilon}" + + +def apply_rotary_pos_emb(q, k, cos, sin, position_ids=None, unsqueeze_dim=1): + """Applies Rotary Position Embedding to the query and key tensors. + + Args: + q (`torch.Tensor`): The query tensor. + k (`torch.Tensor`): The key tensor. + cos (`torch.Tensor`): The cosine part of the rotary embedding. + sin (`torch.Tensor`): The sine part of the rotary embedding. + position_ids (`torch.Tensor`, *optional*): + Deprecated and unused. + unsqueeze_dim (`int`, *optional*, defaults to 1): + The 'unsqueeze_dim' argument specifies the dimension along which to unsqueeze cos[position_ids] and + sin[position_ids] so that they can be properly broadcasted to the dimensions of q and k. For example, note + that cos[position_ids] and sin[position_ids] have the shape [batch_size, seq_len, head_dim]. Then, if q and + k have the shape [batch_size, heads, seq_len, head_dim], then setting unsqueeze_dim=1 makes + cos[position_ids] and sin[position_ids] broadcastable to the shapes of q and k. Similarly, if q and k have + the shape [batch_size, seq_len, heads, head_dim], then set unsqueeze_dim=2. + Returns: + `tuple(torch.Tensor)` comprising of the query and key tensors rotated using the Rotary Position Embedding. + """ + cos = cos.unsqueeze(unsqueeze_dim) # [B,1,N,head_dim] + sin = sin.unsqueeze(unsqueeze_dim) # [B,1,N,head_dim] + q_embed = (q * cos) + (rotate_half(q) * sin) # [B,num_heads,N,head_dim] + k_embed = (k * cos) + (rotate_half(k) * sin) # [B,num_kv_heads,N,head_dim] + return q_embed, k_embed # [B,num_heads,N,head_dim], [B,num_kv_heads,N,head_dim] + + +class Qwen3VLTextAttention(nn.Module): + """Multi-headed attention from 'Attention Is All You Need' paper""" + + def __init__(self, config: Qwen3VLTextConfig, layer_idx: int): + super().__init__() + self.config = config + self.layer_idx = layer_idx + self.head_dim = getattr(config, "head_dim", config.hidden_size // config.num_attention_heads) + self.num_key_value_groups = config.num_attention_heads // config.num_key_value_heads + self.scaling = self.head_dim**-0.5 + self.attention_dropout = config.attention_dropout + self.is_causal = True + + self.q_proj = nn.Linear( + config.hidden_size, config.num_attention_heads * self.head_dim, bias=config.attention_bias + ) + self.k_proj = nn.Linear( + config.hidden_size, config.num_key_value_heads * self.head_dim, bias=config.attention_bias + ) + self.v_proj = nn.Linear( + config.hidden_size, config.num_key_value_heads * self.head_dim, bias=config.attention_bias + ) + self.o_proj = nn.Linear( + config.num_attention_heads * self.head_dim, config.hidden_size, bias=config.attention_bias + ) + self.q_norm = Qwen3VLTextRMSNorm(self.head_dim, eps=config.rms_norm_eps) # unlike olmo, only on the head dim! + self.k_norm = Qwen3VLTextRMSNorm( + self.head_dim, eps=config.rms_norm_eps + ) # thus post q_norm does not need reshape + + @deprecate_kwarg("past_key_value", new_name="past_key_values", version="4.58") + def forward( + self, + hidden_states: torch.Tensor, # [B,N,hidden_size] + position_embeddings: tuple[torch.Tensor, torch.Tensor], + attention_mask: Optional[torch.Tensor], + past_key_values: Optional[Cache] = None, + cache_position: Optional[torch.LongTensor] = None, + **kwargs: Unpack[FlashAttentionKwargs], + ) -> tuple[torch.Tensor, Optional[torch.Tensor]]: + input_shape = hidden_states.shape[:-1] + hidden_shape = (*input_shape, -1, self.head_dim) + + query_states = self.q_norm(self.q_proj(hidden_states).view(hidden_shape)).transpose( + 1, 2 + ) # [B,num_heads,N,head_dim] + key_states = self.k_norm(self.k_proj(hidden_states).view(hidden_shape)).transpose( + 1, 2 + ) # [B,num_kv_heads,N,head_dim] + value_states = self.v_proj(hidden_states).view(hidden_shape).transpose(1, 2) # [B,num_kv_heads,N,head_dim] + + cos, sin = position_embeddings + query_states, key_states = apply_rotary_pos_emb(query_states, key_states, cos, sin) + # query_states: [B,num_heads,N,head_dim], key_states: [B,num_kv_heads,N,head_dim] + + if past_key_values is not None: + # sin and cos are specific to RoPE models; cache_position needed for the static cache + cache_kwargs = {"sin": sin, "cos": cos, "cache_position": cache_position} + key_states, value_states = past_key_values.update(key_states, value_states, self.layer_idx, cache_kwargs) + # key_states: [B,num_kv_heads,N_cached,head_dim], value_states: [B,num_kv_heads,N_cached,head_dim] + + attention_interface: Callable = eager_attention_forward + if self.config._attn_implementation != "eager": + attention_interface = ALL_ATTENTION_FUNCTIONS[self.config._attn_implementation] + + attn_output, attn_weights = attention_interface( + self, + query_states, + key_states, + value_states, + attention_mask, + dropout=0.0 if not self.training else self.attention_dropout, + scaling=self.scaling, + **kwargs, + ) # attn_output: [B,N,num_heads,head_dim] + + attn_output = attn_output.reshape(*input_shape, -1).contiguous() # [B,N,hidden_size] + attn_output = self.o_proj(attn_output) # [B,N,hidden_size] + return attn_output, attn_weights # [B,N,hidden_size], [B,num_heads,N,N_cached] or None + + +class Qwen3VLTextMLP(nn.Module): + def __init__(self, config): + super().__init__() + self.config = config + self.hidden_size = config.hidden_size + self.intermediate_size = config.intermediate_size + self.gate_proj = nn.Linear(self.hidden_size, self.intermediate_size, bias=False) + self.up_proj = nn.Linear(self.hidden_size, self.intermediate_size, bias=False) + self.down_proj = nn.Linear(self.intermediate_size, self.hidden_size, bias=False) + self.act_fn = ACT2FN[config.hidden_act] + + def forward(self, x): + down_proj = self.down_proj(self.act_fn(self.gate_proj(x)) * self.up_proj(x)) + return down_proj + + +class Qwen3VLTextDecoderLayer(nn.Module): + def __init__(self, config: Qwen3VLTextConfig, layer_idx: int): + super().__init__() + self.hidden_size = config.hidden_size + + self.self_attn = Qwen3VLTextAttention(config=config, layer_idx=layer_idx) + + self.mlp = Qwen3VLTextMLP(config) + self.input_layernorm = Qwen3VLTextRMSNorm(config.hidden_size, eps=config.rms_norm_eps) + self.post_attention_layernorm = Qwen3VLTextRMSNorm(config.hidden_size, eps=config.rms_norm_eps) + + @deprecate_kwarg("past_key_value", new_name="past_key_values", version="4.58") + def forward( + self, + hidden_states: torch.Tensor, + position_embeddings: tuple[torch.Tensor, torch.Tensor], + attention_mask: Optional[torch.Tensor] = None, + position_ids: Optional[torch.LongTensor] = None, + past_key_values: Optional[Cache] = None, + use_cache: Optional[bool] = False, + cache_position: Optional[torch.LongTensor] = None, + output_attentions: Optional[bool] = False, + **kwargs: Unpack[TransformersKwargs], + ) -> tuple[torch.Tensor, Optional[torch.Tensor]]: + residual = hidden_states + hidden_states = self.input_layernorm(hidden_states) + # Self Attention + hidden_states, self_attn_weights = self.self_attn( + hidden_states=hidden_states, + attention_mask=attention_mask, + position_ids=position_ids, + past_key_values=past_key_values, + use_cache=use_cache, + cache_position=cache_position, + position_embeddings=position_embeddings, + **kwargs, + ) + hidden_states = residual + hidden_states + + # Fully Connected + residual = hidden_states + hidden_states = self.post_attention_layernorm(hidden_states) + hidden_states = self.mlp(hidden_states) + hidden_states = residual + hidden_states + + outputs = (hidden_states,) + if output_attentions: + outputs += (self_attn_weights,) + return outputs + + +@dataclass +class Qwen3VLModelOutputWithPast(ModelOutput): + r""" + past_key_values (`Cache`, *optional*, returned when `use_cache=True` is passed or when `config.use_cache=True`): + It is a [`~cache_utils.Cache`] instance. For more details, see our [kv cache guide](https://huggingface.co/docs/transformers/en/kv_cache). + + Contains pre-computed hidden-states (key and values in the self-attention blocks) that can be used (see + `past_key_values` input) to speed up sequential decoding. + rope_deltas (`torch.LongTensor` of shape `(batch_size, )`, *optional*): + The rope index difference between sequence length and multimodal rope. + """ + + last_hidden_state: Optional[torch.FloatTensor] = None + past_key_values: Optional[Cache] = None + hidden_states: Optional[tuple[torch.FloatTensor]] = None + attentions: Optional[tuple[torch.FloatTensor]] = None + rope_deltas: Optional[torch.LongTensor] = None + + +class Qwen3VLPreTrainedModel(PreTrainedModel): + config: Qwen3VLConfig + base_model_prefix = "model" + supports_gradient_checkpointing = True + _no_split_modules = ["Qwen3VLTextDecoderLayer", "Qwen3VLVisionBlock"] + _skip_keys_device_placement = "past_key_values" + _supports_flash_attn = True + _supports_sdpa = True + + _can_compile_fullgraph = True + _supports_attention_backend = True + _can_record_outputs = { + "hidden_states": Qwen3VLTextDecoderLayer, + "attentions": Qwen3VLTextAttention, + } + + def _init_weights(self, module: nn.Module, buffer_device: torch.device | None) -> None: + """Initialize the weights.""" + super()._init_weights(module) + + if isinstance(module, Qwen3VLTextRotaryEmbedding): + module.init_weights(buffer_device=buffer_device) + + def init_weights(self, buffer_device: torch.device | None = None) -> None: + self.apply(functools.partial(self._init_weights, buffer_device=buffer_device)) + + +class Qwen3VLVisionModel(Qwen3VLPreTrainedModel): + config: Qwen3VLVisionConfig + _no_split_modules = ["Qwen3VLVisionBlock"] + + def __init__(self, config, *inputs, **kwargs) -> None: + super().__init__(config, *inputs, **kwargs) + self.spatial_merge_size = config.spatial_merge_size + self.patch_size = config.patch_size + self.spatial_merge_unit = self.spatial_merge_size * self.spatial_merge_size + + self.patch_embed = Qwen3VLVisionPatchEmbed( + config=config, + ) + + self.pos_embed = nn.Embedding(config.num_position_embeddings, config.hidden_size) + self.num_grid_per_side = int(config.num_position_embeddings**0.5) + + head_dim = config.hidden_size // config.num_heads + self.rotary_pos_emb = Qwen3VLVisionRotaryEmbedding(head_dim // 2) + + self.blocks = nn.ModuleList([Qwen3VLVisionBlock(config) for _ in range(config.depth)]) + self.merger = Qwen3VLVisionPatchMerger( + config=config, + use_postshuffle_norm=False, + ) + + self.deepstack_visual_indexes = config.deepstack_visual_indexes + self.deepstack_merger_list = nn.ModuleList( + [ + Qwen3VLVisionPatchMerger( + config=config, + use_postshuffle_norm=True, + ) + for _ in range(len(config.deepstack_visual_indexes)) + ] + ) + + self.gradient_checkpointing = False + + def rot_pos_emb(self, grid_thw: torch.Tensor) -> torch.Tensor: # grid_thw: [N_media,3] + merge_size = self.spatial_merge_size + + max_hw = int(grid_thw[:, 1:].max().item()) + freq_table = self.rotary_pos_emb(max_hw) # [max_hw,head_dim//4] + device = freq_table.device + + total_tokens = int(torch.prod(grid_thw, dim=1).sum().item()) + pos_ids = torch.empty((total_tokens, 2), dtype=torch.long, device=device) # [N_vision,2] + + offset = 0 + for num_frames, height, width in grid_thw: + merged_h, merged_w = height // merge_size, width // merge_size + + block_rows = torch.arange(merged_h, device=device) # block row indices + block_cols = torch.arange(merged_w, device=device) # block col indices + intra_row = torch.arange(merge_size, device=device) # intra-block row offsets + intra_col = torch.arange(merge_size, device=device) # intra-block col offsets + + # Compute full-resolution positions + row_idx = ( + block_rows[:, None, None, None] * merge_size + intra_row[None, None, :, None] + ) # [merged_h,1,merge_size,1] + col_idx = ( + block_cols[None, :, None, None] * merge_size + intra_col[None, None, None, :] + ) # [1,merged_w,1,merge_size] + + row_idx = row_idx.expand(merged_h, merged_w, merge_size, merge_size).reshape(-1) # [H*W] + col_idx = col_idx.expand(merged_h, merged_w, merge_size, merge_size).reshape(-1) # [H*W] + + coords = torch.stack((row_idx, col_idx), dim=-1) # [H*W,2] + + if num_frames > 1: + coords = coords.repeat(num_frames, 1) # [T*H*W,2] + + num_tokens = coords.shape[0] + pos_ids[offset : offset + num_tokens] = coords + offset += num_tokens + + embeddings = freq_table[pos_ids] # [N_vision,2,head_dim//4] + embeddings = embeddings.flatten(1) # [N_vision,head_dim//2] + return embeddings # [N_vision,head_dim//2] + + def fast_pos_embed_interpolate(self, grid_thw): + grid_ts, grid_hs, grid_ws = grid_thw[:, 0], grid_thw[:, 1], grid_thw[:, 2] + + idx_list = [[] for _ in range(4)] + weight_list = [[] for _ in range(4)] + + for t, h, w in zip(grid_ts, grid_hs, grid_ws): + h_idxs = torch.linspace(0, self.num_grid_per_side - 1, h) + w_idxs = torch.linspace(0, self.num_grid_per_side - 1, w) + + h_idxs_floor = h_idxs.int() + w_idxs_floor = w_idxs.int() + h_idxs_ceil = (h_idxs.int() + 1).clip(max=self.num_grid_per_side - 1) + w_idxs_ceil = (w_idxs.int() + 1).clip(max=self.num_grid_per_side - 1) + + dh = h_idxs - h_idxs_floor + dw = w_idxs - w_idxs_floor + + base_h = h_idxs_floor * self.num_grid_per_side + base_h_ceil = h_idxs_ceil * self.num_grid_per_side + + indices = [ + (base_h[None].T + w_idxs_floor[None]).flatten(), + (base_h[None].T + w_idxs_ceil[None]).flatten(), + (base_h_ceil[None].T + w_idxs_floor[None]).flatten(), + (base_h_ceil[None].T + w_idxs_ceil[None]).flatten(), + ] + + weights = [ + ((1 - dh)[None].T * (1 - dw)[None]).flatten(), + ((1 - dh)[None].T * dw[None]).flatten(), + (dh[None].T * (1 - dw)[None]).flatten(), + (dh[None].T * dw[None]).flatten(), + ] + + for i in range(4): + idx_list[i].extend(indices[i].tolist()) + weight_list[i].extend(weights[i].tolist()) + + idx_tensor = torch.tensor(idx_list, dtype=torch.long, device=self.pos_embed.weight.device) # [4,N_vision] + weight_tensor = torch.tensor( + weight_list, dtype=self.pos_embed.weight.dtype, device=self.pos_embed.weight.device + ) # [4,N_vision] + pos_embeds = self.pos_embed(idx_tensor) * weight_tensor[:, :, None] # [4,N_vision,hidden_size] + patch_pos_embeds = pos_embeds[0] + pos_embeds[1] + pos_embeds[2] + pos_embeds[3] # [N_vision,hidden_size] + + patch_pos_embeds = patch_pos_embeds.split([h * w for h, w in zip(grid_hs, grid_ws)]) + # tuple of [H_i*W_i,hidden_size] per media item + + patch_pos_embeds_permute = [] + merge_size = self.config.spatial_merge_size + for pos_embed, t, h, w in zip(patch_pos_embeds, grid_ts, grid_hs, grid_ws): + pos_embed = pos_embed.repeat(t, 1) # [T*H*W,hidden_size] + pos_embed = ( + pos_embed.view(t, h // merge_size, merge_size, w // merge_size, merge_size, -1) + # [T,H/ms,ms,W/ms,ms,hidden_size] + .permute(0, 1, 3, 2, 4, 5) + # [T,H/ms,W/ms,ms,ms,hidden_size] + .flatten(0, 4) + # [T*H/ms*W/ms*ms*ms,hidden_size] == [T*H*W,hidden_size] + ) + patch_pos_embeds_permute.append(pos_embed) + patch_pos_embeds = torch.cat(patch_pos_embeds_permute) # [N_vision,hidden_size] + return patch_pos_embeds # [N_vision,hidden_size] + + def forward(self, hidden_states: torch.Tensor, grid_thw: torch.Tensor, **kwargs) -> torch.Tensor: + """ + Args: + hidden_states (`torch.Tensor` of shape `(seq_len, hidden_size)`): + The final hidden states of the model. + grid_thw (`torch.Tensor` of shape `(num_images_or_videos, 3)`): + The temporal, height and width of feature shape of each image in LLM. + + Returns: + `torch.Tensor`: hidden_states. + """ + hidden_states = self.patch_embed(hidden_states) # [N_vision,embed_dim] + + pos_embeds = self.fast_pos_embed_interpolate(grid_thw) # [N_vision,hidden_size] + hidden_states = hidden_states + pos_embeds # [N_vision,hidden_size] + + rotary_pos_emb = self.rot_pos_emb(grid_thw) # [N_vision,head_dim//2] + + seq_len, _ = hidden_states.size() + hidden_states = hidden_states.reshape(seq_len, -1) # [N_vision,hidden_size] + rotary_pos_emb = rotary_pos_emb.reshape(seq_len, -1) # [N_vision,head_dim//2] + emb = torch.cat((rotary_pos_emb, rotary_pos_emb), dim=-1) # [N_vision,head_dim] + position_embeddings = (emb.cos(), emb.sin()) # each: [N_vision,head_dim] + + cu_seqlens = torch.repeat_interleave(grid_thw[:, 1] * grid_thw[:, 2], grid_thw[:, 0]).cumsum( + dim=0, + # Select dtype based on the following factors: + # - FA2 requires that cu_seqlens_q must have dtype int32 + # - torch.onnx.export requires that cu_seqlens_q must have same dtype as grid_thw + # See https://github.com/huggingface/transformers/pull/34852 for more information + dtype=grid_thw.dtype if torch.jit.is_tracing() else torch.int32, + ) + cu_seqlens = F.pad(cu_seqlens, (1, 0), value=0) # [N_media+1] + + deepstack_feature_lists = [] + for layer_num, blk in enumerate(self.blocks): + hidden_states = blk( + hidden_states, + cu_seqlens=cu_seqlens, + position_embeddings=position_embeddings, + **kwargs, + ) + if layer_num in self.deepstack_visual_indexes: + deepstack_feature = self.deepstack_merger_list[self.deepstack_visual_indexes.index(layer_num)]( + hidden_states + ) + deepstack_feature_lists.append(deepstack_feature) + + hidden_states = self.merger(hidden_states) # [N_merged,out_hidden_size] + + return hidden_states, deepstack_feature_lists # [N_merged,out_hidden_size], list of [N_merged,out_hidden_size] + + +class Qwen3VLTextModel(Qwen3VLPreTrainedModel): + config: Qwen3VLTextConfig + _no_split_modules = ["Qwen3VLTextDecoderLayer"] + + def __init__(self, config: Qwen3VLTextConfig): + super().__init__(config) + self.padding_idx = config.pad_token_id + self.vocab_size = config.vocab_size + + self.embed_tokens = nn.Embedding(config.vocab_size, config.hidden_size, self.padding_idx) + self.layers = nn.ModuleList( + [Qwen3VLTextDecoderLayer(config, layer_idx) for layer_idx in range(config.num_hidden_layers)] + ) + self.norm = Qwen3VLTextRMSNorm(config.hidden_size, eps=config.rms_norm_eps) + self.rotary_emb = Qwen3VLTextRotaryEmbedding(config=config) + self.gradient_checkpointing = False + + # Initialize weights and apply final processing + self.post_init() + + def forward( + self, + input_ids: Optional[torch.LongTensor] = None, + attention_mask: Optional[torch.Tensor] = None, + position_ids: Optional[torch.LongTensor] = None, + past_key_values: Optional[Cache] = None, + inputs_embeds: Optional[torch.FloatTensor] = None, + use_cache: Optional[bool] = None, + output_attentions: Optional[bool] = None, + output_hidden_states: Optional[bool] = None, + return_dict: Optional[bool] = None, + cache_position: Optional[torch.LongTensor] = None, + # args for deepstack + visual_pos_masks: Optional[torch.Tensor] = None, + deepstack_visual_embeds: Optional[list[torch.Tensor]] = None, + **kwargs: Unpack[FlashAttentionKwargs], + ) -> Union[tuple, BaseModelOutputWithPast]: + r""" + visual_pos_masks (`torch.Tensor` of shape `(batch_size, seqlen)`, *optional*): + The mask of the visual positions. + deepstack_visual_embeds (`list[torch.Tensor]`, *optional*): + The deepstack visual embeddings. The shape is (num_layers, visual_seqlen, embed_dim). + The feature is extracted from the different visual encoder layers, and fed to the decoder + hidden states. It's from the paper DeepStack(https://arxiv.org/abs/2406.04334). + """ + output_attentions = output_attentions if output_attentions is not None else self.config.output_attentions + output_hidden_states = ( + output_hidden_states if output_hidden_states is not None else self.config.output_hidden_states + ) + use_cache = use_cache if use_cache is not None else self.config.use_cache + return_dict = return_dict if return_dict is not None else self.config.use_return_dict + + if (input_ids is None) ^ (inputs_embeds is not None): + raise ValueError("You must specify exactly one of input_ids or inputs_embeds") + + # torch.jit.trace() doesn't support cache objects in the output + if use_cache and past_key_values is None and not torch.jit.is_tracing(): + past_key_values = DynamicCache(config=self.config) + + if inputs_embeds is None: + inputs_embeds = self.embed_tokens(input_ids) # [B,N,hidden_size] + + if cache_position is None: + past_seen_tokens = past_key_values.get_seq_length() if past_key_values is not None else 0 + cache_position = torch.arange( + past_seen_tokens, past_seen_tokens + inputs_embeds.shape[1], device=inputs_embeds.device + ) # [N] + + # the hard coded `3` is for temporal, height and width. + if position_ids is None: + position_ids = cache_position.view(1, 1, -1).expand(3, inputs_embeds.shape[0], -1) # [3,B,N] + elif position_ids.ndim == 2: + position_ids = position_ids[None, ...].expand(3, position_ids.shape[0], -1) # [3,B,N] + + if position_ids.ndim == 3 and position_ids.shape[0] == 4: + text_position_ids = position_ids[0] + position_ids = position_ids[1:] + else: + text_position_ids = position_ids[0] + + attention_mask = create_causal_mask( + config=self.config, + input_embeds=inputs_embeds, + attention_mask=attention_mask, + cache_position=cache_position, + past_key_values=past_key_values, + position_ids=text_position_ids, + ) + + hidden_states = inputs_embeds # [B,N,hidden_size] + + # create position embeddings to be shared across the decoder layers + position_embeddings = self.rotary_emb(hidden_states, position_ids) # each: [B,N,head_dim] + + # Initialize collectors like Qwen3 + all_hidden_states = () if output_hidden_states else None + all_self_attns = () if output_attentions else None + + # decoder layers + for layer_idx, decoder_layer in enumerate(self.layers): + if output_hidden_states: + all_hidden_states += (hidden_states,) + + layer_outputs = decoder_layer( + hidden_states, + attention_mask=attention_mask, + position_ids=text_position_ids, + past_key_values=past_key_values, + cache_position=cache_position, + position_embeddings=position_embeddings, + output_attentions=output_attentions, + **kwargs, + ) + hidden_states = layer_outputs[0] + + if output_attentions: + all_self_attns += (layer_outputs[1],) + + # add visual features to the hidden states of first several layers + if deepstack_visual_embeds is not None and layer_idx in range(len(deepstack_visual_embeds)): + hidden_states = self._deepstack_process( + hidden_states, + visual_pos_masks, + deepstack_visual_embeds[layer_idx], + ) + + hidden_states = self.norm(hidden_states) # [B,N,hidden_size] + + # add hidden states from the last decoder layer + if output_hidden_states: + all_hidden_states += (hidden_states,) + + if not return_dict: + return tuple( + v + for v in [hidden_states, past_key_values if use_cache else None, all_hidden_states, all_self_attns] + if v is not None + ) + + return BaseModelOutputWithPast( + last_hidden_state=hidden_states, + past_key_values=past_key_values, + hidden_states=all_hidden_states, + attentions=all_self_attns, + ) + + def _deepstack_process( + self, + hidden_states: torch.Tensor, # [B,N,hidden_size] + visual_pos_masks: torch.Tensor, # [B,N] bool + visual_embeds: torch.Tensor, # [N_vision_tokens,hidden_size] + ): + visual_pos_masks = visual_pos_masks.to(hidden_states.device) + visual_embeds = visual_embeds.to(hidden_states.device, hidden_states.dtype) + local_this = hidden_states[visual_pos_masks, :].clone() + visual_embeds # [N_vision_tokens,hidden_size] + hidden_states[visual_pos_masks, :] = local_this + return hidden_states # [B,N,hidden_size] + + +class Qwen3VLModel(Qwen3VLPreTrainedModel): + base_model_prefix = "" + _checkpoint_conversion_mapping = {} + # Reference: fix gemma3 grad acc #37208 + accepts_loss_kwargs = False + config: Qwen3VLConfig + _no_split_modules = ["Qwen3VLTextDecoderLayer", "Qwen3VLVisionBlock"] + + def __init__(self, config): + super().__init__(config) + self.visual = Qwen3VLVisionModel._from_config(config.vision_config) + self.language_model = Qwen3VLTextModel._from_config(config.text_config) + self.rope_deltas = None # cache rope_deltas here + + # Initialize weights and apply final processing + self.post_init() + + def get_input_embeddings(self): + return self.language_model.get_input_embeddings() + + def set_input_embeddings(self, value): + self.language_model.set_input_embeddings(value) + + def set_decoder(self, decoder): + self.language_model = decoder + + def get_decoder(self): + return self.language_model + + def get_rope_index( + self, + input_ids: Optional[torch.LongTensor] = None, + image_grid_thw: Optional[torch.LongTensor] = None, + video_grid_thw: Optional[torch.LongTensor] = None, + attention_mask: Optional[torch.Tensor] = None, + ) -> tuple[torch.Tensor, torch.Tensor]: + """Different from the original implementation, Qwen3VL use timestamps rather than absolute time position ids.""" + + # Since we use timestamps to seperate videos, like , the video_grid_thw should also be split + if video_grid_thw is not None: + video_grid_thw = torch.repeat_interleave(video_grid_thw, video_grid_thw[:, 0], dim=0) + video_grid_thw[:, 0] = 1 + + spatial_merge_size = self.config.vision_config.spatial_merge_size + image_token_id = self.config.image_token_id + video_token_id = self.config.video_token_id + vision_start_token_id = self.config.vision_start_token_id + mrope_position_deltas = [] + if input_ids is not None and (image_grid_thw is not None or video_grid_thw is not None): + total_input_ids = input_ids + if attention_mask is None: + attention_mask = torch.ones_like(total_input_ids) + position_ids = torch.ones( + 3, + input_ids.shape[0], + input_ids.shape[1], + dtype=input_ids.dtype, + device=input_ids.device, + ) # [3,B,N] + image_index, video_index = 0, 0 + attention_mask = attention_mask.to(total_input_ids.device) + for i, input_ids in enumerate(total_input_ids): + input_ids = input_ids[attention_mask[i] == 1] + image_nums, video_nums = 0, 0 + vision_start_indices = torch.argwhere(input_ids == vision_start_token_id).squeeze(1) + vision_tokens = input_ids[vision_start_indices + 1] + image_nums = (vision_tokens == image_token_id).sum() + video_nums = (vision_tokens == video_token_id).sum() + input_tokens = input_ids.tolist() + llm_pos_ids_list: list = [] + st = 0 + remain_images, remain_videos = image_nums, video_nums + for _ in range(image_nums + video_nums): + if image_token_id in input_tokens and remain_images > 0: + ed_image = input_tokens.index(image_token_id, st) + else: + ed_image = len(input_tokens) + 1 + if video_token_id in input_tokens and remain_videos > 0: + ed_video = input_tokens.index(video_token_id, st) + else: + ed_video = len(input_tokens) + 1 + if ed_image < ed_video: + t, h, w = ( + image_grid_thw[image_index][0], + image_grid_thw[image_index][1], + image_grid_thw[image_index][2], + ) + image_index += 1 + remain_images -= 1 + ed = ed_image + + else: + t, h, w = ( + video_grid_thw[video_index][0], + video_grid_thw[video_index][1], + video_grid_thw[video_index][2], + ) + video_index += 1 + remain_videos -= 1 + ed = ed_video + llm_grid_t, llm_grid_h, llm_grid_w = ( + t.item(), + h.item() // spatial_merge_size, + w.item() // spatial_merge_size, + ) + text_len = ed - st + + st_idx = llm_pos_ids_list[-1].max() + 1 if len(llm_pos_ids_list) > 0 else 0 + llm_pos_ids_list.append(torch.arange(text_len).view(1, -1).expand(3, -1) + st_idx) # [3,text_len] + + # t_index is always 0 because llm_grid_t is always 1 (we use timestamps to encode the temporal information for videos) + t_index = ( + torch.arange(llm_grid_t).view(-1, 1).expand(-1, llm_grid_h * llm_grid_w).flatten() + ) # [T*H*W] + h_index = ( + torch.arange(llm_grid_h).view(1, -1, 1).expand(llm_grid_t, -1, llm_grid_w).flatten() + ) # [T*H*W] + w_index = ( + torch.arange(llm_grid_w).view(1, 1, -1).expand(llm_grid_t, llm_grid_h, -1).flatten() + ) # [T*H*W] + llm_pos_ids_list.append(torch.stack([t_index, h_index, w_index]) + text_len + st_idx) # [3,T*H*W] + st = ed + llm_grid_t * llm_grid_h * llm_grid_w + + if st < len(input_tokens): + st_idx = llm_pos_ids_list[-1].max() + 1 if len(llm_pos_ids_list) > 0 else 0 + text_len = len(input_tokens) - st + llm_pos_ids_list.append(torch.arange(text_len).view(1, -1).expand(3, -1) + st_idx) # [3,text_len] + + llm_positions = torch.cat(llm_pos_ids_list, dim=1).reshape(3, -1) # [3,N_unmasked] + position_ids[..., i, attention_mask[i] == 1] = llm_positions.to(position_ids.device) + mrope_position_deltas.append(llm_positions.max() + 1 - len(total_input_ids[i])) + mrope_position_deltas = torch.tensor(mrope_position_deltas, device=input_ids.device).unsqueeze(1) # [B,1] + return position_ids, mrope_position_deltas # [3,B,N], [B,1] + else: + if attention_mask is not None: + position_ids = attention_mask.long().cumsum(-1) - 1 # [B,N] + position_ids.masked_fill_(attention_mask == 0, 1) + position_ids = position_ids.unsqueeze(0).expand(3, -1, -1).to(attention_mask.device) # [3,B,N] + max_position_ids = position_ids.max(0, keepdim=False)[0].max(-1, keepdim=True)[0] # [B,1] + mrope_position_deltas = max_position_ids + 1 - attention_mask.shape[-1] # [B,1] + else: + position_ids = ( + torch.arange(input_ids.shape[1], device=input_ids.device) + .view(1, 1, -1) + .expand(3, input_ids.shape[0], -1) + ) # [3,B,N] + mrope_position_deltas = torch.zeros( + [input_ids.shape[0], 1], + device=input_ids.device, + dtype=input_ids.dtype, + ) # [B,1] + + return position_ids, mrope_position_deltas # [3,B,N], [B,1] + + def get_video_features( + self, pixel_values_videos: torch.FloatTensor, video_grid_thw: Optional[torch.LongTensor] = None + ): + """ + Encodes videos into continuous embeddings that can be forwarded to the language model. The deepstack visual features are also returned. + + Args: + pixel_values_videos (`torch.FloatTensor` of shape `(batch_size, num_channels, image_size, image_size)`): + The tensors corresponding to the input videos. + video_grid_thw (`torch.LongTensor` of shape `(num_videos, 3)`, *optional*): + The temporal, height and width of feature shape of each video in LLM. + """ + # Same implementation as for images + return self.get_image_features(pixel_values_videos, video_grid_thw) + + def get_image_features(self, pixel_values: torch.FloatTensor, image_grid_thw: Optional[torch.LongTensor] = None): + """ + Encodes images into continuous embeddings that can be forwarded to the language model. The deepstack visual features are also returned. + + Args: + pixel_values (`torch.FloatTensor` of shape `(batch_size, num_channels, image_size, image_size)`): + The tensors corresponding to the input images. + image_grid_thw (`torch.LongTensor` of shape `(num_images, 3)`, *optional*): + The temporal, height and width of feature shape of each image in LLM. + """ + pixel_values = pixel_values.type(self.visual.dtype) + image_embeds, deepstack_image_embeds = self.visual(pixel_values, grid_thw=image_grid_thw) + # image_embeds: [N_all_merged,out_hidden_size] + split_sizes = (image_grid_thw.prod(-1) // self.visual.spatial_merge_size**2).tolist() + image_embeds = torch.split(image_embeds, split_sizes) + # tuple of [N_merged_i,out_hidden_size] per image + return image_embeds, deepstack_image_embeds + + def get_placeholder_mask( + self, + input_ids: torch.LongTensor, + inputs_embeds: torch.FloatTensor, + image_features: Optional[torch.FloatTensor] = None, + video_features: Optional[torch.FloatTensor] = None, + ): + """ + Obtains multimodal placeholder mask from `input_ids` or `inputs_embeds`, and checks that the placeholder token count is + equal to the length of multimodal features. If the lengths are different, an error is raised. + """ + if input_ids is None: + special_image_mask = inputs_embeds == self.get_input_embeddings()( + torch.tensor(self.config.image_token_id, dtype=torch.long, device=inputs_embeds.device) + ) + special_image_mask = special_image_mask.all(-1) + special_video_mask = inputs_embeds == self.get_input_embeddings()( + torch.tensor(self.config.video_token_id, dtype=torch.long, device=inputs_embeds.device) + ) + special_video_mask = special_video_mask.all(-1) + else: + special_image_mask = input_ids == self.config.image_token_id + special_video_mask = input_ids == self.config.video_token_id + + n_image_tokens = special_image_mask.sum() + special_image_mask = ( + special_image_mask.unsqueeze(-1).expand_as(inputs_embeds).to(inputs_embeds.device) + ) # [B,N,hidden_size] bool + if image_features is not None and inputs_embeds[special_image_mask].numel() != image_features.numel(): + raise ValueError( + f"Image features and image tokens do not match: tokens: {n_image_tokens}, features {image_features.shape[0]}" + ) + + n_video_tokens = special_video_mask.sum() + special_video_mask = ( + special_video_mask.unsqueeze(-1).expand_as(inputs_embeds).to(inputs_embeds.device) + ) # [B,N,hidden_size] bool + if video_features is not None and inputs_embeds[special_video_mask].numel() != video_features.numel(): + raise ValueError( + f"Videos features and video tokens do not match: tokens: {n_video_tokens}, features {video_features.shape[0]}" + ) + + return special_image_mask, special_video_mask # each: [B,N,hidden_size] bool + + def forward( + self, + input_ids: torch.LongTensor = None, + attention_mask: Optional[torch.Tensor] = None, + position_ids: Optional[torch.LongTensor] = None, + past_key_values: Optional[Cache] = None, + inputs_embeds: Optional[torch.FloatTensor] = None, + pixel_values: Optional[torch.Tensor] = None, + pixel_values_videos: Optional[torch.FloatTensor] = None, + image_grid_thw: Optional[torch.LongTensor] = None, + video_grid_thw: Optional[torch.LongTensor] = None, + cache_position: Optional[torch.LongTensor] = None, + **kwargs: Unpack[TransformersKwargs], + ) -> Union[tuple, Qwen3VLModelOutputWithPast]: + r""" + image_grid_thw (`torch.LongTensor` of shape `(num_images, 3)`, *optional*): + The temporal, height and width of feature shape of each image in LLM. + video_grid_thw (`torch.LongTensor` of shape `(num_videos, 3)`, *optional*): + The temporal, height and width of feature shape of each video in LLM. + """ + if (input_ids is None) ^ (inputs_embeds is not None): + raise ValueError("You must specify exactly one of input_ids or inputs_embeds") + + if inputs_embeds is None: + inputs_embeds = self.get_input_embeddings()(input_ids) # [B,N,hidden_size] + + image_mask = None + video_mask = None + + if pixel_values is not None: + image_embeds, deepstack_image_embeds = self.get_image_features(pixel_values, image_grid_thw) + image_embeds = torch.cat(image_embeds, dim=0).to( + inputs_embeds.device, inputs_embeds.dtype + ) # [N_image_tokens,hidden_size] + image_mask, _ = self.get_placeholder_mask( + input_ids, inputs_embeds=inputs_embeds, image_features=image_embeds + ) # image_mask: [B,N,hidden_size] bool + inputs_embeds = inputs_embeds.masked_scatter(image_mask, image_embeds) # [B,N,hidden_size] + + if pixel_values_videos is not None: + video_embeds, deepstack_video_embeds = self.get_video_features(pixel_values_videos, video_grid_thw) + video_embeds = torch.cat(video_embeds, dim=0).to( + inputs_embeds.device, inputs_embeds.dtype + ) # [N_video_tokens,hidden_size] + _, video_mask = self.get_placeholder_mask( + input_ids, inputs_embeds=inputs_embeds, video_features=video_embeds + ) # video_mask: [B,N,hidden_size] bool + inputs_embeds = inputs_embeds.masked_scatter(video_mask, video_embeds) # [B,N,hidden_size] + + visual_pos_masks = None + deepstack_visual_embeds = None + if image_mask is not None and video_mask is not None: + # aggregate visual_pos_masks and deepstack_visual_embeds + image_mask = image_mask[..., 0] # [B,N] bool + video_mask = video_mask[..., 0] # [B,N] bool + visual_pos_masks = image_mask | video_mask # [B,N] bool + deepstack_visual_embeds = [] + image_mask_joint = image_mask[visual_pos_masks] # [N_visual_tokens] bool + video_mask_joint = video_mask[visual_pos_masks] # [N_visual_tokens] bool + for img_embed, vid_embed in zip(deepstack_image_embeds, deepstack_video_embeds): + embed_joint = img_embed.new_zeros(visual_pos_masks.sum(), img_embed.shape[-1]).to( + img_embed.device + ) # [N_visual_tokens,out_hidden_size] + embed_joint[image_mask_joint, :] = img_embed + embed_joint[video_mask_joint, :] = vid_embed + deepstack_visual_embeds.append(embed_joint) # [N_visual_tokens,out_hidden_size] + elif image_mask is not None: + image_mask = image_mask[..., 0] # [B,N] bool + visual_pos_masks = image_mask + deepstack_visual_embeds = deepstack_image_embeds + elif video_mask is not None: + video_mask = video_mask[..., 0] # [B,N] bool + visual_pos_masks = video_mask + deepstack_visual_embeds = deepstack_video_embeds + + if position_ids is None: + attention_mask_tensor = ( + attention_mask if not isinstance(attention_mask, dict) else attention_mask["full_attention"] + ) + if attention_mask_tensor is not None and attention_mask_tensor.ndim == 4: + attention_mask_tensor = torch.diagonal(attention_mask_tensor[:, 0], dim1=1, dim2=2) # [B,N] + # Only apply conversion for floating point tensors (inverted masks) + if attention_mask_tensor.dtype.is_floating_point: + attention_mask_tensor = attention_mask_tensor / torch.finfo(attention_mask_tensor.dtype).min + attention_mask_tensor = (1.0 - attention_mask_tensor).int() # [B,N] + + # Calculate RoPE index once per generation in the pre-fill stage only. + # When compiling, we can't check tensor values thus we check only input length + # It is safe to assume that `length!=1` means we're in pre-fill because compiled + # models currently cannot do asssisted decoding + prefill_compiled_stage = is_torchdynamo_compiling() and ( + (input_ids is not None and input_ids.shape[1] != 1) + or (inputs_embeds is not None and inputs_embeds.shape[1] != 1) + ) + prefill_noncompiled_stage = not is_torchdynamo_compiling() and ( + (cache_position is not None and cache_position[0] == 0) + or (past_key_values is None or past_key_values.get_seq_length() == 0) + ) + if (prefill_compiled_stage or prefill_noncompiled_stage) or self.rope_deltas is None: + position_ids, rope_deltas = self.get_rope_index( + input_ids, + image_grid_thw, + video_grid_thw, + attention_mask=attention_mask_tensor, + ) + self.rope_deltas = rope_deltas + # then use the prev pre-calculated rope-deltas to get the correct position ids + else: + batch_size, seq_length, _ = inputs_embeds.shape + delta = ( + (cache_position[0] + self.rope_deltas).to(inputs_embeds.device) if cache_position is not None else 0 + ) # [B,1] or scalar + position_ids = torch.arange(seq_length, device=inputs_embeds.device) # [N] + position_ids = position_ids.view(1, -1).expand(batch_size, -1) # [B,N] + if cache_position is not None: # otherwise `deltas` is an int `0` + delta = delta.repeat_interleave(batch_size // delta.shape[0], dim=0) # [B,1] + position_ids = position_ids.add(delta) # [B,N] + position_ids = position_ids.unsqueeze(0).expand(3, -1, -1) # [3,B,N] + + outputs = self.language_model( + input_ids=None, + position_ids=position_ids, + attention_mask=attention_mask, + past_key_values=past_key_values, + inputs_embeds=inputs_embeds, + cache_position=cache_position, + visual_pos_masks=visual_pos_masks, + deepstack_visual_embeds=deepstack_visual_embeds, + **kwargs, + ) + + return Qwen3VLModelOutputWithPast( + last_hidden_state=outputs.last_hidden_state, + past_key_values=outputs.past_key_values, + rope_deltas=self.rope_deltas, + ) + + +@dataclass +class Qwen3VLCausalLMOutputWithPast(ModelOutput): + r""" + loss (`torch.FloatTensor` of shape `(1,)`, *optional*, returned when `labels` is provided): + Language modeling loss (for next-token prediction). + logits (`torch.FloatTensor` of shape `(batch_size, sequence_length, config.vocab_size)`): + Prediction scores of the language modeling head (scores for each vocabulary token before SoftMax). + past_key_values (`Cache`, *optional*, returned when `use_cache=True` is passed or when `config.use_cache=True`): + It is a [`~cache_utils.Cache`] instance. For more details, see our [kv cache guide](https://huggingface.co/docs/transformers/en/kv_cache). + + Contains pre-computed hidden-states (key and values in the self-attention blocks) that can be used (see + `past_key_values` input) to speed up sequential decoding. + rope_deltas (`torch.LongTensor` of shape `(batch_size, )`, *optional*): + The rope index difference between sequence length and multimodal rope. + """ + + loss: Optional[torch.FloatTensor] = None + logits: Optional[torch.FloatTensor] = None + past_key_values: Optional[Cache] = None + hidden_states: Optional[tuple[torch.FloatTensor]] = None + attentions: Optional[tuple[torch.FloatTensor]] = None + rope_deltas: Optional[torch.LongTensor] = None + + +class Qwen3VLForConditionalGeneration(Qwen3VLPreTrainedModel, GenerationMixin): + _checkpoint_conversion_mapping = {} + _tied_weights_keys = ["lm_head.weight"] + # Reference: fix gemma3 grad acc #37208 + accepts_loss_kwargs = False + config: Qwen3VLConfig + + def __init__(self, config): + super().__init__(config) + self.model = Qwen3VLModel(config) + self.lm_head = nn.Linear(config.text_config.hidden_size, config.text_config.vocab_size, bias=False) + + self.post_init() + + def get_input_embeddings(self): + return self.model.get_input_embeddings() + + def set_input_embeddings(self, value): + self.model.set_input_embeddings(value) + + def set_decoder(self, decoder): + self.model.set_decoder(decoder) + + def get_decoder(self): + return self.model.get_decoder() + + def get_video_features( + self, pixel_values_videos: torch.FloatTensor, video_grid_thw: Optional[torch.LongTensor] = None + ): + return self.model.get_video_features(pixel_values_videos, video_grid_thw) + + def get_image_features(self, pixel_values: torch.FloatTensor, image_grid_thw: Optional[torch.LongTensor] = None): + return self.model.get_image_features(pixel_values, image_grid_thw) + + # Make modules available through conditional class for BC + @property + def language_model(self): + return self.model.language_model + + @property + def visual(self): + return self.model.visual + + def forward( + self, + input_ids: torch.LongTensor = None, + attention_mask: Optional[torch.Tensor] = None, + position_ids: Optional[torch.LongTensor] = None, + past_key_values: Optional[Cache] = None, + inputs_embeds: Optional[torch.FloatTensor] = None, + labels: Optional[torch.LongTensor] = None, + pixel_values: Optional[torch.Tensor] = None, + pixel_values_videos: Optional[torch.FloatTensor] = None, + image_grid_thw: Optional[torch.LongTensor] = None, + video_grid_thw: Optional[torch.LongTensor] = None, + cache_position: Optional[torch.LongTensor] = None, + logits_to_keep: Union[int, torch.Tensor] = 0, + **kwargs: Unpack[TransformersKwargs], + ) -> Union[tuple, Qwen3VLCausalLMOutputWithPast]: + r""" + labels (`torch.LongTensor` of shape `(batch_size, sequence_length)`, *optional*): + Labels for computing the masked language modeling loss. Indices should either be in `[0, ..., + config.vocab_size]` or -100 (see `input_ids` docstring). Tokens with indices set to `-100` are ignored + (masked), the loss is only computed for the tokens with labels in `[0, ..., config.vocab_size]`. + image_grid_thw (`torch.LongTensor` of shape `(num_images, 3)`, *optional*): + The temporal, height and width of feature shape of each image in LLM. + video_grid_thw (`torch.LongTensor` of shape `(num_videos, 3)`, *optional*): + The temporal, height and width of feature shape of each video in LLM. + + Example: + TODO: Add example + """ + outputs = self.model( + input_ids=input_ids, + pixel_values=pixel_values, + pixel_values_videos=pixel_values_videos, + image_grid_thw=image_grid_thw, + video_grid_thw=video_grid_thw, + position_ids=position_ids, + attention_mask=attention_mask, + past_key_values=past_key_values, + inputs_embeds=inputs_embeds, + cache_position=cache_position, + **kwargs, + ) + + hidden_states = outputs[0] # [B,N,hidden_size] + + # Only compute necessary logits, and do not upcast them to float if we are not computing the loss + slice_indices = slice(-logits_to_keep, None) if isinstance(logits_to_keep, int) else logits_to_keep + logits = self.lm_head(hidden_states[:, slice_indices, :]) # [B,N_keep,vocab_size] + + loss = None + if labels is not None: + loss = self.loss_function(logits=logits, labels=labels, vocab_size=self.config.text_config.vocab_size) + + return Qwen3VLCausalLMOutputWithPast( + loss=loss, + logits=logits, + past_key_values=outputs.past_key_values, + rope_deltas=outputs.rope_deltas, + ) + + def prepare_inputs_for_generation( + self, + input_ids, + past_key_values=None, + attention_mask=None, + inputs_embeds=None, + cache_position=None, + position_ids=None, + use_cache=True, + pixel_values=None, + pixel_values_videos=None, + image_grid_thw=None, + video_grid_thw=None, + **kwargs, + ): + # Overwritten -- in specific circumstances we don't want to forward image inputs to the model + + model_inputs = super().prepare_inputs_for_generation( + input_ids, + past_key_values=past_key_values, + attention_mask=attention_mask, + inputs_embeds=inputs_embeds, + cache_position=cache_position, + position_ids=position_ids, + pixel_values=pixel_values, + pixel_values_videos=pixel_values_videos, + image_grid_thw=image_grid_thw, + video_grid_thw=video_grid_thw, + use_cache=use_cache, + **kwargs, + ) + + # Qwen3VL position_ids are prepareed with rope_deltas in forward + model_inputs["position_ids"] = None + + if cache_position[0] != 0: + model_inputs["pixel_values"] = None + model_inputs["pixel_values_videos"] = None + + return model_inputs + + def _get_image_nums_and_video_nums( + self, + input_ids: Optional[torch.LongTensor], + inputs_embeds: Optional[torch.Tensor] = None, + ) -> tuple[torch.Tensor, torch.Tensor]: + """ + Get the number of images and videos for each sample to calculate the separation length of the sample tensor. + These parameters are not passed through the processor to avoid unpredictable impacts from interface modifications. + + Args: + input_ids (`torch.LongTensor` of shape `(batch_size, sequence_length)`): + Indices of input sequence tokens in the vocabulary. + + Returns: + image_nums (`torch.LongTensor` of shape `(batch_size, num_images_sample)`) + video_nums (`torch.LongTensor` of shape `(batch_size, num_videos_sample)`) + """ + image_token_id = self.config.image_token_id + video_token_id = self.config.video_token_id + vision_start_token_id = self.config.vision_start_token_id + + if inputs_embeds is not None: + vision_start_mask = ( + inputs_embeds + == self.get_input_embeddings()( + torch.tensor(vision_start_token_id, dtype=torch.long, device=inputs_embeds.device) + ) + )[..., 0] + image_mask = ( + inputs_embeds + == self.get_input_embeddings()( + torch.tensor(image_token_id, dtype=torch.long, device=inputs_embeds.device) + ) + )[..., 0] + video_mask = ( + inputs_embeds + == self.get_input_embeddings()( + torch.tensor(video_token_id, dtype=torch.long, device=inputs_embeds.device) + ) + )[..., 0] + else: + vision_start_mask = input_ids == vision_start_token_id + image_mask = input_ids == image_token_id + video_mask = input_ids == video_token_id + + vision_first_mask = torch.roll(vision_start_mask, shifts=1, dims=1) + image_nums = torch.sum(vision_first_mask & image_mask, dim=1) + video_nums = torch.sum(vision_first_mask & video_mask, dim=1) + + return image_nums, video_nums + + def _expand_inputs_for_generation( + self, + expand_size: int = 1, + is_encoder_decoder: bool = False, + input_ids: Optional[torch.LongTensor] = None, + **model_kwargs, + ) -> tuple[torch.LongTensor, dict[str, Any]]: + # Overwritten -- Support for expanding tensors without a batch size dimension + # e.g., pixel_values, image_grid_thw, pixel_values_videos, video_grid_thw, second_per_grid_t + # pixel_values.shape[0] is sum(seqlen_images for samples) + # image_grid_thw.shape[0] is sum(num_images for samples) + + if expand_size == 1: + return input_ids, model_kwargs + + visual_keys = ["pixel_values", "image_grid_thw", "pixel_values_videos", "video_grid_thw", "second_per_grid_ts"] + + def _expand_dict_for_generation_visual(dict_to_expand): + image_grid_thw = model_kwargs.get("image_grid_thw", None) + video_grid_thw = model_kwargs.get("video_grid_thw", None) + image_nums, video_nums = self._get_image_nums_and_video_nums( + input_ids, inputs_embeds=model_kwargs.get("inputs_embeds", None) + ) + + def _repeat_interleave_samples(x, lengths, repeat_times): + samples = torch.split(x, lengths) + repeat_args = [repeat_times] + [1] * (x.dim() - 1) + result = torch.cat([sample.repeat(*repeat_args) for sample in samples], dim=0) + return result + + for key in dict_to_expand: + if key == "pixel_values": + # split images into samples + samples = torch.split(image_grid_thw, list(image_nums)) + # compute the sequence length of images for each sample + lengths = [torch.prod(sample, dim=1).sum() for sample in samples] + dict_to_expand[key] = _repeat_interleave_samples( + dict_to_expand[key], lengths=lengths, repeat_times=expand_size + ) + elif key == "image_grid_thw": + # get the num of images for each sample + lengths = list(image_nums) + dict_to_expand[key] = _repeat_interleave_samples( + dict_to_expand[key], lengths=lengths, repeat_times=expand_size + ) + elif key == "pixel_values_videos": + samples = torch.split(video_grid_thw, list(video_nums)) + lengths = [torch.prod(sample, dim=1).sum() for sample in samples] + dict_to_expand[key] = _repeat_interleave_samples( + dict_to_expand[key], lengths=lengths, repeat_times=expand_size + ) + elif key == "video_grid_thw": + lengths = list(video_nums) + dict_to_expand[key] = _repeat_interleave_samples( + dict_to_expand[key], lengths=lengths, repeat_times=expand_size + ) + elif key == "second_per_grid_ts": + dict_to_expand[key] = _repeat_interleave_samples( + dict_to_expand[key], lengths=list(video_nums), repeat_times=expand_size + ) + return dict_to_expand + + def _expand_dict_for_generation(dict_to_expand): + for key in dict_to_expand: + if ( + key != "cache_position" + and dict_to_expand[key] is not None + and isinstance(dict_to_expand[key], torch.Tensor) + and key not in visual_keys + ): + dict_to_expand[key] = dict_to_expand[key].repeat_interleave(expand_size, dim=0) + return dict_to_expand + + model_kwargs = _expand_dict_for_generation_visual(model_kwargs) + + if input_ids is not None: + input_ids = input_ids.repeat_interleave(expand_size, dim=0) + + model_kwargs = _expand_dict_for_generation(model_kwargs) + + if is_encoder_decoder: + if model_kwargs.get("encoder_outputs") is None: + raise ValueError("If `is_encoder_decoder` is True, make sure that `encoder_outputs` is defined.") + model_kwargs["encoder_outputs"] = _expand_dict_for_generation(model_kwargs["encoder_outputs"]) + + return input_ids, model_kwargs + + +__all__ = [ + "Qwen3VLVisionModel", + "Qwen3VLForConditionalGeneration", + "Qwen3VLModel", + "Qwen3VLPreTrainedModel", + "Qwen3VLTextModel", +] diff --git a/cosmos-inference/cosmos3/_src/vfm/models/vlm/qwen3_vl/utils.py b/cosmos-inference/cosmos3/_src/vfm/models/vlm/qwen3_vl/utils.py new file mode 100644 index 00000000..7d212875 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/vfm/models/vlm/qwen3_vl/utils.py @@ -0,0 +1,348 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +# Core masking functions extracted from transformers.masking_utils for BAGEL compatibility +# Original Copyright 2025 HuggingFace Inc. team. Licensed under the Apache License, Version 2.0 + +from typing import Callable, ClassVar, Optional, cast + +import torch +from transformers.cache_utils import Cache +from transformers.configuration_utils import PretrainedConfig +from transformers.tokenization_utils_base import PreTrainedTokenizerBase +from transformers.utils import logging + +# from transformers.utils.generic import GeneralInterface +from transformers.utils.import_utils import is_torch_greater_or_equal + +logger = logging.get_logger(__name__) + +_is_torch_greater_or_equal_than_2_6 = is_torch_greater_or_equal("2.6", accept_dev=True) + +_SYSTEM_PROMPT_IMAGE = "You are a helpful assistant who will generate images from a give prompt." +_SYSTEM_PROMPT_VIDEO = "You are a helpful assistant who will generate videos from a give prompt." +_SYSTEM_PROMPT_TRANSFER = ( + "You are a helpful assistant that generates images or videos following the user's instructions" + " and control signals (edge maps, blur, depth, or segmentation)." +) +_SYSTEM_PROMPT_IMAGE_EDITING = "You are a helpful assistant who will edit images based on the user's instructions." + + +def tokenize_caption( + caption: str, + tokenizer: PreTrainedTokenizerBase, + is_video: bool = False, + use_system_prompt: bool = False, + system_prompt: Optional[str] = None, +) -> list[int]: + """Tokenize a text caption into token IDs using the Qwen2 chat template. + + Wraps the caption in a chat-style conversation (with a "user" role) and applies + the tokenizer's chat template to produce the final token ID sequence, including + any special tokens (e.g., BOS, role markers, generation prompt). + + Args: + caption: The text caption to tokenize. + tokenizer: A HuggingFace ``PreTrainedTokenizerBase`` (e.g. Qwen2Tokenizer or Fast tokenizer). + is_video: If True (and use_system_prompt=True), uses the video system prompt; + otherwise uses the image system prompt. Ignored when ``system_prompt`` is + provided. + use_system_prompt: If True, prepends a system prompt message to the conversation + before the user caption. Ignored when ``system_prompt`` is provided. + system_prompt: When supplied, this exact string is used as the system prompt, + overriding both ``is_video`` and ``use_system_prompt``. + + Returns: + List of token IDs representing the full chat-formatted caption. + """ + conversations = [] + if system_prompt is not None: + conversations.append({"role": "system", "content": system_prompt}) + elif use_system_prompt: + _system_prompt = _SYSTEM_PROMPT_VIDEO if is_video else _SYSTEM_PROMPT_IMAGE + conversations.append({"role": "system", "content": _system_prompt}) + conversations.append({"role": "user", "content": caption}) + + tokenizer_output = tokenizer.apply_chat_template( + conversations, + tokenize=True, + add_generation_prompt=True, + add_vision_id=False, + return_dict=False, + ) + return cast(list[int], tokenizer_output) + + +def causal_mask_function(batch_idx: int, head_idx: int, q_idx: int, kv_idx: int) -> bool: + """ + This creates a basic lower-diagonal causal mask. + """ + return kv_idx <= q_idx + + +def sliding_window_overlay(sliding_window: int) -> Callable: + """ + This is an overlay depicting a sliding window pattern. Add it on top of a causal mask for a proper sliding + window mask. + """ + + def inner_mask(batch_idx: int, head_idx: int, q_idx: int, kv_idx: int) -> bool: + return kv_idx > q_idx - sliding_window + + return inner_mask + + +def and_masks(*mask_functions: list[Callable]) -> Callable: + """Returns a mask function that is the intersection of provided mask functions""" + if not all(callable(arg) for arg in mask_functions): + raise RuntimeError(f"All inputs should be callable mask_functions: {mask_functions}") + + def and_mask(batch_idx, head_idx, q_idx, kv_idx): + result = q_idx.new_ones((), dtype=torch.bool) + for mask in mask_functions: + result = result & mask(batch_idx, head_idx, q_idx, kv_idx).to(result.device) + return result + + return and_mask + + +def sliding_window_causal_mask_function(sliding_window: int) -> Callable: + """ + This return the mask_function function to create a sliding window mask. + """ + return and_masks(sliding_window_overlay(sliding_window), causal_mask_function) + + +def padding_mask_function(padding_mask: torch.Tensor) -> Callable: + """ + This return the mask_function function corresponding to a 2D padding mask. + """ + + def inner_mask(batch_idx: int, head_idx: int, q_idx: int, kv_idx: int) -> bool: + return padding_mask[batch_idx, kv_idx] + + return inner_mask + + +def _vmap_for_bhqkv(mask_function: Callable, bh_indices: bool = True) -> Callable: + """ + Used to vmap our mask_functions over the q_idx and kv_idx dimensions of the inputs. + """ + # We vmap the function 2 times, broadcasting the [q_idx, kv_idx] dimensions + dimensions = [(None, None, None, 0), (None, None, 0, None)] + if bh_indices: + # We extend broadcasting over the [batch_idx, head_idx] dimensions + dimensions.extend([(None, 0, None, None), (0, None, None, None)]) + + for dims in dimensions: + mask_function = torch.vmap(mask_function, in_dims=dims, out_dims=0) + return mask_function + + +def prepare_padding_mask( + attention_mask: Optional[torch.Tensor], kv_length: int, kv_offset: int, _slice: bool = True +) -> Optional[torch.Tensor]: + """ + From the 2D attention mask, prepare the correct padding mask to use by potentially padding it, and slicing + according to the `kv_offset` if `_slice` is `True`. + """ + local_padding_mask = attention_mask + if attention_mask is not None: + # Pad it if necessary + if (padding_length := kv_length + kv_offset - attention_mask.shape[-1]) > 0: + local_padding_mask = torch.nn.functional.pad(attention_mask, (0, padding_length)) + # For flex, we should not slice them, only use an offset + if _slice: + # Equivalent to: `local_padding_mask = attention_mask[:, kv_offset : kv_offset + kv_length]`, + # but without data-dependent slicing (i.e. torch.compile friendly) + mask_indices = torch.arange(kv_length, device=local_padding_mask.device) + mask_indices += kv_offset + local_padding_mask = local_padding_mask[:, mask_indices] + return local_padding_mask + + +def eager_mask( + batch_size: int, + cache_position: torch.Tensor, + kv_length: int, + kv_offset: int = 0, + mask_function: Callable = causal_mask_function, + attention_mask: Optional[torch.Tensor] = None, + dtype: torch.dtype = torch.float32, + **kwargs, +) -> torch.Tensor: + """ + Create a 4D float mask of shape `(batch_size, 1, query_length, kv_length)` where a value of 0 indicates that + the element should take part in the attention computation, and -inf (minimum value for the given `dtype`) that + it should not. + """ + # Potentially pad the 2D mask, and slice it correctly + padding_mask = prepare_padding_mask(attention_mask, kv_length, kv_offset) + + # Similar to `kv_arange = torch.arange(start=kv_offset, end=kv_offset + kv_length, device=cache_position.device)` + # but without data-dependent slicing (i.e. torch.compile friendly) + kv_arange = torch.arange(kv_length, device=cache_position.device) + kv_arange += kv_offset + + # Create the 4D mask easily + causal_mask = _vmap_for_bhqkv(mask_function, bh_indices=False)( + None, None, cache_position, kv_arange + ) # [q_len,kv_length] + causal_mask = causal_mask[None, None, :, :].expand(batch_size, -1, -1, -1) # [B,1,q_len,kv_length] + if padding_mask is not None: + causal_mask = causal_mask * padding_mask[:, None, None, :] # [B,1,q_len,kv_length] + + min_dtype = torch.finfo(dtype).min + # we need 0s where the tokens should be taken into account, and -inf otherwise + mask = torch.where( + causal_mask, torch.tensor(0.0, device=causal_mask.device, dtype=dtype), min_dtype + ) # [B,1,q_len,kv_length] + return mask + + +# class AttentionMaskInterface(GeneralInterface): +class AttentionMaskInterface: + # Class instance object for mask interfaces + _global_mapping: ClassVar = { + "eager": eager_mask, + } + + +# Global AttentionMaskInterface shared by all models +ALL_MASK_ATTENTION_FUNCTIONS: AttentionMaskInterface = AttentionMaskInterface() + + +def _preprocess_mask_arguments( + config: PretrainedConfig, + input_embeds: torch.Tensor, + attention_mask: Optional[torch.Tensor], + cache_position: torch.Tensor, + past_key_values: Optional[Cache], + position_ids: Optional[torch.Tensor], + layer_idx: Optional[int], +) -> tuple[bool, Optional[torch.Tensor], None, int, int]: + """ + Perform some common pre-processing of the mask arguments we get from the modeling code. + """ + # If the mask is already 4D, simply return as-is + if isinstance(attention_mask, torch.Tensor) and len(attention_mask.shape) == 4: + return True, attention_mask, None, None, None + + # For TGI/vLLM backends or other custom attention: we don't need a mask + if config._attn_implementation not in ALL_MASK_ATTENTION_FUNCTIONS._global_mapping: + return True, None, None, None, None + + # Move the mask to correct device, and potentially switch dtype for efficiency + if attention_mask is not None and attention_mask.ndim == 2: + attention_mask = attention_mask.to(device=cache_position.device, dtype=torch.bool) + + # If using a cache, it can give all information about mask sizes based on seen tokens + if past_key_values is not None: + kv_length, kv_offset = past_key_values.get_mask_sizes(cache_position, layer_idx) + # Otherwise, the sizes are simply the input sizes + else: + kv_length, kv_offset = input_embeds.shape[1], 0 + + return False, attention_mask, None, kv_length, kv_offset + + +def create_causal_mask( + config: PretrainedConfig, + input_embeds: torch.Tensor, + attention_mask: Optional[torch.Tensor], + cache_position: torch.Tensor, + past_key_values: Optional[Cache], + position_ids: Optional[torch.Tensor] = None, + **kwargs, +) -> Optional[torch.Tensor]: + """ + Create a standard causal mask based on the attention implementation used (stored in the config). + """ + # For hybrid cache structure, use the full_attention layers + layer_idx = 0 + + early_exit, attention_mask, packed_sequence_mask, kv_length, kv_offset = _preprocess_mask_arguments( + config, input_embeds, attention_mask, cache_position, past_key_values, position_ids, layer_idx + ) + if early_exit: + return attention_mask + + batch_size, dtype = input_embeds.shape[0], input_embeds.dtype + mask_factory_function = causal_mask_function + mask_interface = ALL_MASK_ATTENTION_FUNCTIONS[config._attn_implementation] + + # Potentially add the padding 2D mask + if attention_mask is not None: + mask_factory_function = and_masks(mask_factory_function, padding_mask_function(attention_mask)) + + # We now create the mask + causal_mask = mask_interface( + batch_size=batch_size, + cache_position=cache_position, + kv_length=kv_length, + kv_offset=kv_offset, + mask_function=mask_factory_function, + attention_mask=attention_mask, + dtype=dtype, + config=config, + ) + return causal_mask + + +def create_sliding_window_causal_mask( + config: PretrainedConfig, + input_embeds: torch.Tensor, + attention_mask: Optional[torch.Tensor], + cache_position: torch.Tensor, + past_key_values: Optional[Cache], + position_ids: Optional[torch.Tensor] = None, + **kwargs, +) -> Optional[torch.Tensor]: + """ + Create a sliding window causal mask based on the attention implementation used (stored in the config). + """ + # For hybrid cache structure, use the sliding_attention layers + layer_idx = 0 + + early_exit, attention_mask, packed_sequence_mask, kv_length, kv_offset = _preprocess_mask_arguments( + config, input_embeds, attention_mask, cache_position, past_key_values, position_ids, layer_idx + ) + if early_exit: + return attention_mask + + sliding_window = getattr(config, "sliding_window", None) + if sliding_window is None: + raise ValueError("Could not find a `sliding_window` argument in the config, or it is not set") + + batch_size, dtype = input_embeds.shape[0], input_embeds.dtype + mask_factory_function = sliding_window_causal_mask_function(sliding_window) + mask_interface = ALL_MASK_ATTENTION_FUNCTIONS[config._attn_implementation] + + # Potentially add the padding 2D mask + if attention_mask is not None: + mask_factory_function = and_masks(mask_factory_function, padding_mask_function(attention_mask)) + + # We now create the mask + causal_mask = mask_interface( + batch_size=batch_size, + cache_position=cache_position, + kv_length=kv_length, + kv_offset=kv_offset, + mask_function=mask_factory_function, + attention_mask=attention_mask, + dtype=dtype, + config=config, + ) + return causal_mask diff --git a/cosmos-inference/cosmos3/_src/vfm/models/vlm/qwen3_vl/video_processing_qwen3_vl.py b/cosmos-inference/cosmos3/_src/vfm/models/vlm/qwen3_vl/video_processing_qwen3_vl.py new file mode 100644 index 00000000..d5d8ebc2 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/vfm/models/vlm/qwen3_vl/video_processing_qwen3_vl.py @@ -0,0 +1,297 @@ +# Copyright 2025 The Qwen Team and The HuggingFace Inc. team. All rights reserved. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +# ----------------------------------------------------------------------------- +# Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. +# All rights reserved. +# +# This codebase constitutes NVIDIA proprietary technology and is strictly +# confidential. Any unauthorized reproduction, distribution, or disclosure +# of this code, in whole or in part, outside NVIDIA is strictly prohibited +# without prior written consent. +# +# For inquiries regarding the use of this code in other NVIDIA proprietary +# projects, please contact the Deep Imagination Research Team at +# dir@exchange.nvidia.com. +# ----------------------------------------------------------------------------- + +# Source Repository: https://github.com/huggingface/transformers +# This is adapted from src/transformers/models/qwen3_vl/video_processing_qwen3_vl.py. +# Commit Hash: 41e5abac5cb49983a08ddef3e8645d6efd23c8f3 +"""video processor class for Qwen3-VL.""" + +import math +from typing import Optional, Union + +import numpy as np +import torch +from transformers.feature_extraction_utils import BatchFeature +from transformers.image_utils import ChannelDimension, PILImageResampling, SizeDict, get_image_size +from transformers.processing_utils import Unpack, VideosKwargs +from transformers.utils import TensorType, add_start_docstrings, logging +from transformers.video_processing_utils import BASE_VIDEO_PROCESSOR_DOCSTRING, BaseVideoProcessor +from transformers.video_utils import VideoMetadata, group_videos_by_shape, reorder_videos + +logger = logging.get_logger(__name__) + + +def smart_resize( + num_frames: int, + height: int, + width: int, + temporal_factor: int = 2, + factor: int = 32, + min_pixels: int = 128 * 128, + max_pixels: int = 16 * 16 * 2 * 2 * 2 * 6144, +): + if num_frames < temporal_factor: + raise ValueError(f"t:{num_frames} must be larger than temporal_factor:{temporal_factor}") + if height < factor or width < factor: + raise ValueError(f"height:{height} or width:{width} must be larger than factor:{factor}") + elif max(height, width) / min(height, width) > 200: + raise ValueError( + f"absolute aspect ratio must be smaller than 200, got {max(height, width) / min(height, width)}" + ) + h_bar = round(height / factor) * factor + w_bar = round(width / factor) * factor + t_bar = round(num_frames / temporal_factor) * temporal_factor + + if t_bar * h_bar * w_bar > max_pixels: + beta = math.sqrt((num_frames * height * width) / max_pixels) + h_bar = max(factor, math.floor(height / beta / factor) * factor) + w_bar = max(factor, math.floor(width / beta / factor) * factor) + elif t_bar * h_bar * w_bar < min_pixels: + beta = math.sqrt(min_pixels / (num_frames * height * width)) + h_bar = math.ceil(height * beta / factor) * factor + w_bar = math.ceil(width * beta / factor) * factor + + return h_bar, w_bar + + +class Qwen3VLVideoProcessorInitKwargs(VideosKwargs): + patch_size: Optional[int] + temporal_patch_size: Optional[int] + merge_size: Optional[int] + min_frames: Optional[int] + max_frames: Optional[int] + + +@add_start_docstrings( + "Constructs a fast Qwen3-VL image processor that dynamically resizes videos based on the original videos.", + BASE_VIDEO_PROCESSOR_DOCSTRING, + """ + patch_size (`int`, *optional*, defaults to 16): + The spacial patch size of the vision encoder. + temporal_patch_size (`int`, *optional*, defaults to 2): + The temporal patch size of the vision encoder. + merge_size (`int`, *optional*, defaults to 2): + The merge size of the vision encoder to llm encoder. + """, +) +class Qwen3VLVideoProcessor(BaseVideoProcessor): + resample = PILImageResampling.BICUBIC + size = {"shortest_edge": 128 * 32 * 32, "longest_edge": 32 * 32 * 768} + image_mean = [0.5, 0.5, 0.5] + image_std = [0.5, 0.5, 0.5] + do_resize = True + do_rescale = True + do_normalize = True + do_convert_rgb = True + patch_size = 16 + temporal_patch_size = 2 + merge_size = 2 + fps = 2 + min_frames = 4 + max_frames = 768 + do_sample_frames = True + valid_kwargs = Qwen3VLVideoProcessorInitKwargs + model_input_names = ["pixel_values_videos", "video_grid_thw"] + + def __init__(self, **kwargs: Unpack[Qwen3VLVideoProcessorInitKwargs]): + super().__init__(**kwargs) + if self.size is not None and ( + self.size.get("shortest_edge", None) is None or self.size.get("longest_edge", None) is None + ): + raise ValueError("size must contain 'shortest_edge' and 'longest_edge' keys.") + + def _further_process_kwargs( + self, + size: Optional[SizeDict] = None, + **kwargs, + ) -> dict: + """ + Update kwargs that need further processing before being validated + Can be overridden by subclasses to customize the processing of kwargs. + """ + if size is not None and ("shortest_edge" not in size or "longest_edge" not in size): + raise ValueError("size must contain 'shortest_edge' and 'longest_edge' keys.") + + return super()._further_process_kwargs(size=size, **kwargs) + + def sample_frames( + self, + metadata: VideoMetadata, + num_frames: Optional[int] = None, + fps: Optional[Union[int, float]] = None, + **kwargs, + ): + """ + Default sampling function which uniformly samples the desired number of frames between 0 and total number of frames. + If `fps` is passed along with metadata, `fps` frames per second are sampled uniformty. Arguments `num_frames` + and `fps` are mutually exclusive. + + Args: + video (`torch.Tensor`): + Video that need to be sampled. + metadata (`VideoMetadata`): + Metadata of the video containing information about total duration, fps and total number of frames. + num_frames (`int`, *optional*): + Maximum number of frames to sample. Defaults to `self.num_frames`. + fps (`int` or `float`, *optional*): + Target frames to sample per second. Defaults to `self.fps`. + Returns: + torch.Tensor: + Sampled video frames. + """ + if fps is not None and num_frames is not None: + raise ValueError("`num_frames` and `fps` are mutually exclusive arguments, please use only one!") + + total_num_frames = metadata.total_num_frames + fps = fps if fps is not None else self.fps + + # If num_frames is not given but fps is, calculate num_frames from fps + if num_frames is None and fps is not None: + if metadata.fps is None: + metadata.fps = 24 + logger.warning_once( + "Asked to sample `fps` frames per second but no video metadata was provided which is required when sampling with `fps`. " + "Defaulting to `fps=24`. Please provide `video_metadata` for more accurate results." + ) + num_frames = int(total_num_frames / metadata.fps * fps) + num_frames = min(min(max(num_frames, self.min_frames), self.max_frames), total_num_frames) + + if num_frames is None: + num_frames = min(max(total_num_frames, self.min_frames), self.max_frames) + + indices = np.linspace(0, total_num_frames - 1, num_frames).round().astype(int) + + return indices + + def _preprocess( + self, + videos: list[torch.Tensor], + do_convert_rgb: bool = True, + do_resize: bool = True, + size: Optional[SizeDict] = None, + interpolation: PILImageResampling = PILImageResampling.BICUBIC, + do_rescale: bool = True, + rescale_factor: float = 1 / 255.0, + do_normalize: bool = True, + image_mean: Optional[Union[float, list[float]]] = None, + image_std: Optional[Union[float, list[float]]] = None, + patch_size: Optional[int] = None, + temporal_patch_size: Optional[int] = None, + merge_size: Optional[int] = None, + return_tensors: Optional[Union[str, TensorType]] = None, + **kwargs, + ): + grouped_videos, grouped_videos_index = group_videos_by_shape(videos) + resized_videos_grouped = {} + + for shape, stacked_videos in grouped_videos.items(): + B, T, C, H, W = stacked_videos.shape + num_frames, height, width = T, H, W + if do_resize: + resized_height, resized_width = smart_resize( + num_frames=num_frames, + height=height, + width=width, + temporal_factor=temporal_patch_size, + factor=patch_size * merge_size, + min_pixels=size.shortest_edge, + max_pixels=size.longest_edge, + ) + stacked_videos = stacked_videos.view(B * T, C, H, W) # [B*T,C,H,W] + stacked_videos = self.resize( + stacked_videos, + size=SizeDict(height=resized_height, width=resized_width), + interpolation=interpolation, + ) # [B*T,C,resized_height,resized_width] + stacked_videos = stacked_videos.view( + B, T, C, resized_height, resized_width + ) # [B,T,C,resized_height,resized_width] + resized_videos_grouped[shape] = stacked_videos + resized_videos = reorder_videos(resized_videos_grouped, grouped_videos_index) + + # Group videos by size for further processing + # Needed in case do_resize is False, or resize returns videos with different sizes + grouped_videos, grouped_videos_index = group_videos_by_shape(resized_videos) + processed_videos_grouped = {} + processed_grids = {} + for shape, stacked_videos in grouped_videos.items(): + resized_height, resized_width = get_image_size(stacked_videos[0], channel_dim=ChannelDimension.FIRST) + + # Fused rescale and normalize + stacked_videos = self.rescale_and_normalize( + stacked_videos, do_rescale, rescale_factor, do_normalize, image_mean, image_std + ) + patches = stacked_videos + + # Check that videos have `num_frames` divisible by `temporal_patch_size` + if patches.shape[1] % temporal_patch_size != 0: + repeats = patches[:, -1:].repeat(1, temporal_patch_size - 1, 1, 1, 1) # [B,temporal_patch_size-1,C,H,W] + patches = torch.cat([patches, repeats], dim=1) # [B,T_padded,C,H,W] + batch_size, grid_t, channel = patches.shape[:3] + grid_t = grid_t // temporal_patch_size + grid_h, grid_w = resized_height // patch_size, resized_width // patch_size + + patches = patches.view( + batch_size, + grid_t, + temporal_patch_size, + channel, + grid_h // merge_size, + merge_size, + patch_size, + grid_w // merge_size, + merge_size, + patch_size, + ) # [B,grid_t,temporal_patch_size,C,grid_h//merge_size,merge_size,patch_size,grid_w//merge_size,merge_size,patch_size] + patches = patches.permute( + 0, 1, 4, 7, 5, 8, 3, 2, 6, 9 + ) # [B,grid_t,grid_h//merge_size,grid_w//merge_size,merge_size,merge_size,C,temporal_patch_size,patch_size,patch_size] + flatten_patches = patches.reshape( + batch_size, + grid_t * grid_h * grid_w, + channel * temporal_patch_size * patch_size * patch_size, + ) # [B,grid_t*grid_h*grid_w,C*temporal_patch_size*patch_size*patch_size] + + processed_videos_grouped[shape] = flatten_patches + processed_grids[shape] = [[grid_t, grid_h, grid_w]] * batch_size + + processed_videos = reorder_videos(processed_videos_grouped, grouped_videos_index) + processed_grids = reorder_videos(processed_grids, grouped_videos_index) + pixel_values_videos = torch.cat( + processed_videos, dim=0 + ) # [total_videos,N_tokens,C*temporal_patch_size*patch_size*patch_size] + video_grid_thw = torch.tensor(processed_grids) # [total_videos,3] + data = { + "pixel_values_videos": pixel_values_videos, + "video_grid_thw": video_grid_thw, + } + + return BatchFeature(data=data, tensor_type=return_tensors) + + +__all__ = ["Qwen3VLVideoProcessor"] diff --git a/cosmos-inference/cosmos3/_src/vfm/models/vlm/qwen3_vl_moe/__init__.py b/cosmos-inference/cosmos3/_src/vfm/models/vlm/qwen3_vl_moe/__init__.py new file mode 100644 index 00000000..db6d6eb6 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/vfm/models/vlm/qwen3_vl_moe/__init__.py @@ -0,0 +1,15 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025-2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + diff --git a/cosmos-inference/cosmos3/_src/vfm/models/vlm/qwen3_vl_moe/configs/Qwen3-VL-235B-A22B-Instruct.json b/cosmos-inference/cosmos3/_src/vfm/models/vlm/qwen3_vl_moe/configs/Qwen3-VL-235B-A22B-Instruct.json new file mode 100644 index 00000000..f1933484 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/vfm/models/vlm/qwen3_vl_moe/configs/Qwen3-VL-235B-A22B-Instruct.json @@ -0,0 +1,68 @@ +{ + "architectures": [ + "Qwen3VLMoeForConditionalGeneration" + ], + "image_token_id": 151655, + "model_type": "qwen3_vl_moe", + "text_config": { + "attention_bias": false, + "attention_dropout": 0.0, + "bos_token_id": 151643, + "decoder_sparse_step": 1, + "dtype": "bfloat16", + "eos_token_id": 151645, + "head_dim": 128, + "hidden_act": "silu", + "hidden_size": 4096, + "initializer_range": 0.02, + "intermediate_size": 12288, + "max_position_embeddings": 262144, + "mlp_only_layers": [], + "model_type": "qwen3_vl_moe_text", + "moe_intermediate_size": 1536, + "norm_topk_prob": true, + "num_attention_heads": 64, + "num_experts": 128, + "num_experts_per_tok": 8, + "num_hidden_layers": 94, + "num_key_value_heads": 4, + "rms_norm_eps": 1e-06, + "rope_scaling": { + "mrope_interleaved": true, + "mrope_section": [ + 24, + 20, + 20 + ], + "rope_type": "default" + }, + "rope_theta": 5000000, + "use_cache": true, + "vocab_size": 151936 + }, + "tie_word_embeddings": false, + "transformers_version": "4.57.0.dev0", + "video_token_id": 151656, + "vision_config": { + "deepstack_visual_indexes": [ + 8, + 16, + 24 + ], + "depth": 27, + "hidden_act": "gelu_pytorch_tanh", + "hidden_size": 1152, + "in_channels": 3, + "initializer_range": 0.02, + "intermediate_size": 4304, + "model_type": "qwen3_vl_moe", + "num_heads": 16, + "num_position_embeddings": 2304, + "out_hidden_size": 4096, + "patch_size": 16, + "spatial_merge_size": 2, + "temporal_patch_size": 2 + }, + "vision_end_token_id": 151653, + "vision_start_token_id": 151652 +} diff --git a/cosmos-inference/cosmos3/_src/vfm/models/vlm/qwen3_vl_moe/configs/Qwen3-VL-30B-A3B-Instruct.json b/cosmos-inference/cosmos3/_src/vfm/models/vlm/qwen3_vl_moe/configs/Qwen3-VL-30B-A3B-Instruct.json new file mode 100644 index 00000000..23665bac --- /dev/null +++ b/cosmos-inference/cosmos3/_src/vfm/models/vlm/qwen3_vl_moe/configs/Qwen3-VL-30B-A3B-Instruct.json @@ -0,0 +1,68 @@ +{ + "architectures": [ + "Qwen3VLMoeForConditionalGeneration" + ], + "image_token_id": 151655, + "model_type": "qwen3_vl_moe", + "text_config": { + "attention_bias": false, + "attention_dropout": 0.0, + "bos_token_id": 151643, + "decoder_sparse_step": 1, + "dtype": "bfloat16", + "eos_token_id": 151645, + "head_dim": 128, + "hidden_act": "silu", + "hidden_size": 2048, + "initializer_range": 0.02, + "intermediate_size": 6144, + "max_position_embeddings": 262144, + "mlp_only_layers": [], + "model_type": "qwen3_vl_moe_text", + "moe_intermediate_size": 768, + "norm_topk_prob": true, + "num_attention_heads": 32, + "num_experts": 128, + "num_experts_per_tok": 8, + "num_hidden_layers": 48, + "num_key_value_heads": 4, + "rms_norm_eps": 1e-06, + "rope_scaling": { + "mrope_interleaved": true, + "mrope_section": [ + 24, + 20, + 20 + ], + "rope_type": "default" + }, + "rope_theta": 5000000, + "use_cache": true, + "vocab_size": 151936 + }, + "tie_word_embeddings": false, + "transformers_version": "4.57.0.dev0", + "video_token_id": 151656, + "vision_config": { + "deepstack_visual_indexes": [ + 8, + 16, + 24 + ], + "depth": 27, + "hidden_act": "gelu_pytorch_tanh", + "hidden_size": 1152, + "in_channels": 3, + "initializer_range": 0.02, + "intermediate_size": 4304, + "model_type": "qwen3_vl_moe", + "num_heads": 16, + "num_position_embeddings": 2304, + "out_hidden_size": 2048, + "patch_size": 16, + "spatial_merge_size": 2, + "temporal_patch_size": 2 + }, + "vision_end_token_id": 151653, + "vision_start_token_id": 151652 +} diff --git a/cosmos-inference/cosmos3/_src/vfm/models/vlm/qwen3_vl_moe/configs/__init__.py b/cosmos-inference/cosmos3/_src/vfm/models/vlm/qwen3_vl_moe/configs/__init__.py new file mode 100644 index 00000000..db6d6eb6 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/vfm/models/vlm/qwen3_vl_moe/configs/__init__.py @@ -0,0 +1,15 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025-2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + diff --git a/cosmos-inference/cosmos3/_src/vfm/models/vlm/qwen3_vl_moe/configuration_qwen3_vl_moe.py b/cosmos-inference/cosmos3/_src/vfm/models/vlm/qwen3_vl_moe/configuration_qwen3_vl_moe.py new file mode 100644 index 00000000..e5d834a5 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/vfm/models/vlm/qwen3_vl_moe/configuration_qwen3_vl_moe.py @@ -0,0 +1,330 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +from transformers.configuration_utils import PretrainedConfig +from transformers.modeling_rope_utils import rope_config_validation + + +class Qwen3VLMoeTextConfig(PretrainedConfig): + r""" + This is the configuration class to store the configuration of a [`Qwen3VLMoeTextModel`]. It is used to instantiate a + Qwen3-VL-MOE model according to the specified arguments, defining the model architecture. Instantiating a configuration + with the defaults will yield a similar configuration to that of + Qwen3-VL-30B-A3B-Instruct [Qwen/Qwen3-VL-30B-A3B-Instruct](https://huggingface.co/Qwen/Qwen3-VL-30B-A3B-Instruct). + + Configuration objects inherit from [`PretrainedConfig`] and can be used to control the model outputs. Read the + documentation from [`PretrainedConfig`] for more information. + + Args: + vocab_size (`int`, *optional*, defaults to 151936): + Vocabulary size of the Qwen2MoE model. Defines the number of different tokens that can be represented by the + `inputs_ids` passed when calling [`Qwen2MoeModel`] + hidden_size (`int`, *optional*, defaults to 2048): + Dimension of the hidden representations. + intermediate_size (`int`, *optional*, defaults to 5632): + Dimension of the MLP representations. + num_hidden_layers (`int`, *optional*, defaults to 24): + Number of hidden layers in the Transformer encoder. + num_attention_heads (`int`, *optional*, defaults to 16): + Number of attention heads for each attention layer in the Transformer encoder. + num_key_value_heads (`int`, *optional*, defaults to 16): + This is the number of key_value heads that should be used to implement Grouped Query Attention. If + `num_key_value_heads=num_attention_heads`, the model will use Multi Head Attention (MHA), if + `num_key_value_heads=1` the model will use Multi Query Attention (MQA) otherwise GQA is used. When + converting a multi-head checkpoint to a GQA checkpoint, each group key and value head should be constructed + by meanpooling all the original heads within that group. For more details checkout [this + paper](https://arxiv.org/pdf/2305.13245.pdf). If it is not specified, will default to `32`. + hidden_act (`str` or `function`, *optional*, defaults to `"silu"`): + The non-linear activation function (function or string) in the decoder. + max_position_embeddings (`int`, *optional*, defaults to 128000): + The maximum sequence length that this model might ever be used with. + initializer_range (`float`, *optional*, defaults to 0.02): + The standard deviation of the truncated_normal_initializer for initializing all weight matrices. + rms_norm_eps (`float`, *optional*, defaults to 1e-06): + The epsilon used by the rms normalization layers. + use_cache (`bool`, *optional*, defaults to `True`): + Whether or not the model should return the last key/values attentions (not used by all models). Only + relevant if `config.is_decoder=True`. + tie_word_embeddings (`bool`, *optional*, defaults to `False`): + Whether the model's input and output word embeddings should be tied. + rope_theta (`float`, *optional*, defaults to 5000000.0): + The base period of the RoPE embeddings. + attention_bias (`bool`, defaults to `False`, *optional*, defaults to `False`): + Whether to use a bias in the query, key, value and output projection layers during self-attention. + attention_dropout (`float`, *optional*, defaults to 0.0): + The dropout ratio for the attention probabilities. + decoder_sparse_step (`int`, *optional*, defaults to 1): + The frequency of the MoE layer. + moe_intermediate_size (`int`, *optional*, defaults to 1408): + Intermediate size of the routed expert. + num_experts_per_tok (`int`, *optional*, defaults to 4): + Number of selected experts. + num_experts (`int`, *optional*, defaults to 60): + Number of routed experts. + norm_topk_prob (`bool`, *optional*, defaults to `True`): + Whether to normalize the topk probabilities. + router_aux_loss_coef (`float`, *optional*, defaults to 0.001): + The aux loss factor for the total loss. + mlp_only_layers (`List[int]`, *optional*, defaults to `[]`): + Indicate which layers use Qwen3VLMoeMLP rather than Qwen3VLMoeSparseMoeBlock + The list contains layer index, from 0 to num_layers-1 if we have num_layers layers + If `mlp_only_layers` is empty, `decoder_sparse_step` is used to determine the sparsity. + rope_scaling (`Dict`, *optional*): + Dictionary containing the scaling configuration for the RoPE embeddings. NOTE: if you apply new rope type + and you expect the model to work on longer `max_position_embeddings`, we recommend you to update this value + accordingly. + Expected contents: + `rope_type` (`str`): + The sub-variant of RoPE to use. Can be one of ['default', 'linear', 'dynamic', 'yarn', 'longrope', + 'llama3'], with 'default' being the original RoPE implementation. + `factor` (`float`, *optional*): + Used with all rope types except 'default'. The scaling factor to apply to the RoPE embeddings. In + most scaling types, a `factor` of x will enable the model to handle sequences of length x * + original maximum pre-trained length. + `original_max_position_embeddings` (`int`, *optional*): + Used with 'dynamic', 'longrope' and 'llama3'. The original max position embeddings used during + pretraining. + `attention_factor` (`float`, *optional*): + Used with 'yarn' and 'longrope'. The scaling factor to be applied on the attention + computation. If unspecified, it defaults to value recommended by the implementation, using the + `factor` field to infer the suggested value. + `beta_fast` (`float`, *optional*): + Only used with 'yarn'. Parameter to set the boundary for extrapolation (only) in the linear + ramp function. If unspecified, it defaults to 32. + `beta_slow` (`float`, *optional*): + Only used with 'yarn'. Parameter to set the boundary for interpolation (only) in the linear + ramp function. If unspecified, it defaults to 1. + `short_factor` (`List[float]`, *optional*): + Only used with 'longrope'. The scaling factor to be applied to short contexts (< + `original_max_position_embeddings`). Must be a list of numbers with the same length as the hidden + size divided by the number of attention heads divided by 2 + `long_factor` (`List[float]`, *optional*): + Only used with 'longrope'. The scaling factor to be applied to long contexts (< + `original_max_position_embeddings`). Must be a list of numbers with the same length as the hidden + size divided by the number of attention heads divided by 2 + `low_freq_factor` (`float`, *optional*): + Only used with 'llama3'. Scaling factor applied to low frequency components of the RoPE + `high_freq_factor` (`float`, *optional*): + Only used with 'llama3'. Scaling factor applied to high frequency components of the RoPE + head_dim (`int`, *optional*): + The dimension of the head. If not specified, will default to `hidden_size // num_attention_heads`. + + ```python + >>> from transformers import Qwen3VLMoeForConditionalGeneration, Qwen3VLMoeConfig + + >>> # Initializing a Qwen3VLMoe style configuration + >>> configuration = Qwen3VLMoeConfig() + + >>> # Initializing a model from the Qwen3-VL-30B-A3B style configuration + >>> model = Qwen3VLMoeForConditionalGeneration(configuration) + + >>> # Accessing the model configuration + >>> configuration = model.config + ```""" + + model_type = "qwen3_vl_moe_text" + base_config_key = "text_config" + keys_to_ignore_at_inference = ["past_key_values"] + # Default tensor parallel plan for base model `Qwen3VLMoe` + base_model_tp_plan = { + "layers.*.self_attn.q_proj": "colwise", + "layers.*.self_attn.k_proj": "colwise", + "layers.*.self_attn.v_proj": "colwise", + "layers.*.self_attn.o_proj": "rowwise", + "layers.*.mlp.gate_proj": "colwise", + "layers.*.mlp.up_proj": "colwise", + "layers.*.mlp.down_proj": "rowwise", + } + base_model_pp_plan = { + "embed_tokens": (["input_ids"], ["inputs_embeds"]), + "layers": (["hidden_states", "attention_mask"], ["hidden_states"]), + "norm": (["hidden_states"], ["hidden_states"]), + } + + def __init__( + self, + vocab_size=151936, + hidden_size=2048, + intermediate_size=5632, + num_hidden_layers=24, + num_attention_heads=16, + num_key_value_heads=16, + hidden_act="silu", + max_position_embeddings=128000, + initializer_range=0.02, + rms_norm_eps=1e-6, + use_cache=True, + tie_word_embeddings=False, + rope_theta=5000000.0, + attention_bias=False, + attention_dropout=0.0, + decoder_sparse_step=1, + moe_intermediate_size=1408, + num_experts_per_tok=4, + num_experts=60, + norm_topk_prob=True, + router_aux_loss_coef=0.001, + mlp_only_layers=None, + rope_scaling=None, + head_dim=None, + **kwargs, + ): + self.vocab_size = vocab_size + self.max_position_embeddings = max_position_embeddings + self.hidden_size = hidden_size + self.intermediate_size = intermediate_size + self.num_hidden_layers = num_hidden_layers + self.num_attention_heads = num_attention_heads + + # for backward compatibility + if num_key_value_heads is None: + num_key_value_heads = num_attention_heads + + self.num_key_value_heads = num_key_value_heads + self.hidden_act = hidden_act + self.initializer_range = initializer_range + self.rms_norm_eps = rms_norm_eps + self.use_cache = use_cache + self.rope_theta = rope_theta + self.attention_bias = attention_bias + self.attention_dropout = attention_dropout + self.rope_scaling = rope_scaling + self.head_dim = head_dim or hidden_size // num_attention_heads + + rope_config_validation(self, ignore_keys={"mrope_section", "mrope_interleaved"}) + + # MoE arguments + self.decoder_sparse_step = decoder_sparse_step + self.moe_intermediate_size = moe_intermediate_size + self.num_experts_per_tok = num_experts_per_tok + self.num_experts = num_experts + self.norm_topk_prob = norm_topk_prob + self.router_aux_loss_coef = router_aux_loss_coef + self.mlp_only_layers = [] if mlp_only_layers is None else mlp_only_layers + + super().__init__(tie_word_embeddings=tie_word_embeddings, **kwargs) + + +class Qwen3VLMoeVisionConfig(PretrainedConfig): + model_type = "qwen3_vl_moe" + base_config_key = "vision_config" + + def __init__( + self, + depth=27, + hidden_size=1152, + hidden_act="gelu_pytorch_tanh", + intermediate_size=4304, + num_heads=16, + in_channels=3, + patch_size=16, + spatial_merge_size=2, + temporal_patch_size=2, + out_hidden_size=3584, + num_position_embeddings=2304, + deepstack_visual_indexes=[8, 16, 24], + initializer_range=0.02, + **kwargs, + ): + super().__init__(**kwargs) + + self.depth = depth + self.hidden_size = hidden_size + self.hidden_act = hidden_act + self.intermediate_size = intermediate_size + self.num_heads = num_heads + self.in_channels = in_channels + self.patch_size = patch_size + self.spatial_merge_size = spatial_merge_size + self.temporal_patch_size = temporal_patch_size + self.out_hidden_size = out_hidden_size + self.num_position_embeddings = num_position_embeddings + self.initializer_range = initializer_range + self.deepstack_visual_indexes = deepstack_visual_indexes + + +class Qwen3VLMoeConfig(PretrainedConfig): + r""" + This is the configuration class to store the configuration of a [`Qwen3VLMoeModel`]. It is used to instantiate a + Qwen3-VL-MOE model according to the specified arguments, defining the model architecture. Instantiating a configuration + with the defaults will yield a similar configuration to that of + Qwen3-VL-30B-A3B-Instruct [Qwen/Qwen3-VL-30B-A3B-Instruct](https://huggingface.co/Qwen/Qwen3-VL-30B-A3B-Instruct). + + Configuration objects inherit from [`PretrainedConfig`] and can be used to control the model outputs. Read the + documentation from [`PretrainedConfig`] for more information. + + + Args: + text_config (`Union[PreTrainedConfig, dict]`, *optional*, defaults to `Qwen3VLMoeTextConfig`): + The config object or dictionary of the text backbone. + vision_config (`Union[PreTrainedConfig, dict]`, *optional*, defaults to `Qwen3VLMoeVisionConfig`): + The config object or dictionary of the vision backbone. + image_token_id (`int`, *optional*, defaults to 151655): + The image token index to encode the image prompt. + video_token_id (`int`, *optional*, defaults to 151656): + The video token index to encode the image prompt. + vision_start_token_id (`int`, *optional*, defaults to 151652): + The start token index to encode the image prompt. + vision_end_token_id (`int`, *optional*, defaults to 151653): + The end token index to encode the image prompt. + tie_word_embeddings (`bool`, *optional*, defaults to `False`): + Whether to tie the word embeddings. + + ```python + >>> from transformers import Qwen3VLMoeForConditionalGeneration, Qwen3VLMoeConfig + + >>> # Initializing a Qwen3-VL-MOE style configuration + >>> configuration = Qwen3VLMoeConfig() + + >>> # Initializing a model from the Qwen3-VL-30B-A3B style configuration + >>> model = Qwen3VLMoeForConditionalGeneration(configuration) + + >>> # Accessing the model configuration + >>> configuration = model.config + ```""" + + model_type = "qwen3_vl_moe" + sub_configs = {"vision_config": Qwen3VLMoeVisionConfig, "text_config": Qwen3VLMoeTextConfig} + keys_to_ignore_at_inference = ["past_key_values"] + + def __init__( + self, + text_config=None, + vision_config=None, + image_token_id=151655, + video_token_id=151656, + vision_start_token_id=151652, + vision_end_token_id=151653, + tie_word_embeddings=False, + **kwargs, + ): + if isinstance(vision_config, dict): + self.vision_config = self.sub_configs["vision_config"](**vision_config) + elif vision_config is None: + self.vision_config = self.sub_configs["vision_config"]() + + if isinstance(text_config, dict): + self.text_config = self.sub_configs["text_config"](**text_config) + elif text_config is None: + self.text_config = self.sub_configs["text_config"]() + + self.image_token_id = image_token_id + self.video_token_id = video_token_id + self.vision_start_token_id = vision_start_token_id + self.vision_end_token_id = vision_end_token_id + super().__init__(**kwargs, tie_word_embeddings=tie_word_embeddings) + + +__all__ = ["Qwen3VLMoeConfig", "Qwen3VLMoeTextConfig"] diff --git a/cosmos-inference/cosmos3/_src/vfm/models/vlm/qwen3_vl_moe/moe.py b/cosmos-inference/cosmos3/_src/vfm/models/vlm/qwen3_vl_moe/moe.py new file mode 100644 index 00000000..f8c25d32 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/vfm/models/vlm/qwen3_vl_moe/moe.py @@ -0,0 +1,261 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + + +from typing import Callable + +import torch +import torch.nn as nn +from transformers.activations import ACT2FN + +from cosmos3._src.vfm.models.vlm.qwen3_vl_moe.configuration_qwen3_vl_moe import ( + Qwen3VLMoeTextConfig, +) +from cosmos3._src.vfm.models.vlm.qwen3_vl_moe.moe_kernels import ( + TOKEN_GROUP_ALIGN_SIZE_M, + _generate_permute_indices, +) + + +def _run_experts_grouped_mm( + gate_up_proj: torch.Tensor, # [num_experts,hidden_size,2*moe_intermediate_size] + down_proj: torch.Tensor, # [num_experts,moe_intermediate_size,hidden_size] + act_fn: Callable[[torch.Tensor], torch.Tensor], + x: torch.Tensor, # [num_tokens,hidden_size] (tokens sorted by expert) + num_tokens_per_expert: torch.Tensor, # [num_experts] + scores: torch.Tensor, # [padded_len] +) -> torch.Tensor: # [num_tokens,hidden_size] + """ + This function runs the gate/up/down projection in a grouped matrix multiplication fashion. + + Args: + gate_up_proj (torch.Tensor): (num_experts, hidden_size, 2 * moe_intermediate_size) + down_proj (torch.Tensor): (num_experts, moe_intermediate_size, hidden_size) + x (torch.Tensor): (batch_size * seq_len, hidden_size) + num_tokens_per_expert (torch.Tensor): (num_experts,) + scores (torch.Tensor): (num_tokens,) + + Returns: + torch.Tensor: (batch_size * seq_len, hidden_size) + """ + offsets = torch.cumsum(num_tokens_per_expert, dim=0, dtype=torch.int32) # [num_experts] + h = torch._grouped_mm(x, gate_up_proj, offs=offsets) # [num_tokens,2*moe_intermediate_size] + h = torch.chunk(h, chunks=2, dim=-1) # 2x [num_tokens,moe_intermediate_size] + h = act_fn(h[0]) * h[1] * scores.unsqueeze(-1) # [num_tokens,moe_intermediate_size] + return torch._grouped_mm(h, down_proj, offs=offsets) # [num_tokens,hidden_size] + + +class Qwen3VLMoeTextExpertsGroupedMm(nn.Module): + def __init__(self, config): + super().__init__() + self.gate_up_proj = nn.Parameter( + torch.empty(config.num_experts, config.hidden_size, 2 * config.moe_intermediate_size) + ) + self.down_proj = nn.Parameter(torch.empty(config.num_experts, config.moe_intermediate_size, config.hidden_size)) + self.act_fn = ACT2FN[config.hidden_act] + + self.num_experts = config.num_experts + self.moe_intermediate_size = config.moe_intermediate_size + self.hidden_size = config.hidden_size + self.top_k = config.num_experts_per_tok + + def forward( + self, + hidden_states: torch.Tensor, # [num_tokens,hidden_size] + topk_scores: torch.Tensor, # [num_tokens,top_k] + expert_indices: torch.Tensor, # [num_tokens,top_k] + num_tokens_per_expert: torch.Tensor, # [num_experts] + ) -> torch.Tensor: # [num_tokens,hidden_size] + """ + This module obtains the output of the experts by routing the tokens + to the experts and then performing a weighted sum of the output of the experts. + + Args: + hidden_states (torch.Tensor): (batch_size * seq_len, hidden_size) + topk_scores (torch.Tensor): (batch_size * seq_len, top_k) + expert_indices (torch.Tensor): (batch_size * seq_len, top_k) + + Returns: + torch.Tensor: (batch_size * seq_len, hidden_size) + """ + num_tokens, dim = hidden_states.shape + topk_scores_sorted, token_indices_sorted = self._reorder_tokens( + topk_scores, + expert_indices, + ) + # topk_scores_sorted: [num_tokens*top_k] + # token_indices_sorted: [num_tokens*top_k] + + # Build padded permutation indices + num_experts = num_tokens_per_expert.shape[0] + alignment = TOKEN_GROUP_ALIGN_SIZE_M + padded_size = num_tokens * self.top_k + num_experts * alignment + padded_size = ((padded_size + alignment - 1) // alignment) * alignment + + permuted_indices, padded_num_tokens_per_expert = _generate_permute_indices( + num_tokens_per_expert, + num_experts, + padded_size, + alignment, + ) + + # Compose: permuted_indices indexes into sorted order, + # token_indices_sorted maps sorted→original. Compose them: + sentinel = torch.tensor([num_tokens], device=hidden_states.device) # for padding slots + token_indices_ext = torch.cat([token_indices_sorted, sentinel]) + combined_indices = token_indices_ext[permuted_indices.long()] + combined_indices = combined_indices.unsqueeze(-1).expand(-1, dim) + + # Pad scores with a zero sentinel so padding slots contribute nothing + scores_ext = torch.cat([topk_scores_sorted, topk_scores_sorted.new_zeros(1)]) + combined_scores = scores_ext[permuted_indices.long()] # [padded_len] + + # Single gather (with a zero-padded sentinel row) + input_padded = torch.cat([hidden_states, hidden_states.new_zeros(1, dim)]) + routed_input = input_padded.gather(dim=0, index=combined_indices) + + # Run experts + routed_output = _run_experts_grouped_mm( + self.gate_up_proj, + self.down_proj, + self.act_fn, + routed_input, + padded_num_tokens_per_expert, + combined_scores, + ) + + output_padded = torch.zeros_like(input_padded) + output_padded.scatter_add_(dim=0, index=combined_indices, src=routed_output) + return output_padded[:-1] + + def _reorder_tokens( + self, + topk_scores: torch.Tensor, # [num_tokens,top_k] + expert_indices: torch.Tensor, # [num_tokens,top_k] + ) -> tuple[torch.Tensor, torch.Tensor]: + """Reorder tokens into expert-grouped order via argsort. + + Returns: + topk_scores_sorted: [num_tokens*top_k] scores in expert-grouped order. + token_indices_sorted: [num_tokens*top_k] original token indices in + expert-grouped order. + """ + token_indices_sorted = torch.argsort(expert_indices.view(-1), stable=True) # [num_tokens*top_k] + topk_scores_sorted = topk_scores.view(-1)[token_indices_sorted] # [num_tokens*top_k] + token_indices_sorted = token_indices_sorted // self.top_k # [num_tokens*top_k] + return topk_scores_sorted, token_indices_sorted + + def init_weights(self, buffer_device: torch.device): + nn.init.normal_(self.gate_up_proj, mean=0.0, std=0.02) + nn.init.normal_(self.down_proj, mean=0.0, std=0.02) + + +class Qwen3VLMoeTextExpertsNaive(nn.Module): + def __init__(self, config): + super().__init__() + self.gate_up_proj = nn.Parameter( + torch.empty(config.num_experts, config.hidden_size, 2 * config.moe_intermediate_size) + ) + self.down_proj = nn.Parameter(torch.empty(config.num_experts, config.moe_intermediate_size, config.hidden_size)) + self.act_fn = ACT2FN[config.hidden_act] + + self.num_experts = config.num_experts + self.moe_intermediate_size = config.moe_intermediate_size + self.hidden_size = config.hidden_size + + def forward( + self, + hidden_states: torch.Tensor, # [num_tokens,hidden_size] + topk_scores: torch.Tensor, # [num_tokens,top_k] + expert_indices: torch.Tensor, # [num_tokens,top_k] + num_tokens_per_expert: torch.Tensor, # [num_experts] + ) -> torch.Tensor: # [num_tokens,hidden_size] + """ + When training it is more efficient to just loop over the experts and compute the output for each expert + as otherwise the memory would explode. + + For inference we can sacrifice some memory and compute the output for all experts at once. By repeating the inputs. + + Args: + hidden_states (torch.Tensor): (batch_size * token_num, hidden_size) + routing_weights (torch.Tensor): (batch_size * token_num, top_k) + expert_indices (torch.Tensor): (batch_size * token_num, top_k) + num_tokens_per_expert (torch.Tensor): (num_experts,) + + Returns: + torch.Tensor: (batch_size * seq_len, hidden_size) + """ + del num_tokens_per_expert + assert hidden_states.ndim == 2, "hidden_states must be of shape (batch_size * seq_len, hidden_size)" + assert hidden_states.shape[1] == self.hidden_size, ( + "hidden_states must be of shape (batch_size * seq_len, hidden_size)" + ) + routing_weights = torch.zeros( + hidden_states.shape[0], + self.num_experts, + dtype=hidden_states.dtype, + device=hidden_states.device, + ) # [num_tokens,num_experts] + routing_weights = routing_weights.scatter_(1, expert_indices, topk_scores) # [num_tokens,num_experts] + + if self.training: + next_states = torch.zeros_like(hidden_states) # [num_tokens,hidden_size] + with torch.no_grad(): + expert_mask = torch.nn.functional.one_hot( + expert_indices, num_classes=self.num_experts + ) # [num_tokens,top_k,num_experts] + expert_mask = expert_mask.permute(2, 1, 0) # [num_experts,top_k,num_tokens] + # we sum on the top_k and on the sequence length to get which experts + # are hit this time around + expert_hit = torch.greater(expert_mask.sum(dim=(-1, -2)), 0).nonzero() + for expert_idx in expert_hit[:]: + with torch.no_grad(): + _, token_idx = torch.where(expert_mask[expert_idx[0]]) + current_state = hidden_states[token_idx] # [num_expert_tokens,hidden_size] + gate_up = current_state @ self.gate_up_proj[expert_idx] # [num_expert_tokens,2*moe_intermediate_size] + gate, up = gate_up.chunk(2, dim=-1) # 2x [num_expert_tokens,moe_intermediate_size] + gated_output = up * self.act_fn(gate) # [num_expert_tokens,moe_intermediate_size] + out = gated_output @ self.down_proj[expert_idx] # [num_expert_tokens,hidden_size] + weighted_output = out[0] * routing_weights[token_idx, expert_idx, None] + assert weighted_output.dtype == hidden_states.dtype + next_states.index_add_(0, token_idx, weighted_output) + else: + hidden_states = hidden_states.repeat(self.num_experts, 1) # [num_experts*num_tokens,hidden_size] + hidden_states = hidden_states.view( + self.num_experts, -1, self.hidden_size + ) # [num_experts,num_tokens,hidden_size] + gate_up = torch.bmm(hidden_states, self.gate_up_proj) # [num_experts,num_tokens,2*moe_intermediate_size] + gate, up = gate_up.chunk( + 2, dim=-1 + ) # not supported for DTensors # 2x [num_experts,num_tokens,moe_intermediate_size] + next_states = torch.bmm((up * self.act_fn(gate)), self.down_proj) # [num_experts,num_tokens,hidden_size] + next_states = next_states * routing_weights.transpose(0, 1).unsqueeze( + dim=-1 + ) # [num_experts,num_tokens,hidden_size] + next_states = next_states.sum(dim=0) # [num_tokens,hidden_size] + return next_states + + def init_weights(self, buffer_device: torch.device): + nn.init.normal_(self.gate_up_proj, mean=0.0, std=0.02) + nn.init.normal_(self.down_proj, mean=0.0, std=0.02) + + +def create_text_experts(config: Qwen3VLMoeTextConfig, implementation_type: str = "naive") -> nn.Module: + if implementation_type == "naive": + return Qwen3VLMoeTextExpertsNaive(config) + elif implementation_type == "grouped_mm": + return Qwen3VLMoeTextExpertsGroupedMm(config) + else: + raise ValueError(f"Invalid implementation: {implementation_type}") diff --git a/cosmos-inference/cosmos3/_src/vfm/models/vlm/qwen3_vl_moe/moe_bench.py b/cosmos-inference/cosmos3/_src/vfm/models/vlm/qwen3_vl_moe/moe_bench.py new file mode 100644 index 00000000..ff36b535 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/vfm/models/vlm/qwen3_vl_moe/moe_bench.py @@ -0,0 +1,455 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025-2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +"""Speed benchmark for Qwen3VLMoeTextExpertsGroupedMm. + +Usage: + # Default benchmark (forward only, compiled, bf16): + python -m cosmos3._src.vfm.models.vlm.qwen3_vl_moe.moe_bench + + # Forward + backward: + python -m cosmos3._src.vfm.models.vlm.qwen3_vl_moe.moe_bench --backward + + # Compare grouped_mm vs naive: + python -m cosmos3._src.vfm.models.vlm.qwen3_vl_moe.moe_bench --compare + + # Disable torch.compile: + python -m cosmos3._src.vfm.models.vlm.qwen3_vl_moe.moe_bench --no-compile + + # Custom sweep: + python -m cosmos3._src.vfm.models.vlm.qwen3_vl_moe.moe_bench --backward \ + --hidden-size 4096 \ + --moe-intermediate-size 1536 + + # Capture a torch profiler trace (Chrome trace JSON): + python -m cosmos3._src.vfm.models.vlm.qwen3_vl_moe.moe_bench --profile + + # Profile to a custom directory: + python -m cosmos3._src.vfm.models.vlm.qwen3_vl_moe.moe_bench --profile --profile-dir ./my_traces + + # All options: + python -m cosmos3._src.vfm.models.vlm.qwen3_vl_moe.moe_bench --help +""" + +import itertools +import os +from dataclasses import dataclass, field +from pathlib import Path +from typing import Literal + +import torch +import tyro + +from cosmos3._src.vfm.models.vlm.qwen3_vl_moe.configuration_qwen3_vl_moe import ( + Qwen3VLMoeTextConfig, +) +from cosmos3._src.vfm.models.vlm.qwen3_vl_moe.moe import ( + Qwen3VLMoeTextExpertsGroupedMm, + Qwen3VLMoeTextExpertsNaive, +) + + +@dataclass +class BenchConfig: + """Benchmark Qwen3VLMoeTextExpertsGroupedMm.""" + + num_tokens: list[int] = field(default_factory=lambda: [16384, 32768]) + num_experts: list[int] = field(default_factory=lambda: [128]) + top_k: list[int] = field(default_factory=lambda: [8]) + hidden_size: list[int] = field(default_factory=lambda: [2048]) + moe_intermediate_size: list[int] = field(default_factory=lambda: [768]) + num_warmup: int = 10 + num_iters: int = 100 + backward: bool = False + """Also benchmark backward pass.""" + compare: bool = False + """Compare grouped_mm vs naive.""" + compile: bool = True + """Wrap module with torch.compile before benchmarking.""" + profile: bool = False + """Capture a torch profiler trace after benchmarking.""" + profile_dir: str = "./profiles" + """Directory to write Chrome trace JSON files.""" + dtype: Literal["bf16", "fp32"] = "bf16" + + +@dataclass +class BenchResult: + num_tokens: int + num_experts: int + top_k: int + hidden_size: int + moe_intermediate_size: int + fwd_ms: float + bwd_ms: float + peak_mem_mb: float + + +def _make_inputs( + num_tokens: int, + config: Qwen3VLMoeTextConfig, + device: torch.device, + dtype: torch.dtype, +) -> tuple[torch.Tensor, torch.Tensor, torch.Tensor, torch.Tensor]: + hidden_states = torch.randn(num_tokens, config.hidden_size, device=device, dtype=dtype) + + expert_indices = torch.stack( + [torch.randperm(config.num_experts, device=device)[: config.num_experts_per_tok] for _ in range(num_tokens)] + ).to(torch.int64) + + topk_scores = torch.rand(num_tokens, config.num_experts_per_tok, device=device, dtype=dtype) + topk_scores = topk_scores / topk_scores.sum(dim=-1, keepdim=True) + + num_tokens_per_expert = torch.zeros(config.num_experts, dtype=torch.int32, device=device) + for idx in expert_indices.view(-1): + num_tokens_per_expert[idx] += 1 + + return hidden_states, topk_scores, expert_indices, num_tokens_per_expert + + +def bench_forward( + module: torch.nn.Module, + hidden_states: torch.Tensor, + topk_scores: torch.Tensor, + expert_indices: torch.Tensor, + num_tokens_per_expert: torch.Tensor, + num_warmup: int = 20, + num_iters: int = 100, +) -> float: + for _ in range(num_warmup): + with torch.no_grad(): + module(hidden_states, topk_scores, expert_indices, num_tokens_per_expert) + torch.cuda.synchronize() + + start = torch.cuda.Event(enable_timing=True) + end = torch.cuda.Event(enable_timing=True) + + start.record() + for _ in range(num_iters): + with torch.no_grad(): + module(hidden_states, topk_scores, expert_indices, num_tokens_per_expert) + end.record() + torch.cuda.synchronize() + + return start.elapsed_time(end) / num_iters + + +def bench_backward( + module: torch.nn.Module, + hidden_states: torch.Tensor, + topk_scores: torch.Tensor, + expert_indices: torch.Tensor, + num_tokens_per_expert: torch.Tensor, + num_warmup: int = 20, + num_iters: int = 100, +) -> float: + for _ in range(num_warmup): + h = hidden_states.detach().requires_grad_(True) + out = module(h, topk_scores, expert_indices, num_tokens_per_expert) + out.sum().backward() + torch.cuda.synchronize() + + start = torch.cuda.Event(enable_timing=True) + end = torch.cuda.Event(enable_timing=True) + + start.record() + for _ in range(num_iters): + h = hidden_states.detach().requires_grad_(True) + out = module(h, topk_scores, expert_indices, num_tokens_per_expert) + out.sum().backward() + end.record() + torch.cuda.synchronize() + + return start.elapsed_time(end) / num_iters + + +def profile_run( + module: torch.nn.Module, + hidden_states: torch.Tensor, + topk_scores: torch.Tensor, + expert_indices: torch.Tensor, + num_tokens_per_expert: torch.Tensor, + output_path: str, + include_backward: bool = False, + num_warmup: int = 5, + num_active: int = 3, +) -> None: + """Run a few iterations under the torch profiler and export a Chrome trace.""" + + def _step() -> None: + if include_backward: + h = hidden_states.detach().requires_grad_(True) + out = module(h, topk_scores, expert_indices, num_tokens_per_expert) + out.sum().backward() + else: + with torch.no_grad(): + module(hidden_states, topk_scores, expert_indices, num_tokens_per_expert) + + for _ in range(num_warmup): + _step() + torch.cuda.synchronize() + + with torch.profiler.profile( + activities=[ + torch.profiler.ProfilerActivity.CPU, + torch.profiler.ProfilerActivity.CUDA, + ], + record_shapes=True, + with_stack=True, + ) as prof: + for _ in range(num_active): + _step() + torch.cuda.synchronize() + + prof.export_chrome_trace(output_path) + print(f"\nProfile trace saved to {output_path}") + print(prof.key_averages().table(sort_by="cuda_time_total", row_limit=25)) + + +def run_single( + num_tokens: int, + num_experts: int, + top_k: int, + hidden_size: int, + moe_intermediate_size: int, + include_backward: bool, + num_warmup: int, + num_iters: int, + use_compile: bool, + dtype: torch.dtype = torch.bfloat16, + trace_path: str | None = None, +) -> BenchResult: + device = torch.device("cuda") + config = Qwen3VLMoeTextConfig( + hidden_size=hidden_size, + moe_intermediate_size=moe_intermediate_size, + num_experts=num_experts, + num_experts_per_tok=top_k, + hidden_act="silu", + ) + module = Qwen3VLMoeTextExpertsGroupedMm(config).to(device=device, dtype=dtype) + module.init_weights(device) + if use_compile: + module = torch.compile(module, fullgraph=True, dynamic=True) + + hidden_states, topk_scores, expert_indices, num_tokens_per_expert = _make_inputs(num_tokens, config, device, dtype) + + torch.cuda.reset_peak_memory_stats(device) + + fwd_ms = bench_forward( + module, + hidden_states, + topk_scores, + expert_indices, + num_tokens_per_expert, + num_warmup=num_warmup, + num_iters=num_iters, + ) + + bwd_ms = 0.0 + if include_backward: + bwd_ms = bench_backward( + module, + hidden_states, + topk_scores, + expert_indices, + num_tokens_per_expert, + num_warmup=num_warmup, + num_iters=num_iters, + ) + + peak_mem_mb = torch.cuda.max_memory_allocated(device) / (1024 * 1024) + + if trace_path is not None: + profile_run( + module, + hidden_states, + topk_scores, + expert_indices, + num_tokens_per_expert, + output_path=trace_path, + include_backward=include_backward, + ) + + return BenchResult( + num_tokens=num_tokens, + num_experts=num_experts, + top_k=top_k, + hidden_size=hidden_size, + moe_intermediate_size=moe_intermediate_size, + fwd_ms=fwd_ms, + bwd_ms=bwd_ms, + peak_mem_mb=peak_mem_mb, + ) + + +def run_comparison( + num_tokens: int, + config: Qwen3VLMoeTextConfig, + num_warmup: int, + num_iters: int, + use_compile: bool, + dtype: torch.dtype = torch.bfloat16, + trace_dir: str | None = None, +) -> None: + """Run grouped_mm vs naive side-by-side and report speedup.""" + device = torch.device("cuda") + + naive = Qwen3VLMoeTextExpertsNaive(config).to(device=device, dtype=dtype) + grouped = Qwen3VLMoeTextExpertsGroupedMm(config).to(device=device, dtype=dtype) + naive.init_weights(device) + grouped.load_state_dict(naive.state_dict()) + if use_compile: + grouped = torch.compile(grouped, fullgraph=True, dynamic=False) + + hidden_states, topk_scores, expert_indices, num_tokens_per_expert = _make_inputs(num_tokens, config, device, dtype) + + naive_ms = bench_forward( + naive, + hidden_states, + topk_scores, + expert_indices, + num_tokens_per_expert, + num_warmup=num_warmup, + num_iters=num_iters, + ) + grouped_ms = bench_forward( + grouped, + hidden_states, + topk_scores, + expert_indices, + num_tokens_per_expert, + num_warmup=num_warmup, + num_iters=num_iters, + ) + + with torch.no_grad(): + out_naive = naive(hidden_states, topk_scores, expert_indices, num_tokens_per_expert) + out_grouped = grouped(hidden_states, topk_scores, expert_indices, num_tokens_per_expert) + rel_err = (out_naive - out_grouped).norm() / out_naive.norm() + + print(f" naive: {naive_ms:8.3f} ms") + print(f" grouped: {grouped_ms:8.3f} ms") + print(f" speedup: {naive_ms / grouped_ms:8.2f}x") + print(f" rel error: {rel_err.item():.6e}") + + if trace_dir is not None: + tag = ( + f"T{num_tokens}_E{config.num_experts}_K{config.num_experts_per_tok}" + f"_H{config.hidden_size}_I{config.moe_intermediate_size}" + ) + for name, mod in [("naive", naive), ("grouped", grouped)]: + path = os.path.join(trace_dir, f"compare_{name}_{tag}.json") + profile_run( + mod, + hidden_states, + topk_scores, + expert_indices, + num_tokens_per_expert, + output_path=path, + ) + + +def main(args: BenchConfig) -> None: + dtype_map = {"bf16": torch.bfloat16, "fp32": torch.float32} + dtype = dtype_map[args.dtype] + + profile_dir: str | None = None + if args.profile: + profile_dir = args.profile_dir + Path(profile_dir).mkdir(parents=True, exist_ok=True) + + gpu_name = torch.cuda.get_device_name(0) + print(f"GPU: {gpu_name}") + print(f"dtype: {args.dtype}, compile: {args.compile}") + print(f"warmup: {args.num_warmup}, iters: {args.num_iters}") + if profile_dir: + print(f"profile dir: {profile_dir}") + print() + + if args.compare: + for num_tokens, num_experts, top_k, hidden_size, moe_intermediate_size in itertools.product( + args.num_tokens, + args.num_experts, + args.top_k, + args.hidden_size, + args.moe_intermediate_size, + ): + config = Qwen3VLMoeTextConfig( + hidden_size=hidden_size, + moe_intermediate_size=moe_intermediate_size, + num_experts=num_experts, + num_experts_per_tok=top_k, + hidden_act="silu", + ) + header = ( + f"tokens={num_tokens} experts={num_experts} top_k={top_k} " + f"hidden={hidden_size} intermediate={moe_intermediate_size}" + ) + print(header) + run_comparison( + num_tokens=num_tokens, + config=config, + num_warmup=args.num_warmup, + num_iters=args.num_iters, + use_compile=args.compile, + dtype=dtype, + trace_dir=profile_dir, + ) + print() + return + + header = ( + f"{'tokens':>8} {'experts':>7} {'top_k':>5} {'hidden':>6} " + f"{'interm':>6} {'fwd_ms':>8} {'bwd_ms':>8} {'peak_MB':>9}" + ) + print(header) + print("-" * len(header)) + + for num_tokens, num_experts, top_k, hidden_size, moe_intermediate_size in itertools.product( + args.num_tokens, + args.num_experts, + args.top_k, + args.hidden_size, + args.moe_intermediate_size, + ): + trace_path = None + if profile_dir is not None: + mode = "fwd_bwd" if args.backward else "fwd" + tag = f"T{num_tokens}_E{num_experts}_K{top_k}_H{hidden_size}_I{moe_intermediate_size}" + trace_path = os.path.join(profile_dir, f"{mode}_{tag}.json") + result = run_single( + num_tokens=num_tokens, + num_experts=num_experts, + top_k=top_k, + hidden_size=hidden_size, + moe_intermediate_size=moe_intermediate_size, + include_backward=args.backward, + num_warmup=args.num_warmup, + num_iters=args.num_iters, + use_compile=args.compile, + dtype=dtype, + trace_path=trace_path, + ) + bwd_str = f"{result.bwd_ms:8.3f}" if args.backward else " N/A" + print( + f"{result.num_tokens:>8} {result.num_experts:>7} {result.top_k:>5} " + f"{result.hidden_size:>6} {result.moe_intermediate_size:>6} " + f"{result.fwd_ms:>8.3f} {bwd_str} {result.peak_mem_mb:>9.1f}" + ) + + +if __name__ == "__main__": + main(tyro.cli(BenchConfig)) diff --git a/cosmos-inference/cosmos3/_src/vfm/models/vlm/qwen3_vl_moe/moe_kernels.py b/cosmos-inference/cosmos3/_src/vfm/models/vlm/qwen3_vl_moe/moe_kernels.py new file mode 100644 index 00000000..98841b35 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/vfm/models/vlm/qwen3_vl_moe/moe_kernels.py @@ -0,0 +1,216 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +from typing import Callable, Literal + +import torch +import triton +import triton.language as tl + +# Set the token group alignment size for experts in MoE. This is implemented by +# padding each expert size to the next multiple of TOKEN_GROUP_ALIGN_SIZE_M. + +# Valid values are: 8, 16, or 32. +# Different values are needed for different cases: + +# * For bf16, 8 is enough (16 byte alignment / 2 bytes per elem = 8 elements). +# * For fp8, 16 byte alignment / 1 byte per elem = 16 elements. +# * For mxfp8, we need 32 (or block_size) because scaling block size is (1 x 32), +# so when doing per-token-group quantization on each logically distinct subtensor, +# we need to ensure the contracting dim is divisible by block_size. +# In the backward pass, grad_weight = (grad_output_t @ input).t() has gemm dims +# of (N, M) @ (M, K) so M is the contracting dim, and group offsets are along M, +# so we need 32 element alignment. +TOKEN_GROUP_ALIGN_SIZE_M = 16 +ValidTokenGroupAlignmentSize = Literal[8, 16, 32] + + +def _permute( + x: torch.Tensor, + num_tokens_per_expert: int, + num_experts: int, + alignment: int = TOKEN_GROUP_ALIGN_SIZE_M, +): + x_padded_size = x.shape[0] + num_experts * alignment + padded_max_len = ((x_padded_size + alignment - 1) // alignment) * alignment + + with torch.no_grad(): + ( + permuted_indices, + padded_num_tokens_per_expert, + ) = _generate_permute_indices( + num_tokens_per_expert=num_tokens_per_expert, + num_experts=num_experts, + max_len=padded_max_len, + alignment=alignment, + ) + + x = torch.vstack((x, x.new_zeros(x.shape[-1]))) + input_shape = x.shape + x = x[permuted_indices, :] + + return input_shape, x, permuted_indices, padded_num_tokens_per_expert + + +def _unpermute(out, input_shape, permuted_indices): + out_unpermuted = out.new_empty(input_shape) + out_unpermuted[permuted_indices, :] = out + return out_unpermuted[:-1] + + +def indices_padding_wrapper(func: Callable) -> Callable: + """ + In order to use torch._grouped_mm, we need to make sure the number of + tokens each expert gets is a multiple of TOKEN_GROUP_ALIGN_SIZE_M. The + generate_permute_indices kernel also helps achieve this via padding, + without incurring synchronization between device and host. + """ + + def wrapper( + gate_up_proj: torch.Tensor, + down_proj: torch.Tensor, + act_fn: Callable[[torch.Tensor], torch.Tensor], + x: torch.Tensor, + num_tokens_per_expert: torch.Tensor, + ) -> torch.Tensor: + num_experts = num_tokens_per_expert.shape[0] + + input_shape, x, permuted_indices, padded_num_tokens_per_expert = _permute(x, num_tokens_per_expert, num_experts) + + out = func(gate_up_proj, down_proj, act_fn, x, padded_num_tokens_per_expert) + + out = _unpermute(out, input_shape, permuted_indices) + return out + + return wrapper + + +@triton.jit +def _fill_indices_kernel( + num_tokens_per_expert_ptr, + start_index_values_ptr, + write_offsets_ptr, + output_ptr, + num_experts: tl.constexpr, + BLOCK_SIZE: tl.constexpr, # Number of threads per block +): + pid = tl.program_id(axis=0) + num_programs = tl.num_programs(axis=0) + + # map programs (blocks) to the experts and loop (grid stride) if needed + for expert_id in range(pid, num_experts, num_programs): + # read this experts write offset + write_offset = tl.load(write_offsets_ptr + expert_id) + + # load number of tokens for this expert + start_index = tl.load(start_index_values_ptr + expert_id) + length = tl.load(num_tokens_per_expert_ptr + expert_id) + + # each thread in block processes tokens in parallel + offsets = tl.arange(0, BLOCK_SIZE) + + # tokens are processed in chunks of BLOCK_SIZE + for chunk_start in range(0, length, BLOCK_SIZE): + chunk_offsets = chunk_start + offsets + + # mask valid indices + mask = chunk_offsets < length + + values = start_index + chunk_offsets + + # destination + dest_indices = write_offset + chunk_offsets + + # store + tl.store(output_ptr + dest_indices, values, mask=mask) + + +def _fill_indices_wrapper( + num_tokens_per_expert: torch.Tensor, + start_index_values: torch.Tensor, + write_offsets: torch.Tensor, + num_experts: int, + max_len: int, + block_size: int = 128, + max_blocks: int = 1024, # cap on total number of blocks to launch +): + # preallocate output + permuted_indices = torch.full((max_len,), -1, dtype=torch.int32, device=num_tokens_per_expert.device) + + # write offsets is per local expert... + num_blocks = min(num_experts, max_blocks) + # grid = one block per expert unless capped and then we loop... + grid = (num_blocks,) + + # launch kernel + _fill_indices_kernel[grid]( + num_tokens_per_expert, + start_index_values, + write_offsets, + permuted_indices, + num_experts, + BLOCK_SIZE=block_size, + ) + return permuted_indices + + +def _generate_permute_indices( + num_tokens_per_expert: torch.Tensor, + num_experts: int, + max_len: int, + alignment: int, +): + """ + Prepare permutation indices and the number of tokens for each expert. + + Args: + num_tokens_per_expert: number of tokens for each expert. + num_experts: number of experts. + max_len: maximum length of the output index vector. + alignment: alignment for each returned element in `m_sizes` and padding min for zero token experts. + + Returns: + permuted_indices: Tensor of indices that map original token order to the expert-grouped order. + m_sizes: aligned number of tokens for each expert (padded to alignment boundary). + m_offsets: Cumulative sum of m_sizes. The exclusive ending position for each expert's tokens. + + Explanatory details: + `tokens_per_expert_group` is of shape (num_ranks * experts_per_rank,), for example: + From: | rank 0 | rank 1 | + To: | E0 | E1 | E2 | E3 | E0 | E1 | E2 | E3 | + | 4 | 2 | 1 | 3 | 1 | 2 | 3 | 4 | + """ + start_index_values = torch.cumsum(num_tokens_per_expert, dim=0) - num_tokens_per_expert + + # pad out empty experts to alignment requirement + m_sizes = torch.clamp_min(num_tokens_per_expert, alignment) + + # align the chunk sizes (cdiv) + m_sizes = (m_sizes.to(torch.int32) + alignment - 1) // alignment * alignment + + # additional prefix sum to get write offset of each expert in permuted_indices + # write offsets is per local expert, not global + write_offsets = torch.cumsum(m_sizes, dim=0) - m_sizes + + # Select the implementation to use + permuted_indices = _fill_indices_wrapper( + num_tokens_per_expert=num_tokens_per_expert, + start_index_values=start_index_values, + write_offsets=write_offsets, + num_experts=num_experts, + max_len=max_len, + ) + + return permuted_indices, m_sizes diff --git a/cosmos-inference/cosmos3/_src/vfm/models/vlm/qwen3_vl_moe/qwen3_vl_moe.py b/cosmos-inference/cosmos3/_src/vfm/models/vlm/qwen3_vl_moe/qwen3_vl_moe.py new file mode 100644 index 00000000..96d5fc0b --- /dev/null +++ b/cosmos-inference/cosmos3/_src/vfm/models/vlm/qwen3_vl_moe/qwen3_vl_moe.py @@ -0,0 +1,2071 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +import functools +from dataclasses import dataclass +from typing import Any, Callable, NamedTuple, Optional, Union + +import torch +import torch.nn as nn +import torch.nn.functional as F +from transformers.activations import ACT2FN +from transformers.cache_utils import Cache, DynamicCache +from transformers.generation import GenerationMixin +from transformers.integrations import use_kernel_forward_from_hub +from transformers.masking_utils import create_causal_mask +from transformers.modeling_flash_attention_utils import FlashAttentionKwargs +from transformers.modeling_layers import GradientCheckpointingLayer +from transformers.modeling_outputs import BaseModelOutputWithPast, ModelOutput +from transformers.modeling_rope_utils import ROPE_INIT_FUNCTIONS, dynamic_rope_update +from transformers.modeling_utils import ALL_ATTENTION_FUNCTIONS, PreTrainedModel +from transformers.processing_utils import Unpack +from transformers.utils import TransformersKwargs, is_torchdynamo_compiling +from transformers.utils.deprecation import deprecate_kwarg + +from cosmos3._src.vfm.models.vlm.qwen3_vl_moe.configuration_qwen3_vl_moe import ( + Qwen3VLMoeConfig, + Qwen3VLMoeTextConfig, + Qwen3VLMoeVisionConfig, +) +from cosmos3._src.vfm.models.vlm.qwen3_vl_moe.moe import ( + create_text_experts, +) + +# Small additive constant to prevent log(0) in router entropy computation. +ENTROPY_EPSILON = 1e-9 + + +# Avoid torch.combinations here: during FSDP/lazy init this module can be built +# under a meta-device context, and torch.combinations internally calls +# masked_select, which does not have a meta kernel. +def _make_coactivation_pairs(top_k: int, device: torch.device | str | None = None) -> torch.Tensor: + target_device = torch.device(device) if device is not None else torch.device("cpu") + if target_device.type == "meta": + target_device = torch.device("cpu") + + pairs = [(i, j) for i in range(top_k) for j in range(i + 1, top_k)] + if not pairs: + return torch.empty((0, 2), dtype=torch.long, device=target_device) + return torch.tensor(pairs, dtype=torch.long, device=target_device) + + +# We need to use namedtuple instead of dataclass because it is picklable. +class LBLMetadata(NamedTuple): + """Metadata for load balancing loss computation.""" + + # The number of tokens routed to each expert for this rank. + num_tokens_per_expert: torch.Tensor + + # The total number of tokens in the batch. + num_tokens: torch.Tensor + + # The average probability of routing to each expert for this rank. + mean_router_prob_per_expert: torch.Tensor + + +@use_kernel_forward_from_hub("RMSNorm") +class Qwen3VLMoeTextRMSNorm(nn.Module): + def __init__(self, hidden_size, eps=1e-6): + """ + Qwen3VLMoeTextRMSNorm is equivalent to T5LayerNorm + """ + super().__init__() + self.weight = nn.Parameter(torch.ones(hidden_size)) + self.variance_epsilon = eps + + def forward(self, hidden_states): + input_dtype = hidden_states.dtype + hidden_states = hidden_states.to(torch.float32) + variance = hidden_states.pow(2).mean(-1, keepdim=True) + hidden_states = hidden_states * torch.rsqrt(variance + self.variance_epsilon) + return self.weight * hidden_states.to(input_dtype) + + def extra_repr(self): + return f"{tuple(self.weight.shape)}, eps={self.variance_epsilon}" + + +class Qwen3VLMoeTextSparseMoeBlock(nn.Module): + def __init__(self, config): + super().__init__() + self.config = config + self.hidden_size = config.hidden_size + self.num_experts = config.num_experts + self.top_k = config.num_experts_per_tok + self.gate = nn.Linear(config.hidden_size, config.num_experts, bias=False) + self.experts = create_text_experts(config, implementation_type="grouped_mm") + + # ── Heatmap tracking ────────────────────────────────────────────────────── + # Token counts read and reset by ExpertHeatmap on its own schedule. + # persistent=False so these are never saved to checkpoints. + self.register_buffer( + "total_tokens_per_expert", + torch.zeros(config.num_experts, dtype=torch.int64), + persistent=False, + ) + self.register_buffer( + "total_tokens", + torch.zeros(1, dtype=torch.int64), + persistent=False, + ) + + # ── Stability tracking ─────────────────────────────────────────────────── + # Separate token-count buffers owned and reset by MoEStabilityCallback, + # so it is fully independent of ExpertHeatmap's reset cycle. + self.register_buffer( + "stability_tokens_per_expert", + torch.zeros(config.num_experts, dtype=torch.int64), + persistent=False, + ) + self.register_buffer( + "stability_total_tokens", + torch.zeros(1, dtype=torch.int64), + persistent=False, + ) + # Sum of per-token router entropy H = -sum(p_i * log p_i) across all tokens + # seen since the last reset. Divided by stability_total_tokens in the + # callback to get the mean entropy, then normalized by log(N) for [0, 1]. + # float64 to avoid precision loss when accumulating over many steps. + self.register_buffer( + "sum_token_entropy", + torch.zeros(1, dtype=torch.float64), + persistent=False, + ) + # Sum of per-token soft-effective-experts exp(H(p_t)) across all tokens + # seen since the last reset. Divided by stability_total_tokens in the + # callback to get mean_t exp(H(p_t)), the average per-token perplexity + # of the router. Note: this is NOT exp of the mean entropy — by Jensen, + # mean_t exp(H_t) >= exp(mean_t H_t), and the difference matters when + # per-token entropies are heterogeneous (e.g. mix of sharp and broad + # router distributions). Owned and reset by MoEStabilityCallback. + # float64 to avoid precision loss when accumulating over many steps. + self.register_buffer( + "sum_per_token_soft_eff", + torch.zeros(1, dtype=torch.float64), + persistent=False, + ) + + # ── Specialization tracking ─────────────────────────────────────────────── + # N×N symmetric matrix counting how often each expert pair (i, j) appears + # together in the top-K selection for the same token. Only the upper triangle + # (i < j) is written; read and reset by MoESpecializationCallback. + self.register_buffer( + "coactivation_counts", + torch.zeros(config.num_experts, config.num_experts, dtype=torch.int64), + persistent=False, + ) + # Precomputed C(top_k, 2) slot-index pairs used by the co-activation counting + # kernel in forward(). Registered as a buffer so it moves to the correct device + # with the module; persistent=False since it's derived from config constants. + self.register_buffer( + "_coact_pairs", + _make_coactivation_pairs(config.num_experts_per_tok), + persistent=False, + ) + + def _update_moe_callback_stats( + self, + num_tokens_per_expert: torch.Tensor, + num_tokens: torch.Tensor, + routing_weights: torch.Tensor, + expert_indices: torch.Tensor, + ) -> None: + # ── Heatmap + stability buffers ────────────────────────────────────── + # Accumulate into both buffer sets so each callback can reset independently. + self.total_tokens_per_expert.add_(num_tokens_per_expert) + self.total_tokens.add_(num_tokens) + self.stability_tokens_per_expert.add_(num_tokens_per_expert) + self.stability_total_tokens.add_(num_tokens) + + # Per-token router entropy H_t = -sum_i p_i * log(p_i). + # Summed (not meaned) so the callback can normalize by any window length. + # 1e-9 prevents log(0) for near-zero probabilities. + token_entropy = -torch.sum( + routing_weights * torch.log(routing_weights + ENTROPY_EPSILON), dim=-1 + ) # [num_tokens] + self.sum_token_entropy.add_(token_entropy.sum().to(torch.float64)) + # Per-token soft effective experts = exp(H(p_t)), bounded in [1, N]. + # We accumulate the sum here (not the mean) so the callback can compute + # mean_t exp(H_t) over any reset window. Kept separate from + # sum_token_entropy because exp(mean H) != mean exp(H) in general. + self.sum_per_token_soft_eff.add_(token_entropy.exp().sum().to(torch.float64)) + + # ── Co-activation counting ──────────────────────────────────────────── + # For every ordered pair (k1, k2) of top-K slots with k1 < k2, find the + # expert assigned to each slot and increment coactivation_counts[i, j] + # where i = min(expert_k1, expert_k2), j = max(...) to keep counts in the + # upper triangle only (avoids double-counting the pair). + # Vectorized over all C(K,2) pairs in one scatter_add_ call to avoid + # C(K,2) separate kernel launches (28 for top_k=8). + # _coact_pairs: [C(K,2), 2] — precomputed slot index pairs (k1, k2) with k1 < k2 + e1 = expert_indices[:, self._coact_pairs[:, 0]] # [num_tokens, C(K,2)] + e2 = expert_indices[:, self._coact_pairs[:, 1]] # [num_tokens, C(K,2)] + lo = torch.minimum(e1, e2) + hi = torch.maximum(e1, e2) + flat_idx = (lo * self.num_experts + hi).to(torch.int64) # [num_tokens, C(K,2)] + flat_counts = torch.zeros( + self.num_experts * self.num_experts, + dtype=self.coactivation_counts.dtype, + device=self.coactivation_counts.device, + ) + flat_idx = flat_idx.reshape(-1) + flat_counts.scatter_add_(0, flat_idx, torch.ones_like(flat_idx, dtype=flat_counts.dtype)) + self.coactivation_counts.view(-1).add_(flat_counts) + + def forward(self, hidden_states: torch.Tensor) -> tuple[torch.Tensor, LBLMetadata]: + """ + This function performs the MoE computation, including routing, dispatch, GEMMs and combine. + + Args: + hidden_states (torch.Tensor): (num_tokens, hidden_size) + + Returns: + torch.Tensor: (num_tokens, hidden_size) + - routed_out: Output of the MoE computation. + LBLMetadata: Load balancing loss metadata. + """ + assert hidden_states.ndim == 2, "hidden_states must be of shape (num_tokens, hidden_size)" + num_tokens = hidden_states.shape[0] + + router_logits = self.gate(hidden_states) # [num_tokens,num_experts] + routing_weights = torch.nn.functional.softmax( + router_logits, dim=-1, dtype=torch.float32 + ) # [num_tokens,num_experts] + expert_weights, expert_indices = torch.topk(routing_weights, self.top_k, dim=-1) + # expert_weights: [num_tokens,top_k], expert_indices: [num_tokens,top_k] + + expert_weights = expert_weights / expert_weights.sum(dim=-1, keepdim=True) # [num_tokens,top_k] + expert_weights = expert_weights.to(hidden_states.dtype) # [num_tokens,top_k] + + num_tokens_per_expert = torch.histc( + expert_indices.to(dtype=torch.int32).view(-1), + bins=self.num_experts, + min=0, + max=self.num_experts - 1, + ) # [num_experts] + + routed_out = self.experts( + hidden_states=hidden_states, + topk_scores=expert_weights, + expert_indices=expert_indices, + num_tokens_per_expert=num_tokens_per_expert, + ) # [num_tokens,hidden_size] + + num_tokens_per_expert = num_tokens_per_expert.to(dtype=torch.int64) # [num_experts] + num_tokens = torch.tensor( + [num_tokens], + dtype=torch.int64, + device=num_tokens_per_expert.device, + ) # [1] + + # Compute the average probability of routing to these experts. + # Summing over all experts should be equal to 1. + mean_router_prob_per_expert = torch.mean(routing_weights, dim=0) # [num_experts] + + lbl_metadata = LBLMetadata( + num_tokens_per_expert=num_tokens_per_expert, + num_tokens=num_tokens, + mean_router_prob_per_expert=mean_router_prob_per_expert, + ) + + with torch.no_grad(): + self._update_moe_callback_stats( + num_tokens_per_expert=num_tokens_per_expert, + num_tokens=num_tokens, + routing_weights=routing_weights, + expert_indices=expert_indices, + ) + + return routed_out, lbl_metadata + + def get_total_tokens_per_expert(self, reset: bool = True) -> torch.Tensor: + with torch.no_grad(): + total_tokens = self.total_tokens_per_expert.detach().clone() + if reset: + self.total_tokens_per_expert.zero_() + return total_tokens + + def get_total_tokens(self, reset: bool = True) -> torch.Tensor: + with torch.no_grad(): + total_tokens = self.total_tokens.detach().clone() + if reset: + self.total_tokens.zero_() + return total_tokens + + def get_stability_tokens_per_expert(self, reset: bool = True) -> torch.Tensor: + with torch.no_grad(): + val = self.stability_tokens_per_expert.detach().clone() + if reset: + self.stability_tokens_per_expert.zero_() + return val + + def get_stability_total_tokens(self, reset: bool = True) -> torch.Tensor: + with torch.no_grad(): + val = self.stability_total_tokens.detach().clone() + if reset: + self.stability_total_tokens.zero_() + return val + + def get_sum_token_entropy(self, reset: bool = True) -> torch.Tensor: + with torch.no_grad(): + val = self.sum_token_entropy.detach().clone() + if reset: + self.sum_token_entropy.zero_() + return val + + def get_sum_per_token_soft_eff(self, reset: bool = True) -> torch.Tensor: + with torch.no_grad(): + val = self.sum_per_token_soft_eff.detach().clone() + if reset: + self.sum_per_token_soft_eff.zero_() + return val + + def get_coactivation_counts(self, reset: bool = True) -> torch.Tensor: + with torch.no_grad(): + val = self.coactivation_counts.detach().clone() + if reset: + self.coactivation_counts.zero_() + return val + + def init_weights(self, buffer_device: torch.device | None = None): + self.register_buffer( + "total_tokens_per_expert", + torch.zeros(self.num_experts, dtype=torch.int64, device=buffer_device), + persistent=False, + ) + self.register_buffer( + "total_tokens", + torch.zeros(1, dtype=torch.int64, device=buffer_device), + persistent=False, + ) + self.register_buffer( + "stability_tokens_per_expert", + torch.zeros(self.num_experts, dtype=torch.int64, device=buffer_device), + persistent=False, + ) + self.register_buffer( + "stability_total_tokens", + torch.zeros(1, dtype=torch.int64, device=buffer_device), + persistent=False, + ) + self.register_buffer( + "sum_token_entropy", + torch.zeros(1, dtype=torch.float64, device=buffer_device), + persistent=False, + ) + self.register_buffer( + "sum_per_token_soft_eff", + torch.zeros(1, dtype=torch.float64, device=buffer_device), + persistent=False, + ) + self.register_buffer( + "coactivation_counts", + torch.zeros(self.num_experts, self.num_experts, dtype=torch.int64, device=buffer_device), + persistent=False, + ) + self.register_buffer( + "_coact_pairs", + _make_coactivation_pairs(self.top_k, device=buffer_device), + persistent=False, + ) + + if hasattr(self.config, "initializer_range"): + std = self.config.initializer_range + else: + std = getattr(self.config.get_text_config(), "initializer_range", 0.02) + + nn.init.normal_(self.gate.weight, mean=0.0, std=std) + nn.init.normal_(self.experts.gate_up_proj, mean=0.0, std=std) + nn.init.normal_(self.experts.down_proj, mean=0.0, std=std) + + +def rotate_half(x): + """Rotates half the hidden dims of the input.""" + x1 = x[..., : x.shape[-1] // 2] + x2 = x[..., x.shape[-1] // 2 :] + return torch.cat((-x2, x1), dim=-1) + + +def repeat_kv(hidden_states: torch.Tensor, n_rep: int) -> torch.Tensor: + """ + This is the equivalent of torch.repeat_interleave(x, dim=1, repeats=n_rep). The hidden states go from (batch, + num_key_value_heads, seqlen, head_dim) to (batch, num_attention_heads, seqlen, head_dim) + """ + batch, num_key_value_heads, slen, head_dim = hidden_states.shape + if n_rep == 1: + return hidden_states + hidden_states = hidden_states[:, :, None, :, :].expand( + batch, num_key_value_heads, n_rep, slen, head_dim + ) # [B,num_kv_heads,n_rep,N,head_dim] + return hidden_states.reshape(batch, num_key_value_heads * n_rep, slen, head_dim) # [B,num_heads,N,head_dim] + + +def eager_attention_forward( + module: nn.Module, + query: torch.Tensor, # [B,num_heads,N,head_dim] + key: torch.Tensor, # [B,num_kv_heads,N,head_dim] + value: torch.Tensor, # [B,num_kv_heads,N,head_dim] + attention_mask: Optional[torch.Tensor], + scaling: float, + dropout: float = 0.0, + **kwargs: Unpack[TransformersKwargs], +): + key_states = repeat_kv(key, module.num_key_value_groups) # [B,num_heads,N,head_dim] + value_states = repeat_kv(value, module.num_key_value_groups) # [B,num_heads,N,head_dim] + + attn_weights = torch.matmul(query, key_states.transpose(2, 3)) * scaling # [B,num_heads,N,N] + if attention_mask is not None: + causal_mask = attention_mask[:, :, :, : key_states.shape[-2]] + attn_weights = attn_weights + causal_mask # [B,num_heads,N,N] + + attn_weights = nn.functional.softmax(attn_weights, dim=-1, dtype=torch.float32).to(query.dtype) # [B,num_heads,N,N] + attn_weights = nn.functional.dropout(attn_weights, p=dropout, training=module.training) + attn_output = torch.matmul(attn_weights, value_states) # [B,num_heads,N,head_dim] + attn_output = attn_output.transpose(1, 2).contiguous() # [B,N,num_heads,head_dim] + + return attn_output, attn_weights + + +def apply_rotary_pos_emb(q, k, cos, sin, position_ids=None, unsqueeze_dim=1): + """Applies Rotary Position Embedding to the query and key tensors. + + Args: + q (`torch.Tensor`): The query tensor. + k (`torch.Tensor`): The key tensor. + cos (`torch.Tensor`): The cosine part of the rotary embedding. + sin (`torch.Tensor`): The sine part of the rotary embedding. + position_ids (`torch.Tensor`, *optional*): + Deprecated and unused. + unsqueeze_dim (`int`, *optional*, defaults to 1): + The 'unsqueeze_dim' argument specifies the dimension along which to unsqueeze cos[position_ids] and + sin[position_ids] so that they can be properly broadcasted to the dimensions of q and k. For example, note + that cos[position_ids] and sin[position_ids] have the shape [batch_size, seq_len, head_dim]. Then, if q and + k have the shape [batch_size, heads, seq_len, head_dim], then setting unsqueeze_dim=1 makes + cos[position_ids] and sin[position_ids] broadcastable to the shapes of q and k. Similarly, if q and k have + the shape [batch_size, seq_len, heads, head_dim], then set unsqueeze_dim=2. + Returns: + `tuple(torch.Tensor)` comprising of the query and key tensors rotated using the Rotary Position Embedding. + """ + cos = cos.unsqueeze(unsqueeze_dim) # [B,1,N,head_dim] + sin = sin.unsqueeze(unsqueeze_dim) # [B,1,N,head_dim] + q_embed = (q * cos) + (rotate_half(q) * sin) # [B,num_heads,N,head_dim] + k_embed = (k * cos) + (rotate_half(k) * sin) # [B,num_kv_heads,N,head_dim] + return q_embed, k_embed + + +class Qwen3VLMoeTextAttention(nn.Module): + """Multi-headed attention from 'Attention Is All You Need' paper""" + + def __init__(self, config: Qwen3VLMoeTextConfig, layer_idx: int): + super().__init__() + self.config = config + self.layer_idx = layer_idx + self.head_dim = getattr(config, "head_dim", config.hidden_size // config.num_attention_heads) + self.num_key_value_groups = config.num_attention_heads // config.num_key_value_heads + self.scaling = self.head_dim**-0.5 + self.attention_dropout = config.attention_dropout + self.is_causal = True + + self.q_proj = nn.Linear( + config.hidden_size, config.num_attention_heads * self.head_dim, bias=config.attention_bias + ) + self.k_proj = nn.Linear( + config.hidden_size, config.num_key_value_heads * self.head_dim, bias=config.attention_bias + ) + self.v_proj = nn.Linear( + config.hidden_size, config.num_key_value_heads * self.head_dim, bias=config.attention_bias + ) + self.o_proj = nn.Linear( + config.num_attention_heads * self.head_dim, config.hidden_size, bias=config.attention_bias + ) + self.q_norm = Qwen3VLMoeTextRMSNorm( + self.head_dim, eps=config.rms_norm_eps + ) # unlike olmo, only on the head dim! + self.k_norm = Qwen3VLMoeTextRMSNorm( + self.head_dim, eps=config.rms_norm_eps + ) # thus post q_norm does not need reshape + + @deprecate_kwarg("past_key_value", new_name="past_key_values", version="4.58") + def forward( + self, + hidden_states: torch.Tensor, + position_embeddings: tuple[torch.Tensor, torch.Tensor], + attention_mask: Optional[torch.Tensor], + past_key_values: Optional[Cache] = None, + cache_position: Optional[torch.LongTensor] = None, + **kwargs: Unpack[FlashAttentionKwargs], + ) -> tuple[torch.Tensor, Optional[torch.Tensor]]: + input_shape = hidden_states.shape[:-1] + hidden_shape = (*input_shape, -1, self.head_dim) + + query_states = self.q_norm(self.q_proj(hidden_states).view(hidden_shape)).transpose( + 1, 2 + ) # [B,num_heads,N,head_dim] + key_states = self.k_norm(self.k_proj(hidden_states).view(hidden_shape)).transpose( + 1, 2 + ) # [B,num_kv_heads,N,head_dim] + value_states = self.v_proj(hidden_states).view(hidden_shape).transpose(1, 2) # [B,num_kv_heads,N,head_dim] + + cos, sin = position_embeddings + query_states, key_states = apply_rotary_pos_emb(query_states, key_states, cos, sin) + # query_states: [B,num_heads,N,head_dim], key_states: [B,num_kv_heads,N,head_dim] + + if past_key_values is not None: + # sin and cos are specific to RoPE models; cache_position needed for the static cache + cache_kwargs = {"sin": sin, "cos": cos, "cache_position": cache_position} + key_states, value_states = past_key_values.update(key_states, value_states, self.layer_idx, cache_kwargs) + + attention_interface: Callable = eager_attention_forward + if self.config._attn_implementation != "eager": + attention_interface = ALL_ATTENTION_FUNCTIONS[self.config._attn_implementation] + + attn_output, attn_weights = attention_interface( + self, + query_states, + key_states, + value_states, + attention_mask, + dropout=0.0 if not self.training else self.attention_dropout, + scaling=self.scaling, + **kwargs, + ) + # attn_output: [B,N,num_heads,head_dim] + + attn_output = attn_output.reshape(*input_shape, -1).contiguous() # [B,N,hidden_size] + attn_output = self.o_proj(attn_output) # [B,N,hidden_size] + return attn_output, attn_weights + + +class Qwen3VLMoeTextMLP(nn.Module): + def __init__(self, config): + super().__init__() + self.config = config + self.hidden_size = config.hidden_size + self.intermediate_size = config.intermediate_size + self.gate_proj = nn.Linear(self.hidden_size, self.intermediate_size, bias=False) + self.up_proj = nn.Linear(self.hidden_size, self.intermediate_size, bias=False) + self.down_proj = nn.Linear(self.intermediate_size, self.hidden_size, bias=False) + self.act_fn = ACT2FN[config.hidden_act] + + def forward(self, x): + down_proj = self.down_proj(self.act_fn(self.gate_proj(x)) * self.up_proj(x)) + return down_proj + + +class Qwen3VLMoeTextDecoderLayer(GradientCheckpointingLayer): + def __init__(self, config: Qwen3VLMoeTextConfig, layer_idx: int): + super().__init__() + self.hidden_size = config.hidden_size + + self.self_attn = Qwen3VLMoeTextAttention(config, layer_idx) + + if (layer_idx not in config.mlp_only_layers) and ( + config.num_experts > 0 and (layer_idx + 1) % config.decoder_sparse_step == 0 + ): + self.mlp = Qwen3VLMoeTextSparseMoeBlock(config) + else: + self.mlp = Qwen3VLMoeTextMLP(config) + + self.input_layernorm = Qwen3VLMoeTextRMSNorm(config.hidden_size, eps=config.rms_norm_eps) + self.post_attention_layernorm = Qwen3VLMoeTextRMSNorm(config.hidden_size, eps=config.rms_norm_eps) + + @deprecate_kwarg("past_key_value", new_name="past_key_values", version="4.58") + def forward( + self, + hidden_states: torch.Tensor, + position_embeddings: tuple[torch.Tensor, torch.Tensor], + attention_mask: Optional[torch.Tensor] = None, + position_ids: Optional[torch.LongTensor] = None, + past_key_values: Optional[Cache] = None, + cache_position: Optional[torch.LongTensor] = None, + **kwargs: Unpack[FlashAttentionKwargs], + ) -> torch.FloatTensor: + """ + Args: + hidden_states (`torch.FloatTensor`): input to the layer of shape `(batch * seq_len, embed_dim)` + attention_mask (`torch.FloatTensor`, *optional*): attention mask of size + `(batch, sequence_length)` where padding elements are indicated by 0. + output_attentions (`bool`, *optional*): + Whether or not to return the attentions tensors of all attention layers. See `attentions` under + returned tensors for more detail. + output_router_logits (`bool`, *optional*): + Whether or not to return the logits of all the routers. They are useful for computing the router loss, + and should not be returned during inference. + use_cache (`bool`, *optional*): + If set to `True`, `past_key_values` key value states are returned and can be used to speed up decoding + (see `past_key_values`). + past_key_values (`Cache`, *optional*): cached past key and value projection states + cache_position (`torch.LongTensor` of shape `(sequence_length)`, *optional*): + Indices depicting the position of the input sequence tokens in the sequence. + position_embeddings (`tuple[torch.FloatTensor, torch.FloatTensor]`, *optional*): + Tuple containing the cosine and sine positional embeddings of shape `(batch_size, seq_len, head_dim)`, + with `head_dim` being the embedding dimension of each attention head. + kwargs (`dict`, *optional*): + Arbitrary kwargs to be ignored, used for FSDP and other methods that injects code + into the model + + Returns: + torch.Tensor: (batch_size * seq_len, hidden_size) + """ + residual = hidden_states + + hidden_states = self.input_layernorm(hidden_states) + + # Self Attention + hidden_states, _ = self.self_attn( + hidden_states=hidden_states, + position_embeddings=position_embeddings, + attention_mask=attention_mask, + position_ids=position_ids, + past_key_values=past_key_values, + cache_position=cache_position, + **kwargs, + ) + hidden_states = residual + hidden_states + + # Fully Connected + residual = hidden_states + hidden_states = self.post_attention_layernorm(hidden_states) + hidden_states = self.mlp(hidden_states) + hidden_states = residual + hidden_states + + return hidden_states + + +class Qwen3VLMoePreTrainedModel(PreTrainedModel): + config: Qwen3VLMoeConfig + base_model_prefix = "model" + supports_gradient_checkpointing = True + _no_split_modules = ["Qwen3VLMoeTextDecoderLayer", "Qwen3VLMoeVisionBlock"] + _skip_keys_device_placement = ["past_key_values"] + _supports_flash_attn = True + _supports_sdpa = True + _supports_flex_attn = True + _can_compile_fullgraph = False # MoE models don't work with torch.compile (`torch.where(condition)` not supported) + _supports_attention_backend = True + _can_record_outputs = { + "hidden_states": Qwen3VLMoeTextDecoderLayer, + "attentions": Qwen3VLMoeTextAttention, + } + + def _init_weights(self, module: nn.Module, buffer_device: torch.device | None): + """Initialize the weights.""" + super()._init_weights(module) + + if isinstance(module, Qwen3VLMoeTextSparseMoeBlock): + module.init_weights(buffer_device=buffer_device) + elif isinstance(module, Qwen3VLMoeTextRotaryEmbedding): + module.init_weights(buffer_device=buffer_device) + + def init_weights(self, buffer_device: torch.device | None = None) -> None: + self.apply(functools.partial(self._init_weights, buffer_device=buffer_device)) + + +class Qwen3VLMoeVisionMLP(nn.Module): + def __init__(self, config): + super().__init__() + self.hidden_size = config.hidden_size + self.intermediate_size = config.intermediate_size + self.linear_fc1 = nn.Linear(self.hidden_size, self.intermediate_size, bias=True) + self.linear_fc2 = nn.Linear(self.intermediate_size, self.hidden_size, bias=True) + self.act_fn = ACT2FN[config.hidden_act] + + def forward(self, hidden_state): + return self.linear_fc2(self.act_fn(self.linear_fc1(hidden_state))) + + +class Qwen3VLMoeVisionPatchEmbed(nn.Module): + def __init__(self, config) -> None: + super().__init__() + self.patch_size = config.patch_size + self.temporal_patch_size = config.temporal_patch_size + self.in_channels = config.in_channels + self.embed_dim = config.hidden_size + + kernel_size = [self.temporal_patch_size, self.patch_size, self.patch_size] + self.proj = nn.Conv3d(self.in_channels, self.embed_dim, kernel_size=kernel_size, stride=kernel_size, bias=True) + + def forward( + self, hidden_states: torch.Tensor + ) -> torch.Tensor: # hidden_states: [N_patches,in_channels*temporal_patch_size*patch_size*patch_size] + target_dtype = self.proj.weight.dtype + hidden_states = hidden_states.view( + -1, self.in_channels, self.temporal_patch_size, self.patch_size, self.patch_size + ) # [N_patches,in_channels,temporal_patch_size,patch_size,patch_size] + hidden_states = self.proj(hidden_states.to(dtype=target_dtype)).view( + -1, self.embed_dim + ) # [N_patches,embed_dim] + return hidden_states + + +class Qwen3VLMoeVisionRotaryEmbedding(nn.Module): + def __init__(self, dim: int, theta: float = 10000.0) -> None: + super().__init__() + inv_freq = 1.0 / (theta ** (torch.arange(0, dim, 2, dtype=torch.float) / dim)) + self.register_buffer("inv_freq", inv_freq, persistent=False) + + def forward(self, seqlen: int) -> torch.Tensor: + seq = torch.arange(seqlen, device=self.inv_freq.device, dtype=self.inv_freq.dtype) # [seqlen] + freqs = torch.outer(seq, self.inv_freq) # [seqlen,dim//2] + return freqs # [seqlen,dim//2] + + +class Qwen3VLMoeVisionPatchMerger(nn.Module): + def __init__(self, config: Qwen3VLMoeVisionConfig, use_postshuffle_norm=False) -> None: + super().__init__() + self.hidden_size = config.hidden_size * (config.spatial_merge_size**2) + self.use_postshuffle_norm = use_postshuffle_norm + self.norm = nn.LayerNorm(self.hidden_size if use_postshuffle_norm else config.hidden_size, eps=1e-6) + self.linear_fc1 = nn.Linear(self.hidden_size, self.hidden_size) + self.act_fn = nn.GELU() + self.linear_fc2 = nn.Linear(self.hidden_size, config.out_hidden_size) + + def forward(self, x: torch.Tensor) -> torch.Tensor: # x: [N_patches,hidden_size] (before merge) + x = self.norm(x.view(-1, self.hidden_size) if self.use_postshuffle_norm else x).view( + -1, self.hidden_size + ) # [N_merged,merged_hidden_size] + x = self.linear_fc2(self.act_fn(self.linear_fc1(x))) # [N_merged,out_hidden_size] + return x + + +def apply_rotary_pos_emb_vision( + q: torch.Tensor, # [N,num_heads,head_dim] + k: torch.Tensor, # [N,num_heads,head_dim] + cos: torch.Tensor, # [N,head_dim] + sin: torch.Tensor, # [N,head_dim] +) -> tuple[torch.Tensor, torch.Tensor]: + orig_q_dtype = q.dtype + orig_k_dtype = k.dtype + q, k = q.float(), k.float() + cos, sin = cos.unsqueeze(-2).float(), sin.unsqueeze(-2).float() # [N,1,head_dim] + q_embed = (q * cos) + (rotate_half(q) * sin) # [N,num_heads,head_dim] + k_embed = (k * cos) + (rotate_half(k) * sin) # [N,num_heads,head_dim] + q_embed = q_embed.to(orig_q_dtype) + k_embed = k_embed.to(orig_k_dtype) + return q_embed, k_embed + + +class Qwen3VLMoeVisionAttention(nn.Module): + def __init__(self, config: Qwen3VLMoeVisionConfig) -> None: + super().__init__() + self.dim = config.hidden_size + self.num_heads = config.num_heads + self.head_dim = self.dim // self.num_heads + self.num_key_value_groups = 1 # needed for eager attention + self.qkv = nn.Linear(self.dim, self.dim * 3, bias=True) + self.proj = nn.Linear(self.dim, self.dim) + self.scaling = self.head_dim**-0.5 + self.config = config + self.attention_dropout = 0.0 + self.is_causal = False + + def forward( + self, + hidden_states: torch.Tensor, + cu_seqlens: torch.Tensor, + rotary_pos_emb: Optional[torch.Tensor] = None, + position_embeddings: Optional[tuple[torch.Tensor, torch.Tensor]] = None, + **kwargs, + ) -> torch.Tensor: + seq_length = hidden_states.shape[0] + query_states, key_states, value_states = ( + self.qkv(hidden_states).reshape(seq_length, 3, self.num_heads, -1).permute(1, 0, 2, 3).unbind(0) + ) + # query_states, key_states, value_states: [N,num_heads,head_dim] + cos, sin = position_embeddings + query_states, key_states = apply_rotary_pos_emb_vision(query_states, key_states, cos, sin) + # query_states, key_states: [N,num_heads,head_dim] + + query_states = query_states.transpose(0, 1).unsqueeze(0) # [1,num_heads,N,head_dim] + key_states = key_states.transpose(0, 1).unsqueeze(0) # [1,num_heads,N,head_dim] + value_states = value_states.transpose(0, 1).unsqueeze(0) # [1,num_heads,N,head_dim] + + attention_interface: Callable = eager_attention_forward + if self.config._attn_implementation != "eager": + attention_interface = ALL_ATTENTION_FUNCTIONS[self.config._attn_implementation] + + if self.config._attn_implementation == "flash_attention_2": + # Flash Attention 2: Use cu_seqlens for variable length attention + max_seqlen = (cu_seqlens[1:] - cu_seqlens[:-1]).max() + attn_output, _ = attention_interface( + self, + query_states, + key_states, + value_states, + attention_mask=None, + scaling=self.scaling, + dropout=0.0 if not self.training else self.attention_dropout, + cu_seq_lens_q=cu_seqlens, + cu_seq_lens_k=cu_seqlens, + max_length_q=max_seqlen, + max_length_k=max_seqlen, + is_causal=False, + **kwargs, + ) + else: + # Other implementations: Process each chunk separately + lengths = cu_seqlens[1:] - cu_seqlens[:-1] + splits = [ + torch.split(tensor, lengths.tolist(), dim=2) for tensor in (query_states, key_states, value_states) + ] + + attn_outputs = [ + attention_interface( + self, + q, + k, + v, + attention_mask=None, + scaling=self.scaling, + dropout=0.0 if not self.training else self.attention_dropout, + is_causal=False, + **kwargs, + )[0] + for q, k, v in zip(*splits) + ] + attn_output = torch.cat(attn_outputs, dim=1) # [1,N,num_heads,head_dim] + + attn_output = attn_output.reshape(seq_length, -1).contiguous() # [N,hidden_size] + attn_output = self.proj(attn_output) # [N,hidden_size] + return attn_output + + +class Qwen3VLMoeVisionBlock(GradientCheckpointingLayer): + def __init__(self, config, attn_implementation: str = "sdpa") -> None: + super().__init__() + self.norm1 = nn.LayerNorm(config.hidden_size, eps=1e-6) + self.norm2 = nn.LayerNorm(config.hidden_size, eps=1e-6) + self.attn = Qwen3VLMoeVisionAttention(config=config) + self.mlp = Qwen3VLMoeVisionMLP(config=config) + + def forward( + self, + hidden_states: torch.Tensor, + cu_seqlens: torch.Tensor, + rotary_pos_emb: Optional[torch.Tensor] = None, + position_embeddings: Optional[tuple[torch.Tensor, torch.Tensor]] = None, + **kwargs, + ) -> torch.Tensor: + hidden_states = hidden_states + self.attn( + self.norm1(hidden_states), + cu_seqlens=cu_seqlens, + rotary_pos_emb=rotary_pos_emb, + position_embeddings=position_embeddings, + **kwargs, + ) + hidden_states = hidden_states + self.mlp(self.norm2(hidden_states)) + return hidden_states + + +class Qwen3VLMoeVisionModel(Qwen3VLMoePreTrainedModel): + config: Qwen3VLMoeVisionConfig + _no_split_modules = ["Qwen3VLMoeVisionBlock"] + + def __init__(self, config, *inputs, **kwargs) -> None: + super().__init__(config, *inputs, **kwargs) + self.spatial_merge_size = config.spatial_merge_size + self.patch_size = config.patch_size + self.spatial_merge_unit = self.spatial_merge_size * self.spatial_merge_size + + self.patch_embed = Qwen3VLMoeVisionPatchEmbed( + config=config, + ) + + self.pos_embed = nn.Embedding(config.num_position_embeddings, config.hidden_size) + self.num_grid_per_side = int(config.num_position_embeddings**0.5) + + head_dim = config.hidden_size // config.num_heads + self.rotary_pos_emb = Qwen3VLMoeVisionRotaryEmbedding(head_dim // 2) + + self.blocks = nn.ModuleList([Qwen3VLMoeVisionBlock(config) for _ in range(config.depth)]) + self.merger = Qwen3VLMoeVisionPatchMerger( + config=config, + use_postshuffle_norm=False, + ) + + self.deepstack_visual_indexes = config.deepstack_visual_indexes + self.deepstack_merger_list = nn.ModuleList( + [ + Qwen3VLMoeVisionPatchMerger( + config=config, + use_postshuffle_norm=True, + ) + for _ in range(len(config.deepstack_visual_indexes)) + ] + ) + + self.gradient_checkpointing = False + + def rot_pos_emb(self, grid_thw: torch.Tensor) -> torch.Tensor: + merge_size = self.spatial_merge_size + + max_hw = int(grid_thw[:, 1:].max().item()) + freq_table = self.rotary_pos_emb(max_hw) # [max_hw,head_dim//4] + device = freq_table.device + + total_tokens = int(torch.prod(grid_thw, dim=1).sum().item()) + pos_ids = torch.empty((total_tokens, 2), dtype=torch.long, device=device) # [total_tokens,2] + + offset = 0 + for num_frames, height, width in grid_thw: + merged_h, merged_w = height // merge_size, width // merge_size + + block_rows = torch.arange(merged_h, device=device) # block row indices + block_cols = torch.arange(merged_w, device=device) # block col indices + intra_row = torch.arange(merge_size, device=device) # intra-block row offsets + intra_col = torch.arange(merge_size, device=device) # intra-block col offsets + + # Compute full-resolution positions + row_idx = block_rows[:, None, None, None] * merge_size + intra_row[None, None, :, None] + col_idx = block_cols[None, :, None, None] * merge_size + intra_col[None, None, None, :] + + row_idx = row_idx.expand(merged_h, merged_w, merge_size, merge_size).reshape(-1) # [H*W] + col_idx = col_idx.expand(merged_h, merged_w, merge_size, merge_size).reshape(-1) # [H*W] + + coords = torch.stack((row_idx, col_idx), dim=-1) # [H*W,2] + + if num_frames > 1: + coords = coords.repeat(num_frames, 1) # [T*H*W,2] + + num_tokens = coords.shape[0] + pos_ids[offset : offset + num_tokens] = coords + offset += num_tokens + + embeddings = freq_table[pos_ids] # [total_tokens,2,head_dim//4] + embeddings = embeddings.flatten(1) # [total_tokens,head_dim//2] + return embeddings + + def fast_pos_embed_interpolate(self, grid_thw): + grid_ts, grid_hs, grid_ws = grid_thw[:, 0], grid_thw[:, 1], grid_thw[:, 2] + + idx_list = [[] for _ in range(4)] + weight_list = [[] for _ in range(4)] + + for t, h, w in zip(grid_ts, grid_hs, grid_ws): + h_idxs = torch.linspace(0, self.num_grid_per_side - 1, h) + w_idxs = torch.linspace(0, self.num_grid_per_side - 1, w) + + h_idxs_floor = h_idxs.int() + w_idxs_floor = w_idxs.int() + h_idxs_ceil = (h_idxs.int() + 1).clip(max=self.num_grid_per_side - 1) + w_idxs_ceil = (w_idxs.int() + 1).clip(max=self.num_grid_per_side - 1) + + dh = h_idxs - h_idxs_floor + dw = w_idxs - w_idxs_floor + + base_h = h_idxs_floor * self.num_grid_per_side + base_h_ceil = h_idxs_ceil * self.num_grid_per_side + + indices = [ + (base_h[None].T + w_idxs_floor[None]).flatten(), + (base_h[None].T + w_idxs_ceil[None]).flatten(), + (base_h_ceil[None].T + w_idxs_floor[None]).flatten(), + (base_h_ceil[None].T + w_idxs_ceil[None]).flatten(), + ] + + weights = [ + ((1 - dh)[None].T * (1 - dw)[None]).flatten(), + ((1 - dh)[None].T * dw[None]).flatten(), + (dh[None].T * (1 - dw)[None]).flatten(), + (dh[None].T * dw[None]).flatten(), + ] + + for i in range(4): + idx_list[i].extend(indices[i].tolist()) + weight_list[i].extend(weights[i].tolist()) + + idx_tensor = torch.tensor(idx_list, dtype=torch.long, device=self.pos_embed.weight.device) # [4,total_patches] + weight_tensor = torch.tensor( + weight_list, dtype=self.pos_embed.weight.dtype, device=self.pos_embed.weight.device + ) # [4,total_patches] + pos_embeds = self.pos_embed(idx_tensor) * weight_tensor[:, :, None] # [4,total_patches,hidden_size] + patch_pos_embeds = pos_embeds[0] + pos_embeds[1] + pos_embeds[2] + pos_embeds[3] # [total_patches,hidden_size] + + patch_pos_embeds = patch_pos_embeds.split([h * w for h, w in zip(grid_hs, grid_ws)]) + + patch_pos_embeds_permute = [] + merge_size = self.config.spatial_merge_size + for pos_embed, t, h, w in zip(patch_pos_embeds, grid_ts, grid_hs, grid_ws): + pos_embed = pos_embed.repeat(t, 1) # [T*H*W,hidden_size] + pos_embed = ( + pos_embed.view(t, h // merge_size, merge_size, w // merge_size, merge_size, -1) + # [T,H//merge,merge,W//merge,merge,hidden_size] + .permute(0, 1, 3, 2, 4, 5) + # [T,H//merge,W//merge,merge,merge,hidden_size] + .flatten(0, 4) + # [T*H//merge*W//merge*merge*merge,hidden_size] = [T*H*W,hidden_size] + ) + patch_pos_embeds_permute.append(pos_embed) + patch_pos_embeds = torch.cat(patch_pos_embeds_permute) # [total_patches,hidden_size] + return patch_pos_embeds + + def forward(self, hidden_states: torch.Tensor, grid_thw: torch.Tensor, **kwargs) -> torch.Tensor: + """ + Args: + hidden_states (`torch.Tensor` of shape `(seq_len, hidden_size)`): + The final hidden states of the model. + grid_thw (`torch.Tensor` of shape `(num_images_or_videos, 3)`): + The temporal, height and width of feature shape of each image in LLM. + + Returns: + `torch.Tensor`: hidden_states. + """ + hidden_states = self.patch_embed(hidden_states) # [total_patches,embed_dim] + + pos_embeds = self.fast_pos_embed_interpolate(grid_thw) # [total_patches,hidden_size] + hidden_states = hidden_states + pos_embeds # [total_patches,hidden_size] + + rotary_pos_emb = self.rot_pos_emb(grid_thw) # [total_patches,head_dim//2] + + seq_len, _ = hidden_states.size() + hidden_states = hidden_states.reshape(seq_len, -1) # [total_patches,hidden_size] + rotary_pos_emb = rotary_pos_emb.reshape(seq_len, -1) # [total_patches,head_dim//2] + emb = torch.cat((rotary_pos_emb, rotary_pos_emb), dim=-1) # [total_patches,head_dim] + position_embeddings = (emb.cos(), emb.sin()) # 2x [total_patches,head_dim] + + cu_seqlens = torch.repeat_interleave(grid_thw[:, 1] * grid_thw[:, 2], grid_thw[:, 0]).cumsum( + dim=0, + # Select dtype based on the following factors: + # - FA2 requires that cu_seqlens_q must have dtype int32 + # - torch.onnx.export requires that cu_seqlens_q must have same dtype as grid_thw + # See https://github.com/huggingface/transformers/pull/34852 for more information + dtype=grid_thw.dtype if torch.jit.is_tracing() else torch.int32, + ) + cu_seqlens = F.pad(cu_seqlens, (1, 0), value=0) + + deepstack_feature_lists = [] + for layer_num, blk in enumerate(self.blocks): + hidden_states = blk( + hidden_states, + cu_seqlens=cu_seqlens, + position_embeddings=position_embeddings, + **kwargs, + ) + if layer_num in self.deepstack_visual_indexes: + deepstack_feature = self.deepstack_merger_list[self.deepstack_visual_indexes.index(layer_num)]( + hidden_states + ) + deepstack_feature_lists.append(deepstack_feature) + + hidden_states = self.merger(hidden_states) + + return hidden_states, deepstack_feature_lists + + +class Qwen3VLMoeTextRotaryEmbedding(nn.Module): + def __init__(self, config: Qwen3VLMoeTextConfig): + super().__init__() + if hasattr(config, "rope_scaling") and config.rope_scaling is not None: + self.rope_type = config.rope_scaling.get("rope_type", "default") + else: + self.rope_type = "default" + self.max_seq_len_cached = config.max_position_embeddings + self.original_max_seq_len = config.max_position_embeddings + self.mrope_section = config.rope_scaling.get("mrope_section", [24, 20, 20]) + + self.config = config + self.rope_init_fn = ROPE_INIT_FUNCTIONS[self.rope_type] + + def init_weights(self, buffer_device: torch.device | None = None) -> None: + inv_freq, self.attention_scaling = self.rope_init_fn(self.config, buffer_device) + self.register_buffer("inv_freq", inv_freq, persistent=False) + + def apply_interleaved_mrope(self, freqs, mrope_section): + """Apply interleaved MRoPE to 3D rotary embeddings. + Reorganizes frequency layout from chunked [TTT...HHH...WWW] to + interleaved [THTHWHTHW...TT], preserving frequency continuity. + args: + x: (3, bs, seq_len, head_dim // 2) + mrope_section: (3,) + returns: + x_t: (bs, seq_len, head_dim // 2) + """ + freqs_t = freqs[0] # just overwrite the first dimension T + for dim, offset in enumerate((1, 2), start=1): # H, W + length = mrope_section[dim] * 3 + idx = slice(offset, length, 3) + freqs_t[..., idx] = freqs[dim, ..., idx] + return freqs_t + + @torch.no_grad() + @dynamic_rope_update # power user: used with advanced RoPE types (e.g. dynamic rope) + def forward(self, x, position_ids): + assert self.inv_freq.dtype == torch.float32, f"inv_freq must be float32, but got {self.inv_freq.dtype}" + + # In contrast to other models, Qwen3VLMoe has different position ids for the grids + # So we expand the inv_freq to shape (3, ...) + if position_ids.ndim == 2: + position_ids = position_ids[None, ...].expand(3, position_ids.shape[0], -1) # [3,B,N] + inv_freq_expanded = ( + self.inv_freq[None, None, :, None].float().expand(3, position_ids.shape[1], -1, 1) + ) # [3,B,head_dim//2,1] + position_ids_expanded = position_ids[:, :, None, :].float() # [3,B,1,N] + + freqs = (inv_freq_expanded.float() @ position_ids_expanded.float()).transpose(2, 3) # [3,B,N,head_dim//2] + freqs = self.apply_interleaved_mrope(freqs, self.mrope_section) # [B,N,head_dim//2] + emb = torch.cat((freqs, freqs), dim=-1) # [B,N,head_dim] + cos = emb.cos() * self.attention_scaling # [B,N,head_dim] + sin = emb.sin() * self.attention_scaling # [B,N,head_dim] + + return cos.to(dtype=x.dtype), sin.to(dtype=x.dtype) + + +class Qwen3VLMoeTextModel(Qwen3VLMoePreTrainedModel): + config: Qwen3VLMoeTextConfig + _no_split_modules = ["Qwen3VLMoeTextDecoderLayer"] + + def __init__(self, config: Qwen3VLMoeTextConfig): + super().__init__(config) + self.padding_idx = config.pad_token_id + self.vocab_size = config.vocab_size + + self.embed_tokens = nn.Embedding(config.vocab_size, config.hidden_size, self.padding_idx) + self.layers = nn.ModuleList( + [Qwen3VLMoeTextDecoderLayer(config, layer_idx) for layer_idx in range(config.num_hidden_layers)] + ) + self.norm = Qwen3VLMoeTextRMSNorm(config.hidden_size, eps=config.rms_norm_eps) + self.rotary_emb = Qwen3VLMoeTextRotaryEmbedding(config=config) + self.gradient_checkpointing = False + + # Initialize weights and apply final processing + self.post_init() + + def forward( + self, + input_ids: Optional[torch.LongTensor] = None, + attention_mask: Optional[torch.Tensor] = None, + position_ids: Optional[torch.LongTensor] = None, + past_key_values: Optional[Cache] = None, + inputs_embeds: Optional[torch.FloatTensor] = None, + use_cache: Optional[bool] = None, + cache_position: Optional[torch.LongTensor] = None, + # args for deepstack + visual_pos_masks: Optional[torch.Tensor] = None, + deepstack_visual_embeds: Optional[list[torch.Tensor]] = None, + **kwargs: Unpack[FlashAttentionKwargs], + ) -> Union[tuple, BaseModelOutputWithPast]: + r""" + visual_pos_masks (`torch.Tensor` of shape `(batch_size, seqlen)`, *optional*): + The mask of the visual positions. + deepstack_visual_embeds (`list[torch.Tensor]`, *optional*): + The deepstack visual embeddings. The shape is (num_layers, visual_seqlen, embed_dim). + The feature is extracted from the different visual encoder layers, and fed to the decoder + hidden states. It's from the paper DeepStack(https://arxiv.org/abs/2406.04334). + """ + if (input_ids is None) ^ (inputs_embeds is not None): + raise ValueError("You must specify exactly one of input_ids or inputs_embeds") + + # torch.jit.trace() doesn't support cache objects in the output + if use_cache and past_key_values is None and not torch.jit.is_tracing(): + past_key_values = DynamicCache(config=self.config) + + if inputs_embeds is None: + inputs_embeds = self.embed_tokens(input_ids) + + if cache_position is None: + past_seen_tokens = past_key_values.get_seq_length() if past_key_values is not None else 0 + cache_position = torch.arange( + past_seen_tokens, past_seen_tokens + inputs_embeds.shape[1], device=inputs_embeds.device + ) + + # the hard coded `3` is for temporal, height and width. + if position_ids is None: + position_ids = cache_position.view(1, 1, -1).expand(3, inputs_embeds.shape[0], -1) + elif position_ids.ndim == 2: + position_ids = position_ids[None, ...].expand(3, position_ids.shape[0], -1) + + if position_ids.ndim == 3 and position_ids.shape[0] == 4: + text_position_ids = position_ids[0] + position_ids = position_ids[1:] + else: + text_position_ids = position_ids[0] + + attention_mask = create_causal_mask( + config=self.config, + input_embeds=inputs_embeds, + attention_mask=attention_mask, + cache_position=cache_position, + past_key_values=past_key_values, + position_ids=text_position_ids, + ) + + hidden_states = inputs_embeds + + # create position embeddings to be shared across the decoder layers + position_embeddings = self.rotary_emb(hidden_states, position_ids) + + # decoder layers + for layer_idx, decoder_layer in enumerate(self.layers): + layer_outputs = decoder_layer( + hidden_states, + attention_mask=attention_mask, + position_ids=text_position_ids, + past_key_values=past_key_values, + cache_position=cache_position, + position_embeddings=position_embeddings, + **kwargs, + ) + hidden_states = layer_outputs + + # add visual features to the hidden states of first several layers + if deepstack_visual_embeds is not None and layer_idx in range(len(deepstack_visual_embeds)): + hidden_states = self._deepstack_process( + hidden_states, + visual_pos_masks, + deepstack_visual_embeds[layer_idx], + ) + + hidden_states = self.norm(hidden_states) + + return BaseModelOutputWithPast( + last_hidden_state=hidden_states, + past_key_values=past_key_values, + ) + + def _deepstack_process( + self, hidden_states: torch.Tensor, visual_pos_masks: torch.Tensor, visual_embeds: torch.Tensor + ): + visual_pos_masks = visual_pos_masks.to(hidden_states.device) + visual_embeds = visual_embeds.to(hidden_states.device, hidden_states.dtype) + local_this = hidden_states[visual_pos_masks, :].clone() + visual_embeds + hidden_states[visual_pos_masks, :] = local_this + return hidden_states + + +@dataclass +class Qwen3VLMoeCausalLMOutputWithPast(ModelOutput): + r""" + loss (`torch.FloatTensor` of shape `(1,)`, *optional*, returned when `labels` is provided): + Language modeling loss (for next-token prediction). + logits (`torch.FloatTensor` of shape `(batch_size, sequence_length, config.vocab_size)`): + Prediction scores of the language modeling head (scores for each vocabulary token before SoftMax). + past_key_values (`Cache`, *optional*, returned when `use_cache=True` is passed or when `config.use_cache=True`): + It is a [`~cache_utils.Cache`] instance. For more details, see our [kv cache guide](https://huggingface.co/docs/transformers/en/kv_cache). + + Contains pre-computed hidden-states (key and values in the self-attention blocks) that can be used (see + `past_key_values` input) to speed up sequential decoding. + rope_deltas (`torch.LongTensor` of shape `(batch_size, )`, *optional*): + The rope index difference between sequence length and multimodal rope. + """ + + loss: Optional[torch.FloatTensor] = None + logits: Optional[torch.FloatTensor] = None + past_key_values: Optional[Cache] = None + hidden_states: Optional[tuple[torch.FloatTensor]] = None + attentions: Optional[tuple[torch.FloatTensor]] = None + rope_deltas: Optional[torch.LongTensor] = None + aux_loss: Optional[torch.FloatTensor] = None + + +@dataclass +class Qwen3VLMoeModelOutputWithPast(ModelOutput): + r""" + past_key_values (`Cache`, *optional*, returned when `use_cache=True` is passed or when `config.use_cache=True`): + It is a [`~cache_utils.Cache`] instance. For more details, see our [kv cache guide](https://huggingface.co/docs/transformers/en/kv_cache). + + Contains pre-computed hidden-states (key and values in the self-attention blocks) that can be used (see + `past_key_values` input) to speed up sequential decoding. + rope_deltas (`torch.LongTensor` of shape `(batch_size, )`, *optional*): + The rope index difference between sequence length and multimodal rope. + """ + + last_hidden_state: Optional[torch.FloatTensor] = None + past_key_values: Optional[Cache] = None + hidden_states: Optional[tuple[torch.FloatTensor]] = None + attentions: Optional[tuple[torch.FloatTensor]] = None + rope_deltas: Optional[torch.LongTensor] = None + + +class Qwen3VLMoeModel(Qwen3VLMoePreTrainedModel): + base_model_prefix = "" + _checkpoint_conversion_mapping = {} + # Reference: fix gemma3 grad acc #37208 + accepts_loss_kwargs = False + config: Qwen3VLMoeConfig + _no_split_modules = ["Qwen3VLMoeTextDecoderLayer", "Qwen3VLMoeVisionBlock"] + + def __init__(self, config): + super().__init__(config) + self.visual = Qwen3VLMoeVisionModel._from_config(config.vision_config) + self.language_model = Qwen3VLMoeTextModel._from_config(config.text_config) + self.rope_deltas = None # cache rope_deltas here + + # Initialize weights and apply final processing + self.post_init() + + def get_input_embeddings(self): + return self.language_model.get_input_embeddings() + + def set_input_embeddings(self, value): + self.language_model.set_input_embeddings(value) + + def set_decoder(self, decoder): + self.language_model = decoder + + def get_decoder(self): + return self.language_model + + def get_rope_index( + self, + input_ids: Optional[torch.LongTensor] = None, + image_grid_thw: Optional[torch.LongTensor] = None, + video_grid_thw: Optional[torch.LongTensor] = None, + attention_mask: Optional[torch.Tensor] = None, + ) -> tuple[torch.Tensor, torch.Tensor]: + """Different from the original implementation, Qwen3VLMoe use timestamps rather than absolute time position ids.""" + + # Since we use timestamps to seperate videos, like , the video_grid_thw should also be split + if video_grid_thw is not None: + video_grid_thw = torch.repeat_interleave(video_grid_thw, video_grid_thw[:, 0], dim=0) + video_grid_thw[:, 0] = 1 + + spatial_merge_size = self.config.vision_config.spatial_merge_size + image_token_id = self.config.image_token_id + video_token_id = self.config.video_token_id + vision_start_token_id = self.config.vision_start_token_id + mrope_position_deltas = [] + if input_ids is not None and (image_grid_thw is not None or video_grid_thw is not None): + total_input_ids = input_ids + if attention_mask is None: + attention_mask = torch.ones_like(total_input_ids) + position_ids = torch.ones( + 3, + input_ids.shape[0], + input_ids.shape[1], + dtype=input_ids.dtype, + device=input_ids.device, + ) # [3,B,N] + image_index, video_index = 0, 0 + attention_mask = attention_mask.to(total_input_ids.device) + for i, input_ids in enumerate(total_input_ids): + input_ids = input_ids[attention_mask[i] == 1] + image_nums, video_nums = 0, 0 + vision_start_indices = torch.argwhere(input_ids == vision_start_token_id).squeeze(1) + vision_tokens = input_ids[vision_start_indices + 1] + image_nums = (vision_tokens == image_token_id).sum() + video_nums = (vision_tokens == video_token_id).sum() + input_tokens = input_ids.tolist() + llm_pos_ids_list: list = [] + st = 0 + remain_images, remain_videos = image_nums, video_nums + for _ in range(image_nums + video_nums): + if image_token_id in input_tokens and remain_images > 0: + ed_image = input_tokens.index(image_token_id, st) + else: + ed_image = len(input_tokens) + 1 + if video_token_id in input_tokens and remain_videos > 0: + ed_video = input_tokens.index(video_token_id, st) + else: + ed_video = len(input_tokens) + 1 + if ed_image < ed_video: + t, h, w = ( + image_grid_thw[image_index][0], + image_grid_thw[image_index][1], + image_grid_thw[image_index][2], + ) + image_index += 1 + remain_images -= 1 + ed = ed_image + + else: + t, h, w = ( + video_grid_thw[video_index][0], + video_grid_thw[video_index][1], + video_grid_thw[video_index][2], + ) + video_index += 1 + remain_videos -= 1 + ed = ed_video + llm_grid_t, llm_grid_h, llm_grid_w = ( + t.item(), + h.item() // spatial_merge_size, + w.item() // spatial_merge_size, + ) + text_len = ed - st + + st_idx = llm_pos_ids_list[-1].max() + 1 if len(llm_pos_ids_list) > 0 else 0 + llm_pos_ids_list.append(torch.arange(text_len).view(1, -1).expand(3, -1) + st_idx) # [3,text_len] + + # t_index is always 0 because llm_grid_t is always 1 (we use timestamps to encode the temporal information for videos) + t_index = ( + torch.arange(llm_grid_t).view(-1, 1).expand(-1, llm_grid_h * llm_grid_w).flatten() + ) # [T*H*W] + h_index = ( + torch.arange(llm_grid_h).view(1, -1, 1).expand(llm_grid_t, -1, llm_grid_w).flatten() + ) # [T*H*W] + w_index = ( + torch.arange(llm_grid_w).view(1, 1, -1).expand(llm_grid_t, llm_grid_h, -1).flatten() + ) # [T*H*W] + llm_pos_ids_list.append(torch.stack([t_index, h_index, w_index]) + text_len + st_idx) # [3,T*H*W] + st = ed + llm_grid_t * llm_grid_h * llm_grid_w + + if st < len(input_tokens): + st_idx = llm_pos_ids_list[-1].max() + 1 if len(llm_pos_ids_list) > 0 else 0 + text_len = len(input_tokens) - st + llm_pos_ids_list.append(torch.arange(text_len).view(1, -1).expand(3, -1) + st_idx) + + llm_positions = torch.cat(llm_pos_ids_list, dim=1).reshape(3, -1) # [3,N] + position_ids[..., i, attention_mask[i] == 1] = llm_positions.to(position_ids.device) + mrope_position_deltas.append(llm_positions.max() + 1 - len(total_input_ids[i])) + mrope_position_deltas = torch.tensor(mrope_position_deltas, device=input_ids.device).unsqueeze(1) # [B,1] + return position_ids, mrope_position_deltas # [3,B,N], [B,1] + else: + if attention_mask is not None: + position_ids = attention_mask.long().cumsum(-1) - 1 # [B,N] + position_ids.masked_fill_(attention_mask == 0, 1) + position_ids = position_ids.unsqueeze(0).expand(3, -1, -1).to(attention_mask.device) # [3,B,N] + max_position_ids = position_ids.max(0, keepdim=False)[0].max(-1, keepdim=True)[0] # [B,1] + mrope_position_deltas = max_position_ids + 1 - attention_mask.shape[-1] # [B,1] + else: + position_ids = ( + torch.arange(input_ids.shape[1], device=input_ids.device) + .view(1, 1, -1) + .expand(3, input_ids.shape[0], -1) + ) # [3,B,N] + mrope_position_deltas = torch.zeros( + [input_ids.shape[0], 1], + device=input_ids.device, + dtype=input_ids.dtype, + ) # [B,1] + + return position_ids, mrope_position_deltas # [3,B,N], [B,1] + + def get_video_features( + self, pixel_values_videos: torch.FloatTensor, video_grid_thw: Optional[torch.LongTensor] = None + ): + """ + Encodes videos into continuous embeddings that can be forwarded to the language model. The deepstack visual features are also returned. + + Args: + pixel_values_videos (`torch.FloatTensor` of shape `(batch_size, num_channels, image_size, image_size)`): + The tensors corresponding to the input videos. + video_grid_thw (`torch.LongTensor` of shape `(num_videos, 3)`, *optional*): + The temporal, height and width of feature shape of each video in LLM. + """ + # Same implementation as for images + return self.get_image_features(pixel_values_videos, video_grid_thw) + + def get_image_features(self, pixel_values: torch.FloatTensor, image_grid_thw: Optional[torch.LongTensor] = None): + """ + Encodes images into continuous embeddings that can be forwarded to the language model. The deepstack visual features are also returned. + + Args: + pixel_values (`torch.FloatTensor` of shape `(batch_size, num_channels, image_size, image_size)`): + The tensors corresponding to the input images. + image_grid_thw (`torch.LongTensor` of shape `(num_images, 3)`, *optional*): + The temporal, height and width of feature shape of each image in LLM. + """ + pixel_values = pixel_values.type(self.visual.dtype) + image_embeds, deepstack_image_embeds = self.visual(pixel_values, grid_thw=image_grid_thw) + split_sizes = (image_grid_thw.prod(-1) // self.visual.spatial_merge_size**2).tolist() + image_embeds = torch.split(image_embeds, split_sizes) + return image_embeds, deepstack_image_embeds + + def get_placeholder_mask( + self, + input_ids: torch.LongTensor, + inputs_embeds: torch.FloatTensor, + image_features: Optional[torch.FloatTensor] = None, + video_features: Optional[torch.FloatTensor] = None, + ): + """ + Obtains multimodal placeholder mask from `input_ids` or `inputs_embeds`, and checks that the placeholder token count is + equal to the length of multimodal features. If the lengths are different, an error is raised. + """ + if input_ids is None: + special_image_mask = inputs_embeds == self.get_input_embeddings()( + torch.tensor(self.config.image_token_id, dtype=torch.long, device=inputs_embeds.device) + ) + special_image_mask = special_image_mask.all(-1) + special_video_mask = inputs_embeds == self.get_input_embeddings()( + torch.tensor(self.config.video_token_id, dtype=torch.long, device=inputs_embeds.device) + ) + special_video_mask = special_video_mask.all(-1) + else: + special_image_mask = input_ids == self.config.image_token_id + special_video_mask = input_ids == self.config.video_token_id + + n_image_tokens = special_image_mask.sum() + special_image_mask = special_image_mask.unsqueeze(-1).expand_as(inputs_embeds).to(inputs_embeds.device) + if image_features is not None and inputs_embeds[special_image_mask].numel() != image_features.numel(): + raise ValueError( + f"Image features and image tokens do not match: tokens: {n_image_tokens}, features {image_features.shape[0]}" + ) + + n_video_tokens = special_video_mask.sum() + special_video_mask = special_video_mask.unsqueeze(-1).expand_as(inputs_embeds).to(inputs_embeds.device) + if video_features is not None and inputs_embeds[special_video_mask].numel() != video_features.numel(): + raise ValueError( + f"Videos features and video tokens do not match: tokens: {n_video_tokens}, features {video_features.shape[0]}" + ) + + return special_image_mask, special_video_mask + + def forward( + self, + input_ids: torch.LongTensor = None, + attention_mask: Optional[torch.Tensor] = None, + position_ids: Optional[torch.LongTensor] = None, + past_key_values: Optional[Cache] = None, + inputs_embeds: Optional[torch.FloatTensor] = None, + pixel_values: Optional[torch.Tensor] = None, + pixel_values_videos: Optional[torch.FloatTensor] = None, + image_grid_thw: Optional[torch.LongTensor] = None, + video_grid_thw: Optional[torch.LongTensor] = None, + cache_position: Optional[torch.LongTensor] = None, + **kwargs: Unpack[TransformersKwargs], + ) -> Union[tuple, Qwen3VLMoeModelOutputWithPast]: + r""" + image_grid_thw (`torch.LongTensor` of shape `(num_images, 3)`, *optional*): + The temporal, height and width of feature shape of each image in LLM. + video_grid_thw (`torch.LongTensor` of shape `(num_videos, 3)`, *optional*): + The temporal, height and width of feature shape of each video in LLM. + """ + if (input_ids is None) ^ (inputs_embeds is not None): + raise ValueError("You must specify exactly one of input_ids or inputs_embeds") + + if inputs_embeds is None: + inputs_embeds = self.get_input_embeddings()(input_ids) + + image_mask = None + video_mask = None + + if pixel_values is not None: + image_embeds, deepstack_image_embeds = self.get_image_features(pixel_values, image_grid_thw) + image_embeds = torch.cat(image_embeds, dim=0).to(inputs_embeds.device, inputs_embeds.dtype) + image_mask, _ = self.get_placeholder_mask( + input_ids, inputs_embeds=inputs_embeds, image_features=image_embeds + ) + inputs_embeds = inputs_embeds.masked_scatter(image_mask, image_embeds) + + if pixel_values_videos is not None: + video_embeds, deepstack_video_embeds = self.get_video_features(pixel_values_videos, video_grid_thw) + video_embeds = torch.cat(video_embeds, dim=0).to(inputs_embeds.device, inputs_embeds.dtype) + _, video_mask = self.get_placeholder_mask( + input_ids, inputs_embeds=inputs_embeds, video_features=video_embeds + ) + inputs_embeds = inputs_embeds.masked_scatter(video_mask, video_embeds) + + visual_pos_masks = None + deepstack_visual_embeds = None + if image_mask is not None and video_mask is not None: + # aggregate visual_pos_masks and deepstack_visual_embeds + image_mask = image_mask[..., 0] + video_mask = video_mask[..., 0] + visual_pos_masks = image_mask | video_mask + deepstack_visual_embeds = [] + image_mask_joint = image_mask[visual_pos_masks] + video_mask_joint = video_mask[visual_pos_masks] + for img_embed, vid_embed in zip(deepstack_image_embeds, deepstack_video_embeds): + embed_joint = img_embed.new_zeros(visual_pos_masks.sum(), img_embed.shape[-1]).to(img_embed.device) + embed_joint[image_mask_joint, :] = img_embed + embed_joint[video_mask_joint, :] = vid_embed + deepstack_visual_embeds.append(embed_joint) + elif image_mask is not None: + image_mask = image_mask[..., 0] + visual_pos_masks = image_mask + deepstack_visual_embeds = deepstack_image_embeds + elif video_mask is not None: + video_mask = video_mask[..., 0] + visual_pos_masks = video_mask + deepstack_visual_embeds = deepstack_video_embeds + + if position_ids is None: + attention_mask_tensor = ( + attention_mask if not isinstance(attention_mask, dict) else attention_mask["full_attention"] + ) + if attention_mask_tensor is not None and attention_mask_tensor.ndim == 4: + attention_mask_tensor = torch.diagonal(attention_mask_tensor[:, 0], dim1=1, dim2=2) + # Only apply conversion for floating point tensors (inverted masks) + if attention_mask_tensor.dtype.is_floating_point: + attention_mask_tensor = attention_mask_tensor / torch.finfo(attention_mask_tensor.dtype).min + attention_mask_tensor = (1.0 - attention_mask_tensor).int() + + # Calculate RoPE index once per generation in the pre-fill stage only. + # When compiling, we can't check tensor values thus we check only input length + # It is safe to assume that `length!=1` means we're in pre-fill because compiled + # models currently cannot do asssisted decoding + prefill_compiled_stage = is_torchdynamo_compiling() and ( + (input_ids is not None and input_ids.shape[1] != 1) + or (inputs_embeds is not None and inputs_embeds.shape[1] != 1) + ) + prefill_noncompiled_stage = not is_torchdynamo_compiling() and ( + (cache_position is not None and cache_position[0] == 0) + or (past_key_values is None or past_key_values.get_seq_length() == 0) + ) + if (prefill_compiled_stage or prefill_noncompiled_stage) or self.rope_deltas is None: + position_ids, rope_deltas = self.get_rope_index( + input_ids, + image_grid_thw, + video_grid_thw, + attention_mask=attention_mask_tensor, + ) + self.rope_deltas = rope_deltas + # then use the prev pre-calculated rope-deltas to get the correct position ids + else: + batch_size, seq_length, _ = inputs_embeds.shape + delta = ( + (cache_position[0] + self.rope_deltas).to(inputs_embeds.device) if cache_position is not None else 0 + ) + position_ids = torch.arange(seq_length, device=inputs_embeds.device) # [N] + position_ids = position_ids.view(1, -1).expand(batch_size, -1) # [B,N] + if cache_position is not None: # otherwise `deltas` is an int `0` + delta = delta.repeat_interleave(batch_size // delta.shape[0], dim=0) + position_ids = position_ids.add(delta) # [B,N] + position_ids = position_ids.unsqueeze(0).expand(3, -1, -1) # [3,B,N] + + outputs = self.language_model( + input_ids=None, + position_ids=position_ids, + attention_mask=attention_mask, + past_key_values=past_key_values, + inputs_embeds=inputs_embeds, + cache_position=cache_position, + visual_pos_masks=visual_pos_masks, + deepstack_visual_embeds=deepstack_visual_embeds, + **kwargs, + ) + + return Qwen3VLMoeModelOutputWithPast( + last_hidden_state=outputs.last_hidden_state, + past_key_values=outputs.past_key_values, + rope_deltas=self.rope_deltas, + ) + + +def load_balancing_loss_func( + gate_logits: Union[torch.Tensor, tuple[torch.Tensor], None], + num_experts: Optional[int] = None, + top_k=2, + attention_mask: Optional[torch.Tensor] = None, +) -> Union[torch.Tensor, int]: + r""" + Computes auxiliary load balancing loss as in Switch Transformer - implemented in Pytorch. + + See Switch Transformer (https://huggingface.co/papers/2101.03961) for more details. This function implements the loss + function presented in equations (4) - (6) of the paper. It aims at penalizing cases where the routing between + experts is too unbalanced. + + Args: + gate_logits: + Logits from the `gate`, should be a tuple of model.config.num_hidden_layers tensors of + shape [batch_size X sequence_length, num_experts]. + num_experts: + Number of experts + top_k: + The number of experts to route per-token, can be also interpreted as the `top-k` routing + parameter. + attention_mask (`torch.Tensor`, *optional*): + The attention_mask used in forward function + shape [batch_size X sequence_length] if not None. + + Returns: + The auxiliary loss. + """ + if gate_logits is None or not isinstance(gate_logits, tuple): + return 0 + + if isinstance(gate_logits, tuple): + compute_device = gate_logits[0].device + concatenated_gate_logits = torch.cat([layer_gate.to(compute_device) for layer_gate in gate_logits], dim=0) + # concatenated_gate_logits: [num_layers*B*N,num_experts] + + routing_weights = torch.nn.functional.softmax(concatenated_gate_logits, dim=-1) # [num_layers*B*N,num_experts] + + _, selected_experts = torch.topk(routing_weights, top_k, dim=-1) # [num_layers*B*N,top_k] + + expert_mask = torch.nn.functional.one_hot(selected_experts, num_experts) # [num_layers*B*N,top_k,num_experts] + + if attention_mask is None: + # Compute the percentage of tokens routed to each experts + tokens_per_expert = torch.mean(expert_mask.float(), dim=0) # [top_k,num_experts] + + # Compute the average probability of routing to these experts + router_prob_per_expert = torch.mean(routing_weights, dim=0) # [num_experts] + else: + batch_size, sequence_length = attention_mask.shape + num_hidden_layers = concatenated_gate_logits.shape[0] // (batch_size * sequence_length) + + # Compute the mask that masks all padding tokens as 0 with the same shape of expert_mask + expert_attention_mask = ( + attention_mask[None, :, :, None, None] + .expand((num_hidden_layers, batch_size, sequence_length, top_k, num_experts)) + .reshape(-1, top_k, num_experts) + .to(compute_device) + ) # [num_layers*B*N,top_k,num_experts] + + # Compute the percentage of tokens routed to each experts + tokens_per_expert = torch.sum(expert_mask.float() * expert_attention_mask, dim=0) / torch.sum( + expert_attention_mask, dim=0 + ) # [top_k,num_experts] + + # Compute the mask that masks all padding tokens as 0 with the same shape of tokens_per_expert + router_per_expert_attention_mask = ( + attention_mask[None, :, :, None] + .expand((num_hidden_layers, batch_size, sequence_length, num_experts)) + .reshape(-1, num_experts) + .to(compute_device) + ) # [num_layers*B*N,num_experts] + + # Compute the average probability of routing to these experts + router_prob_per_expert = torch.sum(routing_weights * router_per_expert_attention_mask, dim=0) / torch.sum( + router_per_expert_attention_mask, dim=0 + ) # [num_experts] + + overall_loss = torch.sum(tokens_per_expert * router_prob_per_expert.unsqueeze(0)) + return overall_loss * num_experts + + +class Qwen3VLMoeForConditionalGeneration(Qwen3VLMoePreTrainedModel, GenerationMixin): + _checkpoint_conversion_mapping = {} + _tied_weights_keys = ["lm_head.weight"] + # Reference: fix gemma3 grad acc #37208 + accepts_loss_kwargs = False + config: Qwen3VLMoeConfig + + def __init__(self, config): + super().__init__(config) + self.model = Qwen3VLMoeModel(config) + self.lm_head = nn.Linear(config.text_config.hidden_size, config.text_config.vocab_size, bias=False) + + self.post_init() + + def get_input_embeddings(self): + return self.model.get_input_embeddings() + + def set_input_embeddings(self, value): + self.model.set_input_embeddings(value) + + def set_decoder(self, decoder): + self.model.set_decoder(decoder) + + def get_decoder(self): + return self.model.get_decoder() + + def get_video_features( + self, pixel_values_videos: torch.FloatTensor, video_grid_thw: Optional[torch.LongTensor] = None + ): + return self.model.get_video_features(pixel_values_videos, video_grid_thw) + + def get_image_features(self, pixel_values: torch.FloatTensor, image_grid_thw: Optional[torch.LongTensor] = None): + return self.model.get_image_features(pixel_values, image_grid_thw) + + # Make modules available through conditional class for BC + @property + def language_model(self): + return self.model.language_model + + @property + def visual(self): + return self.model.visual + + def forward( + self, + input_ids: torch.LongTensor = None, + attention_mask: Optional[torch.Tensor] = None, + position_ids: Optional[torch.LongTensor] = None, + past_key_values: Optional[Cache] = None, + inputs_embeds: Optional[torch.FloatTensor] = None, + labels: Optional[torch.LongTensor] = None, + pixel_values: Optional[torch.Tensor] = None, + pixel_values_videos: Optional[torch.FloatTensor] = None, + image_grid_thw: Optional[torch.LongTensor] = None, + video_grid_thw: Optional[torch.LongTensor] = None, + cache_position: Optional[torch.LongTensor] = None, + logits_to_keep: Union[int, torch.Tensor] = 0, + **kwargs: Unpack[TransformersKwargs], + ) -> Union[tuple, Qwen3VLMoeCausalLMOutputWithPast]: + r""" + labels (`torch.LongTensor` of shape `(batch_size, sequence_length)`, *optional*): + Labels for computing the masked language modeling loss. Indices should either be in `[0, ..., + config.vocab_size]` or -100 (see `input_ids` docstring). Tokens with indices set to `-100` are ignored + (masked), the loss is only computed for the tokens with labels in `[0, ..., config.vocab_size]`. + image_grid_thw (`torch.LongTensor` of shape `(num_images, 3)`, *optional*): + The temporal, height and width of feature shape of each image in LLM. + video_grid_thw (`torch.LongTensor` of shape `(num_videos, 3)`, *optional*): + The temporal, height and width of feature shape of each video in LLM. + + Example: + ```python + >>> from PIL import Image + >>> import requests + >>> from transformers import AutoProcessor, Qwen3VLMoeForConditionalGeneration + + >>> model = Qwen3VLMoeForConditionalGeneration.from_pretrained("Qwen/Qwen3-VL-30B-A3B-Instruct", dtype="auto", device_map="auto") + >>> processor = AutoProcessor.from_pretrained("Qwen/Qwen3-VL-30B-A3B-Instruct") + + >>> messages = [ + { + "role": "user", + "content": [ + { + "type": "image", + "image": "https://qianwen-res.oss-cn-beijing.aliyuncs.com/Qwen-VL/assets/demo.jpeg", + }, + {"type": "text", "text": "Describe this image in short."}, + ], + } + ] + + >>> # Preparation for inference + >>> inputs = processor.apply_chat_template( + messages, + tokenize=True, + add_generation_prompt=True, + return_dict=True, + return_tensors="pt" + ) + >>> inputs = inputs.to(model.device) + + >>> # Generate + >>> generated_ids = model.generate(**inputs, max_new_tokens=128) + >>> generated_ids_trimmed = [ + out_ids[len(in_ids) :] for in_ids, out_ids in zip(inputs.input_ids, generated_ids) + ] + >>> processor.batch_decode(generated_ids_trimmed, skip_special_tokens=True, clean_up_tokenization_spaces=False)[0] + "A woman in a plaid shirt sits on a sandy beach at sunset, smiling as she gives a high-five to a yellow Labrador Retriever wearing a harness. The ocean waves roll in the background." + ```""" + + outputs = self.model( + input_ids=input_ids, + pixel_values=pixel_values, + pixel_values_videos=pixel_values_videos, + image_grid_thw=image_grid_thw, + video_grid_thw=video_grid_thw, + position_ids=position_ids, + attention_mask=attention_mask, + past_key_values=past_key_values, + inputs_embeds=inputs_embeds, + cache_position=cache_position, + **kwargs, + ) + + hidden_states = outputs[0] # [B,N,hidden_size] + + # Only compute necessary logits, and do not upcast them to float if we are not computing the loss + slice_indices = slice(-logits_to_keep, None) if isinstance(logits_to_keep, int) else logits_to_keep + logits = self.lm_head(hidden_states[:, slice_indices, :]) # [B,N_kept,vocab_size] + + loss = None + if labels is not None: + loss = self.loss_function(logits=logits, labels=labels, vocab_size=self.config.text_config.vocab_size) + + aux_loss = None + if kwargs.get("output_router_logits", False): + aux_loss = load_balancing_loss_func( + outputs.router_logits, + self.config.text_config.num_experts, + self.config.text_config.num_experts_per_tok, + attention_mask, + ) + if labels is not None: + loss += self.config.text_config.router_aux_loss_coef * aux_loss.to( + loss.device + ) # make sure to reside in the same device + + return Qwen3VLMoeCausalLMOutputWithPast( + loss=loss, + aux_loss=aux_loss, + logits=logits, + past_key_values=outputs.past_key_values, + rope_deltas=outputs.rope_deltas, + ) + + def prepare_inputs_for_generation( + self, + input_ids, + past_key_values=None, + attention_mask=None, + inputs_embeds=None, + cache_position=None, + position_ids=None, + use_cache=True, + pixel_values=None, + pixel_values_videos=None, + image_grid_thw=None, + video_grid_thw=None, + **kwargs, + ): + # Overwritten -- in specific circumstances we don't want to forward image inputs to the model + + model_inputs = super().prepare_inputs_for_generation( + input_ids, + past_key_values=past_key_values, + attention_mask=attention_mask, + inputs_embeds=inputs_embeds, + cache_position=cache_position, + position_ids=position_ids, + pixel_values=pixel_values, + pixel_values_videos=pixel_values_videos, + image_grid_thw=image_grid_thw, + video_grid_thw=video_grid_thw, + use_cache=use_cache, + **kwargs, + ) + + # Qwen3VLMoe position_ids are prepareed with rope_deltas in forward + model_inputs["position_ids"] = None + + if cache_position[0] != 0: + model_inputs["pixel_values"] = None + model_inputs["pixel_values_videos"] = None + + return model_inputs + + def _get_image_nums_and_video_nums( + self, + input_ids: Optional[torch.LongTensor], + inputs_embeds: Optional[torch.Tensor] = None, + ) -> tuple[torch.Tensor, torch.Tensor]: + """ + Get the number of images and videos for each sample to calculate the separation length of the sample tensor. + These parameters are not passed through the processor to avoid unpredictable impacts from interface modifications. + + Args: + input_ids (`torch.LongTensor` of shape `(batch_size, sequence_length)`): + Indices of input sequence tokens in the vocabulary. + + Returns: + image_nums (`torch.LongTensor` of shape `(batch_size, num_images_sample)`) + video_nums (`torch.LongTensor` of shape `(batch_size, num_videos_sample)`) + """ + image_token_id = self.config.image_token_id + video_token_id = self.config.video_token_id + vision_start_token_id = self.config.vision_start_token_id + + if inputs_embeds is not None: + vision_start_mask = ( + inputs_embeds + == self.get_input_embeddings()( + torch.tensor(vision_start_token_id, dtype=torch.long, device=inputs_embeds.device) + ) + )[..., 0] + image_mask = ( + inputs_embeds + == self.get_input_embeddings()( + torch.tensor(image_token_id, dtype=torch.long, device=inputs_embeds.device) + ) + )[..., 0] + video_mask = ( + inputs_embeds + == self.get_input_embeddings()( + torch.tensor(video_token_id, dtype=torch.long, device=inputs_embeds.device) + ) + )[..., 0] + else: + vision_start_mask = input_ids == vision_start_token_id + image_mask = input_ids == image_token_id + video_mask = input_ids == video_token_id + + vision_first_mask = torch.roll(vision_start_mask, shifts=1, dims=1) + image_nums = torch.sum(vision_first_mask & image_mask, dim=1) + video_nums = torch.sum(vision_first_mask & video_mask, dim=1) + + return image_nums, video_nums + + def _expand_inputs_for_generation( + self, + expand_size: int = 1, + is_encoder_decoder: bool = False, + input_ids: Optional[torch.LongTensor] = None, + **model_kwargs, + ) -> tuple[torch.LongTensor, dict[str, Any]]: + # Overwritten -- Support for expanding tensors without a batch size dimension + # e.g., pixel_values, image_grid_thw, pixel_values_videos, video_grid_thw, second_per_grid_t + # pixel_values.shape[0] is sum(seqlen_images for samples) + # image_grid_thw.shape[0] is sum(num_images for samples) + + if expand_size == 1: + return input_ids, model_kwargs + + visual_keys = ["pixel_values", "image_grid_thw", "pixel_values_videos", "video_grid_thw", "second_per_grid_ts"] + + def _expand_dict_for_generation_visual(dict_to_expand): + image_grid_thw = model_kwargs.get("image_grid_thw", None) + video_grid_thw = model_kwargs.get("video_grid_thw", None) + image_nums, video_nums = self._get_image_nums_and_video_nums( + input_ids, inputs_embeds=model_kwargs.get("inputs_embeds", None) + ) + + def _repeat_interleave_samples(x, lengths, repeat_times): + samples = torch.split(x, lengths) + repeat_args = [repeat_times] + [1] * (x.dim() - 1) + result = torch.cat([sample.repeat(*repeat_args) for sample in samples], dim=0) + return result + + for key in dict_to_expand: + if key == "pixel_values": + # split images into samples + samples = torch.split(image_grid_thw, list(image_nums)) + # compute the sequence length of images for each sample + lengths = [torch.prod(sample, dim=1).sum() for sample in samples] + dict_to_expand[key] = _repeat_interleave_samples( + dict_to_expand[key], lengths=lengths, repeat_times=expand_size + ) + elif key == "image_grid_thw": + # get the num of images for each sample + lengths = list(image_nums) + dict_to_expand[key] = _repeat_interleave_samples( + dict_to_expand[key], lengths=lengths, repeat_times=expand_size + ) + elif key == "pixel_values_videos": + samples = torch.split(video_grid_thw, list(video_nums)) + lengths = [torch.prod(sample, dim=1).sum() for sample in samples] + dict_to_expand[key] = _repeat_interleave_samples( + dict_to_expand[key], lengths=lengths, repeat_times=expand_size + ) + elif key == "video_grid_thw": + lengths = list(video_nums) + dict_to_expand[key] = _repeat_interleave_samples( + dict_to_expand[key], lengths=lengths, repeat_times=expand_size + ) + elif key == "second_per_grid_ts": + dict_to_expand[key] = _repeat_interleave_samples( + dict_to_expand[key], lengths=list(video_nums), repeat_times=expand_size + ) + return dict_to_expand + + def _expand_dict_for_generation(dict_to_expand): + for key in dict_to_expand: + if ( + key != "cache_position" + and dict_to_expand[key] is not None + and isinstance(dict_to_expand[key], torch.Tensor) + and key not in visual_keys + ): + dict_to_expand[key] = dict_to_expand[key].repeat_interleave(expand_size, dim=0) + return dict_to_expand + + model_kwargs = _expand_dict_for_generation_visual(model_kwargs) + + if input_ids is not None: + input_ids = input_ids.repeat_interleave(expand_size, dim=0) + + model_kwargs = _expand_dict_for_generation(model_kwargs) + + if is_encoder_decoder: + if model_kwargs.get("encoder_outputs") is None: + raise ValueError("If `is_encoder_decoder` is True, make sure that `encoder_outputs` is defined.") + model_kwargs["encoder_outputs"] = _expand_dict_for_generation(model_kwargs["encoder_outputs"]) + + return input_ids, model_kwargs + + +__all__ = [ + "Qwen3VLMoeVisionModel", + "Qwen3VLMoeForConditionalGeneration", + "Qwen3VLMoeModel", + "Qwen3VLMoePreTrainedModel", + "Qwen3VLMoeTextModel", +] diff --git a/cosmos-inference/cosmos3/_src/vfm/models/vlm_model.py b/cosmos-inference/cosmos3/_src/vfm/models/vlm_model.py new file mode 100644 index 00000000..5ab3ec49 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/vfm/models/vlm_model.py @@ -0,0 +1,549 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +"""VLMModel: config-instantiable ImaginaireModel for VLM training. + +Config usage (in vfm/configs/base/vlm/config.py): + config.model = LazyCall(VLMModel)( + policy=config.policy, checkpoint=config.checkpoint, train=config.train + ) + +Phase 0 — bootstrap via the legacy VLM init path, ParallelDims, and async_safe_ce. +Phase 1 — ParallelDims switches to vfm/utils/parallelism.py. +Phase 2 — legacy init replaced by direct HFModel path (_init_vlm); async_safe_ce + replaced by vfm/algorithm/loss/cross_entropy.py::cross_entropy_loss. +Phase 3 — init_flash_attn_meta ported to vfm/utils/flash_attn.py; + config unified under vfm/configs/base/vlm/config.py. +""" + +import os +import re +from collections.abc import Callable +from functools import partial + +import torch +import torch.nn as nn + +from cosmos3._src.imaginaire.lazy_config import instantiate +from cosmos3._src.imaginaire.model import ImaginaireModel +from cosmos3._src.imaginaire.utils import log +from cosmos3._src.vfm.algorithm.loss.cross_entropy import cross_entropy_loss +from cosmos3._src.vfm.configs.base.defaults.model_config import VLMModelConfig +from cosmos3._src.vfm.models.hf_model import HFModel +from cosmos3._src.vfm.models.parallelize_vlm import parallelize +from cosmos3._src.vfm.utils.parallelism import ParallelDims +from projects.cosmos3.vlm.utils.constant import IGNORE_INDEX +from projects.cosmos3.vlm.utils.create_position_ids import get_position_ids + +# Model-type dispatch sets. Using hf_config.model_type (stable HF-defined string) +# rather than model_name_or_path avoids the brittleness of substring-matching a local +# filesystem path that VLMModel._init_vlm has already rewritten (see _init_vlm: the +# downloader returns a local cache path, so the configured model name is lost). +# +# ``qwen3_vl_moe`` is listed here as forward-compat — MoE dispatch in every family +# helper below is already wired for the 30B-A3B / 235B-A22B variants. End-to-end +# training still fails earlier at load_vlm_model's MoE precheck +# (safetensors_loader.py _is_moe_vlm / NotImplementedError) because sharded MoE +# weight loading is unimplemented; see spec §2.2. Removing ``qwen3_vl_moe`` here +# would regress the family helpers the moment MoE load support lands. +_QWEN_VL_TYPES = {"qwen2_5_vl", "qwen3_vl", "qwen3_vl_moe"} +# InternVL variants register both "internvl" and "internvl_chat" as model_type +# in the upstream InternVL HF policy registry. +_INTERNVL_TYPES = {"internvl", "internvl_chat"} + + +def _get_overlay_config(model_type: str) -> tuple[list[str], Callable[[str], bool]]: + """Return ``(skip_patterns, is_lm_key)`` for the pretrain_weights_path_llm overlay. + + ``skip_patterns`` are regex patterns for resolved model keys that are expected to + be absent from the LLM overlay checkpoint (visual encoder + projector); they are + passed as ``extra_skip_patterns`` to ``load_vlm_model`` so Phase-6 completeness + check tolerates them. Every OTHER missing model key still raises. + + ``is_lm_key`` is a predicate that decides whether a key returned in ``keys_loaded`` + counts as a "language-model parameter" for VLMModel's post-overlay sanity check. + Implemented as the inverse of ``skip_patterns`` — a loaded key counts as an LM key + iff it does NOT match any of the visual/projector skip regexes. This mirrors + exactly what ``load_vlm_model``'s Phase-5 skip logic does, so the two checks can + never disagree under HF state-dict layout variations (e.g. ``model.model.*`` + vs. ``model.language_model.*``). + + Family-specific because non-LM params differ across VLM families (projectors may + live outside ``model.visual.*``). Raises ``NotImplementedError`` for unsupported + families — safer than silently mis-skipping. Add a new entry when onboarding a + new VLM family. + + MoE note: ``qwen3_vl_moe`` is accepted here but end-to-end MoE training still + fails earlier at ``load_vlm_model``'s MoE precheck (see module docstring on + ``_QWEN_VL_TYPES``). + """ + if model_type in _QWEN_VL_TYPES: + # Qwen2.5-VL / Qwen3-VL dense + MoE: the visual encoder AND merger/projector + # both live under a ``visual.*`` subtree (merger is a submodule of visual — + # see Qwen3VLForConditionalGeneration / Qwen2_5_VLForConditionalGeneration). + # Every non-visual resolved key counts as an LM key (language_model layers, + # norm, embed_tokens, top-level lm_head). + # + # The ``(?:model\.)*`` prefix makes both the loader-side Phase-5 skip AND + # the VLMModel-side LM predicate tolerate three layouts uniformly: + # + # 1. Bare (Qwen2.5-VL official HF class) — ``visual.merger.*``, + # ``model.embed_tokens.*`` / ``lm_head.weight``. See + # projects/cosmos3/vlm/scripts/convert_qwenvl_ckpt.py:101-118 which + # inspects ``state_dict()`` for keys starting with ``visual.merger``. + # 2. One wrapper (Qwen3-VL official HF class) — ``model.visual.*``, + # ``model.language_model.*`` / ``lm_head.weight``. + # 3. Two+ wrappers (HFModel-shim-wrapped callers) — ``model.model.visual + # .*`` etc., e.g. hf_model_test.py::test_vlm_load_hf_native_keys:644. + # + # A narrower regex (e.g. requiring a leading ``model.``) would either + # reject valid Qwen2.5 visual keys in Phase-6 completeness OR misclassify + # wrapper-layout visual keys as LM keys in the post-overlay safety check. + skip_patterns = [r"^(?:model\.)*visual\..*"] + compiled_skips = [re.compile(p) for p in skip_patterns] + return ( + skip_patterns, + lambda k: not any(r.match(k) for r in compiled_skips), + ) + # Nemotron / InternVL / etc: projectors live outside ``model.visual.*`` + # (e.g. ``model.multi_modal_projector.*``, ``model.projector.*``), and lm_head + # may be nested (``model.lm_head.weight``). The Qwen-shaped skip list would fail + # Phase-6 completeness on those families; the Qwen-shaped predicate would misreport + # a successful overlay as "0 language-model parameters". Fail loudly rather than + # silently. + raise NotImplementedError( + f"VLMModel: pretrain_weights_path_llm overlay not yet supported for " + f"model_type={model_type!r}. Supported types: {sorted(_QWEN_VL_TYPES)}. " + f"Add a new entry in _get_overlay_config() when onboarding a new VLM family " + f"(see docs/superpowers/specs/2026-04-20-vlm-pretrain-weights-path-llm-design.md §7)." + ) + + +def _get_vision_encoder_modules(model: nn.Module, model_type: str) -> list: + if model_type in _QWEN_VL_TYPES: + + # which returns only [patch_embed, blocks]. Qwen3-VL adds a learnable `pos_embed` + # (nn.Embedding — see qwen3_vl.py Qwen3VLVisionModel); leaving it trainable while + # freezing the rest of the vision encoder contradicts the intent of + # freeze_vision_encoder=True. `hasattr` gate preserves Qwen2.5-VL compatibility + # (no pos_embed there). + mods = [model.visual.patch_embed, model.visual.blocks] + if hasattr(model.visual, "pos_embed"): + mods.append(model.visual.pos_embed) + return mods + elif model_type in _INTERNVL_TYPES: + return [model.vision_model] + raise ValueError(f"freeze_vision_encoder not supported for model_type={model_type!r}") + + +def _get_mm_projector_modules(model: nn.Module, model_type: str) -> list: + if model_type == "qwen2_5_vl": + return [model.visual.merger] + elif model_type in {"qwen3_vl", "qwen3_vl_moe"}: + mods = [model.visual.merger] + if hasattr(model.visual, "deepstack_merger_list"): + mods.append(model.visual.deepstack_merger_list) + return mods + elif model_type in _INTERNVL_TYPES: + # Legacy InternVL helper used `model.model.model.multi_modal_projector` + # because it operated on a wrapped HFModel (ImaginaireModel -> HFModel -> + # raw HF InternVL). We receive the raw HF model directly + # (hf_model._model), so drop the two wrapper hops. Best-effort until L1 + # GPU validation on a real InternVL3_5 checkpoint. + return [model.model.multi_modal_projector] + raise ValueError(f"freeze_mm_projector not supported for model_type={model_type!r}") + + +def _get_llm_modules(model: nn.Module, model_type: str) -> list: + if model_type in _QWEN_VL_TYPES: + # model.language_model is a @property on Qwen3VLForConditionalGeneration / + # Qwen2_5_VLForConditionalGeneration that delegates to self.model.language_model + # — avoids accidentally freezing `visual` which also lives inside self.model. + # model.lm_head is a top-level submodule on the conditional-generation class. + return [model.language_model, model.lm_head] + elif model_type in _INTERNVL_TYPES: + # Legacy InternVL helper returned `[model.language_model, model.model.lm_head]` + # for the wrapped HFModel. Same raw-HF adjustment as mm_projector above: + # the raw HF InternVL class exposes `.language_model` at the top level but + # its `lm_head` lives one level deeper under `.model`. Best-effort until + # L1 validation. + return [model.language_model, model.model.lm_head] + raise ValueError(f"freeze_llm not supported for model_type={model_type!r}") + + +def _apply_freeze_config(model: nn.Module, model_type: str, cfg) -> int: + """Apply freeze config in-place. Returns trainable parameter-tensor count. + + ``cfg`` can be either a concrete ``OptimizerConfig`` instance (direct-constructor + path, e.g. unit tests) or a ``DictConfig`` / LazyCall-backed config (runtime path, + where the attrs ``__attrs_post_init__`` has not yet fired). The mutual-exclusivity + guard below duplicates the attrs validator so both paths fail loudly before any + parameter is frozen. + """ + # Extract optional VFM-only fields — OptimizerConfig lives in vlm/ and does not + # carry trainable_params / frozen_params, so always use getattr with None default. + trainable_params = getattr(cfg, "trainable_params", None) + frozen_params = getattr(cfg, "frozen_params", None) + + # Defensive mutual-exclusivity guard — runs BEFORE any freeze, even on LazyCall path + if trainable_params is not None and frozen_params is not None: + raise ValueError("OptimizerConfig: set at most one of trainable_params or frozen_params, not both.") + + # Step 1 — legacy named flags via module-probing + if cfg.freeze_vision_encoder: + for m in _get_vision_encoder_modules(model, model_type): + for p in m.parameters(): + p.requires_grad = False + + if cfg.freeze_mm_projector: + for m in _get_mm_projector_modules(model, model_type): + for p in m.parameters(): + p.requires_grad = False + + if cfg.freeze_llm: + for m in _get_llm_modules(model, model_type): + for p in m.parameters(): + p.requires_grad = False + + # Step 2 — regex override (mutually exclusive; already validated above). + # + # `remove_duplicate=False` is required for tied weights. Qwen3 configs set + # `tie_word_embeddings=True`, so `hf_model.tie_embeddings()` makes + # `lm_head.weight` and `model.embed_tokens.weight` the same tensor. The default + # `named_parameters()` dedups by tensor id and keeps only the first traversed + # name (`model.embed_tokens.weight`); a regex aimed at `lm_head` would silently + # match nothing and user intent would be lost. Iterating with duplicates + # preserves both names so either can trigger a match. + if trainable_params is not None: + # OR-semantics across tied names: first freeze everything, then unfreeze + # any tensor whose *any* registered name matches. Cannot write + # `requires_grad = any(...)` directly because a second visit could flip + # True back to False on the same shared tensor. + for p in model.parameters(): + p.requires_grad = False + for param_name, p in model.named_parameters(remove_duplicate=False): + if any(re.search(pat, param_name) for pat in trainable_params): + p.requires_grad = True + elif frozen_params is not None: + for param_name, p in model.named_parameters(remove_duplicate=False): + if any(re.search(pat, param_name) for pat in frozen_params): + p.requires_grad = False + + n = sum(p.requires_grad for p in model.parameters()) + if not any([cfg.freeze_vision_encoder, cfg.freeze_mm_projector, cfg.freeze_llm, trainable_params, frozen_params]): + log.warning("freeze config: no freeze mechanism set — all parameters are trainable (full fine-tune)") + assert n > 0, "freeze config left 0 trainable parameters — check patterns" + return n + + +class VLMModel(ImaginaireModel): + """Config-instantiable ImaginaireModel for VLM training. + + Args: + config: VLMModelConfig (policy, train, ema). + checkpoint: root CheckpointConfig (load_path, load_from_object_store). + """ + + def __init__(self, config: VLMModelConfig, checkpoint): + super().__init__() + from cosmos3._src.vfm.utils.flash_attn import init_flash_attn_meta + + self.config = config + # Expose model.precision so LowPrecisionCallback can read it (mirrors OmniMoTModel). + self.precision = getattr(torch, config.policy.parallelism.precision) + init_flash_attn_meta(config.train.deterministic) + self._init_vlm(config.policy, checkpoint, config.train) + + dp_group = None + cp_group = None + if self.parallel_dims is not None: + if self.parallel_dims.dp_shard_enabled: + dp_group = self.parallel_dims.dp_shard_mesh.get_group() + if self.parallel_dims.cp_enabled: + cp_group = self.parallel_dims.cp_mesh.get_group() + + self._loss_fn = partial( + cross_entropy_loss, + loss_scaling_factor=1.0, + dp_group=dp_group, + cp_group=cp_group, + ignore_index=IGNORE_INDEX, + ) + + def _init_vlm(self, policy, checkpoint, train) -> None: + """Initialize VLM without the legacy ModelRegistry (Phase 2+). + + Sequence (ordering is critical — do not reorder): + a. Download HF weights from S3 to local cache. + b. Meta-init HFModel (params on meta, buffers on CPU via include_buffers=False; + c. Build ParallelDims + device mesh. + d. Apply FSDP2 via parallelize() — meta tensors are NOT auto-materialized. + e. Explicitly materialize meta tensors; move CPU buffers to CUDA. + f. Tie output embedding → input embedding if tie_word_embeddings=True. + g. Load pretrain weights into sharded CUDA tensors. + h. Apply gradient checkpointing if configured. + """ + from projects.cosmos3.vlm.utils.pretrained_models_downloader import ( + maybe_download_hf_model_from_s3, + ) + + load_pretrain_weights = checkpoint.load_path == "" + log.info(f"checkpoint.load_path: {checkpoint.load_path!r} | load_pretrain_weights: {load_pretrain_weights}") + + # ── a. Download HF model files (config + tokenizer; weights only if no ckpt) ── + local_path = maybe_download_hf_model_from_s3( + policy.model_name_or_path, + checkpoint.load_from_object_store.credentials, + checkpoint.load_from_object_store.bucket, + include_model_weights=load_pretrain_weights, + ) + # local_path is exposed below as self.model_name_or_path; the (frozen) policy + # config is not mutated. + + # ── b. Meta-init HFModel ── + hf_model = HFModel( + model_name_or_path=local_path, + dtype=train.master_torch_dtype, + attn_implementation="flash_attention_2", + ) + + # ── b.1. Early family-gate for pretrain_weights_path_llm ── + # Fail-fast on unsupported VLM families BEFORE any expensive work + # (parallelize, materialize, base-weight load, overlay download). + # ``hf_config.model_type`` is populated by HFModel's meta-init; no + # weights touched yet. Uses ``getattr`` default so the probe itself + # matches the later overlay guard at step g.2 exactly. + if getattr(policy, "pretrain_weights_path_llm", ""): + _get_overlay_config(hf_model.hf_config.model_type) + + # ── c. Build ParallelDims + device mesh ── + # Overlay-mesh design (see vfm/utils/parallelism.py): cp/cfgp do NOT + # consume FSDP rank slots, so dp_replicate * dp_shard == world_size + # alone. The VLM HFModel doesn't have a CP-aware attention path. + world_size = int(os.environ.get("WORLD_SIZE", 1)) + _dp_replicate = policy.parallelism.data_parallel_replicate_degree + # Single-process run: force dp_replicate=1 so ParallelDims doesn't + # auto-infer it to world_size (which would equal 1 anyway, but guards + # against environments where WORLD_SIZE is unset/inconsistent). + if not torch.distributed.is_initialized(): + _dp_replicate = 1 + + parallel_dims = ParallelDims( + world_size=world_size, + dp_shard=policy.parallelism.data_parallel_shard_degree, + dp_replicate=_dp_replicate, + cp=policy.parallelism.context_parallel_shard_degree, + enable_inference_mode=False, + ) + + # VLM does not currently support cp or cfgp. CP needs a CP-aware + # attention path (see ``vfm/models/mot/context_parallel_utils.py``) that + # is not wired into the VLM HFModel; CFGP is inference-only. + assert parallel_dims.cp == 1, f"VLM does not support CP (got cp={parallel_dims.cp})" + assert parallel_dims.cfgp == 1, f"VLM does not support CFGP (got cfgp={parallel_dims.cfgp})" + + if torch.distributed.is_initialized(): + parallel_dims.build_meshes(device_type="cuda") + + # Replicate-only (DDP) is not implemented in Phase 2's parallelize(). + # Raise early rather than running with no gradient synchronization and + # silently producing wrong training results. + if parallel_dims.dp_replicate_enabled and not parallel_dims.dp_shard_enabled: + raise NotImplementedError( + "VLMModel Phase 2 does not support replicate-only DDP " + "(dp_replicate > 1, dp_shard == 1). " + "Use dp_shard > 1 for FSDP2. DDP support is planned for Phase 3." + ) + + # ── d. Apply FSDP2 ── + if torch.distributed.is_initialized(): + parallelize( + hf_model, + parallel_dims, + policy.parallelism.precision, + train.fsdp_offload, + ) + + # ── e. Materialize meta tensors ── + # FSDP2 fully_shard does NOT auto-materialize meta tensors. We must + # explicitly allocate empty tensors so that load_weights() can copy + # into them via local_view.data.copy_(). + # - Normal (no offload): materialize to CUDA directly. + # - CPUOffloadPolicy: FSDP2 keeps params on CPU between fwd/bwd and + # moves them to GPU during forward. Materialize to CPU here so that + # load_weights() can write into a real (non-meta) CPU tensor; FSDP + # will move them to GPU on first forward. + # Reference: vlm/train.py:188-192 (same guard; offload handled by post_to_empty_hook). + if train.fsdp_offload: + hf_model._model._apply( + lambda t: torch.empty_like(t, device="cpu") if t.device.type == "meta" else t, + recurse=True, + ) + else: + hf_model._model._apply( + lambda t: torch.empty_like(t, device="cuda") if t.device.type == "meta" else t.to("cuda"), + recurse=True, + ) + + # ── f. Tie embeddings (replaces the legacy post_to_empty_hook) ── + hf_model.tie_embeddings() + + # ── g. Load pretrain weights ── + if load_pretrain_weights: + hf_model.load_weights( + checkpoint_path=local_path, + credential_path=None, # local path after download + parallel_dims=parallel_dims if torch.distributed.is_initialized() else None, + ) + + # ── g.2. Optional LLM overlay (pretrain_weights_path_llm) ── + # Overlay the language tower with a separate LLM checkpoint. + # Visual + projector params are preserved from the VLM load above + # (skipped via extra_skip_patterns). The existing name converter + # in load_vlm_model tail-matches raw LLM keys into + # model.language_model.*, so no temp-dir remap is needed. + # Mirrors legacy vlm/train.py:221-233 semantics. + llm_path = getattr(policy, "pretrain_weights_path_llm", "") + if llm_path: + overlay_skip_patterns, is_lm_key = _get_overlay_config(hf_model.hf_config.model_type) + llm_local_path = maybe_download_hf_model_from_s3( + llm_path, + checkpoint.load_from_object_store.credentials, + checkpoint.load_from_object_store.bucket, + include_model_weights=True, + require_s3_exists=True, + ) + keys_loaded = hf_model.load_weights( + checkpoint_path=llm_local_path, + credential_path=None, + parallel_dims=parallel_dims if torch.distributed.is_initialized() else None, + extra_skip_patterns=overlay_skip_patterns, + ) + lm_loaded = {k for k in keys_loaded if is_lm_key(k)} + if not lm_loaded: + raise RuntimeError( + f"VLMModel overlay: loaded 0 language-model parameters from " + f"{llm_path!r} (local path: {llm_local_path!r}). The LLM " + "checkpoint did not match any language_model.* key in the " + "VLM; check model-family / layer-count compatibility." + ) + log.info(f"VLMModel: overlaid {len(lm_loaded)} language-model params from {llm_path}") + + # ── i. Gradient checkpointing ── + if policy.parallelism.use_activation_checkpointing: + hf_model.apply_gradient_checkpointing() + + self.model = hf_model + self.parallel_dims = parallel_dims + self.model_name_or_path = local_path + self.hf_config = hf_model.hf_config + + def on_train_start(self, memory_format) -> None: + """Called by trainer after model.to("cuda"). No device move needed here.""" + + def on_after_backward(self, iteration: int = 0) -> None: + """No-op — FSDP handles gradient synchronization internally.""" + + def init_optimizer_scheduler(self, optimizer_config, scheduler_config): + """Apply freeze config then build optimizer + scheduler. + + Freeze order: legacy flags (module-probing, dispatched on hf_config.model_type) + → regex override (trainable_params or frozen_params). All mechanisms are + optional — if none set, all params train. + + Dispatch uses ``self.hf_config.model_type`` (a stable HF string like + ``"qwen3_vl"``, ``"qwen2_5_vl"``, ``"internvl_chat"``) rather than + ``self.model_name_or_path`` because ``_init_vlm`` rewrites the latter with a + resolved local filesystem path that may not contain a recognizable + architecture substring. + + Names emitted by ``self.model._model.named_parameters()`` do NOT carry a + ``_model.`` prefix; they start directly with the submodule name + (e.g. ``model.visual.patch_embed.weight`` on Qwen3-VL). + """ + cfg = optimizer_config.config + n_trainable = _apply_freeze_config(self.model._model, self.hf_config.model_type, cfg) + log.info( + f"freeze config applied (model_type={self.hf_config.model_type}): {n_trainable} trainable parameter tensors" + ) + + # Single-part optimizer. The freeze config already selected the right subset of + # parameters (via legacy flags, trainable_params, or frozen_params); per-component + # LR multipliers (lr_multiplier={"vision_encoder": 0.1, "mm_projector": 1.0, ...}) + # are NOT restored by this change — that requires a separate multi-part optimizer + # rework on the VFM-unified path. + optimizer_config.model_parts = [self.model] + optimizer_config.model_part_names = ["llm"] + optimizer = instantiate(optimizer_config) + + # Build scheduler. + scheduler_config.optimizers = optimizer + scheduler = instantiate(scheduler_config) + + return optimizer, scheduler + + def training_step(self, data: dict, iteration: int) -> tuple[dict, torch.Tensor]: + """position_ids → forward → CE loss.""" + position_ids = get_position_ids( + self.hf_config, + input_ids=data["input_ids"], + image_grid_thw=data.get("image_grid_thw"), + video_grid_thw=data.get("video_grid_thw"), + attention_mask=data.get("attention_mask"), + ) + if position_ids is not None: + data["position_ids"] = position_ids + + labels = data.pop("labels") + data.pop("attention_mask", None) + logits = self.model(**data) + loss = self._loss_fn(logits, labels) + + # loss_avg: DP-averaged loss for logging (matches cosmos-rl ReduceOp.AVG). + # Does not affect the backward scalar. Pick the same 1-D sub-mesh the + # legacy single-mesh ``ParallelDims.dp_mesh`` returned — dp_shard if + # sharding is on, else dp_replicate — so the reduction group is + # byte-identical to pre-merge behavior. + loss_avg = loss.detach().clone() + pd = getattr(self, "parallel_dims", None) + dp_mesh = pd.dp_mesh if pd is not None else None + if torch.distributed.is_initialized() and dp_mesh is not None: + sub_dim = "dp_shard" if pd.dp_shard_enabled else "dp_replicate" + torch.distributed.all_reduce( + loss_avg, op=torch.distributed.ReduceOp.AVG, group=dp_mesh[sub_dim].get_group() + ) + if not torch.distributed.is_initialized() or torch.distributed.get_rank() == 0: + log.info(f"train/loss_avg: {loss_avg.item():.5f} (iteration {iteration})") + + return {"loss": loss, "loss_avg": loss_avg, "labels": labels}, loss + + @torch.no_grad() + def validation_step(self, data: dict, iteration: int) -> tuple[dict, torch.Tensor]: + """Required: VLM experiments enable validation by default (pre_exp01x.py:607). + ImaginaireTrainer.validate() calls this — must not raise NotImplementedError.""" + position_ids = get_position_ids( + self.hf_config, + input_ids=data["input_ids"], + image_grid_thw=data.get("image_grid_thw"), + video_grid_thw=data.get("video_grid_thw"), + attention_mask=data.get("attention_mask"), + ) + if position_ids is not None: + data["position_ids"] = position_ids + + labels = data.pop("labels") + data.pop("attention_mask", None) + logits = self.model(**data) + loss = self._loss_fn(logits, labels) + return {"loss": loss, "labels": labels}, loss diff --git a/cosmos-inference/cosmos3/_src/vfm/processors/__init__.py b/cosmos-inference/cosmos3/_src/vfm/processors/__init__.py new file mode 100644 index 00000000..7b8a1b97 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/vfm/processors/__init__.py @@ -0,0 +1,156 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +import json +import os +import sys +from types import SimpleNamespace +from typing import Optional + +from transformers import PreTrainedTokenizerFast + +from cosmos3._src.vfm.processors.base import BaseVLMProcessor +from cosmos3._src.vfm.processors.nemotron3densevl_processor import Nemotron3DenseVLProcessor +from cosmos3._src.vfm.processors.nemotronvl_processor import NemotronVLProcessor +from cosmos3._src.vfm.processors.qwen3vl_processor import Qwen3VLProcessor +from cosmos3._src.vfm.tokenizers.tokenization_qwen2 import Qwen2Tokenizer +from cosmos3._src.vfm.utils.vlm.pretrained_models_downloader import maybe_download_hf_model_from_s3 + +_VARIANT_TO_CREDENTIALS = { + "s3": ("credentials/s3_training.secret", "bucket"), + "gcp": ("credentials/gcp_checkpoint.secret", "bucket"), + # "hf" => no S3 backing store: pass empty credentials/bucket so the downloader + # falls back to a direct HuggingFace Hub download (matches the legacy + # ``download_tokenizer_files(model_name, "hf")`` behavior on origin/main, which + # simply returned the model name and let from_pretrained pull from HF). + "hf": ("", ""), +} + +# S3 prefix under which HuggingFace model files are stored in the checkpoint buckets. +_LLM_S3_PREFIX = "cosmos3/pretrained/huggingface" + + +class LLMTokenizerProcessor(BaseVLMProcessor): + """Wrapper that adapts a bare LLM tokenizer to the ``BaseVLMProcessor`` API. + + Used by LLM-only (no-vision) tokenizer configs so that all augmentors and + model code can treat LLM-only and full VLM configs uniformly through the + same ``proc.tokenizer`` / ``proc.tokenize_text`` surface. The base class + handles ``tokenize_text`` / ``encode`` / ``decode``; we only need to wire + up ``self.processor`` so ``.tokenizer`` resolves. + """ + + def __init__(self, tokenizer): + self.processor = SimpleNamespace(tokenizer=tokenizer) + + +def _patch_nemotron_llm_tokenizer_vision_tokens(destination_dir: str) -> None: + """Remap reserved placeholder tokens to vision special tokens in-place. + + The Nemotron LLM tokenizer reserves ```` / ```` + at IDs 20/21 -- the same slots the VLM tokenizer uses for + ``<|vision_start|>`` / ``<|vision_end|>``. Renaming them here keeps + every vision-token ID inside the original vocab_size (131072) so no + embedding-layer resize is needed during FSDP training. The function is + idempotent: re-applying it after the tokens are already renamed is a no-op. + """ + remap = {"": "<|vision_start|>", "": "<|vision_end|>"} + + tokenizer_json_path = os.path.join(destination_dir, "tokenizer.json") + if os.path.exists(tokenizer_json_path): + with open(tokenizer_json_path) as f: + data = json.load(f) + for entry in data.get("added_tokens", []): + if entry["content"] in remap: + entry["content"] = remap[entry["content"]] + vocab = data.get("model", {}).get("vocab", {}) + for old_name, new_name in remap.items(): + if old_name in vocab: + vocab[new_name] = vocab.pop(old_name) + with open(tokenizer_json_path, "w") as f: + json.dump(data, f) + + tokenizer_config_path = os.path.join(destination_dir, "tokenizer_config.json") + if os.path.exists(tokenizer_config_path): + with open(tokenizer_config_path) as f: + tc_data = json.load(f) + for entry in tc_data.get("added_tokens_decoder", {}).values(): + if entry.get("content") in remap: + entry["content"] = remap[entry["content"]] + with open(tokenizer_config_path, "w") as f: + json.dump(tc_data, f) + + +def _download_llm_tokenizer( + tokenizer_type: str, + credentials: str, + bucket: str, + cache_dir: Optional[str] = None, +) -> str: + return maybe_download_hf_model_from_s3( + tokenizer_type, + credentials=credentials, + bucket=bucket, + include_model_weights=False, + cache_dir=cache_dir, + s3_prefix=_LLM_S3_PREFIX, + ) + + +def build_processor( + tokenizer_type: str, + config_variant: Optional[str] = None, + credentials: Optional[str] = None, + bucket: Optional[str] = None, + cache_dir: Optional[str] = None, +): + if credentials is None or bucket is None: + if config_variant is None: + config_variant = "s3" + if config_variant not in _VARIANT_TO_CREDENTIALS: + raise ValueError(f"config_variant must be one of {list(_VARIANT_TO_CREDENTIALS)}, got {config_variant!r}") + variant_credentials, variant_bucket = _VARIANT_TO_CREDENTIALS[config_variant] + credentials = credentials if credentials is not None else variant_credentials + bucket = bucket if bucket is not None else variant_bucket + elif config_variant is not None: + raise ValueError("Provide either config_variant or (credentials, bucket), not both") + if "Qwen/Qwen3-VL" in tokenizer_type or "Siglip2-Qwen3-1.7B" in tokenizer_type: + return Qwen3VLProcessor(tokenizer_type, credentials=credentials, bucket=bucket, cache_dir=cache_dir) + elif "nvidia/NVIDIA-Nemotron-Nano-12B-v2-VL-BF16" in tokenizer_type: + return NemotronVLProcessor(tokenizer_type, credentials=credentials, bucket=bucket, cache_dir=cache_dir) + elif "NVIDIA-Nemotron-3-Dense-VL" in tokenizer_type or "Qwen3-2B-ViT" in tokenizer_type: + return Nemotron3DenseVLProcessor(tokenizer_type, credentials=credentials, bucket=bucket, cache_dir=cache_dir) + elif "Qwen/Qwen3-0.6B" in tokenizer_type: + local_path = _download_llm_tokenizer(tokenizer_type, credentials, bucket, cache_dir) + return LLMTokenizerProcessor(Qwen2Tokenizer.from_pretrained(local_path)) + elif "Nemotron/NVIDIA-Nemotron-3-2B-BF16" in tokenizer_type: + local_path = _download_llm_tokenizer(tokenizer_type, credentials, bucket, cache_dir) + _patch_nemotron_llm_tokenizer_vision_tokens(local_path) + return LLMTokenizerProcessor(PreTrainedTokenizerFast.from_pretrained(local_path, trust_remote_code=True)) + else: + raise ValueError(f"Tokenizer type {tokenizer_type} not supported") + + +def build_processor_lazy(*args, **kwargs): + """LazyCall wrapper that resolves ``build_processor`` on this module at call time. + + LazyCall captures its target at config-construction time, so a direct + ``L(build_processor)`` would freeze the original function reference and + bypass any later ``monkeypatch.setattr`` on this module's + ``build_processor`` attribute. This wrapper performs a fresh module-level + lookup on every call, so test fixtures patching ``build_processor`` are + honored when the config is instantiated. + """ + return sys.modules[__name__].build_processor(*args, **kwargs) diff --git a/cosmos-inference/cosmos3/_src/vfm/processors/base.py b/cosmos-inference/cosmos3/_src/vfm/processors/base.py new file mode 100644 index 00000000..44f20168 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/vfm/processors/base.py @@ -0,0 +1,204 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +"""Base class and shared helpers for VFM/VLM processor wrappers. + +Each concrete processor wraps a HuggingFace ``AutoProcessor`` for a specific +model family and exposes a small surface used by dataloaders and the training +model: + +* ``apply_chat_template`` -- model-specific message templating (per subclass) +* ``add_assistant_tokens_mask`` -- model-specific loss mask construction +* ``tokenizer`` -- the underlying HF tokenizer (uniform property) +* ``tokenize_text`` / ``encode`` / ``decode`` -- simple delegations + +This module hosts the parts that were truly common across subclasses so the +concrete files only contain model-specific logic. +""" + +import os +from typing import Dict, List, Optional + +from transformers.models.auto.processing_auto import AutoProcessor + +from cosmos3._src.imaginaire.utils import log +from cosmos3._src.vfm.models.vlm.qwen3_vl.utils import tokenize_caption +from cosmos3._src.vfm.utils.vlm.pretrained_models_downloader import maybe_download_hf_model_from_s3 + + +def convert_string_content_to_list_content(messages: List[Dict]) -> List[Dict]: + """Normalize chat messages so ``content`` is always a list of typed dicts. + + Many HF processors do not accept ``"content": str``; they expect each + message's content to be a list of ``{"type": ..., ...}`` entries. This + helper rewrites bare-string contents into a single ``{"type": "text", ...}`` + entry in place and returns the same list for convenience. + """ + for i, message in enumerate(messages): + if isinstance(message["content"], str): + messages[i]["content"] = [{"type": "text", "text": message["content"]}] + return messages + + +def maybe_parse_video_content( + messages: List[Dict], +) -> tuple[int, Optional[list[float]], Optional[list[int]], Optional[list[list[int]]]]: + """Scan messages for video entries and return their decoding metadata. + + Returns ``(num_video, fps_per_video, total_frames_per_video, frame_indices_per_video)``. + Logs a critical warning when a video entry omits ``fps``. + """ + num_video = 0 + video_fps: list[float] = [] + video_total_num_frames: list[int] = [] + video_frames_indices: list[list[int]] = [] + for message in messages: + if isinstance(message["content"], list): + for sub_content in message["content"]: + if sub_content.get("type", "") == "video" and isinstance(sub_content["video"], list): + num_video += 1 + fps = sub_content.get("fps", None) + if fps is None: + log.critical( + f"fps is None for video {sub_content}. Better to set the fps explicitly", rank0_only=False + ) + video_fps.append(fps) + video_total_num_frames.append(len(sub_content["video"])) + video_frames_indices.append(list(range(video_total_num_frames[-1]))) + return num_video, video_fps, video_total_num_frames, video_frames_indices + + +def maybe_get_max_pixels_from_images_kwargs(messages: List[Dict]) -> tuple[Optional[int], Optional[int]]: + """Return ``(max_pixels, min_pixels)`` from the first image entry that sets ``max_pixels``.""" + for message in messages: + if isinstance(message["content"], list): + for sub_content in message["content"]: + if sub_content.get("type", "") == "image" and sub_content.get("max_pixels", None) is not None: + return sub_content["max_pixels"], sub_content.get("min_pixels", None) + return None, None + + +class BaseVLMProcessor: + """Shared skeleton for VFM/VLM processor wrappers. + + Subclasses inherit the S3-or-local model resolution, the + ``AutoProcessor`` load, and the extraction of common token IDs. They + are responsible only for: + + * the chat templating logic (``apply_chat_template``); + * the loss-mask construction (``add_assistant_tokens_mask``); + * any model-specific dataloader helper fields (e.g. ``patch_size``, + ``merge_size``, ``use_smart_resize``). + + A subclass that needs a different pad-id resolution (e.g. NemotronVL's + ```` convention) overrides :py:meth:`_resolve_pad_id`. A + subclass that needs a different vision-end marker sets the + ``VISION_END_TOKEN`` class attribute; the default ``None`` skips that + lookup entirely (used for tokenizers that lack a single-token marker). + """ + + # Override on subclasses to the model's vision-end token (e.g. ``""``). + # Leave as None when the tokenizer does not expose a single-token marker — + # ``vision_end_id`` will then be set to None and downstream consumers + # (e.g. ``debug_data_qwen.py``) will skip the check. + VISION_END_TOKEN: Optional[str] = None + + def __init__( + self, + name: str, + credentials: str = "./credentials/s3_training.secret", + bucket: str = "bucket", + cache_dir: Optional[str] = None, + ) -> None: + self.name = name + if os.path.isdir(name): + model_name_or_path_local = name + else: + model_name_or_path_local = maybe_download_hf_model_from_s3( + name, credentials, bucket, include_model_weights=False, cache_dir=cache_dir + ) + + self.processor = AutoProcessor.from_pretrained(model_name_or_path_local, trust_remote_code=True) + log.info("Successfully loaded processor from local cache") + + self.image_token_id = ( + self.processor.tokenizer.convert_tokens_to_ids(self.processor.image_token) + if hasattr(self.processor, "image_token") + else None + ) + self.video_token_id = ( + self.processor.tokenizer.convert_tokens_to_ids(self.processor.video_token) + if hasattr(self.processor, "video_token") + else None + ) + self.eos_id = self.processor.tokenizer.eos_token_id + self.pad_id = self._resolve_pad_id() + self.vision_end_id = ( + self.processor.tokenizer.convert_tokens_to_ids(self.VISION_END_TOKEN) + if self.VISION_END_TOKEN is not None + else None + ) + + # ------------------------------------------------------------------ + # Hooks for subclasses + # ------------------------------------------------------------------ + + def _resolve_pad_id(self): + """Return the pad token id. Default: ``pad_token_id`` falling back to ``eos_id``. + + Override on subclasses whose model uses a non-standard pad token (e.g. + NemotronVL uses ````). + """ + pad = self.processor.tokenizer.pad_token_id + return pad if pad is not None else self.eos_id + + # ------------------------------------------------------------------ + # Shared interfaces + # ------------------------------------------------------------------ + + @property + def tokenizer(self): + """Expose the underlying HF tokenizer uniformly. + + Lets model and test code call ``proc.tokenizer`` regardless of which + concrete processor wrapper they received. + """ + return self.processor.tokenizer + + def tokenize_text( + self, + caption: str, + is_video: bool = False, + use_system_prompt: bool = False, + system_prompt: Optional[str] = None, + ) -> list[int]: + """Tokenize a text caption via the shared ``tokenize_caption`` helper. + + Keeps VFM diffusion augmentors and VLM dataloaders on the same code + path so a single processor instance serves both. + """ + return tokenize_caption( + caption, + self.processor.tokenizer, + is_video=is_video, + use_system_prompt=use_system_prompt, + system_prompt=system_prompt, + ) + + def encode(self, *args, **kwargs): + return self.processor.tokenizer.encode(*args, **kwargs) + + def decode(self, *args, **kwargs): + return self.processor.tokenizer.decode(*args, **kwargs) diff --git a/cosmos-inference/cosmos3/_src/vfm/processors/nemotron3densevl_processor.py b/cosmos-inference/cosmos3/_src/vfm/processors/nemotron3densevl_processor.py new file mode 100644 index 00000000..4c74f7d1 --- /dev/null +++ b/cosmos-inference/cosmos3/_src/vfm/processors/nemotron3densevl_processor.py @@ -0,0 +1,193 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +from typing import Optional + +import numpy as np +import torch +from PIL import Image +from qwen_vl_utils.vision_process import smart_resize + +from cosmos3._src.vfm.processors.base import ( + BaseVLMProcessor, + convert_string_content_to_list_content, + maybe_parse_video_content, +) + + +class Nemotron3DenseVLProcessor( + BaseVLMProcessor +): + """Wrapper around the HuggingFace ``AutoProcessor`` for Nemotron3-Dense-VL / Qwen3-2B-ViT.""" + + # Nemotron3-Dense / Qwen3-2B-ViT does not expose a single vision-end token; + # leave ``vision_end_id`` unset (None) rather than the legacy ```` + # which silently resolved to the UNK token id. + VISION_END_TOKEN: Optional[str] = None + + def __init__( + self, + name: str = "Qwen/Qwen3-VL-2B-Init", + credentials: str = "./credentials/s3_training.secret", + bucket: str = "bucket", + cache_dir: Optional[str] = None, + ): + super().__init__(name=name, credentials=credentials, bucket=bucket, cache_dir=cache_dir) + # Helper attributes consumed by the dataloader video decoding path. + shortest_edge = self.processor.image_processor.size["shortest_edge"] + self.min_height_width = int(np.sqrt(shortest_edge)) + self.patch_size = self.processor.video_processor.patch_size + self.temporal_patch_size = self.processor.video_processor.temporal_patch_size + self.merge_size = self.processor.video_processor.merge_size + self.use_smart_resize = True + + def apply_chat_template( + self, + messages, + add_generation_prompt=False, + return_tensors="pt", + tokenize=True, + **kwargs, + ): + """ + Return: + inputs: dict + input_ids: torch.Tensor, shape: (N_token) + attention_mask: torch.Tensor, shape: (N_token) + texts: str, the raw text + image_sizes: torch.Tensor, shape (N_img, 2) + pixel_values: torch.Tensor, shape (N_img_patch, 3, 224, 224) + """ + assert tokenize, "tokenize must be True" + assert return_tensors == "pt", "return_tensors must be pt" + # Note: this tokenizer does not support "content": str, it always expect "content" entry to be a list of dicts + messages = convert_string_content_to_list_content(messages) + kwargs = {} + # Pre-resize images per-message using smart_resize so the resulting + # token count matches the configured min/max-pixel budget. + for message in messages: + if isinstance(message["content"], list): + for sub_content in message["content"]: + if sub_content.get("type", "") == "image": + image = sub_content["image"] + max_pixels = sub_content.get("max_pixels", self.processor.image_processor.size["longest_edge"]) + min_pixels = sub_content.get("min_pixels", self.processor.image_processor.size["shortest_edge"]) + assert isinstance(image, Image.Image), ( + "image must be a url string for now, not support list of images for one content" + ) + width, height = image.size + resized_height, resized_width = smart_resize( + height, + width, + factor=32, + min_pixels=min_pixels, + max_pixels=max_pixels, + ) + image = image.resize((resized_width, resized_height)) + sub_content["image"] = image + + num_video, video_fps, video_total_num_frames, video_frames_indices = maybe_parse_video_content(messages) + if num_video > 0: + assert num_video == 1, "only support one video for now" + fps = video_fps[0] + total_num_frames = video_total_num_frames[0] + frames_indices = video_frames_indices[0] + kwargs.update( + { + "do_sample_frames": False, + "video_metadata": dict(fps=fps, total_num_frames=total_num_frames, frames_indices=frames_indices), + } + ) + + inputs = self.processor.apply_chat_template( + messages, + tokenize=tokenize, + add_generation_prompt=add_generation_prompt, + return_dict=True, + return_tensors=return_tensors, + **kwargs, + ) + + # Convert batch features into single features + inputs["input_ids"] = inputs["input_ids"][0] # [N_token] + inputs["attention_mask"] = inputs["attention_mask"][0] # [N_token] + return inputs + + def add_assistant_tokens_mask(self, tokens): + """ + Add a mask to the assistant tokens. + This is used to mask out tokens that are not generated by the assistant (e.g., system prompts, user prompts, chat templates), such that in the loss computation, only the tokens generated by the assistant are used. + If there are multiple turns in the conversation, the mask will mask all the assistant tokens in each turn. + + Args: + tokens (Union[List[int], torch.Tensor]): The tokens to add the mask to. + Returns: + Union[List[bool], torch.Tensor]: The mask. True for tokens generated by the assistant (i.e. should apply loss on), False for tokens not generated by the assistant. + """ + if isinstance(tokens, torch.Tensor) and tokens.ndim == 2: + mask = torch.stack( + [self.add_assistant_tokens_mask(tokens[i]) for i in range(tokens.shape[0])] + ) # [B,N_token] + assert mask.shape == tokens.shape + return mask + np_tokens = tokens.cpu().numpy() if isinstance(tokens, torch.Tensor) else np.array(tokens) + assert np_tokens.ndim == 1 + + # Constants defining bos, eos and fixed offsets. + BOS_TOKEN = "<|im_start|>" + EOS_TOKEN = "<|im_end|>" + ROLE = "assistant" + # Offsets: skip the bos + "assistant\n" (always 3 tokens) and include the eos (+1) for supervision + START_OFFSET = 3 + END_OFFSET = 1 + + # Retrieve token IDs for the markers and the role. + bos_token_id = self.processor.tokenizer.convert_tokens_to_ids(BOS_TOKEN) + eos_token_id = self.processor.tokenizer.convert_tokens_to_ids(EOS_TOKEN) + role_id = self.processor.tokenizer.convert_tokens_to_ids(ROLE) + # ``role`` may tokenize into multiple sub-tokens (e.g. Qwen3-2B-ViT + # splits "assistant"); the multi-token branch below handles that case. + role_ids = self.processor.tokenizer.encode(ROLE, add_special_tokens=False) + think_start_id = self.processor.tokenizer.convert_tokens_to_ids("") + think_end_id = self.processor.tokenizer.convert_tokens_to_ids("") + + # Locate all positions where the start and end markers appear. + start_indices = np.where(np_tokens == bos_token_id)[0] + end_indices = np.where(np_tokens == eos_token_id)[0] + + # Initialize the mask with False values. + masks = np.zeros_like(np_tokens, dtype=bool) + assert len(start_indices) == len(end_indices) + # For each pair of bos/eos, check if the role is 'assistant' + # and apply the mask accordingly. + for start, end in zip(start_indices, end_indices): + end_pos = None + if np_tokens[start + 1] == role_id: + # Mask tokens from after the assistant header (start+3) to include the end marker (end+1) + masks[start + START_OFFSET : end + END_OFFSET] = True + end_pos = start + START_OFFSET + elif all(np_tokens[start + 1 : start + 1 + len(role_ids)] == role_ids): + masks[start + START_OFFSET + len(role_ids) - 1 : end + END_OFFSET] = True + end_pos = start + START_OFFSET + len(role_ids) - 1 + if end_pos is not None and np_tokens[end_pos] == think_start_id: + masks[end_pos] = False + if np_tokens[end_pos + 1] == think_end_id: + masks[end_pos + 1] = False + + assert masks.shape == np_tokens.shape + if isinstance(tokens, torch.Tensor): + return torch.from_numpy(masks) + else: + return masks.tolist() diff --git a/cosmos-inference/cosmos3/_src/vfm/processors/nemotronvl_processor.py b/cosmos-inference/cosmos3/_src/vfm/processors/nemotronvl_processor.py new file mode 100644 index 00000000..f1e4d6ee --- /dev/null +++ b/cosmos-inference/cosmos3/_src/vfm/processors/nemotronvl_processor.py @@ -0,0 +1,484 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +from typing import Dict, List, Optional + +import numpy as np +import torch +from PIL import Image +from transformers.processing_utils import VideosKwargs +from transformers.video_utils import VideoMetadata + +from cosmos3._src.imaginaire.utils import log +from cosmos3._src.vfm.processors.base import BaseVLMProcessor, convert_string_content_to_list_content + +nemotron_chat_template = """ +{%- set ns = namespace(enable_thinking=false, has_sys_prompt=false, non_tool_system_content='', has_video=false, explicit_think_requested=false) -%} +{%- set msg = namespace(content='') -%} +{%- for message in messages -%} + {%- if message['role'] == 'system' -%} + {%- set ns.has_sys_prompt = true -%} + {# Extract system content without tool flags #} + {%- if message['content'] is string -%} + {%- set ns.non_tool_system_content = message['content'].replace('', '<_end_think>').replace('/think', '').replace('/no_think', '').replace('<_end_think>', '').strip() -%} + {%- else -%} + {%- set ns.non_tool_system_content = '' -%} + {%- for content in message['content'] -%} + {%- if content['type'] == 'text' -%} + {%- set ns.non_tool_system_content = ns.non_tool_system_content + content['text'].replace('', '<_end_think>').replace('/think', '').replace('/no_think', '').replace('<_end_think>', '') -%} + {%- endif -%} + {%- endfor -%} + {%- set ns.non_tool_system_content = ns.non_tool_system_content.strip() -%} + {%- endif -%} + {%- endif -%} + {# Check for video content in all messages #} + {%- if message['content'] is not string -%} + {%- for content in message['content'] -%} + {%- if content['type'] == 'video' or content['type'] == 'video_url' -%} + {%- set ns.has_video = true -%} + {%- endif -%} + {%- endfor -%} + {%- endif -%} + {%- if message['content'] is string -%} + {%- if message['role'] == 'user' or message['role'] == 'system' -%} + {%- if '/think' in message['content'].replace('', '') -%} + {%- set ns.enable_thinking = true -%} + {%- set ns.explicit_think_requested = true -%} + {%- elif '/no_think' in message['content'] -%} + {%- set ns.enable_thinking = false -%} + {%- endif -%} + {%- endif -%} + {%- else -%} + {%- for content in message['content'] -%} + {%- if content['type'] == 'text' -%} + {%- if message['role'] == 'user' or message['role'] == 'system' -%} + {%- if '/think' in content['text'].replace('', '') -%} + {%- set ns.enable_thinking = true -%} + {%- set ns.explicit_think_requested = true -%} + {%- elif '/no_think' in content['text'] -%} + {%- set ns.enable_thinking = false -%} + {%- endif -%} + {%- endif -%} + {%- endif -%} + {%- endfor -%} + {%- endif -%} +{%- endfor -%} + +{{- bos_token -}} +{%- if messages[0]['role'] != 'system' -%} + {{- 'System\n' -}} +{%- else -%} + {{- 'System\n' + ns.non_tool_system_content }} +{%- endif -%} + +{%- if tools -%} + {%- if ns.non_tool_system_content != '' -%} + {{- '\n\n' -}} + {%- endif -%} + {{- 'You can use the following tools to assist the user if required:\n' -}} + {{- '[' -}} + {%- for tool in tools -%} + {{- (tool.function if tool.function is defined else tool) | tojson -}} + {{- ', ' if not loop.last else '' -}} + {%- endfor -%} + {{- ']\n\n' -}} + + {{- 'If you decide to call any tool(s), use the following format:\n' -}} + {{- '[{"name": "tool_name1", "arguments": "tool_args1"}, ' -}} + {{- '{"name": "tool_name2", "arguments": "tool_args2"}]\n\n' -}} + + {{- 'The user will execute tool-calls and return responses from tool(s) in this format:\n' -}} + {{- '[{"response": "tool_response1"}, ' -}} + {{- '{"response": "tool_response2"}]\n\n' -}} + + {{- 'Based on the tool responses, you can call additional tools if needed, ' -}} + {{- 'correct tool calls if any errors are found, or just respond to the user.' -}} +{%- endif -%} +{{- '\n' -}} + +{%- set messages = messages[1:] if messages[0]['role'] == 'system' else messages -%} + +{# Prevent no user or assistant message #} +{%- if messages|length == 0 -%} + {%- set messages = [{'role': 'user', 'content': ''}] -%} +{%- endif -%} + +{%- for message in messages %} + {%- if message['content'] is string -%} + {%- set msg.content = message['content'].replace('', '<_end_think>').replace('/think', '').replace('/no_think', '').replace('<_end_think>', '').strip() -%} + {%- else -%} + {%- set msg.content = '' -%} + {%- set mm_content = '' -%} + {%- set counters = namespace(images=0, videos=0) -%} + + {%- for content in message['content'] -%} + {%- if content['type'] == 'image' -%} + {%- set counters.images = counters.images + 1 -%} + {%- elif content['type'] == 'video' -%} + {%- set counters.videos = counters.videos + 1 -%} + {%- elif content['type'] == 'text' -%} + {%- set msg.content = msg.content + content['text'] -%} + {%- endif -%} + {%- endfor -%} + {%- if '' in msg.content -%} + {%- set counters.images = 0 -%} + {%- endif -%} + {%- if '