Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
61 commits
Select commit Hold shift + click to select a range
3f9459f
minor fixes to better support baapb server
keighrim Dec 14, 2025
873122e
adding select timepoints from targets
kelleyl Dec 19, 2025
0c40bac
Importing the summarizer, fixing its imports and hooking it up with a…
marcverhagen Dec 23, 2025
4c7c6a6
Added notes on how to add a CLI script and added the summarizer to th…
marcverhagen Jan 8, 2026
7cc973d
Removed some deprecated methods because they broke the coverage tests…
marcverhagen Jan 8, 2026
8b310eb
Type checker from the coverage test does not like string-valued type …
marcverhagen Jan 8, 2026
cca6714
More cleanup and fixes for code coverage tests
marcverhagen Jan 8, 2026
d2498f8
More changes to satisfy typing requirements from the code coverage tests
marcverhagen Jan 8, 2026
51983c5
Making code pass pytype in Python 3.10, 3.11 and 3.12 ; some cleanup.
marcverhagen Jan 12, 2026
53cc9df
Merge pull request #350 from clamsproject/346-summarizer
marcverhagen Jan 12, 2026
b3ad973
Replaced DocumentTypes type hint with DocumentTypesBase
marcverhagen Jan 14, 2026
73da13e
cleanup imports
keighrim Jan 29, 2026
9bdbbab
Some documentation thoughts and updates.
marcverhagen Jan 9, 2026
d803abc
Added CLI module to documentation
marcverhagen Jan 14, 2026
86578c7
Various documentation updates, including for the summarizer
marcverhagen Jan 22, 2026
d39f9f0
reverting incorrect heading level control in documentation
keighrim Jan 30, 2026
3548308
minor changes in sphinx conf.py..
keighrim Jan 30, 2026
02e1a9b
reverting "long" help msg for `mmif describe` command ...
keighrim Jan 30, 2026
7895721
fix minor typos
keighrim Jan 30, 2026
6b11dcd
update cli.rst, removing not-so-helpful, not-so-readable help msg sni…
keighrim Jan 30, 2026
9aac163
merged developer documents into CONTRIBUTING file
keighrim Jan 30, 2026
99c8338
added apidoc to automate package/module discovery for sphinx docs gen
keighrim Jan 30, 2026
72bfc32
removed temporary note file
keighrim Jan 30, 2026
0a26924
added caching local resolved path from http:// URI for speed
keighrim Feb 1, 2026
64ea411
Merge pull request #357 from clamsproject/347-slow-remote-file-access
keighrim Feb 1, 2026
aa8b7d8
minor documentation fix
keighrim Feb 2, 2026
39bbf42
removed "source" media counts from workflow ID prefix ...
keighrim Feb 2, 2026
660d778
Merge pull request #358 from clamsproject/326-fix-wfid
keighrim Feb 2, 2026
854739d
documented git workflow for developers
keighrim Feb 3, 2026
b5a32f9
some clarification regarding CLI subcmd auto-discovery
keighrim Feb 9, 2026
39ef2be
local test build for documentation no longer requires VERSION file
keighrim Feb 9, 2026
583174d
fixed ambiguous fn references in docstring
keighrim Feb 9, 2026
5eba9f6
removing advertiseing `make docs` in dev doc
keighrim Feb 10, 2026
0f1c49a
Merge pull request #356 from clamsproject/348-docs-krim
keighrim Feb 10, 2026
054a2ad
Merge branch 'develop' into 348-documentation
keighrim Feb 10, 2026
03a9112
Merge pull request #353 from clamsproject/351-fix-type-hint
keighrim Feb 10, 2026
5d36df6
fixing issues in pytest config and `make test` cmd
keighrim Feb 11, 2026
503abe2
Merge pull request #355 from clamsproject/348-documentation
keighrim Feb 11, 2026
5342210
updated `summarize` flag for consistency, remove redundant docstrings…
keighrim Feb 11, 2026
c166f70
added tests for summarize command
keighrim Feb 11, 2026
4df2d3f
replaced deprecated `argparser.FileType` with native implementation
keighrim Feb 11, 2026
d85526f
`summarize`'s log msgs are now using standard logging lib
keighrim Feb 11, 2026
ad5d0d3
fixed type hints in native CLI-IO hanlder
keighrim Feb 11, 2026
3d214d9
adding some minor fixes to main page of docs
keighrim Feb 11, 2026
a9893de
reflecting review comments
keighrim Feb 12, 2026
c5c3d29
re-wrote CLI IO handler, simplifying input types
keighrim Feb 12, 2026
a6db461
Merge pull request #360 from clamsproject/359-summarize-cli-flags
keighrim Feb 13, 2026
398eb21
Updated `mmif describe` implementation to be based on pydantic for be…
keighrim Feb 14, 2026
8266a2e
updated test cases for utils and clis
keighrim Feb 15, 2026
9ee0bd5
added human-friendly summary for pydantic classes in `describe --help`
keighrim Feb 16, 2026
978e389
Merge pull request #367 from clamsproject/366-pydantic-describe
keighrim Feb 23, 2026
96a8b2e
updated dev document
keighrim Feb 25, 2026
bb84725
sort frame numbers
kelleyl Feb 26, 2026
3ae9eca
adding `Z` suffix in timestamps, always defaulting to UTC (docker def)
keighrim Feb 26, 2026
f4d25d8
Merge pull request #372 from clamsproject/368-tz-in-timestamps
keighrim Feb 27, 2026
1a3e43e
Merge branch '345-target_frame_selection' into video_document_helper-…
kelleyl Mar 5, 2026
a6b1982
adding deduplication to frame extraction
kelleyl Mar 5, 2026
27054d9
Merge pull request #361 from clamsproject/348-more-docs-fix
keighrim Mar 10, 2026
09c7e32
added three basic modes for sampling from TF
keighrim Mar 10, 2026
d2dcf9e
Merge pull request #373 from clamsproject/video_document_helper-bug-fix
keighrim Mar 10, 2026
815b227
updated build GHA to use documentation hub
keighrim Mar 10, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
27 changes: 21 additions & 6 deletions .github/workflows/publish.yml
Original file line number Diff line number Diff line change
@@ -1,12 +1,27 @@
name: "📦 Publish (docs, PyPI)"
name: "📦 Publish (PyPI + docs)"

on:
push:
tags:
on:
push:
tags:
- '[0-9]+.[0-9]+.[0-9]+'

jobs:
package-and-upload:
name: "🤙 Call SDK publish workflow"
publish-pypi:
name: "📦 Build and upload to PyPI"
uses: clamsproject/.github/.github/workflows/sdk-publish.yml@main
secrets: inherit

publish-docs:
name: "📖 Build and publish docs"
needs: publish-pypi
uses: clamsproject/clamsproject.github.io/.github/workflows/sdk-docs.yml@main
with:
source_repo: clamsproject/mmif-python
source_ref: ${{ github.ref_name }}
project_name: mmif-python
version: ${{ github.ref_name }}
build_command: 'python3 build-tools/docs.py --build-ver ${{ github.ref_name }} --output-dir docs'
docs_output_dir: 'docs/${{ github.ref_name }}'
python_version: '3.11'
update_latest: true
secrets: inherit
7 changes: 6 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -79,5 +79,10 @@ mmif/vocabulary

# Documentation build artifacts
documentation/cli_help.rst
documentation/whatsnew.rst
documentation/whatsnew.md
documentation/autodoc
docs-test

# environments
.venv*
venv*
99 changes: 96 additions & 3 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,72 @@
# Contributing to mmif-python

## Git Workflow

We follow a Gitflow-inspired branching model to maintain a stable `main` branch and a dynamic `develop` branch.

1. **Branch Roles**:
- `main`: Reserved for stable, production-ready releases.
- `develop`: The primary branch for ongoing development, feature integration, and bug fixes. This serves as the "staging" area for the next release.
2. **Issue Tracking**: Every contribution (bug fix or feature) must first be reported as a [GitHub Issue](https://github.com/clamsproject/mmif-python/issues). Issues should clearly define goals and, preferably, include an implementation plan.
3. **Branch Naming**: Create a dedicated working branch for each issue. Branches must be named using the format `NUM-short-description`, where `NUM` is the issue number (e.g., `113-fix-file-loading`).
4. **Pull Requests (PRs)**:
- Once work is complete, open a PR targeting the `develop` branch.
- **Communication**: High-level discussion and planning should occur in the issue thread. The PR conversation is strictly for code review and implementation-specific feedback.
5. **Releases**:
- When `develop` is ready for a new release, open a PR from `develop` to `main` using the "release" PR template.
- After merging the release candidate into `main`, manually tag the commit with the version number. This tag triggers the automated CI/CD pipeline for publishing.
6. **Branch Protection**: Both `main` and `develop` are protected branches. Direct pushes are disabled; all changes must be introduced via Pull Requests.

## CLI Scripts

The `mmif` command-line interface supports subcommands (e.g., `mmif source`, `mmif describe`). These are implemented as Python modules in `mmif/utils/cli/`.

### Adding a New CLI Script

To add a new CLI subcommand, create a Python module in `mmif/utils/cli/` with these three required functions:

1. **`prep_argparser(**kwargs)`** - Define and return an `argparse.ArgumentParser` instance for your subcommand. When called during discovery, the main CLI will pass `add_help=False` to this function to avoid duplicate help flags.

2. **`describe_argparser()`** - Return a tuple of two strings:
- A one-line description (shown in `mmif --help`)
- A more verbose description (shown in `mmif <subcommand> --help`)

3. **`main(args)`** - Execute the subcommand logic with the parsed arguments.

### Standard I/O Argument Pattern

To ensure a consistent user experience and avoid resource leaks, all CLI subcommands should adhere to the following I/O argument patterns using the `mmif.utils.cli.open_cli_io_arg` context manager (which replaces the deprecated `argparse.FileType`):

1. **Input**: Use a positional argument (usually named `MMIF_FILE`) that supports both file paths and STDIN.
- In `prep_argparser`, use `nargs='?'`, `type=str`, and `default=None`.
- In `main`, use `with open_cli_io_arg(args.MMIF_FILE, 'r', default_stdin=True) as input_file:`.
2. **Output**: Use the `-o`/`--output` flag for the output destination.
- In `prep_argparser`, use `type=str` and `default=None`.
- In `main`, use `with open_cli_io_arg(args.output, 'w', default_stdin=True) as output_file:`.
3. **Formatting**: Use the `-p`/`--pretty` flag as a boolean switch (`action='store_true'`) to toggle between compact and pretty-printed JSON/MMIF output.

[!NOTE]
> CLI modules should typically act as thin wrappers. It is recommended to implement the core utility logic in other packages (e.g., `mmif.utils`) and import it into the CLI module. See existing modules like `summarize.py` (which imports from `mmif.utils.summarizer`) or `describe.py` for examples.

### How CLI Discovery Works

The CLI system automatically discovers subcommands at runtime. The entry point is configured in the build script (currently `setup.py`) as follows:

```python
entry_points={
'console_scripts': [
'mmif = mmif.__init__:cli',
],
},
```

The `cli()` function in `mmif/__init__.py` handles discovery and delegation. It uses `pkgutil.walk_packages` to find all modules within the top-level of the `mmif.utils.cli` package. For the discovery logic to work, a "cli module" should implement the requirements outlined above.

This means adding a properly structured module within the CLI package is all that's needed—the module name will automatically be registered as a subcommand. No modifications to `setup.py` or other configuration files are required.

> [!NOTE]
> Any "client" code (not shell CLI) wants to use a module in `cli` package should be able to directrly `from mmif.utils.cli import a_module`. However, for historical reasons, some CLI modules are manually imported in `mmif/__init__.py` (e.g., `source.py`) for backward compatibility for clients predateing the discovery system.

## Documentation

The documentation for `mmif-python` is built using Sphinx and published to the [CLAMS documentation hub](https://github.com/clamsproject/website-test).
Expand All @@ -9,12 +76,38 @@ The documentation for `mmif-python` is built using Sphinx and published to the [
To build the documentation for the current checkout:

```bash
make doc
# OR
python3 build-tools/docs.py
```

The output will be in `documentation/_build/html`.
The output will be in `docs-test`. For more options, run `python build-tools/docs.py --help`.

> [!NOTE]
> Since the documentation build process is relying on the working `mmif` package, one must "build" the package first before building the documentation. This can be done by running
> ```bash
> rm VERSION* # remove existing VERSION file if exists
> make devversion # creates a dummy VERSION file
> pip install -r requirements.dev # install dev dependencies
> python setup.py sdist # build the package (will download auto-generate subpackges like `mmif.res` and `mmif.ver`)

> [!NOTE]
> running `build-tools/docs.py` in "local testing" mode will overwrite any existing VERSION file with a dummy version.

### API Documentation (autodoc)

As of 2026 (since the next version of 1.2.1), API documentation is **automatically generated** using `sphinx-apidoc`. When you run the documentation build:

1. The `run_apidoc()` function in `documentation/conf.py` runs automatically
2. It scans packages listed in `apidoc_package_names` (currently `mmif` and `mmif_docloc_http`)
3. RST files are generated in `documentation/autodoc/`
4. These files are **not tracked in git** - they're regenerated on each build

**When you add a new module or subpackage**, it will be automatically documented on the next build. No manual updates required.

**To add a new top-level package** (like `mmif_docloc_http`), add it to `apidoc_package_names` in `documentation/conf.py`.

**To exclude a subpackage** from documentation (like `mmif.res` or `mmif.ver`), add it to `apidoc_exclude_paths`.

**Module docstrings** in `__init__.py` files are used as package descriptions in the documentation. Keep them concise and informative.

### Building Documentation for Old Versions

Expand Down
21 changes: 8 additions & 13 deletions Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -36,17 +36,12 @@ publish: distclean version package test
$(generatedcode): dist/$(sdistname)*.tar.gz

docs:
@echo "WARNING: The 'docs' target is deprecated and will be removed."
@echo "The 'docs' directory is no longer used. Documentation is now hosted in the central CLAMS documentation hub."
@echo "Use 'make doc' for local builds or 'make doc-version' for specific versions."
@echo "Nothing is done."
@echo "The 'docs' target is deprecated and will be removed."
@echo "Documentation is now managed by 'build-tools/docs.py'."
@echo "Please run 'python3 build-tools/docs.py --help' for usage."

doc: # for single version sphinx - builds current source
python3 build-tools/docs.py

doc-version: # interactive build for specific version
@read -p "Enter version/tag to build (e.g., v1.0.0): " ver; \
[ -n "$$ver" ] && python3 build-tools/docs.py --build-ver $$ver
doc: docs
doc-version: docs

package: VERSION dist/$(sdistname)*.tar.gz

Expand Down Expand Up @@ -85,15 +80,15 @@ version: VERSION; cat VERSION
# since the GH api will return tags in chronological order, we can just grab the last one without sorting
AUTH_ARG := $(if $(GITHUB_TOKEN),-H "Authorization: token $(GITHUB_TOKEN)")

VERSION.dev: devver := $(shell curl --silent $(AUTH_ARG) "https://api.github.com/repos/clamsproject/mmif-python/git/refs/tags" | grep '"ref":' | sed -E 's/.+refs\/tags\/([0-9.]+)",/\1/g' | tail -n 1)
VERSION.dev: specver := $(shell curl --silent $(AUTH_ARG) "https://api.github.com/repos/clamsproject/mmif/git/refs/tags" | grep '"ref":' | grep -v 'py-' | sed -E 's/.+refs\/tags\/(spec-)?([0-9.]+)",/\2/g' | tail -n 1)
VERSION.dev: devver := $(shell curl --silent $(AUTH_ARG) "https://api.github.com/repos/clamsproject/mmif-python/git/refs/tags" | grep '"ref":' | sed -E 's/.+refs\/tags\/([0-9.]+)",/\1/g' | sort -V | tail -n 1)
VERSION.dev: specver := $(shell curl --silent $(AUTH_ARG) "https://api.github.com/repos/clamsproject/mmif/git/refs/tags" | grep '"ref":' | grep -v 'py-' | sed -E 's/.+refs\/tags\/(spec-)?([0-9.]+)",/\2/g' | sort -V | tail -n 1)
VERSION.dev:
@echo DEVVER: $(devver)
@echo SPECVER: $(specver)
@if [ $(call macro,$(devver)) = $(call macro,$(specver)) ] && [ $(call micro,$(devver)) = $(call micro,$(specver)) ] ; \
then \
if [[ $(devver) == *.dev* ]]; then echo $(call increase_dev,$(devver)) ; else echo $(call add_dev,$(call increase_patch, $(devver))); fi \
else echo $(call add_dev,$(specver)) ; fi \
else if [[ $(devver) == *.dev* ]]; then echo $(call increase_dev,$(devver)) ; else echo $(call add_dev,$(call increase_patch, $(devver))); fi ; fi \
> VERSION.dev

VERSION: version := $(shell git tag | sort -t. -k 1,1nr -k 2,2nr -k 3,3nr -k 4,4nr | head -n 1)
Expand Down
19 changes: 9 additions & 10 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,21 +1,20 @@
## MultiMedia Interchange Format
[MMIF](https://mmif.clams.ai) is a JSON(-LD)-based data format designed for transferring annotation data between computational analysis applications in [CLAMS project](https://clams.ai).

[MMIF](https://mmif.clams.ai) is a JSON(-LD)-based data format designed for transferring annotation data between computational analysis applications of the [CLAMS project](https://clams.ai).


## mmif-python
`mmif-python` is a Python implementation of the MMIF data format.
`mmif-python` provides various helper classes and functions to handle MMIF JSON in Python,
including ;

1. de-/serialization of MMIF internal data structures to/from JSON
`mmif-python` is a Python implementation of the MMIF data format. It provides various helper classes and functions to handle MMIF JSON in Python, including:

1. serialization and de-serialization of MMIF internal data structures to/from JSON
2. validation of MMIF JSON
3. handling of CLAMS vocabulary types
4. navigation of MMIF object via various "search" methods (e.g. `mmif.get_all_views_contain(vocab_type))`)
4. navigation of MMIF objects via various "search" methods (e.g. `mmif.get_all_views_contain(vocab_type)`)

## For more ...

* [Version history and patch notes](https://github.com/clamsproject/mmif-python/blob/main/CHANGELOG.md)
* [MMIF Python API documentation](https://clamsproject.github.io/mmif-python)
* [MMIF Python API documentation](https://clamsproject.github.io/mmif-python/latest)
* [MMIF JSON specification and schema](https://clamsproject.github.io/mmif)

## For devs ...
* Build documentation: `python build-tools/docs.py --help`
* [Contributing guide](CONTRIBUTING.md)
25 changes: 25 additions & 0 deletions build-tools/docs.py
Original file line number Diff line number Diff line change
Expand Up @@ -40,13 +40,38 @@ def run_sphinx_build(self, *args, cwd=None, check=True):
return run_command([self.sphinx_build, *args], cwd=cwd, check=check)


def get_dummy_version():
"""Returns a dummy version based on current git branch and dirty status.
Falls back to 'unknown' if not in a git repository."""
try:
branch = subprocess.check_output(["git", "rev-parse", "--abbrev-ref", "HEAD"],
stderr=subprocess.DEVNULL, text=True).strip()
dirty = subprocess.run(["git", "diff", "--quiet"],
stderr=subprocess.DEVNULL, check=False).returncode != 0
return f"{branch}{'+dirty' if dirty else ''}"
except (subprocess.CalledProcessError, FileNotFoundError):
return "unknown"


def build_docs_local(source_dir: Path, output_dir: Path):
"""
Builds documentation for the provided source directory.
Assumes it's running in an environment with necessary tools.
"""
print("--- Running in Local Build Mode ---")

# Warning for user as VERSION file is critical
if sys.stdin.isatty():
import select
print("\nWARNING: The 'VERSION' file will be overwritten with a dummy version for this local build.")
print("Pausing for 3 seconds (press Enter to continue immediately)...")
select.select([sys.stdin], [], [], 3)

# Overwrite VERSION file with dummy version for local builds
version = get_dummy_version()
print(f"Generating dummy VERSION for local build: {version}")
(source_dir / "VERSION").write_text(version)

# 1. Generate source code and install in editable mode.
print("\n--- Step 1: Generating source code and installing in editable mode ---")
try:
Expand Down
3 changes: 2 additions & 1 deletion build-tools/requirements.docs.txt
Original file line number Diff line number Diff line change
@@ -1,3 +1,4 @@
sphinx>=7.0,<8.0
sphinx
furo
m2r2
autodoc-pydantic
37 changes: 0 additions & 37 deletions documentation/autodoc/mmif.serialize.rst

This file was deleted.

49 changes: 0 additions & 49 deletions documentation/autodoc/mmif.utils.rst

This file was deleted.

28 changes: 0 additions & 28 deletions documentation/autodoc/mmif.vocabulary.rst

This file was deleted.

Loading
Loading