TL;DR: ISIC MultiAnnot++ is the largest public skin lesion segmentation dataset, collected from the ISIC Archive, with 17,684 masks for 14,967 images, including 2,394 images with multiple annotations, enabling research into inter-annotator agreement and segmentation preference modeling.
ISIC MultiAnnot++ (IMA++) is a large-scale dermoscopic dataset designed to facilitate multi-annotator skin lesion segmentation research. Collected from the ISIC Archive, it contains 17,684 segmentation masks across 14,967 images. Notably, a subset of 2,394 images features 2-5 segmentations per image provided by 16 distinct annotators. The dataset captures diverse segmentation styles influenced by annotator, tool choice, and skill level.
- Inter-Annotator Variability: Captures diverse segmentation styles driven by annotators (
A00-A15), tool choice (T1-T3), and skill level (S1-S2). - Realistic Scenario: Unlike most medical image segmentation datasets where every image is segmented by every annotator (i.e., a complete bipartite graph between annotators and images), IMA++ features an incomplete bipartite graph (i.e., every image is segmented by at least one annotator, but not all images are segmented by all annotators). This setup simulates real-world annotation scenarios where multiple annotators contribute to a subset of images.
- Scale: The largest public multi-annotator SLS dataset, enabling robust analysis of segmentation consistency and preference modeling.
- Tool-Specific Styles: Explicit metadata allows for analyzing how different annotation tools (
T1-T3) influence segmentation boundaries. - Consensus Masks: For the 2,394 images with multiple segmentations, we also provide two types of consensus masks: STAPLE (
ST) and majority voting (MV).
The dataset (segmentation masks and metadata) is hosted on Zenodo. The raw images must be downloaded from the ISIC Archive.
The Zenodo repository contains:
segs.zip: A ZIP archive of the 22,472 segmentation masks (original 17,684 + consensus labels).*_metadata.csv: Rich metadata files for images and masks.iaa_metrics_*.csv: Pre-computed inter-annotator agreement metrics.
The underlying RGB skin lesion images are not included in the Zenodo repository. All 14,967 images are available as a dedicated collection in the ISIC Archive:
Download from ISIC Archive Collection
Alternatively, you can use the ISIC API to download images by their ISIC IDs (available both in the IMAplusplus_img_metadata.csv file and as a separate file ISIC_ids.txt)
using the code in this GitHub repository.
IMAplusplus/
βββ dataset_creation/ # Dataset creation and preprocessing
β βββ create_dataset.py # Creates dataset metadata and anonymizes annotators
β βββ move_dataset.py # Copies images and masks to target directory
β βββ constants.py # Tool and skill level mappings
β βββ config.yaml # Configuration for dataset creation
β
βββ dataset_analysis/ # Dataset quality assurance and visualization
β βββ mask_qa.py # Validates masks for quality issues
β βββ other_datasets_overlap.py # Visualizes overlap with other datasets
β βββ imaplusplus_annotator_overlap.py # Visualizes annotator overlap
β βββ config.yaml # Configuration for analysis
β
βββ multiannotator_analysis/ # Multi-annotator analysis and metrics
β βββ create_multiannotator_subset.py # Creates subset with multiple annotations
β βββ create_consensus_masks.py # Generates STAPLE and majority voting masks
β βββ compute_IAA_metrics.py # Computes inter-annotator agreement metrics
β βββ compute_image_level_metrics.py # Aggregates metrics per image
β βββ visualization_scripts/ # Scripts for generating visualizations
β βββ config.yaml # Configuration for analysis
β
βββ utils/ # Utility functions
β βββ data.py # Data loading utilities
β βββ metrics.py # Metric computation functions
β βββ md5.py # MD5 hash utilities
β
βββ output/ # Generated outputs
β βββ metadata/ # CSV metadata files
β βββ metrics/ # CSV metrics files
β βββ seg_masks/ # Segmentation masks
β βββ visualizations/ # Generated plots and figures
β
βββ overall_script.sh # Main pipeline script
It is recommended to use a virtual environment (e.g., venv or conda). Install the required dependencies using the provided requirements.txt file to ensure reproducibility:
pip install -r requirements.txtKey dependencies include:
- pandas: Data manipulation and CSV handling
- numpy: Numerical operations
- scikit-image: Image processing and mask operations
- SimpleITK: STAPLE consensus mask computation
- medpy: Medical image metrics (Dice, Jaccard, Hausdorff distance)
- omegaconf: Configuration management
- loguru: Logging
- tqdm: Progress bars
- matplotlib: Plotting
- upsetplot: UpSet plots for set visualization
Run the complete pipeline (requires raw data and mapping files) using the provided script:
bash overall_script.shPrerequisites: To run the full dataset creation pipeline, you need:
- Raw ISIC Images: Downloaded from ISIC Archive.
- Raw ISIC Segmentations: The original pool of segmentations.
- Metadata Mapping: Raw mapping files (not included in this repository due to privacy and licensing constraints) are required to map raw files to the IMA++ schema.
Note: The
dataset_creation/directory is provided for transparency to document the data processing pipeline described in our paper. Most users should skip the creation step and use the pre-packaged dataset from Zenodo.
Configuration for input/output paths is handled in the config.yaml of each module.
Click to expand Individual Scripts
(Only needed if reproducing the dataset construction from raw files)
cd dataset_creation/
python create_dataset.py # Creates metadata from raw inputs
python move_dataset.py # Renames and moves files to target structureScripts to validate quality and visualize statistics.
cd dataset_analysis/
python mask_qa.py # Quality assurance: empty and full-image masks, disconnected regions, border-touching
python other_datasets_overlap.py # Generate overlap visualizations between IMA++ and other datasets (UpSet plot)
python imaplusplus_annotator_overlap.py # Generate annotator interaction plots (UpSet plot)cd multiannotator_analysis/
python create_multiannotator_subset.py # Create multi-annotator subset
python create_consensus_masks.py # Generate consensus masks
python compute_IAA_metrics.py # Compute IAA metrics
python compute_image_level_metrics.py # Aggregate metrics per imageClick to expand Configuration
Each module uses a config.yaml file for configuration. Key settings include:
- Paths: Source and target directories for images and masks
- Metadata: Paths to input and output metadata CSV files
- Processing options: Verbose logging, parallel processing settings
Example configuration structure:
# dataset_creation/config.yaml
orig_imgs_dirs:
jpg: ["/path/to/images/"]
orig_segs_dir: "/path/to/masks/"
raw_img_metadata_path: "./original_metadata_files/raw_ISIC_images_metadata.csv.gz"
raw_seg_masks_metadata_path: "./original_metadata_files/raw_ISIC_segmasks_metadata.csv"
target_data_dir: "/path/to/output/"Click to expand Output Files
All metadata files are saved in output/metadata/:
IMAplusplus_seg_metadata.csv: Complete segmentation metadataIMAplusplus_img_metadata.csv: Image metadataIMAplusplus_multiannotator_subset_seg_metadata.csv: Multi-annotator subset metadataIMAplusplus_seg_metadata_qa_results.csv: Quality assurance results
All metrics files are saved in output/metrics/:
IMAplusplus_multiannotator_subset_IAA_metrics.csv: Pairwise IAA metricsIMAplusplus_multiannotator_subset_IAA_metrics_summary.csv: Summary statisticsIMAplusplus_multiannotator_subset_image_level_metrics.csv: Per-image aggregated metricsIMAplusplus_seg_metadata_qa_results.csv: Quality assurance results
Each segmentation mask has the following metadata:
ISIC_id: ISIC image identifierimg_filename: Image filenameseg_filename: Segmentation mask filenameannotator: Anonymized annotator ID (A00,A01, ...,A15)tool: Segmentation tool (T1: manual pointlist,T2: semi-automated flood fill,T3: fully automated algorithm)skill_level: Annotator skill level (S1: expert,S2: novice)mskObjectID: Original mask object IDmask_md5: MD5 hash of the mask file
For images with multiple annotations, two consensus masks are generated:
- STAPLE (
*_ST_ST_ST_ST.png): STAPLE algorithm consensus - Majority Voting (
*_MV_MV_MV_MV.png): Majority voting consensus
The following metrics are computed for all pairwise mask comparisons:
Overlap Metrics:
- Dice coefficient
- Jaccard coefficient
Boundary Metrics:
- Hausdorff distance (HD)
- 95th percentile Hausdorff distance (HD95)
- Average symmetric surface distance (ASSD)
- Normalized versions (by image diagonal length)
The repository includes scripts to generate insightful visualizations of the dataset characteristics.
The distribution of segmentations across 16 annotators is long-tailed, with complex intersections.
Distribution of number of segmentations per image and factor-wise counts (Annotator, Tool, Skill).
If you use the IMA++ dataset in your research, please cite the following papers:
-
Abhishek, K., Kawahara, J., Hamarneh, G. (2025). IMA++: ISIC Archive Multi-Annotator Dermoscopic Skin Lesion Segmentation Dataset. arXiv preprint arXiv:2512.21472, pages 1β11. https://doi.org/10.48550/arXiv.2512.21472
-
Abhishek, K., Kawahara, J., Hamarneh, G. (2025). What Can We Learn from Inter-Annotator Variability in Skin Lesion Segmentation?. In: Medical Image Computing and Computer-Assisted Intervention (MICCAI) ISIC Skin Image Analysis Workshop (MICCAI ISIC). MICCAI 2025. Lecture Notes in Computer Science, vol 16149, pages 23β33. Springer, Cham. https://doi.org/10.1007/978-3-032-05825-6_3
-
Abhishek, K., Kawahara, J., Hamarneh, G. (2025). Segmentation Style Discovery: Application to Skin Lesion Images. In: Medical Image Computing and Computer-Assisted Intervention (MICCAI) ISIC Skin Image Analysis Workshop (MICCAI ISIC). MICCAI 2024. Lecture Notes in Computer Science, vol 15274, pages 24β34. Springer, Cham. https://doi.org/10.1007/978-3-031-77610-6_3
Click to expand BibTeX entries
The BibTeX entries for these papers are:
@Article{abhishek2025imaplusplus,
author = {Abhishek, Kumar and Kawahara, Jeremy and Hamarneh, Ghassan},
title = {IMA++: ISIC Archive Multi-Annotator Dermoscopic Skin Lesion Segmentation Dataset},
journal = {arXiv preprint arXiv:2512.21472},
year = {2025},
doi = {https://doi.org/10.48550/arXiv.2512.21472},
url = {https://arxiv.org/abs/2512.21472},
publisher = {arXiv},
pages = {1--11},
}
@InProceedings{abhishek2025what,
author = {Abhishek, Kumar and Kawahara, Jeremy and Hamarneh, Ghassan},
title = {What Can We Learn from Inter-Annotator Variability in Skin Lesion Segmentation?},
booktitle = {Proceedings of the International Conference on Medical Image Computing and Computer-Assisted Intervention (MICCAI) ISIC Skin Image Analysis Workshop (MICCAI ISIC)},
pages = {23--33},
year = {2025},
doi = {https://doi.org/10.1007/978-3-032-05825-6_3},
url = {https://link.springer.com/chapter/10.1007/978-3-032-05825-6_3},
publisher = {Springer Nature Switzerland},
address = {Cham},
isbn = {9783032058256}
}
@InProceedings{abhishek2025segmentation,
author = {Abhishek, Kumar and Kawahara, Jeremy and Hamarneh, Ghassan},
title = {Segmentation Style Discovery: Application to Skin Lesion Images},
booktitle = {Proceedings of the International Conference on Medical Image Computing and Computer-Assisted Intervention (MICCAI) ISIC Skin Image Analysis Workshop (MICCAI ISIC)},
pages = {24--34},
year = {2025},
doi = {https://doi.org/10.1007/978-3-031-77610-6_3},
url = {https://link.springer.com/chapter/10.1007/978-3-031-77610-6_3},
publisher = {Springer Nature Switzerland},
address = {Cham},
isbn = {9783031776106}
}
