Summary
An analysis of how D4D schema slots map to standard vocabularies, and how those mappings align (or conflict) with the Fairscape/ROCrate Pydantic models that already have a D4D conversion layer (fairscape_models/conversion/mapping/d4d.py).
D4D Slot URI Coverage
Of ~414 domain-specific attributes across the D4D modules, approximately 28% are mapped to standard vocabulary URIs via slot_uri:
| Module |
Mapped/Total |
Coverage |
| Distribution |
3/3 |
100% |
| Ethics |
7/7 |
100% |
| Motivation |
8/9 |
88% |
| Base (core) |
41/65 |
63% |
| Maintenance |
7/13 |
53% |
| Uses |
4/10 |
40% |
| Variables |
9/27 |
33% |
| Composition |
20/62 |
32% |
| Preprocessing |
6/22 |
27% |
| Collection |
5/20 |
25% |
| Data Governance |
6/39 |
15% |
| Evaluation Summary |
0/122 |
0% |
| Human |
0/14 |
0% |
Vocabularies used in D4D
dcterms: (Dublin Core) — 70 mappings (dominant)
schema: (Schema.org) — 33
dcat: (Data Catalog) — 9
prov:, skos:, qudt:, DUO: — 1 each
Notable gaps
- Evaluation Summary (122 slots, 0 mapped) — the largest module with zero URI mappings
- Human (14 slots, 0 mapped) — human subjects data with no standard vocab links
- Data Governance (39 slots, 15%) — governance/consent terms could map to DUO/ODRL
- No mappings to DATS, OBI, IAO, or other biomedical metadata standards
Fairscape URI Approach
Fairscape uses JSON-LD with @vocab: "https://schema.org/" as default namespace plus evi: "https://w3id.org/EVI#" for extensions and rai: for responsible AI fields. Key mappings in Fairscape:
| Fairscape field |
Effective URI |
Standard |
| name |
schema:name |
Schema.org |
| description |
schema:description |
Schema.org |
| @id |
schema:identifier |
Schema.org |
| dateCreated |
schema:dateCreated |
Schema.org |
| dateModified |
schema:dateModified |
Schema.org |
| contentUrl |
schema:contentUrl |
Schema.org |
| license |
schema:license |
Schema.org |
| author |
schema:author |
Schema.org |
| contentSize |
schema:contentSize |
Schema.org |
| evi:formats |
https://w3id.org/EVI#formats |
EVI (custom) |
| rai:dataUseCases |
RAI namespace |
Custom |
| rai:dataBiases |
RAI namespace |
Custom |
D4D ↔ Fairscape Alignment at the URI Level
Both schemas use Schema.org for core metadata. Where they overlap:
| D4D slot_uri |
Fairscape JSON-LD |
Alignment |
schema:name |
schema:name |
✅ Exact |
schema:description |
schema:description |
✅ Exact |
schema:identifier |
schema:identifier |
✅ Exact |
schema:license |
schema:license |
✅ Exact |
schema:url |
schema:url |
✅ Exact |
dcterms:created |
schema:dateCreated |
⚠️ Same concept, different vocab |
dcterms:modified |
schema:dateModified |
⚠️ Same concept, different vocab |
dcterms:creator |
schema:author |
⚠️ Same concept, different vocab |
dcat:downloadURL |
schema:contentUrl |
⚠️ Same concept, different vocab |
The core tension
D4D leans on Dublin Core (dcterms:) for provenance and dates, while Fairscape uses Schema.org for everything. Both are valid standard vocabularies, but this creates unnecessary mapping friction. For example:
dcterms:created vs schema:dateCreated — semantically identical
dcterms:creator vs schema:author — nearly identical
dcat:downloadURL vs schema:contentUrl — same concept
Fairscape's existing D4D mapping
fairscape_models/conversion/mapping/d4d.py already contains a ROCRATE_TO_D4D_MAPPING dict that maps D4D field names to ROCrate/Fairscape source keys. However, this mapping operates at the field name level, not at the URI level. A formal URI-level alignment (e.g., via SSSOM) would:
- Make the mapping vocabulary-aware and auditable
- Capture the dcterms↔schema.org equivalences explicitly
- Identify D4D slots that have no Fairscape equivalent (and vice versa)
- Enable automated interoperability tooling
Proposed Next Steps
- Add
slot_uri mappings to the currently unmapped D4D modules, prioritizing Evaluation Summary (0%) and Human (0%)
- Consider harmonizing the dcterms vs schema.org choice — or at minimum, add
exact_mappings cross-references between them
- Produce a formal SSSOM mapping between D4D slot URIs and Fairscape/ROCrate URIs
- Consider adding mappings to domain-relevant standards: DUO (consent), OBI (assays), IAO (information artifacts)
Summary
An analysis of how D4D schema slots map to standard vocabularies, and how those mappings align (or conflict) with the Fairscape/ROCrate Pydantic models that already have a D4D conversion layer (
fairscape_models/conversion/mapping/d4d.py).D4D Slot URI Coverage
Of ~414 domain-specific attributes across the D4D modules, approximately 28% are mapped to standard vocabulary URIs via
slot_uri:Vocabularies used in D4D
dcterms:(Dublin Core) — 70 mappings (dominant)schema:(Schema.org) — 33dcat:(Data Catalog) — 9prov:,skos:,qudt:,DUO:— 1 eachNotable gaps
Fairscape URI Approach
Fairscape uses JSON-LD with
@vocab: "https://schema.org/"as default namespace plusevi: "https://w3id.org/EVI#"for extensions andrai:for responsible AI fields. Key mappings in Fairscape:D4D ↔ Fairscape Alignment at the URI Level
Both schemas use Schema.org for core metadata. Where they overlap:
schema:nameschema:nameschema:descriptionschema:descriptionschema:identifierschema:identifierschema:licenseschema:licenseschema:urlschema:urldcterms:createdschema:dateCreateddcterms:modifiedschema:dateModifieddcterms:creatorschema:authordcat:downloadURLschema:contentUrlThe core tension
D4D leans on Dublin Core (
dcterms:) for provenance and dates, while Fairscape uses Schema.org for everything. Both are valid standard vocabularies, but this creates unnecessary mapping friction. For example:dcterms:createdvsschema:dateCreated— semantically identicaldcterms:creatorvsschema:author— nearly identicaldcat:downloadURLvsschema:contentUrl— same conceptFairscape's existing D4D mapping
fairscape_models/conversion/mapping/d4d.pyalready contains aROCRATE_TO_D4D_MAPPINGdict that maps D4D field names to ROCrate/Fairscape source keys. However, this mapping operates at the field name level, not at the URI level. A formal URI-level alignment (e.g., via SSSOM) would:Proposed Next Steps
slot_urimappings to the currently unmapped D4D modules, prioritizing Evaluation Summary (0%) and Human (0%)exact_mappingscross-references between them