Skip to content
Open
Changes from all commits
Commits
Show all changes
34 commits
Select commit Hold shift + click to select a range
c1695c0
Updated datasets authorization
nitrosx May 5, 2026
1ee138d
New proposed dataset authorization matrix
nitrosx May 6, 2026
9880096
Merge branch 'updated_dataset_authorization' into update_dataset_auth…
nitrosx May 6, 2026
7f705a4
New proposed dataset authorization matrix (#2722)
nitrosx May 6, 2026
c4c6067
Update docs/developer-guide/authorization/authorization_datasets.md
nitrosx May 6, 2026
bfdf846
Update docs/developer-guide/authorization/authorization_datasets.md
nitrosx May 6, 2026
c63efa5
Update docs/developer-guide/authorization/authorization_datasets.md
nitrosx May 6, 2026
9fb6f8c
Update docs/developer-guide/authorization/authorization_datasets.md
nitrosx May 6, 2026
dcd88d8
Update docs/developer-guide/authorization/authorization_datasets.md
nitrosx May 6, 2026
e3fd4b5
Update docs/developer-guide/authorization/authorization_datasets.md
nitrosx May 6, 2026
a7bfd94
Update docs/developer-guide/authorization/authorization_datasets.md
nitrosx May 6, 2026
f96e6f1
Update docs/developer-guide/authorization/authorization_datasets.md
nitrosx May 6, 2026
536ea33
included feedback
nitrosx May 6, 2026
d69c20a
included feedback
nitrosx May 6, 2026
f9e0aa1
unforseen changes
nitrosx May 6, 2026
6c2e0f9
Added use cases to dataset permissions
nitrosx May 8, 2026
393bb72
Update authorization_datasets.md
nitrosx May 8, 2026
04e3623
Update authorization_datasets.md
nitrosx May 8, 2026
e523117
Apply suggestion from @alubbock
nitrosx May 20, 2026
1b7ec83
Apply suggestion from @sourcery-ai[bot]
nitrosx May 20, 2026
a27dd3b
Apply suggestion from @alubbock
nitrosx May 20, 2026
4839738
Apply suggestion from @alubbock
nitrosx May 20, 2026
11de758
Apply suggestion from @alubbock
nitrosx May 20, 2026
eaede84
Apply suggestion from @alubbock
nitrosx May 20, 2026
9dadf9b
Apply suggestion from @alubbock
nitrosx May 20, 2026
9ac821e
Apply suggestion from @alubbock
nitrosx May 20, 2026
94c4350
Apply suggestion from @alubbock
nitrosx May 20, 2026
ff61903
Apply suggestion from @alubbock
nitrosx May 20, 2026
61d6864
Apply suggestion from @alubbock
nitrosx May 20, 2026
d24f248
Apply suggestion from @alubbock
nitrosx May 20, 2026
5adaac7
Apply suggestion from @alubbock
nitrosx May 20, 2026
ca05366
Apply suggestion from @alubbock
nitrosx May 20, 2026
e1e6927
Apply suggestion from @alubbock
nitrosx May 20, 2026
24199c3
Update authorization_datasets.md
nitrosx May 20, 2026
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
276 changes: 276 additions & 0 deletions docs/developer-guide/authorization/authorization_datasets.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,276 @@
# Datasets Authorization

This is the list of the permissions available for datasets and all their endpoints

## Endpoint authorization

- DatasetCreate
- DatasetRead
- DatasetUpdate
- DatasetDelete

## Instance authorization

- DatasetCreateOwnerNoPid
- DatasetCreateOwnerWithPid
- DatasetCreateAny
- DatasetReadManyPublic
- DatasetReadManyAccess
- DatasetReadAny
- DatasetUpdateOwner
- DatasetUpdateAny
- DatasetDeleteOwner
- DatasetDeleteAny

## Implementation

How the different levels of authorization translate into data conditions applied by the backend.

- Public
- isPublished = true
- Access (conditions are applied in logical _or_)
- isPublished = true
- the user belongs to the group listed in the _ownerGroup_ field
- the user belongs to one of the groups listed in the _accessGroups_ field
- _sharedWith contains the user's email_ (obsolete, it will be removed)
- Owner
- the user belongs to the group listed in the _ownerGroup_ field
- Any
- User can perform the action to any dataset

## Operation to endpoints map

- Create
- POST Datasets
- POST Datasets/isValid
- Read
- GET Datasets
- GET Datasets/fullquery
- GET Datasets/fullfacet
- GET Datasets/metadataKeys
- GET Datasets/count
- GET Datasets/findOne
- GET Datasets/_pid_
- GET Datasets/_pid_/datasetlifecycle
- GET Datasets/_pid_/logbook
- Update
- PATCH Datasets/_pid_
- PUT Datasets/_pid_
- POST Datasets/_pid_/appendToArrayField
- PATCH Datasets/_pid_/datasetlifecycle
- Delete
- DELETE Datasets/_pid_

## Authorization standard users

| Operation | Endpoint Authorization | Anonymous | Authenticated User | Notes |
| --------- | ---------------------- | --------- | ------------------ | ----- |
| Create | _DatasetCreate_ | __no__ | __no__ | |
| Read | _DatasetRead_ | Public<br/>_DatasetReadPublic_ | Has Access<br/>_DatasetReadAccess_ | |
| Update | _DatasetUpdate_ | __no__ | __no__ | |
| | | | | |
| DELETE | _DatasetDelete_ | __no__ | __no__ | |

## Special permissions groups

- __Dataset Create Basic__ (DsCB)
These groups are allowed to create datasets for any of the groups they belong to, although they are not allowed to assign the pid to the new dataset.
Default: _#nogroup_
Special values:
- _#all_ : all groups are allowed to create datasets with pid assigned by the system.
- __Dataset Create Extended__ (DsCE)
These groups are allowed to create datasets for any of the groups they belong to, and they can assign the pid to the new dataset.
Default: _#nogroup_
Special values:
- _#all_ : all groups are allowed to create datasets with explicit pid.
- __Dataset Create Privileged__ (DsCP)
These groups are allowed to create datasets for any group, and they can also assign the pid to the new dataset.
Default: _#nogroup_
Special values:
- _#all_ : all groups can create datasets with for any group with explicit pid.
- __Dataset Read Privileged__ (DsRP)
These groups are allowed to read all datasets independently from the ownership.
Default: _#nogroup_
- __Dataset Update Basic__ (DsUB)
These groups are allowed to update only datasets they own.
Default: _#nogroup_
Special values:
- _#DsCB_ : all groups listed in _Dataset Create Basic_ are allowed to update the datasets they own.
- _#DsCE_ : all groups listed in _Dataset Create Extended_ are allowed to update the datasets they own.
- __Dataset Update Privileged__ (DsUP)
These groups are allowed to update any datasets independently from the ownership.
Default: _#nogroup_
Special values:
- _#DsCP_ : all groups listed in _Dataset Create Privileged_ are allowed to update any datasets.
- __Dataset Delete Basic__ (DsDB)
These groups are allowed to delete only the datasets they own.
Default: _#nogroup_
Special values:
- _#DsCB_ : all groups listed in _Dataset Create Basic_ are allowed to delete the datasets they own.
- _#DsCE_ : all groups listed in _Dataset Create Extended_ are allowed to delete the datasets they own.
- __Dataset Delete Privileged__ (DsDP)
These groups are allowed to delete any dataset independently of the ownership.
Default: _#nogroup_
Special values:
- _#DsCP_ : all groups listed in _Dataset Create Privileged_ are allowed to delete any datasets.

## Authorization special permissions groups

If a user belongs to one of the groups which is listed in any special permission, the permissions listed in this table override the standard permissions.
When the cell is empty in the following table, the permissions listed in the standard users table are applied.
A user can belong to multiple groups listed in multiple special permissions. The union of all the permissions is applied.

| Operation | Endpoint Authorization | Dataset Read Privileged | Dataset Create Basic | Dataset Create Extended | Dataset Create Privileged | Dataset Update Basic | Dataset Update Privileged | Admin | Dataset Delete Basic | Dataset Delete Privileged | Delete | Notes |
| --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- |
| Create | _DatasetCreate_ | | Owner, w/o PID<br/>_DatasetCreateOwnerNoPid_ | Owner, w/ PID<br/>_DatasetCreateOwnerWithPid_ | Any<br/>_DatasetCreateAny_ | | | Any<br/>_DatasetCreateAny_ | | | | |
| Read | _DatasetRead_ | Any<br/>_DatasetReadAny_ | | | | | | Any<br/>_DatasetReadAny_ | | | | |
| Update | _DatasetUpdate_ | | | | | Owner<br/>_DatasetUpdateOwner_ | Any<br/>_DatasetUpdateAny_ | Any<br/>_DatasetUpdateAny_ | | | | |
| | | | | | | | | | | | | |
| Delete | _DatasetDelete_ | | | | | | | | Own<br/>_DatasetDeleteOwner_ | Any<br/>_DatasetDeleteAny_ | Any<br/>_DatasetDeleteAny_ | |

## Priorities

This section lists the connected special permissions groups in order of importance.
A user will acquire the permissions from the special permissions groups up to the rightmost group in the list they belong to.

- Read
- Anonymous -> Authenticated -> Dataset Read Privileged -> Admin
- Create
- Anonymous -> Authenticated -> Dataset Create Basic -> Dataset Create Extended -> Dataset Create Privileged -> Admin
- Update
- Anonymous -> Authenticated -> Dataset Update Basic -> Dataset Update Privileged -> Admin
- Delete
- Anonymous -> Authenticated -> Dataset Delete Basic -> Dataset Delete Privileged -> Delete

## Environmental Variables

The following list present the environmental variables that should be configured to setup the special groups listed in the previous sections.
Each variable is a comma separated list of the users' groups that acquired the special permissions linked to the special group.

- __DATASET_READ_PRIVILEGED_GROUPS__: groups with __Dataset Read Privileged__ permissions
- __DATASET_CREATE_BASIC_GROUPS__: groups with __Dataset Create Basic__ permissions
- __DATASET_CREATE_EXTENDED_GROUPS__: groups with __Dataset Create Extended__ permissions
- __DATASET_CREATE_PRIVILEGED_GROUPS__: groups with __Dataset Create Privileged__ permissions
- __DATASET_UPDATE_BASIC_GROUPS__: groups with __Dataset Update Basic__ permissions
- __DATASET_UPDATE_PRIVILEGED_GROUPS__: groups with __Dataset Update Privileged__ permissions
- __DATASET_DELETE_BASIC_GROUPS__: groups with __Dataset Delete Basic__ permissions
- __DATASET_DELETE_PRIVILEGED_GROUPS__: groups with __Dataset Delete Privileged__ permissions
- __ADMIN_GROUPS__: groups with __Admin__ permissions. This variable affects all the sub-systems.
- __DELETE_GROUPS__: groups with __Delete__ permissions. This variable affects all the sub-systems.

## Legacy

The legacy datasets special permissions environment variables are marked obsolete and will be removed in the future.
In the meantime, they are mapped to the matching new variable.
Here is the map:

- Create Dataset Groups ( CREATE_DATASET_GROUPS ) -> Dataset Create Basic
- Create Dataset with PID Group ( CREATE_DATASET_WITH_PID_GROUPS ) -> Dataset Create Extended
- Create Dataset Privileged ( CREATE_DATASET_PRIVILEGED_GROUPS ) -> Dataset Create Privileged

## Use cases and configuration examples

This section includes few of the many use cases that the community has come across, found them informative.
Each use case provides the settings for each special permissions groups in isolation.
In a production setup, each special permissions groups will contain a list of multiple group that is the union of each individual case.

### Data ingestion with creation only

#### Description

We need a functional account that allows the ingestion process to create datasets for any group so it can ingest datasets independently from who owns them.

#### Configuration

##### Accounts

- username: ingestor
- group: ingestor

##### Special permissions groups

- DATASET_READ_PRIVILEGED_GROUPS = ""
- DATASET_CREATE_BASIC_GROUPS = ""
- DATASET_CREATE_EXTENDED_GROUPS = ""
- DATASET_CREATE_PRIVILEGED_GROUPS = "ingestor"
- DATASET_UPDATE_BASIC_GROUPS = ""
- DATASET_UPDATE_PRIVILEGED_GROUPS = ""
- DATASET_DELETE_BASIC_GROUPS = ""
- DATASET_DELETE_PRIVILEGED_GROUPS = ""
- ADMIN_GROUPS = ""
- DELETE_GROUPS = ""

### Data ingestion with creation and update

#### Description

We need a functional account that allows the ingestion process to create and update datasets for any group so it can ingest datasets independently from who owns them and also perform additional updates at a later time.

#### Configuration

##### Accounts

- username: ingestor
- group: ingestor

##### Special permissions groups

- DATASET_READ_PRIVILEGED_GROUPS = ""
- DATASET_CREATE_BASIC_GROUPS = ""
- DATASET_CREATE_EXTENDED_GROUPS = ""
- DATASET_CREATE_PRIVILEGED_GROUPS = "ingestor"
- DATASET_UPDATE_BASIC_GROUPS = ""
- DATASET_UPDATE_PRIVILEGED_GROUPS = "ingestor"
- DATASET_DELETE_BASIC_GROUPS = ""
- DATASET_DELETE_PRIVILEGED_GROUPS = ""
- ADMIN_GROUPS = ""
- DELETE_GROUPS = ""

### Post Ingestion tasks workflow

#### Description

We need to set up a workflow to run post ingestions task. The process needs to be able to list any dataset that has a specific value in their keywords field independently from the group who owns the dataset. Once the list is retrieved, the process will perform the set tasks (like determining end of embargo period, performing some aggregation or statistic on the data) and save the results back in the dataset as additional scientific metadata.

##### Accounts

- username: post_ingestion_tasks
- group: post_ingestion_tasks

##### Special permissions groups

- DATASET_READ_PRIVILEGED_GROUPS = "post_ingestion_tasks"
- DATASET_CREATE_BASIC_GROUPS = ""
- DATASET_CREATE_EXTENDED_GROUPS = ""
- DATASET_CREATE_PRIVILEGED_GROUPS = ""
- DATASET_UPDATE_BASIC_GROUPS = ""
- DATASET_UPDATE_PRIVILEGED_GROUPS = "post_ingestion_tasks"
- DATASET_DELETE_BASIC_GROUPS = ""
- DATASET_DELETE_PRIVILEGED_GROUPS = ""
- ADMIN_GROUPS = ""
- DELETE_GROUPS = ""

### Automatic workflow to delete obsolete datasets

#### Description

We need to set up a workflow to delete datasets that are more than 10 years old and are marked with the keyword _obsolete_.
The process needs to list all datasets that contains the value _obsolete_ in the keywords field and have creation time older than 10 years from today. Once the list has been retrieved, it has to iterate through and execute a delete command on each dataset.

##### Accounts

- username: delete_obsolete_datasets
- group: delete_obsolete_datasets

##### Special permissions groups

- DATASET_READ_PRIVILEGED_GROUPS = "delete_obsolete_datasets"
- DATASET_CREATE_BASIC_GROUPS = ""
- DATASET_CREATE_EXTENDED_GROUPS = ""
- DATASET_CREATE_PRIVILEGED_GROUPS = ""
- DATASET_UPDATE_BASIC_GROUPS = ""
- DATASET_UPDATE_PRIVILEGED_GROUPS = ""
- DATASET_DELETE_BASIC_GROUPS = ""
- DATASET_DELETE_PRIVILEGED_GROUPS = "delete_obsolete_datasets"
- ADMIN_GROUPS = ""
- DELETE_GROUPS = ""
Loading