Skip to content

Import Bio-ORACLE v3.0 ocean environmental layers (present + future climate) #53

@boettiger-lab-llm-agent

Description

@boettiger-lab-llm-agent

Dataset

Bio-ORACLE v3.0 — Global marine environmental rasters at 0.05° (~5.5 km) resolution, present-day baselines (2000–2020) and CMIP6 future projections (2020–2100) under 6 SSP scenarios.

Variables (19 + terrain)

Group Variables
Temperature Sea water temperature (thetao), air temperature (tas)
Chemistry Salinity (so), pH (ph), dissolved O₂ (o2), nitrate (no3), phosphate (po4), silicate (si), dissolved iron (dfe)
Productivity Chlorophyll (chl), phytoplankton carbon (phyc), PAR (par), attenuation coeff (kdpar)
Circulation Sea water speed (sws), direction (swd), mixed layer depth (mlotst)
Ice/Cloud Sea ice thickness (sithick), sea ice cover (siconc), cloud cover (clt)
Terrain Bathymetry (min/mean/max), slope, aspect, topographic position index, terrain ruggedness, coastline, landmass

Each variable has 6 statistics: max, mean, min, ltmax, ltmin, range.

Depth levels: surface (depthSurf), benthic (depthMax, depthMean, depthMin).

Temporal coverage:

  • Present: 2000–2010, 2010–2020 baselines
  • Future: decadal steps 2020–2100 under SSP1-1.9, SSP1-2.6, SSP2-4.5, SSP3-7.0, SSP4-6.0, SSP5-8.5

Total datasets on ERDDAP server: ~356 (variable × depth × time period × scenario combinations).

Technical specs

  • Resolution: 0.05° (~5.5 km), global 7200 × 3600 grid
  • CRS: WGS84 (EPSG:4326)
  • Native format: NetCDF-3, also served as GeoTIFF directly from ERDDAP
  • No authentication required — ERDDAP REST API is fully curl/wget-friendly:
    # Full global GeoTIFF for mean sea surface temperature, 2000–2020 baseline
    curl -O "https://erddap.bio-oracle.org/erddap/griddap/thetao_baseline_2000_2020_depthsurf.geotif?thetao_mean%5B(2000-01-01T00:00:00Z)%5D%5B(-89.975):1:(89.975)%5D%5B(-179.975):1:(179.975)%5D"
    Full dataset list: https://erddap.bio-oracle.org/erddap/info/index.json?page=1&itemsPerPage=2000

Proposed processing

Scope question

With 356 datasets this is large. Likely approach is to be selective — prioritize:

  1. Present-day baselines (2000–2020) for the most ecologically informative variables: thetao, so, o2, ph, chl, no3, bathymetry
  2. Surface + benthic depth variants
  3. Future scenarios for a key variable (e.g., temperature) under SSP1-2.6 and SSP5-8.5

Workflow

Each layer is a single-band global GeoTIFF → standard raster-workflow:

cng-datasets raster-workflow \
  --dataset bio-oracle/thetao-mean-surface \
  --source-url <erddap-geotif-url> \
  --bucket public-bio-oracle \
  --h3-resolution 8 \
  --value-column thetao_mean \
  --output-dir catalog/bio-oracle/k8s/thetao-mean-surface

Bucket: public-bio-oracle (new bucket)

Key use case

These layers are the canonical covariates for marine species distribution modeling (SDM). Combined with OBIS/GBIF occurrences (already in the catalog), they enable habitat suitability analysis for marine species directly in DuckDB via H3 joins.

Related issues / datasets

Metadata

Metadata

Assignees

No one assigned

    Labels

    high-seasDatasets for the High Seas app

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions