ci: switch to FESOM-org docker image + add fesom2_xios end-to-end run-test#902
Merged
Conversation
Replaces the externally-owned ghcr.io/suvarchal/fesom2-ci:latest with ghcr.io/fesom/fesom2_docker:fesom2_ci-master, the FESOM-org-owned image built from FESOM/FESOM2_Docker that ships OASIS3-MCT prebuilt at /oasis and XIOS 2.5 prebuilt at /xios. Adds an 'xios' CMake preset (FESOM_WITH_XIOS=ON, otherwise standalone defaults) and a corresponding matrix cell that copies /xios into the workspace, exports XIOS_ROOT, and exercises the FESOM_WITH_XIOS=ON code path which had no CI coverage before. The cell is build-only — XIOS output isn't run in CI, so its ctest step is allowed to fail (continue-on-error). fail-fast is set to false so a regression in one matrix cell doesn't cancel the others.
…ings The openmp workflow was the only one still pointing at the -nightly tag, which is only refreshed by a monthly cron, so the setuptools<80 fix from FESOM2_Docker#4 (which only rebuilt -master) hadn't reached it. Aligns with fesom2_recom.yml, fesom2_cavities.yml etc.
…n-test
The xios cell in fesom2_build_tests was build-only (no XIOS server in CI),
duplicating coverage now provided by fesom2_xios.yml which both compiles
and runs FESOM with FESOM_WITH_XIOS=ON on the pi mesh, mirroring the
fesom2_recom / fesom2_cavities / fesom2_main pattern.
Drops the xios preset from CMakePresets.json (no other consumer) and
removes the xios matrix entry + the now-unused 'Copy XIOS directory'
step + XIOS_ROOT env from fesom2_build_tests.yml.
The new fesom2_xios.yml uses ghcr.io/fesom/fesom2_docker:fesom2_test_refactoring-master
(which now ships XIOS at /xios after FESOM2_Docker#5), copies the prebuilt
XIOS into the workspace, runs ./configure.sh ubuntu -DFESOM_WITH_XIOS=ON,
mkrun pi test_pi -m docker, stages docs/xios_xml/{context,field_def,file_def}_fesom.xml
plus a minimal standalone iodef.xml (using_server=false, no oasis), runs
./job_docker_new, and verifies XIOS produced sst.fesom*.nc.
XIOS context_fesom.xml references axis_def_fesom.xml, domain_def_fesom.xml, and grid_def_fesom.xml in addition to field/file_def. The previous Stage XIOS XMLs step only copied three of them, so XIOS aborted at startup with 'Can not open <./axis_def_fesom.xml> file'. Copy them all and overwrite iodef.xml with the standalone variant.
io_xios.F90:274 unconditionally calls xios_set_axis_attr("std_dens", ...)
to populate the dMOC density coordinate, but the bundled axis_def_fesom.xml
only declared the vertical axes (nz, nz1). XIOS therefore threw CException
'axis std_dens not found' at xios_close_context_definition, aborting any
FESOM_WITH_XIOS=ON run that started from these reference XMLs.
Add std_dens as a top-level axis (not in the Z-axis group, since it carries
density not depth). Discovered by the new fesom2_xios.yml CI workflow.
Replaces the bundled docs/xios_xml/file_def_fesom.xml at runtime with a 6-field subset (sst, a_ice, temp, salt, u, v) matching what mkfesom/settings/test_pi/setup.yml flags for the standard pi-mesh bit-identical CI check. The bundled file_def lists ~50 fields, several of which have stale grid_ref/prec/shape entries that crash xios_send_field with an opaque CException. Verified end-to-end on Levante (no-OASIS XIOS 2.5 + intel + openmpi, 2 ranks, pi mesh, 1 day, CORE2 forcing, woa18 climatology): all 6 xios-output netcdf files produced per rank, see SLURM job 24651037.
Adds an inline python check that mirrors mkfesom's fcheck but handles the per-rank XIOS file naming (<var>.fesom_<startyear>-<endyear>_<rank>.nc) that fcheck doesn't know about: glob all rank files per variable, masked concat, mean, compare to a hardcoded reference at abs<1e-3. Reference means were measured on Levante (intel + openmpi, 2-rank pi mesh, 1 day, CORE2 forcing, woa18 climatology, FESOM2_Docker#6 patched XIOS) so the CI's gfortran-side numbers may differ at the ULP level — the 1e-3 tolerance absorbs intel-vs-gfortran rounding on a first pass. Tighten to 1e-12 once the gfortran-side reference is known.
Collaborator
Author
|
Ready for review. ty! :) |
sebastianbeyer
approved these changes
May 3, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
ghcr.io/suvarchal/fesom2-ci:latestto the FESOM-org-ownedghcr.io/fesom/fesom2_docker:fesom2_ci-master, built from FESOM/FESOM2_Docker#3. Same toolchain (ubuntu 22.04 + GCC + OpenMPI + NetCDF/HDF5 + LAPACK + OASIS3-MCT prebuilt at/oasis); the recipe was reverse-engineered from suvarchal's image so behavior is identical for the existing five matrix cells (default, coupled, coupled_yac, recom, ifs_interface).fesom2_xios.ymlend-to-end run-test mirroring the structure offesom2_recom.yml/fesom2_cavities.yml. It runs insideghcr.io/fesom/fesom2_docker:fesom2_test_refactoring-master(XIOS 2.5 install at/xiosadded in FESOM2_Docker#5; attached-mode finalize segv fixed in #6), compiles FESOM with./configure.sh ubuntu -DFESOM_WITH_XIOS=ON, runsmkrun pi test_pi -m docker, stages a minimal standaloneiodef.xml(using_server=false, no OASIS) plus a 6-fieldfile_def_fesom.xml(sst,a_ice,temp,salt,u,v— matching the field set of the standardtest_pibit-identical check), runs./job_docker_new, and verifies all six XIOS-output netcdf files were produced per rank.fesom2_openmp.ymlwith the other run-test workflows by switching its container tag fromfesom2_test_refactoring-nightlyto-master. The nightly tag is only refreshed by a monthly cron, which let it lag behind master on the recent setuptools / pkg_resources fix from FESOM2_Docker#4.std_densaxis declaration todocs/xios_xml/axis_def_fesom.xml.src/io_xios.F90:274callsxios_set_axis_attr("std_dens", ...)unconditionally — to populate the dMOC density coordinate fromstd_dens_N/std_densinoce_dens_MOC— so the axis must exist in the registry even whenldiag_dMOC=.false.. Without the declarationxios_close_context_definitionaborts withCException: axis std_dens not found. Discovered by the newfesom2_xios.ymlworkflow.