deep_obs is a Python library implementing the methods described in the paper "Deep-Learned Observation Operators for Artificial Intelligence Weather Forecasting Models." It provides a framework for replacing classic, physics-based observation operators, like the Community Radiative Transfer Model (CRTM), with deep-learned emulators or alternatives. Specifically, code in this repository utilizes data from NOAA's UFS replay historical reanalysis dataset to train observation operators for the Advanced Technology Microwave Sounder (ATMS) sensor.
In numerical weather prediction (NWP) and data assimilation, the observation operator
This library enables training three types of deep-learned models.
-
Models that emulate physics-based observation operators. These models learn to map the background model states to the CRTM-simulated observations (i.e.,
$f(x^b)=y^b$ ) -
Models that replace physics-based observation operators: These models learn to map the background model states to the actual observations (i.e.,
$f(x^b)=y^o$ ) -
Models that predict the innovation directly: We show that we can more accurately emulate CRTM by predicting the innovations—which are used by the data assimilation model—directly from the model state and actual observations (i.e.,
$f(x^b, y^o)=y^i$ ).
Below is an example of how to install an editable version of the deep_obs package.
1. Clone this repository.
git clone https://git.codev.mitre.org/scm/eida/deep-obs.git
2. In the directory of this repository, install an editable version of the deep_obs package using pyproject.toml.
cd deep-obs
python -m pip install -e .
- Download the data.
download_from_ufs_replay.pydownloads specified data from UFS replay to your local system. This example command downloads the data from 2022 used in the paper.
python src/deep_obs/download_from_ufs_replay.py \
--dest_dir data/ufs_replay
- Preprocess the data.
preprocess_data.pyextracts the needed input and output features for each observation and saves them in parquet format for efficient training/testing. This example command preprocesses the data from the previous step.
python src/deep_obs/preprocess_data.py \
--original_dir data/ufs_replay \
--preprocessed_dir data/2022_preprocessed
- Generate configs.
generate_configs.pysplits the data into a train and test split and calculates the mean and standard deviation of each feature in the train split. This example command gets the configurations for clear-sky ocean scene from 2022.
python src/deep_obs/generate_configs.py \
--preprocessed_dir data/2022_preprocessed \
--data_config_dir data_configs/2022_valid_ocean_clear \
--constraints valid_obs ocean_only clear_sky
train.py trains models using specified configurations. This example command
- uses the config
configs/innovations_127.yamlto train a model that predicts innovations using all 127 model levels available in the UFS replay model states and - trains the model on the clear-sky ocean scene using the configs in
data_configs/2022_valid_ocean_clearthat were generated in the previous step.
python src/deep_obs/train.py \
--config configs/innovations_127.yaml \
--preprocessed_dir data/2022_preprocessed \
--data_config_dir data_configs/2022_valid_ocean_clear \
--run_name innovations_127_ocean_clear
test.py tests pre-trained models on specified datasets. This example command tests the model that we trained in the previous step on the test set.
python src/deep_obs/test.py \
--config configs/innovations_127.yaml \
--preprocessed_dir data/2022_preprocessed \
--data_config_dir data_configs/2022_valid_ocean_clear \
--run_name innovations_127_ocean_clear
If you use this code or these methods in your research, please cite:
@article{lieberman2026deep,
title={Deep-Learned Observation Operators for Artificial Intelligence Weather Forecasting Models},
author={Kelsey Lieberman, Laura Slivinski, Matt Bender, Chris Miller, Josh DaRosa, Nick Krall, Mohammad Ridhwaan Alam, Nick Silverman, Sergey Frolov},
journal={https://arxiv.org/abs/2604.00082},
year={2026}
}
© 2026 THE MITRE CORPORATION. ALL RIGHTS RESERVED. Approved for public release. Distribution unlimited. Case 26-0098.
