Skip to content

mitre/deep-obs

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

15 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

deep_obs: Deep-Learned Observation Operators

deep_obs is a Python library implementing the methods described in the paper "Deep-Learned Observation Operators for Artificial Intelligence Weather Forecasting Models." It provides a framework for replacing classic, physics-based observation operators, like the Community Radiative Transfer Model (CRTM), with deep-learned emulators or alternatives. Specifically, code in this repository utilizes data from NOAA's UFS replay historical reanalysis dataset to train observation operators for the Advanced Technology Microwave Sounder (ATMS) sensor.

💡 Key Concepts

In numerical weather prediction (NWP) and data assimilation, the observation operator $H$ maps the forecasted model state $x^b$ to the observation space $y^b$. The diffrence between the actual observations $y^o$ and the simulated observations $y^b$, also known as the innovation $y^i=y^o-y^b$, is then used by the data assimilation model to nudge the initial conditions that are used by the forecast model.

This library enables training three types of deep-learned models.

  1. Models that emulate physics-based observation operators. These models learn to map the background model states to the CRTM-simulated observations (i.e., $f(x^b)=y^b$)
  2. Models that replace physics-based observation operators: These models learn to map the background model states to the actual observations (i.e., $f(x^b)=y^o$)
  3. Models that predict the innovation directly: We show that we can more accurately emulate CRTM by predicting the innovations—which are used by the data assimilation model—directly from the model state and actual observations (i.e., $f(x^b, y^o)=y^i$).

🚀 Installation

Below is an example of how to install an editable version of the deep_obs package.

1. Clone this repository.

git clone https://git.codev.mitre.org/scm/eida/deep-obs.git

2. In the directory of this repository, install an editable version of the deep_obs package using pyproject.toml.

cd deep-obs
python -m pip install -e .

🛠 Usage

Downloading and preprocessing the data

  1. Download the data. download_from_ufs_replay.py downloads specified data from UFS replay to your local system. This example command downloads the data from 2022 used in the paper.
python src/deep_obs/download_from_ufs_replay.py \
    --dest_dir data/ufs_replay
  1. Preprocess the data. preprocess_data.py extracts the needed input and output features for each observation and saves them in parquet format for efficient training/testing. This example command preprocesses the data from the previous step.
python src/deep_obs/preprocess_data.py \
    --original_dir data/ufs_replay \
    --preprocessed_dir data/2022_preprocessed
  1. Generate configs. generate_configs.py splits the data into a train and test split and calculates the mean and standard deviation of each feature in the train split. This example command gets the configurations for clear-sky ocean scene from 2022.
python src/deep_obs/generate_configs.py \
    --preprocessed_dir data/2022_preprocessed \
    --data_config_dir data_configs/2022_valid_ocean_clear \
    --constraints valid_obs ocean_only clear_sky

Training a model

train.py trains models using specified configurations. This example command

  • uses the config configs/innovations_127.yaml to train a model that predicts innovations using all 127 model levels available in the UFS replay model states and
  • trains the model on the clear-sky ocean scene using the configs in data_configs/2022_valid_ocean_clear that were generated in the previous step.
python src/deep_obs/train.py \
    --config configs/innovations_127.yaml \
    --preprocessed_dir data/2022_preprocessed \
    --data_config_dir data_configs/2022_valid_ocean_clear \
    --run_name innovations_127_ocean_clear

Testing a model

test.py tests pre-trained models on specified datasets. This example command tests the model that we trained in the previous step on the test set.

python src/deep_obs/test.py \
    --config configs/innovations_127.yaml \
    --preprocessed_dir data/2022_preprocessed \
    --data_config_dir data_configs/2022_valid_ocean_clear \
    --run_name innovations_127_ocean_clear

📜 Citation

If you use this code or these methods in your research, please cite:

@article{lieberman2026deep,
  title={Deep-Learned Observation Operators for Artificial Intelligence Weather Forecasting Models},
  author={Kelsey Lieberman, Laura Slivinski, Matt Bender, Chris Miller, Josh DaRosa, Nick Krall, Mohammad Ridhwaan Alam, Nick Silverman, Sergey Frolov},
  journal={https://arxiv.org/abs/2604.00082},
  year={2026}
}

© 2026 THE MITRE CORPORATION. ALL RIGHTS RESERVED. Approved for public release. Distribution unlimited. Case 26-0098.

About

Python library implementing the methods described in "Deep-Learned Observation Operators for Artificial Intelligence Weather Forecasting Models."

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors