(NeurIPS'25) TRACE: Grounding Time Series in Context for Multimodal Embedding and Retrieval

This repository provides installation and usage scripts for TRACE (arXiv:2506.09114).

1) Create Environment

conda env create -f environment.yml
conda activate trace-rag

Configure runtime paths via .env.

2) Download Dataset

Download dataset from Google Drive and unzip the file into the dataset/ directory.

The dataset for the project follows this structure:

dataset/
  pretrain/
    train_data/
    val_data/
    test_data/
  forecasting/
    train.json
    val.json
    test.json
  retrieval/
    train.parquet
    test.parquet

3) Project Organization

├── pretrain.py                  # Stage 1
├── context_align.py             # Stage 2
├── forecast_finetune.py         # Optional for task-specific finetuning
├── demo.ipynb                   # Embedding + retrieval demo
├── configs/
│   ├── pretrain.yaml
│   ├── align.yaml
│   └── finetune.yaml
└── src/
    ├── data/                    # Dataset + dataloader
    ├── models/                  # TS encoder / multimodal encoder / retriever
    ├── tasks/                   # Training loops
    └── utils/                   # Config / metrics / helpers

4) Training

TRACE uses a two-stage training pipline. Stage 3 serves as an optional stage for task-specific finetuning.

Stage 1: time-series pretraining
Stage 2: time-series/text context alignment (embedding + retrieval)
Stage 3 (optional): forecasting finetuning (with or without RAG)

Stage 1: Pretrain

CUDA_VISIBLE_DEVICES=0,1 torchrun \
  --nproc_per_node=2 \
  --master-port=<MASTER_PORT_STAGE1> \
  pretrain.py \
  --config configs/pretrain.yaml \

After pretraining, record the run name.

Important:

This run name is the key link to Step 2.
context_align.py uses --pretraining_run_name to locate and override model settings from results/wandb_configs/<PRETRAIN_RUN_NAME>.yaml.
Pretraining checkpoints are expected under results/model_checkpoints/<PRETRAIN_RUN_NAME>/.

Stage 2: Context Align

CUDA_VISIBLE_DEVICES=0,1 torchrun \
  --nproc_per_node=2 \
  --master-port=<MASTER_PORT_STAGE2> \
  context_align.py \
  --config configs/align.yaml \
  --pretraining_run_name "<PRETRAIN_RUN_NAME>" \
  --cross_attend

5) Optional Forecast Finetune

Use the same <PRETRAIN_RUN_NAME> from Stage 1.

5.1 w/o RAG (TS-only)

CUDA_VISIBLE_DEVICES=0,1 torchrun \
  --nproc_per_node=2 \
  --master-port=<MASTER_PORT_FT_WO_RAG> \
  forecast_finetune.py \
  --config configs/finetune.yaml \
  --pretraining_run_name "<PRETRAIN_RUN_NAME>" \
  --ts_only

5.2 w/ RAG (TS + Text)

CUDA_VISIBLE_DEVICES=0,1 torchrun \
  --nproc_per_node=2 \
  --master-port=<MASTER_PORT_FT_W_RAG> \
  forecast_finetune.py \
  --config configs/finetune.yaml \
  --pretraining_run_name "<PRETRAIN_RUN_NAME>" \
  --top_k 1

Retrieval Demo

Refer to demo.ipynb for generating embedding bank and cross-modal retrieval.

Citation

If you find this work useful, please consider citing our paper:

@article{chen2025trace,
  title={Trace: Grounding time series in context for multimodal embedding and retrieval},
  author={Chen, Jialin and Zhao, Ziyu and Nurbek, Gaukhar and Feng, Aosong and Maatouk, Ali and Tassiulas, Leandros and Gao, Yifeng and Ying, Rex},
  journal={arXiv preprint arXiv:2506.09114},
  year={2025}
}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

(NeurIPS'25) TRACE: Grounding Time Series in Context for Multimodal Embedding and Retrieval

1) Create Environment

2) Download Dataset

3) Project Organization

4) Training

Stage 1: Pretrain

Stage 2: Context Align

5) Optional Forecast Finetune

5.1 w/o RAG (TS-only)

5.2 w/ RAG (TS + Text)

Retrieval Demo

Citation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
configs		configs
dataset		dataset
misc		misc
results		results
src		src
.DS_Store		.DS_Store
README.md		README.md
context_align.py		context_align.py
demo.ipynb		demo.ipynb
env		env
environment.yml		environment.yml
forecast_finetune.py		forecast_finetune.py
pretrain.py		pretrain.py

Folders and files

Latest commit

History

Repository files navigation

(NeurIPS'25) TRACE: Grounding Time Series in Context for Multimodal Embedding and Retrieval

1) Create Environment

2) Download Dataset

3) Project Organization

4) Training

Stage 1: Pretrain

Stage 2: Context Align

5) Optional Forecast Finetune

5.1 w/o RAG (TS-only)

5.2 w/ RAG (TS + Text)

Retrieval Demo

Citation

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages