Skip to content

groupmm/subsequenceSDTW

Repository files navigation

Accompanying code for: Subsequence SDTW: Differentiable Alignment with Flexible Boundary Conditions

Johannes Zeitler (johannes.zeitler@audiolabs-erlangen.de)
International Audio Laboratories Erlangen
February 2026

Overview

This repository contains code to reproduce all experiments in the paper. The main notebooks are:

  • train_strong.ipynb: training with strongly aligned targets
  • train_SDTW_noMismatch.ipynb: training with standard SDTW and no boundary mismatch
  • train_SDTW.ipynb: training with standard SDTW and boundary mismatch
  • train_subSDTW.ipynb: training with subSDTW and boundary mismatch
  • train_subSDTW-W.ipynb: training with weighted subSDTW and boundary mismatch
  • eval.ipynb: compute evaluation metrics

Additionally, the following files/folders are contained:

  • data/: Open-domain subset of the BPSD. It's not sufficient to reproduce the paper results, but it provides a functional codebase. Audio is corrected in tuning to A4=440Hz and resampled to 16kHz flac
  • dataset_weakLabels.py: provides dataset class for weakly aligned score-audio pairs for the BPSD dataset
  • midi.py: some helper functions for MIDI parsing
  • onsets_and_frames/: pytorch onsets-and-frames implementation from https://github.com/jongwook/onsets-and-frames
  • prepare_weak_targets.ipynb: pre-compute weak target representations in musical and physical time from the BPSD annotations.
  • pretrained_model.pt: A transcriber pretrained on the MAESTRO dataset
  • SDTW.py: standard SDTW
  • subSDTW.py: subsequence SDTW without weight penalty
  • subSDTW_W.py: subsequence SDTW with weight penalty

Notes

To reduce the memory footprint of this repository, we do not include all training datasets. The MAESTRO (https://magenta.withgoogle.com/datasets/maestro) and BPSD (https://doi.org/10.5281/zenodo.10847702) datasets need to be acquired separately. For the BPSD dataset, we use an audio version that was corrected to A4=440Hz tuning

If you use this code...

please cite our paper

Johannes Zeitler and Meinard Müller. Subsequence SDTW: Differentiable Alignment with Flexible Boundary Conditions. IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), Barcelona, Spain, 2026.

About

Accompanying code for the paper "Subsequence SDTW: Differentiable Alignment with Flexible Boundary Conditions", ICASSP 2026

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors