Clone the repository:
git clone https://github.com/penn-waves-lab/SmartDJ.git
Install the dependencies:
cd SmartDJ
pip install -r requirements.txt
Download the pretrained SmartDJ-Editor here for interactive editing
bash script/download_ckpts.shLaunch the interactive audio editor with a web-based UI:
bash ./script/launch_gradio_editor.shsmartdj_editor_gradio_demo.mp4
Alternatively, you can also use the command line interactive demo for the SmartDJ-Editor
bash ./script/interactive_edit_editor.sh🛠️ Available Commands
We support the following editing commands.
Spatial locations: {left | left front | front | right front | right}
| Operation | Command |
|---|---|
| ❌ Remove Sound | remove the sound of [sound event] at the {spacial location} |
| ➕ Add Sound | add the sound of [sound event] at the {spacial location} with [xx] dB |
| 🎯 Extract Sound | extract the sound of [sound event] at the {spacial location} |
| 🔊 Change Volume | turn {up | down} the volume of [sound event] at {spacial location} by [xx] dB |
| 🧭 Change Direction | change the sound of [sound event] at [original position] to {spacial location} |
| ⏱️ Shift Sound Timing | shift the sound of [sound event] at the {spacial location} by [xx] seconds |
| 🌊 Add Reverberation | reverb the sound of [sound event] at the {spacial location} with reverb level [xx] |
| 🎨 Change Timbre | change the timbre of the sound of [sound event] at the {spacial location} to be more {bright | dark | warm | cold | muffled} |
- Release inference code and weight for SmartDJ-Editor (diffusion editor)
- Release inference code for SmartDJ-Planer (ALM planer)
- Release dataset synthesis pipeline
If you find this work helpful, please consider citing our paper:
@article{lan2025guiding,
title={Guiding audio editing with audio language model},
author={Lan, Zitong and Hao, Yiduo and Zhao, Mingmin},
journal={arXiv preprint arXiv:2509.21625},
year={2025}
}