SmartDJ: Declarative Audio Editing with Audio Langugae Model

📦 Installation

Clone the repository:

git clone https://github.com/penn-waves-lab/SmartDJ.git

Install the dependencies:

cd SmartDJ
pip install -r requirements.txt

🤖 Pretrained Models

Download the pretrained SmartDJ-Editor here for interactive editing

bash script/download_ckpts.sh

⚡ Inference

Gradio Interactive Demo

Launch the interactive audio editor with a web-based UI:

bash ./script/launch_gradio_editor.sh

Demo Usage

smartdj_editor_gradio_demo.mp4

Command Line Interactive Demo for Audio Editor

Alternatively, you can also use the command line interactive demo for the SmartDJ-Editor

bash ./script/interactive_edit_editor.sh

🛠️ Available Commands

We support the following editing commands.
Spatial locations: {left | left front | front | right front | right}

Operation	Command
❌ Remove Sound	`remove the sound of [sound event] at the {spacial location}`
➕ Add Sound	`add the sound of [sound event] at the {spacial location} with [xx] dB`
🎯 Extract Sound	`extract the sound of [sound event] at the {spacial location}`
🔊 Change Volume	`turn {up \| down} the volume of [sound event] at {spacial location} by [xx] dB`
🧭 Change Direction	`change the sound of [sound event] at [original position] to {spacial location}`
⏱️ Shift Sound Timing	`shift the sound of [sound event] at the {spacial location} by [xx] seconds`
🌊 Add Reverberation	`reverb the sound of [sound event] at the {spacial location} with reverb level [xx]`
🎨 Change Timbre	`change the timbre of the sound of [sound event] at the {spacial location} to be more {bright \| dark \| warm \| cold \| muffled}`

Todo

Release inference code and weight for SmartDJ-Editor (diffusion editor)
Release inference code for SmartDJ-Planer (ALM planer)
Release dataset synthesis pipeline

📜 Citation

If you find this work helpful, please consider citing our paper:

@article{lan2025guiding,
  title={Guiding audio editing with audio language model},
  author={Lan, Zitong and Hao, Yiduo and Zhao, Mingmin},
  journal={arXiv preprint arXiv:2509.21625},
  year={2025}
}

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
config		config
demo_audios		demo_audios
editor_model		editor_model
media		media
script		script
utils		utils
vae_modules		vae_modules
.gitigonre		.gitigonre
README.md		README.md
editor_demo.py		editor_demo.py
gradio_audio_editor.py		gradio_audio_editor.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SmartDJ: Declarative Audio Editing with Audio Langugae Model

📦 Installation

🤖 Pretrained Models

⚡ Inference

Gradio Interactive Demo

Demo Usage

Command Line Interactive Demo for Audio Editor

Todo

📜 Citation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

SmartDJ: Declarative Audio Editing with Audio Langugae Model

📦 Installation

🤖 Pretrained Models

⚡ Inference

Gradio Interactive Demo

Demo Usage

Command Line Interactive Demo for Audio Editor

Todo

📜 Citation

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages