Lisa: Lightweight Yet Superb Neural Speech Coding

ICASSP 2026 Oral

📖 Introduction

Neural speech coding has recently achieved remarkable progress at low and ultra-low bitrates, but its efficiency is still limited by the ability to learn compact representations. To address this challenge, we introduce Lisa, a lightweight neural speech codec that improves both feature representation and quantization.

Lisa uses a causal frequency-domain encoder-decoder with Inception Residual Blocks (IRB) to better capture multi-scale correlations. It also introduces Regulated Residual Vector Quantization (R-RVQ), which modulates residuals into quantization-friendly forms for more compact multi-stage representation. Experiments show that Lisa achieves stronger coding efficiency than existing neural speech codecs while keeping low complexity for real-time speech communication and streaming.

🔧 Installation

Create the environment and install dependencies:

conda create -n lisa python=3.8 -y
conda activate lisa
pip install -r requirements.txt

Before running the code, update the hard-coded project path in runs/lisa/train.py, runs/lisa/test.py, and runs/lisa/model.py. Replace /root/hjk/lisa_release_root/ with the absolute path of this repository on your machine, for example:

/path/to/lisa-main/

📂 Datasets

Download the LibriTTS dataset and prepare the following subsets:

train-clean-100 and train-clean-360 for training
test-clean for evaluation

Resample all audio files to 16 kHz before training or evaluation.

📦 Pretrained Models

The source code does not include pretrained model files. Download the released checkpoints from the NJU cloud drive.

Put the downloaded checkpoint folders under saves/.

💻 Inference

Run inference with a pretrained model:

python runs/lisa/test.py \
  --root_dir /path/to/test_audio \
  --bandwidth 1500 \
  --pretrain saves/lisa_1500/ckpt/iter_1200000.model \
  --test_from forward \
  --device cuda:0

Reconstructed audio will be saved under:

saves/lisa/output/

To run objective metric evaluation, use:

--test_from model

The evaluation code computes metrics including ViSQOL, STOI, and PESQ.

🚀 Training

Option A: train from scratch

python runs/lisa/train.py \
  --root_dir /path/to/train_audio \
  --bandwidth 1500 \
  --batch_size 16 \
  --learning_rate 1e-4

Option B: load a pretrained model

python runs/lisa/train.py \
  --root_dir /path/to/train_audio \
  --bandwidth 1500 \
  --pretrain saves/lisa_1500/ckpt/iter_1200000.model

📝 BibTeX

If you find this project useful, please cite:

@inproceedings{huang2026lisa,
  title={Lisa: Lightweight Yet Superb Neural Speech Coding},
  author={Huang, Jiankai and Zhang, Junteng and Lu, Ming and Cao, Xun and Ma, Zhan},
  booktitle={ICASSP 2026-2026 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)},
  pages={14457--14461},
  year={2026},
  organization={IEEE}
}

👥 Authors

These files are provided by Nanjing University Vision Lab. Please contact us (jiankaihuang@smail.nju.edu.cn and zhangjunteng@smail.nju.edu.cn) if you have any questions.

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
common		common
runs/lisa		runs/lisa
src		src
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Lisa: Lightweight Yet Superb Neural Speech Coding

📖 Introduction

🔧 Installation

📂 Datasets

📦 Pretrained Models

💻 Inference

🚀 Training

Option A: train from scratch

Option B: load a pretrained model

📝 BibTeX

👥 Authors

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Lisa: Lightweight Yet Superb Neural Speech Coding

📖 Introduction

🔧 Installation

📂 Datasets

📦 Pretrained Models

💻 Inference

🚀 Training

Option A: train from scratch

Option B: load a pretrained model

📝 BibTeX

👥 Authors

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages