Towards Transfer-Efficient Multi-modal Sequential Recommendation with State Space Duality
Hao Fan, Qingyang Liu, Hongjiu Liu, Yanrong Hu, Kai Fang
ArXiv Preprint: https://arxiv.org/abs/2506.02916
We propose MMM4Rec(Multi-Modal Mamba for Sequential Recommendation), a transfer-efficient multimodal sequential recommendation model. By establishing intrinsic algebraic constraints that align with sequential recommendation principles, MMM4Rec eliminates the need for complex optimization objectives required by other multimodal sequential recommendation models. During both pre-training and fine-tuning, MMM4Rec is optimized solely using the standard cross-entropy loss. This unified and simplified optimization objective enables MMM4Rec to achieve rapid convergence when transferred to new domains.
In the following, we will guide you how to use this repository step by step. 🤗
The following are the main runtime environment dependencies for running the repository:
- linux (We use Ubuntu 22.0.4)
- cuda 11.8
- python 3.10.15
- pytorch 2.3.1
- numpy 1.26.4
- pandas 2.2.3
- jsonlines 4.0.0
- pytorch-lightning 2.4.0
- lightning 2.4.0
- transformers 4.47.0
- sentencepiece 0.2.0
- tabulate 0.9.0
- tensorboardX 2.6.2
- tensorboard 2.19.0
- casual-conv1d 1.4.0
- mamba-ssm 2.2.2
You can also view detailed environment information in file environment.yaml.
This study focuses on the 🛒 Amazon Review 2018 dataset.
You can download our preprocessed dataset directly via the link https://figshare.com/s/f7603ea556c23c2aef88 and extract the subfolders (e.g., Scientific) from the downloaded archive into the 📁 dataset/📁 amazon-2018/📁 processed folder.
For detailed dataset preprocessing steps and descriptions, please refer to 📁 dataset/Ⓜ️ READEM.md.
In this section, you can learn about our project structure.
You can click on the directory below to expand and view the project structure:
📁 MMM4Rec
- 📁 baseline | (The baseline model in the paper)
- 📁 BSARec
- 📜 config.yaml
- 🐍 run.py
- 📁 ...
- 📁 configs | (Configuration file for MMM4Rec)
- 📁 finetune
- 📜 config_mmm4rec_scientific.yaml
- 📜 ...
- 📜 config_mmm4rec_FHCKM.yaml
- 📁 data | (Dataset class pytorch implementation)
- 🐍 amazon_dataset.py
- 📁 dataset | (Store dataset files)
- 📁 amazon-2018
- 📁 preprocess
- 📁 processed
- 📁 raw
- 📁 misc | (Store readme related images)
- 📁 model | (Python implementation of the model)
- 📁 encoder | (Kernel Implementation)
- 🐍 ssd_kernel.py
- 🐍 ...
- 🐍 mmm4rec.py
- 🐍 ...
- 📁 pre_weights | (Pre-trained weight files)
- 📀 pretrained_weight.ckpt
- 📁 reference_log | (Reference log file)
- 📁 scientific
- 📁 with_id | (With ID Feature)
- 📁 without_id | (Without ID Feature)
- 📁 ...
- 📁 saved | (Store the training logs and weights)
- 📁 MMM4Rec
- 📁 {time}
- 📄 output.log
- 📀 best_epoch.ckpt
- 📁 ...
- 📁 ...
- 📁 script | (Model fine-tuning script)
- 📁 finetune
- 🚅 scientific.sh
- 🚅 ...
- 📁 trainer | (Python Implementation of Trainer)
- 🐍 pretrain_trainer.py
- 🐍 trainer.py
- 🐍 utils.py
- 📜 environment.yaml
- 🐍 callback.py
- 🐍 main.py
- 🐍 pretrain.py
- 🐍 test.py
Ok, congratulations 🎇, you have finished all the preparation 👍, let's start training the model! 😄
This section will introduce the training methods of the model.
After preparing our provided pretraining dataset, you can directly pretrain MMM4Rec using the following approach:
python pretrained.pyAs described in our paper, our work primarily focuses on model transfer efficiency. To help you quickly verify our results, we have provided pre-trained model weights in the 📁 pre_weights folder, along with a quick-start shell script in 📁 script .
To fine-tune MMM4Rec in the Scientific domain, simply run the following command:
cd ./script/finetune
/bin/bash scientific.sh
cd ../../You can also directly examine the training logs in the 📁 reference_log folder to verify our work's effectiveness.
For example, check the output.log file to see MMM4Rec's fine-tuning logs in the Scientific domain.
Alternatively, you can directly download our fine-tuned model weights and complete log files for downstream datasets via the link https://figshare.com/s/f7603ea556c23c2aef88.
Our implementation is built upon Pytorch and Pytorch Lightning - we gratefully acknowledge their excellent work.
For dataset processing, we referenced approaches from MMSRec, UniSRec, and MISSRec. Our trainer implementation draws inspiration from RecBole.
Notably, we mathematically implemented an SSD kernel attention form (🐍 ssd_kernel.py) equivalent to Mamba's approach.
If you find this code useful or use the toolkit in your work, please consider citing:
@misc{fan2025mmm4rec,
title={Towards Transfer-Efficient Multi-modal Sequential Recommendation with State Space Duality},
author={Hao Fan and Qingyang Liu and Hongjiu Liu and Yanrong Hu and Kai Fang},
year={2025},
eprint={2506.02916},
archivePrefix={arXiv},
primaryClass={cs.IR},
url={https://arxiv.org/abs/2506.02916},
}
