Enabling Efficient All-Dimension Top-k Sparsification for High-Performance Distributed DNN Training Systems

ADTopk is an all-dimension top-𝑘 sparsification scheme that selects the 𝑘 largest elements across all dimensions of the per-layer gradient tensor, ensuring that every dimension contributes at least one element and eliminating dimension missing. ADTopk also enables independent local sorting within each dimension, allowing parallel execution across dimensions to enhance GPU utilization. Furthermore, we enhance ADTopk with system-level optimizations: (i) interleaved sparsification to accelerate convergence, (ii) partial sparsification to reduce sparsification overhead, and (iii) hybrid collective communication to improve sparse communication efficiency. We implement a high-performance distributed DNN training framework with a modular sparsification compression library supporting ADTopk and state-of-the-art gradient sparsification baselines. We also develop collective communication libraries to support different sparsification communication primitives.

Introduction

This code repository covers:

ADTopk Framework

ADTopk avoids dimension missing via a matrix-based sparsification to enhance convergence accuracy, and increases GPU core parallelism via a multiple local sorting to improve sparsification efficiency.
ADTopk employs an interleaved sparsification scheme to combine ADTopk and the traditional Top-𝑘 to speed up the convergence.
ADTopk employs a partial sparsification scheme via minimizing communication idle periods to to reduce the sparsification overhead.
ADTopk employs a hybrid collective communication that combines All-Reduce and All-Gather to improve sparse communication efficiency.

State-of-the-art gradient sparsification methods.

Implementation

ADTopk System Architecture

We implement ADTopk, which mainly consists of five main modules, an all-dimension Top-𝑘 sparsification module, an interleaved sparsification module, a partial sparsification module, a hybrid collective communication module, and a decompression and averaging module. We also implement a proofs module to prove the stable convergence of ADTopk distributed SGD on multiple training tasks.

We use the PyTorch framework and implemented the prototype system of ADTopk based on the Horovod framework using NCCL as the communication library. Overview of our system is as follows.

ADTopk System Overview

The workflow of the ADTopk System：

Installation

Prerequisites

Install ADTopk

git clone https://github.com/zqming-cs/ADTopk.git
cd ADTopk
pip install -r requirements.txt
HOROVOD_GPU_OPERATIONS=NCCL pip install horovod==0.28.1
pip install -e .

Quick start

The primary benchmark is provided in examples. For example, we can use the following command to run the benchmark on 8 GPUs, with sparsification algorithm as adtopk, gaussiank, sidco, and dgc, communication primitives as All-Reduce and All-Gather, memory as residual.

To run BERT-large training job:

cd ADTopk/examples/nlp_examples/bert/scripts
bash run_squad_bert.sh

To run GPT2-large training job:

cd ADTopk/examples/nlp_examples/gpt
bash run_clm_no_trainer_hvd_103.sh

To run ViT-large training job:

cd ADTopk/examples/cv_examples/vit
bash run_imagenet_no_trainer.sh

To run ResNet-152 training job:

cd ADTopk/examples/cv_examples/resnet
bash run_imagenet_resnet152.sh

Papers

Enabling Efficient All-Dimension Top-k Sparsification for High-Performance Distributed DNN Training Systems

An earlier version of this paper appeared at the Proceedings of the 33rd International Symposium on High-Performance Parallel and Distributed Computing (HPDC2024), June 2024. In this extended version, we enhanced ADTopk with new system-level optimizations, including partial sparsification and hybrid collective communication. We also include new evaluation results and show that our enhanced ADTopk increases training throughput by up to 268.0%.

Citation

If you are using this repository for your paper, please cite our previous work

@inproceedings{ming2024adtopk,
  title={ADTopk: All-Dimension Top-k Compression for High-Performance Data-Parallel DNN Training},
  author={Zhangqiang Ming and Yuchong Hu and Wenxiang Zhou and Xinjue Zheng and Dan Feng},
  booktitle={Proceedings of the 33rd International Symposium on High-Performance Parallel and Distributed Computing},
  url={https://doi.org/10.1145/3625549.3658678}
  year={2024}
}

Referred Datasets

Wikitex-2/103: https://huggingface.co/datasets/wikitext
SQuAD: https://rajpurkar.github.io/SQuAD-explorer/
ImageNet: https://www.image-net.org/
CIFAR-100: https://www.cs.utoronto.ca/~kriz/cifar.html

License

See LICENSE.

Name		Name	Last commit message	Last commit date
Latest commit History 51 Commits
adtopk_lib.egg-info		adtopk_lib.egg-info
adtopklib		adtopklib
espresso		espresso
examples		examples
hipress		hipress
oktopk		oktopk
proofs		proofs
wfbps		wfbps
LICENSE		LICENSE
README.md		README.md
overview_2.jpg		overview_2.jpg
requirements.txt		requirements.txt
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Enabling Efficient All-Dimension Top-k Sparsification for High-Performance Distributed DNN Training Systems

Introduction

ADTopk Framework

State-of-the-art gradient sparsification methods.

Implementation

ADTopk System Architecture

ADTopk System Overview

Installation

Prerequisites

Install ADTopk

Quick start

Papers

Citation

Referred Datasets

License

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Enabling Efficient All-Dimension Top-k Sparsification for High-Performance Distributed DNN Training Systems

Introduction

ADTopk Framework

State-of-the-art gradient sparsification methods.

Implementation

ADTopk System Architecture

ADTopk System Overview

Installation

Prerequisites

Install ADTopk

Quick start

Papers

Citation

Referred Datasets

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages