Training Deep Learning Recommendation Model with Quantized Collective Communications

Conference on Knowledge Discovery and Data Mining (KDD)


Deep Learning Recommendation Model (DLRM) captures our representative model architectures developed for click-through-rate (CTR) prediction based on high-dimensional sparse categorical data. Collective communications can account for a significant fraction of time in synchronous training of DLRM at scale. In this work, we explore using fine-grain integer quantization to reduce the communication volume of alltoall and allreduce collectives. We emulate quantized alltoall and allreduce, the latter using ring or recursive-doubling and each with optional carried-forward error compensation. We benchmark accuracy loss of quantized alltoall and allreduce with a representative DLRM model and Kaggle 7D dataset. We show that alltoall forward and backward passes, and dense allreduce can be quantized to 4 bits without accuracy loss compared to full-precision training.

Latest Publications

Sustainable AI: Environmental Implications, Challenges and Opportunities

Carole-Jean Wu, Ramya Raghavendra, Udit Gupta, Bilge Acun, Newsha Ardalani, Kiwan Maeng, Gloria Chang, Fiona Aga Behram, James Huang, Charles Bai, Michael Gschwind, Anurag Gupta, Myle Ott, Anastasia Melnikov, Salvatore Candido, David Brooks, Geeta Chauhan, Benjamin Lee, Hsien-Hsin S. Lee, Bugra Akyildiz, Max Balandat, Joe Spisak, Ravi Jain, Mike Rabbat, Kim Hazelwood

MLSys - 2022