Paper
26 June 2023 Maximize multicast transmission throughput for multi-distributed model training tasks
Hanfeng Zhan, Hongli Xu
Author Affiliations +
Abstract
Reducing the significant communication overhead in large-scale distributed model training is a current research hotspot. Some existing studies have reduced the communication cost from workers to parameter servers (upstream) by utilizing the in-network aggregation capabilities of programmable switches or by pruning models to reduce overall communication costs. However, none of them have investigated how to accelerate downlink communication for multiple distributed model training in data centers, given the constraints of programmable switches and links. We propose an algorithm that achieves an approximate ratio of š¯‘‚(logκ/β + 1). Through simulation, we demonstrate that our algorithm outperforms previous multicast methods.
© (2023) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only.
Hanfeng Zhan and Hongli Xu "Maximize multicast transmission throughput for multi-distributed model training tasks", Proc. SPIE 12721, Second International Symposium on Computer Applications and Information Systems (ISCAIS 2023), 127210F (26 June 2023); https://doi.org/10.1117/12.2683437
Advertisement
Advertisement
RIGHTS & PERMISSIONS
Get copyright permission  Get copyright permission on Copyright Marketplace
KEYWORDS
Switches

Packet switching

Data centers

RELATED CONTENT


Back to Top