Paper
8 June 2024 M-LAB: scheduling space exploration of multitasks on tiled deep learning accelerators
Bingya Zhang, Sheng Zhang
Author Affiliations +
Proceedings Volume 13171, Third International Conference on Algorithms, Microchips, and Network Applications (AMNA 2024); 131711E (2024) https://doi.org/10.1117/12.3032039
Event: 3rd International Conference on Algorithms, Microchips and Network Applications (AMNA 2024), 2024, Jinan, China
Abstract
With the increasing commercialization of deep neural networks (DNN), there is a growing need for running multiple neural networks simultaneously on an accelerator. This creates a new space to explore the allocation of computing resources and the order of computation. However, the majority of current research in multi-DNN scheduling relies predominantly on newly developed accelerators or employs heuristic methods aimed primarily at reducing DRAM traffic, increasing throughput and improving Service Level Agreements (SLA) satisfaction. These approaches often lead to poor portability, incompatibility with other optimization methods, and markedly high energy consumption. In this paper, we introduce a novel scheduling framework, M-LAB, that all scheduling of data is at layer level instead of network level, which means our framework is compatible with the research of inter-layer scheduling, with significant improvement in energy consumption and speed. To facilitate layer-level scheduling, M-LAB eliminates the conventional network boundaries, transforming these dependencies into a layer-to-layer format. Subsequently, M-LAB explores the scheduling space by amalgamating inter-layer and intra-layer scheduling, which allows for a more nuanced and efficient scheduling strategy tailored to the specific needs of multiple neural networks. Compared with current works, M-LAB achieves 2.06x-4.85x speed-up and 2.27-4.12x cost reduction.
(2024) Published by SPIE. Downloading of the abstract is permitted for personal use only.
Bingya Zhang and Sheng Zhang "M-LAB: scheduling space exploration of multitasks on tiled deep learning accelerators", Proc. SPIE 13171, Third International Conference on Algorithms, Microchips, and Network Applications (AMNA 2024), 131711E (8 June 2024); https://doi.org/10.1117/12.3032039
Advertisement
Advertisement
RIGHTS & PERMISSIONS
Get copyright permission  Get copyright permission on Copyright Marketplace
KEYWORDS
Neural networks

Deep learning

Artificial intelligence

Artificial neural networks

Energy efficiency

Back to Top