M-LAB: scheduling space exploration of multitasks on tiled deep learning accelerators

Bingya Zhang; Sheng Zhang

doi:10.1117/12.3032039

8 June 2024 M-LAB: scheduling space exploration of multitasks on tiled deep learning accelerators

Bingya Zhang, Sheng Zhang

Proceedings Volume 13171, Third International Conference on Algorithms, Microchips, and Network Applications (AMNA 2024); 131711E (2024) https://doi.org/10.1117/12.3032039
Event: 3rd International Conference on Algorithms, Microchips and Network Applications (AMNA 2024), 2024, Jinan, China

Abstract

With the increasing commercialization of deep neural networks (DNN), there is a growing need for running multiple neural networks simultaneously on an accelerator. This creates a new space to explore the allocation of computing resources and the order of computation. However, the majority of current research in multi-DNN scheduling relies predominantly on newly developed accelerators or employs heuristic methods aimed primarily at reducing DRAM traffic, increasing throughput and improving Service Level Agreements (SLA) satisfaction. These approaches often lead to poor portability, incompatibility with other optimization methods, and markedly high energy consumption. In this paper, we introduce a novel scheduling framework, M-LAB, that all scheduling of data is at layer level instead of network level, which means our framework is compatible with the research of inter-layer scheduling, with significant improvement in energy consumption and speed. To facilitate layer-level scheduling, M-LAB eliminates the conventional network boundaries, transforming these dependencies into a layer-to-layer format. Subsequently, M-LAB explores the scheduling space by amalgamating inter-layer and intra-layer scheduling, which allows for a more nuanced and efficient scheduling strategy tailored to the specific needs of multiple neural networks. Compared with current works, M-LAB achieves 2.06x-4.85x speed-up and 2.27-4.12x cost reduction.

(2024) Published by SPIE. Downloading of the abstract is permitted for personal use only.

Citation Download Citation

Bingya Zhang and Sheng Zhang "M-LAB: scheduling space exploration of multitasks on tiled deep learning accelerators", Proc. SPIE 13171, Third International Conference on Algorithms, Microchips, and Network Applications (AMNA 2024), 131711E (8 June 2024); https://doi.org/10.1117/12.3032039

ACCESS THE FULL ARTICLE

INSTITUTIONAL
Select your institution to access the SPIE Digital Library.

SELECT YOUR INSTITUTION

PERSONAL
Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.

PERSONAL SIGN IN

No SPIE Account? Create one

PURCHASE THIS CONTENT

SUBSCRIBE TO DIGITAL LIBRARY

50 downloads per 1-year subscription

Members: $195

Non-members: $335 ADD TO CART

25 downloads per 1 - year subscription

Members: $145

Non-members: $250 ADD TO CART

PURCHASE SINGLE ARTICLE

Includes PDF, HTML & Video, when available