The best of many worlds: scheduling machine learning inference on CPU-GPU integrated architectures

Vasiliadis Giorgos, Tsirbas Rafail, Ioannidis Sotirios

URI	http://purl.tuc.gr/dl/dias/30E0BFB7-D2EE-4943-B7A2-0F2D1C5DA2EC	-
Identifier	https://doi.org/10.1109/IPDPSW55747.2022.00017	-
Identifier	https://ieeexplore.ieee.org/document/9835207	-
Language	en	-
Extent	10 pages	en
Title	The best of many worlds: scheduling machine learning inference on CPU-GPU integrated architectures	en
Creator	Vasiliadis Giorgos	en
Creator	Tsirbas Rafail	en
Creator	Ioannidis Sotirios	en
Creator	Ιωαννιδης Σωτηριος	el
Publisher	Institute of Electrical and Electronics Engineers	en
Description	This work was supported by the projects CONCORDIA, C4IIoT, COLLABS, and MARVEL funded by the European Commission under Grant Agreements No. 830927, No. 833828, No. 871518, and No. 957337.	en
Content Summary	A plethora of applications are using machine learning, the operations of which are becoming more complex and require additional computing power. At the same time, typical commodity system setups (including desktops, servers, and embedded devices) are now offering different processing devices, the most often of which are multi-core CPUs, integrated GPUs, and discrete GPUs. In this paper, we follow a data-driven approach, where we first show the performance of different processing devices when executing a diversified set of inference engines; some processing devices perform better for different performance metrics (e.g., throughput, latency, and power consumption), while at the same time, these metrics may also deviate significantly among different applications. Based on these findings, we propose an adaptive scheduling approach, tailored for machine learning inference operations, that enables the use of the most efficient processing device available. Our scheduler is device-agnostic and can respond quickly to dynamic fluctuations that occur at real-time, such as data bursts, application overloads and system changes. The experimental results show that it is able to match the peak throughput, by predicting correctly the optimal processing device with an accuracy of 92.5%, with energy savings up to 10%.	en
Type of Item	Πλήρης Δημοσίευση σε Συνέδριο	el
Type of Item	Conference Full Paper	en
License	http://creativecommons.org/licenses/by-nc-nd/4.0/	en
Date of Item	2024-07-31	-
Date of Publication	2022	-
Subject	Machine learning algorithms	en
Subject	Computer architecture	en
Subject	Performance evaluation	en
Bibliographic Citation	G. Vasiliadis, R. Tsirbas and S. Ioannidis, "The best of many worlds: scheduling machine learning inference on CPU-GPU integrated architectures," in Proceedings of the 2022 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW 2022), Lyon, France, 2022, pp. 55-64, doi: 10.1109/IPDPSW55747.2022.00017.	en

Search

Browse

My Space

The best of many worlds: scheduling machine learning inference on CPU-GPU integrated architectures

Vasiliadis Giorgos, Tsirbas Rafail, Ioannidis Sotirios

Available Files

Services

Export

Share

Statistics

Metadata & Content in a METS Package:

Metadata in Format: