05.02.2021 | Research
Current Research at DSC: Picking the Right AI Hardware
Towards increased operational efficiency: Using neural networks to select the right hardware already during programming.
With the paper “Pick the Right Edge Device: Towards Power and Performance Estimation of CUDA-based CNNs on GPGPUs”
our collaborator Christopher Metz, in cooperation with Dr.-Ing. Mehran Goli (DFKI), has provided a new approach to estimate power consumption and performance for different graphics card models (GPGPUs) for the execution of Convolutional Neural Networks (CNNs).
It will be possible to perform these estimates for new CNNs without running them on real hardware in advance. This is especially helpful for systems in the edge or embedded computing domain, since fewer prototypes need to be tested. Also, this estimation can help to choose cheaper GPGPU models and thus has a direct impact on the final price of a product. However, the approach uses a neural network for the estimation. It receives as input an instruction profile for the CNN and the hardware components of a GPGPU. The instruction profile is generated from the source code of the CNN and contains the calculation instructions for the graphics card. These are divided into different classes. With the information, the neural network will perform the estimation. In doing so, it is necessary that the network must be executed for each GPGPU that should be estimated. Afterwards, the best GPGPU must still be selected manually.
Previous approaches rely on so-called performance counters for estimation and prediction. The disadvantage of performance counters is that they are only available at execution time. Thus, a new model must first be executed on real hardware in order to determine the corresponding performance counters. However, an estimation can also be performed with the existing approaches. With the new approach the machine Learning models do not need to be executed once, so it is possible to determine the appropriate hardware much earlier, already during programming.
The work was done in cooperation with the Cyber-Physical Systems group of the DFKI in Bremen and was submitted and presented at the System-level Design Methods for Deep Learning on Heterogeneous Architectures (SLOHA) workshop. Further authors are Dr.-Ing. Mehran Goli (DFKI/University of Bremen) and Prof. Rolf Drechsler (University of Bremen/DFKI). Currently, the necessary training data set is generated and collected on which the neural network is trained. The results of the experiments will be published in a future paper.
Author: Christopher Metz
Please contact us if you have any questions:
+49 (421) 218 - 63942