Efficient Memory Organization for DNN Hardware Accelerator Implementation on PSoC

The use of deep learning solutions in different disciplines is increasing and their algorithms are computationally expensive in most cases. For this reason, numerous hardware accelerators have appeared to compute their operations efficiently in parallel, achieving higher performance and lower latenc...

Full description

Bibliographic Details
Main Authors: Antonio Rios-Navarro, Daniel Gutierrez-Galan, Juan Pedro Dominguez-Morales, Enrique Piñero-Fuentes, Lourdes Duran-Lopez, Ricardo Tapiador-Morales, Manuel Jesús Dominguez-Morales
Format: Article
Language:English
Published: MDPI AG 2021-01-01
Series:Electronics
Subjects:
Online Access:https://www.mdpi.com/2079-9292/10/1/94
_version_ 1797542340280188928
author Antonio Rios-Navarro
Daniel Gutierrez-Galan
Juan Pedro Dominguez-Morales
Enrique Piñero-Fuentes
Lourdes Duran-Lopez
Ricardo Tapiador-Morales
Manuel Jesús Dominguez-Morales
author_facet Antonio Rios-Navarro
Daniel Gutierrez-Galan
Juan Pedro Dominguez-Morales
Enrique Piñero-Fuentes
Lourdes Duran-Lopez
Ricardo Tapiador-Morales
Manuel Jesús Dominguez-Morales
author_sort Antonio Rios-Navarro
collection DOAJ
description The use of deep learning solutions in different disciplines is increasing and their algorithms are computationally expensive in most cases. For this reason, numerous hardware accelerators have appeared to compute their operations efficiently in parallel, achieving higher performance and lower latency. These algorithms need large amounts of data to feed each of their computing layers, which makes it necessary to efficiently handle the data transfers that feed and collect the information to and from the accelerators. For the implementation of these accelerators, hybrid devices are widely used, which have an embedded computer, where an operating system can be run, and a field-programmable gate array (FPGA), where the accelerator can be deployed. In this work, we present a software API that efficiently organizes the memory, preventing reallocating data from one memory area to another, which improves the native Linux driver with a 85% speed-up and reduces the frame computing time by 28% in a real application.
first_indexed 2024-03-10T13:29:13Z
format Article
id doaj.art-9b7c67ef574046b0a1d6b1b8d9f31c97
institution Directory Open Access Journal
issn 2079-9292
language English
last_indexed 2024-03-10T13:29:13Z
publishDate 2021-01-01
publisher MDPI AG
record_format Article
series Electronics
spelling doaj.art-9b7c67ef574046b0a1d6b1b8d9f31c972023-11-21T08:26:51ZengMDPI AGElectronics2079-92922021-01-011019410.3390/electronics10010094Efficient Memory Organization for DNN Hardware Accelerator Implementation on PSoCAntonio Rios-Navarro0Daniel Gutierrez-Galan1Juan Pedro Dominguez-Morales2Enrique Piñero-Fuentes3Lourdes Duran-Lopez4Ricardo Tapiador-Morales5Manuel Jesús Dominguez-Morales6Robotics and Technology of Computers Lab, Universidad de Sevilla, Escuela Tecnica Superior de Ingenieria Informatica—Escuela Politectica Superior, 41012 Sevilla, SpainRobotics and Technology of Computers Lab, Universidad de Sevilla, Escuela Tecnica Superior de Ingenieria Informatica—Escuela Politectica Superior, 41012 Sevilla, SpainRobotics and Technology of Computers Lab, Universidad de Sevilla, Escuela Tecnica Superior de Ingenieria Informatica—Escuela Politectica Superior, 41012 Sevilla, SpainRobotics and Technology of Computers Lab, Universidad de Sevilla, Escuela Tecnica Superior de Ingenieria Informatica—Escuela Politectica Superior, 41012 Sevilla, SpainRobotics and Technology of Computers Lab, Universidad de Sevilla, Escuela Tecnica Superior de Ingenieria Informatica—Escuela Politectica Superior, 41012 Sevilla, SpainRobotics and Technology of Computers Lab, Universidad de Sevilla, Escuela Tecnica Superior de Ingenieria Informatica—Escuela Politectica Superior, 41012 Sevilla, SpainRobotics and Technology of Computers Lab, Universidad de Sevilla, Escuela Tecnica Superior de Ingenieria Informatica—Escuela Politectica Superior, 41012 Sevilla, SpainThe use of deep learning solutions in different disciplines is increasing and their algorithms are computationally expensive in most cases. For this reason, numerous hardware accelerators have appeared to compute their operations efficiently in parallel, achieving higher performance and lower latency. These algorithms need large amounts of data to feed each of their computing layers, which makes it necessary to efficiently handle the data transfers that feed and collect the information to and from the accelerators. For the implementation of these accelerators, hybrid devices are widely used, which have an embedded computer, where an operating system can be run, and a field-programmable gate array (FPGA), where the accelerator can be deployed. In this work, we present a software API that efficiently organizes the memory, preventing reallocating data from one memory area to another, which improves the native Linux driver with a 85% speed-up and reduces the frame computing time by 28% in a real application.https://www.mdpi.com/2079-9292/10/1/94deep learningembedded systemsPSoCmemory organizationFPGAhardware accelerator
spellingShingle Antonio Rios-Navarro
Daniel Gutierrez-Galan
Juan Pedro Dominguez-Morales
Enrique Piñero-Fuentes
Lourdes Duran-Lopez
Ricardo Tapiador-Morales
Manuel Jesús Dominguez-Morales
Efficient Memory Organization for DNN Hardware Accelerator Implementation on PSoC
Electronics
deep learning
embedded systems
PSoC
memory organization
FPGA
hardware accelerator
title Efficient Memory Organization for DNN Hardware Accelerator Implementation on PSoC
title_full Efficient Memory Organization for DNN Hardware Accelerator Implementation on PSoC
title_fullStr Efficient Memory Organization for DNN Hardware Accelerator Implementation on PSoC
title_full_unstemmed Efficient Memory Organization for DNN Hardware Accelerator Implementation on PSoC
title_short Efficient Memory Organization for DNN Hardware Accelerator Implementation on PSoC
title_sort efficient memory organization for dnn hardware accelerator implementation on psoc
topic deep learning
embedded systems
PSoC
memory organization
FPGA
hardware accelerator
url https://www.mdpi.com/2079-9292/10/1/94
work_keys_str_mv AT antonioriosnavarro efficientmemoryorganizationfordnnhardwareacceleratorimplementationonpsoc
AT danielgutierrezgalan efficientmemoryorganizationfordnnhardwareacceleratorimplementationonpsoc
AT juanpedrodominguezmorales efficientmemoryorganizationfordnnhardwareacceleratorimplementationonpsoc
AT enriquepinerofuentes efficientmemoryorganizationfordnnhardwareacceleratorimplementationonpsoc
AT lourdesduranlopez efficientmemoryorganizationfordnnhardwareacceleratorimplementationonpsoc
AT ricardotapiadormorales efficientmemoryorganizationfordnnhardwareacceleratorimplementationonpsoc
AT manueljesusdominguezmorales efficientmemoryorganizationfordnnhardwareacceleratorimplementationonpsoc