Efficient Memory Organization for DNN Hardware Accelerator Implementation on PSoC

The use of deep learning solutions in different disciplines is increasing and their algorithms are computationally expensive in most cases. For this reason, numerous hardware accelerators have appeared to compute their operations efficiently in parallel, achieving higher performance and lower latenc...

Full description

Bibliographic Details
Main Authors:	Antonio Rios-Navarro, Daniel Gutierrez-Galan, Juan Pedro Dominguez-Morales, Enrique Piñero-Fuentes, Lourdes Duran-Lopez, Ricardo Tapiador-Morales, Manuel Jesús Dominguez-Morales
Format:	Article
Language:	English
Published:	MDPI AG 2021-01-01
Series:	Electronics
Subjects:	deep learning embedded systems PSoC memory organization FPGA hardware accelerator
Online Access:	https://www.mdpi.com/2079-9292/10/1/94

_version_	1797542340280188928
author	Antonio Rios-Navarro Daniel Gutierrez-Galan Juan Pedro Dominguez-Morales Enrique Piñero-Fuentes Lourdes Duran-Lopez Ricardo Tapiador-Morales Manuel Jesús Dominguez-Morales
author_facet	Antonio Rios-Navarro Daniel Gutierrez-Galan Juan Pedro Dominguez-Morales Enrique Piñero-Fuentes Lourdes Duran-Lopez Ricardo Tapiador-Morales Manuel Jesús Dominguez-Morales
author_sort	Antonio Rios-Navarro
collection	DOAJ
description	The use of deep learning solutions in different disciplines is increasing and their algorithms are computationally expensive in most cases. For this reason, numerous hardware accelerators have appeared to compute their operations efficiently in parallel, achieving higher performance and lower latency. These algorithms need large amounts of data to feed each of their computing layers, which makes it necessary to efficiently handle the data transfers that feed and collect the information to and from the accelerators. For the implementation of these accelerators, hybrid devices are widely used, which have an embedded computer, where an operating system can be run, and a field-programmable gate array (FPGA), where the accelerator can be deployed. In this work, we present a software API that efficiently organizes the memory, preventing reallocating data from one memory area to another, which improves the native Linux driver with a 85% speed-up and reduces the frame computing time by 28% in a real application.
first_indexed	2024-03-10T13:29:13Z
format	Article
id	doaj.art-9b7c67ef574046b0a1d6b1b8d9f31c97
institution	Directory Open Access Journal
issn	2079-9292
language	English
last_indexed	2024-03-10T13:29:13Z
publishDate	2021-01-01
publisher	MDPI AG
record_format	Article
series	Electronics
spelling	doaj.art-9b7c67ef574046b0a1d6b1b8d9f31c972023-11-21T08:26:51ZengMDPI AGElectronics2079-92922021-01-011019410.3390/electronics10010094Efficient Memory Organization for DNN Hardware Accelerator Implementation on PSoCAntonio Rios-Navarro0Daniel Gutierrez-Galan1Juan Pedro Dominguez-Morales2Enrique Piñero-Fuentes3Lourdes Duran-Lopez4Ricardo Tapiador-Morales5Manuel Jesús Dominguez-Morales6Robotics and Technology of Computers Lab, Universidad de Sevilla, Escuela Tecnica Superior de Ingenieria Informatica—Escuela Politectica Superior, 41012 Sevilla, SpainRobotics and Technology of Computers Lab, Universidad de Sevilla, Escuela Tecnica Superior de Ingenieria Informatica—Escuela Politectica Superior, 41012 Sevilla, SpainRobotics and Technology of Computers Lab, Universidad de Sevilla, Escuela Tecnica Superior de Ingenieria Informatica—Escuela Politectica Superior, 41012 Sevilla, SpainRobotics and Technology of Computers Lab, Universidad de Sevilla, Escuela Tecnica Superior de Ingenieria Informatica—Escuela Politectica Superior, 41012 Sevilla, SpainRobotics and Technology of Computers Lab, Universidad de Sevilla, Escuela Tecnica Superior de Ingenieria Informatica—Escuela Politectica Superior, 41012 Sevilla, SpainRobotics and Technology of Computers Lab, Universidad de Sevilla, Escuela Tecnica Superior de Ingenieria Informatica—Escuela Politectica Superior, 41012 Sevilla, SpainRobotics and Technology of Computers Lab, Universidad de Sevilla, Escuela Tecnica Superior de Ingenieria Informatica—Escuela Politectica Superior, 41012 Sevilla, SpainThe use of deep learning solutions in different disciplines is increasing and their algorithms are computationally expensive in most cases. For this reason, numerous hardware accelerators have appeared to compute their operations efficiently in parallel, achieving higher performance and lower latency. These algorithms need large amounts of data to feed each of their computing layers, which makes it necessary to efficiently handle the data transfers that feed and collect the information to and from the accelerators. For the implementation of these accelerators, hybrid devices are widely used, which have an embedded computer, where an operating system can be run, and a field-programmable gate array (FPGA), where the accelerator can be deployed. In this work, we present a software API that efficiently organizes the memory, preventing reallocating data from one memory area to another, which improves the native Linux driver with a 85% speed-up and reduces the frame computing time by 28% in a real application.https://www.mdpi.com/2079-9292/10/1/94deep learningembedded systemsPSoCmemory organizationFPGAhardware accelerator
spellingShingle	Antonio Rios-Navarro Daniel Gutierrez-Galan Juan Pedro Dominguez-Morales Enrique Piñero-Fuentes Lourdes Duran-Lopez Ricardo Tapiador-Morales Manuel Jesús Dominguez-Morales Efficient Memory Organization for DNN Hardware Accelerator Implementation on PSoC Electronics deep learning embedded systems PSoC memory organization FPGA hardware accelerator
title	Efficient Memory Organization for DNN Hardware Accelerator Implementation on PSoC
title_full	Efficient Memory Organization for DNN Hardware Accelerator Implementation on PSoC
title_fullStr	Efficient Memory Organization for DNN Hardware Accelerator Implementation on PSoC
title_full_unstemmed	Efficient Memory Organization for DNN Hardware Accelerator Implementation on PSoC
title_short	Efficient Memory Organization for DNN Hardware Accelerator Implementation on PSoC
title_sort	efficient memory organization for dnn hardware accelerator implementation on psoc
topic	deep learning embedded systems PSoC memory organization FPGA hardware accelerator
url	https://www.mdpi.com/2079-9292/10/1/94
work_keys_str_mv	AT antonioriosnavarro efficientmemoryorganizationfordnnhardwareacceleratorimplementationonpsoc AT danielgutierrezgalan efficientmemoryorganizationfordnnhardwareacceleratorimplementationonpsoc AT juanpedrodominguezmorales efficientmemoryorganizationfordnnhardwareacceleratorimplementationonpsoc AT enriquepinerofuentes efficientmemoryorganizationfordnnhardwareacceleratorimplementationonpsoc AT lourdesduranlopez efficientmemoryorganizationfordnnhardwareacceleratorimplementationonpsoc AT ricardotapiadormorales efficientmemoryorganizationfordnnhardwareacceleratorimplementationonpsoc AT manueljesusdominguezmorales efficientmemoryorganizationfordnnhardwareacceleratorimplementationonpsoc

Efficient Memory Organization for DNN Hardware Accelerator Implementation on PSoC

Similar Items