Acceleration of Deep Neural Network Training Using Field Programmable Gate Arrays

Computational Intelligence and Neuroscience Số , năm 2022 (Tập 2022, trang -)

Convolutional neural network (CNN) training often necessitates a considerable amount of computational resources. In recent years, several studies have proposed for CNN inference and training accelerators in which the FPGAs have previously demonstrated good performance and energy efficiency. To speed up the processing, CNN requires additional computational resources such as memory bandwidth, a FPGA platform resource usage, time, power consumption, and large datasets for training. They are constrained by the requirement for improved hardware acceleration to support scalability beyond existing data and model sizes. This paper proposes a procedure for energy efficient CNN training in collaboration with an FPGA-based accelerator. We employed optimizations such as quantization, which is a common model compression technique, to speed up the CNN training process. Additionally, a gradient accumulation buffer is used to ensure maximum operating efficiency while maintaining gradient descent of the learning algorithm. To validate the design, we implemented the AlexNet and VGG-16 models on an FPGA board and laptop CPU along side GPU. It achieves 203.75 GOPS on Terasic DE1 SoC with the AlexNet model and 196.50 GOPS with the VGG-16 model on Terasic DE-SoC. Our result also exhibits that the FPGA accelerators are more energy efficient than other platforms. � 2022 Guta Tesema Tufa et al.