A Variable Precision Approach for Deep Neural Networks

International Conference on Advanced Technologies for Communications Số , năm 2019 (Tập 2019-October, trang 313-318)

Deep Neural Network (DNN) architectures have been recently considered as the big breakthrough for a variety of applications. Because of the high computing capabilities required, DNN has been unsuitable for various embedded applications. Many works have been trying to optimize the key operations, which are multiply-and-add, in hardware for a smaller area, higher throughput, and lower power consumption. One way to optimize these factors is to use the reduced bit accuracy; for examples, Google's TPU used only 8-bit integer operations for DNN inference. Based on the characteristics of different layers in DNN, further bit accuracy can be changed to preserve the hardware area, power consumption, and throughput. In this work, the thesis investigates a hardware implementation of multiply-and-add with variable bit precision which can be adjusted at the computation time. The proposed design can calculate the sum of several products with the bit precision ranging from 1 to 16 bits. The hardware implementation results on Xilinx FPGA Virtex 707 development kit show that our design occupies smaller hardware and can run at a higher frequency of 310 MHz, while the same functionality implemented with and without DSP48 blocks can only run at a frequency of 102 MHz. In addition, to demonstrate that the proposed design is applicable effectively for deep neural network architecture, the paper also integrated the new design in the MNIST network. The simulation and verification results show that the proposed system can achieve the accuracy up of to 88%. � 2019 IEEE.