Accelerating Deep Convolutional Neural Networks Using Specialized Hardware
- Kalin Ovtcharov ,
- Olatunji Ruwase ,
- Joo-Young Kim ,
- Jeremy Fowers ,
- Karin Strauss ,
- Eric Chung
We describe the design of a convolutional neural network accelerator running on a Stratix V FPGA. The design runs at three times the throughput of previous FPGA CNN accelerator designs. We show that the throughput/watt is significantly higher than for a GPU, and project the performance when ported to an Arria 10 FPGA.