Abstract

This paper presents the Visual Attention Engine (VAE), which is a digital cellular neural network (CNN) that executes the VA algorithm to speed up object-recognition. The proposed time-multiplexed processing element (TMPE) CNN topology achieves high performance and small area by integrating 4800 (80 × 60) cells and 120 PEs. Pipelined operation of the PEs and single-cycle global shift capability of the cells result in a high PE utilization ratio of 93%. The cells are implemented by 6T static random access memory-based register files and dynamic shift registers to enable a small area of 4.5 mm2. The bus connections between PEs and cells are optimized to minimize power consumption. The VAE is integrated within an object-recognition system-on-chip (SoC) fabricated in the 0.13-µm complementary metal–oxide–semiconductor process. It achieves 24 GOPS peak performance and 22 GOPS sustained performance at 200 MHz enabling one CNN iteration on an 80 × 60 pixel image to be completed in just 4.3 µs. With VA enabled using the VAE, the workload of the object-recognition SoC is significantly reduced, resulting in 83% higher frame rate while consuming 45% less energy per frame without degradation of recognition accuracy.