Compiling KB-Sized Machine Learning Models to Constrained Hardware

Sridhar Gopinath, Nikhil Ghanathe, Vivek Seshadri, Rahul Sharma

MSR-TR-2018-35 |

Recent breakthroughs in machine learning (ML) have produced models that can directly run on constrained IoT devices. This approach allows systems to avoid expensive communication between the IoT devices and the cloud, thereby enabling energy-efficient real-time analytics. However, ML models are expressed typically in floating-point, and IoT hardware typically does not support floating-point. Therefore, running these models on IoT devices requires simulating IEEE-754 floating-point using software, which is very inefficient.

This paper presents SeeDot, a domain-specific language to express ML inference algorithms and an associated compiler which compiles SeeDot programs to fixed-point code that can efficiently run on constrained IoT devices. We propose 1) a novel compilation strategy that reduces the search space for some key parameters used in fixed-point code, and 2) new efficient implementations of expensive operations. For microcontrollers, our evaluation shows that SeeDot-generated programs have comparable accuracy, are 2.4x–11.9x faster than floating-point implementations, and up to two orders of magnitude faster than the code generated by a commercial float-to-fixed converter. SeeDot-based FPGA implementations are 18.7x–211.3x faster than microcontroller-based implementations and 5.2x–9.8x faster than FPGA implementations generated by commercial high-level synthesis tools.