DeepCache: Principled Cache for Mobile Deep Vision

  • Mengwei Xu ,
  • Mengze Zhu ,
  • Yunxin Liu ,
  • Felix Xiaozhu Lin ,
  • Xuanzhe Liu

MobiCom 2018 |

Published by ACM – Association for Computing Machinery

We present DeepCache, a principled cache design for deep
learning inference in continuous mobile vision. DeepCache
benefits model execution efficiency by exploiting temporal
locality in input video streams. It addresses a key challenge
raised by mobile vision: the cache must operate under video
scene variation, while trading off among cacheability, overhead,
and loss in model accuracy. At the input of a model,
DeepCache discovers video temporal locality by exploiting
the video’s internal structure, for which it borrows proven
heuristics from video compression; into the model, Deep-
Cache propagates regions of reusable results by exploiting
the model’s internal structure. Notably, DeepCache eschews
applying video heuristics to model internals which are not
pixels but high-dimensional, difficult-to-interpret data.
Our implementation of DeepCache works with unmodified
deep learning models, requires zero developer’s manual
effort, and is therefore immediately deployable on off-theshelf
mobile devices. Our experiments show that DeepCache
saves inference execution time by 18% on average and up
to 47%. DeepCache reduces system energy consumption by
20% on average.