Researchers develop new visual intelligence techniques to boost smart home security

Published June 16, 2017

Share this page

By Kangping Liu, Senior Research Program Manager, Microsoft Research Asia

Imagine when you leave your house or apartment that a smart home security system can automatically “look after” your home, giving you real-time notices about events happening at home, or providing you with a short video including all events of interest that happened while you were gone. With this system, parents could keep track of their older kids’ activities, and get real-time alarms about potential dangers facing their kids or elder family members. Sounds great, but these functions would make high demands on smart home systems because they require both accurate event understanding and real-time processing.

As a response to Microsoft Research Asia’s “Big Video Data Analytics” collaborative research call for proposals, Dr. Weiyao Lin, an associate professor in the electronic engineering department of Shanghai Jiao Tong University in China, has tackled these challenges with deep learning techniques. Collaborating with Dr. Tao Mei, a senior researcher at Microsoft Research Asia, Lin and his students have designed a system that is able to detect abnormal events in real time and adaptively create “online” summarization videos for user-selected events of interest. The system also allows remote interactions and controls through a smartphone.

Smart Home Security

Lin proposed a real-time event detection method based on deep learning, which integrates visual object detection, tracking, and event parsing into one single convolutional network-based framework. The method can reliably detect abnormal events–such as a person falling down–in different scenarios, in real time. Lin also developed an event-based video summarization method. Unlike most existing summarization approaches, this method performs online summarization, which embeds the summarization step in the video capturing process. In this way, the extra computation load, which is normally required in traditional offline summarization methods, can be largely saved. Moreover, Lin’s summarization method also introduces an event-based scheme that is able to automatically identify event types and adaptively create different summarization videos according to user-selected events-of-interest.

“This takes us one step further to realizing a fully automatic and highly intelligent home security system,” said Lin.

Besides home security scenarios, this system could also be applied in other locations, including shopping malls, schools, and streets. For example, the system could be deployed in classrooms to create “personalized” summary videos for the daily school activities of a pupil. It could also be used to automatically obtain statistical data about traffic violations on a crossroad (e.g., frequency of crossing red light events) or teaching activities in a class (e.g., frequency of Q&A activities).

Lin’s work was partially inspired by research on video analysis conducted in Mei’s team, as well as the MSR Video to Text (MSR-VTT) dataset, a new large-scale video benchmark for video understanding. This dataset comprises 41.2 hours and 10,000 web video clips with 200,000 clip-sentence pairs, covering diverse visual content and categories. By working with Mei, Lin constructed the initial learning models for event detection using the MSR-VTT dataset.

Among other publications from this collaborative research, Lin and Mei co-authored “A diffusion and clustering-based approach for finding coherent motions and understanding crowd scenes”, which is published at IEEE Transactions on Image Processing, vol.25, 2016.

“Dr. Lin’s work is unique in that it can create a personalized event summary from a live video stream in real-time,” said Mei. “This is very useful for a wide variety of public and home security applications.” As video data is increasing at an unprecedented level, intelligent video analysis has been an emerging and important area of study within Microsoft Research. Mei hopes to collaborate further with Lin’s team as each does further research in the video space.

This past May, Lin was invited to share this project at Microsoft Research Asia Symposium on Collaborative Research. The live demo was well received by symposium attendees and Microsoft researchers and won “Best Demo of The Year” award.

Best Demo of The Year
Left to right: Prof. Weiyao Lin, Shanghai Jiao Tong University; Dr. Tim Pan, senior director, Microsoft Research Asia

Microsoft Research Blog

Researchers develop new visual intelligence techniques to boost smart home security

Related publications

MSR-VTT: A Large Video Description Dataset for Bridging Video and Language [Supplementary Material]

Research Areas

Related events

Related labs