Live Video Analytics

Live Video Analytics



Cameras are now everywhere. Large-scale video processing is a grand challenge representing an important frontier for analytics, what with videos from factory floors, traffic intersections, police vehicles, and retail shops. It’s the golden era for computer vision, AI, and machine learning – it’s a great time now to extract value from videos to impact science, society, and business!

Project Rocket‘s goal is to democratize video analytics: build a system for real-time, low-cost, accurate analysis of live videos. This system will work across a geo-distributed hierarchy of intelligent edges and large clouds, with the ultimate goal of making it easy and affordable for anyone with a camera stream to benefit from video analytics.

For information regarding this work, please see the list of publications below. Check out our upcoming IEEE Computer’17 paper for a nice summary.


“Safer Cities, Safer People” US Department of Transportation Award

Institute of Transportation Engineering 2017 Achievements Award – “Video Analytics for Vision Zero

ACM MobiSys 2017 Best Demo

Microsoft 2017 Hackathon Grand Prize Winner

Rocket: Video Analytics Stack

Rocket is an extensible software stack for democratizing video analytics: making it easy and affordable for anyone with a camera stream to benefit from computer vision and machine learning algorithms. Rocket allows programmers to plug-in their favorite vision algorithms while scaling across a hierarchy of intelligent edges and the cloud.

Video Analytics Stack

Video Analytics for Vision Zero

One of the verticals this project is focused on is video streams from cameras at traffic intersections. Traffic-related accidents are among the top 10 reasons for fatalities worldwide. This project partners with jurisdictions to identify traffic details—vehicles, pedestrians, bikes—that impact traffic planning and safety.

We have an on-going pilot in Bellevue, Washington for active traffic monitoring of traffic intersections live 24X7.

 Check out the traffic dashboard from Rocket’s video analytics that is live at Bellevue’s Traffic Management Center! The dashboard alerts the traffic authorities on abnormal traffic volumes.

Continuous Crowdsourcing

 Participate in the nationwide Video Analytics Traffic Safety Initiative.

Resource-Accuracy Tradeoff for Video Queries

VideoStorm is a video analytics system that processes thousands of video analytics queries on live video streams over large clusters. Given the high costs of vision processing, resource management is crucial. We consider two key characteristics of video analytics: resource-quality tradeoff with multi-dimensional configurations, and variety in quality and lag goals. VideoStorm’s offline profiler generates query resource-quality profile, while its online scheduler allocates resources to queries to maximize performance on quality and lag, in contrast to the commonly used fair sharing of resources in clusters. More details found in our NSDI 2017 paper.


While it is promising to balance resource and quality (or accuracy) by selecting a suitable configuration (e.g., the resolution and frame rate of the input video), one must also address the significant dynamics of the configurations’ impact on video analytics accuracy. Chameleon dynamically picks the best configurations for video analytics pipelines periodically, while also efficiently searching the large space of configurations. Chameleon relies on the underlying characteristics (e.g., the velocity and sizes of objects) that affect the best configuration have enough temporal and spatial correlation to allow the search cost to be amortized over time and across multiple video feeds. Details can be found in our SIGCOMM 2018 paper.

Querying Large Video Datasets with Low Latency and Low Cost

Large volumes of videos are continuously recorded from cameras deployed for traffic control and surveillance with the goal of answering “after the fact” queries: identify video frames with objects of certain classes (cars, bags) from many days of recorded video. While advancements in convolutional neural networks (CNNs) have enabled answering such queries with high accuracy, they are too expensive and slow. We build Focus, a system for low-latency and low-cost querying on large video datasets. Focus uses cheap ingestion techniques to index the videos by the objects occurring in them. At ingest-time, it uses compression and video-specific specialization of CNNs. Focus handles the lower accuracy of the cheap CNNs by judiciously leveraging expensive CNNs at query-time. To reduce query time latency, we cluster similar objects and hence avoid redundant processing. Using experiments on video streams from traffic, surveillance and news channels, we see that Focus uses 58X fewer GPU cycles than running expensive ingest processors and is 37X faster than processing all the video at query time. More details can be found here.

Virtualizing Steerable Cameras

Cameras are often electronically steerable (pan, tilt, zoom or “PTZ”) and have to support multiple applications simultaneously like amber alert scanning based on license plate recognition and traffic volume monitoring. The primary challenge in supporting multiple such applications concurrently is that the view and image requirements of the applications differ. Allowing applications to directly steer the cameras inevitably leads to conflicts. Our solution virtualizes the camera hardware. With virtualization, we break the one-to-one binding between the camera and the application. The application binds itself to a virtual instance of the camera and specifies its view requirements, e.g., orientation, resolution, zoom. Our system does its best to provide the most recent view that meets the applications’ requirements. More details in the IPSN 2017 paper.

target=_blank”Check out this video

In-Vehicle Video Analytics

We developed ParkMaster, which integrates with users’ smartphones, mounted on the car’s dashboard, to sample the presence of cars at road-side parking spots from the driver’s vehicle itself. It includes two main components: a smartphone app, which runs on the driver’s smartphone (edge), performs real-time visual analytics and a Azure cloud service that maintains a real-time database summarizing the number of available parking spaces and provides client support for location services. While the user is driving, with the smartphone placed on the windshield ParkMaster captures video with the phone’s camera and, locally processing frames in real-time, estimated the availability of roadside parking spaces. On-road experiments from two major cities in the US and Europe (Los Angeles and Paris), and a small European village show that ParkMaster achieves an overall end-to-end accuracy close to 90% with a negligible overhead (less the one megabyte / hour) in mobile cellular data consumption.

Video Analytics for Wireless Cameras

We built Vigil, a video surveillance system, which leverages wireless cameras with edge computing capability to support real-time scene surveillance in enterprise campuses, retail stores, and across smart cities. Vigil intelligently partitions video processing between intelligent edges and the cloud to save wireless capacity, which can then be used to support additional cameras therefor increasing the spatially coverage of the surveilled region. It incorporates video analytics with novel video frame prioritization and video stream scheduling algorithms to optimize bandwidth utilization. We have tested Vigil across three sites using both White-Space and Wi-Fi networks. Depending on the level of activity in the scene, experimental results show that Vigil is able to increase geographical coverage anywhere from 5 to 200 times more than some of state-of- art systems that simply upload video streams. For a fixed region of coverage and bandwidth, Vigil outperforms the default equal throughput allocation strategy of Wi-Fi by delivering up to 25% more objects relevant to a user’s query.


Public Talks

Keynotes, Seminars, Conferences

Keynote Talks

  • IEEE 14th International Conference on Mobile Ad Hoc and Sensor Systems (October 23rd, 2017)
    Victor Bahl, “Live Video Analytics
  • 3rd IEEE International Conference on Collaboration and Internet Computing (October 15th, 2017)
    Victor Bahl, “Democratizing Video Analytics
  • Emerging Topics in Computing Symposium, University of Buffalo Computer Systems Engineering Dept. 50th Anniversary (September 29th, 2017)
    Victor Bahl, “Live Video Analytics the Perfect Edge Computing Application
  • 35th IEEE International Performance Computing and Communications Conference (December 10th, 2016)
    Victor Bahl, “Distributed Video Analytics

University Department Seminars

  • ETH Zurich (Aug 2017)
    Ganesh Ananthanarayanan, “Taming the Video Star! Real-time Video Analytics at Scale”
  • University of California at Berkeley (May 2017)
    Ganesh Ananthanarayanan, “Taming the Video Star! Real-time Video Analytics at Scale”
  • Washington University of St. Louis (April 28, 2017)
    Victor Bahl, “Live Video Analytics the Perfect Edge Computing Application”
  • Cornell University (April 2017)
    Ganesh Ananthanarayanan, “Taming the Video Star! Real-time Video Analytics at Scale”

Miscellaneous Invited Talks

  • Ganesh Ananthanarayanan, “Video Analytics for Vision Zero”, Microsoft Office of the CTO Summit (February 2017)
  • Victor Bahl, “Distributed Video Analytics”, The First IEEE/ACM Symposium on Edge Computing, Washington DC, USA (October 28th 2016)
  • Peter Bodik, “Cameras everywhere! Video Analytics at Scale”, Microsoft Research Faculty Summit, Redmond, WA (July 13th, 2016)


  • Haoyu Zhang, “Live Video Analytics at Scale with Approximation and Delay-Tolerance”, USENIX NSDI, Boston, MA, 2017.
  • Aakanksha Chowdhery, “The Design and Implementation of a Wireless Video Surveillance System”, ACM MobiCom, Paris, France, 2015.


News Articles

Researchers & Interns


  • Haoyu Zhang (summer 2015), Princeton University
  • Shubham Jain (summer 2015 and summer 2016), Rutgers University
  • Yao Lu (summer 2015 and summer 2016), University of Washington
  • Michael Hung (summer 2016), University of Southern California
  • Giulio Grassi (summer 2016), LIP6
  • Kevin Hsieh (summer 2017), Carnegie Mellon University
  • Enrique Apuy Saurez (summer 2017), Georgia Tech