Indexing Audio-Video Content, with a Bit of Research Assistance

Published

Posted by Rob Knies

Microsoft Azure Media Services Indexer logo (opens in new tab)Search, categorization, accessibility—these are what customers gain from the Microsoft Azure Media Services Indexer (opens in new tab), launched Sept. 10.

The Indexer, formerly known as the Microsoft Audio Video Indexing Service (MAVIS (opens in new tab)), is being announced right before IBC2014 (opens in new tab), being held in Amsterdam from Sept. 11-16. The IBC conference, which examines the future of electronic media and of entertainment technology and content, also will be the site for the public preview of the Indexer.

PODCAST SERIES

The AI Revolution in Medicine, Revisited

Join Microsoft’s Peter Lee on a journey to discover how AI is impacting healthcare and what it means for the future of medicine.

For Behrooz Chitsaz (opens in new tab), Microsoft Research director of IP Strategy, it represents a watershed moment.

“It’s very exciting!” says Chitsaz, who has been shepherding the technology now in the Indexer for much of the last seven years. “Seeing the potential … for me, that’s the most important thing. We’re now giving customers and people insight into audio-video.”

You might recall earlier references to the MAVIS project from Microsoft Research. It has been used (opens in new tab) to index the digital archives for the U.S. state of Washington and also has been deployed by the U.S. Department of Energy (opens in new tab), the British Library, NASA's Jet Propulsion Laboratory, and The Washington Post. In 2012, the service was updated with new algorithms to capitalize on the advances made by deep-learning experts.

But the inclusion into the Indexer represents the biggest leap yet for the technology, and the Microsoft Azure (opens in new tab) adoption seems certain to gain the technology plenty of fans, given its increasingly relevant appeal in this era of near-ubiquitous multimedia.

Chitsaz explains the service’s ability to provide enhanced access to the ever expanding amount of audio-video currently available.

"We provide the ability for customers to build a search experience that would allow a deep search for spoken words inside audio and video archives, an effective way of browsing archives," he says.

The Microsoft Azure Media Services Indexer extracts keywords from a video file as metadata that can be used for categorization of audio-video in a custom application, such as whether the content pertains to subjects such as entertainment, politics, or neural-network research.

Behrooz Chitsaz (opens in new tab)And that's not all the Indexer can do. It also can deliver a machine-generated draft description for video and audio content, sufficient for a basic understanding of the clip in question, and it can supply keywords that can be used for categorization of audio and video, such as whether the content pertains to subjects such as entertainment, politics, and neural-network research.

As Chitsaz explains it, the MAVIS service and Azure Media Services were a natural fit.

"We started to talk about this 1½, 2 years ago," he reports. "They saw the value in this service, given the early success we had, and we’ve been working with each other to transfer this. It’s always been run on Azure, which helped the transition from research into their service in that we didn't have to move it to Azure."

The product contribution, though, does not signal the end of the collaboration.

"We will continue to drive innovation into Azure Media Services," Chitsaz states. "There are a lot of technologies that we’re going to deliver to Azure Media Services."

And that, he concludes, promises to fuel the ongoing energy surrounding the Indexer.

"There's more and more audio-video, Chitsaz says, "whether it's social audio-video, whether it's broadcast, or in the enterprise. We're giving people the ability to unlock the information in their audio and video content, which is what gives me a lot of joy. It gets very exciting."