Microsoft Research Blog

The Microsoft Research blog provides in-depth views and perspectives from our researchers, scientists and engineers, plus information about noteworthy events and conferences, scholarships, and fellowships designed for academic and scientific communities.

Cloud computing changes the way we practice public speaking

June 16, 2016 | By Microsoft blog editor

Overcoming the fear of public speaking using cloud-based technology

By Vani Mandava, Senior Program Manager, Microsoft Research

People often rank public speaking as the number one fear that they face. New cloud-based technology from researchers at the University of Rochester lets speakers polish and practice at home in front of their computer camera, while the analysis provides instant feedback about improvement.

Leading this effort known as ROC Speak is M. Ehsan Hoque, an assistant professor of computer science and electrical and computer engineering at University of Rochester, where he codirects the Rochester Human-Computer Interaction (ROC HCI) Lab.

Hoque has more than one motivation for helping people improve their communication. He has a brother with a severe social deficit who isn’t able to communicate well. Hoque also has heard from the public in more than 2,000 emails that some wish they could practice public speaking on computers in the privacy of their homes. Social speaking ability is valued by everyone, from academics who lecture to business leaders and students. With the donation of Microsoft Azure for Research tools, Hoque has explored and designed a new platform to help speakers.

At a conference a few years ago, a man walked up to Hoque and said he feared social stigma because his speaking style was monotonous and he had difficulty making eye contact with people. Hoque was inspired by that encounter to further develop what is now ROC Speak. It’s possible that in the future, ROC Speak will help people overcome speaking issues and might smooth social difficulties for people with Asperger’s Syndrome.

Among those testing the tool has been Valentina Kutyifa, a research cardiologist and former president of the Toastmasters Club at the University of Rochester. She has helped build a collaboration between the nonprofit Toastmaster’s International, which also helps people practice speaking, and ROC Speak. “I have used the ROC Speak product for some of my speeches, and I felt it’s very useful and helpful for preparing speeches and providing instant feedback,” she said.

Hoque explains that communication is much more complex than we realize, especially the nonverbal elements. The ROC Speak platform works by measuring many forms of nonverbal behavior simultaneously. Using the video camera and audio recording on the user’s laptop, the program measures eye gaze, word use, voice level, and hand gestures. ROC Speak uses techniques that automatically analyze these subtle human behaviors. In addition, the system provides feedback, which allows users to explore the nuances of their behavior during practice of a speech.

“The human face has 43 muscles. Using 43 muscles, we can create 10,000 unique facial expressions. To model nonverbal behavior, we need to get a lot of data and collect it in a naturalistic environment,” Hoque said. To facilitate capturing and handling that data, Microsoft Azure for Research lent the project access to cloud resources with advanced tools in the Cortana Intelligence suite, tools such as Azure Machine Learning and Microsoft Cognitive Services. This enabled the ROC Speak team to make the platform broadly available, capture and store participant data, and synthesize it.

Hoque and his lab used Azure-based tools to analyze user videos by scoring automated visual features such as smile intensity and movement, and audio features like pitch and loudness. After users record a 2-minute video, they are given immediate feedback by the machine–learning-based analysis. The feedback is presented in visually appealing graphs that show, for example, voice level for every few seconds of the speech. Word use—both the speed of talking and the sophistication of language—is analyzed. Gestures are tracked as well. The user can choose to share the video with other users and receive ratings on such elements as friendliness and gestures, as well as an overall rating.

Azure has helped his team, Hoque said, because it has an intuitive user interface and has allowed his students to use it without prior experience in cloud computing. One of his students is Vivian Li, who is an undergraduate research assistant. Li was surprised by the sense of community that developed among ROC Speak users who were rating other users on their videos. She and Kutyifa also were inspired by how dramatically people improved as they practiced.

Hoque sees further developments for the ROC Speak project, especially as they gather more and more data from participants. “It is one of the largest datasets on nonverbal communication captured from people practicing public speaking in front of a computer,” he said. Because it is cloud based, there is enormous potential to grow even further and collect even more data.

One of the great advantages he sees to cloud computing is that he is able to tweak and improve his algorithms as users keep using the program. His next step is to deploy ROC Speak widely in the world. “Social skills are fundamental to who you are … So I think if there is a platform out there that helps you to be better with communication, it can change the way we communicate.”

Learn more

Up Next

Artificial intelligence, Data platforms and analytics

Cloud computing aids researchers in solving the unsolvable in medical data labeling

It’s not uncommon for physicians to disagree about a diagnosis. That’s why people often seek a second or third opinion when faced with a serious or complex health concern. What if instead of a second opinion, hundreds of expert opinions could be collated? What if those experts were a combination of both humans and AI […]

Vani Mandava

Director, Data Science Outreach

microsoft ability team stands in front of sign at university of texas at austin

Artificial intelligence, Human language technologies, Search and information retrieval

Microsoft Ability Initiative: A collaborative quest to innovate in image captioning for people who are blind or with low vision

Microsoft is committed to pushing the boundaries of technology to improve and positively influence all parts of society. Recent advances in deep learning and related AI techniques have resulted in significant strides in automated image captioning. However, current image captioning systems are not well-aligned with the needs of a community that can benefit greatly from […]

Meredith Ringel Morris

Sr. Principal Researcher & Research Manager

Two guys writing equations on a window in Asia

Artificial intelligence, Graphics and multimedia, Human language technologies

Growing a generation of computer scientists – Microsoft Research Asia at 20 and going beyond technical achievement

Microsoft Research Asia celebrates its 20th anniversary this year, and the milestone provided an occasion for many in the industry to reflect on an amazing journey, one not only replete with excellence and technological achievement, but also significant in its profound influence as it cultivated a generation of computer scientists and engineers, catalyzed collaboration between […]

Microsoft blog editor