It began with a couple of folding tables from Costco. In November, Greg Baribault, a product leader for Microsoft Teams, and his colleague Shiraz Cupala commandeered two rooms at the Hive, a prototyping space on the company’s Redmond, Washington, campus. Their plan: to design the conference room of the future, a new kind of software-infused meeting space where in-person and remote participants could collaborate on an equal playing field. The stakes: If the Hive team succeeded, its model would have the potential to change not just business meetings, but the way all kinds of groups connect.

As the project intensified, the pair jettisoned the cheap Costco tables in favor of eight-foot-tall sheets of corrugated cardboard, which they used to build movable walls and makeshift furniture of various shapes and sizes. They tested PowerPoint presentations so they’d be able to tell what each participant in the room could see, or couldn’t. Soon enough, others were drawn to the Hive from across the company—from researchers, engineers, and marketers to Microsoft Teams product developers and next-gen hardware designers. Everybody pitched in. Some built the furniture. Others acted as meeting participants. Still others tried out different audio setups.

“It was a very collaborative and open discussion about what we could do,” says Baribault. “We had software engineers cutting up cardboard.” Microsoft’s real estate and digital teams also worked hand in hand with the Hive crew, doing the actual A/V build-out and keeping everyone grounded in the realities of workplace facilities to ensure that the end result was something Microsoft—and its customers—could practically deploy at scale.

The team came together during the pandemic in anticipation of a future that’s now arriving: a hybrid era where, as location becomes more flexible than ever, everyone will need to be on equal ground no matter where they are. Companies are responding to the clear need to rethink and reorient their workspaces: People are spending 148 percent more time in Microsoft Teams meetings each week than before the pandemic, and 66 percent of business decision makers are considering redesigning physical spaces to better accommodate hybrid work environments.

But for hybrid work to be truly equitable, it would have to be viable in practice, not just in theory. Meetings would need to become more inclusive. There have always been remote workers, of course, but communicating with others can feel like peering through a straw into the meeting room at pixelated and poorly framed webcam images of your colleagues. In the coming age of all hybrid, all the time, no one’s meeting experience, remote or in person, should be second class. To fulfill this promise, the Microsoft teams didn’t simply need to reimagine physical meeting spaces—they needed to reimagine the entire digital experience of online communication and collaboration.

“The risk of hybrid meetings is that in-person attendees become anonymous faces in a room, while remote attendees are left speaking into a void, not knowing if they are seen or heard, or how to jump in and take a turn,” says Jaime Teevan, chief scientist at Microsoft. “If things return to the old normal, we will have missed a once-in-a-lifetime opportunity to create a new and better future of work.”

“The risk of hybrid meetings is that in-person attendees become anonymous faces in a room, while remote attendees are left speaking into a void, not knowing if they are seen or heard, or how to jump in and take a turn.”—Jaime Teevan, chief scientist, Microsoft

Turning research into products

The Hive team envisioned meetings of the future where everyone would be interacting and cocreating, no matter where they were, sharing content and ideas in real time. Everyone could see one another at eye level. They would know who was talking and where they were in the room. Everyone could use the same virtual whiteboards. Notes and chats would be visible to all.

More and more colleagues joined in as they tried to think through what new room configurations, new products, and new Microsoft Teams software features could help them realize this vision. Members of Project Malta, a group of Microsoft Research scientists looking at hybrid meeting configurations, drew from decades of research conducted by Microsoft and others. Many of the findings from that research—studies about everything from telepresence to eye gaze—offered important insights, but they remained as prototypes or in the theoretical realm until the pandemic created new urgency to rethink the future of work. “A lot of what we’re doing now is revisiting previous insights we had in a new world that has changed completely,” says Kori Inkpen, a principal researcher with Microsoft and the leader of Project Malta.

People sit at a curved makeshift desk constructed from cardboard and examine a projected meeting display

Using sheets of cardboard, teams within Microsoft prototyped a curved desk that affords a good view of a meeting for each participant. They also developed a prototype to enable rapid iteration of potential on-screen layouts. Remote workers appear at the bottom of the projection screen, at eye level with in-person participants.

Microsoft

A project called IllumiShare in 2012—a simple video demo with two players engaged in a remote game of tic-tac-toe—foreshadowed the immersive collaboration enabled by today’s virtual whiteboards. There were studies about the cognitive fatigue caused by listening to multiple voices coming out of a single speaker and papers on the importance of eye gaze in fostering feelings of human connection.

In fact, researchers at Microsoft have been studying remote meetings since the early `90s. “There’s this long trajectory that plays into what we’re doing now,” says Abigail Sellen, a cognitive scientist and deputy director at Microsoft Research Cambridge UK. In 1991, Sellen and Microsoft researcher Bill Buxton, pioneers in the field of human-computer interaction, worked on a videoconferencing system that replicated a roundtable conversation among three other people, using low-tech speakers and tiny black-and-white video screens not much larger than Post-its. Today, the product teams at Microsoft are working to create that same feeling of having people to the left and right of you—this time, bringing video feeds of remote participants to eye level at the bottom of the screen in a horizontal view (looking left to right is more natural and helps with eye gaze). They’re also using intelligent speakers and spatial audio.

Project Malta researcher John Tang knows a thing or two about the challenges of remote meetings. Based in Mountain View, California, Tang for years was the sole team member working remotely with Kori Inkpen’s Microsoft Research team in Redmond—the lone voice coming out of the speaker puck at group meetings. In 2020, Tang researched the telework experiences of people with disabilities, crucial work that foreshadowed the design of many of the latest Teams improvements. Making it possible to read the lips of participants, it turns out, helps not just the hearing impaired but everyone to better understand the speaker, cutting down on videoconference fatigue that comes from having to get all of your information through audio. “That’s one of the reasons why video meetings are so much more tiring than in-person meetings, where we can turn to who’s talking and do all the lip-reading work that we do naturally,” he says. Now Tang was working to bring those insights into the hybrid meeting experience of the future.

‘Who’s looking at whom?’

Back at the Hive, construction was ramping up. As table sizes and shapes and camera placements were being worked out, prototypes were replacing cardboard mockups. In came intelligent speakers that used voice recognition to help remote participants figure out who was saying what, as well as cameras that delivered separate views of every individual in the room.

In April, Inkpen was at the Hive testing out a prototype. Six researchers were in the conference room playing the in-room participants. Standing in as remote workers were Cupala, who was in his car following along on his cellphone, and Tang, watching the action on his laptop from his home in Mountain View. “Imagine a whole bunch of geeks in the room,” says Inkpen. “We get in there and it’s like, ‘OK, I’m pointing at John. Who does it look like I’m pointing to? Can you tell who’s looking at whom?’”

The prototype worked well for the people in the room at the Hive, but for the people participating remotely, not so much. There weren’t enough cameras set up, and remote workers were invisible to some in-person attendees. “I spoke up for the remote person,” says Tang. “Everyone was saying it’s great, but we were trying to say, it’s not so great out here, and here’s what we still need to work on to make sure that everyone is included.”

In the UK, members of Microsoft Research Cambridge, also part of the Project Malta team, were in the process of conducting their own studies and creating their own prototypes. In ongoing experiments, researchers are using different kinds of hardware as “stand-ins” for the remote participants. These lightweight, off-the-shelf Double robots (picture a video screen atop a wheeled broomstick) serve as remote-controlled video surrogates to replace static images of remote people on front-of-room displays. Researchers are also testing 55-inch Surface Hubs, which can be wheeled to different places in a meeting room for remote people to beam in over.

Taking turns

One of the knottiest issues of videoconferencing is turn taking—knowing when you can speak up—because what we see and hear in a videoconference is different from meeting in person. Sean Rintel, a principal researcher at Microsoft Research Cambridge, led a large-scale study of Microsoft employees’ meeting experiences during the pandemic. The study found that managing who gets a turn to talk was the most common interaction challenge. Microsoft researchers have discovered multiple factors at work, including poor audio, the difficulty of knowing how you are being seen and heard, and the “cue-lessness” of people in video calls (the fact that gestures and eye gaze are distorted or watered down). Technological solutions that researchers are exploring include integrating spatial audio to enable a more naturalistic sound stage and using AI to annotate parallel chat conversations so it’s easier to follow spoken and written messages simultaneously.

Early in the pandemic, Microsoft saw parallel chat messages in Teams meetings globally jump by a factor of 10 in four months. Parallel chat has a plethora of uses, from asking questions to sharing links, files, agreement, praise, and discussion. Of course, there are also jokes and plain old chatting. “We’re finding that parallel chat is used in particular by people who normally don’t take the floor in meetings,” says Teevan. As part of the large-scale meetings study, Advait Sarkar, a senior researcher at Microsoft Research Cambridge, found that women are twice as likely as men to report using parallel chat for questions and answers during meetings.

But parallel chat is a double-edged sword. It can be a distraction from the meeting. There are differing expectations around its use and formality. It challenges those with reading and vision difficulties, as well as those who struggle with interpreting sentiment in written form. Parallel chat may enable more contributions, but the research team says work remains to be done to ensure that it doesn’t distract from the “main meeting” while at the same time not marginalizing contributions from those who choose to use it.

In-Meeting Chat Use Surged During the Pandemic, Especially Among Some Groups

An infographic demonstrates that 20 percent of men  ages 25 to 34 and 58 percent of women ages 25 to 34 said their chat use has increased during the pandemic

A Microsoft study (see chart) found that women ages 25 to 34 are far more likely to say their chat use during meetings has increased during the pandemic. The same study showed women are also twice as likely to report using parallel chat for questions and answers during a meeting (16 percent of women versus 8 percent of men).

Infographic by Valerio Pellegrini

Finally, the workers at the Hive felt ready to do a run-through of their prototype. Two projectors beamed a 20-foot video wall showing an agenda, shared notes, a presentation, and a group of remote attendees all on a common stage. “Each time someone walked into the room and experienced their life-size remote colleagues separated out from the background looking back at them, you could see them have a ‘Wow!’ moment,” says Cupala. “The typical videoconference fatigue melted away. It felt like we were actually together.” There was still more to work through. Cupala continues, “Normalizing the sizes of people’s remote video feeds still needed work, and a good remote view of the in-room participants would require new technology on the horizon.”

This prototyping exercise, combined with decades of research, continues to inform changes to Microsoft Teams—changes intended to make hybrid work, well, actually work. Some of these changes have already been rolled out or will be released to customers over the next few months. There are innovations that improve eye contact and the sense of presence among all participants, remote and in person. There are tools that capture body language and allow remote workers to fluidly collaborate and annotate everything from virtual whiteboards to PowerPoint presentations. (You can read more about the latest innovations here.)

Several people sit at a desk in a redesigned Teams meeting room

In Microsoft’s reimagined hybrid meeting experience, every participant has a clear view of every other participant—life-size, and at eye level.

“There are good things about remote meetings that we want to bring forward and not lose as we embrace the possibilities of hybrid,” says Teevan, from transcriptions to parallel chat to simply being able to see the names of everyone in the meeting. “Moving forward, it’s not just about doing a better job of including remote people, but also about hanging on to these benefits.”

Acknowledgments: We would like to thank everyone who contributed to this research including Sam Albert, Greg Baribault, John Bradley, Bill Buxton, Shiraz Cupala, Jason Faulkner, Matt Hempey, Eric Hull, Kori Inkpen, Sasa Junuzovic, Mike Messer, Sean Rintel, Advait Sarkar, Abigail Sellen, John Tang, Jaime Teevan, Chad Voss, and Andy Wilson.