Guidelines for human-AI interaction design

Published February 1, 2019

By Saleema Amershi , Partner Research Manager Mihaela Vorvoreanu , Director, UX Research and Education, Aether Eric Horvitz , Chief Scientific Officer

Share this page

The increasing availability and accuracy of AI has stimulated uses of AI technologies in mainstream user-facing applications and services. Along with opportunities for infusing valuable AI services in a wide range of products come challenges and questions about best practices and guidelines for human-centered design. A dedicated team of Microsoft researchers addressed this need by synthesizing and validating a set of guidelines for human-AI interaction. This work marks an important step toward much-needed best practices for the complexities AI designers face.

The integration of AI services such as prediction, recognition, and natural language understanding brings multiple new considerations to the fore for designers. For example, interaction designers have to grapple with rates of failure and success of AI inference, the changes in system behavior that may come with ongoing machine learning, and with the understandability and controllability of AI functions.

The variability of current AI designs as well as high-profile reports of failures – ranging from the humorous, embarrassing or disruptive (for example, benign autocorrect errors) to the more serious, when users cannot effectively understand or control an AI system, (for example, accidents in semi-autonomous vehicles) – highlight opportunities for creating more intuitive and effective user experiences with AI. The ongoing conversation on human-centered design for AI systems shows that designers are hungry for trustworthy AI-centric design heuristics or guidelines.

Over the last 20 years, research scientists and engineers have proposed guidelines and recommendations for designing effective interaction with AI-infused systems. Ideas span recommendations for managing user expectations, moderating the level of autonomy, supporting the resolution of ambiguity, and providing awareness about changes that may occur as the system learns about users. Unfortunately, many of these design suggestions are scattered through different publications and are rarely presented explicitly as guidelines. The Microsoft research team identified more than 150 such design recommendations, many of which captured similar ideas. By distilling and validating them into one unified set of guidelines, this work empowers the community to move forward and build on existing knowledge.

“The design community didn’t have a unified set of guidelines for creating intuitive interactions between humans and AI systems. We set out to create and validate one,” said Saleema Amershi, lead researcher on the development of the Guidelines for Human-AI Interaction.

The Guidelines for Human-AI Interaction—as well as the process for developing and validating them—will be presented at the 2019 CHI Conference on Human Factors in Computing Systems in Glasgow, Scotland. The team—Saleema Amershi, Dan Weld, Mihaela Vorvoreanu, Adam Fourney, Besmira Nushi, Penny Collisson, Jina Suh, Shamsi Iqbal, Paul Bennett, Kori Inkpen, Jaime Teevan, Ruth Kikin-Gil, and Eric Horvitz—synthesized more than 20 years of knowledge and thinking in AI design spanning academia and industry into a compact set of generally applicable design guidelines for human-AI interaction.

In its quest to synthesize broad and specific guidance coming from a variety of sources into a unified set of guidelines that could be universally embraced, the CHI 2019 paper, titled Guidelines for Human-AI Interaction, is also a 20th anniversary celebration of Eric Horvitz’s formative CHI 1999 paper. That paper proposed principles for smoothly weaving together human and AI capabilities and harnessing a mix of AI and human initiatives.

Following a rigorous process, Microsoft researchers began by collecting more than 150 AI-related design recommendations—potential guidelines—from respected sources that ranged from scholarly research papers to blog posts and internal documents. Grouping recommendations by theme, the team was able to condense them into a manageable number. They then embarked on multiple rounds of evaluation with user experience (UX) and human computer interaction (HCI) experts, seeking to ensure that the guidelines were easy to understand as well as applicable to a wide range of popular AI products.

“We wanted to ensure the guidelines are specific and observable at the UI level. So, we eliminated overarching principles like ‘set expectations,’ or ‘build trust’ and instead translated them into specific, actionable guidelines,” said Mihaela Vorvoreanu, senior program manager.

The resulting 18 guidelines for Human-AI Interaction are grouped into four sections that prescribe how an AI system should behave upon initial interaction, as the user interacts with the system, when the system is wrong, and over time. While the Guidelines for Human-AI Interaction are provided to support design decisions, they are not intended to be used as a simple checklist. The recommended guidelines are intended to support and stimulate conversations about design decisions between user experience and engineering practitioners, and to foster further research in this evolving space. The authors recognize that there will be numerous situations where AI designers must consider tradeoffs among guidelines and weigh the importance of one or more over others. Rising capabilities and use cases may suggest a need for additional guidelines.

Guideline 10: Scope services when in doubt. When AutoReplace in Word is uncertain of a correction, it engages in disambiguation by displaying multiple options the user can select from.

The guidelines were developed and tested on products with graphical user interfaces. There are opportunities to develop specific extensions or modifications of the guidelines for voice interaction, and for specialized, high-stakes uses such as semi-autonomous vehicles.

“We’re still in the early days of harnessing AI technologies to extend human capabilities,” said Eric Horvitz, director of Microsoft Research Labs. “There is so much opportunity ahead—and also many intriguing challenges. An important direction is sharing and refining sets of principles and designs about how to best integrate AI capabilities into human-computer interaction experiences.”

Guideline 15: Encourage granular feedback.
Ideas in Excel empowers users to understand your data through high-level visual summaries, trends, and patterns. It encourages feedback on each suggestion by asking, “Is this helpful?”

Guidelines for human-AI interaction can extend beyond design into the realm of responsible and trustworthy AI. For example, in November 2018, a Microsoft advisory committee focused on the responsible development and application of AI technologies published a set of guidelines on the design of conversational interfaces such as chatbots and virtual assistants, Responsible Bots: 10 Guidelines for Developers of Conversational AI. That work, and ongoing efforts on guidelines for human-AI interaction are being hosted and coordinated across Microsoft by the Aether Committee, a company-wide advisory committee on responsible AI, announced by CEO Satya Nadella (opens in new tab) as part of the initiative to ensure the company’s AI-related efforts are deeply grounded in Microsoft’s core values and principles and benefit society at large. Aether hosts a set of topically focused working groups. Amershi serves as co-chair of the Aether working group on Human-AI Interaction and Collaboration.

Human-AI Interaction Design Guidelines

INITIALLY

01 Make clear what the system can do.

Help the user understand what the AI system is capable of doing.

02 Make clear how well the system can do what it can do.

Help the user understand how often the AI system may make mistakes.

DURING INTERACTION

03 Time services based on context.

Time when to act or interrupt based on the user’s current task and environment.

04 Show contextually relevant information.

Display information relevant to the user’s current task and environment.

05 Match relevant social norms.

Ensure the experience is delivered in a way that users would expect, given their social and cultural context.

06 Mitigate social biases.

Ensure the AI system’s language and behaviors do not reinforce undesirable and unfair stereotypes and biases.

WHEN WRONG

07 Support efficient invocation.

Make it easy to invoke or request the AI system’s services when needed.

08 Support efficient dismissal.

Make it easy to dismiss or ignore undesired AI system services.

09 Support efficient correction.

Make it easy to edit, refine, or recover when the AI system is wrong.

10 Scope services when in doubt.

Engage in disambiguation or gracefully degrade the AI system’s services when uncertain about a user’s goals.

11 Make clear why the system did what it did.

Enable the user to access an explanation of why the AI system behaved as it did.

OVER TIME

12 Remember recent interactions.

Maintain short-term memory and allow the user to make efficient references to that memory.

13 Learn from user behavior.

Personalize the user’s experience by learning from their actions over time.

14 Update and adapt cautiously.

Limit disruptive changes when updating and adapting the AI system’s behaviors.

15 Encourage granular feedback.

Enable the user to provide feedback indicating their preferences during regular interaction with the AI system.

16 Convey the consequences of user actions.

Immediately update or convey how user actions will impact future behaviors of the AI system.

17 Provide global controls.

Allow the user to globally customize what the AI system monitors and how it behaves.

18 Notify users about changes.

Inform the user when the AI system adds or updates its capabilities.

For more details and examples of each guideline, read the paper, Guidelines for Human-AI Interaction.

Related publications

Principles of Mixed-Initiative User Interfaces