Happy and productive at work: Predicting opportune moments to switch tasks and take breaks


By , Principal Applied and Data Science Manager , Principal Researcher , Partner Researcher and Research Manager , Chief Scientist & Technical Fellow

A collage of information workers displaying a range of emotions while at work.

Consider the last time you struggled to complete a task at work. Perhaps you got stuck trying to solve a tricky problem, unable to break out of that unproductive state, or maybe you were so engrossed in an activity unrelated to work that you ignored the need to return to pressing matters in a timely manner. You can probably think of countless times when you felt too emotionally drained to tackle something you were usually excited to be a part of. All these scenarios, and many others, are examples of situations where getting things done comes at significant cognitive and emotional cost. Now, think how helpful it could be if an AI assistant guided you in transitioning to a different activity or taking a break, saving you the time and energy spent forging unproductively ahead, continuing to fall behind, or getting further distracted by negative feelings surrounding your progress, or lack thereof.

At Microsoft Research, our team of productivity, affective computing, and attention researchers has been investigating the challenge of balancing productivity and well-being for many years, examining how we can design experiences and applications that help people get work done efficiently while also preserving their emotional well-being. One thread of research we’ve focused on to accomplish that is how to help guide people through transitions between tasks and work breaks. This research builds upon a large body of prior work, including investigations into the effects of interruption, understanding and supporting long-term emotional well-being, and the design and development of tools to support productivity and well-being.

Microsoft Research Blog

Introducing Aurora: The first large-scale foundation model of the atmosphere

Aurora, a new AI foundation model from Microsoft Research, can transform our ability to predict and mitigate extreme weather events and the effects of climate change by enabling faster and more accurate weather forecasts than ever before.

In our ACM CHI Conference on Human Factors in Computing Systems (CHI 2020) paper “Optimizing for Happiness and Productivity: Modeling Opportune Moments for Transitions and Breaks at Work,” we—along with PhD students Harmanpreet Kaur and Alex Williams, both Microsoft interns at the time—investigated how to optimize happiness (as quantified via facial expressions) and productivity by modeling information workers’ affect and work patterns. The result is a group of models designed to capture when a person should consider switching to a new task or stepping away for a break as a means of enhancing positive emotions while still ensuring work gets done.

The fundamental questions

There are a few questions fundamental to developing AI-powered tools for helping people navigate what they need to accomplish while maintaining their well-being, namely enabling task switches that are beneficial for productivity, incorporating considerations of physical and emotional state in task-switching support, and measuring the value of these efforts. Additionally, these solutions often need to be personalized, as each person is different in the way they perform tasks and balance their well-being needs. More broadly, we’re interested in exploring the following issues:

  1. Work-related tasks can be complex, comprising multiple activities. Take, for example, the act of looking up references on Stack Overflow while performing a coding task on Microsoft Visual Studio. How do we support effective task switching in these scenarios, addressing the need to switch tasks while reducing unproductive and distracting switches? How can we help people transition to more productive states when they’re unable to break out of one that isn’t conducive to being productive or pleasurable?
  2. Recent research into the effects of distraction-blocking software shows the notion of productivity is strongly tied to the emotional and physical well-being of the worker. As a company committed to advancing better productivity practices, how can we start to integrate measures of emotional well-being into our productivity tools?
  3. How can we better understand the value of emotional well-being in the context of productivity and the positive impact of taking breaks and transitioning to other tasks? How are the outcomes of incorporating breaks and task switches more strategically better than simply focusing on “getting things done”?

We break the problem of optimizing productivity and happiness at work into multiple steps, first identifying when a person should transition to a different task or take a break; then determining what they should switch to (that is, what kind of task or a break); and finally developing AI assistants that can use those moments to effectively guide people in switching their attention to something more fruitful. In the paper, we investigate the first step: modeling when a person should transition to a different task or take a break given they already have a set of tasks they want to accomplish for the day.

A linear process diagram shows the method of collecting the participant data used to create each data sample that maps the input values to the expected value of each of the three possible actions. The diagram shows collection across the course of a day, beginning at the far left of the diagram with 9 a.m. and continuing right with every hour on the hour marked until 5 p.m. A color-coded list of the categories of data collected via this method appears under the linear process with a large arrow indicating the combination of the data mapping to the expected value. In the linear process portion of the diagram, a webcam icon enclosed in a green circle is labeled as the “Emotion and Activity Logging Software” component of the collection process. It corresponds to the listed data it’s collecting throughout the day (green): emotion (anger, contempt, disgust, fear, happiness, neutral, sadness, surprise); heart rate; physical movement (distance from screen, eye movement); interaction data; and action (transition, break, continue). A calendar/clock icon enclosed in a pink circle represents the component that tracks time and day data. A checklist icon enclosed in a blue circle right above 9 a.m. in the linear process is labeled as the “Daily Task List” component. It corresponds to the listed data it’s collecting at the start of each day, including the type of tasks an individual wants to accomplish (reading, writing, coding, email, Excel, paper-based, brainstorming, online search) and their urgency. A sliding scale icon enclosed in an orange circle is labeled as the “Hourly Self-Reports” component and corresponds to vertical orange bars at each hour mark.

Figure 1: The data collection setup for building models that predict when people should continue with a task, transition to another, or take a break. The different components are: 1.) a logging software that tracks perceived emotion via a webcam and workstation activity data; 2.) time and day tracking; 3.) a daily task interface through which people enter the tasks they want to get done for the day and for each task information about the urgency, difficulty, and anticipated completion time; and 4.) hourly self-reports of task progress, overall emotions, and feelings of productivity. Eight categories of data are collected through this setup: emotion, heart rate, physical movement, interaction data, time, day, task information, and action the participant had taken. The data is used as input; the output is an expected value for each of the three recommended actions.

Finding the right moment

Our goal is to identify at any given moment whether a person should continue with their current task, switch to a different task, or take a break based on the status of their current activities and perceived emotional state. To do this, we leverage several pieces of information: people’s digital activities, defined as their interactions with their computers, including applications they access, whether they’re typing or moving their mouse, and tab switches; a list of tasks they want to get done for the day; and a continuous stream of emotion data based on facial expressions. This information is used as input for predictive models, which provide as output the recommended action. If the model output is to continue with an ongoing task, no recommendation is provided to the individual. When models recommend switching to a new task, they only make the suggestion to switch, not the suggestion of which task a person should switch to, though we envision future systems being able to also make such a recommendation.

To gather the data to build the models, we used a multimodal AI sensing platform to measure perceived emotion and collect contextual data about digital activities, as well as self-reports of progress on a predefined list of tasks, for 25 people for three weeks (Figure 1 above). Our models output a separate expected value—calculated as a cumulative sum weighing task progress, affect, and productivity-given-stress equally—for each of the actions; productivity-given-stress is a combined value of productivity and stress from the hourly self-reports. The action with the highest expected value is recommended to the individual.

Building the models

Since people are fundamentally different in how they engage in tasks and their affective states, we developed customized models for each person in our study using only their respective data. But as a point of comparison, we built combined models based on job role, hypothesizing there might be some commonalities in predictive features for people whose job responsibilities are similar. These models incorporate data from all participants belonging to a specific job role. We also built a general model in which data from all the participants are combined.

Because our data effectively represented a per-person time series and our output variables are continuous in nature, we modeled our setup as a classic time series, an autoregressive integrated moving average with exogenous variable (ARIMAX). We found these models yield two or three significant features per participant that best predict the outcome. In these models, the action participants took at the moment the data was being collected—whether they continued with their ongoing task, took a break, or transitioned to another task—was the most prominent feature followed by task information, comprising the type of task, urgency, difficulty, and anticipated completion time; interaction data; and emotion. However, while the ARIMAX data is rich, helping us understand what features are important from a practical point of view, ARIMAX models are less suitable for deployment in real time, as they’re computationally expensive, requiring around 15 minutes of processing time per participant. This wouldn’t work for a real-time recommendation system. We then explored other regression models, comparing their real-time performance, as well as how well they were aligned with the ARIMAX models. We found that random forest regression (RFR) models were most suitable, resulting in models that were most similar to the ARIMAX models.

A heat map, using shades of blue to represent different values, shows the importance of each of the eight categories of features used in predicting whether an individual should continue with a task, switch to a new task, or take a break for each of the study’s 25 participants, for five different job clusters, and for an aggregate of all participants (P1–P25, C1–C5, and “All” on the horizontal axis, respectively). On the vertical axis are the eight feature categories: Emotion, Heart Rate, Physical Movement, Interaction Data, Time of Day, Day of Week, Task Information, and Potential Actions. Along the far right of the heat map is a vertical color scale showing the numerical Feature Importance value corresponding to each of six different shades of blue, beginning with the lightest shade and lowest values (0.0–0.1) at the bottom and moving up to the darkest shade and highest values (0.5–0.6), with the different shades defined in increasing one-tenth increments. The heat map shows that, across participants, interaction data was the most important feature category followed by task information, emotion, physical movement, time, heart rate, day, and potential actions but that importance varied significantly by individual.

Figure 2: The importance of specific features in producing the best predictions, as determined by the random forest regression models. Darker squares indicate stronger feature importance. Data is shown for each study participant, clusters based on job roles (C1–C5), and all participants combined.

For the RFR models, we found interaction data to be the most important feature on average, followed by task information and emotion. However, at an individual level, the important features varied quite a bit, highlighting the same model wouldn’t necessarily be successful for different people. Figure 2 (above) shows the relative importance of features for each participant. For example, emotion was most important for Participant 13, while task information was most important for Participants 2 and 20.

We also found that once we started pooling data according to job clusters or into a combined general model, the important features change. We observed that on average, the importance of features per job cluster was not necessarily aligned with its members’ personalized models and instead displayed patterns related to the particular job roles. For example, for software engineers, task information and interaction data were the most important features, whereas for the cluster of financial managers, the time of day was most crucial—but, again, not for every individual in those respective groups. That feature importance based on specific job roles doesn’t always extend to the person occupying the role demonstrates why personalized models will perform better than the broader job cluster model. The general model’s feature importance was more spread out across all feature categories, as one would expect with an aggregate model.

Our models put to the test

To evaluate how well the models would work in practice, we built and deployed a tool that uses these models to recommend in real time if individuals should take a break or transition to a different task and had participants rate how useful the recommendations were. In our evaluation, we found that people generally agreed with the recommendations our tool provided. They agreed with the timing of the recommendation 85.7 percent of the time for transitions and 77 percent of the time for breaks. However, interestingly, while people agreed with the recommendation, compliance was a different matter, as it was generally difficult to break out of what they were doing. (Given what you have on your plate, for example, you may know that after reading this post, you should stop browsing the web, but find that hard to do—and you wouldn’t be alone!) This opens up opportunities for AI assistants to interact with people and help them act on recommendations, which we explore in other research, including the paper “Design and Evaluation of Intelligent Agent Prototypes for Assistance with Focus and Productivity at Work.”

Showing how user log activities and perceived emotion data can be combined to identify moments to take a break or transition to different tasks is a step toward making AI-based systems aimed at helping people be productive and happy a reality for information workers. We envision AI assistants using similar models to support people in adopting behavior that can have longer-term benefits to their productivity, for example, helping them find value in the immediate action of leaving an ongoing task, even if it may seem counterproductive, or better preparing them to leave a task so it can be easily resumed later. This is an emerging area of research at the intersection of human-computer interaction and AI with many interesting research questions that directly impact the individual.

Related publications

Continue reading

See all blog posts