Microsoft uses machine learning to develop smart energy solutions

Jan 11, 2019   |  

A Lenovo Yoga laptop in an office setting.

At Microsoft, we’re using Azure Machine Learning to improve the effectiveness of the operation schedules for our buildings’ heating, ventilation, and air conditioning (HVAC) systems to reduce costs and increase employee comfort. We used Azure Databricks and Azure Machine Learning Studio to examine data from our HVAC systems and buildings, combined with weather forecast information, to predict building occupancy and HVAC behavior. Using machine learning is helping us optimize operations and drive digital transformation.

Microsoft Real Estate and Security (RE&S) is responsible for heating and cooling 115 buildings in the Puget Sound area. Microsoft Digital partnered with RE&S to improve the effectiveness of the schedules for their heating, ventilation, and air conditioning (HVAC) system to reduce costs and increase employee comfort. Microsoft Digital implemented machine learning to predict when employees will arrive into Microsoft buildings each morning and how long it will take for a building to reach its optimal comfort temperature. As a result, we were able to generate a dynamic HVAC schedule that resulted in significant cost savings and increased employee comfort for RE&S. We’re continuing to implement machine learning in our buildings throughout the Puget Sound region and we’re encouraging the rest of Microsoft to use machine learning to optimize operations and drive digital transformation.

Examining facilities at Microsoft

At Microsoft RE&S, digital-transformation efforts center on the buildings in which Microsoft employees do their work. In the Puget Sound area, RE&S operates and maintains 115 buildings that house more than 59,000 employees. Although most of the buildings operate during standard business hours, Microsoft encourages employees to manage their schedule to best fit their workstyle. For example, buildings that house sales employees are often intermittently occupied, while employees in other buildings might get an earlier (or later) start on their workday. As a result, the primary hours of operation for buildings fluctuate from building to building and season to season. The systems that control HVAC vary between buildings, and each system needs time to bring a building, or sections of a building, to optimal temperature at the start of the day. We call this ramp-up time.

RE&S spends a significant portion of its budget heating and cooling buildings, and tries to be as efficient as possible in maintaining optimal temperatures. Despite best efforts, RE&S still hears concerns about employees being either too hot or too cold in their work environments. When RE&S realized that the typical systems that schedule and manage HVAC operations had room for improvement, they partnered with Microsoft Digital to find a solution.

Identifying an opportunity for increased intelligence

After examining the cooling and heating patterns for Microsoft buildings and how our employees were affected, we discovered that morning temperature was the most significant concern. When employees arrived to work in the morning, buildings were often too cold or too hot, depending on the season. All our HVAC systems are configured to observe an energy-saving temperature range when a building is unoccupied. These temperature ranges are designed for energy conservation, but not for employee comfort. As a result, each HVAC system has a ramp-up time to bring the building temperature into a range that is comfortable for our employees. Ramp-up time is primarily determined by the HVAC system’s capacity for heating and cooling, but it also varies from day to day due to outside weather conditions.

Our findings indicated that the static schedules set for our HVAC systems didn’t account for variance in ramp-up time from building to building or the different schedules that employees worked in each building. Our team recognized the opportunity to create a more intelligent and efficient method for managing our HVAC systems’ scheduling to address two primary goals:

  • Increase employee satisfaction by more accurately controlling when our buildings achieved optimal temperature during the day.
  • Decrease overall costs associated with energy waste for the operation of our HVAC systems.

Integrating machine learning into facilities

The two primary goals that our team established translated into several critical tasks that we needed to perform to create a solution that would fulfill both goals as effectively as possible. Machine-learning models were chosen as a primary component. Our engineering team identified opportunities within machine-learning technology to increase the intelligence and effectiveness with which the HVAC systems were scheduled by using existing HVAC data and controls. The team’s primary design tasks were:

  • Capture important weather information and telemetry data from HVAC systems.
  • Understand the occupancy trends for each building.
  • Use machine learning to determine the accurate ramp-up time required for HVAC systems to bring a building or floor to optimal temperature.
  • Use the above data to send control information to HVAC systems to enable more intelligent environment control.
The model for improving HVAC scheduling at Microsoft. The model begins with visualizations for weather, people, and data, each connected by arrows to a beaker that represents machine learning. The beaker is connected by an arrow to a calendar representing date and time information. The workflow finishes with the calendar connected by an arrow to a building.
Figure 1. Using weather, occupancy, and system data with machine-learning to improve HVAC scheduling

Identifying Azure big data tools

Our engineering team selected Microsoft Azure as the default platform for our solution. Azure provides advanced big-data management tools and has the capability to host a flexible, cost-effective solution in the cloud. We identified the following as our primary solution components:

  • Azure Machine Learning Studio. This fully managed cloud service enables you to easily build, deploy, and share predictive analytics solutions. It’s designed for applied machine learning and it provides simple and easy deployment of machine learning algorithms.
  • Azure Databricks. This Apache Spark-based analytics service accelerates big data analytics and AI solutions. Databricks allows you to quickly set up a Spark environment and build on familiar deep-learning frameworks and libraries.
  • Azure Data Lake. A data storage service, Azure Data Lake enables you to store data of any size or shape and integrate it directly into data analytics services. By using Data Lake, you can simplify data ingestion and storage and take advantage of batch, streaming, and interactive analytics to deploy data-storage solutions quickly.
  • Azure Data Factory. This service provides data-integration management and scheduling across the Azure big data management tool set. You can use Data Factory to build and manage data pipelines and transform raw data into transformed data ready for application.
  • Azure HDInsight. An open-source analytics service, Azure HDInsight provides a platform for developing big-data solutions on open-source frameworks such as Apache Hadoop, Spark, and Kafka. It integrates directly with Azure Data Lake and Azure Data Factory to build comprehensive analytics pipelines.
  • Microsoft Power BI. A business analytics solution, Power BI enables data visualization and presentation, to help you further explore and analyze your data. Power BI uses a wide range of visuals and reports to deliver business intelligence insights and share them on a cloud-based platform.

Proof-of-concept planning

We decided to prepare and apply a proof of concept (POC) in a small number of buildings in the Puget Sound area. The engineering team’s main objective was to obtain as much useful telemetry data as possible from the existing systems and then reuse that data to create a more accurate HVAC scheduling system.

First, the engineering team assessed our facilities for buildings that would function well in a pilot project and identified a target set of three buildings. These buildings were equipped with HVAC systems that could provide data for our machine-learning algorithms, were regularly used on a daily basis, and were subject to occupancy fluctuations depending on the time of day or season.

Examining machine-learning methods

Using the existing data sources, we achieved a greater degree of insight into how the HVAC systems in our buildings should perform for optimal efficiency and comfort. We established some standards for the predictive models to provide sufficient usable data in those models and then began assembling and testing our models. We used Azure Machine Learning Studio and custom scripts built on the open-source R platform for machine learning to create two models:

  • Ramp-up time prediction. This model calculates and predicts how long ramp-up would take for building. We used it to provide a more accurate prediction of how long it would take for the HVAC systems to get the building to optimal temperature. We used HVAC system telemetry to assemble a historical ramp-up time data set. We used the telemetry to define ramp-up time as a dependent variable and incorporated additional HVAC system metrics as inputs into the model along with weather data. We use a gradient-boosted regression machine to predict ramp-up time for the upcoming. This model retrains and predicts ramp-up time on a daily basis.
  • Occupancy prediction. This model calculates and predicts when occupancy would reach 25 percent in a building. We used 25 percent as our threshold value to indicate that a building should be considered occupied. At the point in time that this threshold was reached, we wanted the building’s temperature to be in the optimal comfort range. To determine occupancy levels, we use two primary data sources:
    • Building entry systems. These provided data for when and where our employees used their employee identification cards to enter buildings.
    • HVAC telemetry. We used data from HVAC temperature control boxes that scheduled when the HVAC system was active within a building.

    We chose the autoregressive integrated moving average (ARIMA) model for time-series forecasting to predict occupancy, combined with a regressor variable containing the day of the week. This model retrains and predicts occupancy on a daily basis.

The combination of ramp-up time prediction and occupancy threshold prediction allowed us to make a simple calculation for each building: Take the predicted occupancy threshold time for a day (8:30 AM, for example), subtract the predicted HVAC ramp-up time (such as 45 minutes), and use the resulting date/time stamp as the start of the ramp-up process for the HVAC system (such as 7:45 AM). If the model’s prediction was correct, the building would reach optimal temperature at the same time the building reached its occupancy threshold.

Implementing machine learning for HVAC scheduling

In the POC design, we trained our machine learning models for three separate buildings in the Puget Sound area. The output from the machine learning process was sent to the HVAC scheduler for each building in the form of a simple date/time stamp that started the HVAC system at the correct time for optimal temperature to coincide with 25 percent occupancy of the building. In more detail, the process looked like this:

  1. Capture historical data, including ID badge swipe information, from the building entry systems, HVAC telemetry, and weather information into Azure Data Lake.
  2. Transform and combine the data in preparation for machine-learning model consumption. We used Azure Databricks to perform data transformation, Azure Data Factory to create data pipelines for consumption by each machine learning model, and Azure HDInsight to deploy our Apache Spark scripts.
  3. Create data sets for modeling of occupancy prediction and ramp-up prediction.
  4. Push the data sets into two Azure Machine Learning Studio experiments to forecast future behavior. We used one instance to forecast building occupancy and one instance to predict HVAC ramp-up time.
  5. Train and forecast the occupancy model to predict occupancy for the upcoming day.
  6. Train the ramp-up machine-learning model and predict ramp-up time for the upcoming day.
  7. Combine the current ramp-up prediction with historic data to evaluate prediction accuracy.
  8. Output the final date/time stamp for HVAC control-system consumption.
The production architecture for HVAC scheduling in Microsoft buildings using Azure Machine Learning and associated tools.
Figure 2. The production architecture for HVAC scheduling in Microsoft buildings using Azure Machine Learning: Data sources – Data preparation – Machine learning prediction models – Recommendation consumption

Reviewing implementation

We implemented machine learning in the HVAC systems for three of our Puget Sound buildings for our POC. We worked closely with our HVAC engineering team to ensure that we were getting the most accurate inputs and preparing our outputs correctly for the HVAC systems. Data preparation was an important part of the groundwork, and we put significant effort into our data cleanup and aggregation tasks by using Azure Data Lake, Azure Data Factory, Azure Databricks, and Azure HDInsight.

After running our models and leveraging the outputs, we began to see results. Based on how our buildings have operated under the new models, we expect to gain cost savings of more than $15,000 annually and a decrease of 60 hours annually in the time that our employees experience discomfort due to building temperatures. For the three buildings in our POC, that results in a projection of almost 52,000 individual person-hours of discomfort saved. We’re in the process of evaluating HVAC systems in the Puget Sound area and implementing these models for a total of 43 buildings.

Improving data quality

We also found room for improvement. Data is paramount to accurate machine learning prediction, and as we used machine learning for occupancy and ramp-up time prediction, we learned more about the data we were using. For example, there were certain building-use patterns that led to inaccurate predictions. Employees who were working as part of a high-priority development effort or problem resolution might put in extra hours, work through the night, or use buildings in a nontypical way. In these cases, the HVAC systems in the buildings would remain at the optimal temperature setting for occupancy which provided no value in predicting ramp-up time. We accounted for situations such as the example above to ensure that our data sets contained reliable and consistent information. With cleaner and more consistent data, we received more accurate predictions from our models.

Benefits and business outcomes

We’ve realized several benefits from applying machine learning to our HVAC operations, including:

  • Cost savings. Running machine learning for our three POC buildings has resulted in changes to our HVAC scheduling that are projected to save more than $15,000 per year. We estimate that when the technology is applied to the 43 buildings targeted for HVAC scheduling prediction in the Puget Sound area, the cost savings will exceed $500,000 annually.
  • Increased employee comfort. Our employees’ comfort is important to us, and we want their experience working in our buildings to be the best possible. We’ll be saving more than 52,000 person-hours of reported discomfort due to low or high temperatures this year, and we expect that number to increase into the millions after we have implemented the machine learning-driven predictive models in the remaining buildings.

Best practices

We learned a lot from our implementation of machine learning for our HVAC systems and have created some best practices based on our findings. These best practices include:

  • Explore machine learning model methods. We explored several options for machine learning models before choosing the ones that we used. It’s important to investigate the models available, the important qualities of each, and how those models can best be applied to your data.
  • Provide the best data for your models. This was a major consideration for us. Machine learning models consume a large amount of data and the analytic results of the models are only as useful as the data you put into them. We performed extensive data examination including cleanup and aggregation before implementing machine learning. We wanted to understand the data and ensure that it was accurate and relevant before we pushed that data through the modeling process.
  • Use and reuse your most recent data. Historical data factored largely in the predictive models that we used, but we found that being able to use the recently generated data from the current week or the previous day vastly improved our models’ predictive accuracy. We had to refine our processes and ensure a smooth data pipeline so that the previous day’s data was ready to be consumed by the current day’s model.

Looking ahead

We’re excited about the possibilities that machine learning brings for digital transformation within RE&S. We’re currently in the process of refining our models and implementing machine learning in the rest of our Puget Sound buildings. We’re also exploring ways that we can use modeling results from our systems to anticipate and schedule predictive maintenance for our HVAC infrastructure, to help prevent outages and ensure an efficient and effective environment. In addition, we’re planning to make our HVAC implementation more granular, by eventually leveraging our HVAC systems to run our machine-learning models on a floor-by-floor basis, not simply building by building.

Conclusion

Machine learning has helped Microsoft Digital accelerate digital transformation at Microsoft RE&S. Our facilities are saving money on operational costs; our employees are more comfortable; and we’re sharing our results broadly to extend the benefits of machine learning and AI across our entire organization.