Hundreds of millions of people are in desperate need of new or more effective treatments for everything from cancer to autoimmune diseases to rare conditions. But pharmaceutical research takes time: according to the vfa, the association of research-based pharmaceutical companies in Germany, it takes 13 long years for a new drug to reach the German market. Analyzing data collected during clinical trials plays a key role in this. To get help to people in need sooner, the life science company Bayer collaborated with Microsoft to develop ALYCE, a data engineering and data science platform. Using artificial intelligence and machine learning, ALYCE helps Bayer evaluate data from clinical trials faster and more efficiently. Complying with all relevant regulations, this high-performance, scalable data analysis is based on Microsoft Azure.
The challenge: Evaluating data from clinical trials more efficiently
“Our goal is to provide patients with new treatment options as quickly as possible,” says Benedikt Egersdoerfer, Vice President at Bayer. He is also the company’s Head of Clinical Data Sciences and Analytics, overseeing just under 1,000 internal and external employees based all over the world. His team looks after the entire value chain for the clinical data used in Bayer’s drug research. But speed is not the only factor in play: “Efficiency and productivity are also absolutely crucial for us, because drug research has to be affordable to society,” Egersdoerfer says. “Anything else wouldn’t match Bayer’s identity. After all, our vision is ‘Health for all, hunger for none’.”
One obstacle on the way to getting drugs ready for the market is that research already generates a vast amount of data—and that amount is growing massively. In today’s clinical trials, traditional methods of collecting data have been joined by many new data sources such as sensors, devices, and third-party providers. A great way to process this abundance of data is to use artificial intelligence (AI) and machine learning (ML). Five years ago, Bayer had already recognized the importance of developing better ways to process clinical data. “We didn’t know where to store the growing amount of heterogeneous data because we didn’t know how big the datasets would get. At the same time, we were asking ourselves where to source the scalable computing power required to handle increasingly complex data processing,” Egersdoerfer says.
Bayer also had to consider GxP requirements—good x practice, where “x” stands for the specialized field such as clinical: “When we process patient or other data gathered during clinical trials with a view to getting the drug approved, we always have to ensure that the data is consistently traceable and that our systems meet regulatory requirements,” Egersdoerfer explains. “However, none of our ML tools offered an adequate level of structure and validation for use in a GxP environment capable of handling sensitive data on people participating in clinical trials.”
This is because, as is common in large companies, Bayer operates a heavily fragmented IT infrastructure. “Considering the new requirements for device data and computing power, it was important that we consolidate everything on a scalable platform where we can store and analyze our data,” explains Abi Velurethu, Vice President Clinical Data Science & Digital Solutions at Bayer, who is responsible for Technology & Application Management.
1. Remote data collection from numerous data sources;
2. Pattern recognition based on AI;
3. Medical insights review & analytics (MIRA), which involves using automated processes and interactive dashboards to efficiently evaluate trial data.
The solution: Data analysis underpinned by machine learning and based on Microsoft Azure
To map Bayer’s exacting requirements for collecting, storing, and analyzing data, the project team selected a solution architecture, built using Microsoft's Intelligent Data Platform, with data storage in Azure Data Lake. Here, the Azure tools for SecDevOps—in other words, the dovetailing of security, development, and IT operations—make it possible to automate and industrialize processes. In addition to core infrastructure services and Data Lake, ALYCE’s core components are: Azure Synapse Analytics, Azure Databricks and Data Factory, Azure Active Directory, Purview, Azure Kubernetes Service, and Power BI for visualization. Bayer also uses solutions from third-party providers such as SAS and TIBCO. “Azure lets us import all different kinds of data,” Velurethu says. “The platform’s programming interfaces were also appealing. They make it possible for us to tie in external analysis applications, dashboards, and programming tools.”
“Azure accelerates our qualification process and the final validate activities, which is a key advantage.”
Holger Dach, Data Science & Analytics Technology and Application Manager, Bayer
Bayer has been using pattern recognition in its clinical trials since November 2021, and remote data collection since March 2022 with the new Azure technology. The first MIRA data visualizations are finished since summer 2022: “We’ve developed interactive dashboards for our oncology unit. They make checking tumor reactions as part of our clinical trials extremely interactive, which didn’t use to be possible,” Egersdoerfer says. “ALYCE has sped up this process significantly.”
There are also other ways in which ALYCE helps ensure speedy results: “The platform flags unwanted events, missing data, and information that might have been incorrectly reported,” Egersdoerfer explains. Dach offers an example: “Up to now, it’s been very difficult to recognize time series anomalies for clinical trials. But now, if the analysis software reveals that a patient was recorded as having suddenly lost 30 kg and then gained the weight back again shortly thereafter, then it’s clear that something’s not right.” Egersdoerfer also points out that, in addition to data cleansing, the ML technology helps identify risks: “We can now uncover any instances of fraud and misconduct in our trials, such as a patient breaking the rules by participating in two trials.” Bayer can then commission specialists to investigate on-site.
The benefits: Accelerated insights for more efficient research
“ALYCE helps us speed up our processes, which ultimately makes us faster at creating innovations,” Egersdoerfer says. “We’ve achieved all of our original use case goals, and even surpassed some.” What he is referring to is that ingesting external data was not originally part of the project. “Bayer was using a different solution, but ingesting external data proved easier in Azure. Personally, I felt that was the icing on the cake,” he says.
“The journey we’ve embarked on with ALYCE and our partnership with Microsoft is providing us with new skills with which to meet future requirements. That’s a really big payoff.”
Abi Velurethu, Vice President Clinical Data Science & Analytics, Bayer
All this makes the project team confident about the future: “Now that we have ALYCE, I can imagine that our fragmented infrastructure might organically merge into a cohesive technology platform,” Egersdoerfer says. “This means we can focus on complex data rather than on complex technology.” Velurethu adds: “The journey we’ve embarked on with ALYCE and our partnership with Microsoft is providing us with new skills with which to meet future requirements. That’s a really big payoff.”
“Now that we have ALYCE, I can imagine that our fragmented infrastructure might organically merge into a cohesive technology platform. This means we can focus on complex data rather than on complex technology.”
Benedikt Egersdoerfer, Vice President, Head of Clinical Data Sciences and Analytics, Bayer
Follow Microsoft