Trace Id is missing
June 21, 2022

A move to Windows containers helps Relativity boost reliability and security while lowering costs

Technical Story

Relativity, a global market leader in legal e-discovery and compliance software, helps organizations find the truth buried in today’s ever-growing data sources. Its flagship product, RelativityOne, provides expert software as a service (SaaS) capable of sifting through torrents of data. With multiple awards for technological innovation and cybersecurity excellence, RelativityOne is backed by the security, reliability, and global reach of Azure. Yet, as cloud practices evolved, the company knew it could do more to modernize its highly complex Windows code base, streamline development, and improve scalability.

Relativity

Moving its focus away from virtual machines (VMs), Relativity has embraced Windows containers hosted on Azure Kubernetes Service (AKS), a fully managed Kubernetes environment. Without having to rewrite millions of lines of code, the company benefits from a faster, more cost-effective platform for its products and services—one that’s easier to scale and operate. And the platform gives Relativity’s engineering teams the speed and flexibility they need to build more innovative services.

“Before, it would take our application development teams six months to deliver new features. Now that teams are running Windows containers on Azure Kubernetes Service, they're able to deploy within the same day, which is a huge win.”

Shelby Lewin, Senior Product Manager, Relativity

Platform complexity drives new, more scalable approach

Over the past year, the number of requests to collect data in RelativityOne has more than tripled, while the types and number of platforms used to communicate and store data continue to expand. For Relativity, the rising global demand for legal and compliance technology requires commensurate increase in innovation. Relativity has more than 300,000 users in approximately 40 countries and regions. The company serves thousands of organizations globally, primarily in legal, financial services, and government sectors, including the U.S. Department of Justice and 198 of the firms listed in the Am Law 200 (the American Lawyer's annual ranking of US law firms).

RelativityOne uses advanced search algorithms, pinpoint analytics, and the latest in machine learning, AI, and visualizations to deliver an end-to-end solution for organizing data, discovering the truth, and acting on it. The software was built primarily on the Microsoft .NET Framework and Microsoft SQL Server and runs on Azure. The on-demand computing, storage, and networking resources have given Relativity a flexible, pay-as-you-go cloud platform that the company can tailor to suit individual customer requirements.

However, that flexibility came at a price, as Relativity Principal Software Engineer Julian Portillo points out. “Our software was a standard monolith based on Windows and the .NET Framework, but we had a lot of divergent compute platforms that made it hard to configure and operate at scale.”

For example, a customer with a large volume of data to process might have required larger VM types for a project, while another customer might have needed something else. Over time, the range of customized configurations became increasingly cumbersome and expensive for Relativity to update and maintain.

“Scaling was a problem,” adds Relativity Senior Product Manager Shelby Lewin. “We started trying to understand how we can move all of our compute workloads away from the bespoke solutions to something more cost effective and efficient.”

Windows containers shave months off deployment cycles

Relativity began considering the best way to streamline the architecture for RelativityOne and other products—without entirely rewriting a massive code base that had taken years to perfect. But the application business logic had to be decoupled from the infrastructure, a step that led to a containerized microservices architecture. Containers provide a lightweight, isolated environment that makes apps easier to develop, deploy, and manage. Engineers can encapsulate specific business capabilities within microservices that run in containers with all the required libraries and dependencies. That makes containers highly scalable and portable across diverse environments.

“Windows containers made it easy for us, because we already are a .NET shop, and it really shortened our time to delivery,” Lewin explains. “Before, it would take our application development teams six months to deliver new features. Now that teams are running Windows containers on Azure Kubernetes Service, they’re able to deploy within the same day, which is a huge win.”

Yet deconstructing an existing architecture into microservices is no small task. With a choice of open-source and managed Kubernetes implementations, Relativity chose to deploy Windows containers on AKS, which can deploy containers to VMs running a Windows Server image and can manage them at scale—whether that’s dozens or even thousands of containers.

In the earlier cloud architecture that stored the application and related files on VMs, Relativity was responsible for updating and upgrading a VM’s operating system on a case-by-case basis. Now engineers can update containers and redeploy whenever needed. AKS can distribute multiple containers across clusters and scale them automatically without the need for an engineering team to set up and configure additional servers. In addition, Relativity is taking advantage of the Azure Hybrid Benefit program to run Windows VMs on Azure, at a reduced cost, for significant savings.

The effort also paid off in better application performance. After the move to Windows containers on AKS, RelativityOne’s response time was 12 times faster—a huge time savings for customers and their large batch jobs.

“Working with Microsoft was a smooth, clear process that worked well for us and allowed us to accomplish our goals.”

Julian Portillo, Principal Software Engineer, Relativity

Architecture of a complex multitenant SaaS solution

The Relativity platform is a batching and imaging machine deployed in roughly 25 clusters across 20 global regions and Azure sovereign cloud instances. A batch job starts with the Collect service, which ingests raw data from emails, messages, documents, files, and other sources, and either sends it directly to Azure for immediate processing or stores it for later review.

Collect and other Relativity services are deployed and managed using a combination of Helm templating and an internally built orchestrator. “We can deploy and test clusters as part of our automated GitHub action integration tests, which is awesome,” Portillo says. “In general, we try to use open source wherever we can. When open-source tools don't exist, we build what we need.”

Relativity makes heavy use of Azure Container Registry, a fully managed, geo-replicated Docker image repository that supports automated container building and patching. Multiple container images are pulled from the registry and deployed together as a pod in a cluster managed by AKS. This step is a key to the platform’s dynamic scaling and to some dramatic savings, compared to the earlier architecture.

Relativity uses elastic node pools based on virtual machine scale sets in AKS. Nodes of the same configuration are grouped together into the node pools containing the underlying VMs. In addition, Relativity deploys custom-built Kubernetes operators that run inside each AKS cluster to monitor workloads and schedule pods to run on certain nodes. Recently, a whopping 21.1 million pods were scheduled across its clusters in a 30-day window.

“In moving to AKS, our infrastructure teams set up the AKS clusters and make sure they all work together in an automated way, safely and effectively,” Portillo explains. The Relativity platform team can adjust the number of pods in a deployment depending on CPU utilization or other select metrics—a Kubernetes feature called horizontal pod autoscaling. The platform also incorporates advanced Kubernetes scheduler features to isolate teams and workloads for greater security in this multitenant architecture.

For added business continuity and disaster recovery, Azure Traffic Manager checks the health of Azure resources and routes traffic from non-healthy clusters to healthy ones. “We use Traffic Manager to facilitate automatic failover for some of our services. If a cluster goes down or a service goes out in a particular cluster, we'll do an automatic failover to a working cluster,” Portillo reports. The teams use several other managed Azure services to streamline deployment and operations. For example, Azure Cosmos DB, a managed NoSQL database, supports configuration management for the platform.

All the engineering teams now use Relativity’s new AKS platform to update services and build new capabilities. As Portillo puts it, “The internal developer experience is unified, but our external customers have their choice of products.”

Automating security with SecDevOps

Relativity’s transition to Windows containers and AKS coincided with an organizational change to SecDevOps, a security-first approach to development and operations that helps speed up iterations while improving quality and compliance. Relativity designated infrastructure teams to set up the AKS clusters and handle the container operations, leaving application developers free to focus on creating and releasing new features.

According to Portillo, “The real story here is that our engineering teams—the ones who deliver products to our end customers—didn't have to change their processes much. They didn't have to learn about Kubernetes, and they could just deploy easily and quickly.”

AKS automation also helps to strengthen the company’s tough security stance. The number of software vulnerabilities reported by the tech industry has exploded in the years since Relativity started business. “This is why it's important to automate. You need to have automated methods to correct vulnerabilities, or else you'll be buried,” Portillo states. Relativity built a proactive service for detecting critical vulnerabilities and exploits. The service runs in AKS, scanning all the containers running in a cluster. If an issue is discovered, the service attempts to patch the vulnerability on the spot, triggering the continuous deployment (CD) pipeline to deploy the fix automatically across the production clusters with the same level of testing as any other pull request. “We can now deploy vulnerability fixes to production faster than we can get a pizza delivered,” he adds.

Relativity also checks its system integrity regularly using chaos testing, something it couldn’t do with the old architecture. Back then, rolling out new SSL certificates across VMs could take multiple engineering teams several months of work. “We can now roll SSL certificates and other secrets globally, with zero downtime, automatically. It’s just a pull request,” Portillo explains. “We can roll out secrets and update all of our running systems at the push of a button.”

“We can now roll SSL certificates and other secrets globally, with zero downtime, automatically… Our security posture has drastically increased.”

Julian Portillo, Principal Software Engineer, Relativity

New architecture, new possibilities

Many organizations adopt IaaS as a first step in their cloud journey. Relativity took the next step for its SaaS product line, modernizing its applications, optimizing operations, and significantly reducing costs using Windows containers hosted on AKS.

“Without Windows containers, I don't think we could have moved Relativity to Kubernetes,” Portillo suggests. The sheer size of Relativity’s undertaking caught the attention of the Microsoft product team, and Relativity offered valuable input that resulted in several key AKS enhancements. According to Portillo, “Working with Microsoft was a smooth, clear process that worked well for us and allowed us to accomplish our goals. Without our work together, we wouldn’t have been able to switch to Windows containers.”

Meanwhile, Relativity engineers are making the most of the new platform. Currently, they’re moving all computing resources from the old VM-based clusters and onto the faster, more scalable AKS clusters. Portillo concludes, “Without Windows containers, this project simply could not have happened.”

“We would not have been able to move our large monolithic enterprise into Kubernetes without Windows containers and Azure Kubernetes Service.”

Julian Portillo, Principal Software Engineer, Relativity

Take the next step

Fuel innovation with Microsoft

Talk to an expert about custom solutions

Let us help you create customized solutions and achieve your unique business goals.

Drive results with proven solutions

Achieve more with the products and solutions that helped our customers reach their goals.

Follow Microsoft