Trace Id is missing
August 13, 2021

LinkedIn simplifies log management with Azure Data Explorer

Millions of people around the world turn to LinkedIn to build their professional network, explore new career opportunities, and develop their reputation within their industries. Yet, serving such a large customer base was proving to be a challenge for the company. To overcome operational issues, centralize logs, and build scalability, LinkedIn made the switch to Azure Data Explorer. Today, the professional networking site is reaping the benefits—accelerating new service onboarding, lifting its operational burden, and reducing costs.

LinkedIn

With more than 740 million users from 150 countries, LinkedIn helps people build their professional networks, stay in touch, and find new career opportunities using effective search engine technology and data algorithms.

“Our journey with Azure Data Explorer began in 2018, when we started thinking about ways to improve the infrastructure at LinkedIn,” explains Nick Brown, Senior Site Reliability Engineer. “Not only did we need to become GDPR compliant, we also had to overcome certain operational difficulties, to centralize logs, bring down costs, and have a more scalable solution that was easier to deploy.”

Enhancing analytics capabilities

“Our primary target was to improve efficiency,” stresses Brown. “Our prior log aggregation solution spun up a considerable number of superfluous clusters. We needed more efficient clustering of data logs to develop stronger algorithms for open text search and analytics capabilities. We enabled centralized log aggregation to make all the logs at LinkedIn accessible to anyone at the company, simplifying log analysis and correlation tasks.”

However, this kind of scaling up usually requires more resources and, inevitably, incurs higher costs, a second issue the company was seeking to resolve. 

“With Elasticsearch, our previous solution, infrastructure costs were pretty high,” shares Benson Wu, Site Reliability Engineering Leader at LinkedIn. “We were running multiple individual clusters. The services required the efforts of many, but didn't scale well, and would go down all the time.”

“In our search for alternatives, we asked ourselves how many people we would need in order to be able to run the solution centrally,” reveals Brown. “We wanted a consolidated, centrally managed system. We knew that it would all have to be managed by a small team. But we also needed to know how far we could scale the solution before we’d require complex automations to enable further scaling.” 

“With ADX, we can now onboard a new service and query it within a half hour. With Elasticsearch, it used to take us roughly a couple of weeks to get a new service up. We’re doing it at about a quarter of the cost. That, for us, is a pretty big win.”

Benson Wu, Site Reliability Engineering Leader, LinkedIn

Scaling up while sizing down

After assessing a working proof of concept, the LinkedIn team chose Azure Data Explorer (ADX). “Compared to our old solution, we’ve nearly doubled our market penetration for certain types of services,” shares Wu. “And we’re doing it at about a quarter of the cost, year to year, which is amazing.”

The centralized log system also enhanced performance. “One of the biggest things lacking in our previous solution was the ability to trace an issue throughout all our services,” continues Wu. “Before, if you were debugging, you might come across an issue, but the solution would also point you downstream or to a different team’s service and thus a separate cluster. Today, you can search the entire ecosystem in one place. This has dramatically increased productivity.”

The LinkedIn team has also benefitted from fewer system failures. “We used to have clusters constantly going into disarray and resolving the problem would sap time and effort. ADX has encountered far, far fewer issues, which has of course been very beneficial,” notes Wu. “We’re more confident and can do everything we need to in smaller teams. We used to have a whole virtual team that packaged certain types of services. Now, we’re running everything with just a three-person team.”

Accelerating service onboarding and user adoption

Onboarding of new services has also experienced a major boost. “With ADX, we can now onboard a new service and query it within a half hour. With Elasticsearch, it used to take us roughly a couple of weeks to get a new service up. We’re doing it at about a quarter of the cost. That, for us, is a pretty big win,” enthuses Wu. 

Overall utilization of ADX has been impressively high. “We had 330 active users in the past week, running over 78,000 queries. And we have about 75 percent of our production services onboarded,” says Brown. Users’ experience of ADX’s query language, KQL, has been very positive. “The feedback we get is great,” shares Brown. “Even among people who were not especially eager to learn new technologies, every single one of them has said how great KQL is after trying it. It’s a language that is user friendly for both operations people, who are used to working with command lines, as well as data systems engineers, who are used to working with SQL.”

Looking to the future, LinkedIn is excited about its continued collaboration with Microsoft. “We keep tabs on new ADX features being released,” says Brown. “We’re also in touch with Microsoft regarding new use cases that can further benefit our customer base.”

At a broader level, LinkedIn is looking forward to a future that is fully in the cloud. “At LinkedIn, we're in the process of taking all of our on-premises data centers and moving them up into the cloud. Soon, all of LinkedIn will be running on Azure,” concludes Brown.

“The feedback we get is great. Even among people who were not especially eager to learn new technologies, every single one of them has said how great it is after having tried it.”

Nick Brown, Senior Site Reliability Engineer, LinkedIn

Take the next step

Fuel innovation with Microsoft

Talk to an expert about custom solutions

Let us help you create customized solutions and achieve your unique business goals.

Drive results with proven solutions

Achieve more with the products and solutions that helped our customers reach their goals.

Follow Microsoft