Klout wanted to give consumers, brands, and partners faster, more detailed insight into hundreds of terabytes of social-network data. It also wanted to boost efficiency. To do so, Klout deployed a business intelligence solution based on Microsoft
SQL Server 2012 Enterprise and Apache Hadoop. As a result, Klout processes data queries in near real time, minimizes costs, boosts efficiency, increases insight, and facilitates innovation.
Klout helps clients make sense of the hundreds of terabytes of data generated each day by more than 1 billion signals on 15 leading social networks including Facebook and LinkedIn. “We take in raw data and make it into something that is actionable for our
consumers, brands, and partners,” says David Mariani, Vice President of Engineering at Klout.
||When it comes to business intelligence, Microsoft SQL Server 2012 demonstrates that the platform has continued to advance and keep up with the innovations that are happening in big data.
| David Mariani
Vice President of Engineering
The data that Klout analyzes is generated by the more than 100 million people who are indexed by the firm. This includes Klout members and the people that they interact with on social sites. Individuals join Klout to understand their influence on the web, which
is rated on a scale from 1 to 100. They also sign up to participate in campaigns where they can receive gifts and free services. More than 3,500 data partners also join Klout to better understand consumers and network trends including changes in demand and
how peoples’ influence might affect word-of-mouth advertising.
To deliver the level of insight that customers seek and yet meet the budget constraints of a startup firm, Klout maintained a custom infrastructure based on the open-source Apache Hadoop framework, which provides distributed processing of large data sets.
The solution included a separate silo for the data from each social network. To manage queries, Klout used custom web services, each with distinct business logic, to extract data from the silos and deliver it as a data mashup.
Maintaining Hadoop and the custom web services to support business intelligence (BI) was complex and time-consuming for the team. The solution also hindered data insight. For example, accessing detailed information from Hadoop required extra development,
and so mashups often lacked the level of detail that users sought. In addition, people often waited minutes, or sometimes hours, for queries to process, and they could only obtain information based on predetermined templates.
Klout wanted to update its infrastructure to speed efficiency and support custom BI. Engineers sought technologies that could deliver mission-critical availability and still scale to meet big-data growth and performance requirements.
In 2011, Klout decided to implement a BI solution based on Microsoft SQL Server 2012 Enterprise data management software and the open-source Hive data warehouse system. “When it comes to BI and analytics, open-source tool sets are just ineffective and there’s
really not a good choice,” Mariani says. “Instead, Klout chose the best of both worlds by marrying the Microsoft BI platform with Hadoop and Hive.” Based on employees’ previous experience with the Microsoft BI platform, Klout also knew that SQL Server offers
excellent compatibility with third-party software and it can handle the data scale and query performance needed to manage big-data sets.
In August 2011, engineers implemented a data warehouse with Hive, which consolidates data from all of the network silos hosted by Hadoop. In addition, Klout deployed SQL Server 2012 on a system that runs the Windows Server 2008 R2 Enterprise operating system
to take advantage of Microsoft SQL Server 2012 Analysis Services. Engineers use it to manage all business logic required to facilitate multidimensional online analytical processing (MOLAP). Data is stored in multidimensional cubes, which helps preserve detail
and speed analysis. To provide high availability, Klout replicates the database to a secondary system using SQL Server 2012 AlwaysOn.
At the time that Klout was initially deploying its solution, SQL Server 2012 and Hive could not communicate directly. To work around this issue, engineers set up a temporary relational database that runs MySQL 5.5 software. It includes data from the previous
30 days and serves as a staging area for data exchange and analysis. Klout engineers are currently working to implement the new open database connectivity driver in SQL Server 2012 to directly join Hive with SQL Server 2012 Analysis Services. In addition,
to enhance insight Klout plans to work with Microsoft to incorporate other Microsoft BI tools into its solution, such as Microsoft SQL Server Power Pivot for Microsoft Excel.
With its new solution, Klout expects to boost efficiency, reduce expenses, expand insight, and support innovation.
Speeds Efficiency and Cuts Costs
By taking advantage of the Microsoft platform for BI, users will be able to get the data they seek in near real time. Mariani says, “By using SQL Server 2012 Analysis Services to create cubes, every day our MOLAP model can load 350 million rows of new Hive
data, analyze 35 billion rows of information, and achieve an average query response time of less than 10 seconds.” He further explains how Klout can minimize costs and complexity with its new solution. “By using open-source and commercial software, we don’t
have to build everything from the ground up and we get a great ecosystem of tools and support.” For example, Klout will spend less time managing business logic and data connections. Mariani says, “By creating a single warehouse on Hive, and putting all of
our business logic in SQL Server 2012 Analysis Services, we can easily expose all of the data that was previously tucked away in Hadoop.”
Increases Insight and Advantage
Once it is fully implemented, customers and employees can use the new BI solution to see the big picture and the details behind it—to gain competitive advantage and better manage social presence. Users will also have more control over data analysis. “SQL
Server 2012 and Microsoft BI tools work well with big data,” says Mariani. “We can use SQL Server 2012 Analysis Services and Microsoft BI tools to perform ad-hoc queries of big data on Hadoop at sub-second response times.”
Klout is implementing the flexible and scalable infrastructure it needs to continue to push the limits of data analysis. “When it comes to business intelligence, Microsoft SQL Server 2012 demonstrates that the platform has continued to advance and keep up
with the innovations that are happening in big data,” says Mariani. “We’re very excited about working with Microsoft to develop solutions based on technologies like SQL Server PowerPivot for Excel and Hadoop so that we can continue to deliver unique services
that transform what is possible with big data.”
This case study is for informational purposes only. MICROSOFT MAKES NO WARRANTIES, EXPRESS OR IMPLIED, IN THIS SUMMARY.