In the battle of the buzzwords, “big data” is about to render “guestimation” obsolete.
This is big.
“Big data absolutely has the potential to change the way governments, organizations, and academic institutions conduct business and make discoveries, and its likely to change how everyone lives their day-to-day lives,” said Susan Hauser, corporate vice president of Microsoft’s Enterprise and Partner Group.
The world now holds twice as many bytes of data as there are liters of water in all its oceans, Hauser said. By learning to surf this wave of big data, it is possible to replace hunches with insight; to spot trends before they pass quickly by; and to take action while others are still deliberating.
Big data is the term increasingly used to describe the process of applying serious computing power – the latest in machine learning and artificial intelligence – to seriously massive and often highly complex sets of information.
What kind of information? You name it. Big data can be comparing utility costs with meteorological data to spot trends and inefficiencies. Big data can be comparing ambulance GPS information with hospital records on patient outcomes to determine the correlation between response time and survival. But big data can also be the tiny device you wear to track your movement, calories and sleep to track your own personal health and fitness.
“Our daily lives generate an enormous collection of data,” said Dan Vesset, program vice president of IDC's Business Analytics research.
Whether you’re surfing the Web, shopping at the store, driving your smart car around town, boarding an airplane, visiting a doctor, attending class at university, each day you are generating a variety of data, he said.
“The benefit of the data depends on where and to whom you’re talking to,” Vesset said. “A lot of the ultimate potential is in the ability to discover potential connections, and to predict potential outcomes in a way that wasn’t really possible before. Before, you only looked at these things in hindsight.”
With more data than ever available in digital form, progressively inexpensive data storage, and more advanced computers at the ready to help process and analyze it all, the field of big data has truly reached a watershed moment, Vesset said.
It’s a moment for which Microsoft is prepared, and has been preparing – virtually since the company’s origin, Hauser said.
“Microsoft believes that big data has the power to drive practical insights that just weren’t possible before ,” Hauser said. “It’s about managing all that data and providing tools that enable everyone to answers questions– questions they might not have even known they had. That’s the vision we have.”
A ‘Tipping Point’
The big data explosion – including its cross-over from the high-tech industry to a variety of more widespread, mainstream uses – can be traced to several factors, said Dave Campbell, a technical fellow at Microsoft.
First, there’s the growing ocean of data. Pre-computers, a database was little more than the tall, gray filing cabinet in the corner. But now, more and more information is being digitized – or just “born digital” in the first place.
Then, advancements in machine intelligence have made for increasingly clever algorithms that can be used to process, compare and visualize this rising tide of structured and unstructured data.
And housing those vast stores of data is now more affordable than ever – three decades ago a terabyte of data storage could cost millions, Campbell said. Today, it’s about US$30 at Office Depot.
It’s a tipping point. There’s no reason to throw anything away any more,” Campbell said. “We are at an amazing inflection in which so much is already born digital today, even inherently analog data such as voicemail and photographs.”
Another reasons big data has turned a corner is that – well, there just is more data. Sensors, GPS devices, mobile phones, social media, smart cars, roads, bridges, buildings – all produce a steady stream of data just waiting to be examined and cross-examined.
“In the next five years, we’ll generate more data as humankind than we generated in the previous 5,000 years,” said Eron Kelly, general manager of product marketing for Microsoft SQL Server.
“It’s an inevitable reality of our new world that more and more data is being generated,” Kelly said. “Those able to derive insights from that data will make better decisions and will be more efficient, and they’ll move whatever agenda it is that they have forward much faster than those that don’t.”
Data, Data Everywhere
There may be oceans of data out there, but making it into something you can use is another matter entirely.
“Big data is a big problem, and it’s an incredible opportunity,” Kelly said. “What we’re providing is the tool that allows you to scoop the water out of the ocean, pour it into a filter, and make it drinkable rather than having to do on your own each of those potable steps you vaguely remember from high school chemistry.”
One challenge of big data can simply be managing its sheer size. Storing, searching, analyzing, comparing, refining, combining, visualizing – massive sets of data can be a challenge to traditional database software. That’s where database and business intelligence tools such as Microsoft SQL Server, Windows Server, PowerPivot, Microsoft Office and SharePoint come in handy, Hauser said.
“Organizations that are partnering with Microsoft are seeing results pretty quickly,” she said. “The impact – that’s what’s most exciting.”
And what’s more, you don’t have to be an information technology (IT) specialist or a data scientist with a Ph.D. in analytics to get results, she said.
Another challenge in making big data useful is getting your hands on the right big data. Microsoft is working with Hadoop, an open source data platform that helps manage unstructured data, to help customers work with all types of data, both structured and unstructured.
Structured data, most universally found in databases that use Structured Query Language (SQL), is organized in a way that lets users select exact pieces, rows or columns of that database – perhaps you’ll select all of the rows with a certain zip code, or the columns with a specific date. Unstructured data, however, has no such architecture and can often include text or images that aren’t part of the free-form data (emails, for example).
Microsoft is also working to integrate Hadoop with SQL Server and Windows Azure to ensure customers can combine all their data sources.
“What we’re trying to do is allow a broad set of skills, driving simplicity and ease of use into the area of big data,” Kelly said. “Taking very complex technical problems and simplifying them with easy-to-use tools – that’s been the Microsoft strategy over the last 30 years.”
Vision for the Future
A hospital uses rapid gene sequencing to stop an outbreak of antibiotic resistant bacteria, saving lives. A railroad company gets an alert from a train’s sensor that a preventative fix is needed, saving the cost and time of removing the train from the tracks later. A university notices a student’s activity level has started to drop to a level consistent with dropouts, and reaches out to assist.
The data may be big, but in its essence, big data is quite personal.
“Big data is really a bit of a misnomer,” Campbell said. “It really doesn’t have anything to do with size.”
Rather, it’s about the insight it provides. Big data may hold the key to smarter cities, faster medical breakthroughs, greater academic learning, more efficient use of resources, and more profitable companies. Not to mention jobs – big jobs.
“Big data is important, yet the real gap is going to be in skills and ability,” Kelly said.
In the next few years millions of big data-related IT jobs will be created worldwide and yet, according to the McKinsey Global Institute, there is a major shortage of the “analytical and managerial talent necessary to make the most of big data.” The United States alone faces a shortage of more than 140,000 workers with big data skills as well as up to 1.5 million managers and analysts needed to analyze and make decisions based on big data findings.
Kelly said that in the years to come, businesses that successfully harness the power of big data will outperform and outcompete competitors.
According to the MIT Center for Digital Business, companies that adopt data-driven practices, and use big data to guide decision making, will have output and productivity that is 5 to 6 percent higher than what would be expected given their other investments and information technology uses.
“That’s not only about making more money in the near term, that’s survival. In an increasingly competitive global marketplace, you have to do everything you can to stay ahead of the competition,” Kelly said. “If you don’t harness the data and information around you to make better decisions and become more efficient, you fall behind. That’s true of companies, governments, healthcare, and pretty much any industry. That’s why it’s so critical.”