The link to the source code is here.
The What Part
Deep Learning is a hot buzzword of today. The recent results and applications are incredibly promising, spanning areas such as speech recognition, language understanding and computer vision. Indeed, Deep Learning is now changing the very customer experience around many of Microsoft’s products, including HoloLens, Skype, Cortana, Office 365, Bing and more. Deep Learning is also a core part of Microsoft’s development platform offerings with an extensive toolset that includes: the Microsoft Cognitive Toolkit, the Cortana Intelligence Suite, Microsoft Cognitive Services APIs, Azure Machine Learning, the Bot Framework, and the Azure Bot Service. Our Deep Learning based language translation in Skype was recently named one of the 7 greatest software innovations of the year by Popular Science, and this technology has now helped machines achieve human-level parity in conversational speech recognition. To learn more about our Deep Learning journey, I encourage you to read a recent blog From “A PC on every desktop” to “Deep Learning in every software”.
The applications of Deep Learning technology are truly so far reaching that the new mantra, of Deep Learning in Every Software, may well become a reality within this decade. The venerable SQL Server DBMS is no exception. Can SQL Server do Deep Learning? The response to this is enthusiastic “yes!” With the public preview of the next release of SQL Server, we’ve added significant improvements into R Services inside SQL Server including a very powerful set of machine learning functions that are used by our own product teams across Microsoft. This brings new machine learning and deep neural network functionality with increased speed, performance and scale to database applications built on SQL Server. We have just recently showcased SQL Server running more than one million R predictions per second, using SQL Server as a Machine Learning Model Management System and I encourage you all to try out R examples and machine learning templates for SQL Server on GitHub.
In this blog, I wanted to address the finer points of the matter – the what, the why and the how part of Deep Learning in SQL Server. With this new clarity, it will be easier to see a picture of the road forward for data-driven machine intelligence using such a powerful data platform like SQL Server.
The Why Part
Today, every company is a data company, and every app is a data app.
When you put intelligence (AI, ML, AA, etc.) close to where the data lives, then every app becomes an intelligent app. SQL Server can help developers and customers everywhere realize the holy grail of deep learning in their applications with just a few lines of code. It enables It enables data developers to deploy mission critical operational systems that embed deep learning models. So here are the 10 whys for deep learning in SQL Server.
The 10 Whys of Deep Learning inside SQL Server
- By pushing intelligence close to where your data lives (i.e., SQL Server), you get security, compliance, privacy, encryption, master data services, availability groups, advanced BI, in-memory, virtualization, geo-spatial, temporal, graph capabilities and other world-class features.
- You can do both near “real-time intelligence” or “batch intelligence” (similar in spirit to OLTP and OLAP, but applied to Deep Learning and intelligence).
- Your apps built on top of SQL Server don’t need to change to take advantage of Deep Learning, and a multitude of apps (web, mobile, IoT) can share the same deep learning models without duplicating code.
- You can exploit a number of functionalities that come in machine learning libraries (e.g., MicrosoftML) that will drive the productivity of your data scientists, developers and DBAs and business overall. This might be faster and far more efficient than doing it in the house.
- You can develop predictable solutions that can evolve/scale up as you need. With the latest service pack of SQL Server, many features that were only available in the Enterprise Edition are now available in the Standard/Express/Web Edition of SQL Server. That means you can do Deep Learning using a standard SQL Server without high costs.
- You can use heterogeneous external data sources (via Polybase) for training and inference of deep models.
- You can create versatile data simulations and what-if scenarios inside SQL Server and then train a variety of rich Deep Learning models in those simulated worlds to enable intelligence even with a limited training data.
- You can operationalize Deep Learning models in a very easy and fast way using stored procedures and triggers.
- You get all the tools, monitoring, debugging and ecosystem around SQL Server applicable to intelligence. SQL Server can literally become your Machine Learning Management System and handle the entire life cycle of DNN models along with data.
- You can generate new knowledge and insights on the data you are storing already and anyways without having any impact on your transactional workload (via HTAP pattern).
Let’s be honest, nobody buys a DBMS for the sake of DBMS. People buy it for what it enables you to do. By putting deep learning capabilities inside SQL Server, we can scale artificial intelligence and machine learning both in traditional sense (scale of data, throughput, latency), but we also scale it in terms of productivity (low barrier to adoption and lower learning curve). The value that it brings results in so many shapes and forms – time, better experience, productivity, lower $ cost, higher $ revenue, opportunity, higher business aspirations, thought-leadership in an industry, etc.
Real-life applications of Deep Learning running inside SQL Server span banking, healthcare, finance, manufacturing, retail, e-commerce and IoT. With applications like fraud detection, disease prediction, power consumption prediction, personal analytics, you have the ability to transform existing industries and apps. That also means whatever workloads you are running using SQL Server, be it CRM, ERP, DW, OLTP, BD… you can add Deep Learning to them almost seamlessly. Furthermore, it’s not just about doing deep learning standalone, but it’s rather about combining it with all kinds of data and analytics that SQL Server is so great at (e.g., processing structured data, JSON data, geo-spatial data, graph data, external data, temporal data). All that is really left to be added to this mix is… your creativity.
The How Part
Here is a great scenario to show all of this in reality. I am going to use an example of predicting galaxy classes from image data – using the power of Microsoft R and its new MicrosoftML package for machine learning (which has been built by our Algorithms and Data Science team). And I am going to do all this in SQL Server with R Services on a readily available Azure NC VM. I am going to classify the images of galaxies and other celestial objects into 13 different classes based on the taxonomy created by astronomers – mainly elliptical and spirals and then various sub-categories within them. The shape and other visual features of galaxies change as they evolve. Studying the shapes of galaxies and classifying them appropriately helps scientists learn how the universe is evolving. It is very easy for us humans to look at these images and put them in the right buckets based on the visual features. But in order to scale it to the 2 trillion known galaxies I need help from machine learning and techniques like deep neural networks – so that is exactly what I am going to use. It’s not a big leap to imagine that instead of astronomy data, we have healthcare data or financial data or IoT data and we are trying to make predictions on that data.
Imagine a simple web app that loads images from a folder and then classifies them into different categories – spiral or elliptical and then sub-types with those categories (e.g., is it a regular spiral or does it have a handlebar structure in the center).
The classification can be done incredibly fast on vast amounts of images. Here is an example output:
So how does this simple app do this amazingly complex computation?
The code for such an app actually isn’t doing much – it just writes the paths to the new files to classify into a database table (the rest of the app code is plumbing and page layout, etc).
SqlCommand Cmd = new SqlCommand("INSERT INTO [dbo].[GalaxiesToScore] ([path] ,[PredictedLabel]) "
What is happening in the database?
Prediction and operationalization part:
Let’s look at the table where the app writes the image paths. It contains a column with paths to the galaxy images, and a column to store the predicted classes of galaxies. As soon as a new row of data gets entered into this table, a trigger gets executed.
The trigger in turn invokes a stored procedure – PredictGalaxiesNN as shown below (with R script portion embedded inside the stored proc):
This is where the magic happens – in these few lines of R code. This R script takes two inputs – the new rows of data (that have not been scored yet) and the model that is stored in a table as varbinary(max). I will talk about how the model got there in a minute. Inside the script, the model gets de-serialized and is used by the familiar scoring function (rxPredict) in this line:
scores <- rxPredict(modelObject = model_un, data = InputDataSet, extraVarsToWrite="path")
to score the new rows and then write the scored output out. This is a new variant of rxPredict which understands the ML algorithms included in the new Microsoft ML package. This line
[ library("MicrosoftML") ]
loads the new package that contains the new ML algorithms. In addition to DNN (the focus of this blog), there are five other powerful ML algorithms in this package – fast linear learner, fast tree, fast forest, one class SVM for anomaly detection, regularized logistic regression (with L1 and L2) and neural nets. So, with just 6-7 lines of R code, you can enable any app to get the intelligence from a DNN based model. All that the apps needs to do is connect to SQL Server. By the way, you can now very easily generate a stored procedure for R Code using the sqlrutils package.
What about training the model?
Where was the model trained? Well, the model was trained in SQL Server as well. However, it does not have to be trained on SQL Server – it could have been trained on a separate machine with a standalone R Server running on it, on-prem or on the cloud. Today we have these new ML algorithms on Windows version of R Server, and the support for other platforms is coming soon. I just chose to do the training in the SQL server box here, but I could have done it outside as well. Let’s look at the stored proc with the training code.
The model training is done in these lines of code.
This new function – rxNeuralNet from the MicrosoftML package for training a DNN. The code looks similar to other R and rx functions – there is a formula, an input dataset, and some other parameters. One of the parameters here is this line “netDefinition = netDefinition”. This is where the neural network is being defined.
Here is the DNN definition in this portion of the code:
Here, a deep neural net is defined using Net# specification language that was created for this purpose. It has 1 input, 1 output and 8 hidden layers. It starts with an input layer of 50×50 pixels and 3 colors (RGB) image data. First hidden layer is a convolution layer where we specify the kernel (small sub-part of the image) size and how many times we want the kernel to map to other kernels (convolute). There are some other layers for more convolutions, and for normalization and pooling that help stabilize the network. And finally, the output layer that maps it to one of the 13 classes. In about 50 lines of Net# specification, I have defined a complex neural network. Net# is documented on MSDN.
Training data size/GPU:
Here is the R code to do the training.
Some other lines to note here are – ‘training_rows = 238000’. This model was trained on 238K images that we got from Sloan Digital Sky Survey dataset. We then created two variants of each image with 45% and 90% rotations. In all there was about 700K images to train on. That’s a lot of image data to train on – so, how long did it take to train it? Well, we were able to train this model in under 4 hours. This is a decent sized machine – 6 cores and 56GB or RAM, but then it also has a powerful Nvidia Tesla K80 GPU. In fact, it is an Azure VM – the new NC series GPU VM, readily available to anyone with an Azure subscription. We were able to leverage the GPU computation by specifying one simple parameter: acceleration = “gpu”. Without GPU, the training takes roughly 10X more time.
The What Are You Waiting For Part
So with just a few lines of R code using algorithms from the MicrosoftML package, I was able to train a DNN on tons of image data and operationalize the trained model in SQL using R services such that any app connected to SQL can get this type of Intelligence easily. That’s the power of Microsoft R and the Microsoft ML package in it combined with SQL Server. This is just the beginning, and we are working on adding more algorithms on our quest to democratize the power of AI and machine learning. You can download the MicrosoftML: Algorithm Cheat Sheet here to help you choose the right machine learning algorithm for a predictive analytics model.
Don’t wait, go ahead and give it a try.