Delivering AI with data: the next generation of the Microsoft data platform

This post was authored by Joseph Sirosh, Corporate Vice President, Microsoft Data Group

Leveraging intelligence out of the ever-increasing amounts of data can make the difference between being the next market disruptor or being relegated to the pages of history. Today at the Microsoft Data Amp online event, we will make several product announcements that can help empower every organization on the planet with data-driven intelligence. We are delivering a comprehensive data platform for developers and businesses to create the next generation of intelligent applications that drive new efficiencies, help create better products, and improve customer experiences.

I encourage you to attend the live broadcast of the Data Amp event, starting at 8 AM Pacific, where Scott Guthrie, executive VP of Cloud and Enterprise, and I will describe product innovations that integrate data and artificial intelligence (AI) to transform your applications and your business. You can stream the keynotes and access additional on-demand technical content to learn more about the announcements of the day.

Today, you’ll see three key innovation themes in our product announcements. The first is the close integration of AI functions into databases, data lakes, and the cloud to simplify the deployment of intelligent applications. The second is the use of AI within our services to enhance performance and data security. The third is flexibility—the flexibility for developers to compose multiple cloud services into various design patterns for AI, and the flexibility to leverage Windows, Linux, Python, R, Spark, Hadoop, and other open source tools in building such systems.

Hosting AI where the data lives

A novel thread of innovation you’ll see in our products is the deep integration of AI with data. In the past, a common application pattern was to create statistical and analytical models outside the database in the application layer or in specialty statistical tools, and deploy these models in custom-built production systems. That results in a lot of developer heavy lifting, and the development and deployment lifecycle can take months. Our approach dramatically simplifies the deployment of AI by bringing intelligence into existing well-engineered data platforms through a new computing model: GPU deep learning. We have taken that approach with the upcoming release of SQL Server, and deeply integrated deep learning and machine learning capabilities to support the next generation of enterprise-grade AI applications.

So today it’s my pleasure to announce the first RDBMS with built-in AIa production-quality Community Technology Preview (CTP 2.0) of SQL Server 2017. In this preview release, we are introducing in-database support for a rich library of machine learning functions, and now for the first time Python support (in addition to R). SQL Server can also leverage NVIDIA GPU-accelerated computing through the Python/R interface to power even the most intensive deep-learning jobs on images, text, and other unstructured data. Developers can implement NVIDIA GPU-accelerated analytics and very sophisticated AI directly in the database server as stored procedures and gain orders of magnitude higher throughput. In addition, developers can use all the rich features of the database management system for concurrency, high-availability, encryption, security, and compliance to build and deploy robust enterprise-grade AI applications.


We have also released Microsoft R Server 9.1, which takes the concept of bringing intelligence to where your data lives to Hadoop and Spark, as well as SQL Server. In addition to several advanced machine learning algorithms from Microsoft, R Server 9.1 introduces pretrained neural network models for sentiment analysis and image featurization, supports SparklyR, SparkETL, and SparkSQL, and GPU for deep neural networks. We are also making model management easier with many enhancements to production deployment and operationalization. R Tools for Visual Studio provides a state-of-the-art IDE for developers to work with Microsoft R Server. An Azure Microsoft R Server VM image is also available, enabling developers to rapidly provision the server on the cloud.


In the cloud, Microsoft Cognitive Services enable you to infuse your apps with cognitive intelligence. Today I am excited to announce that the Face API, Computer Vision API, and Content Moderator are now generally available in the Azure Portal. Here are some of the different types of intelligence that cognitive services can bring to your application:

  • Face API helps detect and compare human faces, organize faces into groups according to visual similarity, and identify previously tagged people in images.
  • Computer Vision API gives you the tools to understand the contents of any image: It creates tags that identify objects, beings like celebrities or actions in an image, and crafts coherent sentences to describe it. You can now detect landmarks and handwriting in images. Handwriting detection remains in preview.
  • Content Moderator provides machine-assisted moderation of text and images, augmented with human review tools.

Azure Data Lake Analytics (ADLA) is a breakthrough serverless analytics job service where you can easily develop and run massively parallel petabyte-scale data transformation programs that compose U-SQL, R, Python, and .NET. With no infrastructure to manage, you can process data on demand, scale instantly, and pay per job only. Furthermore, we’ve incorporated the technology that sits behind the Cognitive Services inside U-SQL directly as functions. Now you can process massive unstructured data, such as text/images, extract sentiment, age, and other cognitive features using Azure Data Lake, and query/analyze these by content. This enables what I call “Big Cognition—it’s not just extracting one piece of cognitive information at a time, and not just about understanding an emotion or whether there’s an object in an individual image, but rather it’s about integrating all the extracted cognitive data with other types of data, so you can perform powerful joins, analytics, and integrated AI.

Azure Data Lake Store (ADLS) is a no-limit cloud HDFS storage system that works with ADLA and other big data services for petabyte-scale data. We are announcing the general availability of Azure Data Lake Analytics and Azure Data Lake Store in the Azure North Europe region.

Yet another powerful integration of data and AI is the seamless integration of DocumentDB with Spark to enable machine learning and advanced analytics on top of globally distributed data. To recap, DocumentDB is a unique, globally distributed, limitless NoSQL database service in Azure designed for mission-critical applications. Designed as such from the ground up, it allows customers to distribute their data across any number of Azure regions worldwide, guarantees low read and write latencies, and offers comprehensive SLAs for data-loss, latency, availability, consistency, and throughput. You can use it as either your primary operational database or as an automatically indexed, virtually infinite data lake. The Spark connector understands the physical structure of DocumentDB store (indexing and partitioning) and enables computation pushdown for efficient processing. This service can significantly simplify the process of building distributed and intelligent applications at global scale.


I’m also excited to announce the general availability of Azure Analysis Services. Built on the proven business intelligence (BI) engine in Microsoft SQL Server Analysis Services, it delivers enterprise-grade BI semantic modeling capabilities with the scale, flexibility, and management benefits of the cloud. Azure Analysis Services helps you integrate data from a variety of sources—for example, Azure Data Lake, Azure SQL DW, and a variety of databases on-premises and in the cloud—and transform them into actionable insights. It speeds time to delivery of your BI projects by removing the barrier of procuring and managing infrastructure. And by leveraging the BI skills, tools, and data your team has today, you can get more from the investments you’ve already made.

Stepping up performance and security

Performance and security are central to databases. SQL Server continues to lead in database performance benchmarks, and in every release we make significant improvements. SQL Server 2016 on Windows Server 2016 holds a number of records on the Transaction Processing Performance Council (TPC) benchmarks for operational and analytical workload performance, and SQL Server 2017 does even better. I’m also proud to announce that the upcoming version of SQL Server will run just as fast on Linux as on Windows, as you’ll see in the newly published 1TB TPC-H benchmark world record nonclustered data warehouse performance achieved with SQL Server 2017 on Red Hat Enterprise Linux and HPE ProLiant hardware.

SQL Server 2017 will also bring breakthrough performance, scale, and security features to data warehousing. With up to 100x faster analytical queries using in-memory Columnstores, PolyBase for single T-SQL querying across relational and Hadoop systems, capability to scale to hundreds of terabytes of data, modern reporting, plus mobile BI and more, it provides a powerful integrated data platform for all your enterprise analytics needs.

In the cloud, Azure SQL Database is bringing intelligence to securing your data and increasing database performance. Threat Detection in Azure SQL Database works around the clock, using machine learning to detect anomalous database activities indicating unusual and potentially harmful attempts to access or exploit databases. Simply turning on Threat Detection helps customers make databases resilient to the possibility of intrusion. Other features of Azure SQL Database such as auto-performance tuning automatically implement, tune, and validate performance to guarantee the most optimal query performance. Together, our intelligent database management features help make your database more secure and faster automatically, freeing up scarce DBA capacity for more strategic work.

Simple, flexible multiservice AI solutions in the cloud

We are very committed to simplifying the development of AI systems. Cortana Intelligence is a collection of fully managed big data and analytics services that can be composed together to build sophisticated enterprise-grade AI and analytics applications on Azure. Today we are announcing Cortana Intelligence solution templates that make it easy to compose services and implement common design patterns. These solutions templates have been built on best practice designs motivated by real-world customer implementations done by our engineering team, and include Personalized Offers (for example, for retail applications), Quality Assurance (for example, for manufacturing applications), and Demand Forecasting. These templates accelerate your time to value for an intelligent solution, allowing you to deploy a complex architecture within minutes, instead of days. The templates are flexible and scalable by design. You can customize them for your specific needs, and they’re backed by a rich partner ecosystem trained on the architecture and data models. Get started today by going to the Azure gallery for Cortana Intelligence solutions.


Also, AppSource is a single destination to discover and seamlessly try business apps built by partners and verified by Microsoft. Partners like KenSci have already begun to showcase their intelligent solutions targeting business decision-makers in AppSource. Now partners can submit Cortana Intelligence apps at AppSource “List an app” page.

Cross-platform and open source flexibility

Whether on-premises or in the cloud, cross-platform compatibility is increasingly important in our customers’ diverse and rapidly changing data estates. SQL Server 2017 will be the first version of SQL Server compatible with Windows, Linux, and Linux-based container images for Docker. In addition to running on Windows Server, the new version will also run on Red Hat Enterprise Linux, SUSE Enterprise Linux Server, and Ubuntu. It can also run inside Docker containers on Linux or Mac, which can help your developers spend more time developing and less on DevOps.

Getting started

It has never been easier to get started with the latest advances in the intelligent data platform. We invite you to join us to learn more about SQL Server 2017 on Windows, Linux, and in Linux-based container images for Docker; Cognitive Services for smart, flexible APIs for AI; scalable data transformation and intelligence from Azure Data Lake Store and Azure Data Lake Analytics; the Azure SQL Database approach to proactive threat detection and intelligent database tuning; new solution templates from Cortana Intelligence; and precalibrated models for Linux, Hadoop, Spark, and Teradata in R Server 9.1.

Join our Data Amp event to learn more! You can go now to the Microsoft Data Amp online event for live coverage starting at 8 AM Pacific on April 19. You’ll also be able to stream the keynotes and watch additional on-demand technical content after the event ends. I look forward to your participation in this exciting journey of infusing intelligence and AI into every software application.