Announcing the Next Generation of Databases and Data Lakes from Microsoft

This post was authored by Joseph Sirosh, Corporate Vice President of the Microsoft Data Group.

Microsoft Connect() 2016

For the past two years, we’ve unveiled several of our cutting-edge technologies and innovative solutions at Connect(); which will be livestreaming globally from New York City starting November 16. This year, I am thrilled to announce the next generation of SQL Server and Azure Data Lake, and several new capabilities to help developers build intelligent applications.

1. Next release of SQL Server with Support for Linux and Docker (Preview)

I am excited to announce the public preview of the next release of SQL Server which brings the power of SQL Server to both Windows – and for the first time ever – Linux. Now you can also develop applications with SQL Server on Linux, Docker, or macOS (via Docker) and then deploy to Linux, Windows, Docker, on-premises, or in the cloud.  This represents a major step in our journey to making SQL Server the platform of choice across operating systems, development languages, data types, on-premises and the cloud.  All major features of the relational database engine, including advanced features such as in-memory OLTP, in-memory columnstores, Transparent Data Encryption, Always Encrypted, and Row-Level Security now come to Linux. Getting started is easier than ever. You’ll find native Linux installations (more info here) with familiar RPM and APT packages for Red Hat Enterprise Linux, Ubuntu Linux, and SUSE Linux Enterprise Server. The public preview on Windows and Linux will be available on Azure Virtual Machines and as images available on Docker Hub, offering a quick and easy installation within minutes.  The Windows download is available on the Technet Eval Center.

We have also added significant improvements into R Services inside SQL Server, such as a very powerful set of machine learning functions that are used by our own product teams across Microsoft. This brings new machine learning and deep neural network functionality with increased speed, performance and scale, especially for handling a large corpus of text data and high-dimensional categorical data. We have just recently showcased SQL Server running more than one million R predictions per second and encourage you all to try out R examples and machine learning templates for SQL Server on GitHub.

The choice of application development stack with the next release of SQL Server is absolutely amazing – it includes .NET, Java, PHP, Node.JS, etc. on Windows, Linux and Mac (via Docker). Native application development experience for Linux and Mac developers has been a key focus for this release. Get started with the next release of SQL Server on Linux, macOS (via Docker) and Windows with our developer tutorials that show you how to install and use the next release of SQL Server on macOS, Docker, Windows, RHEL and Ubuntu and quickly build an app in a programming language of your choice.

SQL Server

2. SQL Server 2016 SP1

We are announcing SQL Server 2016 SP1 which is a unique service pack – for the first time we introduce consistent programming model across SQL Server editions. With this model, programs written to exploit powerful SQL features such as in-memory OLTP, in-memory columnstore analytics, and partitioning will work across Enterprise, Standard and Express editions. Developers will find it easier than ever to take advantage of innovations such as in memory databases and advanced analytics – you can use these advanced features in the Standard Edition and then step up to Enterprise for Mission Critical performance, scale and availability – without having to re-write your application.

Our software partners are excited about the flexibility that this change gives them to adopt advanced features while supporting multiple editions of SQL Server.

“With SQL Server 2016 SP1, we can run the same code entirely on both platforms and customers who need Enterprise scale buy Enterprise, and customers who don’t need that can buy Standard and run just fine. From a programming point of view, it’s easier for us and easier for them,” said Nick Craver, Architecture Lead at Stack Overflow.

To be even more productive with SQL Server, you can now take advantage of improved developer experiences on Windows, Mac and Linux for Node.js, Java, PHP, Python, Ruby, .NET core and C/C++. Our JDBC Connector is now published and available as 100% open source which gives developers more access to information and flexibility on how to contribute and work with the JDBC driver. Additionally, we’ve made updates to ODBC for PHP driver and launched a new ODBC for Linux connector, making it much easier for developers to work with Microsoft SQL-based technologies. To make it more seamless for all developers Microsoft VSCode users can also now connect to SQL Server, including SQL Server on Linux, Azure SQL Database and Azure SQL Data Warehouse.  In addition, we’ve released updates to SQL Server Management Studio, SQL Server Data Tools, and Command line tools which now support SQL Server on Linux.

Tools

3. Azure Data Lake Analytics and Store GA

Today, I am excited to announce the general availability of Azure Data Lake Analytics and Azure Data Lake Store.

Azure Data Lake Analytics is a cloud analytics service that allows you to develop and run massively parallel data transformations and processing programs in U-SQL, R, Python and .Net over petabytes of data with just a few lines of code. There is no infrastructure to manage, and you can process data on demand allowing you to scale in seconds, and only pay for the resources used. U-SQL is a simple, expressive, and super-extensible language that combines the power of C# with the simplicity of SQL. Developers can write their code either in Visual Studio or Visual Studio Code and the execution environment gives you debugging and optimization recommendations to improve performance and reduce cost.

Azure Data Lake Store is a cloud analytics data lake for enterprises that is secure, massively scalable and built to the open HDFS standard. You can store trillions of files, and single files can be greater than a petabyte in size. It provides massive throughput optimized to run big analytic jobs. It has data encryption in motion and at rest, single sign-on (SSO), multi-factor authentication and management of identities built-in through Azure Active Directory, and fine-grained POSIX-based ACLS for role-based access controls.

Azure Data Lake Petabytes of Data

Furthermore, we’ve incorporated the technology that sits behind the Microsoft Cognitive Services inside U-SQL directly. Now you can process any amount of unstructured data, e.g., text, images, and extract emotions, age, and all sorts of other cognitive features using Azure Data Lake and perform query by content. You can join emotions from image content with any other type of data you have and do incredibly powerful analytics and intelligence over it. This is what I call Big Cognition. It’s not just extracting one piece of cognitive information at a time, not just about understanding an emotion or whether there’s an object in an image, but rather it’s about joining all the extracted cognitive data with other types of data, so you can do some really powerful analytics with it. We have demonstrated this capability at Microsoft Ignite and PASS Summit, by showing a Big Cognition demo in which we used U-SQL inside Azure Data Lake Analytics to process a million images and understand what’s inside those images. You can watch this demo (starting at minute 38) and try it yourself using a sample project on GitHub.

4. DocumentDB Emulator

We live on a Planet of the Apps, and the best back-end system to build modern intelligent mobile or web apps is Azure DocumentDB – planet-scale, globally distributed managed NoSQL service, with 99.99% availability and guarantees for low latency and consistency, all of which is backed by an enterprise grade security and SLA.

Today I am happy to announce a public preview of DocumentDB Emulator which provides a local development experience for the Azure DocumentDB. Using the DocumentDB Emulator, you can develop and test your application locally without an internet connection, without creating an Azure subscription, and without incurring any costs. This has long been the most requested feature on the user voice site, so we are thrilled to roll this out to everyone.

Furthermore, we’ve added .NET Core support in DocumentDB. The .Net Core is a lightweight and modular platform to create applications and services that run on Linux, Mac and Windows. With DocumentDB support for .Net Core, developers can now use .Net Core to build cross platform applications and services that use DocumentDB API.

Planet of the Apps

5. Other Announcements

  • Today we also are announcing the General Availability of R Server for Azure HDInsightHDInsight is the only fully managed Cloud Hadoop offering that provides optimized open source analytic clusters for Spark, Hive, Map Reduce, HBase, Storm, and R Server backed by a 99.9% SLA. Running Microsoft R Server as a service on top of Apache Spark, customers can achieve unprecedented scale and performance by combining enterprise-scale analytics in R with the power of Spark. With transparently parallelized analytic functions, it’s now possible to handle up to 1000x more data with up to 50x faster speeds than open source R – helping you train more accurate models for better predictions than previously possible. Plus, because R Server is built to work with the open source R language, all of your R scripts can run without significant changes.
  • We are also announcing the public preview of Kafka for HDInsightan enterprise-grade, open-source streaming ingestion service which is cost-effective, easy to provision, manage and use. This service enables you to build real-time solutions like IoT, fraud detection, click-stream analysis, financial alerts, and social analytics. Using out-of-the-box integration with Storm for HDInsight or Spark Stream for HDInsight, you can architect powerful streaming pipelines to drive intelligent real-time actions.
  • Another exciting news is the availability of Operational Analytics for Azure SQL Database. It’s the first fully managed Hybrid Transactional and Analytical Processing (HTAP) database service in the cloud. The ability to run both analytics (OLAP) and OLTP workloads on the same database tables at the same time allows developers to build a new level of analytical sophistication into their applications.  Developers can eliminate the need for ETL and a data warehouse in some cases (using one system for OLAP and OLTP, instead of creating two separate systems), helping to reduce complexity, cost, and data latency. The in-memory technologies in Azure SQL DB helps achieve phenomenal performance – e.g., 75,000 transactions per second for order processing (11X performance gain) and reduced query execution time from 15 seconds down to 0.26 (57X performance gain). This capability is now a standard feature of Azure SQL DB at no additional cost.

We are making our products and innovations more accessible to all developers – on any platform, on-premises and in the cloud. We are building for a future where our data platform is dwarfed by the aggregate value of the solutions built on top of it. This is the true measure of success of a platform – when the number and the value created by the apps built on top is far larger than the platform itself.

The live broadcast of Connect(); begins on November 16th at 9:45am EST, and continues with interactive Q&A and immersive on-demand content. Join us to learn more about these amazing innovations.

@josephsirosh