Skip to main content Microsoft Intelligent Data Platform Azure Arc Azure databases Power BI SQL Server 2025 SQL Server BI SQL Server 2022 SQL Server 2019 SQL Server 2017 SQL Server 2016 SQL Server 2005 - 2014 Downloads Community SQL End of Support Data Security - SQL Server Encryption SQL Server blog SQL Server and Azure SQL workshops Browse Microsoft Solutions Hub SQL Server Tech Community Azure Databases Tech Community Azure Synapse Analytics Tech Community Developer Find a partner Become a partner Partner resources Try SQL Server 2025 Microsoft Security Azure Dynamics 365 Microsoft 365 Microsoft Teams Windows 365 Microsoft AI Azure Space Mixed reality Microsoft HoloLens Microsoft Viva Quantum computing Sustainability Education Automotive Financial services Government Healthcare Manufacturing Retail Find a partner Become a partner Partner Network Microsoft Marketplace Marketplace Rewards Software development companies Blog Microsoft Advertising Developer Center Documentation Events Licensing Microsoft Learn Microsoft Research View Sitemap

Data Virtualization with PolyBase for SQL Server 2022

Part of the SQL Server 2022 blog series.

Microsoft SQL Server 2022 introduces the newest version of PolyBase, and with it the capability to query data where it lives, virtualize data, and use REST APIs. REST APIs enable SQL Server to be both more flexible and lightweight while expanding its range of supported connectors and file formats.

SQL Server 2022 now supports CSV, Parquet, and Deltafiles stored on Azure Storage Account v2, Azure Data Lake Storage Gen2, or any simple storage service (S3)–compliant object storage—the last as an on-premises offering or in the cloud. Finally, SQL Server 2022 can now use Create External Table as Select (CETAS), together with commands like OPENROWSET, Create External Table (CET), and all the new T-SQL enhancements. SQL Server 2022 is a powerful data hub.

How does it work?

Data virtualization for SQL Server 2022 is both more flexible and easier to use now that its capabilities are inside the engine itself. The figure below gives a better understanding of the architecture:

Examples of S3-compliant object storage providers compatible with SQL Server 2022.
This figure provides a non-exhaustive list of S3-compliant object storage providers. SQL Server 2022 is compatible with any object storage provider compatible with S3-REST APIs.

OPENROWSET: Lightweight command that allows SQL engine to access data outside SQL Server, either a file or another database. Recommended for loading data or data exploration.

CREATE EXTERNAL TABLE (CET): Creates a table where the data stays in its original location outside of SQL Server, and when selected, the SQL engine will provide the requested data to the user. External table benefits from reusability and can leverage the use of statistics for better performance.

CREATE EXTERNAL TABLE as SELECT (CETAS): It performs a combination of operations in a single command. First, it allows SQL Server to transform and convert a given data stored inside or outside the database. Second, it then exports the data to a different location, either a network location or Azure. Finally, it creates an external table targeting the newly exported data.

These operations are secured by a combination of database master key and external credentials for simplified management.

If the data is stored on Azure Storage Account v2 (abs), Azure Data Lake Gen2 (ADLs), or an S3-compliant Object Storage, SQL Server 2022 will use the REST API implementation. If not, SQL Server 2022 will use PolyBase services—PolyBase services installation is required for both cases.

For a complete list of data sources, please refer to our CREATE EXTERNAL DATA SOURCE documentation.

External file support has also increased, and SQL Server 2022 now supports CSV, Parquet, and Delta type. Please find a complete list of supported external file formats in our CREATE EXTERNAL FILE FORMAT (TRANSACT-SQL) documentation.

Benefits

Major benefits of Data Virtualization with PolyBase on SQL Server 2022:

  • No data movement: Access the data where it is.
  • T-SQL language: Ability to leverage all the benefits of the T-SQL language, its commands, enhancements, and familiarity.
  • One source for all your data: Users and applications can use SQL Server 2022 as its single data source for all of the required data, while database administrators and data engineers have a single environment to maintain.
  • Security: Leverage SQL Server security features for granular permissions, credential management, and control.
  • Cost: PolyBase is available in all SQL Server 2022 editions.

Next steps

Download the latest release of SQL Server 2022 if you haven’t already done so and check out the SQL Server 2022 Overview and What’s New references.

Learn more

See the following SQL Server resources on Microsoft Learn:

English (United States)
Your Privacy Choices Opt-Out Icon Your Privacy Choices
Consumer Health Privacy Sitemap Contact Microsoft Privacy Manage cookies Terms of use Trademarks Safety & eco Recycling About our ads