Consult titles from Azure SQL Database and Synapse Analytics Serverless SQL Pools - Serverless SQL (2023)

I was recently exploring an option to use Azure SQL Database for a small data store and was wondering if I could create oneelastic adviceConnect to a Serverless SQL Pools database and query a delta folder in Azure Data Lake Gen2. This can be done successfully by using an external table in a serverless SQL pool database and creating the same table in Azure SQL Database. Data in Azure SQL Database can now be queried along with data in the data lake using the Serverless SQL Pools engine.

The goal is to query internal and external data through a single interface: the Azure SQL database.

In this blog scenario we are using a sample database, there is an Azure SQL database containing AdventureWorks fact and dimension tables with less than 10 GB of data in total (not enough data for a lot! dedicated SQL! ). There is also an Azure Data Lake Gen2 account that contains around 1.2 billion web telemetry events about users' browsing/purchasing behavior on the (fictitious) website. This data is saved in delta format (parquet files). The goal here is to query this data in a single interface without moving the web telemetry, and use the serverless SQL Pools engine to perform the "heavy-lifting" data query in the data lake.

SQL queries are enabledGitHub here.

What is an elastic query?

Elastic Query lets you query remote tables across multiple databases using T-SQL commands. A connection can be made from one Azure SQL Database to another Azure SQL Database and then T-SQL queries can be run against the databases. You can read moreHere.

Elastic Query functionality is included in the cost of Azure SQL Database and is supported at all tiers.

basic solution

Here, in its simplest form, is the architecture that Power BI (or any SQL query tool) integrates withAzure SQL-DatenbankThis SQL database alone contains the imported data and also those in the existing external tablesServerless SQL GroupsDatabase. Serverless SQL pools connect to the underlying data in delta formatAzure Data Lake Gen2.

Consult titles from Azure SQL Database and Synapse Analytics Serverless SQL Pools - Serverless SQL (1)

solution steps

The following steps cover the tutorial. First we need to create objects in the Serverless SQL Pools service and then switch to Azure SQL Database to finish setting up Elastic Query.

  • Serverless SQL Groups
    • Create a serverless SQL pool database
    • Create an external table over delta data in Azure Data Lake Gen2
    • Create a SQL login and user in serverless SQL pools
    • Grant the SQL user SELECT on an external table
    • Specify REFERENCE in the database scope credentials used in the external table for the SQL user
  • Azure SQL-Datenbank
    • Create master key
    • Create database related credentials using the SQL user created in Serverless
    • Create External Data Source (Serverless URL, Credentials)
    • Create an external table in the SQL database

Solution step by step

Now let's walk through the process to enable Azure SQL Database to send queries to serverless SQL pools and retrieve the results. This tutorial assumes you already have an Azure Synapse Analytics workspace set up, here is oneIntroduction to serverless SQL poolsArticle on setting up a new Synapse workspace. You also need an Azure SQL database. In this scenario, the Synapse workspace and the SQL database are all in the same region.

(Video) Azure Synapse Analytics Serverless SQL Pools

object dependencies

Below is a SQL object dependency diagram showing the relationship between each object created.

Consult titles from Azure SQL Database and Synapse Analytics Serverless SQL Pools - Serverless SQL (2)

Configuration of SQL groups without a server

Let's create a new serverless SQL pool database. This step involves creating a data source for the Azure Data Lake Gen2 account, its managed identity security to allow serverless SQL pools to access the data, and creating the external table over the delta data.

Create SQL objects

The following T-SQL can be run in Synapse Studio atBuilt-inServerloser SQL-Pooldienst.

--create loginUSE MASTERCREATE LOGIN elasticuser WITH PASSWORD = '<strong_password>'--create new databaseCREATE DATABASE SQLElasticQuery;GOUSE SQLElasticQuery;GO--create objects to connect to the data lake--create a schema to keep our objectsCREATE SCHEMA LDW authorisation dbo ; --encryption to allow authenticationCREATE MASTER KEY ENCRIPTATION BY PASSWORD = '<strong_password>';--Erstellen eines Berechtigungsnachweises mit Managed IdentityCREATE DATABASE SCOPED CREDENTIAL DataLakeManagedIdentityWITH IDENTITY='Managed IdentityWITH IDENTITY='Managed Identity'--Erstellen Sie eine Datenquelle zur Verwendung in queryCREATE EXTERNAL DATA SOURCE ExternalDataSourceDataLakeUKMIWITH (LOCATION = 'https://<storage account>.dfs.core.windows.net/<container>', CREDENTIAL = DataLakeManagedIdentity);--Create Delta File Format CREATE EXTERNAL FILE FORMAT SynapseDeltaFormatWITH(FORMAT_TYPE = DELTA);-- Unterstützung für UTF8ALTER DATABASE aktivieren SQLDatabash COLLATE Latin1_General_100_BIN2_UTF8;

Create external table via Delta

We can now create an external table over the delta data in the data lake using the data source and file format created in the previous step. The location specifies the root folder for the delta data.

--cria uma tabela externa sobre DeltaCREATE EXTERNAL TABLE LDW.WebTelemetryDelta( UserID varchar(20), EventType varchar(100), ProductID varchar(100), URL varchar(100), Device varchar(50), SessionViewSeconds int, MessageKey varchar( 100) ), EventYear int, EventMonth int, EventDate data) CON (LOCATION = 'cleansed/webtelemetrydeltakekey', DATA_SOURCE = ExternalDataSourceDataLakeUKMI, FILE_FORMAT = SynapseDeltaFormat)GO

Create SQL objects required by Elastic Query

Once the external table is created, we can now create a user in the database and grant SELECT privileges on the table. We also need to provide access to the managed credentials that the Serverless SQL pools use to connect to the Data Lake account (I have an ongoing discussion with Microsoft about the process of securing external tables... I'll guide you on the Keep me up to date).

--Switch to serverless databasesUSE SQLElasticQuery;GO--create a user from loginCREATE USER elasticuser FROM LOGIN elasticuser;--grant select on Delta tableGRANT SELECT ON LDW. SCOPE::[DataLakeManagedIdentity] TO [elasticuser];

Once the objects are in place, we can now go to Azure SQL Database and create the objects we need there.

(Video) Elastic Queries with Azure SQL Database and Synapse Analytics Serverless SQL Pools

Azure SQL database configuration

With the Serverless SQL Pools objects created, we can now create the required objects in the corresponding Azure SQL Database. In this scenario, there is an Azure SQL database with existing fact and dimension tables. We create a master key, database related credentials that reflect the login to the serverless SQL pools, and then a data source that points to the serverless SQL pools endpoint.

--execute in Azure SQL Database if no master key existsCREATE MASTER KEY BY PASSWORD = '<strong_password>';--Create a credential with the same details as the credential created above --in serverless SQL poolsCREATE DATABASE SCOPED CREDENTIAL ElasticDBQueryCredV2WITH IDENTITY = 'elasticuser',SECRET = '<strong_password>';--Now create a data source that points to the Serverless SQL Pools endpoint and database.--Use the credentials created in the previous step. CREATE EXTERNAL DATA SOURCE ExternalDataSourceDataLakeUKMIV2 WITH ( TYPE = RDBMS, LOCATION = 'synapse-ondemand.sql.azuresynapse.net', DATABASE_NAME = 'SQLElasticQuery', CREDENTIAL = ElasticDBQueryCredV2);

Once the security mechanism is in place, we can create an external table with the same schema as the table in the Serverless SQL Pools database.

--Create schema to contain tables CREATE SCHEMA LDW AUTHORIZATION dbo;GO--Create external table referencing data source created above CREATE EXTERNAL TABLE LDW.WebTelemetryDelta(UserID varchar(20), EventType varchar(100), ProductID varchar (100 ), URL varchar(100), Device varchar(50), SessionViewSeconds int, MessageKey varchar(100), EventYear int, EventMonth int, EventDate data) WITH (DATA_SOURCE = ExternalDataSourceDataLakeUKMIV2);GO

Now that we have all the required objects in the Serverless SQL Pools database and the Azure SQL database, we can start the query.

Advice

We can run a SELECT statement on the external table in Azure SQL Database and observe the results, we will check those tooMonitor>SQL Queriesto view SQL requests sent to the Serverless SQL Pools service. The following SQL queries were run using SSMS connected to Azure SQL Database using an Active Directory login.

--show 10 rows of source data SELECT TOP 10 *FROM [LDW].[WebTelemetryDelta]WHERE EventDate = '2022-03-03';
Consult titles from Azure SQL Database and Synapse Analytics Serverless SQL Pools - Serverless SQL (3)

Let's run an aggregate query and check the monitoring scope in Synapse Studio.

--add the web telemetry via eventtype and deviceSELECTEventType,Device,SUM(CAST(SessionViewSeconds AS BIGINT)) AS TotalSessionViewSeconds,COUNT(*) AS TotalEventCountFROM LDW.WebTelemetryDeltaGROUP BYEventType,Device;
Consult titles from Azure SQL Database and Synapse Analytics Serverless SQL Pools - Serverless SQL (4)

The SQL generated inrequest contentThe field shows that the aggregated query was successfully sent to the serverless SQL pools.

(Video) Using CI/CD for Serverless SQL Pools in Azure Synapse Analytics

--SQL is generated when a titled query from Azure SQL DatabaseSELECT [Tbl1002].[EventType] [Col1036],[Tbl1002].[Device] [Col1037],SUM(CONVERT(bigint,[Tbl1002].[SessionViewSeconds], 0 ) is executed. ) [Expr1003],COUNT(*) [Expr1004] SINCE [LDW].[WebTelemetryDelta] [Tbl1002] GROUP BY [Tbl1002].[EventType],[Tbl1002].[Device]

If we LINK an inner and outer table in a GROUP BY query, we see that the query sent to Serverless SQL Pools requests each row. This is extremely inefficient as the query doesn't use the serverless SQL grouping mechanism except to extract the data row by row... so the Azure SQL database has to do the GROUP BY operation!

--ejecutar consulta agregada SELECCIONAR DP.EnglishProductCategoryName AS ProductCategory,COUNT(WT.MessageKey) AS TotalEventCountFROM DW.vwProductHierarchy DPINNER JOIN LDW.WebTelemetryDelta WT ON DP.ProductKey + 400 = WT.ProductIDGROUP BY DP.EnglishProductCategoryName

As we can see, the SQL query sent to Serverless SQL Pools requests each row of data.

--SQLSELECT Gerado [Tbl1007].[MessageKey] [Col1018], CONVERT(int,[Tbl1007].[ProductID],0) [Expr1012] FROM [LDW].[WebTelemetryDelta] [Tbl1007]

We can SELECT the data from the outer table and then LINK the inner table to a CTE or Temp table. In this example we use a CTE to store the results from the Serverless SQL Pools database.

--adicione a category e o nome do produto; CON webtelemetryAS(SELECTProductID,COUNT(*) AS TotalEventCountFROM LDW.WebTelemetryDeltaGROUP BY ProductID)SELECT DP.EnglishProductCategoryName AS ProductCategory,DP.EnglishProductName AS ProductName,SUM(WT.TotalEventCount) AS TotalEventCountFROM DW.vwProductHierarchy DPINNER ÚNASE a webtelemetry WT EN DP.ProductKey . + 400 = WT.ProductIDGROUP BY DP.EnglishProductCategoryName,DP.EnglishProductName--categoria de produto agregado;WITH webtelemetryAS(SELECTProductID,COUNT(*) AS TotalEventCountFROM LDW.WebTelemetryDeltaGROUP BY ProductID)SELECT DP.EnglishProductCategoryName AS ProductCategory ,SUM(WT.TotalEventCount ) AS TotalEventCountFROM DW.vwProductHierarchy DPINNER JOIN webtelemetry WT ON DP.ProductKey + 400 = WT.ProductIDGROUP BY DP.EnglishProductCategoryName

If we look at the SQL generated and executed against the Serverless SQL Pools database, we can see that an aggregated query is executed, which is the desired result.

--SQL wird ausgeführt und auf Basis von Serverless SQL Pools SELECT [Expr1009], CONVERT(int,[Col1032],0) [Expr1015] FROM ( SELECT [Tbl1008].[ProductID] [Col1032], COUNT(*) [ Expr1009 ] FROM [LDW].[WebTelemetryDelta] [Tbl1008] GROUP BY [Tbl1008].[ProductID] ) Qry1033

Using Elastic Query to load Azure SQL Database

We also have the ability to use query elastic bindings to insert INSERT INTO...SELECT data from the external serverless SQL pool table into an internal table in Azure SQL Database. Perhaps aggregated telemetry data from the web could be useful if placed alongside already imported facts and dimensions.

Diploma

In this blog post, we explained how to configure an elastic query between an Azure SQL Database and a serverless SQL pools database to enable querying of data in an Azure Data Lake Gen2 account. This gives us the ability to use Azure SQL Database to store data internally and also use the serverless SQL pools engine to process data in a data lake.

(Video) Let's Build A...Data Lake Solution using Azure Synapse Analytics Serverless SQL Pools

references

FAQs

What is serverless SQL pool in Azure Synapse analytics? ›

Serverless SQL pool is a query service over the data in your data lake. It enables you to access your data through the following functionalities: A familiar T-SQL syntax to query data in place without the need to copy or load data into a specialized store. To learn more, see the T-SQL support section.

What is the difference between Azure Synapse SQL pool and Azure SQL Database? ›

Azure SQL DB provides an easy-to-maintain data storage with predictable cost structures while Azure synapse provides control and features such as pausing computational tasks in order to efficiently manage costs.

What is the difference between Azure Synapse SQL pool and serverless? ›

The Synapse dedicated SQL pool is the heir to Azure SQL data warehouse and includes all the features of enterprise data warehousing. Unlike the serverless SQL pool, there is no built-in dedicated pool, so its instances must be created and deleted by the user, and we can choose the resources it is provisioned with.

How do I query Azure SQL database in synapse? ›

For dedicated SQL pools in Azure Synapse, navigate to the Azure Synapse Analytics workspace.
  1. Select Manage.
  2. In the Analytics pools section, select SQL pools.
  3. Select + New to create a new dedicated SQL pool.
  4. Give the dedicated SQL pool a new name TutorialDB and pick a performance level. ...
  5. Select Create.
Nov 28, 2022

What are the 2 SQL pools in Azure Synapse? ›

There are 2 types of SQL Pool: Dedicated and Serverless.

Is Azure Synapse a SQL database? ›

Azure Synapse SQL is a big data analytic service that enables you to query and analyze your data using the T-SQL language. You can use standard ANSI-compliant dialect of SQL language used on SQL Server and Azure SQL Database for data analysis.

Videos

1. How to Query Delta Lake Tables in Lake Databases Using Serverless SQL?
(Azure Synapse Analytics)
2. How to make SQL Pools even faster in Azure Synapse Analytics
(Guy in a Cube)
3. Integrating Power BI with Azure Synapse Analytics Serverless SQL Pools by Andy Cutler
(Power Platform User Group Stuttgart)
4. Building a Data Warehouse Dimensional Model using Azure Synapse Analytics SQL Serverless
(Cloud Lunch and Learn)
5. Synapse Espresso: Introduction into Synapse Serverless SQL Pools
(Azure Synapse Analytics)
6. Synapse Analytics - Querying Delta Lake with Serverless SQL Pools
(Advancing Analytics)

References

Top Articles
Latest Posts
Article information

Author: Jamar Nader

Last Updated: 29/03/2023

Views: 5995

Rating: 4.4 / 5 (55 voted)

Reviews: 94% of readers found this page helpful

Author information

Name: Jamar Nader

Birthday: 1995-02-28

Address: Apt. 536 6162 Reichel Greens, Port Zackaryside, CT 22682-9804

Phone: +9958384818317

Job: IT Representative

Hobby: Scrapbooking, Hiking, Hunting, Kite flying, Blacksmithing, Video gaming, Foraging

Introduction: My name is Jamar Nader, I am a fine, shiny, colorful, bright, nice, perfect, curious person who loves writing and wants to share my knowledge and understanding with you.