Friday, January 3, 2020

T-SQL Basics: using OUTPUT

You are working on a database development project, and you need to obtain the records before/after an INSERT, DELETE, UPGRADE or MERGE Statement to present it to the user or do any work with that data.

A common approach that I see a lot and involves a lot of code to obtain the before/after information can be the following (maybe some IF and TRY...CATCH also) :

DECLARE @id INT = 1;

SELECT PostalCode
    , ModifiedDate
FROM Person.Address a
WHERE a.AddressID = @id;

UPDATE Person.Address
SET PostalCode = 95012, 
    ModifiedDate = 'Jan 02 2020'
WHERE AddressID = @id;

SELECT PostalCode
    , ModifiedDate
FROM Person.Address a
WHERE a.AddressID = @id;

The results will be something like this (before and after):

And the execution plan will look like this, each extra SELECT statement adds more workload to our server (independently of the resources used) :

Even when the code is easy to read, but if you use this pattern over all your codebase, maintain it can become difficult if you have to change object names or implement it on another system.

T-SQL language provides the OUTPUT clause, that allows you to retrieve information from a DML statement in the same batch.

NOTE: OUTPUT clause will return rows even when the statement presents an error and is rolled back, so you should not use the results if any error is encountered.

Sample usage is as follows:

-- Values After INSERT INTO --
VALUES (@a, @b);

-- Values After INSERT FROM --
SELECT * FROM <MyOtherTable>;

-- Values Before/After UPDATE --
UPDATE <MyTable>
SET <myField> = @value
 DELETED.<myField> as [ValueBefore],
 INSERTED.<myField> as [ValueAfter]
WHERE id = @key;

-- Deleted records After a DELETE --
DELETE <MyTable>
WHERE id = @key;

If we refactor the first example to use OUTPUT we obtain this code:


UPDATE Person.Address
SET PostalCode = 95012, 
    ModifiedDate = 'Jan 02 2020'
WHERE AddressID = @id

Less code and also easy to follow and maintain, and the results will be something like this:

What if we take a look at the execution plan? we can see that this time just one sentence is executed (as expected):

Of course, performance can vary depending on how many rows are you retrieving back to the user, current workload, index and table design that is very own to your environment, but the more you know, the better you can adapt each situation to achieve the optimal performance.

Speaking of which, what if we have a lot of fields or records to work with, it is possible to store the output results on another table? of course, you can, as we can see in this other example:

-- To review the correct data has been deleted
-- we can output to a temp table
FROM Person1
INTO PersonBackupInfo
WHERE EmailPromotion = 0;

-- we can review/recover data if needed
FROM PersonBackupInfo;

We can implement a deleted data review/rollback logic to our application to protect it from user errors (the typical Ooops queries) and protect our databases from an unnecessary point in time restores.

Like everything in life, nothing is perfect, and there are some remarks you should consider to implement it on your applications, so be sure to read it before using it.

Thursday, December 26, 2019

Xmas tree in T-SQL

This is a very simple version of a Xmas tree using T-SQL, also is a good example to learn how recursive CTE works.

So, if you are in front of the computer with little work to do, take a look at the code, maybe you can improve it further!

/* Draw a simple xmas tree using recursion 
   🎄 Happy holidays!              */
WITH XmasTree
AS (
  CAST(REPLICATE(' ', 16) + '^' AS VARCHAR(50)) AS t,
  0 AS lvl
  CAST(REPLICATE(' ', 15 - lvl) + '/' + REPLICATE('*', 2 * lvl + 1) + '\' AS VARCHAR(50))
  ,n.lvl + 1
 FROM XmasTree n
 WHERE lvl < 16
SELECT t as [Happy Holidays]
FROM XmasTree;

And this is the final result:


Monday, July 15, 2019

Refreshing dependent views after a DDL change

When you are in charge of databases with tons of objects, making a small change over an object can be difficult to replicate if you have a lot of objects that depend on it.
For this post, we will focus on how to update column changes on dependent views.

For this type of task, there are 2 scenarios:

  • You explicitly defined column names on the dependent views, or defined WITH SCHEMABINDING: For this case, you need to perform a find/replace task, this can be difficult to achieve since there are no native tools to do it (as far as I know). You can use third-party free tools like RedGate SQL Search or ApexSQL SQL Search, both having its pro and cons, so take a look at both tools and find out what it suits best your needs.

  • You implicitly defined column names on dependant views: The famous SELECT *, even when this is not a best practice, it is easy to replicate changes for this case, we will show you how to do it.

Setting up our example

We will use WideWorldImporters test database for this example, we create two simple views (one depending on the other), as follows:

USE WideWorldImporters;

CREATE VIEW dbo.[vi_invoices_received_by]
    COUNT(InvoiceID) as [Num],
FROM Sales.Invoices
GROUP BY ConfirmedReceivedBy, CustomerID;

CREATE VIEW dbo.[vi_additional_fields_received_by]
FROM dbo.vi_invoices_received_by RB
INNER JOIN sales.Customers C
ON C.CustomerID = RB.CustomerID;

We proceed to execute the second view, you can see that the column labels of the first view are not so descriptive, and can be improved:

So we proceed to execute an alter view over the first view:

ALTER VIEW dbo.[vi_invoices_received_by]
    ConfirmedReceivedBy as [Received by], 
    COUNT(InvoiceID) as [# of Invoices],
FROM Sales.Invoices
GROUP BY ConfirmedReceivedBy, CustomerID;

So we should now be able to see the updated column names on the second view, right?
but if you execute the view again, you will obtain the same results as before:

What happened?

Refreshing your dependent views

SQL Server provides the procedure sp_refreshview to update metadata of views, so you can use it to update the schema without having to recreate the objects.

For our case, you just have to execute the following code against the second view:

EXEC sp_refreshview 'dbo.[vi_additional_fields_received_by]'; 

If we execute the second view again, we can see that the labels have been updated now:

And that is all, you just have to execute the procedure for each dependent view.

What if I don't know the dependent objects?

There are several ways to determine what objects depend on the one you are modifying, I will show you two of them.

  • Via T-SQL: Execute the sp_depends procedure, it will return two subsets: the first one, are objects on which our current object depends. The second one objects that depend on the current object. Usage is as follows:

  • Via SSMS: Just right-click on the object and select View Dependencies

Thursday, July 4, 2019

SQL Engine basics: Caching data

SQL Server engine does an excellent job to optimize query performance and it has different methods to achieve this goal, one of these methods is the data caching.

Data caching consist of putting 8KB pages from storage into memory as they are needed, so the database engine can access the required data again in a much faster way.

At a glance, the way SQL Engine caches the data is the following:

When a request is made the engine checks if the data is already cached in memory (or buffer pool), if the data is already available, then it is returned. If the data is not available, then it is retrieved from storage and put into memory until the buffer pool is full.

How long the data is stored in the buffer pool depends on various aspects: the memory allocated to SQL Server, size of your database, workload, type of queries executed, In-memory OLTP.. etc.

When the buffer pool does not have enough space to allocate the desired memory, then a data spill occurs, it consists of additional data stored on tempdb, this is usually caused by inaccurate memory grants. Fortunately, in SQL Server 2019 Memory Grant feedback is being implemented, that can help to reduce this problem, you can read more about it here.

Since SQL Server 2014, you can also configure Buffer pool extensions, as the term suggests, this consist on an extension of the memory by using faster storage (usually SSD or Flash memory), this is an advanced feature and you can learn more about it in the Microsoft documentation.

Related T-SQL Commands

You have now a concept on what is the data caching and how it works, now let us see it in action.

we will use WideWorldImporters test database for this example, first we run some queries, does not matter what you execute for this example, can be any SELECT statement:

DECLARE  @RawValue decimal(10,2) = 25000

SELECT SUM(UnitPrice * Quantity) as RawValue
FROM Sales.InvoiceLines
HAVING SUM(UnitPrice * Quantity) > @RawValue;

SET @RawValue = RAND()*1500

SELECT InvoiceID, InvoiceLineID, UnitPrice,Quantity
FROM Sales.InvoiceLines 
WHERE TaxAmount > @RawValue

After that we use the DMO sys.dm_os_buffer_descriptors to check what data is into memory, after making the query more readable we can use it like this:

    IIF(database_id = 32767,'ResourceDB',DB_NAME(database_id)) AS [Database],
    file_id, -- This File is at database level, it depends on the database you are executing the query
    allocation_unit_id, -- can be joined with sys.allocation_units (database level DMO)
FROM sys.dm_os_buffer_descriptors BD
ORDER BY [Database], row_count DESC;

Depending on the workload, we obtain something like this:

If you want something more compact, you can execute the query grouping it by database:

    IIF(database_id = 32767,'ResourceDB',DB_NAME(database_id)) AS [Database],
    COUNT(row_count) AS cached_pages
FROM sys.dm_os_buffer_descriptors 
GROUP BY database_id
ORDER BY cached_pages DESC;

We obtain the results grouped by database:

In order to release the data from the cache, you run the DBCC DROPCLEANBUFFERS command:


NOTE: Avoid running this command in production environments, since all the data will be released from memory and must be pulled from the storage again, incurring in additional IO usage.

If you have Buffer pool extension enabled and want to check their options, or if you want to determine if the instance has it, use the sys.dm_os_buffer_pool_extension_configuration DMO as follows:

-- Check if you have a buffer pool extension enabled:
SELECT * FROM sys.dm_os_buffer_pool_extension_configuration;

The results are something like this (I don't have a buffer pool extension enabled on my machine):

If you want the script for the commands we discussed, you can download it from my GitHub repository.

Tuesday, May 7, 2019

Embed Azure Data Studio Notebooks in your website

Notebooks are a functionality available in Azure Data Studio, that allows you to create and share documents that may contain text, code, images, and query results. These documents are helpful to be able to share database insights and create runbooks that you can share easily.

Are you new to notebooks? don't know what are the uses for it? want to know how to create your first notebook? then you can get started in ADS notebooks checking my article for here.

Once you have created your first notebooks and share them among your team, maybe you want to share it on your website or blog for public access.
even when you can share the file for download, you can also embed it on the HTML code.

On this post, I will show you how to do it.

What do you need?

We will use an online HTML converter, nbviewer, provided by Jypiter website, on that homepage, you just have to provide the link of your .ipynb file (my GitHub Notebook repository for this example).

It looks something like this:

After clicking the Go! button, the next window will show you the notebook rendered:

At this point, you could share this link on your site and the user can click on it to see your notebook contents, but what if you want to show the results directly on your website?

Embedding it into your website

You can use the IFrame HTML tag (reference here), with this tag, you can embed an external URL in your HTML code (just be aware of security risks of embedding external code in your application).
the code should look like this:

width="600px" height="800px" 
src="your nbviewer URL" >

The final result is this:

Not the best possible way ¯\_(ツ)_/¯, but it is something.

Going further

If you want to fine-tune the above results or host them on your own website, you can check nbviewer GitHub repository so you can use the code locally.

Thursday, April 11, 2019

Creating logins and users in Azure Database

Azure Database is the PaaS solution for SQL Server databases, on a previous post we have discussed how to create one.

On this post, I want to show you how you can secure your Azure SQL Database by creating users and segregating their permissions.

When you connect to your Azure Database using SSMS (or another tool), you can see the management options are very limited compared to an On-Premises instance.

If you want to create a login and database user, you must create them via T-SQL, on this post I will show you how to do it.

Types of logins

Azure SQL database support two types of logins: SQL Server login and Azure Active directory login.

In order to create Azure AD logins, you must set up an AD administrator first using the Azure portal, you configure it on the server dashboard, then accessing the Active Directory Admin, as follows:

Once you set up your AD Admin, you can connect to the Azure database using this account and you can then assign proper access to other AD accounts.

Creating logins and users

As we told you before, you must create users and assign permissions using T-SQL, the basic script for creating users will be as follows (this is for SQL logins):

Note that as PaaS, you connect only to one database, so, the USE <database> command is not supported on Azure databases, so you must run the T-SQL script connecting to the required database manually:

Script #1 for creating the login

/******* Run This code on the MASTER Database ******/

-- Create login,

 WITH PASSWORD = 'YourStrongP@ssW0rd' 

-- Create user in master database (so the user can connect using ssms or ADS)
 FOR LOGIN az_read

Script #2 - For the user database you want to provide access:

/******* Run This code on the Database you want to give the access ******/

-- The user database where you want to give the access

 FOR LOGIN az_read

-- Add user to the database roles you want
EXEC sp_addrolemember N'db_datareader', N'sqlreadusr'

Explaining it:

You first need to create the login, and set up your password, following the Azure strong password requirements.
Then, if the user is planning to connect to the instance using SSMS or ADS or another tool where the default database to connect is not required,  you must create the user in the master database (without roles, unless required specific access).
Next step is to create the user on the database you want to provide the access.
Finally, you assign the roles you want for that particular user.

After that you can connect with the user to provide the respective access:

For creating logins from Azure Active Directory, the script changes a little, you must create the login connecting to the database using another AD account (the administrator we configurated earlier or another AD user with enough privileges), then you specify the AD account followed by FROM EXTERNAL PROVIDER.

Once you are connected you must only change the first script, as follows:

/******* Run This code on the MASTER Database ******/

-- Create login,


-- Create user in master database (so the user can connect using ssms or ADS)

There is no change in script 2 to provide access to the user on a specific database.

Contained User

For more secure environments, you can create contained database users, this approach provide you a more portable database, with no need to worry about the logins, this is the recommended way to grant users access to Azure databases.

In order to create a contained user, just connect to the database you want to provide the access, and run the create user script as follows:

/**** Run the script on the Azure database you want to grant access *****/

CREATE USER contained_user 
WITH PASSWORD = 'YourStrongP@ssW0rd';


After creating the contained user, you can use it by specifying the database you want to connect in the connection options (for this example using Azure Data Studio):

You can see in the object explorer, we only have access to the database we connected, improving security and portability:

You can read more about this in the Microsoft official documentation here.

Wednesday, March 13, 2019

Quick tip: Zoom in Azure Data Studio

If you finally have given a try to Azure Data Studio, and if you use it on a regular basis, maybe you want to customize it to suit your needs.

Among the huge customization options it has, you can control the text size in form of zoom. To do change it, just use the following keyboard combinations:

  • (Ctrl + = ) For Zoom in.
  • (Ctrl + - ) For Zoom out.
  • (Ctrl + 0) For Zoom reset.
 You can see it in action:

Also, as with everything on this tool, you can access this functionality from command palette (Ctrl + Shift + P), and type "zoom" (you can fine-control the editor zoom from here):

If you haven't tried it yet, you can download Azure Data Studio here.