Tuesday, July 17, 2018

PowerBI: Implement a What If parameter

Our Guinea Pig Dashboard
PowerBI is focused on creating interactive, analytical reports, from a lot of different data sources and with the ability to access them via mobile devices.

One of the cool options that allow your reports to be interactive, is the What If parameters: they allow you to visualize how your data could/should behave under different scenarios.
On this post, I will show you a super simple way to use a what if parameter.

Note: This test was done using PowerBI desktop and the Import option in the data source selection.



We will use a simple Dashboard representing sales by region in a specific year, as this:



Let's say we want to plan next year sales, we want to add a line representing the next year target by region, for this example just as a value between 0 and 200% of this year sales.

Adding the What If Parameter


Click on the Modeling Tab and select the New Parameter option.



A new window will open, here you configure the values for your parameter.
For this example, we configure a decimal value between 0 and 2.
Leave the Add Slicer option checked to add the value selection box to your dashboard.


The parameter visualization is added, but as a multiple select option.



We change the type of slicer to Single value (you can select other option if you want).




Adding a new measure


We will demonstrate how the newly created parameter works using a measure, it will multiply the year sales by the value we select in the slicer.
Of course you can use the parameter in more complex formulas, that modify data values more in depth, that are not so obvious as our example, but for now, we will keep it simple.

We proceed to create a new measure on our data table, select the header of the table and select the New Measure option.



Then enter the formula you want to use, for our example the sum of the sales of the year, multiplied by our parameter, to simulate next year sales.



You can see how the measure is visualized as any other column in our table



Modify your visualizations


Now, just modify the dashboard by adding the new column to any visualization you want and save it.
Then you can change your slicer and the column will change its value in real time.

A simple analytical report with a few clicks :)


Sources:

https://docs.microsoft.com/en-us/power-bi/desktop-what-if

https://docs.microsoft.com/en-us/power-bi/desktop-use-directquery

Tuesday, July 10, 2018

Display SELECT query or table as HTML

https://bertwagner.com/2018/07/03/code-youd-hate-to-live-without-t-sql-tuesday-104-invitation/
This post is part of #tsql2sday event :)
For my current job I support a lot of SQL Servers, they come in all the flavors and sizes (old, new, patched, unpatched, big, small, dedicated, shared, built by our team, built by others... etc). In the ocean of different possible configurations, having a centralized monitoring solution is sometimes not possible for the 100% of the servers, because of different reasons (unsupported features for older versions, located in a different domain or datacenter, some weird security restrictions).

Even when we have already in place a custom inventory and monitoring solution (BTW done using C#, asp.net, SQL Server and Powershell), a couple of rogue servers, needs their own special treat, most of the cases this has to do with some kind of monitoring.

So, for these serves, the solution is to implement local jobs to monitor different kinds of stuff and send via email the monitoring and usage results.
We could perfectly send the info in plain text, but reading this in an email, especially when your boss is included in the distribution list is just awful, so we had to format the data first to make it look better...

That's why I created sp_TabletoHTML, a Stored Procedure to convert any table or SELECT query to an HTML table, so you just have to paste the output in your email body.

Before this stored procedure, we had to do the HTML conversion based on each table columns, and after 3 or 4 times of adapting the same code, I was tired and ended up doing this stored procedure.

How it works?


  • Create the stored procedure and remember where you left it (can be the master database, it really doesn't matter as long as you can access it properly).
  • Then you pass in @stTable (yes, Unicode varchar in case you have weird characters in your object names) the name of the table or view containing the data using the schema.name convention or you can also pass a SELECT query sentence, giving you the ability of formatting or cleaning up your data before converting it to HTML.
  • You can specify if want to include column names or not (included by default) or the predefined style you want or no style at all (or leave the parameters at their defaults)
  • Execute the stored procedure and use the output string as your HTML <table> result.
  • Enjoy!

Sample execution:

SET @SQLSentence = 'SELECT 
name,
state_desc,
create_date,
collation_name 
FROM sys.databases'

EXEC sp_TabletoHTML @stTable = @SQLSentence,
 @TableStyle = 3,
 @RawTableStyle = @st OUTPUT

And you will obtain a beautiful looking HTML table like this one:

Formats can vary...

What is the logic involved?


Just tweaking a little bit the FOR XML functionality, and a lot of column reformatting.
This is the core of the formatting, all the columns with the same name of TD that will translate into <td></td> and the rows that will translate into elements named <tr></tr>:


SELECT [COL1] as TD, [COL2] as TD...
FROM ##rowstablePreHTML 
FOR XML RAW('TR'), ELEMENTS


Since some of the queries are build dynamically, I use sp_executesql, that could be a security risk for untrusted sources.

For internal use, we did not check for malicious code, because... I trust my own code, but before releasing it into Github I added a very basic malicious code check (just check for keywords that has nothing to do with table results) and raise an error for any occurrence encountered. As a recommendation, always validate your queries first.

Null values are not converted to an empty string (because of the unique way XML handle their nulls).  Before using it, remove null values from your data.

Some special datatypes like geography, timestamp, XML, image are not supported, if you try to use them, an error will raise, remove these columns before using it.

Where is the code?

Fortunately for you, it is free!

I offer this stored procedure as open source in GitHub, you can check it here.
The link is also in the Free Tools section on this blog.

The full documentation is also available on the Github page, so you can check the parameter usage and some other examples there!

Also, if any of you can improve this SP, please do it, you are more than welcome!

Tuesday, July 3, 2018

Saving disk space by choosing the correct data type

"Database is almost 4 TB in size, user is complaining on long times for DEV data refresh and high storage costs for database and backups. Full backups are taking a long time to complete, even with compression enabled. 
After a quick check, we discover tables that stores data from 15 years ago, without any purge or historical archive process implemented, most of the data stored is only with logging purposes, using incorrect data types and storing a lot of nulls. After changing data types and implementing data purge processes, database size was reduced to almost 90 GB, drastically improving their application performance and reducing storage costs and refresh times."
- true story... or not -


Designing our databases it is something that must be done carefully and by knowing how our application will work and choosing the right data types, specially for databases that will contain high volumes of data, can drastically improve our application response times and storage required.

We will compare storage usage for the most common data types. For displaying the table size I'm using my database dashboard report.

Dates


Always analyze if you really need to store times, or if you need to store time, the precision you need for your application to work correctly.

We create different tables with each datatype we want to test: datetime, smalldatetime, datetime2, datetime2(0), date.


CREATE TABLE dbo.testDatetime
(
    colDate datetime NOT NULL
);
----------------------------------------
CREATE TABLE dbo.testSmallDatetime
(
    colDate smalldatetime NOT NULL
);
----------------------------------------
CREATE TABLE dbo.testDatetime2
(
    colDate datetime2 NOT NULL
);
----------------------------------------
CREATE TABLE dbo.testDatetime20
(
    colDate datetime2(0) NOT NULL
);
----------------------------------------
CREATE TABLE dbo.testDate
(
    colDate date NOT NULL
);




Then we proceed to insert the same info on each table (500,000 rows to have a good sample space)

INSERT INTO testDatetime VALUES(getdate())
INSERT INTO testSmallDatetime VALUES(getdate())
INSERT INTO testDatetime2 VALUES(getdate())
INSERT INTO testDatetime20 VALUES(getdate())
INSERT INTO testDate VALUES(getdate())

GO 500000

After the insert ends, we can see the different storage usages, as well as the data that is stored on each table.

Datetime needs around 20% more space than Date



Strings


We need to check the possible maximum length of the data to be stored, if we require to store or not unicode characters, and if we require a fixed or variable length.

We create different tables with each datatype we want to test: Char, VarChar, NChar, NVarChar.


CREATE TABLE dbo.testChar
(
    colString char(20) NOT NULL
);
-------------------------------
CREATE TABLE dbo.testVarChar
(
    colString varchar(20) NOT NULL
);
-------------------------------
CREATE TABLE dbo.testNChar
(
    colString nchar(20) NOT NULL
);
-------------------------------
CREATE TABLE dbo.testNVarChar
(
    colString nvarchar(20) NOT NULL
);
 

Again we insert same info on each table (500,000 rows also).

DECLARE @i int
SET @i =RAND()*20

INSERT INTO testChar VALUES(REPLICATE ('$' ,@i ) )
INSERT INTO testVarChar VALUES(REPLICATE ('$' ,@i ) )
INSERT INTO testNChar VALUES(REPLICATE ('$' ,@i ) )
INSERT INTO testNVarChar VALUES(REPLICATE ('$' ,@i ) )

GO 500000

We can determine what datatype use more storage space, we can determine that storing unicode vs non-unicode characters make a big difference, so this can be one of the first things to check at design level.

A good datatype for strings could make a big difference!

'X' are too mainstream, besides, everybody loves '$'

Numbers


For this datatype the requirements are more straightforward, so the only thing I do is to show you the different storage usages.

Again we create one table for each data type to check (since for numbers there are a lot of parameters and various subtypes, we will only check the most common with the defaults): Int, Decimal, Numeric, Money (exact types) and Real, Float (approximate types).


CREATE TABLE dbo.testInt
(
    colNumber int NOT NULL
);
-------------------------------
CREATE TABLE dbo.testDecimal
(
    colNumber decimal NOT NULL
);
-------------------------------
CREATE TABLE dbo.testNumeric
(
    colNumber Numeric NOT NULL
);
-------------------------------
CREATE TABLE dbo.testMoney
(
    colNumber money NOT NULL
);
-------------------------------
CREATE TABLE dbo.testFloat
(
    colNumber float NOT NULL
);
-------------------------------
CREATE TABLE dbo.testReal
(
    colNumber real NOT NULL
);
 
 

We insert same info on each table (500,000 rows also).

DECLARE @i int
SET @i =RAND()*1000

INSERT INTO testInt VALUES(@i/3.141592)
INSERT INTO testDecimal VALUES(@i/3.141592)
INSERT INTO testNumeric VALUES(@i/3.141592)
INSERT INTO testMoney VALUES(@i/3.141592)
INSERT INTO testFloat VALUES(@i/3.141592)
INSERT INTO testReal VALUES(@i/3.141592)

GO 500000

And again, with the default parameters for each data type, we can see the storage used by each one, so is always important to know your data, investigate the numbers to be stored in your database so you can optimize space usage.

By default, numeric and decimal datatypes use the most space, this can change based on the parameters specified.

Note that decimal and numeric defaults have no decimal digits
and values were approximated.
Make your own adjustments for specific tests

You can check the different data-types SQL Server offers, with their storage requirements as well their limitations, link is below, so you have a better understanding on how each one works, this will help you to design more efficient applications and reduce your future administration time, or maybe you can optimize databases already in place by just adjusting column types :)

Please note that these are only general guidelines, also I have used test data, always make sure to perform your own tests at development stage and make your own comparisons.

Source:

https://docs.microsoft.com/en-us/sql/t-sql/data-types/data-types-transact-sql?view=sql-server-2017


Thursday, June 28, 2018

Check instant file initialization and lock pages in memory options

Study your server reports do the work for you!

Instant file initialization and lock pages in memory are 2 windows operating system options that can help us improve our overall SQL Server performance.
These options are already checked in the Server Dashboard report available in my free tools.

We will talk about both options and how to check them.

Note that these options are for SQL Server running under Windows OS, for Linux systems you can ask Carlos Robles via his webpage or twitter. He has a lot of experience working with databases (SQL Server, Oracle, MySQL, Sybase) running under UNIX environments.


Instant file Initialization


When you create any file on the operating system, any data left in disk is overwritten by filling with zeros all the file, that could take some time depending of the file size.
For SQL Server this is done when you restore a backup, perform an auto growth operation, create a data/log file (if you put an initial value of several GB this zeroing could take a long time to complete), so this can impact our SQL Server performance.
To avoid that, the zeroing operation can be skipped leaving the "dirty" data on the disk, improving the speed of file operations.

To check if instant file initialization is enabled


Just check the error log and find for the "instant file initialization" entry


exec master..xp_readerrorlog 0,1, "Instant file initialization"

This will tell you if it is enabled or disabled for your system



To enable Instant File Initialization


Starting SQL Server 2016, this option can be configured at setup by checking the "Grant Perform Volume maintenance task privilege to SQL Server Database Engine Service" option


If the option was not enabled during setup, or for older versions, you have to add SQL Server service account to the Perform volume maintenance tasks local policy on your system.
You can check how to do it here.


Lock Pages in Memory


This option allow any given OS process to keep data in the memory, instead of paging it to disk (virtual memory).
Enabling this option we can improve performance by making SQL Server process read pages from physical memory (of course if the page is available in memory).

To check if Lock Pages in Memory is enabled


Run the following query:


/*Lock pages in memory*/
SELECT
MN.locked_page_allocations_kb
FROM sys.dm_os_memory_nodes MN
INNER JOIN sys.dm_os_nodes N 
ON MN.memory_node_id = N.memory_node_id
WHERE N.node_state_desc <> 'ONLINE DAC'

If lock pages in memory is enabled, the value will be greater than 0.

To enable Lock Pages in Memory


You have to add the sql server service account to the Lock pages in memory policy.
You can check how to do it here.


Also remember, these options are already checked in the Server Dashboard report available in my free tools.

 

 

Sources:

 

https://docs.microsoft.com/en-us/sql/relational-databases/databases/database-instant-file-initialization?view=sql-server-2017

https://docs.microsoft.com/en-us/sql/database-engine/configure-windows/enable-the-lock-pages-in-memory-option-windows?view=sql-server-2017


Thursday, June 21, 2018

What it is Page Life Expectancy (PLE)

Which cup will you prefer to go the less
number of times to the coffee maker?
image was taken from here
Let us imagine this: Every morning when you get to the office, you must drink coffee to be fully awake, if you are a coffee lover, maybe you should have your own personalized 24 oz cup to put it.
Let's say one day you lose your cup and you have to grab one of the tiny-ugly cups available in the office, now you have to put the coffee on it several times to drink the same 24 oz, this means you will have to leave your desk several times.

Something similar happens with Page Life Expectancy (PLE). This is a measure that indicates us how long a data page stays in memory(in seconds) so it can be reused by other SQL Processes without having to read the same page from disk again (which translates in less I/O and less overhead). So with this in mind, the higher the value the better, but, how much is high enough for our system?


Right PLE value for my SQL Server


Long time ago, when servers were not so powerful, the recommendation was to have our PLE higher than 300, but now times have changed and servers too, so the recommendation is to have a PLE higher than 300 for each 4 GB of RAM assigned to SQL Server if you have Max server memory set (or for your server memory if you don't have a maximum set to SQL Server).

Keep in mind that you have to measure this value over a period of time and then average it to have a more consistent value or to identify sudden drops and correlate them with queries executed at that time.

To Calculate it


I take it from performance counters, more specifically from the Buffer Manager counter.
To calculate it I use the following T-SQL (please note that this only gives you the current value):


DECLARE @SQLmemory NUMERIC(9, 2)
DECLARE @Maxmemory INT

SET @SQLmemory = (
  SELECT physical_memory_kb / 1024.0 / 1024.0
  FROM sys.dm_os_sys_info
  )
SET @Maxmemory = (
  SELECT convert(INT, value_in_use)
  FROM sys.configurations
  WHERE [name] = 'max server memory (MB)'
  )

IF (@Maxmemory) <> 2147483647
BEGIN
 SET @SQLmemory = @Maxmemory / 1024.0
END

SELECT [counter_name]
 ,@SQLmemory AS [Server Memory_GB]
 ,cast([cntr_value] AS INT) AS [PLE_Current]
 ,(@SQLmemory / 4) * 300 AS [PLE_Expected]
FROM sys.dm_os_performance_counters
WHERE [object_name] LIKE '%Manager%'
 AND [counter_name] = 'Page life expectancy'

What if the value is too low?


This means SQL Server has to flush memory very often. this could be a symptom that your SQL Server is struggling with the currently allocated memory, so you need to tune your Database checking (but not limited to):
  • Check your indexes (missing or unused)
  • Select statements that return data you don't need
  • Tune your queries
  • Check your statistics
  • Check for ad-hoc queries
If all of the above does not work then you should consider expanding the memory of the server.


There are 2 posts I want to recommend to for further reading about this topic:

NUMA considerations for PLE from Paul Randal
Fixing PLE from Steve Hood

Monday, June 11, 2018

Collation considerations to create robust queries

Is your code "This" robust?
What is Collation? In short words: are the sorting rules, case, and accent sensitivity properties for your data. (if you want to read more, click here).


If you are dedicated to database development or you provide support to a lot of user databases, then you must be aware of the collation of the database at the moment to write your queries to avoid execution failures at deploy stage.


On this post, I will show you some of the most common errors related to collation and how to avoid them to have solid T-SQL queries.


1. Your DEV machine (all defaults)


If you just install SQL Server in your dev machine and use all the defaults, including collation, it is very likely you won't have any issues of this kind, so let us take the following TSQL as an example:

USE School;

--Current database and master 
SELECT name, collation_name FROM sys.databases
where database_id = DB_ID() or database_id =1; 


SELECT DEP.* 
FROM Department dep

It will execute without any issues:


So now, let us change the Collation for the School database:


USE [master];

ALTER DATABASE [School] COLLATE SQL_Latin1_General_CP1_CS_AS;


Collations are different now

After this change, the fun begins...

2. The column prefix '<something>' does not match with a table name or alias name used in the query


Executing the same tsql code, now with a different collation will raise an error:



To solve this, verify that the case on the alias, object and properties are the same, and change it to be the same across all your query:




3. The multi-part identifier "<something>" could not be bound


Our second error, if we add a WHERE clause to the query, again with a different case, an error will show:



To solve this, as the previous error, verify that the case on the alias, object and properties are the same, and change it to be the same across all your query:



4. Cannot resolve the collation conflict between "<Collation 1>" and "<Collation 2>" in the equal to operation


For our third error, we will use a different query that use tables for different databases (and maybe different collations):


SELECT QUOTENAME(SU.[name]) AS UserName 
          ,QUOTENAME(SP.[name]) AS PrincipalName 
FROM sys.sysusers AS SU -- Table located on the database
LEFT JOIN sys.server_principals AS SP -- Table located on [master]
ON SU.[name]  = SP.[name] 
AND SP.[type] = 'S';

Even when we have applied the case rules we explained before, we have another kind of error when databases use different collation:



To solve this, you must apply the same collation to both sides of the join, using COLLATE DATABASE_DEFAULT as you can see here:



One important thing, you only have to do this change where you use multi-database joins, if your joins are in the same database, this is not necessary.


Now you have the knowledge to build more robust T-SQL queries that can be deployed to a wide set of different environments and configurations and you have more probabilities they will not fail.

Friday, May 25, 2018

Add a custom report to SSMS

Maybe some of you don't know this, but to view Report Definition Language reports (RDL) , you aren't forced to use SQL Server Reporting Services and a report server running on the machine.

With the latest version of SSMS you are able to view any report, just have in mind 2 things when creating your report to be able to use it on SSMS:
  1. Use an embedded Dataset for your report and point the server name to (local)
  2. Add the database name to your objects if you want a specific db, otherwise, SSMS will use the current db or the master as default.
You can create RDL reports (among other projects) with the free tool SQL Server Data Tools (SSDT) who is the replacement for the old Business Development Integration Studio(BIDS).

Once you have your report created, it is just matter of importing it to SSMS to use it. I will show you how to do it.

Add the report to SSMS


For this example I am using one of my open source reports for server monitoring.

First, connect to your server using SSMS and right click on the server or database, then select Reports>Custom Reports...


Locate the report file on your windows explorer. Note: If you want another name for the report, rename the file on this point, then select it and click OPEN.


A Warning will appear for any custom report you add, and again I tell you: Always review the source of any custom code before using it to avoid malicious executions against your servers, also test them in a non-prod server before. As advice: trust no one.
If you trust on the source, then ignore the warning and click RUN.


After that, the custom report will appear, note that the connection has been changed to the current server where the report is being executed.
BE CAREFUL: Report runs with the credentials of the user connected to the SSMS at that point, so have this in mind for sensitive information or object permissions.



After that, the report will be available for you to execute it at any time.



To Delete it


Reports are not "imported" to SSMS, so if you want to delete it, just delete the rdl file on the windows explorer, after that if you select it again in SSMS it will prompt to delete it from recent list.