Monday, March 30, 2020

Administration Basics: Point in time recovery

Nowadays, data is a precious asset for companies today. If you are a database administrator (by decision or by mistake) or simply you are the "IT guy," you have the mission of guarantee all the data is backed up and accessible for recovery.

Trust me, even when you could think you have the more reliable hardware on the planet, or you have multiple database replicas around the globe, anything can happen (a user deleting an entire schema by mistake, an application updating the wrong records, some process crashing, a lot of things can happen).
So trust me and don't question me, just backup all your databases regularly.

The Backup/restore strategy is a broad topic, so for this post, we will focus on the restore activity, so if you don't know how to back up a database yet, you can start here.


What do we need to know?

I like to think of the SQL backups like a pyramid, like this:

Pyramid of backups

For restoring your database to a particular time, you must perform a series of restores:
  • Full backup BEFORE the date you want. 
  • If available, the closest differential backup (this can be omitted if your diff backup is after the date you want or you don't have it).
  • Each T-log backup in order until the required date (we will see an example later).
It is essential to know that only the FULL database recovery model supports point-in-time recovery; with Bulk-logged, you are only allowed to restore the entire transaction log backup.

Set up our example

For this example, I have created a database named DB_RestoreDemo in SQL 2019. On it, I have a table called dates_demo with no relevant information, just a date and a varchar field for this example.

USE [DB_RestoreDemo]
GO

CREATE TABLE [dbo].[dates_demo](
 [id] [bigint] IDENTITY(1,1) NOT NULL,
 [date] [datetime2](7) NOT NULL,
 [text] [nvarchar](100) NOT NULL,
 CONSTRAINT [PK_dates_demo] PRIMARY KEY CLUSTERED ([id] ASC)
) ON [PRIMARY]

GO

I populate the table with some records (save it as we will use the same insert later):

USE DB_RestoreDemo
GO

INSERT INTO dbo.dates_demo
VALUES(GETDATE(),replicate('|',rand()*100))

GO 50

Then take a full backup:

BACKUP DATABASE [DB_RestoreDemo] TO  
DISK = N'C:\SQLServer\DB_RestoreDemo_full.bak' 
WITH COMPRESSION
GO

After that, execute the INSERT statement above, but replace the number after the GO with a significant number, so it takes a lot of time to complete, in my case I will execute it 10 million times:



While it is running, open a new query window and perform some Differential and Transaction log backups as you prefer (in any order and as many times you want), just remember to change the filenames each time to have them quickly identified:


-- Differential backup
-- Remember to change the filename each time

BACKUP DATABASE [DB_RestoreDemo] 
TO  DISK = N'C:\SQLServer\DB_RestoreDemo_diff_1.bak' 
WITH  DIFFERENTIAL, COMPRESSION
GO

-- Transaction Log backup
-- Remember to change the filename each time

BACKUP LOG [DB_RestoreDemo] 
TO  DISK = N'C:\SQLServer\DB_RestoreDemo_1.trn' 
WITH COMPRESSION
GO

Once you are happy with the backups taken, you can stop the INSERT query or let it run until it finishes... depends on you.

We have what we need for our example now.


Let's do it!


Open the table and select a random record, we will try to restore the database as close as possible to the date of that record.

SELECT * 
FROM dbo.dates_demo
ORDER BY [date];

I have selected this random record; we must guarantee that the last record of the table is this one (or very close in case the millisecond resolution include other records):
Id: 613834, date: 2020-03-30 01:49:50.5366667



Then, go to the folder where your backups are located, and check the modified date of the files, you have three options depending on which record you have selected:


  • Option 1: The time you want to restore is before the full backup taken: Congratulations! You know what data loss is now, good thing this is a demo. You cannot go to that point in time if you don't have the required backups.
  • Option 2: The date you want to restore is after any backup was taken: You can make another transaction log backup.
  • Option 3: The date you want to restore is in range, just locate the file with the closest creation date after the date you want.

In my case, is this file:

 

Based on the pyramid we saw before, this must be my restore strategy:


  1. Restore 1 FULL backup - DB_RestoreDemo_full.bak
  2. Restore 1 DIFF backup - DB_RestoreDemo_diff_1.bak
  3. Restore 1 Transaction Log backup - DB_RestoreDemo_2.trn

We will restore these backups to a new database DB_RestoreDemo_NEW, so we don't overwrite existing.

Please note that except for the last one, all restore operations must be done with the NORECOVERY option.

For the last transaction log file restored, we will use the STOPAT option; this option allows us to restore a database to a specific date (our example), a particular LSN, or a transaction mark (we will see that on another post).

NOTE: for this example, we used only INSERT statements without explicit transactions, but in a real-life scenario, the STOPAT option with a date parameter will restore the database to the latest committed transaction to that point in time.

 Once we have our restore strategy in place, we create the restore scripts:

USE [master]
GO

RESTORE DATABASE [DB_RestoreDemo_new] 
FROM  DISK = N'C:\SQLServer\DB_RestoreDemo_full.bak' 
WITH  FILE = 1,  
MOVE N'DB_RestoreDemo' 
TO N'C:\Program Files\Microsoft SQL Server\MSSQL15.SQL2019\MSSQL\DATA\DB_RestoreDemo_new.mdf',  
MOVE N'DB_RestoreDemo_log' 
TO N'C:\Program Files\Microsoft SQL Server\MSSQL15.SQL2019\MSSQL\DATA\DB_RestoreDemo_new_log.ldf',  
NORECOVERY;

GO

RESTORE DATABASE [DB_RestoreDemo_new] 
FILE = N'DB_RestoreDemo' 
FROM  DISK = N'C:\SQLServer\DB_RestoreDemo_diff_1.bak' 
WITH  FILE = 1, 
NORECOVERY;

GO

-- Because of the resolution on the decimals in datetime type
-- the increments are .003, .007 and .000, so the closest one
-- to the date we want .537 is .540
RESTORE LOG [DB_RestoreDemo_new] 
FROM  DISK = N'C:\SQLServer\DB_RestoreDemo_2.trn' 
WITH  FILE = 1, STOPAT = N'2020-03-30T01:49:50.540';

GO

 We navigate to the restored database, and if everything is ok, our selected record should be the last one (or very close to):



We have successfully restored a new database to the required point in time. A few things to have in mind:

  • You can specify the STOPAT option with a LSN or a transaction mark; this is useful if you have a critical process you can write your application to "mark" it for easy restore.
  • You just need the latest FULL and DIFF backup to the required date, but need all the TLog backups after that DIFF (or full if you don't have it).
  • In case the date you want to restore is included on a DIFF backup, you will need to ignore that DIFF backup and restore all the TLOG backups since the last FULL, including the first one after the DIFF backup.
  • For Bulk-logged recovery mode, you cannot perform point-in-time recovery, but you can perfectly restore log backups, but the entire content is restored.

And remember to always test your backups by restoring them regularly, with this you can achieve three things:
Confidence in the restore process, this will help you to remain calm under a stressful situation.
You can prepare the scripts you need to do it before anything bad happens.
You can estimate the time you need to do it, and check if that aligns with your RPO and RTO.