Archive System-Versioned Temporal Table History Data

I am a big fan of SYSTEM-VERSIONED TEMPORAL TABLES and the way they automatically store a historic row when data changes in a table. But many times, heavily updated tables can make the underlying historic table grow to an unmanageable size. In my organization, we typically want to keep that data for a long period of time, but really only need it locally in the database for thirty (30) days. To accomplish this goal, I’ve created a second database where rows older than 30 days can be off loaded. In this post I’ll demonstrate how I’ve made the process automatic.

What is a System-Versioned Temporal Table?

A Temporal Table is a system-versioned type of user table designed to keep a full history of data changes, allowing easy point-in-time analysis. Every temporal table has two explicitly defined columns, each with a datetime2 data type. These columns are referred to as period columns. These period columns are used exclusively by the system to record the period of validity for each row, whenever a row is modified. The main table that stores current data is referred to as the current table, or simply as the temporal table. It also contains a reference to another table with a mirrored schema, called the history table

Here is what the layout for a System Versioned Temporal Table could look like.

Temporal Table Example

https://learn.microsoft.com/en-us/sql/relational-databases/tables/temporal-tables?view=sql-server-ver16

The Flow

My project is going to copy HISTORY table rows older than 30 days off to a second database and then delete the rows in the main database table. To this I will follow these steps:

  • Turn VERSIONING OFF on Temporal Tables
  • COPY rows from MAIN DB to HISTORIC DB
  • DELETE rows from MAIN DB
  • Turn VERSIONING ON

Automation

I will create a STORED PROCEDURE to handle the task, catch errors, and log the process. Then I will create a SQL AGENT job and set it to run once a week during a maintenance window.

Stored Procedure

I’ve named my STORRED PROC adm_HistoricTablesArchive and placed it in the MAIN database. Since this operates at the database level, I will need to copy the stored procedure to other databases where I wish to use the solution. In my HISTORIC DB I have already made a duplicate empty matching table for each _HISTORY table in my main database.

My PROCEDURE will:

  • Log steps to an audit table.
  • Find all SYSTEM-VERSIONED TEMPORAL TABLES
  • Ignore Lookup Tables
  • Move the data over 30 days old to a HISTORIC database

My PROCEDURE will NOT:

  • Will not create the required matching Historic tables
  • Will not update schema in matching Historic database

In the first part of my STORED PROC I will set up the variables I need and initiate a log entry.

/****** Object:  StoredProcedure [dbo].[adm_HistoricTablesArchive]    Script Date: 5/17/2024 8:23:04 AM ******/
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
-- =============================================
-- Author:		David Speight
-- Create date: 20240221
-- Version date: 20240507
-- Description:	Archive historic records
-- =============================================
CREATE PROCEDURE [dbo].[adm_HistoricTablesArchive]
@eRecipients varchar(1000), -- My email address
@eProfile_name varchar(50), -- My SQL Agent Email Profile Sender
@emailFailuresOnly bit, -- 0=Always email, 1=Only email if errors
@DaysToKeep INT, -- 30 used as an example
@ArchiveDBname VARCHAR(50) -- Your Historic Database Name, ie: MAIN_Historic
AS
BEGIN
	SET NOCOUNT ON;

	DECLARE @iLoop INT=0;
	DECLARE @currID INT=0;
	DECLARE @tSql NVARCHAR(MAX);
	DECLARE @SysTimeEnd datetime2(7);
	DECLARE @ErrorMsg varchar(max)=''; 	
	DECLARE @ParentAuditKey INT=1;
	DECLARE @PackageGUID UNIQUEIDENTIFIER;
	DECLARE @ExecStartDT DATETIME;
	DECLARE @InsertRowCnt INT=0;
	DECLARE @TableInitialRowCnt INT=0;
	DECLARE @TableFinalRowCnt INT=0;
	DECLARE @ErrorCount INT=0;
	DECLARE @eSubject VARCHAR(50)= 'adm_HistoricTablesArchive Error ' + CONVERT(VARCHAR(50),@@SERVERNAME);
	DECLARE @eBody VARCHAR(2500)= @eSubject;
	DECLARE @TableName varchar(50);

	SELECT	@SysTimeEnd = DATEADD(d,(0-@DaysToKeep), getutcdate()),
			@PackageGUID = NEWID();;

	BEGIN TRY
		-- Start an audit trail for this process.
		INSERT INTO Maintenance.dbo.DimAudit
		(ParentAuditKey,TableName,PkgName,PkgGUID,ExecStartDT,ExecStopDT,InsertRowCnt,TableInitialRowCnt,TableFinalRowCnt,SuccessfulProcessingInd)
		VALUES(@ParentAuditKey,'PARENT','History Table Archiving',@PackageGUID,getutcdate(),null,null,null,null,'N');
	END TRY
	BEGIN CATCH
		PRINT 'Maintenance.dbo.DimAudit FAILED, it may be unaccessible.';
	END CATCH;

	SELECT @ParentAuditKey = SCOPE_IDENTITY();

Next, I will create a table variable to hold a list of all the SYSTEM-VERSIONED tables in my database, excluding any with the word “lookup” in the table name.

-- Create Temp Table with all Table Names, Column Names
	DECLARE @tTables as TABLE (ID INT IDENTITY(1,1) NOT NULL, TableName VARCHAR(250), ColumnList VARCHAR(MAX))
	INSERT INTO @tTables (TableName, ColumnList)
	SELECT CONCAT(schema_name(h.schema_id) ,'.',h.name),   
	STRING_AGG(a.name,',')
	FROM sys.columns a
	INNER JOIN sys.tables h on a.object_id=h.object_id
	INNER JOIN sys.tables b on b.history_table_id = h.object_id
	WHERE h.name NOT LIKE '%Lookup%'
	GROUP BY CONCAT(schema_name(h.schema_id) ,'.',h.name);

Next, I will LOOP through the rows in the table variable and copy data over 30 days old from the MAIN DB to the HISTORIC DB for each item/table. I will also grab a count of the number of rows moved so I can log that and determine if I need to go back and delete those rows too.

SELECT @iLoop = MAX(ID) FROM @tTables;
	WHILE @iLoop > 0
	BEGIN
		SELECT	@currID = @iLoop,
			@ExecStartDT = GETUTCDATE();

		-- COPY ROWS to DW-Historic DB
		SELECT	@tSql = CONCAT('INSERT INTO ',@ArchiveDBname,'.',TableName,' (',ColumnList,') SELECT ',ColumnList,' FROM ',TableName,' WHERE SysTimeEnd < ''',@SysTimeEnd, ''' ORDER BY SysTimeEnd;'),
@TableName = TableName
		FROM @tTables WHERE ID = @currID;
		BEGIN TRY
		EXEC(@tSql);
		SELECT @InsertRowCnt = @@ROWCOUNT;
		END TRY
		BEGIN CATCH
		SELECT @ErrorMsg = @ErrorMsg + 'FAILED: ' + @tSql + ' | ';
		SELECT @ErrorCount = @ErrorCount+1;
		END CATCH;

Now I am ready to delete the rows from the source history table in the MAIN DB. In this section of the LOOP, I will TURN OFF SYSTEM-VERSIONING, delete the rows over 30 days in age, then TURN ON SYSTEM-VERSIONING.


		-- SAFEGUARD:  If NO rows were copied to DW, then no need to delete anything.
		IF @InsertRowCnt > 0 
		BEGIN
		--ALTER TABLE dbo.TestTemporal SET (SYSTEM_VERSIONING = OFF);
		SELECT @tSql = CONCAT('ALTER TABLE ',REPLACE(TableName,'_History',''),' SET (SYSTEM_VERSIONING = OFF);')
		FROM @tTables WHERE ID = @currID;
		BEGIN TRY
		EXEC(@tSql);
		END TRY
		BEGIN CATCH
		SELECT @ErrorMsg = @ErrorMsg + 'FAILED: ' + @tSql + ' | ';
		SELECT @ErrorCount = @ErrorCount+1;
		END CATCH;

		-- DELETE ROWS FROM PROD
		SELECT @tSql = CONCAT('SELECT @x = COUNT(*) FROM ',TableName,';')
		FROM @tTables WHERE ID = @currID;
		BEGIN TRY
		exec sp_executesql @tSql, N'@x int out', @TableInitialRowCnt out;
		END TRY
		BEGIN CATCH
		SELECT @ErrorMsg = @ErrorMsg + 'FAILED: ' + @tSql + ' | ';
		SELECT @ErrorCount = @ErrorCount+1;
		END CATCH;

		SELECT @tSql = CONCAT('DELETE FROM ',TableName,' WHERE SysTimeEnd < ''',@SysTimeEnd, ''';')
		FROM @tTables WHERE ID = @currID;
		BEGIN TRY
		EXEC(@tSql);
		END TRY
		BEGIN CATCH
		SELECT @ErrorMsg = @ErrorMsg + 'FAILED: ' + @tSql + ' | ';
		SELECT @ErrorCount = @ErrorCount+1;
		END CATCH;

		SELECT @tSql = CONCAT('SELECT @x = COUNT(*) FROM ',TableName,';')
		FROM @tTables WHERE ID = @currID;
		BEGIN TRY
		exec sp_executesql @tSql, N'@x int out', @TableFinalRowCnt out;
		END TRY
		BEGIN CATCH
		SELECT @ErrorMsg = @ErrorMsg + 'FAILED: ' + @tSql + ' | ';
		SELECT @ErrorCount = @ErrorCount+1;
		END CATCH;

		--ALTER TABLE dbo.TestTemporal SET (SYSTEM_VERSIONING = ON (HISTORY_TABLE=dbo.TestTemporal_History,DATA_CONSISTENCY_CHECK=ON));
			SELECT @tSql = CONCAT('ALTER TABLE ',REPLACE(TableName,'_History',''),' SET (SYSTEM_VERSIONING = ON (HISTORY_TABLE=',TableName,',DATA_CONSISTENCY_CHECK=ON));')
			FROM @tTables WHERE ID = @currID;
		BEGIN TRY
		EXEC(@tSql);
		END TRY
		BEGIN CATCH
		SELECT @ErrorMsg = @ErrorMsg + 'FAILED: ' + @tSql + ' | ';
		SELECT @ErrorCount = @ErrorCount+1;
		END CATCH;
		END;
	
		SELECT @ErrorCount = (CASE WHEN (@ErrorCount = 0 AND LEN(@ErrorMsg) > 0) THEN 1 ELSE @ErrorCount END);

Finally, we will log our work done in each iteration of the LOOP.

BEGIN TRY
	INSERT INTO Maintenance.dbo.DimAudit (ParentAuditKey,TableName,PkgName,PkgGUID,ExecStartDT,ExecStopDT,InsertRowCnt,ErrorRowCnt,TableInitialRowCnt,TableFinalRowCnt,SuccessfulProcessingInd)
	VALUES(@ParentAuditKey,@TableName,'History Table Archiving',@PackageGUID,@ExecStartDT,getutcdate(),@InsertRowCnt,@ErrorCount,@TableInitialRowCnt,@TableFinalRowCnt,
	(CASE WHEN @ErrorCount = 0 THEN 'Y' ELSE 'N' END) );
END TRY
BEGIN CATCH
	PRINT 'Maintenance.dbo.DimAudit FAILED, it may be unaccessible.';
END CATCH;

IF (select COUNT(*) FROM sys.tables h INNER JOIN sys.tables b on b.history_table_id = h.object_id WHERE h.name = REPLACE(@TableName,'dbo.','') ) = 0
BEGIN
	INSERT INTO Maintenance.dbo.DimAudit (ParentAuditKey,TableName,PkgName,PkgGUID,ExecStartDT,ExecStopDT,InsertRowCnt,ErrorRowCnt,TableInitialRowCnt,TableFinalRowCnt,SuccessfulProcessingInd)
	VALUES(@ParentAuditKey,@TableName,'Error: History linking',@PackageGUID,@ExecStartDT,getutcdate(),0,2,0,0,'N');
END;

PRINT @ErrorMsg;
SELECT @iLoop = @iLoop-1, @tSql='', @ErrorMsg='', @ExecStartDT=null,@InsertRowCnt=0,@TableInitialRowCnt=0,@TableFinalRowCnt=0, @ErrorCount=0;
END;

Lastly, I will update my log table and send an email if any part of the process threw an error.

-- Update the audit table
	BEGIN TRY
	SELECT @ErrorCount = SUM(ErrorRowCnt)
	FROM Maintenance.dbo.DimAudit
	WHERE ParentAuditKey = @ParentAuditKey;

	UPDATE Maintenance.dbo.DimAudit
	SET ParentAuditKey = @ParentAuditKey,
		ExecStopDT = GETUTCDATE(),
		SuccessfulProcessingInd = (CASE WHEN @ErrorCount = 0 THEN 'Y' ELSE 'N' END)
	WHERE AuditKey = @ParentAuditKey;
	END TRY
	BEGIN CATCH
		PRINT 'Maintenance.dbo.DimAudit FAILED, it may be unaccessible.';
	END CATCH;

	select top(1000) * from Maintenance.dbo.DimAudit 
	WHERE ParentAuditKey = @ParentAuditKey
	ORDER BY 1 DESC;

	-- Send the email if needed.
	IF (@ErrorCount > 0 or @emailFailuresOnly = 0) 
	BEGIN  
		SELECT @eBody = CONCAT(@eBody, ' The job was run and returned ', convert(varchar(24),@ErrorCount),' errors.'); 
		-- SEND EMAIL CODE GOES HERE
		begin try
			EXEC msdb.dbo.sp_send_dbmail
			@profile_name = @eProfile_name,
			@recipients = @eRecipients,
			@subject = @eSubject,
			@body = @eBody; 
		end try
		begin catch
			Print 'Error Email failed: adm_HistoricTablesArchive.';
		end catch;
	END;
END;

Agent Job

My SQL AGENT job will run on Tuesdays at 5 AM. I chose this time because many Holidays fall on a Monday, I come into work at 7AM, and I can immediately address anything that may have errored.

Troubleshooting and Logs

I will piggyback off other projects I’ve created in the past and store my logging in the dimAudit table in my MAINTENANCE database. Here is a sample from one of my weekly logs.

Sample Log

I will repost the code for the dimAudit table here:

CREATE TABLE [dbo].[DimAudit](
	[AuditKey] [int] IDENTITY(1,1) NOT NULL,
	[ParentAuditKey] [int] NOT NULL,
	[TableName] [varchar](50) NOT NULL,
	[PkgName] [varchar](50) NOT NULL,
	[PkgGUID] [uniqueidentifier] NULL,
	[PkgVersionGUID] [uniqueidentifier] NULL,
	[PkgVersionMajor] [smallint] NULL,
	[PkgVersionMinor] [smallint] NULL,
	[ExecStartDT] [datetime] NOT NULL,
	[ExecStopDT] [datetime] NULL,
	[ExecutionInstanceGUID] [uniqueidentifier] NULL,
	[ExtractRowCnt] [bigint] NULL,
	[InsertRowCnt] [bigint] NULL,
	[UpdateRowCnt] [bigint] NULL,
	[ErrorRowCnt] [bigint] NULL,
	[TableInitialRowCnt] [bigint] NULL,
	[TableFinalRowCnt] [bigint] NULL,
	[TableMaxDateTime] [datetime] NULL,
	[SuccessfulProcessingInd] [char](1) NOT NULL,
 CONSTRAINT [PK_dbo.DimAudit] PRIMARY KEY CLUSTERED 
(
	[AuditKey] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON, OPTIMIZE_FOR_SEQUENTIAL_KEY = OFF) ON [PRIMARY]
) ON [PRIMARY]
GO

ALTER TABLE [dbo].[DimAudit] ADD  CONSTRAINT [DF__DimAudit__TableN__1DE57479]  DEFAULT ('Unknown') FOR [TableName]
GO

ALTER TABLE [dbo].[DimAudit] ADD  CONSTRAINT [DF__DimAudit__PkgNam__1ED998B2]  DEFAULT ('Unknown') FOR [PkgName]
GO

ALTER TABLE [dbo].[DimAudit] ADD  CONSTRAINT [DF__DimAudit__ExecSt__1FCDBCEB]  DEFAULT (getdate()) FOR [ExecStartDT]
GO

ALTER TABLE [dbo].[DimAudit] ADD  CONSTRAINT [DF__DimAudit__Succes__20C1E124]  DEFAULT ('N') FOR [SuccessfulProcessingInd]
GO

ALTER TABLE [dbo].[DimAudit]  WITH CHECK ADD  CONSTRAINT [FK_dbo_DimAudit_ParentAuditKey] FOREIGN KEY([ParentAuditKey])
REFERENCES [dbo].[DimAudit] ([AuditKey])
GO

ALTER TABLE [dbo].[DimAudit] CHECK CONSTRAINT [FK_dbo_DimAudit_ParentAuditKey]
GO