Change Data Capture (CDC) Setup, Update, Reenable

There is a great post on MSSQLTIPS on how to RESTORE a database and keep the CDC (Change Data Capture) information. But what happens when this fails, or something goes wrong, and you need to reenable it, or you’re setting it up for the first time?

One company I work for, uses native backups made on an AWS RDS server that are copied to S3. Those backups are moved to a QA environment where code can be tested against real data. Sometimes in this process the CDC fails to set up due to user error. We don’t typically need all the production CDC data anyway, so it seems quicker to just reenable CDC rather than dropping the databases and restoring them again.

Additionally, there are instances where I DO NOT want PRODUCTION CDC DATA moved. For example, in a DEV environment I’m going to clean all the PI data. In that case, it makes more sense to RESTORE the database initially without CDC data and then enable CDC after the fact.

In this solution, I’ve created two stored procedures. One that can check to verify that CDC is enabled and defined for all tables in the database, and a one that will reenable CDC and set up table definitions when needed.

I have already enabled CLR and Ad Hoc Distributed Queries using sp_configure or in AWS RDS I’ve set this up in my RDS Parameter Options.

The CDC TABLE CHECKER

This PROCEDURE runs through a list of tables and checks for CDC configurations. It will return two datasets: The first, a list of CDC table definitions, the last, a list of tables without a CDC table definition.

CREATE OR ALTER   PROCEDURE [dbo].[CdcTablesCreateChecker]
AS
BEGIN

DECLARE @TableList as TABLE (ID INT NOT NULL IDENTITY(1,1), 
	SchemaName varchar(250), 
	TableName varchar(500));
DECLARE @CdcList as TABLE  (ID INT NOT NULL IDENTITY(1,1), 
	source_schema varchar(128), 
	source_table varchar(128),
	capture_instance varchar(128),
	object_id bigint,
	source_object_id bigint,
	start_lsn varchar(500),
	end_lsn varchar(500),
	supports_net_changes bit,
	has_drop_pending bit,
	role_name varchar(128),
	index_name varchar(128),
	filegroup_name varchar(128),
	create_date	datetime,
	index_column_list varchar(500),
	captured_column_list varchar(max)
	);

-- Grab any existing cdc tables into a list
INSERT INTO @CdcList 
EXEC sys.sp_cdc_help_change_data_capture;

select * from @CdcList;

SELECT S.name AS SchemaName, 
	T.name AS TableName,
	1 AS Cdc_Not_Setup
	FROM sys.tables AS T
	INNER JOIN sys.schemas AS S on  T.schema_id = S.schema_id
	LEFT JOIN @CdcList AS C ON C.source_schema = S.name and C.source_table = T.name
	WHERE T. schema_id < 1000 and S.name not like '%cdc%'
	AND C.source_table IS NULL
ORDER BY S.name, T.name;

END;
GO

If I run this PROCEDURE on a database where CDC has not been set up, it will return the following error message, along with the two datasets, the first being empty and the second, a list of all tables in the database.

The CDC Table Creator

This PROCEDURE will set up CDC if it’s not already set up. It will also allow me to setup tables in batches. I’ve set the default to 100 tables per batch, but I’m going to use 25 in this demo since my AdventureWorks database only has 71 tables.

/****** Object:  StoredProcedure [dbo].[CdcTablesCreate]    Script Date: 12/8/2023 9:18:40 AM ******/
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
-- =============================================
-- Author:		David Speight
-- Create date: 20231207
-- Description:	This will create CDC for all tables in THIS database, or create cdc for missing tables.
-- This DOES NOT update tables that have drifted from original cdc creation

-- Required:	sp_configure 'Ad Hoc Distributed Queries', 1 | This setting is required.
-- For AWS RDS set the option in the PARAMATER OPTIONS on the RDS dashboard.
-- =============================================
CREATE OR ALTER   PROCEDURE [dbo].[CdcTablesCreate] 
@cdcRoleName varchar(100)='db_cdcadmin', 
@MaxBatch int=100				-- This allows you to create the cdc tables in batches in case there are a lot.
AS
BEGIN

SET NOCOUNT ON;

DECLARE @setupSql varchar(2500);
DECLARE @cdcSql varchar(2500);
DECLARE @iLoopCount int;
DECLARE @TableList as TABLE (ID INT NOT NULL IDENTITY(1,1), 
	SchemaName varchar(250), 
	TableName varchar(500));
DECLARE @CdcList as TABLE  (ID INT NOT NULL IDENTITY(1,1), 
	source_schema nvarchar(128), 
	source_table nvarchar(128),
	capture_instance nvarchar(128),
	object_id bigint,
	source_object_id bigint,
	start_lsn nvarchar(128),
	end_lsn nvarchar(128),
	supports_net_changes bit,
	has_drop_pending bit,
	role_name nvarchar(128),
	index_name nvarchar(128),
	filegroup_name nvarchar(128),
	create_date	datetime,
	index_column_list nvarchar(128),
	captured_column_list nvarchar(max)
	);


IF EXISTS (SELECT * FROM sys.database_principals  where name = @cdcRoleName)
BEGIN
SELECT ' EXISTS'
END
ELSE
BEGIN
SELECT @setupSql = CONCAT(' CREATE ROLE [',@cdcRoleName,'];')
EXEC (@setupSql);
END;

BEGIN TRY
exec sys.sp_cdc_help_change_data_capture;
END TRY
BEGIN CATCH
IF EXISTS (SELECT * FROM sys.databases where name = 'rdsadmin')
BEGIN
SELECT @setupSql = CONCAT('exec msdb.dbo.rds_cdc_enable_db ''',DB_NAME(),''';')
END
ELSE
BEGIN
SELECT @setupSql = 'Exec sys.sp_cdc_enable_db;';
END;
EXEC (@setupSql);
WAITFOR DELAY '00:00:15'
END CATCH;

-- Grab any existing cdc tables into a list
INSERT INTO @CdcList 
EXEC sys.sp_cdc_help_change_data_capture;

-- Get a list of all tables in the database that are NOT in the above list.
INSERT INTO @TableList 
SELECT S.name, 
	T.name
FROM sys.tables AS T
INNER JOIN sys.schemas AS S on  T.schema_id = S.schema_id
LEFT JOIN @CdcList AS C ON C.source_schema = S.name and C.source_table = T.name
WHERE T. schema_id < 1000 and S.name not like '%cdc%'
AND C.source_table IS NULL
ORDER BY S.name, T.name;

-- Loop thorugh table and create a cdc table for each item.	
SELECT @iLoopCount = MAX(ID) FROM @TableList;

WHILE @iLoopCount > 0
BEGIN
SELECT @cdcSql = CONCAT('exec sys.sp_cdc_enable_table @source_schema = N''',SchemaName,''',  @source_name= N''',TableName,''',  @role_name= N''',@cdcRoleName,'''')
FROM @TableList
WHERE ID = @iLoopCount and TableName not like 'sys%';

If @iLoopCount <= @MaxBatch
BEGIN
BEGIN TRY
EXEC(@cdcSql);
END TRY
BEGIN CATCH
PRINT(@cdcSql);
END CATCH
END;

SELECT @iLoopCount = @iLoopCount-1, @cdcSql='';
END;

exec sys.sp_cdc_help_change_data_capture;

END;
GO

Example code from my first run using:

EXEC dbo.CdcTablesCreate @cdcRoleName ='db_cdcadmin', @MaxBatch=25;

EXEC dbo.CdcTablesCreateChecker;

My output from the checker returns what has been enabled and what has not.

My final output looks like the screenshot below, where I notice that I have excluded any tables that are prefixed with “sys.”

Final Check

I always double check my SQL SERVER AGENT to make sure the two required Agent Jobs have been created.

If not, I can use this code to create them.

Use AdventureWorks2019;
GO
exec sys.sp_cdc_add_job 'capture';
GO
exec sys.sp_cdc_add_job 'cleanup';
GO

End of post.