apply cdc to existing data warehouse

hi all,

I'm building an education data warehouse. I have various dimensions that i would like to track changes to i.e. DimStudent, DimClass, DimCollege, DimExamPaper etc. etc. My solution currently drops keys, truncates tables and recreates keys before loading staging tables, which then go on to load Dimension and fact tables.

Therefore, my question is what are the minimum physical changes i need to put in place to --> truncate staging tables prior to load, track changes between staging tables, dim tables and fact tables, to incrementally load rows to dim tables and fact tables? Do i need to add a table/tables to track changes for EACH dimension or fact table? I have looked at examples but there seems to be a fair amount of work for just one table. I have 11 dimensions, 4 fact tables and 22 staging tables.

more ▼

asked Jun 04, 2014 at 12:16 PM in Default

avatar image

1.1k 56 60 66

How are you currently capturing history? Are you currently using type 2 slowly changing dimensions? Or is this the sort of thing you're looking to move towards.

Jun 04, 2014 at 01:54 PM @SQLShark

i'm not tracking any history at the moment. The DWH purely dumps and then loads data, all tables are truncated prior to loading. i would like to implement SCD type 2. Do you use SCD component instead of CDC? or do you use SCD with CDC?

Jun 04, 2014 at 01:57 PM jhowe

Personally I have not used CDC I load my type 2 with TSQL MERGE. I have 2 tables one using an ETL schema and the other not. I load data in to the ETL version (staging) then clean the data (if needed). Then using a TSQL MERGE I take the data from ETL to the DW. I use all the standard history markers (Active flag, date from, date to).

Alex Whittles has a good blog on loading SCD using MERGE.


Jun 04, 2014 at 02:19 PM @SQLShark

thx for comment, yes i have used merge before, but it is very difficult to debug merge issues. For visibility purposes i would like to use CDC so it will be easier to maintain...

Jun 04, 2014 at 03:37 PM jhowe
(comments are locked)
10|1200 characters needed characters left

1 answer: sort voted first

after a lot of investigation it looks as if CDC has to be enabled on a table by table basis, it is also recommended that you use a CDC state table for each CDC enabled table. It is also recommended to use a package per table for this process and may not be suitable for a large amount of tables.

more ▼

answered Jun 12, 2014 at 09:08 AM

avatar image

1.1k 56 60 66

(comments are locked)
10|1200 characters needed characters left
Your answer
toggle preview:

Up to 2 attachments (including images) can be used with a maximum of 524.3 kB each and 1.0 MB total.

Follow this question

By Email:

Once you sign in you will be able to subscribe for any updates here



Answers and Comments

SQL Server Central

Need long-form SQL discussion? SQLserverCentral.com is the place.



asked: Jun 04, 2014 at 12:16 PM

Seen: 946 times

Last Updated: Jun 12, 2014 at 09:08 AM

Copyright 2018 Redgate Software. Privacy Policy