question

jhowe avatar image
jhowe asked

apply cdc to existing data warehouse

hi all, I'm building an education data warehouse. I have various dimensions that i would like to track changes to i.e. DimStudent, DimClass, DimCollege, DimExamPaper etc. etc. My solution currently drops keys, truncates tables and recreates keys before loading staging tables, which then go on to load Dimension and fact tables. Therefore, my question is what are the minimum physical changes i need to put in place to --> truncate staging tables prior to load, track changes between staging tables, dim tables and fact tables, to incrementally load rows to dim tables and fact tables? Do i need to add a table/tables to track changes for EACH dimension or fact table? I have looked at examples but there seems to be a fair amount of work for just one table. I have 11 dimensions, 4 fact tables and 22 staging tables.
ssissql-serversql-server-2012data-warehousecdc
4 comments
10 |1200 characters needed characters left characters exceeded

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

thx for comment, yes i have used merge before, but it is very difficult to debug merge issues. For visibility purposes i would like to use CDC so it will be easier to maintain...
1 Like 1 ·
How are you currently capturing history? Are you currently using type 2 slowly changing dimensions? Or is this the sort of thing you're looking to move towards.
0 Likes 0 ·
i'm not tracking any history at the moment. The DWH purely dumps and then loads data, all tables are truncated prior to loading. i would like to implement SCD type 2. Do you use SCD component instead of CDC? or do you use SCD with CDC?
0 Likes 0 ·
Personally I have not used CDC I load my type 2 with TSQL MERGE. I have 2 tables one using an ETL schema and the other not. I load data in to the ETL version (staging) then clean the data (if needed). Then using a TSQL MERGE I take the data from ETL to the DW. I use all the standard history markers (Active flag, date from, date to). Alex Whittles has a good blog on loading SCD using MERGE. http://www.purplefrogsystems.com/blog/2012/04/automating-t-sql-merge-to-load-dimensions-scd/
0 Likes 0 ·

1 Answer

· Write an Answer
jhowe avatar image
jhowe answered
after a lot of investigation it looks as if CDC has to be enabled on a table by table basis, it is also recommended that you use a CDC state table for each CDC enabled table. It is also recommended to use a package per table for this process and may not be suitable for a large amount of tables.
10 |1200 characters needed characters left characters exceeded

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

Write an Answer

Hint: Notify or tag a user in this post by typing @username.

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.