I'm working on a Data Warehouse implementation using SSIS 2008. The source system is in SQL 2000.
The code to find deleted records and insert them into a deleted records table is as follows:
Basically I am using Left Outer Join (where null) twice to find records that exist in a numbers table, but don't exist in the source table, nor are already listed as deleted in the DeleteRecords table.
I'd love to call this whole thing from one stored procedure and just pass in the source table name. However, I know you shouldn't do that and no performance benefit will be gained by using a SP to run dynamic SQL.
I have approximately 300 tables which will have to be checked this way.
Should I actually create 300 separate stored procedures, one for each table?
Use SQL that is dynamically generated in SSIS to pull newly deleted records per source table?
I'd love to use delete triggers, but that won't be allowed on this project.
asked Jun 20, 2011 at 08:35 AM in Default
You wouldn't get a performance benefit, but if you're going to use a DeletedRecords table, this could be a case where dynamic SQL could make sense.
If it's not too late, though, I would recommend looking at type-2 slowly changing dimensions and Todd McDermid's SSIS plug-in to manage Kimball-style warehouse dimensions. The type-2 setup, by including effective and expiration dates, lets you handle deleted records without recourse to a separate table, so your fact tables can still maintain foreign key references to dimensions and guarantee referential integrity, something you could not do with a main table and a deleted table.
But again, if that's out of the question (which would be a shame--the setup listed above is problematic, in my opinion), using dynamic SQL to create a deleted record loader and passing in a table name wouldn't strike me as a bad idea. It's a really bad idea if end users had some level of control, but the rules are a bit looser for ETL implementations, and when you know that your warehouse is going to get loaded in one way and by one load process (so you don't have a number of people entering data online, or three totally separate mechanisms for populating the warehouse tables, etc.).
answered Jun 20, 2011 at 08:53 AM