The typical pattern for using T-SQL Merge for Type 2 SCD columns is:
This is a generalization of course, and I've omitted several steps for brevity. I'm only showing the Expiration portion of the merge. However, since I'm using a Type 2 Merge statement for both my inserting of completely new records (not shown) and handling the expiration and new records for the Type 2 changes, why can't I move the RowIsCurrent portion into the On Clause? Therefore, we would have:
Is there something I'm missing here? In my own testing, the performance is the same until the number of expired records starts to overtake the number of current rows in the Dimension. Then the performance of this method just gets better and better.
Also, it solves another issue I've had which is non-continuous entities in my Dimension. Suppose that there are entities in the Dimension that should be expired and have NO current record? With the traditional method, the Merge statement ignores this and does NOT insert a new record. However, this new method is only looking for Current Rows to begin with. If it doesn't find one, it inserts a new one as if it's an entirely new entity to the Dimension.
Am I missing something here? I don't know why everyone isn't doing this.
asked Jul 29, 2013 at 02:57 PM in Default
The TechNet article Optimizing MERGE Statement Performance explicitly warns about this:
There are further details in the TechNet article Inserting, Updating, and Deleting Data by Using MERGE:
Both articles explain how the syntax of
answered Jul 30, 2013 at 12:39 AM
I agree that the expired target rows should not be evaluated - they have no value in the operation (they're not logically part of the "target"). As to the issue you mentioned - imho type 2 merge statements need to have NOT MATCHED BY SOURCE and NOT MATCHED BY TARGET clauses to handle both situations.
answered Jul 29, 2013 at 03:36 PM