question

Bhavpreet Singh avatar image
Bhavpreet Singh asked

SSIS - dealing with multiple databases and following row level commitment

hi, Consider the scenario:

I have two databases. From a table in first database I need to read some rows and insert\update the same to a table in second database and after inserting\updating the same I wana acknowledge the source database by setting a column let’s say ‘isProcessed’ to ‘Y’. I wana do this on row level basis. So, that if the package fails somehow in between we need not restart from the beginning.

I tried implementing the same using the following steps(explaining it giving an example):

  1. adding a data flow task and in that adding a OLE DB data source
  2. adding a OLE DB command to execute insert statement for each row in the data source above
  3. adding a derived column to fetch the acknowledgment returned by the stored proc executed above (this is added after step 1 and b4 step 2)
  4. based on the result set achieved above updating the source database

as you can see that the step 4 is performed only after step 3 is executed for all the rows obtained from source in step 1.

Please provide a solution.

ssisdatabase
10 |1200

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

Superminister avatar image
Superminister answered

I'd suggest a rather different approach for not having to restart each time: Is there any unique or Key column in your source data?

If not, could you identify such unique criteria, even by using compound criteria?

Then simply sort your input data and in the step updating the destination output the criteria value you have just used (for compund sorting criteria I'd suggest using ROW_NUMBER() instead).

Store this return value in a package variable. When the insert/update fails, output this variable to a temp table or to your log file.

Use it as filter criteria, e.g. sourcekey >= (at)variable

If you use sorted input data, you could even switch from using single rows to batches, likely improving performance and still allowing to restart with the last batch processed instead of the whole.

Hope that helps Cheers Christian

10 |1200

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

Tom Staab avatar image
Tom Staab answered

There are several ways this can be done, but before I get into that I want to point out a potential performance issue. If you truly need to process these rows and acknowledge them one at a time, this becomes more of a RBAR (Row-By-Agonizing-Row) solution rather than the preferred set-based methodology. You would be repeating the following 2 steps for each row.

  • Insert/update a row in database #2
  • acknowledge that row in database #1

But do you really have to do that? What is the likelihood that you will lose contact with database #2 during the process? Perhaps you could insert a batch and then acknowledge the batch. If errors occur, delay handling them until after both steps have completed. There are various ways to accomplish this depending on your specific requirements. You mentioned inserts and updates, and you also mentioned an OLE DB insert command. I believe the updates will be the trickier part for handling errors. What could cause an error with those? Do (or can) you insert any timestamp or other data that indicates which records have been inserted (or updated) in database #2?

More information will help us give you a good recommendation. Otherwise, the danger is that a bad recommendation is sometimes worse than none at all.

10 |1200

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

Write an Answer

Hint: Notify or tag a user in this post by typing @username.

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.