SSIS VS. SQL: where to check record's validity before inserting?
Hi, I have an SSIS package that processes an incoming spreadsheet. Before inserting/updating records I need to check if a date in each of them - say, "Job Close Date" - is not before a corresponding date - say, "Job Open Date" - that is stored, for each "Job" in a table is the database. I use a store procedure that is called from SSIS package and performs dates check and, if "Job Close Date" is valid, goes on and performs insert/update. My boss does not like this technique, he says that the best practice would be to put this logic into the package using, say, conditional split. Is this correct that performing check-ups and insert/update via SQL stored procedure is worse than doing this in the SSIS package? If yes, why? And, anyway, I don't see how conditional split could check "Job Close Date" record by record... could it? Thank you, - Tatyana
i agree with Kannan, it would be better to perform the checks on the data before inserting them into the table. Use a conditional split, and ignore the data that is invalid. If you are inserting the data into a table, then use SSIS, however if you are updating existing data, then you should always insert the data into a temp table and then use SQL to perform the updates. Using the OLEDB Command transform to perform updates only updates one row at a time (RBAR) and should be avoided. And, because you are removing the data you don't need in the SISS package, you can remove the where clauses in the update statement that you would normally use.
My preference would be to do this in a SSIS package. The Conditional split gives you two outputs ( conditions based on the validity of Job close date) The out puts can then be inserted/updated parallelly. Yes it does check the condition for each row. Based on properties like DefaultBufferMaxRows and DefaultBufferSize in the data flow task you could further enhance the performance. Thanks, Kannan