Updating a Table - What is best way to add/update, keeping performance in mind?
Hello, I was hoping someone could answer a few questions of mine regarding table updates as I'm an SQL noob. I have a table on a linked server (Oracle) that I'm trying to duplicate in SQL Server... Initially I had simply created a procedure like this: DROP TABLE xxx SELECT * INTO DBO.DETAIL FROM XXX //Using OpenQuery as is improved performance a lot. But it's taking quite a long time to run... so I'm looking at other ways of doing this. I ran across the UPDATE command and had tried: UPDATE DBO.DETAIL SET field = B.field, etc... FROM DBO.DETAIL A, xxx B Where A.Key = B.Key This command had updated every single line as well, which is alright I guess, but it seems to be unnecessary (I just need changes to be updated, not everything). In addition, it won't "add". For example, if table B has a new record, I'd like that to be added to table A - I imagine there is a simple add/update command, could anyone shed some light on the best way of doing this? If you're also able to tell me whether or not the records will be locked during the update, that'd be appreciated so I know if I have to schedule this at night or if I can do this periodically throughout the day for the most up-to-date data. Any comments/suggestions are very much appreciated. Thanks! Sheep
Lots of options. You could use SSIS to gather the data and import it. I wouldnt drop the table, I'd simply truncate it. OK the Insert stament needs to change a litte and its only preference. How many rows are invovled? In your update have a comparison on the columns that are changed so that you only update rows that need it. IE UPDATE DBO.DETAIL SET field = B.field, etc... FROM DBO.DETAIL A, xxx B Where A.Key = B.Key and A.ColOne B.ColOne -- where ColOne is the column that has changed in Oracle and you want it updated in SQL If the SQL Table is big then you may want to consider dropping and recreating indexes. If the issue invovled new rows too the you should also investigate the MERGE command in SQL 2008 as it handles both updates and inserts in one command
What you are describing is also known as an upsert (an update and insert in one). You didn't mention which version of SQL Server you are using, so here are two possible ways to do what you are wanting: /* Upsert method - works on SQL 2000 and above */ UPDATE trg SET Col1 = src.Col1 FROM dbo.Detail trg INNER JOIN XXX src ON src.[Key] = trg.[Key] INSERT INTO dbo.Detail ([Key], Col1) SELECT src.[Key], src.Col1 FROM XXX src LEFT JOIN dbo.Detail trg ON trg.[Key] = src.[Key] WHERE trg.[Key] IS NULL This does the `UPDATE`, and straight away a separate `INSERT` to add the missing data. AS you may realise, this is two separate operations and not really that great performance wise. The next way is to use the `MERGE` command (SQL 2008 and above) MERGE dbo.Detail trg USING (SELECT [Key], Col1 FROM XXX) src ON (src.[Key] = trg.[Key]) WHEN MATCHED AND (trg.Col1 src.Col1) THEN UPDATE SET trg.Col1 = src.Col1 WHEN NOT MATCHED THEN INSERT ([Key], Col1) VALUES (src.[Key], src.Col1) The `MERGE` command is optimised to fulfill this exact request