I HAVE 40 MILLION RECORDS IN MY PERSON TABLE I HAVE 100000 RECORDS FROM MY SOURCE TABLE WHICH I NEED TO COMPARE WITH PERSON TABLE AND DO AN INSERT,UPDATE OR DELETE ACCORDINGLY. UPDATE IF PERSONID EXISTS ALREADY INSERT IF NOT EXISTS I AM USING SSIS 2008 AND LOOK UPS TO DO THIS,Can you suggest me how to do this most efficiently ?
As @Kev Riley mentioned in comment, the best performing way would be [MERGE (Transact-SQL)]. You can even call it in Execute SQL Task. In case you really have to use the Data Flow, the Be sure, to load only primary keys into the Lookup and not the whole table. If you would like to detect also changes and update only the changed rows, then load into the lookup only the primary key and HASH of the columns where change can occur. Also for the source data include the HASH of the columns to compare.. Then simply compare the hashes to detect changes. Additionally when loading the data into the lookup you can reduce the set loaded into the lookup, by getting minimum and maximum key from the source table and then load into the lookup only the keys and hashes for the range between min and max keys. But again, if you expect lots of updates, than again you should insert rows with data to update into separate table and the do the update by T-SQL in the Execute SQL Task. So again based on the above, you will end with T-SQL for the update. So as mentioned in the beginning, if there is no real reason to use Data Flow, you should go the MERGE way. :