ETL Strategies: Identity Insert vs. Use Identity Logic
One of my ETL moves about 18 Million rows from one server to another for further processing. I am using the following options `FAST LOAD` For the Identity Column, I have two options: - Use `IDENTITY INSERT` - Don't set any input for the Identity Column, thereby forcing SQL Server to generate a new IDENTITY for each row inserted The value of the Identity column does not matter (it is not used in JOINS) Which option should I choose?
If you just want to speed up the **process of ETL**, and provided with the identity column values are not used, give a try to omit the IDENTITY column from ETL process and add the IDENTITY column as a separate process later. Cheers.
I would lean towards using identity insert. But as usual, I would try with substantially smaller data sets before finally deciding. If possible, I would also split the data set into smaller chunks to gain performance.