how to Delete HUGE amount of data with out timing out
I have a table with a 44 Million records. With a PK of location. I am trying to delete a single location that has 1.5 million records. Delete from myBigTable where location = '01234' I have tried with and with out a Tran and still times out. what is the best approach on deleting HUGE amounts of data. Break it out some how ie Get count delete top 100000, next 100000 ect
Table partitioning is great for this sort of thing because of the SWITCH option. Other than that, deleting based on the clustered index is best, and it sounds like you already are. Doing subset ranges of the PK might be your best bet. DECLARE @DeleteLocation varchar(50) = '01234'; DECLARE @BatchSize int = 10000; WHILE @@ROWCOUNT > 0 BEGIN; DELETE TOP(@BatchSize) myBigTable WHERE location = @DeleteLocation ; END; @@ROWCOUNT will return 1 after the assignment of @BatchSize. It will then stop the loop once there are no rows to delete.
SET ROWCOUNT 500 delete_more: DELETE FROM MySalesOrderDetail WHERE SalesOrderID between 43659 and 44000 IF @@ROWCOUNT > 0 GOTO delete_more SET ROWCOUNT the above script 55 records in a batch you can change as per your requirekmet
I helped e developer who ran into timeouts when deleting a few million records from a table with some 500 million records. In that case, there was quite some latency on writing to tempdb and to the transaction log due to the way the disks were set up. I adviced on chunking the delete into pieces, like Tom Staab suggests in his answer. But that didn't really help much - even when deleting smaller chunks every row had to be written to the transaction log. When examining the execution plans for the delete (both when deleting the whole set at once and when deleting smaller chunks) we had a lot of spill to tempdb, because of a sort operation needed to delete the rows in all non clustered indexes. In that case, I tried disabling all non clustered indexes prior to the delete, deleted the rows and then rebuilt the non clustered indexes. It was a huge timesaver for us. If that applies to your solution or not really depends on what's going on under the hood. If you look at the actual execution plan of a delete of some million records, you will both see if you get sort operations and if these (or other operations) spill to tempdb (that's displayed as a warning on the graphical representation of the execution plan in Management Studio). If you do, it might or might not save you some time disabling indexes prior to the delete and rebuilding them afterwords. If you're on Standard edition, your index rebuilds will be made offline, which might or might not be ok with your non functional requirements. If you're on Enterprise Edition, it might be worth investing some time on partitioning the table. From SQL Server 2016 you won't even have to switch out a partition to staging table in order to truncate a subset - you can truncate a partition directly. Partitioning is an andvanced feature. A much less advanced feature of Enterprise Edition is compression, which would save you both time, disk space and memory. With row compression on both clustered and non clustered indexes, you'll probably reduce disk space by some 30% which will directly impact the execution time of a delete operation positively (smaller size = less information to write to the transaction log and less IO on the data pages). Sorry to give you such broad "it depends"-style answer, but like so very often there's no one-size-fits-all solution.