Hi, I have to move billions of rows from one table to the other on live system so we perform it in batch processing. When I run the query which fetches data it uses proper clustered index and last miliseconds however when the same query is used in the loop, in that batch processing it seems to it totally mistook the proper query plan and performs massive scans. Cannot find the reason why it is happeing like that? Any insights appriciated.
Grant, here it is: while 1=1 begin insert into #batch -- the problematic query select top 50000 tab_id from db..tableA with (nolock) where tab_id > @min_id and tab_id <= 1768364873 order by tab_id option(recompile); if @v_rowcount = 0 break; ------------------------------ -- ... inserting to different table ------------------------------ select @min_id = max(id) from #batch; truncate table #batch; --print message print 'Processed....cnt, rows, time'; end
Without seeing the query, it's impossible to say. However, a loop, a cursor, is a very poor mechanism for data movement. You're very likely to hit performance issues. You'll be better off finding a batch-based process that uses set-based logic, not cursors.[Here's a good article] on the use of the tally table as a mechanism to avoid loops. :
Maybe I haven't exressed properly enoough, batch processing = batch-based process. So the query is really simple, it's select which fetches clustred index and then it's inserted into other table. What's really surprise me, is that this select executed without the loop properly uses the index and causes really few reads, when it's executed in batch-based way (the IDs of course are changing in where-condition to fetch next batches) it lasts really long and causes massive reads, like it would perform index scans not index seeks.
Is it a parameter or a local variable? If it's a local variable, then the optimizer uses an average of the statistics to generate a plan rather than a specific value to get a more accurate plan. Assuming the query is simple, you might try using a RECOMPILE hint. That will sniff the parameter and possibly improve the plan choices.
Grant, your suggestion was right. I removed condition which was there just for limit purpose and put it at the begning of the loop so curently the codes looks like that: while 1=1 begin if @min_id > 1768364873 break; insert into #batch -- the problematic query select top 50000 tab_id from db..tableA with (nolock) where tab_id > @min_id order by tab_id option(recompile); .... and now is running as a champ! Thank you very much for time spent and your help!