question

Haque avatar image
Haque asked

select into performance is slow

I’m creating a table from another table. The source table has about 400 million records and the final table has about 110 million records. I’m doing a lot of calculation from the source table then inserting them into a table using select into statement. The query is taking about 2 hours 40 minutes to complete. I was expecting it to complete in one hour. I looked at the execution plan and found out insert statement is taking 77% of the time. The estimate number of rows is 400 million records and the actual number of rows is 110 million. Will the difference between estimated and actual number of rows slow down performance? How can I bring down the estimated number of rows? Also, will I get a better performance if I first create the table and then do insert into rather than select into the table? Is there any other ways I can improve the performance?
performance
4 comments
10 |1200

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

ruancra avatar image ruancra commented ·
Try to do the Insert in batches, start with doing 50 000 and bump it up to see what batch size gives the greatest performance...
0 Likes 0 ·
ThomasRushton avatar image ThomasRushton ♦♦ commented ·
What version of SQL Server?
0 Likes 0 ·
Geeth avatar image Geeth commented ·
sql serber 2008 R2
0 Likes 0 ·
Grant Fritchey avatar image Grant Fritchey ♦♦ commented ·
All answers that are helpful to you should be indicated by clicking on the thumbs up next to those answers. Any one answer that helped you solve your problem should be marked by clicking next to the check mark on that answer. You've been asking questions here for almost six months. You should understand how things work by now.
0 Likes 0 ·
ThomasRushton avatar image
ThomasRushton answered
Estimates are created based on the statistics information that SQL Server holds. Are your statistics up-to-date?
2 comments
10 |1200

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

Grant Fritchey avatar image Grant Fritchey ♦♦ commented ·
SQL Server does automatically update statistics, but, by default it's on a formula of only updating stats when 500 rows + 20% of existing rows have been changed. For larger data sets that may not be often enough. You can manually update statistics one of two ways. You can run sp_updatestats which updates any statistics that have a single row edited using a sampled average, or you can target stats using UPDATE STATISTICS and point to specific stats or indexes. You can also add WITH FULL SCAN to the UPDATE command to get a complete scan of statistics, not a sampled average. And yes, it makes a giant difference in performance because the optimizer bases most of it's decisions on the statistics you provide it.
1 Like 1 ·
Geeth avatar image Geeth commented ·
how to check statistics
0 Likes 0 ·
Grant Fritchey avatar image
Grant Fritchey answered
Beyond @ThomasRushton suggestion to update your statistics, I couldn't possibly make any suggestions without at least seeing the execution plan and query. But, 70% of the time spent on writes in a query of this sort actually makes plenty of sense when you think about it. The most work is going to be done writing the data to disk, splitting & rearrange pages as necessary, etc. All the work of maintaining data on disk is almost always more expensive than reading from the disk.
1 comment
10 |1200

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

Grant Fritchey avatar image Grant Fritchey ♦♦ commented ·
The partitioning rules should handle splitting the data for you, but other than that, again, not seeing the plan itself, I can't intelligently comment on it. Sorry. Costs are just estimates based on statistics and the operation in question. So they don't do more than give you guide posts where to look for issues. Sorts are always done in tempdb, and if they are large, you get spills from memory to disk. One thing you can look at is seeing if there is an index you can use to make the data access ordered instead of relying on the engine to sort the data later.
0 Likes 0 ·
ThomasRushton avatar image
ThomasRushton answered
Is there an index in place on the target table? If so, perhaps disable them all before starting the bulk insert.
1 comment
10 |1200

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

Haque avatar image Haque commented ·
No index is in place for the target table but there are indexes in the source table
0 Likes 0 ·

Write an Answer

Hint: Notify or tag a user in this post by typing @username.

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.