question

Kevin Feasel avatar image
Kevin Feasel asked

Improving SSIS Insert Performance

I have an SSIS package which loads a fact table and am trying to improve the package's performance. I have gotten to the point where the slowest part of the package load is inserting the records into the warehouse fact table. Here's what I have so far: - I am using an OLE DB data connection (instead of ADO) to connect to the SQL Server. - I am performing a bulk insert (fast table load) and unchecked the "check constraints" box. I set the batch load to 10,000 rows. - Before the data load, I am disabling all non-clustered indexes, and then I rebuild them after the load completes. - My clustered index is an identity key. - The database and log are sized appropriately so I don't run into any auto-growth issues while loading the table. - The destination table is on a SQL Server 2008 installation. - Each row inserted is no greater than 308 bytes (I have a couple varchar columns for the business key). Are there any additional suggestions for improving insert performance? I'm at the point now where the table loads in 2 hours (which is half of my load window and roughly where I want to be), but the source query and translations finish in closer to 40 minutes if I attach a rowcount instead of doing the insert, so I was hoping that maybe I was missing something which could shave off some more time.
ssisbulk-insert
10 |1200

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

Pavel Pawlowski avatar image
Pavel Pawlowski answered
If the batch load of 10,000 corresponds to the rows per batch setting of the OLE DB Destination, then it's probably wrong. The rows per batch should be as close to to total amount of records being stored into the destination table. It is used as a ROWS_PER_BATCH hint of the BULK INSERT operation. It is used to optimize the plan for the bulk insert operation. If you run the SSIS on the same machine as the SQL Server instance, you can use a SQL Server destination for better performance. If you are loading very large amount of data and using Enterprise version of SQL server, you can try to parallelize the ETL process to insert into serveral partitions and then switch the partitions into a single table. If you are using full recovery mode, you can thing about log backup prior the ETL, and switch to bulk logging for the ETL.
2 comments
10 |1200

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

Thanks much for the set of tips. It looks like [my rows per batch setting might just be ignored anyways]( http://social.msdn.microsoft.com/Forums/en-US/sqlintegrationservices/thread/86efc54e-45bd-4ccc-baf1-a7b34f80fe32/) because I do have a Maximum Insert Commit Size, but it's good to know that I was wrong in picking that number... I've always heard bad thing about the SQL Server data flow destination, even from the SSIS development team webcasts. I do have SSIS running on the same machine as the instance, though, so I'll look into that some more. I'm only loading something on the order of 35-40 million records, so I'll keep partitioning in mind. I actually have this one on my agenda, so I may need to push that one up a bit. Switching to bulk logging is something I didn't even think about. That move alone should have at least a fair impact, I would imagine. This will be the first thing I try on Monday. Thanks again.
0 Likes 0 ·
Switching to bulk logging and editing the rows per batch setting shaved off about 10 minutes. I still have to do partitioning, but that's still a 7-10% gain, so I'm good with that for now...
0 Likes 0 ·
Fatherjack avatar image
Fatherjack answered
if you are looking for pure "[balls-out][1]" speed then you ought to take a look at some compression technology like RedGate's HyperBac. It sits between the application and the disk and compresses/decompresses the data on the fly. You simply attribute a file extension to be monitored by HyperBac and it works. I have seen HUGE speed gains using this and BCP. There's a free 14 day trial from their site if you want to test/see it in action. http://www.red-gate.com/products/dba/sql-hyperbac/ [Edit] Just re-read this and realised I linked to the wrong product, didnt mean HyperBac, it's SQL Storage Compress that you want. They are all in the same family of DBA Tools that RedGate do ... sorry! http://www.red-gate.com/products/dba/sql-storage-compress/ [1]: http://en.wikipedia.org/wiki/Centrifugal_governor
2 comments
10 |1200

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

I'll take a look at that tool. The load isn't pushing the available window, though, so I probably wouldn't have business justification to try to get it, but it does look intriguing.
0 Likes 0 ·
Pavel's suggestions are out of the box free and he has way more SSIS experience than me so be sure to try his points but if you try it out let me know how you get on.
0 Likes 0 ·
aRookieBIdev avatar image
aRookieBIdev answered
try changing the values in the properties DefaultBufferMaxSize and DefaultBufferMaxRows. The default value of DefaultBufferMaxSize is 10 MB and its upper and lower boundaries are constrained by two internal properties of SSIS which are MaxBufferSize (100MB) and MinBufferSize (64 KB). DefaultBufferMaxRows specifies the default number of rows in a buffer.
10 |1200

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

Write an Answer

Hint: Notify or tag a user in this post by typing @username.

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.