question

Raj More avatar image
Raj More asked

How do I consume more resources than this

I have a daily ETL process in SSIS that builds my warehouse so we can provide day-over-day reports. I have two servers - one for SSIS and the other for the SQL Server Database. The SSIS server (SSIS-Server01) is an 8CPU, 32GB RAM box. The SQL Server database (DB-Server) is another8CPU, 32GB RAM box. Both are VMWare virtual machines. In its oversimplified form, the SSIS reads 17 Million rows (about 9GB) from a single table on the DB-Server, unpivots them to 408M rows, does a few lookups and a ton of calculations, and then aggregates it back to about 8M rows that are written to a brand new table on the same DB-Server every time (this table will then be moved into a partition to provide day-over-day reports). I have a loop that processes 18 months worth of data at a time - a grand total of 10 years of data. I chose 18 months based on my observation of RAM Usage on SSIS-Server - at 18 months it consumes 27GB of RAM. Any higher than that, and SSIS starts buffering to disk and the performance nosedives. I am using [Microsoft's Balanced Data Distributor][1] to send data down 8 parallel paths to maximize resource usage. I do a union before starting work on my aggregations. Here is the task manager graph from the SSIS server ![alt text][2] Here is another graph showing the 8 individual CPUs ![alt text][3] As you can see from these images, the memory usage slowly increases to about 27G as more and more rows are read and processed. However the CPU usage is constant around 40%. The second graph shows that we are only using 4 (sometimes 5) CPUs out of 8. I am trying to make the process run faster (it is only using 40% of the available CPU). How do I go about making this process run more efficiently (least time, most resources)? [1]: http://technet.microsoft.com/en-us/sqlserver/hh369962 [2]: http://img155.imageshack.us/img155/8453/resources01.jpg [3]: http://img219.imageshack.us/img219/2013/resources02.jpg
ssissql-server-2008-r2etl
8 comments
10 |1200

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

Scot Hauder avatar image Scot Hauder commented ·
For more utilization, spawn more threads. Break the process up into smaller chunks and run them asynchronously. If 18 mo is your benchmark, make two 9 mo data flows and run them at the same time. Also look at William has said, it can often be difficult to know what is really going on with VMs
1 Like 1 ·
Scot Hauder avatar image Scot Hauder commented ·
Looks good to me, you don't want % above 85% for extended periods of time...I wish we had your server for our SSIS. Remove some of your RAM if you want the cpu % to max
0 Likes 0 ·
Kev Riley avatar image Kev Riley ♦♦ commented ·
Is it really an 8 CPU box, or a 4 CPU, with HyperThreading?
0 Likes 0 ·
Raj More avatar image Raj More commented ·
@Kev this is a VMWare virtual. I can find out on Monday what the setup actually is.
0 Likes 0 ·
Raj More avatar image Raj More commented ·
@Scot I want to process the data faster. Looks to me like I'm wasting available CPU by not using the 45% to take me to 85% usage.
0 Likes 0 ·
Show more comments
WilliamD avatar image
WilliamD answered
You can start by looking at perfmon charts to see if/what things are blocking you. Take a look at logical/physical disk counters. If you are seeing disk bottlenecks, as @ThomasRushton is offering as a possible reason, then Disk Time % will most likely be hovering around 100%. You would also see queue lengths rising higher than normal (depending the disk system, this may not be an issue). If disk write/read counters are going into constant values during the load process, then that would also indicate bottlenecks, as something is pushing to a limit and holding everything else back. You mention that the machine is a VM. You have to be *really* careful there, as you may be sharing resources without knowing it. Talk to the VM/SAN admins, find out what the host systems are up to - are you on shared storage that is being used elsewhere? What do the SAN performance counters say? Do you know the underlying VM host setup at all? Is there room for optimisation? SAN hardware can deliver a lot, but SQL server and SSIS can kick the crap out of most standard SAN setups without too much trouble. This is especially the case when the manufacturers defaults are followed and the SAN Admin is not a SAN ninja! I suggest you also take a look at the great stuff on SAN and VM config over at [ brentozar.com][1], you may find something there that helps out too. [1]: http://www.brentozar.com/sql/sql-server-san-best-practices/
10 |1200

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

Håkan Winther avatar image
Håkan Winther answered
You should also take a look at the wait stats to see what the server is waiting for. Select * from sys.dm\_os_wait\_stats
1 comment
10 |1200

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

Scot Hauder avatar image Scot Hauder commented ·
+1 This could be valuable since the MBDD blocks all paths if one is blocked
0 Likes 0 ·

Write an Answer

Hint: Notify or tag a user in this post by typing @username.

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.