I am running an ETL that moves data and generates some output tables. Sizes range from 20GB to 8GB to 4GB all the way down to 1MB (about 25 tables total).
We are on VMWare virtual machines so I do not have the options of putting filegroups on individual spindles. Processing power is decent - 4 CPU cores and 32 GB RAM.
I need indexes on these tables since they are going to be used in other processes. I only have one clustered index - all other indexes are non clustered.
I am running SSIS, I have one SQL task to run all the index generation. However, I figure I can save time (and put the hardware I have to hard work) by running these index generations in parallel.
How do I determine which ones to run in parallel and which ones to not?
asked Nov 03 '11 at 06:28 PM in Default
I believe it purely depends upon the environment. If you can live with high CPU utilization, slow response, you can try multiple index generation in parallel on different tables. Since you are in VM environment, you may never know when you would be striving for resources.
I agree parallel execution could save you time if your resources are under utilized. But at the same time some combinations could increase your execution time, if they would be striving for the resources. As far as the combination is concerned, I would try different combinations (like one big 8 GB and some minor tables, one huge 20 GB index only etc) according to the environment, workload and then implement the most feasible combination. I would also try different combinations with different max degree of parallelism (MAXDOP). Cheers.
answered Nov 04 '11 at 02:40 AM
If all of the tables you are talking about exist on a single physical harddrive, you are unlikely to see any benefit from running them in parallel. Indexing is an IO intensive task, but it is generally not processor or ram intensive (though there may be some limited exceptions, such as indexing a view which includes a requirement to derive complex results or indexing a computed column).
However, as Usman said the best way to know is to test it, and you are likely to see some benefit if the tables in question did exist on multiple hard drives.
answered Nov 04 '11 at 09:59 AM