x
login about faq Site discussion (meta-askssc)

Creating Indexes on multiple tables in parallel

I am running an ETL that moves data and generates some output tables. Sizes range from 20GB to 8GB to 4GB all the way down to 1MB (about 25 tables total).

We are on VMWare virtual machines so I do not have the options of putting filegroups on individual spindles. Processing power is decent - 4 CPU cores and 32 GB RAM.

I need indexes on these tables since they are going to be used in other processes. I only have one clustered index - all other indexes are non clustered.

I am running SSIS, I have one SQL task to run all the index generation. However, I figure I can save time (and put the hardware I have to hard work) by running these index generations in parallel.

How do I determine which ones to run in parallel and which ones to not?

more ▼

asked Nov 03 '11 at 06:28 PM in Default

Raj More gravatar image

Raj More
1.7k 74 78 82

(comments are locked)
10|1200 characters needed characters left

2 answers: sort voted first

I believe it purely depends upon the environment. If you can live with high CPU utilization, slow response, you can try multiple index generation in parallel on different tables. Since you are in VM environment, you may never know when you would be striving for resources.

I agree parallel execution could save you time if your resources are under utilized. But at the same time some combinations could increase your execution time, if they would be striving for the resources. As far as the combination is concerned, I would try different combinations (like one big 8 GB and some minor tables, one huge 20 GB index only etc) according to the environment, workload and then implement the most feasible combination. I would also try different combinations with different max degree of parallelism (MAXDOP). Cheers.

more ▼

answered Nov 04 '11 at 02:40 AM

Usman Butt gravatar image

Usman Butt
13.8k 6 8 14

My MAXDOP is currently set to 0 (zero). I'd love to understand a scientific way to tweak these settings rather than just playing around and not really understanding why.

Nov 04 '11 at 10:24 AM Raj More

@raj Moore - the whole thing about SQL Server and the settings you are talking about is that they are usage specific. You have to change the values to suit your system and data. Set, observe, repeat that is scientific too.

Nov 04 '11 at 12:28 PM WilliamD
(comments are locked)
10|1200 characters needed characters left

If all of the tables you are talking about exist on a single physical harddrive, you are unlikely to see any benefit from running them in parallel. Indexing is an IO intensive task, but it is generally not processor or ram intensive (though there may be some limited exceptions, such as indexing a view which includes a requirement to derive complex results or indexing a computed column).

However, as Usman said the best way to know is to test it, and you are likely to see some benefit if the tables in question did exist on multiple hard drives.

more ▼

answered Nov 04 '11 at 09:59 AM

TimothyAWiseman gravatar image

TimothyAWiseman
14.3k 16 20 29

(comments are locked)
10|1200 characters needed characters left
Your answer
toggle preview:

Up to 2 attachments (including images) can be used with a maximum of 524.3 kB each and 1.0 MB total.

Follow this question

By Email:

Once you sign in you will be able to subscribe for any updates here

By RSS:

Answers

Answers and Comments



Facebook logo Follow Ask SSC on Facebook
Find Ask SSC on Google+
linkedin logo Find us on LinkedIn

Topics:

x750
x394
x111
x28
x21

asked: Nov 03 '11 at 06:28 PM

Seen: 928 times

Last Updated: Nov 03 '11 at 06:28 PM

Copyright © 2002-2012 Simple Talk Publishing. All Rights Reserved. If you have any queries, please contact the site administrators.
Ask SQL Server Central is a community service provided by Red Gate.