Order of Data Load and Index Creation

I have one clustered and several non-clustered indexes on a data warehouse table with about 18 M rows 170 columns - about 18GB in size. The table comes from a Legacy system, and I cannot change it's structure.

This table is truncated and loaded on a daily basis.

For optimum performance, which is there an order to loading data and creating indexes?

Also, what do I do to this massive table in order to prep it for the indexes?
more ▼

asked Dec 12 '11 at 01:02 PM in Default

Raj More gravatar image

Raj More
1.7k 77 82 84

Its too flipant to be an answer but my suggestion would be to test both and see which works best for you, with your data.
Dec 13 '11 at 02:10 AM Fatherjack ♦♦
(comments are locked)
10|1200 characters needed characters left

3 answers: sort voted first

It depends on if the load system needs the indexes as part of it's process. But assuming that it runs faster without them (a fairly common batch load scenario) I would suggest that you build the clustered indexes first, then the nonclustered indexes. Otherwise, you're effectively building the nonclustered indexes twice. They'll get built once for a heap table, then when the table gets transformed to a cluster, the nonclustered indexes have to be completely rebuilt with pointers to the cluster key.

And what @Fatherjack says.
more ▼

answered Dec 13 '11 at 03:53 AM

Grant Fritchey gravatar image

Grant Fritchey ♦♦
91k 19 21 74

+1 Ah, you were a minute faster than me! :)
Dec 13 '11 at 03:56 AM Håkan Winther
Ha! Shere luck & maybe some quick fingers.
Dec 13 '11 at 04:00 AM Grant Fritchey ♦♦
(comments are locked)
10|1200 characters needed characters left

The fastest solution would be truncating the table, disable the indexes, load the data and rebuilld the indexes to enable them.

Do you really need to reload every record? How many records are actually changed? Can you use bulk load and extract only the new/changed records (MERGE maybe)?

Test is the key to be sure!
more ▼

answered Dec 13 '11 at 03:54 AM

Håkan Winther gravatar image

Håkan Winther
15.5k 33 37 48

@Hakan Doesn't disabling a Clustered Index deny access to the data?
Dec 13 '11 at 05:20 AM Raj More
@Raj More After truncating the data, would any data be available?
Dec 13 '11 at 05:32 AM Usman Butt
@Usman I get an error Msg 8655, Level 16, State 1, Line 1 The query processor is unable to produce a plan because the index 'Idx_TblCIdxTest_Id' on table or view 'TblCIdxTest' is disabled.
Dec 13 '11 at 01:41 PM Raj More
If you don't want the clustered index to get a heap instead, you actualy have to drop the index, but then you have to create the index with the column specification that you had before.
Dec 14 '11 at 12:48 AM Håkan Winther

@Raj More I think you did not get my point. The reason why I second to @Hakan Winther 's suggestion i.e. to disable all the indexes is to reduce the maintenance of keeping up to date with the definition of your several non-clustered indexes. Obviously, If you would disable the clustered index neither data could be read nor data insertion would be possible. For the first part, since I have already truncated the data, this should be the case as I would not want a report/query etc to be ran at that time and user gets an empty output. For second part, I would have to either execute the rebuild command or drop command for the clustered index only which is to be done in each suggestion.

At the end, all suggestions should be tested and you should implement what you are comfortable with and performs best in the testing. Hope it clears the ambiguity.
Dec 15 '11 at 01:46 AM Usman Butt
(comments are locked)
10|1200 characters needed characters left

Is the data that you want to load ordered in any way? If it's loaded according to the clustered index, you could load without disabling the clustered index, and then rebuild the non clustered indexes.

If the data is not (more or less) ordered according to the clustered key, go with suggestions from @Håkan Winther and @Grant Fritchey.
more ▼

answered Dec 13 '11 at 04:15 AM

Magnus Ahlkvist gravatar image

Magnus Ahlkvist
15.9k 15 19 32

+1 That is true, if the data is sorted according to clustered index, keep the clustered index, otherwise drop/disable it
Dec 13 '11 at 05:35 AM Håkan Winther
Eactly if data comes in order of clustered index, then keep the index. Also if that data are coming using BUIL operation, you nca try to play with Trace Flag 610 to get the insert operation Minimally Logged. (ORDER needs to be specified for the BULK operation). Then depending on the system you can gain much better performance.
Dec 13 '11 at 01:05 PM Pavel Pawlowski
(comments are locked)
10|1200 characters needed characters left
Your answer
toggle preview:

Up to 2 attachments (including images) can be used with a maximum of 524.3 kB each and 1.0 MB total.

New code box

There's a new way to format code on the site - the red speech bubble logo will automatically format T-SQL for you. The original code box is still there for XML, etc. More details here.

Follow this question

By Email:

Once you sign in you will be able to subscribe for any updates here



Answers and Comments

SQL Server Central

Need long-form SQL discussion? SQLserverCentral.com is the place.



asked: Dec 12 '11 at 01:02 PM

Seen: 1913 times

Last Updated: Dec 12 '11 at 01:02 PM