question

Raj More avatar image
Raj More asked

Order of Data Load and Index Creation

I have one clustered and several non-clustered indexes on a data warehouse table with about 18 M rows 170 columns - about 18GB in size. The table comes from a Legacy system, and I cannot change it's structure. This table is truncated and loaded on a daily basis. For optimum performance, which is there an order to loading data and creating indexes? Also, what do I do to this massive table in order to prep it for the indexes?
sql-server-2008-r2indexing
1 comment
10 |1200

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

Fatherjack avatar image Fatherjack ♦♦ commented ·
Its too flipant to be an answer but my suggestion would be to test both and see which works best for you, with your data.
0 Likes 0 ·
Grant Fritchey avatar image
Grant Fritchey answered
It depends on if the load system needs the indexes as part of it's process. But assuming that it runs faster without them (a fairly common batch load scenario) I would suggest that you build the clustered indexes first, then the nonclustered indexes. Otherwise, you're effectively building the nonclustered indexes twice. They'll get built once for a heap table, then when the table gets transformed to a cluster, the nonclustered indexes have to be completely rebuilt with pointers to the cluster key. And what @Fatherjack says.
2 comments
10 |1200

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

Håkan Winther avatar image Håkan Winther commented ·
+1 Ah, you were a minute faster than me! :)
0 Likes 0 ·
Grant Fritchey avatar image Grant Fritchey ♦♦ commented ·
Ha! Shere luck & maybe some quick fingers.
0 Likes 0 ·
Håkan Winther avatar image
Håkan Winther answered
The fastest solution would be truncating the table, disable the indexes, load the data and rebuilld the indexes to enable them. Do you really need to reload every record? How many records are actually changed? Can you use bulk load and extract only the new/changed records (MERGE maybe)? Test is the key to be sure!
5 comments
10 |1200

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

Usman Butt avatar image Usman Butt commented ·
@Raj More I think you did not get my point. The reason why I second to @Hakan Winther 's suggestion i.e. to disable all the indexes is to reduce the maintenance of keeping up to date with the definition of your several non-clustered indexes. Obviously, If you would disable the clustered index neither data could be read nor data insertion would be possible. For the first part, since I have already truncated the data, this should be the case as I would not want a report/query etc to be ran at that time and user gets an empty output. For second part, I would have to either execute the rebuild command or drop command for the clustered index only which is to be done in each suggestion. At the end, all suggestions should be tested and you should implement what you are comfortable with and performs best in the testing. Hope it clears the ambiguity.
1 Like 1 ·
Raj More avatar image Raj More commented ·
@Hakan Doesn't disabling a Clustered Index deny access to the data?
0 Likes 0 ·
Usman Butt avatar image Usman Butt commented ·
@Raj More After truncating the data, would any data be available?
0 Likes 0 ·
Raj More avatar image Raj More commented ·
@Usman I get an error `Msg 8655, Level 16, State 1, Line 1 The query processor is unable to produce a plan because the index 'Idx_TblCIdxTest_Id' on table or view 'TblCIdxTest' is disabled.`
0 Likes 0 ·
Håkan Winther avatar image Håkan Winther commented ·
If you don't want the clustered index to get a heap instead, you actualy have to drop the index, but then you have to create the index with the column specification that you had before.
0 Likes 0 ·
Magnus Ahlkvist avatar image
Magnus Ahlkvist answered
Is the data that you want to load ordered in any way? If it's loaded according to the clustered index, you could load without disabling the clustered index, and then rebuild the non clustered indexes. If the data is not (more or less) ordered according to the clustered key, go with suggestions from @Håkan Winther and @Grant Fritchey.
2 comments
10 |1200

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

Håkan Winther avatar image Håkan Winther commented ·
+1 That is true, if the data is sorted according to clustered index, keep the clustered index, otherwise drop/disable it
1 Like 1 ·
Pavel Pawlowski avatar image Pavel Pawlowski commented ·
Eactly if data comes in order of clustered index, then keep the index. Also if that data are coming using BUIL operation, you nca try to play with Trace Flag 610 to get the insert operation Minimally Logged. (ORDER needs to be specified for the BULK operation). Then depending on the system you can gain much better performance.
0 Likes 0 ·

Write an Answer

Hint: Notify or tag a user in this post by typing @username.

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.