x
login about faq Site discussion (meta-askssc)

How to reduce fragmentation on a clustered index

I have a database that is used to store data collected from a number of channels. It is real time data so each data point has an associated DateTime value. In its most basic form it it looks like this:

CREATE TABLE [dbo].[TestTable](
    [ChannelId] [int] NOT NULL,
    [DataTimeStamp] [datetime] NOT NULL,
    [Data] [float] NOT NULL
)

I have a clustered index on the ChannelId and DataTimeStamp to allow retrieval of data over a given time span. This works well for queries but the index fragments, and rebuilding is getting slower as the table grows in size.

I've looked changing the fill factor but I think this would be effective only if data was inserted with a random DataTimeStamp, rather than monotonically increasing.

Is there anything I can do to reduce the fragmentation? The only option I can see would be to create a separate table for each channel.

more ▼

asked Feb 02 '10 at 06:09 AM in Default

Stewy gravatar image

Stewy
21 1 1 1

I should add that most of the time the data will be queried for a single channel with a specified date range.

Feb 02 '10 at 08:22 AM Stewy
(comments are locked)
10|1200 characters needed characters left

1 answer: sort voted first

Change the clustered index to have DateTimeStamp as it's first key. You will still be able to query a channel for a given date range efficiently, with the added bonus that you will be able to query all channels for a given date range efficiently.

It will have a negative effect on querying for a particular channel with no specified date range, however.

I would still give it a fill factor appropriate to how your data is populated (i.e. if late data can be inserted then a fill factor would alleviate page splits in that scenario).

more ▼

answered Feb 02 '10 at 06:14 AM

Matt Whitfield gravatar image

Matt Whitfield ♦♦
29.2k 56 63 87

Would it also be good to have the datetimestamp as desc?

Feb 02 '10 at 08:02 AM sp_lock

No I don't think so, because then new rows would want to go before existing rows, and pages naturally fill up forwards...

Feb 02 '10 at 08:04 AM Matt Whitfield ♦♦

This would have the bonus of making queries for all channels over a given date range efficient, but querying for data from a single channel within a date range would then be slower.

Feb 02 '10 at 08:20 AM Stewy

Any difference would be totally marginal. You would still get a clustered index seek for a date range for a specific channel ID. As I said in the answer, the negative impact would be when searching for a specific channel ID with no date range.

Feb 02 '10 at 09:42 AM Matt Whitfield ♦♦

My concern was with the performance when retreiving data for a channelID over a large date range. There would certainly be a lot more disk IO as the data for each channel is spread throughout the table.

Feb 02 '10 at 10:28 AM Stewy
(comments are locked)
10|1200 characters needed characters left
Your answer
toggle preview:

Up to 2 attachments (including images) can be used with a maximum of 524.3 kB each and 1.0 MB total.

Follow this question

By Email:

Once you sign in you will be able to subscribe for any updates here

By RSS:

Answers

Answers and Comments



Facebook logo Follow Ask SSC on Facebook
Find Ask SSC on Google+
linkedin logo Find us on LinkedIn

Topics:

x1834
x220
x111
x18

asked: Feb 02 '10 at 06:09 AM

Seen: 1868 times

Last Updated: Feb 02 '10 at 06:09 AM

Copyright © 2002-2012 Simple Talk Publishing. All Rights Reserved. If you have any queries, please contact the site administrators.
Ask SQL Server Central is a community service provided by Red Gate.