How to reduce fragmentation on a clustered index

I have a database that is used to store data collected from a number of channels. It is real time data so each data point has an associated DateTime value. In its most basic form it it looks like this:

CREATE TABLE [dbo].[TestTable](
    [ChannelId] [int] NOT NULL,
    [DataTimeStamp] [datetime] NOT NULL,
    [Data] [float] NOT NULL

I have a clustered index on the ChannelId and DataTimeStamp to allow retrieval of data over a given time span. This works well for queries but the index fragments, and rebuilding is getting slower as the table grows in size.

I've looked changing the fill factor but I think this would be effective only if data was inserted with a random DataTimeStamp, rather than monotonically increasing.

Is there anything I can do to reduce the fragmentation? The only option I can see would be to create a separate table for each channel.

more ▼

asked Feb 02, 2010 at 06:09 AM in Default

avatar image

21 1 1 3

I should add that most of the time the data will be queried for a single channel with a specified date range.

Feb 02, 2010 at 08:22 AM Stewy
(comments are locked)
10|1200 characters needed characters left

1 answer: sort voted first

Change the clustered index to have DateTimeStamp as it's first key. You will still be able to query a channel for a given date range efficiently, with the added bonus that you will be able to query all channels for a given date range efficiently.

It will have a negative effect on querying for a particular channel with no specified date range, however.

I would still give it a fill factor appropriate to how your data is populated (i.e. if late data can be inserted then a fill factor would alleviate page splits in that scenario).

more ▼

answered Feb 02, 2010 at 06:14 AM

avatar image

Matt Whitfield ♦♦
29.5k 62 66 88

Would it also be good to have the datetimestamp as desc?

Feb 02, 2010 at 08:02 AM sp_lock

No I don't think so, because then new rows would want to go before existing rows, and pages naturally fill up forwards...

Feb 02, 2010 at 08:04 AM Matt Whitfield ♦♦

This would have the bonus of making queries for all channels over a given date range efficient, but querying for data from a single channel within a date range would then be slower.

Feb 02, 2010 at 08:20 AM Stewy

Any difference would be totally marginal. You would still get a clustered index seek for a date range for a specific channel ID. As I said in the answer, the negative impact would be when searching for a specific channel ID with no date range.

Feb 02, 2010 at 09:42 AM Matt Whitfield ♦♦

My concern was with the performance when retreiving data for a channelID over a large date range. There would certainly be a lot more disk IO as the data for each channel is spread throughout the table.

Feb 02, 2010 at 10:28 AM Stewy
(comments are locked)
10|1200 characters needed characters left
Your answer
toggle preview:

Up to 2 attachments (including images) can be used with a maximum of 524.3 kB each and 1.0 MB total.

Follow this question

By Email:

Once you sign in you will be able to subscribe for any updates here



Answers and Comments

SQL Server Central

Need long-form SQL discussion? SQLserverCentral.com is the place.



asked: Feb 02, 2010 at 06:09 AM

Seen: 2988 times

Last Updated: Feb 02, 2010 at 06:09 AM

Copyright 2018 Redgate Software. Privacy Policy