x

How to reduce fragmentation on a clustered index

I have a database that is used to store data collected from a number of channels. It is real time data so each data point has an associated DateTime value. In its most basic form it it looks like this:

CREATE TABLE [dbo].[TestTable](
    [ChannelId] [int] NOT NULL,
    [DataTimeStamp] [datetime] NOT NULL,
    [Data] [float] NOT NULL
)

I have a clustered index on the ChannelId and DataTimeStamp to allow retrieval of data over a given time span. This works well for queries but the index fragments, and rebuilding is getting slower as the table grows in size.

I've looked changing the fill factor but I think this would be effective only if data was inserted with a random DataTimeStamp, rather than monotonically increasing.

Is there anything I can do to reduce the fragmentation? The only option I can see would be to create a separate table for each channel.

more ▼

asked Feb 02, 2010 at 06:09 AM in Default

Stewy gravatar image

Stewy
21 1 1 1

I should add that most of the time the data will be queried for a single channel with a specified date range.
Feb 02, 2010 at 08:22 AM Stewy
(comments are locked)
10|1200 characters needed characters left

1 answer: sort voted first

Change the clustered index to have DateTimeStamp as it's first key. You will still be able to query a channel for a given date range efficiently, with the added bonus that you will be able to query all channels for a given date range efficiently.

It will have a negative effect on querying for a particular channel with no specified date range, however.

I would still give it a fill factor appropriate to how your data is populated (i.e. if late data can be inserted then a fill factor would alleviate page splits in that scenario).

more ▼

answered Feb 02, 2010 at 06:14 AM

Matt Whitfield gravatar image

Matt Whitfield ♦♦
29.4k 61 65 87

Would it also be good to have the datetimestamp as desc?
Feb 02, 2010 at 08:02 AM sp_lock
No I don't think so, because then new rows would want to go before existing rows, and pages naturally fill up forwards...
Feb 02, 2010 at 08:04 AM Matt Whitfield ♦♦
This would have the bonus of making queries for all channels over a given date range efficient, but querying for data from a single channel within a date range would then be slower.
Feb 02, 2010 at 08:20 AM Stewy
Any difference would be totally marginal. You would still get a clustered index seek for a date range for a specific channel ID. As I said in the answer, the negative impact would be when searching for a specific channel ID with no date range.
Feb 02, 2010 at 09:42 AM Matt Whitfield ♦♦
My concern was with the performance when retreiving data for a channelID over a large date range. There would certainly be a lot more disk IO as the data for each channel is spread throughout the table.
Feb 02, 2010 at 10:28 AM Stewy
(comments are locked)
10|1200 characters needed characters left
Your answer
toggle preview:

Up to 2 attachments (including images) can be used with a maximum of 524.3 kB each and 1.0 MB total.

New code box

There's a new way to format code on the site - the red speech bubble logo will automatically format T-SQL for you. The original code box is still there for XML, etc. More details here.

Follow this question

By Email:

Once you sign in you will be able to subscribe for any updates here

By RSS:

Answers

Answers and Comments

SQL Server Central

Need long-form SQL discussion? SQLserverCentral.com is the place.

Topics:

x1950
x251
x130
x20

asked: Feb 02, 2010 at 06:09 AM

Seen: 2631 times

Last Updated: Feb 02, 2010 at 06:09 AM