|
I have a database that is used to store data collected from a number of channels. It is real time data so each data point has an associated DateTime value. In its most basic form it it looks like this:
I have a clustered index on the ChannelId and DataTimeStamp to allow retrieval of data over a given time span. This works well for queries but the index fragments, and rebuilding is getting slower as the table grows in size. I've looked changing the fill factor but I think this would be effective only if data was inserted with a random DataTimeStamp, rather than monotonically increasing. Is there anything I can do to reduce the fragmentation? The only option I can see would be to create a separate table for each channel.
(comments are locked)
|
|
Change the clustered index to have It will have a negative effect on querying for a particular channel with no specified date range, however. I would still give it a fill factor appropriate to how your data is populated (i.e. if late data can be inserted then a fill factor would alleviate page splits in that scenario). Would it also be good to have the datetimestamp as desc?
Feb 02 '10 at 08:02 AM
sp_lock
No I don't think so, because then new rows would want to go before existing rows, and pages naturally fill up forwards...
Feb 02 '10 at 08:04 AM
Matt Whitfield ♦♦
This would have the bonus of making queries for all channels over a given date range efficient, but querying for data from a single channel within a date range would then be slower.
Feb 02 '10 at 08:20 AM
Stewy
Any difference would be totally marginal. You would still get a clustered index seek for a date range for a specific channel ID. As I said in the answer, the negative impact would be when searching for a specific channel ID with no date range.
Feb 02 '10 at 09:42 AM
Matt Whitfield ♦♦
My concern was with the performance when retreiving data for a channelID over a large date range. There would certainly be a lot more disk IO as the data for each channel is spread throughout the table.
Feb 02 '10 at 10:28 AM
Stewy
(comments are locked)
|


I should add that most of the time the data will be queried for a single channel with a specified date range.