x

Data Partitions, Old Data and Performance

All things being equal do you get better throughput on old data if you keep a year in a single file or if you use 12 monthly partitions? Assume for the sake of argument that you need to access this data for trend analysis and "old" data will be utilized in 60% of the workloads. Current year will go on monthly partitons regardless.

My first intinct is that the monthly partitions are smaller and more quickly traversed. On the other hand you may experience disk thrashing if months are on the same drives.

Of course, more files means more threads. Maybe that can focus processors better and up throughput.

Also there is the maintenance overhead as you run into hundreds of files to support monthly partitions once you house a few years worth of data.

So would you combine or separate?

more ▼

asked Feb 12, 2010 at 06:36 PM in Default

Blackhawk-17 gravatar image

Blackhawk-17
11.9k 28 31 37

(comments are locked)
10|1200 characters needed characters left

3 answers: sort voted first

I would say that it depends. You probably don't get any performance boost (unless you can spread your data to different disk groups), but you can get some other benefits.

  • If your queries doesn't need data from all tables, it is faster to use small tables then one huge table.
  • As a rule of thumb, smaller tables are easier to maintain (index & statistics), but probably your index doesn't need rebuilds.
  • You can use filegroup backup (if your tables are spread over different filegroups) that is faster than a full (at the cost of complexity of course)
  • A performance benefit of moving "old" data to a separate partition is that you can probably set the filegroup to "read only" and SQL server doesn't need to handle locks on the table/tables.
more ▼

answered Feb 14, 2010 at 04:03 PM

Håkan Winther gravatar image

Håkan Winther
15.6k 35 37 48

(comments are locked)
10|1200 characters needed characters left

I would say stick with one data storage set for the old data, but ensure it's on a filegroup with a few files. Reason is that it won't be added to, so there's no benefit in terms of reindexing (seeing as the data won't change). Secondly, there's little difference between it using several threads to read data from the same filegroup to using several threads to read data from different filegroups. Trend analysis would be unlikely to want data from a single partition only, so no benefit there.

It's weird too, because the MS page on partitioning uses language that doesn't inspire, in terms of performance gains: Partitioned Table and Index Concepts

more ▼

answered Feb 12, 2010 at 08:59 PM

Matt Whitfield gravatar image

Matt Whitfield ♦♦
29.5k 61 65 87

(comments are locked)
10|1200 characters needed characters left

Doesn't it largely depend on the access mechanisms? If the access for older data is by year as opposed to by month, then it might make sense. Otherwise, I'd continue the same storage methods that you started with.

more ▼

answered Feb 12, 2010 at 08:24 PM

Grant Fritchey gravatar image

Grant Fritchey ♦♦
103k 19 21 74

(comments are locked)
10|1200 characters needed characters left
Your answer
toggle preview:

Up to 2 attachments (including images) can be used with a maximum of 524.3 kB each and 1.0 MB total.

New code box

There's a new way to format code on the site - the red speech bubble logo will automatically format T-SQL for you. The original code box is still there for XML, etc. More details here.

Follow this question

By Email:

Once you sign in you will be able to subscribe for any updates here

By RSS:

Answers

Answers and Comments

SQL Server Central

Need long-form SQL discussion? SQLserverCentral.com is the place.

Topics:

x251
x53

asked: Feb 12, 2010 at 06:36 PM

Seen: 1287 times

Last Updated: Feb 12, 2010 at 06:36 PM