All things being equal do you get better throughput on old data if you keep a year in a single file or if you use 12 monthly partitions? Assume for the sake of argument that you need to access this data for trend analysis and "old" data will be utilized in 60% of the workloads. Current year will go on monthly partitons regardless.
My first intinct is that the monthly partitions are smaller and more quickly traversed. On the other hand you may experience disk thrashing if months are on the same drives.
Of course, more files means more threads. Maybe that can focus processors better and up throughput.
Also there is the maintenance overhead as you run into hundreds of files to support monthly partitions once you house a few years worth of data.
So would you combine or separate?
asked Feb 12 '10 at 06:36 PM in Default
I would say that it depends. You probably don't get any performance boost (unless you can spread your data to different disk groups), but you can get some other benefits.
answered Feb 14 '10 at 04:03 PM
I would say stick with one data storage set for the old data, but ensure it's on a filegroup with a few files. Reason is that it won't be added to, so there's no benefit in terms of reindexing (seeing as the data won't change). Secondly, there's little difference between it using several threads to read data from the same filegroup to using several threads to read data from different filegroups. Trend analysis would be unlikely to want data from a single partition only, so no benefit there.
It's weird too, because the MS page on partitioning uses language that doesn't inspire, in terms of performance gains: Partitioned Table and Index Concepts
answered Feb 12 '10 at 08:59 PM
Matt Whitfield ♦♦
Doesn't it largely depend on the access mechanisms? If the access for older data is by year as opposed to by month, then it might make sense. Otherwise, I'd continue the same storage methods that you started with.
answered Feb 12 '10 at 08:24 PM
Grant Fritchey ♦♦