Hi All I have just downloaded the "Project Real" sample from Microsoft and noticed that they have offered two partitioning solutions for the Data Warehouse, either: 1. Multiple Table (MT) Format 2. Partitioned table In the multiple table format they basically create tables and views etc dynamically (with a date on the end of the table name). Has anyone else used a similar design? Is it manageable? Or there any issues that anyone has come across? Thanks in advance...
As with all partitioning, you are only really going to get exceptional performance if you plan very carefully; in particular, how often do you need to visit data in older partitions, or in more than one partition? In a multiple-table setup, you often need to give your queries a bit of help to stay performant. A control table that keeps the physical table name and minimum/maximum-value of each clustered key might be a reasonable start (provided your database design uses append-friendly, sequential keys -- IDENTITY columns or 'guaranteed' date-based keys); at least then (with a bit of extra code) you'd be able to direct your updates and queries to the right sub-table first time. Simple things (such as counts) become complex and filters that don't directly target clustered indices have to be run against all sub-tables. You'll also need to consider re-partitioning (when to reorder the data within the tables; when to create the next table; when to recreate unifying views I would prefer partitioned tables, if Enterprise Edition is an option, as much of this managed for you. Of course, that's what you pay the big bucks for. Otherwise, you might want to consider locking-down direct access to the database itself and providing access only via a front-end web service that can access the data in the 'best' way you can. Perhaps, if you were to go this route, you may be interested in 'sharding' across separate databases instead. Something like the [Enzo Shard Library](
http://enzosqlshard.codeplex.com/) might be of interest to you *(full disclosure: don't work there, never used it, read good things across the web, been looking for an opportunity/excuse to try it)*
Partitioned tables and also the Multiple Table Format (Partitioned Views) have it's pluses and minuses. - Both can bring you a performance gain on very large table - Partitioned tables are enterprise feature, while Partitioned views you can created even on Standard version - Both have extra management cost - For partitioned tables you have to manage partition functions and partitione schemes - For partitioned views you have to manage check constraints on tables and in most cases instead of... triggers on the views - Both require a careful thinking on on the design - Partitioned table allows only horizontal partitioning. - Partitioned views allows you also vertical partitioning - Partitioned views allows you even distributed partitioning among several servers - With partitioned tables you can easily switch partitions between tables where in partitioned views you have to alter the view. - On partitioned tables you can use batch import, while on partitioned views not - and many more. In both scenarios you have to analyze the needs and design the clustered keys carefully. On enterprise version on server if the partitioning should be done on a single server a partitioned tables are better and less problematic solution. Even with partitioned views you can gain very good performance, as if you have correctly designed primary keys and Check constraints, query optimizer will recognize which tables are necessary to fulfill the query and will not touch the other not necessary. If you go using partitioned views and/or tables, in both situation you have to think clearly about the design and both scenarios have higher management costs, but both are manageable. I was using the partitioned views cca 2 times without any problems. With proper design you can have even unique IDENTITY column in the partitioned views scenario.