Hi all, I have a table which I load every month with new incoming data. The loading of data is done in an SSIS-package, which disables non clustered indexes (except UNIQUE indexes) on the table, loads the data and then rebuilds non clustered indexes. The table has grown quite a bit and therefore the performance on rebuilding indexes gets worse by the month. Loading data takes 20 minutes and rebuilding indexes takes six hours. I'm looking at solving this by partitioning the table, loading monthly data to a staging table and switching it into the main table. I don't really have any problems with creating the partition function and scheme etc, but I'm kind of stuck on the clustered index. Right now, I have an identity column as PK and clustered index. I also have a unique, non clustered index on date+code. I did have date+code as primary key initially, but that caused too many page splits, because the data source is an text-file which is not in any way sorted by code. When I put the table on to the partition scheme, I understand I'll have to drop the clustered index and rebuild it on the partition scheme. Then I'll need the date in the clustered index. One idea I have is to make the primary clustered key consist of ID+date. That would solve the page splits, and it would also help me in that I don't have to find out which identity seed to use in the loading job. But I'm thinking I'm not the only one ran into this scenario. Any nice suggestions on how to continue? I don't really want to laborate too much with it, as rebuilding the clustered index will take me more or less forever :) Therefore, I'd like to have you input before I decide which way to go.
I would test my partitioning keys in a new (empty) table to find out what's the best key. How large is your file? It doesn't sound that big if it takes 20 minutes to load. The best option would be to load the data into a staging table with the same structure as the target table, except the non-clustered indexes, then add the non-clustered indexes. When this is done, just switch the data in with fast partition switching, it will take fractions of a second as it's a schema change and not a DML operation.