I am new to SQL Server but I have to store 20 million rows per day (1MB per row) in a table that will then be queried later on. The table has about 30 columns and the query can be based on any of these columns. The table will store data for up to 7 years before it is archived.
In order to get sub-second response time with 40 concurrent users what is the correct approach:
1) Is creating an index on every column the right way to go? 2) Based on the data size should the table be partitioned?
asked Aug 01, 2012 at 05:48 PM in Default
There is no simple answer to this question, it all depends on a lot of things, but I give you some recommendations. I would recommend you to partition the table. By using partitioning, SQL server may use partition elimination to exclude the partitions that will not be used.
Adding an index for every column might help you a little bit, BUT it will not solve every query (and will decrease insert performance and will probably give you a lot of index fragmentation) as you'll probably get a lot of index seek + key lookup to get all the columns you need in the result. In theory, You can of course put the other columns in the include clause, but that will increase the amount of data 30 times and will decrease your insert performance. In reallity you can't to that.
Column store index implemented in SQL server 2012 for this type of scenario (adhoc querying). If you don't need to update or delete old data, I would recommend you to create a Columnstore index, with every column included. That will increase the performance very much, and it is a good reason to upgrade to SQL Server 2012 Enterprise Edition.