gotqn avatar image
gotqn asked

How to define if table is a good candidate for a clustered columnstore index?

I have read ([here][1],[here][2] and [here][3]) about clustered columnstore indexes introduces in SQL Server 2014. Basically, now: - Column store indexes can be updatable - Table schema can be modified (without drop column store indexes) - Structure of the base table can be columnar - Space saved by compression effects (with a column store index, you can save between 40 to 50 percent of initial space used for the table) In addition, they support: - Row mode and Batch mode processing - BULK INSERT statement - More data types AS I have understood there are some restricts like: 1. not supported data types 2. other indexes cannot be created but as it it said: > With a clustered column store index, all filter possibilities are > already covered; Query Processor, using Segment Elimination, will be > able to consider only the segments required by the query clauses. On > the columns where it cannot apply the Segment Elimination, all scans > will be faster than B-Tree index scans because data are compressed so > less I/O operations will be required. I am interested in the following things: - Does the statement above say that a clustered column store index is always better for extracting data then a B-Tree index when a lot of duplicated values exist - What about the performance between clustered column store index and non-clustered B-Tree `covering` index, when the table has many columns for example - Can I have a combination of clustered and non clustered columnstores indexes on one table and most importantly, can anyone tell how to define if a table is a good candidate for a columned stored index? It is said that the best candidates are tables for which update/delete/insert operations are not performed often. For example, I have table with storage size above 17 GB (about 70 millions rows) and new records are inserted and deleted constantly. On the other side, a lot of queries using its columns are performed. Or I have a table with storage size about 40 GB (about 60 millions rows) with many inserts performed each day - it is not queried often but I want to reduce its size. I know the answer is mostly in running production tests but before that I need to pick the better candidates. [1]: [2]: [3]:
10 |1200

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

1 Answer

Grant Fritchey avatar image
Grant Fritchey answered
If you need a combination of columnstore and non-columnstore indexes, you can't use the clustered columnstore index. The performance between the two types of storage units, b-tree and columnstore is not directly comparable. If you are doing point look up operations, retrieving a row, the b-tree structure is going to be radically better than the columnstore. If you are doing any type of aggregation, totals, averages, large set-based operations, the columnstore is going to be radically better. I would only consider the columnstore or clustered columnstore when I have that kind of clear difference in the data retrieval. If it's a common, mixed-use table, or an OLTP oriented table, I would not suggest using columnstrore at all.
1 comment
10 |1200

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

gotqn avatar image gotqn commented ·
So, I should use index when I am performing aggregations and extracting large amount of records, but what about tables where regular updates occur? Could this really harm the benefits of using column store index? I am asking because as I have read the deletes are handle pretty clever (by using `deleted bitmap`).
0 Likes 0 ·

Write an Answer

Hint: Notify or tag a user in this post by typing @username.

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.