x

Filtered Statistics on columns with no indexes

Anyone can explain how would I benefit from filtered statistics on columns that do not have an index?

Update:

Based on [this link][1], query plans will differ as the optimizer will estimate more accurately. I see the differences in plans when I run under the AdventureWorks sample DB.

But I still do not see the benefit as the IO logical reads are still the same with and without filtered statistics.

[1]: http://msdn.microsoft.com/en-us/library/ms190397.aspx#CreateStatistics
more ▼

asked Oct 27 '10 at 02:27 PM in Default

ozamora gravatar image

ozamora
1.4k 2 3 5

(comments are locked)
10|1200 characters needed characters left

3 answers: sort voted first

If you have queries with complex but predictable where clauses which include multiple columns then creating filtered statistics might make sense. They are cheaper than indexes as far as the system resources are concerned, but still provide information to the engine about so-called cardinality estimates (a fancy word for selectivity). The information is stored for each prefix column of the statistics, meaning that if the queries fail to include the prefix column in the predicates then the statistics are useless, just like the indexes are useless when the first column of the index is not included.

I am sure that someone will have a better answer, but these are my 2 cents.

Oleg
more ▼

answered Oct 27 '10 at 03:17 PM

Oleg gravatar image

Oleg
15.9k 2 4 24

@Oleg - good answer. I am trying to get an example put together and the bloody thing won't do what I want it to.

I suspect I am getting some sort of plan reuse when I don't want it, but I cannot seem to fix it.
Oct 27 '10 at 03:25 PM WilliamD

@WilliamD If it is dev DB then you can just clear all them out:

dbcc dropcleanbuffers with no_infomsgs;
dbcc freeproccache with no_infomsgs;
dbcc freesystemcache('All') with no_infomsgs;
go
Ladies and Gentlemen! All existing plans have left the building :)
Oct 27 '10 at 03:35 PM Oleg
@Oleg - tried that, didn't help with my example. I think I'll post it as a wiki here and someone can take a look.
Oct 27 '10 at 03:37 PM WilliamD
@Oleg - +1 - not sure how anyone is going to have a better answer than that...
Oct 27 '10 at 03:39 PM Matt Whitfield ♦♦
Thanks guys. I understand the theory but still don't get the idea of filtered statistics for a table that has no indexes. At the end, the optimizer will do a cluster index scan or table scan. What do you think?
Oct 27 '10 at 05:44 PM ozamora
(comments are locked)
10|1200 characters needed characters left

I cannot seem to get this example to work as I wanted. Maybe it can be improved upon - it is just too late for me today.

I play in my test DB SandBox, that it all you will need to change to get this to work.

I clear the cache in this script too - DO NOT PLAY WITH THIS ON A PRODUCTION SYSTEM!!!!!!

/* Turn off Statistics creation and updates */
ALTER DATABASE SandBox SET AUTO_CREATE_STATISTICS OFF, AUTO_UPDATE_STATISTICS OFF

/* Create a test table */
CREATE TABLE dbo.TestTable
(Col1 int NOT NULL)
;
/* Insert a row */
INSERT INTO dbo.TestTable
        (Col1)
SELECT TOP 1 ROW_NUMBER() OVER (ORDER BY (SELECT NULL) ) FROM sys.columns
;


/* Now turn on Actual Estimation Plan*/
SET STATISTICS XML ON

/* Query Test Table */
SELECT * FROM dbo.TestTable
;
/* Estimate Rows = 1 , Actual Rows = 1*/

/* Insert lots of rows */
INSERT INTO dbo.TestTable
        (Col1)
SELECT ROW_NUMBER() OVER (ORDER BY (SELECT NULL) )+1 FROM sys.columns
;
/* Query Test Table */
SELECT * FROM dbo.TestTable
;
/* Estimate Rows = 486 , Actual Rows = 486*/

/* Hmmm - no difference for table scans with no WHERE clause.... */


/* Query Test Table - with a WHERE clause */
SELECT * FROM dbo.TestTable WHERE col1<100
;
/* Estimate Rows = 145.8 , Actual Rows = 99 */

/* AHA!! The execution plan is different - SQL Server even warns that there are missing Stats on the execution plan */

/* Create a filtered statistic */
;
CREATE STATISTICS stat ON dbo.TestTable (Col1) WHERE col1 <100 
;
/* Query Test Table - with a WHERE clause */
SELECT * FROM dbo.TestTable WHERE col1<100
;
/* Estimate Rows = 145.8 , Actual Rows = 99 */

/* The execution plan is the same as before - SQL Server even warns that there are missing Stats on the execution plan - why is that? */

/* 
    Of course - the execution plan from the first select with WHERE clause has been cached and reused - PARAMETER SNIFFING!
    Let's clear the proc cache and try that again! Brute force cleanup, it can be done for a certain plan though
*/
-- Uncomment these lines to do the buffer cleanup
--dbcc dropcleanbuffers with no_infomsgs;
--dbcc freeproccache with no_infomsgs;
--dbcc freesystemcache('All') with no_infomsgs;

/* Query Test Table - with a WHERE clause */
SELECT * FROM dbo.TestTable WHERE col1<100
;
/* Estimate Rows = 145.8 , Actual Rows = 99 */


/* Drop table, reset statistics settings for DB */
DROP TABLE dbo.TestTable;
ALTER DATABASE SandBox SET AUTO_CREATE_STATISTICS ON, AUTO_UPDATE_STATISTICS ON
more ▼

answered Oct 27 '10 at 03:42 PM

WilliamD gravatar image

WilliamD
25.8k 17 19 41

@WilliamD

/* The execution plan is the same as before - SQL Server even warns that there are missing Stats on the execution plan - why is that? */
Is it posssible that this is because of the AUTO_UPDATE_STATISTICS OFF? What I mean is that though statistics are created, they are not actually there yet? In other words, if you remove the AUTO_UPDATE_STATISTICS OFF from the beginning of the script, maybe it will then do what you expect.
Oct 27 '10 at 03:52 PM Oleg
@WilliamD In my situation, your script did not need any modifications, it just ran as expected. What I mean is that after I uncommented dbcc lines and executed them, the next select with the predicate had shown consistent numbers (99) for both estimated and actual number of rows. The warning also disappeared. The only difference was that I used AdventureWorks database with few extra tables I created for some other tests and did not drop yet, so the number of inserted records happened to be 1329.
Oct 27 '10 at 04:53 PM Oleg

@Oleg - thanks for testing this. I dont know about the AUTO_UPDATE_STATISTCS OFF affecting this, as I create the statistic manually and then clear the cache - this should then go ahead and re-create the exec plan, and use any stats that are available.

Last night I had it work once when I was going through it, but then after that it didn't. It was past midnight and I was a little bit too tired to fix it then..... maybe another day, another blog post.
Oct 27 '10 at 11:53 PM WilliamD
(comments are locked)
10|1200 characters needed characters left
Thanks @WillianD. Seems that filtered stats are still not helping the optimizer. I will do some additional research about this. The
more ▼

answered Oct 27 '10 at 05:52 PM

ozamora gravatar image

ozamora
1.4k 2 3 5

You can invalidate plans by issuing

OPTION (RECOMPILE)
Oct 27 '10 at 08:22 PM ozamora
(comments are locked)
10|1200 characters needed characters left
Your answer
toggle preview:

Up to 2 attachments (including images) can be used with a maximum of 524.3 kB each and 1.0 MB total.

New code box

There's a new way to format code on the site - the red speech bubble logo will automatically format T-SQL for you. The original code box is still there for XML, etc. More details here.

Follow this question

By Email:

Once you sign in you will be able to subscribe for any updates here

By RSS:

Answers

Answers and Comments

SQL Server Central

Need long-form SQL discussion? SQLserverCentral.com is the place.

Topics:

x33
x1

asked: Oct 27 '10 at 02:27 PM

Seen: 1114 times

Last Updated: Oct 27 '10 at 08:21 PM