question

ozamora avatar image
ozamora asked

Filtered Statistics on columns with no indexes

Anyone can explain how would I benefit from filtered statistics on columns that do not have an index? Update: Based on [this link][1], query plans will differ as the optimizer will estimate more accurately. I see the differences in plans when I run under the AdventureWorks sample DB. But I still do not see the benefit as the IO logical reads are still the same with and without filtered statistics. [1]: http://msdn.microsoft.com/en-us/library/ms190397.aspx#CreateStatistics
statisticsfiltered
10 |1200

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

Oleg avatar image
Oleg answered
If you have queries with complex but predictable where clauses which include multiple columns then creating filtered statistics might make sense. They are cheaper than indexes as far as the system resources are concerned, but still provide information to the engine about so-called cardinality estimates (a fancy word for selectivity). The information is stored for each prefix column of the statistics, meaning that if the queries fail to include the prefix column in the predicates then the statistics are useless, just like the indexes are useless when the first column of the index is not included. I am sure that someone will have a better answer, but these are my 2 cents. Oleg
7 comments
10 |1200

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

Oleg avatar image Oleg commented ·
@WilliamD If it is dev DB then you can just clear all them out:
dbcc dropcleanbuffers with no_infomsgs;
dbcc freeproccache with no_infomsgs;
dbcc freesystemcache('All') with no_infomsgs;
go
Ladies and Gentlemen! All existing plans have left the building :)
1 Like 1 ·
WilliamD avatar image WilliamD commented ·
@Oleg - good answer. I am trying to get an example put together and the bloody thing won't do what I want it to. I suspect I am getting some sort of plan reuse when I don't want it, but I cannot seem to fix it.
0 Likes 0 ·
WilliamD avatar image WilliamD commented ·
@Oleg - tried that, didn't help with my example. I think I'll post it as a wiki here and someone can take a look.
0 Likes 0 ·
Matt Whitfield avatar image Matt Whitfield ♦♦ commented ·
@Oleg - +1 - not sure how anyone is going to have a better answer than that...
0 Likes 0 ·
ozamora avatar image ozamora commented ·
Thanks guys. I understand the theory but still don't get the idea of filtered statistics for a table that has no indexes. At the end, the optimizer will do a cluster index scan or table scan. What do you think?
0 Likes 0 ·
Show more comments
WilliamD avatar image
WilliamD answered
I cannot seem to get this example to work as I wanted. Maybe it can be improved upon - it is just too late for me today. I play in my test DB SandBox, that it all you will need to change to get this to work. I clear the cache in this script too - ***DO NOT PLAY WITH THIS ON A PRODUCTION SYSTEM!!!!!!*** /* Turn off Statistics creation and updates */ ALTER DATABASE SandBox SET AUTO_CREATE_STATISTICS OFF, AUTO_UPDATE_STATISTICS OFF /* Create a test table */ CREATE TABLE dbo.TestTable (Col1 int NOT NULL) ; /* Insert a row */ INSERT INTO dbo.TestTable (Col1) SELECT TOP 1 ROW_NUMBER() OVER (ORDER BY (SELECT NULL) ) FROM sys.columns ; /* Now turn on Actual Estimation Plan*/ SET STATISTICS XML ON /* Query Test Table */ SELECT * FROM dbo.TestTable ; /* Estimate Rows = 1 , Actual Rows = 1*/ /* Insert lots of rows */ INSERT INTO dbo.TestTable (Col1) SELECT ROW_NUMBER() OVER (ORDER BY (SELECT NULL) )+1 FROM sys.columns ; /* Query Test Table */ SELECT * FROM dbo.TestTable ; /* Estimate Rows = 486 , Actual Rows = 486*/ /* Hmmm - no difference for table scans with no WHERE clause.... */ /* Query Test Table - with a WHERE clause */ SELECT * FROM dbo.TestTable WHERE col1<100 ; /* Estimate Rows = 145.8 , Actual Rows = 99 */ /* AHA!! The execution plan is different - SQL Server even warns that there are missing Stats on the execution plan */ /* Create a filtered statistic */ ; CREATE STATISTICS stat ON dbo.TestTable (Col1) WHERE col1 <100 ; /* Query Test Table - with a WHERE clause */ SELECT * FROM dbo.TestTable WHERE col1<100 ; /* Estimate Rows = 145.8 , Actual Rows = 99 */ /* The execution plan is the same as before - SQL Server even warns that there are missing Stats on the execution plan - why is that? */ /* Of course - the execution plan from the first select with WHERE clause has been cached and reused - PARAMETER SNIFFING! Let's clear the proc cache and try that again! Brute force cleanup, it can be done for a certain plan though */ -- Uncomment these lines to do the buffer cleanup --dbcc dropcleanbuffers with no_infomsgs; --dbcc freeproccache with no_infomsgs; --dbcc freesystemcache('All') with no_infomsgs; /* Query Test Table - with a WHERE clause */ SELECT * FROM dbo.TestTable WHERE col1<100 ; /* Estimate Rows = 145.8 , Actual Rows = 99 */ /* Drop table, reset statistics settings for DB */ DROP TABLE dbo.TestTable; ALTER DATABASE SandBox SET AUTO_CREATE_STATISTICS ON, AUTO_UPDATE_STATISTICS ON
3 comments
10 |1200

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

Oleg avatar image Oleg commented ·
@WilliamD
/* The execution plan is the same as before - SQL Server even warns that there are missing Stats on the execution plan - why is that? */
Is it posssible that this is because of the AUTO_UPDATE_STATISTICS OFF? What I mean is that though statistics are created, they are not actually there yet? In other words, if you remove the AUTO_UPDATE_STATISTICS OFF from the beginning of the script, maybe it will then do what you expect.
0 Likes 0 ·
Oleg avatar image Oleg commented ·
@WilliamD In my situation, your script did not need any modifications, it just ran as expected. What I mean is that after I uncommented dbcc lines and executed them, the next select with the predicate had shown consistent numbers (99) for both estimated and actual number of rows. The warning also disappeared. The only difference was that I used AdventureWorks database with few extra tables I created for some other tests and did not drop yet, so the number of inserted records happened to be 1329.
0 Likes 0 ·
WilliamD avatar image WilliamD commented ·
@Oleg - thanks for testing this. I dont know about the AUTO_UPDATE_STATISTCS OFF affecting this, as I create the statistic manually and then clear the cache - this should then go ahead and re-create the exec plan, and use any stats that are available. Last night I had it work once when I was going through it, but then after that it didn't. It was past midnight and I was a little bit too tired to fix it then..... maybe another day, another blog post.
0 Likes 0 ·
ozamora avatar image
ozamora answered
Thanks @WillianD. Seems that filtered stats are still not helping the optimizer. I will do some additional research about this. The
1 comment
10 |1200

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

ozamora avatar image ozamora commented ·
You can invalidate plans by issuing OPTION (RECOMPILE)
0 Likes 0 ·

Write an Answer

Hint: Notify or tag a user in this post by typing @username.

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.