question

Gustavo avatar image
Gustavo asked

Full Text Search Optimization

Hi, I am having a hard time trying to optimize a **Full Text Search** ( FTS ) on a ~1 billion rows table ( around 18Gb of reserved space used ). The main problem seems the amount of rows returned and not the search itself. For example, when searching for "*sales*" i got **1811766** rows and it takes **22secs**, when searching for "*pineapple*" i got **9** rows and it takes **0.1secs** **So far i tryed the following tips and optimizations:** - Separeted Disk with its own raid system ( low IO concurrency ) - clustered index on the FTS key. - timestamp column to enhance incremental update ( even if i usually do full updates at night ) - daily table index and FTS rebuild - using CONTAINSTALE instead of CONTAINS - config: max full-text crawl range set to the number of cpu cores ( 4x4 = 16 ) - config: resource_usage set to 5 ( from 3 ) - config: ft crawl bandwidth (max) set to 0 ( unlimited ) - config: ft notify bandwidth (max) set to 0 ( unlimited ) - Config: max server memory set to 44Gb ( 4Gb left to SO + fdhost.exe ) I am running on a win2003 with 4x quad-core CPUs, 48GbRam. SQL 2005 SP3, with all cumulative updates ( except last one or two ). I have very few bottlenecks right now, cpu never allways under 80%, our new fiber channel solved most IO problems, and Memory usage also seems ok ( Page Life Expectancy never goes too down, and buffer cache hit ratio is 98+ ). **These are some of my sources of information so far:** - http://technet.microsoft.com/en-us/library/cc917695.aspx - http://technet.microsoft.com/en-us/library/ms142560.aspx - http://sqlserverpedia.com/wiki/FTS_-_Performance **Question: What else should i do to get a faster response time ?** Any help is appreciated, Thanks.
sql-server-2005full-textoptimizationtuning
10 |1200 characters needed characters left characters exceeded

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

WilliamD avatar image
WilliamD answered
Gustavo, What does the system look like when you run the slower query (I/O, CPU etc.)? What does your storage subsystem look like (RAID Levels, # of physical disks per volume)? Have you ensured the partitions are aligned? This is a common problem, as the default behaviour can cause performance problems - [Technet Whitepaper][1] [1]: http://technet.microsoft.com/en-us/library/cc966412.aspx#EEAA
5 comments
10 |1200 characters needed characters left characters exceeded

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

Hi William, Thanks for answering. When i reach a lot of simultaneous requests cpu goes up and 'page life expectancy' goes down. Meaning that i will have problems to scale this in the future, but right now i am able to handle them without a clear bottleneck. I have two separate raid system, both aligned and build of 12 + 12 Hds( raid 10 ). FTS is on the second raid ( usually used for failover only ). The Problem seems to be the number of returned rows, since searching for "pineabple" is instant, but for "sales" takes ~22secs.
0 Likes 0 ·
Gustavo, it sounds like the FTS is working as it should (as I understand it). The smaller search is faster, because there really is less data to plough through to get your results - it may even still be in cache with sub-second speeds. The larger query is slower because you are looking for more data (probably off disk). What I am not too sure about with FTS is the behaviour in terms of caching. If Page Life Expectancy is dropping, this would suggest that the cache is being emptied to allow the data from the FTS engine to be cached. This would be pumping data off disk and into your RAM and be the reason for 20+ second queries. Can you check that your I/O subsystem is up to the task when you issue such a large search (or two)? I would not be surprised to see full whack on that array whilst it tries to pull all the data off disk.
0 Likes 0 ·
Willian, Indeed, while searching for "sales" a lot more of IO and memory resources are used, the behavior is completly different when searching for "pineapple". But that amount of resources needed should be supported by the actual hardware and the setups i have made. Is hardware scaling my only viable solution in this case ?
0 Likes 0 ·
Gustavo, one possibility would be to try partitioning your table (if logically possible and you have enterprise edition). Once you have your partitions, you could then create indexed views on these and Full Text Indexes on these partitioned views. How does that sound?
0 Likes 0 ·
Its worth a try thanks. I have also request some more RAM, upgrading from 48Gb to 64Gb might give some benefits too. I will write next time with the results.
0 Likes 0 ·
evgenik avatar image
evgenik answered
SQL 2005 IS THE BOTTLENECK. Move to 2008 R2 or better 2012. The FTS is dramatically faster and diffrent there. 2012 is ~1.5 faster than 2008 R2. Also: cache in application the one word top N results for all possible prefixes. In my ~80M words case it loaded only 1Gb of memory.
10 |1200 characters needed characters left characters exceeded

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

ThomasRushton avatar image
ThomasRushton answered
The problem is, as mentioned in the comments, most likely to be the sheer volume of data being returned - there's a limit to how fast traditional HDDs can shift data off disk, through the server, and back out to the client application. An alternative to sheer hardware scaling might be (if that sort of query is run often enough) to partition the data so that this particular table & Full Text Index data are on SSD rather than a normal HDD/array.
10 |1200 characters needed characters left characters exceeded

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

Write an Answer

Hint: Notify or tag a user in this post by typing @username.

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.