x

Full Text Search Optimization

Hi,

I am having a hard time trying to optimize a Full Text Search ( FTS ) on a ~1 billion rows table ( around 18Gb of reserved space used ).

The main problem seems the amount of rows returned and not the search itself. For example, when searching for "*sales*" i got 1811766 rows and it takes 22secs, when searching for "*pineapple*" i got 9 rows and it takes 0.1secs

So far i tryed the following tips and optimizations:

  • Separeted Disk with its own raid system ( low IO concurrency )

  • clustered index on the FTS key.

  • timestamp column to enhance incremental update ( even if i usually do full updates at night )

  • daily table index and FTS rebuild

  • using CONTAINSTALE instead of CONTAINS

  • config: max full-text crawl range set to the number of cpu cores ( 4x4 = 16 )

  • config: resource_usage set to 5 ( from 3 )

  • config: ft crawl bandwidth (max) set to 0 ( unlimited )

  • config: ft notify bandwidth (max) set to 0 ( unlimited )

  • Config: max server memory set to 44Gb ( 4Gb left to SO + fdhost.exe )

I am running on a win2003 with 4x quad-core CPUs, 48GbRam. SQL 2005 SP3, with all cumulative updates ( except last one or two ).

I have very few bottlenecks right now, cpu never allways under 80%, our new fiber channel solved most IO problems, and Memory usage also seems ok ( Page Life Expectancy never goes too down, and buffer cache hit ratio is 98+ ).

These are some of my sources of information so far:

Question: What else should i do to get a faster response time ?

Any help is appreciated, Thanks.

more ▼

asked Jul 22 '10 at 01:53 PM in Default

Gustavo gravatar image

Gustavo
592 4 4 7

(comments are locked)
10|1200 characters needed characters left

3 answers: sort voted first

Gustavo,

What does the system look like when you run the slower query (I/O, CPU etc.)?

What does your storage subsystem look like (RAID Levels, # of physical disks per volume)?

Have you ensured the partitions are aligned? This is a common problem, as the default behaviour can cause performance problems - [Technet Whitepaper][1]

[1]: http://technet.microsoft.com/en-us/library/cc966412.aspx#EEAA
more ▼

answered Jul 23 '10 at 12:20 AM

WilliamD gravatar image

WilliamD
25.8k 17 19 41

Hi William, Thanks for answering.

When i reach a lot of simultaneous requests cpu goes up and 'page life expectancy' goes down. Meaning that i will have problems to scale this in the future, but right now i am able to handle them without a clear bottleneck.

I have two separate raid system, both aligned and build of 12 + 12 Hds( raid 10 ). FTS is on the second raid ( usually used for failover only ).

The Problem seems to be the number of returned rows, since searching for "pineabple" is instant, but for "sales" takes ~22secs.
Jul 23 '10 at 06:17 AM Gustavo

Gustavo,

it sounds like the FTS is working as it should (as I understand it). The smaller search is faster, because there really is less data to plough through to get your results - it may even still be in cache with sub-second speeds.

The larger query is slower because you are looking for more data (probably off disk). What I am not too sure about with FTS is the behaviour in terms of caching. If Page Life Expectancy is dropping, this would suggest that the cache is being emptied to allow the data from the FTS engine to be cached. This would be pumping data off disk and into your RAM and be the reason for 20+ second queries.

Can you check that your I/O subsystem is up to the task when you issue such a large search (or two)? I would not be surprised to see full whack on that array whilst it tries to pull all the data off disk.
Jul 23 '10 at 06:54 AM WilliamD

Willian,

Indeed, while searching for "sales" a lot more of IO and memory resources are used, the behavior is completly different when searching for "pineapple".

But that amount of resources needed should be supported by the actual hardware and the setups i have made.

Is hardware scaling my only viable solution in this case ?
Jul 26 '10 at 07:25 AM Gustavo

Gustavo, one possibility would be to try partitioning your table (if logically possible and you have enterprise edition).

Once you have your partitions, you could then create indexed views on these and Full Text Indexes on these partitioned views.

How does that sound?
Jul 27 '10 at 12:41 AM WilliamD

Its worth a try thanks.

I have also request some more RAM, upgrading from 48Gb to 64Gb might give some benefits too.

I will write next time with the results.
Jul 28 '10 at 11:02 AM Gustavo
(comments are locked)
10|1200 characters needed characters left

SQL 2005 IS THE BOTTLENECK. Move to 2008 R2 or better 2012. The FTS is dramatically faster and diffrent there. 2012 is ~1.5 faster than 2008 R2.

Also: cache in application the one word top N results for all possible prefixes. In my ~80M words case it loaded only 1Gb of memory.
more ▼

answered Sep 04 '12 at 04:49 AM

evgenik gravatar image

evgenik
0

(comments are locked)
10|1200 characters needed characters left

The problem is, as mentioned in the comments, most likely to be the sheer volume of data being returned - there's a limit to how fast traditional HDDs can shift data off disk, through the server, and back out to the client application.

An alternative to sheer hardware scaling might be (if that sort of query is run often enough) to partition the data so that this particular table & Full Text Index data are on SSD rather than a normal HDD/array.
more ▼

answered Sep 04 '12 at 07:53 AM

ThomasRushton gravatar image

ThomasRushton ♦
33.3k 13 20 44

(comments are locked)
10|1200 characters needed characters left
Your answer
toggle preview:

Up to 2 attachments (including images) can be used with a maximum of 524.3 kB each and 1.0 MB total.

New code box

There's a new way to format code on the site - the red speech bubble logo will automatically format T-SQL for you. The original code box is still there for XML, etc. More details here.

Follow this question

By Email:

Once you sign in you will be able to subscribe for any updates here

By RSS:

Answers

Answers and Comments

SQL Server Central

Need long-form SQL discussion? SQLserverCentral.com is the place.

Topics:

x1933
x43
x34
x19

asked: Jul 22 '10 at 01:53 PM

Seen: 2413 times

Last Updated: Sep 04 '12 at 07:53 AM