x

Significant difference between contains and containstable?

I have a full-text index created upon a column of type varchar(max) with filestream enabled. The filestream contains data such as JPG's, TIF's, PDF's, and XML's (although this is mostly irrelevant to the question, I believe).

I have two queries that I've created that allow me to search upon the index.

Full-text search #1--

select
  parentObj.ObjectID as 'GroupingID',
  parentObj.Name as 'Grouping',
  childObj.ObjectID as 'FileObjID', 
  childObj.Name as 'FileName',
  fs.FileStreamID
from dbo.dat_FileStream fs
inner join dbo.dat_Object childObj
    on fs.ObjectID = childObj.ObjectID
inner join dbo.dat_Collection c
    on fs.ObjectID = c.ObjectID
inner join dbo.dat_Object parentObj
    on c.ParentID = parentObj.ObjectID
where contains(FileStreamData, @srchTerm)
    and parentObj.ObjectTypeID = 1
ORDER BY 'Grouping'

Full-text search #2--

select
  KEY_TBL.RANK,
  parentObj.ObjectID as 'GroupingID',
  parentObj.Name as 'Grouping',
  childObj.ObjectID as 'FileObjID',
  childObj.Name as 'FileName',
  fs.FileStreamID
from dbo.dat_FileStream fs
inner join containstable(dbo.dat_FileStream, FileStreamData, @srchTerm, 1000) as KEY_TBL
    on fs.FileStreamID = KEY_TBL.[KEY]
inner join dbo.dat_Object childObj
    on fs.ObjectID = childObj.ObjectID
inner join dbo.dat_Collection c
    on fs.ObjectID = c.ObjectID
inner join dbo.dat_Object parentObj
    on c.ParentID = parentObj.ObjectID
where parentObj.ObjectTypeID = 1
ORDER BY 'Grouping'

The only significant difference between the two full-text searches is query #1 uses contains and query #2 uses containstable.

My problem is that the two queries do not always yield the same results. For example, were I to search on the phrase "Independent contractors", query #1 would yield a result set of 10 different documents (PDF's and XML's), while query #2 would only yield a result set of 6. This seems to be the rule: query #1 always yields a few more than query #2, and query #2 always yields the exact same matches that query #1 does.

Query #1 - search on "Independent contractors" yields:

4262    AAA-00-12  4561 AAA-00-12.pdf   4235
4316    AAA-00-15  4753 AAA-00-15.pdf   4427
4316    AAA-00-15  4754 AAA-00-15.xml   4428
3873    AAA-00-19  4784 AAA-00-19.pdf   4458
3903    AAA-00-22  6795 AAA-00-22.pdf   6459
3953    AAA-00-24  6899 AAA-00-24.pdf   6563
3953    AAA-00-24  6900 AAA-00-24.xml   6564
4842    AAA-00-9   4905  AAA-00-9.pdf 4577
4842    AAA-00-9   4906  AAA-00-9.xml 4578
4057    AAA-0001   4260  AAA-0001.pdf 3936

Query #2 - search on "Independent contractors" yields:

19  4262 AAA-00-12   4561  AAA-00-12.pdf    4235
126 4316    AAA-00-15  4754 AAA-00-15.xml   4428
126 4316    AAA-00-15  4753 AAA-00-15.pdf   4427
116 3873    AAA-00-19  4784 AAA-00-19.pdf   4458
125 3903    AAA-00-22  6795 AAA-00-22.pdf   6459
57  3953 AAA-00-24   6900  AAA-00-24.xml    6564
57  3953 AAA-00-24   6899  AAA-00-24.pdf    6563
Thanks for any help that I receive on this!!
more ▼

asked Nov 17, 2010 at 10:45 AM in Default

Jagd gravatar image

Jagd
21 3 3 3

(comments are locked)
10|1200 characters needed characters left

1 answer: sort voted first

Jagd - from reading the information in [BOL][1] I understand the main difference between CONTAINS and CONTAINSTABLE to be the option of further filtering the FT search when using CONSTAINSTABLE. You can tell the search to only look near to a word when using CONTAINSTABLE, you can also control the thesaurus lookup with CONTAINSTABLE better.

I would imagine that is what you are seeing, as CONTAINSTABLE is returning a subset of CONTAINS.

I think the explanation in BOL above is quite clear.

HTH

[1]: http://technet.microsoft.com/en-us/library/ms187787.aspx
more ▼

answered Nov 17, 2010 at 11:19 AM

WilliamD gravatar image

WilliamD
25.9k 17 19 41

(comments are locked)
10|1200 characters needed characters left
Your answer
toggle preview:

Up to 2 attachments (including images) can be used with a maximum of 524.3 kB each and 1.0 MB total.

New code box

There's a new way to format code on the site - the red speech bubble logo will automatically format T-SQL for you. The original code box is still there for XML, etc. More details here.

Follow this question

By Email:

Once you sign in you will be able to subscribe for any updates here

By RSS:

Answers

Answers and Comments

SQL Server Central

Need long-form SQL discussion? SQLserverCentral.com is the place.

Topics:

x991
x46
x21

asked: Nov 17, 2010 at 10:45 AM

Seen: 2161 times

Last Updated: Nov 17, 2010 at 10:45 AM