question

Jagd avatar image
Jagd asked

Significant difference between contains and containstable?

I have a full-text index created upon a column of type varchar(max) with filestream enabled. The filestream contains data such as JPG's, TIF's, PDF's, and XML's (although this is mostly irrelevant to the question, I believe). I have two queries that I've created that allow me to search upon the index. Full-text search #1-- select parentObj.ObjectID as 'GroupingID', parentObj.Name as 'Grouping', childObj.ObjectID as 'FileObjID', childObj.Name as 'FileName', fs.FileStreamID from dbo.dat_FileStream fs inner join dbo.dat_Object childObj on fs.ObjectID = childObj.ObjectID inner join dbo.dat_Collection c on fs.ObjectID = c.ObjectID inner join dbo.dat_Object parentObj on c.ParentID = parentObj.ObjectID where contains(FileStreamData, @srchTerm) and parentObj.ObjectTypeID = 1 ORDER BY 'Grouping' Full-text search #2-- select KEY_TBL.RANK, parentObj.ObjectID as 'GroupingID', parentObj.Name as 'Grouping', childObj.ObjectID as 'FileObjID', childObj.Name as 'FileName', fs.FileStreamID from dbo.dat_FileStream fs inner join containstable(dbo.dat_FileStream, FileStreamData, @srchTerm, 1000) as KEY_TBL on fs.FileStreamID = KEY_TBL.[KEY] inner join dbo.dat_Object childObj on fs.ObjectID = childObj.ObjectID inner join dbo.dat_Collection c on fs.ObjectID = c.ObjectID inner join dbo.dat_Object parentObj on c.ParentID = parentObj.ObjectID where parentObj.ObjectTypeID = 1 ORDER BY 'Grouping' The only significant difference between the two full-text searches is query #1 uses contains and query #2 uses containstable. My problem is that the two queries do not always yield the same results. For example, were I to search on the phrase "Independent contractors", query #1 would yield a result set of 10 different documents (PDF's and XML's), while query #2 would only yield a result set of 6. This seems to be the rule: query #1 always yields a few more than query #2, and query #2 always yields the exact same matches that query #1 does. Query #1 - search on "Independent contractors" yields: 4262 AAA-00-12 4561 AAA-00-12.pdf 4235 4316 AAA-00-15 4753 AAA-00-15.pdf 4427 4316 AAA-00-15 4754 AAA-00-15.xml 4428 3873 AAA-00-19 4784 AAA-00-19.pdf 4458 3903 AAA-00-22 6795 AAA-00-22.pdf 6459 3953 AAA-00-24 6899 AAA-00-24.pdf 6563 3953 AAA-00-24 6900 AAA-00-24.xml 6564 4842 AAA-00-9 4905 AAA-00-9.pdf 4577 4842 AAA-00-9 4906 AAA-00-9.xml 4578 4057 AAA-0001 4260 AAA-0001.pdf 3936 Query #2 - search on "Independent contractors" yields: 19 4262 AAA-00-12 4561 AAA-00-12.pdf 4235 126 4316 AAA-00-15 4754 AAA-00-15.xml 4428 126 4316 AAA-00-15 4753 AAA-00-15.pdf 4427 116 3873 AAA-00-19 4784 AAA-00-19.pdf 4458 125 3903 AAA-00-22 6795 AAA-00-22.pdf 6459 57 3953 AAA-00-24 6900 AAA-00-24.xml 6564 57 3953 AAA-00-24 6899 AAA-00-24.pdf 6563 Thanks for any help that I receive on this!!
t-sqlfull-textsearch
10 |1200

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

1 Answer

·
WilliamD avatar image
WilliamD answered
Jagd - from reading the information in [BOL][1] I understand the main difference between `CONTAINS` and `CONTAINSTABLE` to be the option of further filtering the FT search when using CONSTAINSTABLE. You can tell the search to only look near to a word when using `CONTAINSTABLE`, you can also control the thesaurus lookup with `CONTAINSTABLE` better. I would imagine that is what you are seeing, as `CONTAINSTABLE` is returning a subset of `CONTAINS`. I think the explanation in BOL above is quite clear. HTH [1]: http://technet.microsoft.com/en-us/library/ms187787.aspx
10 |1200

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

Write an Answer

Hint: Notify or tag a user in this post by typing @username.

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.