question

Jon Crawford avatar image
Jon Crawford asked

How to use LIKE when multiple patterns need matched

I have a list of acceptable values that I need to search for within a text field. I can whittle the list down for records that are potential matches, but I need to know exact matches to decide when/if to include them in the results. Currently I'm using this terrible join, surely there's a better way to do this that I'm just not thinking of? IF OBJECT_ID('tempdb..#acceptable') IS NULL BEGIN CREATE TABLE #acceptable ( iRow int identity(1, 1), value varchar(15) ); INSERT INTO #acceptable ( value ) SELECT 'Frodo' UNION SELECT 'Gimli' UNION SELECT 'Gandalf' UNION SELECT 'Boromir' UNION SELECT 'Aragorn' UNION SELECT 'Legolas' UNION SELECT 'Sam' UNION SELECT 'Bill' UNION SELECT 'Merriweather' UNION SELECT 'Peregrin'; END IF OBJECT_ID('tempdb..#tableToCheck') IS NULL BEGIN CREATE TABLE #tableToCheck ( iRow int identity(1, 1), columnToCheck varchar(500) ); INSERT INTO #tableToCheck ( columnToCheck ) SELECT 'Once there was a Hobbit named Frodo Baggins' UNION SELECT 'He had some friends who took him camping' UNION SELECT 'These friends included Aragorn, who had never lived up to his father''s father''s father''s expectations,' UNION SELECT 'Boromir the Bastard, who ends up being tarred and feathered, sans tar;' UNION SELECT 'Some old git named Gandalf the Grey, because he never took a bath until he fell into a mine pit' UNION SELECT 'and a few other folks like Merriweather, Legolas and most importantly Bill. Oh, and Smeagol, but whatever.'; END SELECT * FROM #tableToCheck JOIN #acceptable ON #tableToCheck.columnToCheck LIKE '%' + #acceptable.value + '%' ORDER BY #acceptable.iRow
t-sqljoinslike
2 comments
10 |1200 characters needed characters left characters exceeded

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

Full-Text indexing?
0 Likes 0 ·
I must have spelled it wrong, I said "better". In all seriousness though, not sure that's an option unless I copy our production data from the vendor db into our db and set it up there. I'll look into it more.
0 Likes 0 ·

1 Answer

· Write an Answer
Oleg avatar image
Oleg answered
If the version of SQL Server in question is 2016 or higher then [string_split][1] may be used to split the values in the table to check so that more detailed info can be retrieved via an equality join. For example: -- Find all matches with the iRow repeated for each found match should -- the columnToCheck value result in multiple matches for the same row select t.iRow, s.[Value] from #tableToCheck t cross apply string_split(replace(replace(columnToCheck, ',', ''), '.', ''), ' ') s inner join #acceptable a on a.[value] = s.[Value]; If the SQL Server version is earlier than 2016, the[ SQL 8K CSV Splitter Funciton][2] by @Jeff Moden may be used in place of **string\_split**. With this approach, it is pretty straightforward to get more detailed info about the match results: -- Display the number of matching words count along with the list of all found words ;with matches as ( select t.iRow, s.[Value], count(1) over (partition by t.iRow) MatchCount from #tableToCheck t cross apply string_split(replace(replace(columnToCheck, ',', ''), '.', ''), ' ') s inner join #acceptable a on a.[value] = s.[Value] ) select t.iRow, max(t.MatchCount) MatchCount, max(stuff(m.List, 1, 2, '')) List from matches t cross apply ( select ', ' + [Value] from matches where iRow = t.iRow for xml path('') ) m(List) group by t.iRow; Of course, **matches** with cross applied **matches** is a bit too heavy because the cross apply kicks in for each row and then the results are aggregated. Substituting the **matches** with the actual table (before cross applying) will reduce the work because there will be no need to aggregate anything. Based on the sample data in question, the second query produces the following results: iRow MatchCount List ----------- ----------- ------------------------------ 1 1 Frodo 3 1 Aragorn 4 1 Boromir 5 1 Gandalf 6 3 Merriweather, Legolas, Bill Hope this helps. Oleg [1]: https://docs.microsoft.com/en-us/sql/t-sql/functions/string-split-transact-sql [2]: http://www.sqlservercentral.com/articles/Tally+Table/72993/
1 comment
10 |1200 characters needed characters left characters exceeded

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

+1. Definitely would go for it but if the purpose could have been different. I would rather use the CASE statement to find if there are exactly matches after the pattern matching is already done. But that is just my perception.
0 Likes 0 ·

Write an Answer

Hint: Notify or tag a user in this post by typing @username.

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.