First off, this is my first time posting on this forum. I've got a database with a long list of strings, what I'd like to do is find how many of these strings contain other strings in the same table, and if they are sorted in descending order, what's it's index within that sub table?
For instance, if I have the strings a, await, wait, waiting, walk then the index for wait should be 2 (or 1 if we start counting from 0).
I was able to count how many occurrences there were (in this case 3).
Each name in the table is unique.
The query only needs to be run once every few months and has a list of about 2000 string.
Answer by Phil ·
I thought up something that I've done before using SQL and Excel. I used the following query: Select B.name, A.name From types as A, types as B WHERE A.name LIKE '%' + B.name + '%' ORDER BY B.name, A.name Which gave me the list of As which matched something in B. Then I exported to Excel and computed a new column. This column would start at 0 if the row above it containing A was not equal to the current rows A. Otherwise, it would add one. I then create another column which is either 1 if A and B of the current row match, otherwise it is empty. By sorting by this new column, and deleting all other rows with no entries on this column, I am left with a list of search strings and indexes, which is exactly what I want. It's a bit of a hack but considering the infrequency I have to do this, I'm fine with this solution. If there's a better way through SQL only though, I'd love to hear it.
Answer by Benjamin ·
This answer requires SQL Server 2005 or above. I'm not sure of the exact output you were looking forward, but I'm sure you can tweak that having this solution.
I didn't know the structure of your table so I can only present an example solution and not the exact solution with your table name and field names. Skip down to the SELECT portion to see how this works.
--create a table to hold the terms to search
CREATE TABLE [dbo].[Terms] (
[Word] varchar(30) NOT NULL )
--add a unique key that is clustered (for default ordering purposes)
CREATE UNIQUE CLUSTERED INDEX CX_Terms
ON [dbo].[Terms] ( [Word] )
--add some values
INSERT INTO [dbo].[Terms] VALUES ('a')
INSERT INTO [dbo].[Terms] VALUES ('await')
INSERT INTO [dbo].[Terms] VALUES ('wait')
INSERT INTO [dbo].[Terms] VALUES ('waiting')
INSERT INTO [dbo].[Terms] VALUES ('walk')
--view the list of terms
SELECT [Word] FROM [dbo].[Terms]
--view the list of terms with the accompanying terms they are found in, the count and the
index position in the sub-table
SELECT
a.[Word] AS [Search Word],
b.[Word] AS [Found in Word],
COUNT(b.[Word]) OVER (PARTITION BY a.[Word]) AS [Count], --count of word containing search text including the word you are searching
ROW_NUMBER() OVER (ORDER BY a.[Word]) - RANK() OVER (ORDER BY a.[Word]) AS [Index] --the index of the word in the ordered list, 0=the word itself 1=first word which contains the searched word, etc.
FROM
[dbo].[Terms] a
INNER JOIN
[dbo].[Terms] b
ON
CHARINDEX(a.[Word], b.[Word]) > 0
--a.[Word] <> b.[Word] --don't match the same word to itself
ORDER BY
a.[Word],
b.[Word]
/*--results
Search Word Found in Word Count Index
a a 5 0
a await 5 1
a wait 5 2
a waiting 5 3
a walk 5 4
await await 1 0
wait await 3 0
wait wait 3 1
wait waiting 3 2
waiting waiting 1 0
walk walk 1 0
*/
--clean up after test
DROP TABLE [dbo].[Terms]
Answer by nigelrivett ·
select a.string, num = count(*)
from tbl a
join tbl b
on b.string like '%' + a.string + '%'
Shouldn't wait be 3 in your sample? await, wait, waiting all contain wait.
This won't be quick.