question

Katie 1 avatar image
Katie 1 asked

query performance tuning

All, In the database we have a table where all the file names are stored.This table is queried every 1 minute by the application to see a file name already exists or not, So the query is designed as, select filename from the filenametable where filename like ('%abctext%') The filename is around 190 charecters in length and there are about 100k records in the table and it is effecting performance, I have put an index on the filename, but still it does not show a significant change on the perfromance. All 100k records needs to stay in the table, for a month or so. but even for first 100k itself, i see it cracking up. any way we can fine tune this query?
sql-server-2008sqlqueryperformanceserver
4 comments
10 |1200 characters needed characters left characters exceeded

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

SQL Server when dealing with character-based columns, starts the index search based on the letters starting from the left (if we're talking about English). Therefore, LIKE 'abctext%' would potentially use an index but that assumes the file name starts with abctext. As soon as you start with % (LIKE '%abctext' or LIKE '%abctext%'), you're telling SQL Server 0 or more characters before abctext, at which point it's going to have to look at the entry for that column for each row. Think about trying to go through the phonebook knowing that "hit" was in the name, but you didn't know what characters were before or after. You could match this with White, Whit, Whitehead, Chittering, or Doughit. How exactly do you find all matches in the phone book without going through every entry? You don't. You have to consider every single one. And that's why SQL Server won't choose to use an index or be able to make good use for one even if it does select it.
2 Likes 2 ·
Can you share the query plan? (if it doesnt have any sensitive info, ofcourse)
0 Likes 0 ·
How long is it taking? I have a 264,000 filename table and the same query takes less than 100ms
0 Likes 0 ·
The problem is, the application pings every 1 min to that table and trys to check if there are any files existing. This is causing a overhead. As we have a requirement to process the files as soon as the file is available, so i cannot mess with the timing.
0 Likes 0 ·
Grant Fritchey avatar image
Grant Fritchey answered
Are the file searches all wild card searches. I can just type in 'a' and you're going to find aardvark, apple, can, zap...? For 100,000 records? Talk to the business. Convince them that the only way to get for sure performance is to modify the requirement so that if you type in 'a' the search will only return aardvark & apple. That, or, if this is really the way this must behave, you may want to look at one of the NoSQL databases out there that are specifically targeted to doing wild card text searches. Especially if you're only doing a name/value pair search as the example query implies.
10 |1200 characters needed characters left characters exceeded

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

DaniSQL avatar image
DaniSQL answered
The index you created on filename can not be used since your pridicate is not [sargable][1]. Does your filename start with 'abc'? If yes your index will be used since the where clause will be sargable, just modify your query to select filename from the filenametable where filename like 'abctext%' [1]: http://en.wikipedia.org/wiki/Sargable
3 comments
10 |1200 characters needed characters left characters exceeded

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

So actually there are two queries select filename from the filenametable where filename like '%abctext1%' and select filename from the filenametable where filename like '%abctext2%' , all other charecters in the filename change every minute.
0 Likes 0 ·
You mean SARGable. :-)
0 Likes 0 ·
@K. Brian Edited! good catch :-)
0 Likes 0 ·
weitzera avatar image
weitzera answered
There's little to nothing you can do without changing the query. If you have a way of determining within SQL server what portion of the filename is variable and what portion is not, for example if the abctext portion always is 7 characters beginning after a period, then you could make a computed column that has abctext in it and apply an index to that: http://msdn.microsoft.com/en-us/library/ms189292.aspx Otherwise, you could split a filename that looks like [variablePrefix][staticName][variableNumber][variableSuffix] into several columns in your table, apply an index to the static portion and the variable number, and adjust the application to split the filename accordingly when it is adding or querying.
10 |1200 characters needed characters left characters exceeded

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

Kevin Feasel avatar image
Kevin Feasel answered
I'd look into using a [full-text index][1] for this query. If you have the server power to keep that index up-to-date, performance would be much faster than the %something% search for the reason @DaniSQL brought up. Rebuilding the index can be pretty slow, but it might be worth giving this a try. [1]: http://msdn.microsoft.com/en-us/library/ms142571.aspx
10 |1200 characters needed characters left characters exceeded

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

WilliamD avatar image
WilliamD answered
Is there a pattern to the filenames that could be used to create a computed column from? You say the files all have the same pattern inside them, the rest is different every minute. If that is so, you could create the computed column to fish out the relevant filename part, then index that column. You then change your query to access the computed column and it should fly. @Kevin feasel's suggestion of full text indexing would be my next step if my suggestion didn't work out.
1 comment
10 |1200 characters needed characters left characters exceeded

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

just realised that ist exactly what @weitzera wrote almost a whole day ago! Stupid me, must read *all* answers before posting
0 Likes 0 ·
weitzera avatar image
weitzera answered
Apparently I don't have enough "karma" to comment on an individual post... Anyway, I may be wrong, but I don't believe a fulltext index is applicable in this case for two reasons. I don't think you're allowed to search inside a given "word," the closest is the beginning of the word: http://msdn.microsoft.com/en-us/library/cc879300.aspx Even if you could, I expect the full text index query would be just as expensive as the like query in this case, because the filenames are probably not made of many words. Someone please correct me if I'm wrong...
10 |1200 characters needed characters left characters exceeded

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

davehants avatar image
davehants answered
It sounds like your business needs a better naming convention for file names. What you are doing is very inefficient...
10 |1200 characters needed characters left characters exceeded

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

Write an Answer

Hint: Notify or tag a user in this post by typing @username.

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.