question

Pradeep407 avatar image
Pradeep407 asked

Long running query.

Hello,

We have a table with half million records. There's a column which is storing xml data but the datatype of that column is varchar. This column alone is consuming 4097 bytes.

When i query the below table it is taking a long time to fetch the results (several hours) and i eventually cancel the query.

select <big column> from table (nolock)

Server memory is 32 GB, SQL memory is 22 GB, PLE is huge.

Any suggestions on how do i handle this situation? Thanks.

xmlquery-tuningquery-optimisation
10 |1200

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

1 Answer

·
WRBI avatar image
WRBI answered

If you can somehow get your column to be an XML data type, you can use XML Indexes.

Although be warned, the last time I created an XML index on an existing column, it had to be scheduled for out of hours, because it took forever and a day.

Here's a link to a good redgate blog post. XML Indexes.

4 comments
10 |1200

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

Pradeep407 avatar image Pradeep407 commented ·

@WRBI, thanks for the response. Really appreciate it. We will try to implement the recommendation in days to come. But for now can you please share your thoughts on what we are running and how do we optimize it..

Our Developer is running a query something like this for the last 10 hours and it is still running...

Big_Column is hosting xml data but the datatype of that column is varchar. The size of the table is 1 TB, size of each row is 4503 bytes and this table has 3.5 M records. Most of the time Wait_Type shows PAGEIOLATCH_SH

Query

select some_col1,some_col2,some_col3 FROM Archive_table with (Nolock) WHERE (Charindex('sometext-A', Big_Column) > 0) or (Charindex('sometext-B', Big_Column) > 0) or (Charindex('sometext-C', Big_Column) > 0) or (Charindex('sometext-D', Big_Column) > 0) or (Charindex('sometext-E', Big_Column) > 0) or (Charindex('sometext-F', Big_Column) > 0) or (Charindex('sometext-G', Big_Column) > 0) or (Charindex('sometext-H', Big_Column) > 0) or (Charindex('sometext-I', Big_Column) > 0) or (Charindex('sometext-J', Big_Column) > 0)

0 Likes 0 ·
KenJ avatar image KenJ Pradeep407 commented ·

The query, as presented is not tunable. You have a 1TB table, a query that requires SQL to literally crawl through almost every byte of that searching for string values with both leading and trailing wildcards, and only 22GB of memory in which to do it (most if which is already dedicated to other processes, based on your huge PLE). In theory, an SSD storage subsystem could help with the I/O performance, but you really need to reduce the amount of data you are reading to help with this.

I see you're working the XML column/index angle now. Another option is to create a full text catalog and index for this table then rewriting the query to use contains(). This could make a huge improvement without having to change the column data type. If the data type change would be a development bottleneck, the full text approach could make a good interim solution. If each approach is just as convenient, then store the XML as XML. you can always add a full text index to that if the xml index route doesn't help enough.

0 Likes 0 ·
WRBI avatar image WRBI KenJ commented ·
@KenJ

Love the idea of FullText as alternative/complimentary option! I'm keeping this one in mind for the future.

1 Like 1 ·
Show more comments

Write an Answer

Hint: Notify or tag a user in this post by typing @username.

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.