question

DavidD avatar image
DavidD asked

xml query very very slow

hi, i am being given an xml doc to import into a database. my method of doing this is to recevie the xml doc as a xml type param, select the contents out into a user defined table type, and then merge the udt into a physical table. i seem to be having an issue inserting the results into the udt. selecting all the contents out of the xml works fine and fast for around 1000 or so records but if i try and insert @udt select blah blah blah from xml document it basically takes forever and a half. in fact it never finishes i have to stop it. the query looks like this: select o.value('itmID[1]','nchar(18)') as itmID, o.value('vin[1]','nvarchar(18)') as vin, o.value('engNo[1]','nvarchar(20)') as engNo, o.value('regNo[1]','nvarchar(9)') as regNo, more fields here..... from @ItemsXml.nodes('/Items/item') x(o) can anyone tell me why the select works fine but a insert @var select from xml is so bad? the xml document is not malformed and looking at it after ive selected everything out i cvnat find anything wrong with it.
sql-server-2008xml
2 comments
10 |1200

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

DavidD avatar image DavidD commented ·
if i insert the xml into a temp table field and then insert into the udt via a select from table cross apply it works. why would that be?
0 Likes 0 ·
Matt Whitfield avatar image Matt Whitfield ♦♦ commented ·
The plan has the answer... :) Without looking at the plan it's pretty much anyone's guess...
0 Likes 0 ·
Matt Whitfield avatar image
Matt Whitfield answered
Have you tried the same into a temporary table instead of a table variable, and compared the execution plans? Inserting into a table variable can cause parallelism to be disabled, and this might be biting you here.
4 comments
10 |1200

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

DavidD avatar image DavidD commented ·
i have, it was the first thing i tried. its exactly the same. eventually have to kill it.
0 Likes 0 ·
Matt Whitfield avatar image Matt Whitfield ♦♦ commented ·
Awesome. So what do the plans look like? Can you post some short-ish SQL that replicates it?
0 Likes 0 ·
DavidD avatar image DavidD commented ·
there's about a thousand nested loops. how do i post it?
0 Likes 0 ·
Matt Whitfield avatar image Matt Whitfield ♦♦ commented ·
@David - with great difficulty - in that case... Can you replicate it with some more simple SQL?
0 Likes 0 ·
Oleg avatar image
Oleg answered
Looks like Matt's suggestion is correct. I had exact same problem with large xml inputs, and temp table was the solution. Here is my scenario: huge xml (100s of MBs), is fed into the proc as a parameter. In fact, the input is so big that there is no way to even open it by any available editor (only SlickEdit opens it but still very-very slowly). Now, the proc has the logic like this: xml has deltas for a day or so, some data is new, some is existing but updated, so for new data: -- pseudo code insert into dbo.DestTable select * from xml left join dbo.DestTable on some_join_condition where dbo.DestTable.FirstColumn is null; showed extremely bad performance. Changing it to: select item.value('SomeNode[1]','varchar(10)') ColName, -- etc into #temp from @xml.nodes('//some_path') items(item); -- and then doing the normal insert into dbo.DestTable select * from #temp left join dbo.DestTable on some_join_condition where dbo.DestTable.FirstColumn is null; made it incomparably faster. Further attempts showed that it can be even faster than that by simply changing the way to pass xml into the proc. Instead of feeding it xml parameter, I started using just a path to it as a parameter to the proc and then making it read xml contents into local xml variable first before inserting the contents into the temp table. One small problem to overcome is that openrowset does not take a path as a parameter, so declare @sql nvarchar(500); declare @xml xml; select @sql = ' select @xmlOut = t.BulkColumn from openrowset(bulk ''' + @path + ''', single_blob) t;' exec sp_executesql @sql, N'@xmlOut xml out', @xmlOut = @xml out; Once the above is done, the @xml variable is read from disk and is ready to be interrogated. Please note that single\_blob for openrowset is faster than single\_clob. I read somewhere in BOL that binary represented XML data is the easiest for SQL Server. Hope this helps, Oleg
2 comments
10 |1200

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

DavidD avatar image DavidD commented ·
great post. yes thats pretty similar to what i do. i receive all the xml by param, then insert it into the table var then merge the table with the physical table. it doenst explain why i get bad performance when i insert into a table var or temp table. they are both slow. interestingly if i write the xml param into a temp table first, then select the data of the single xml field in the temp table into another defined temp table with all the correct columns, its fine. id post my exec plans if i new how to its extremely big though
0 Likes 0 ·
ThomasRushton avatar image ThomasRushton ♦♦ commented ·
Execution plans - post these by exporting the plan (to a .sqlplan file), and then by opening this up in your favourite text editor and copying & pasting (it's just XML under the hood). If it's too big for that, then look at posting it up as a Google Document. If you've got the screen realestate to make the whole plan visible, then consider a screenshot and posting that up.
0 Likes 0 ·
Alan avatar image
Alan answered
Nice response Oleg. Two observations: //pizza is different than /pizza. The first searches the whole hierarchy to find elements with pizza name while the second selects the root pizza element only. /pizzas/pizza will select all pizza inside pizzas root element. Maybe you know this, but I am just clarifying as this may have great impact on your query. Second is that while going thru this slowness issue, I broke down my big xml into parts. So instead of having one select /pizzas/pizza/toppings/topping, I selected /pizzas/pizza into a temp table, then, on a second step, I processed each pizza from the temp table. This reduced the processing from 25 seconds to 2 seconds. And that was for creating only 220 rows!
10 |1200

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

Write an Answer

Hint: Notify or tag a user in this post by typing @username.

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.