i am being given an xml doc to import into a database. my method of doing this is to recevie the xml doc as a xml type param, select the contents out into a user defined table type, and then merge the udt into a physical table. i seem to be having an issue inserting the results into the udt. selecting all the contents out of the xml works fine and fast for around 1000 or so records but if i try and insert @udt select blah blah blah from xml document it basically takes forever and a half. in fact it never finishes i have to stop it. the query looks like this:
select o.value('itmID','nchar(18)') as itmID, o.value('vin','nvarchar(18)') as vin, o.value('engNo','nvarchar(20)') as engNo, o.value('regNo','nvarchar(9)') as regNo, more fields here..... from @ItemsXml.nodes('/Items/item') x(o)
can anyone tell me why the select works fine but a insert @var select from xml is so bad? the xml document is not malformed and looking at it after ive selected everything out i cvnat find anything wrong with it.
asked Oct 20, 2010 at 03:32 PM in Default
Looks like Matt's suggestion is correct. I had exact same problem with large xml inputs, and temp table was the solution. Here is my scenario: huge xml (100s of MBs), is fed into the proc as a parameter. In fact, the input is so big that there is no way to even open it by any available editor (only SlickEdit opens it but still very-very slowly).
Now, the proc has the logic like this: xml has deltas for a day or so, some data is new, some is existing but updated, so for new data:
showed extremely bad performance. Changing it to:
-- and then doing the normal
made it incomparably faster. Further attempts showed that it can be even faster than that by simply changing the way to pass xml into the proc. Instead of feeding it xml parameter, I started using just a path to it as a parameter to the proc and then making it read xml contents into local xml variable first before inserting the contents into the temp table. One small problem to overcome is that openrowset does not take a path as a parameter, so
Once the above is done, the @xml variable is read from disk and is ready to be interrogated. Please note that single_blob for openrowset is faster than single_clob. I read somewhere in BOL that binary represented XML data is the easiest for SQL Server.
Hope this helps,Oleg
answered Oct 20, 2010 at 05:09 PM
Have you tried the same into a temporary table instead of a table variable, and compared the execution plans?Inserting into a table variable can cause parallelism to be disabled, and this might be biting you here.
answered Oct 20, 2010 at 03:36 PM
Matt Whitfield ♦♦
Nice response Oleg. Two observations: //pizza is different than /pizza. The first searches the whole hierarchy to find elements with pizza name while the second selects the root pizza element only. /pizzas/pizza will select all pizza inside pizzas root element. Maybe you know this, but I am just clarifying as this may have great impact on your query. Second is that while going thru this slowness issue, I broke down my big xml into parts. So instead of having one select /pizzas/pizza/toppings/topping, I selected /pizzas/pizza into a temp table, then, on a second step, I processed each pizza from the temp table. This reduced the processing from 25 seconds to 2 seconds. And that was for creating only 220 rows!
answered Jun 12, 2012 at 02:36 PM