|
Hi, I am in a situation, where I am using the famous 8K Splitter function developed by well-known, well-regarded Mr. Jeff Moden. Often it is referred on this forum as well. So, I believe I must share my findings and seek valuable suggestions/corrections from this great community. To start with, the original function mentioned in the article has been further optimized and is called “[DelimitedSplit8K_T1]” (BTW, we are using it currently). It is available in the attachment section of the aforementioned link. But for ready reference I am posting it here So far I have come up with some modified versions but 2 of them are performing better than the DelimitedSplit8K_T1. But those would need some testing and there could be more room for improvement. I am posting them here. I have used some comments to give the main idea and differences. Any suggestions,corrections,optimization would be highly appreciated. I am using the same performance tests available in the above said article. Thanks in advance.
(comments are locked)
|
|
We have managed to complete the 10000 rows, 1133 elements i.e. near to 8k characters performance test (described in the above-mentioned pointer) in 19 secs approximately. Which is quite a drastic improvement for us. The major change was the conversion of tally table column's data type from BIGINT to INT. Another change (which shows marginally increased performance) was to add pre-calculated column which have the value for N + 1. Following performance test includes the two versions we have come up with Since my version is faster and I have not find any bugs yet. Marking my answer as accepted ;)
May 31 '12 at 02:25 PM
Usman Butt
Let me be the first one to upvote it :)
May 31 '12 at 02:29 PM
Oleg
@Oleg Thanks. I know you are quite busy these days (read it on another thread), but I will post the VARCHAR(MAX) testing results as well. Your xml solution and RCTE solution (surprisingly) worked really well. ;)
Jun 01 '12 at 05:58 AM
Usman Butt
(comments are locked)
|
|
As I said, we did some testing on VARCHAR(MAX) versions as well. As tally table solution does not behave well with wider strings. Two of them performed really well
But surprisingly, RCTE was winner on some occasions with good margin as compared to losing. If I do the same 10000 rows, 1133 elements with 125-150 characters width, RCTE beat the XML solution marginally. But if the numbers of elements are decreased to 500, the RCTE was almost twice as fast as XML. But I would not count out any of the solutions, as they could fit according the requirement and environment. For RCTE solution, I must say it is more resource intensive. The index spool and SORT operators in the execution plan clearly indicates that it would hit the memory, tempdb and processors hard. It also reflects that with more CPUs, memory and capacity planned tempdb, the solution is viable to a better performance as compared to other solutions. Having said that, I still may tilt towards the XML solution. To me, why the RCTE performs much better than the tally table solution (both used the Charindex and Substring), is because the tally table solution compare each character with the delimiter, and due to the Out of Row phenomenon, this does not scale well. Whereas, RCTE solution does it for only the required number of times. Moreover, till VARCHAR(8000), the is dealt as a predicate in addition to the Seek predicate. But for VARCHAR(MAX), this is divided into two steps i.e. A Seek predicate followed by a Filter predicate which decreases the performance quite a bit. Now the performance test for RCTE and XML solutions As always, everyone's mileage may differ. Please note that testing was done on SQL Server 2005. For @Usman Thank you for bringing my function back to life. @Pavel Pawlowski's CLR solution is probably the best for this specific problem but if wheeling out a CLR is not an option, it is nice to explore other alternatives. My mileage did vary on the box I use here at work (4 cores, 8 GB of RAM, Windows 7 64 bit). For some reason, the xml solution beat the RCTE, maybe my box is tired :) : /*== dbo.Split_XML ==*/ SQL Server Execution Times: CPU time = 118857 ms, elapsed time = 120583 ms. /*== dbo.Split_RCTE ==*/ SQL Server Execution Times: CPU time = 134660 ms, elapsed time = 138365 ms. I also checked my inline TVF xml function, but really sucks, despite the fact that it is inline: create function dbo.udf_SplitXmlInlineTvf
(
@Parameter varchar(max), @Delimiter char(1)
)
returns table with schemabinding as
return
select cast(row_number() over (order by
(select null)) as int) ItemNumber,
R.Item.value('text()[1]', 'varchar(max)') ItemValue
from (select cast(''+replace(@Parameter,
@Delimiter, '')+'' as xml)) X(N)
cross apply N.nodes('//r') R(Item);
goI guess the cross apply of the cross apply is not a good way to go.
Jun 08 '12 at 03:22 PM
Oleg
@Oleg I totally agree that CLR is the best option. Moreover, thanks for sharing the test results. As I said earlier, everyone's mileage may differ ;) But what version of SQL Server you tested upon? What MAXDOP setting you had?
Jun 11 '12 at 07:33 AM
Usman Butt
Nicely done. It would be nice to see what the results from your runs actually were because I had to stop the rCTE solution after more than 3 minutes. Also, the collation trick didn't help me at all because I use the default collation. It’s still a good idea, though. Last but not least, the real key to the performance improvement was switching from the Itzik-style cteTally to a real Tally Table. The DelimitedSplit8K-T1 function runs in the same amount of time as your variations when it's changed to a Tally Table. Don’t forget to add the length check in when you do. The CLR is still going to beat it but we're not doing bad for a T-SQL only solution.,Nicely done. It would be nice to see what the results from your runs actually were because I had to stop the rCTE solution after more than 3 minutes. Also, the collation trick didn't help me at all because I use the default collation. Last but not least, the real key to the spead improvement was switiching from the Itzik-style cteTally to a real Tally Table. The DelimitedSplit8K function runs in the same amout of time as your variations when it's changed to a Tally Table.
Jun 16 '12 at 04:31 AM
Jeff Moden
@Usman Here is what I used: OS: Windows 7 SP1
Jun 19 '12 at 04:21 PM
Oleg
Just noticed a bug in The line
should read
Jul 04 '12 at 03:19 PM
Kev Riley ♦♦
(comments are locked)
|


@Usman There are 5 pages worth of pretty intersting discussion on the main site starting from here. Jeff Moden's 8K splitter, my attempt on it with XML and Pavel Pawlowski's CLR solution are pretty thoroughly examined there and Pavel's CLR function is a clear winner.
Many thanks Oleg for pointing to a very informative discussion. And sorry for the delay.
@Oleg I forgot to let know that CLR is not permissible, so had to stick with TSQL solution.
Since my version is faster and I have not find any bugs yet. Marking my answer as accepted. Last day for the bounty as well ;)