|
A bit of history to this : I posted this question elsewhere about 12 months ago, and got some very interesting and useful answers, but now I want to see what the wonderful community here can come up with. --- In SQL 2005, I have some data and want to get the result so essentially the data is aggregated on id, with the refs concatenated together, separated by a slash '/', but with any common prefix removed so if the data was like I would want to get the result I know I can simply concatenate the refs by using to give but it's the bit that removes the common element that I'm after..... expected result --- Code for test data : expected results
(comments are locked)
|
|
This is a bit clunky... :) Edited to include reference to a static tally table... @Matt - Wow! I like it. There is an evil worktable on my machine for the ROW_NUMBER() for the cte AllLengthCounts - you use sys.allobjects and that is a killer (for me at least)
Sep 23 '10 at 06:42 AM
WilliamD
Yeah I wasn't going for optimal, particularly, just flexible :) I think it would suck over a large amount of data...
Sep 23 '10 at 07:08 AM
Matt Whitfield ♦♦
I didn't want to be so blunt :o) I realised that my solution can be improved by dumping the result of string breaking into a "real" table as opposed to keeping it as a CTE.
Sep 23 '10 at 07:10 AM
WilliamD
Hmm, actually, performance wise they both suck. Over 256 rows, mine ran in 9.884 seconds and yours ran in 38.781... Now that's poor performance! I'm sure we could squeeze them to be a lot faster if we put some thought in :)
Sep 23 '10 at 07:53 AM
Matt Whitfield ♦♦
...although I had to make a tally table to get yours to run, so I changed mine to reference the tally table, and then it took 0.649 seconds.
Sep 23 '10 at 07:56 AM
Matt Whitfield ♦♦
(comments are locked)
|
can you give the definition of the
Sep 24 '10 at 06:10 AM
Kev Riley ♦♦
numbers table is just a simple tally table. create table numbers (num int, primary key (num))
Sep 24 '10 at 07:40 AM
Squirrel
(comments are locked)
|
|
As promised here's the answer I got originally, thanks to Quassnoi over at StackOverflow Matt assures me that his solution is faster over a large enough data set, although I still marvel at the simplicity if this solution! I was wondering about this since I suspected that someone would just use math and subtract the numbers and sequentially tack on the remainders. (The numbers may be in text format, but they're numbers nevertheless.) But no one tried that, although it I think it would have been easier that way.
Sep 30 '10 at 07:39 AM
Mark
@Kev Riley +1 This is a very good solution, but why spoil it with such a strange way (100-deep recursion) it uses to generate 101 tally records? if the hier definition can be replaced with something else, like ;with hier (cnt) as
(
select top 101
row_number() over (order by [object_id]) cnt
from sys.objects -- or something
)then the soluition will look even better. Of course, with just 100-deep recursion into a single column set there is no big performance hit, but the plan for recursive cte is somewhat more complex :)
Sep 30 '10 at 08:03 AM
Oleg
(comments are locked)
|
|
I hope I understood you right, so here is my go at it. (v1.0 Lunchtime hack done in about 10 minutes and only tested on your example data). Edit 14:37 CET - Fix added to comply with Kev's comment (extra CTE called MatchPos) Edit 15:25 CET - Final fix so I really comply with Kev ;o) Edit 15:39 CET - Kev really is playing with us! :o) Those repeating numbers made me reverse the position finding in the CTE LeftMostMatch, but it works now. Edit 24-09-2010 09:26 - Made a further change to make sure that the results come out in the right order, even for permanent tables as opposed to table variables (order by added to the XML concatenation). The solution as a whole can be sped up by moving SplitRef into a permanent table (Is that allowed?). @WilliamD close, but not quite right on the concat for id=2, should be 3536757628/29/30, but you have 3536757628/9/30
Sep 23 '10 at 05:13 AM
Kev Riley ♦♦
@WilliamD, sorry but now you have made id=1 wrong....should be 3536757616/7/8, but you have 3536757616/17/18. The common root can be of any length >1
Sep 23 '10 at 06:25 AM
Kev Riley ♦♦
@WilliamD, can I be picky? the order of the concatenated parts is slightly out, and for id=4 the result isn't correct
Sep 23 '10 at 07:51 AM
Kev Riley ♦♦
(comments are locked)
|
Cyborg, if the grouping goes round a 2 digit ending it breaks. Try adding: to the test data, and you get an error with your solution.
Sep 23 '10 at 06:10 AM
WilliamD
(comments are locked)
|


what should be the expected result for 1 ? is it 3536757616/7/8 or 3536757616/17/18?
yeah, kind of with @Oleg here - is it always the last character that you want to consider for concatenation of is it possible to have a root of less than 9 characters
the result for 1 is 3536757616/7/8 as the root is common all the way up to the '1'. Yes the root can be less than 9 chars, in fact at least it has to be 1 char
... but match it as far as possible. meaning that an unbroken series from 353675616 to 353675639 would be returned in 3 groups, starting as 35367561x, 35367562x and 35367563x respectively .. ? if they had the same ID
no, I want one row per id, so an unbroken series would have a common root of 3536756, and would go 353675616/17/18..../38/39. Like in the example for id 2, it crosses over from 29 to 30....