question

Murali avatar image
Murali asked

how to split data

DECLARE @var VARCHAR(MAX) = '(17.685895196738677,75.926513671875;State Highway 151, Shoaaaa, Maaaaaa, Iaaaaa)_(17.675427818339383,77.5634765625;Naaaa Haaa 9, AaaaaPa, Iaaaaaa)_ (18.375379094031824,74.564208984375;Saaaaa 151, Maaaaaa, Iaaaa)' the above data...needs to be split into row wise...like... 17.685895196738677 75.926513671875 State Highway 151, Shoaaaa, Maaaaaa, Iaaaaa it should not consider '( ' and '_' in that string....
t-sql
2 comments
10 |1200

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

ThomasRushton avatar image ThomasRushton ♦♦ commented ·
Not sure about the separators - some of them do split the row, some don't. Are there any rules?
0 Likes 0 ·
Fatherjack avatar image Fatherjack ♦♦ commented ·
How many rows need this process applied and how often will you need to do it?
0 Likes 0 ·
Fatherjack avatar image
Fatherjack answered
I'd use a tally table method to do this if there are many rows to be processed, so long as the split occurs on every comma http://www.sqlservercentral.com/articles/T-SQL/62867/
10 |1200

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

WilliamD avatar image
WilliamD answered
To add to @Fatherjack's suggestion, here is an example that will at least split the example data into the intitial three major groups. You can go from there to get the split down to what you really want. **However, as this is a complicated split, it will probably be better done in SQLCLR using RegEx. I think @Matt Whitfield may swing by and suggest the same. It will be quicker and less hassle to do it that way (if you can code in C# or VB.Net).** The SplitString function comes from the article linked by @Fatherjack, the Nums function is a dynamic tally inline table function and works like the Tally table that is mentioned in the article. You can see how clunky it is already in my example, and you would need to use SplitString again to achieve the final result. IF EXISTS (SELECT * FROM sys.objects WHERE name ='StringSplit' AND type = 'IF') BEGIN DROP FUNCTION dbo.StringSplit END GO CREATE FUNCTION dbo.StringSplit (@String varchar(max), @Splitter varchar(10)) RETURNS TABLE RETURN (SELECT SUBSTRING(@String, N + 1, CHARINDEX(@Splitter, @String, N + 1) - N - 1) data FROM dbo.Nums(LEN(@String)) N2 WHERE N < LEN(@String) AND SUBSTRING(@String, N, 1) = @Splitter) GO IF EXISTS (SELECT * FROM sys.objects WHERE name ='Nums' AND type = 'IF') BEGIN DROP FUNCTION dbo.Nums END GO CREATE FUNCTION dbo.Nums (@m AS bigint) RETURNS TABLE AS RETURN WITH t0 AS (SELECT n = 1 UNION ALL SELECT n = 1), t1 AS (SELECT n = 1 FROM t0 AS a, t0 AS b), t2 AS (SELECT n = 1 FROM t1 AS a, t1 AS b), t3 AS (SELECT n = 1 FROM t2 AS a, t2 AS b), t4 AS (SELECT n = 1 FROM t3 AS a, t3 AS b), t5 AS (SELECT n = 1 FROM t4 AS a, t4 AS b), result AS (SELECT ROW_NUMBER() OVER (ORDER BY n) AS n FROM t5) SELECT n FROM result WHERE n <= @m GO DECLARE @Splitter char(1) DECLARE @Parameter varchar(8000) SELECT @Splitter = '_', @Parameter = '(17.685895196738677,75.926513671875;State Highway 151, Shoaaaa, Maaaaaa, Iaaaaa)_(17.675427818339383,77.5634765625;Naaaa Haaa 9, AaaaaPa, Iaaaaaa)_(18.375379094031824,74.564208984375;Saaaaa 151, Maaaaaa, Iaaaa)' SELECT @Parameter = @Splitter + REPLACE(REPLACE(@Parameter, ')', ''), '(', '') + @Splitter ; SELECT data FROM dbo.StringSplit(@Parameter, @Splitter) SS
3 comments
10 |1200

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

ThomasRushton avatar image ThomasRushton ♦♦ commented ·
@Murali I suspect that @WilliamD knows this, and that is why he wrote the paragraph above the code. In case you missed it, it says:
You can see how clunky it is already in my example, and you would need to use SplitString again to achieve the final result.
2 Likes 2 ·
Murali avatar image Murali commented ·
thanks for your response...but it is not satisfying my requirment. actually as i mentioned in the above...the result should come like that...this is returning the data like below... 17.685895196738677,75.926513671875;State Highway 151, Shoaaaa, Maaaaaa, Iaaaaa 17.675427818339383,77.5634765625;Naaaa Haaa 9, AaaaaPa, Iaaaaaa 18.375379094031824,74.564208984375;Saaaaa 151, Maaaaaa, Iaaaa
0 Likes 0 ·
ThomasRushton avatar image ThomasRushton ♦♦ commented ·
If you want the rows split within a field, then you can always replace the penultimate line of @WilliamD's response with SELECT LEFT(data, Charindex(',', data) - 1) + Char(13) + Char(10) + Replace(RIGHT(data, Len(data) - Charindex(',', data)), ';', Char(13) + Char(10))
0 Likes 0 ·
Matt Whitfield avatar image
Matt Whitfield answered
If you can use the CLR, do - as @WilliamD says. You haven't said anything about the data volume, whether this is a one-time operation etc etc. You can find an example of a regular expression matcher in my [CLR Presentation from SQL Bits][1]. The regular expression to parse that is: \(([0-9]+\.[0-9]+),([0-9]+\.[0-9]+);([^)]+)\) [1]: https://www.atlantis-interactive.co.uk/blog/post/2010/09/08/CLR-Demystified-Presentation.aspx
5 comments
10 |1200

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

Oleg avatar image Oleg commented ·
@WilliamD One of the best resources to get going is the [Regular Expression Library][1]. It has a very good collection of regexes. Most of the expressions they post there are very good. Occasionally, someone submits a poorly written one, but not too often. Surely, some of the expressions are bound to be ridiculously difficult, almost impossible to write (such as a true valid date or a true valid email address), but eveb then usually people choose to use somewhat simplified versions of those anyway, the ones covering most common scenarios. [1]: http://www.regexplib.com/
2 Likes 2 ·
Matt Whitfield avatar image Matt Whitfield ♦♦ commented ·
The single most ultra brilliant thing for regular expressions is [Expresso][1] - it has a deconstructor and test mode so you can really get to understand what's going on. My advice is to start simple, and build up from there. The way expresso formats things in trees certainly helps with that. When you get to things like non consuming read-aheads, that's a bit interesting :) [1]: http://www.ultrapico.com/Expresso.htm
1 Like 1 ·
WilliamD avatar image WilliamD commented ·
+1 That is exactly what i was searching for on your blog. What do you use to construct your regular expressions, a reference or learned the syntax as is?
0 Likes 0 ·
WilliamD avatar image WilliamD commented ·
The makers of Expresso have a similar logo to Atlantis Interactive.... coincidence or conspiracy?? ;o)
0 Likes 0 ·
Matt Whitfield avatar image Matt Whitfield ♦♦ commented ·
Conspiracy - definitely! :)
0 Likes 0 ·
TimothyAWiseman avatar image
TimothyAWiseman answered
There are several very fine answers here. But I am going to particularly agree with Matt Whitfield that regex is probably the way to go. In addition to the fine article he linked to, I have found [ http://www.simple-talk.com/sql/t-sql-programming/clr-assembly-regex-functions-for-sql-server-by-example/][1] to be very helpful in the past. Also, depending on how you ultimately plan to use this data it may make sense to do it in the application layer instead of trying to do it within the database. While you certainly can, as many posters here expertly point out, string manipulation is not a strength of SQL. [1]: http://www.simple-talk.com/sql/t-sql-programming/clr-assembly-regex-functions-for-sql-server-by-example/
2 comments
10 |1200

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

Matt Whitfield avatar image Matt Whitfield ♦♦ commented ·
+1 - nice link :)
0 Likes 0 ·
Magnus Ahlkvist avatar image Magnus Ahlkvist commented ·
Haven't seen that article before. Saved. +1
0 Likes 0 ·

Write an Answer

Hint: Notify or tag a user in this post by typing @username.

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.