how to split data

DECLARE @var VARCHAR(MAX) = '(17.685895196738677,75.926513671875;State Highway 151, Shoaaaa, Maaaaaa, Iaaaaa)_(17.675427818339383,77.5634765625;Naaaa Haaa 9, AaaaaPa, Iaaaaaa)_
(18.375379094031824,74.564208984375;Saaaaa 151, Maaaaaa, Iaaaa)'

the above data...needs to be split into row wise...like...

State Highway 151, Shoaaaa, Maaaaaa, Iaaaaa  
it should not consider '( ' and '_' in that string....
more ▼

asked Jan 27, 2011 at 01:49 AM in Default

Murali gravatar image

906 103 117 120

Not sure about the separators - some of them do split the row, some don't. Are there any rules?
Jan 27, 2011 at 01:54 AM ThomasRushton ♦
How many rows need this process applied and how often will you need to do it?
Jan 27, 2011 at 03:56 AM Fatherjack ♦♦
(comments are locked)
10|1200 characters needed characters left

4 answers: sort voted first
I'd use a tally table method to do this if there are many rows to be processed, so long as the split occurs on every comma http://www.sqlservercentral.com/articles/T-SQL/62867/
more ▼

answered Jan 27, 2011 at 02:06 AM

Fatherjack gravatar image

Fatherjack ♦♦
42.8k 75 79 108

(comments are locked)
10|1200 characters needed characters left

To add to @Fatherjack's suggestion, here is an example that will at least split the example data into the intitial three major groups. You can go from there to get the split down to what you really want.

However, as this is a complicated split, it will probably be better done in SQLCLR using RegEx. I think @Matt Whitfield may swing by and suggest the same. It will be quicker and less hassle to do it that way (if you can code in C# or VB.Net).

The SplitString function comes from the article linked by @Fatherjack, the Nums function is a dynamic tally inline table function and works like the Tally table that is mentioned in the article.

You can see how clunky it is already in my example, and you would need to use SplitString again to achieve the final result.

IF EXISTS (SELECT * FROM sys.objects WHERE name ='StringSplit' AND type = 'IF')
        DROP FUNCTION dbo.StringSplit
    (@String varchar(max),
     @Splitter varchar(10))
    (SELECT SUBSTRING(@String, N + 1, CHARINDEX(@Splitter, @String, N + 1) - N - 1) data
     FROM   dbo.Nums(LEN(@String)) N2
     WHERE  N < LEN(@String)
            AND SUBSTRING(@String, N, 1) = @Splitter)
IF EXISTS (SELECT * FROM sys.objects WHERE name ='Nums' AND type = 'IF')
        DROP FUNCTION dbo.Nums
CREATE FUNCTION dbo.Nums (@m AS bigint)
    WITH    t0
              AS (SELECT    n = 1
                  UNION ALL
                  SELECT    n = 1),
              AS (SELECT    n = 1
                  FROM      t0 AS a,
                            t0 AS b),
              AS (SELECT    n = 1
                  FROM      t1 AS a,
                            t1 AS b),
              AS (SELECT    n = 1
                  FROM      t2 AS a,
                            t2 AS b),
              AS (SELECT    n = 1
                  FROM      t3 AS a,
                            t3 AS b),
              AS (SELECT    n = 1
                  FROM      t4 AS a,
                            t4 AS b),
              AS (SELECT    ROW_NUMBER() OVER (ORDER BY n) AS n
                  FROM      t5)
    SELECT  n
    FROM    result
    WHERE   n <= @m

DECLARE @Splitter char(1)
DECLARE @Parameter varchar(8000)

SELECT @Splitter = '_',
       @Parameter = '(17.685895196738677,75.926513671875;State Highway 151, Shoaaaa, Maaaaaa, Iaaaaa)_(17.675427818339383,77.5634765625;Naaaa Haaa 9, AaaaaPa, Iaaaaaa)_(18.375379094031824,74.564208984375;Saaaaa 151, Maaaaaa, Iaaaa)'
SELECT  @Parameter = @Splitter + REPLACE(REPLACE(@Parameter, ')', ''), '(', '') + @Splitter ;

SELECT  data
FROM    dbo.StringSplit(@Parameter, @Splitter) SS
more ▼

answered Jan 27, 2011 at 02:33 AM

WilliamD gravatar image

25.9k 17 19 41

thanks for your response...but it is not satisfying my requirment. actually as i mentioned in the above...the result should come like that...this is returning the data like below...

17.685895196738677,75.926513671875;State Highway 151, Shoaaaa, Maaaaaa, Iaaaaa 17.675427818339383,77.5634765625;Naaaa Haaa 9, AaaaaPa, Iaaaaaa 18.375379094031824,74.564208984375;Saaaaa 151, Maaaaaa, Iaaaa
Jan 27, 2011 at 02:59 AM Murali

@Murali I suspect that @WilliamD knows this, and that is why he wrote the paragraph above the code. In case you missed it, it says:

You can see how clunky it is already in my example, and you would need to use SplitString again to achieve the final result.
Jan 27, 2011 at 03:12 AM ThomasRushton ♦

If you want the rows split within a field, then you can always replace the penultimate line of @WilliamD's response with

SELECT LEFT(data, Charindex(',', data) - 1) + Char(13) + Char(10) + Replace(RIGHT(data, Len(data) - Charindex(',', data)), ';', Char(13) + Char(10))
Jan 27, 2011 at 03:36 AM ThomasRushton ♦
(comments are locked)
10|1200 characters needed characters left

If you can use the CLR, do - as @WilliamD says. You haven't said anything about the data volume, whether this is a one-time operation etc etc.

You can find an example of a regular expression matcher in my [CLR Presentation from SQL Bits][1].

The regular expression to parse that is:

[1]: https://www.atlantis-interactive.co.uk/blog/post/2010/09/08/CLR-Demystified-Presentation.aspx
more ▼

answered Jan 27, 2011 at 04:10 AM

Matt Whitfield gravatar image

Matt Whitfield ♦♦
29.5k 61 65 87

+1 That is exactly what i was searching for on your blog.

What do you use to construct your regular expressions, a reference or learned the syntax as is?
Jan 27, 2011 at 04:54 AM WilliamD

@WilliamD One of the best resources to get going is the [Regular Expression Library][1]. It has a very good collection of regexes. Most of the expressions they post there are very good. Occasionally, someone submits a poorly written one, but not too often. Surely, some of the expressions are bound to be ridiculously difficult, almost impossible to write (such as a true valid date or a true valid email address), but eveb then usually people choose to use somewhat simplified versions of those anyway, the ones covering most common scenarios.

[1]: http://www.regexplib.com/
Jan 27, 2011 at 05:10 AM Oleg

The single most ultra brilliant thing for regular expressions is [Expresso][1] - it has a deconstructor and test mode so you can really get to understand what's going on.

My advice is to start simple, and build up from there. The way expresso formats things in trees certainly helps with that. When you get to things like non consuming read-aheads, that's a bit interesting :)

[1]: http://www.ultrapico.com/Expresso.htm
Jan 27, 2011 at 05:37 AM Matt Whitfield ♦♦
The makers of Expresso have a similar logo to Atlantis Interactive.... coincidence or conspiracy?? ;o)
Jan 27, 2011 at 06:03 AM WilliamD
Conspiracy - definitely! :)
Jan 27, 2011 at 07:52 AM Matt Whitfield ♦♦
(comments are locked)
10|1200 characters needed characters left

There are several very fine answers here. But I am going to particularly agree with Matt Whitfield that regex is probably the way to go. In addition to the fine article he linked to, I have found [http://www.simple-talk.com/sql/t-sql-programming/clr-assembly-regex-functions-for-sql-server-by-example/][1] to be very helpful in the past.

Also, depending on how you ultimately plan to use this data it may make sense to do it in the application layer instead of trying to do it within the database. While you certainly can, as many posters here expertly point out, string manipulation is not a strength of SQL.

[1]: http://www.simple-talk.com/sql/t-sql-programming/clr-assembly-regex-functions-for-sql-server-by-example/
more ▼

answered Jan 27, 2011 at 10:21 AM

TimothyAWiseman gravatar image

15.6k 21 23 32

+1 - nice link :)
Jan 28, 2011 at 03:09 AM Matt Whitfield ♦♦
Haven't seen that article before. Saved. +1
Jan 28, 2011 at 04:11 AM Magnus Ahlkvist
(comments are locked)
10|1200 characters needed characters left
Your answer
toggle preview:

Up to 2 attachments (including images) can be used with a maximum of 524.3 kB each and 1.0 MB total.

New code box

There's a new way to format code on the site - the red speech bubble logo will automatically format T-SQL for you. The original code box is still there for XML, etc. More details here.

Follow this question

By Email:

Once you sign in you will be able to subscribe for any updates here



Answers and Comments

SQL Server Central

Need long-form SQL discussion? SQLserverCentral.com is the place.



asked: Jan 27, 2011 at 01:49 AM

Seen: 591 times

Last Updated: Jan 27, 2011 at 01:55 AM