x

Split comma delimited string into separate columns

I have a file that was sent to me with the data separated by commas in 1 field. I want to separate the data into separate columns.

Ex: field contains: Nissan Dealer, 123 Main St, Chicago, IL, 76533
I have created col1, col2, col3, col4, col5. I want Nissan Dealer to go to col1, 123 Main St to col2, Chicago to col3, IL to col4 and 76533 to col5. I have a parse string that will split but you have to select which element you want. It would be nicer if I could make it build the columns on the row as it parsed or split them instead of my having to alter the table to add all of the columns or actually build a separate table with them parsed. Thx
more ▼

asked Apr 04, 2011 at 02:01 PM in Default

tredd68 gravatar image

tredd68
61 4 4 5

(comments are locked)
10|1200 characters needed characters left

6 answers: sort voted first

Check out this question. One of the answers includes the link to [Jeff Moden article][2] about splitting delimited values with T-SQL.

Oleg

[2]: http://www.sqlservercentral.com/articles/T-SQL/62867/
more ▼

answered Apr 04, 2011 at 02:13 PM

Oleg gravatar image

Oleg
15.9k 2 4 24

@tredd68 Yes, there is. Here is the [link to the thread][1] where the split functions are discussed (Jeff's DelimitedSplit8K, @Pavel Pawlowski's CLR and mine using xml). Both Pavel and Jeff posted extensive performance tests. The discussion spans over 5 pages. Please check it out.

[1]: http://www.sqlservercentral.com/Forums/Topic943562-338-2.aspx
Apr 04, 2011 at 03:30 PM Oleg
(comments are locked)
10|1200 characters needed characters left
If you have BIDS installed you can easily import this file with an SSIS project. You could also use BCP to do it. With SSIS, you just create your file import, chose your destination (SQL TABLE) and possibly have to do a data conversion task. You can chose your mapping of source to destination and then schedule the SSIS package as a SQL job if this will be an ongoing task you need to perform. If you have used DTS in SQL 2000, it is much the same with SSIS. There are many options to you. SQLCMD, BCP, SSIS, DTS, Powershell, etc. If it is a one time think you could also use the import wizard with SSMS. You could also use the Import Wizard and save the import project as SSIS if you don't want the hassle of doing it from scratch with BIDS. (Business Intelligence Design Studio)
more ▼

answered Apr 04, 2011 at 02:11 PM

Tim gravatar image

Tim
36.4k 35 41 139

(comments are locked)
10|1200 characters needed characters left

I think Oleg probably has the reference for the best answer for you, but I always like to have options, so here is an implementation using XML.

create table testTable3 (id int, col1 varchar(512))

insert into testTable3
    values (1, 'Nissan Dealer, 123 Main St, Chicago, IL, 76533')

insert into testTable3
    values (2, 'some place, 1234 Somewhere St, Stillater, OK, 123456')

; with cte1 as (

       select 
         t2.id,
         t3.split,
         ROW_NUMBER() over (partition by t2.id order by t2.id) as num
       from
         (
         select *,
         CAST('<X>'+replace(t.col1,',','</X><X>')+'</X>' as XML) as xmlfilter
         from testTable3 t
         ) t2
       cross apply
       (
         select col1data.D.value('.','varchar(50)') as split
         from t2.xmlfilter.nodes('X') as col1data(D)) t3
         )


select id, [1], [2], [3], [4], [5]
from 
(
    select id, num, split
    from CTE1
) as sourcetable
pivot
(max(split) for num IN ([1], [2], [3], [4], [5])) as pivottable

there is a discussion about techniques like this at [http://www.sqlservercentral.com/articles/XML/66932/][1]

[1]: http://www.sqlservercentral.com/articles/XML/66932/
more ▼

answered Apr 04, 2011 at 03:31 PM

TimothyAWiseman gravatar image

TimothyAWiseman
15.5k 20 23 32

Hey Timothy, congrats on the 10k :) Its great having your answers on this forum, thanks for spending some of your time with us.
Apr 05, 2011 at 12:30 AM Fatherjack ♦♦

@tredd68 You just need to entitize the xml unfriendly characters before casting them to xml. For example, take Timothy's answer and try to run is as is but change one of the values to become unfriendly (make Nissan Dealer something like Nissan Dealer <<&). This will make the query fail. Now, restate the part inside of the CTE definition to read this:

select 
    t2.id,
    t3.split,
    row_number() over (partition by t2.id order by t2.id) as num
    from 
    (
        select 
            id, cast('' + replace(
            xmlfilter, ',', '') + '' as xml) xmlfilter  
            from #testTable3 cross apply 
            (
                select '' + col1 for xml path('')
            ) t(xmlfilter)
    ) t2
    cross apply
    (
        select col1data.D.value('.','varchar(50)') as split
        from t2.xmlfilter.nodes('x') as col1data(D)
    ) t3
and it will address the issue you have. The idea is to select the col1 values for xml path forcing the engine to entitize unfriendly characters BEFORE adding X nodes part and casting it as xml :)
Apr 05, 2011 at 10:17 AM Oleg
@tredd68 The part select '' + col1 for xml path('') deserves a little explanation. The dummy '' is the key here. Without it, selecting data for xml path will add the <col1> to the left of the value and close this node at the end. Adding an emply string to the left forces the column to become nameless so the engine will not add the col1 node. Because the columns cannot be referenced if they are nameless, the name is given to the column AFTER the selection for xml path has already taken place. This is why the t(xmlfilter) is present.
Apr 05, 2011 at 10:24 AM Oleg

@TimothyAWiseman Adding the xml path (like in my comment above) could be a better option though because it will handle all xml unfriendly characters, not just the ampersand. Whatever needs to be entitized will be entitized on the way in and un-entitized on the way out (in the bottom cross apply). In fact, you can even have the column values including <X> itself and any other nodes and still splitting correctly by adding xml path part. No way this can be done by attempting to replace all the possible unfriendly combinations with replace function.

Congratulations on moving to the 5-digits Timothy! Sorry it is 145 points too late. I have learned a lot from your answers on this site as I am sure did many others.
Apr 05, 2011 at 10:57 AM Oleg

@tredd68 Here is Timothy's answer restated to handle unfriendly characters. Please note the value of the first record which forces the original script in the answer to break.

create table #temp (id int, col1 varchar(512));

insert into #temp select 1, 'Nissan Dealer&>>>&&, 123 Main St, Chicago, IL, 76533' union all select 2, 'some place, 1234 Somewhere St, Stillater, OK, 123456';

;with cte as ( select t2.id, t3.split, row_number() over ( partition by t2.id order by t2.id) as num from ( select id, cast('' + replace( xmlfilter, ', ', '') + '' as xml) xmlfilter
from #temp cross apply ( select '' + col1 for xml path('') ) t(xmlfilter) ) t2 cross apply ( select item.value('.','varchar(50)') as split from t2.xmlfilter.nodes('x') as R(item) ) t3 ) select id, [1], [2], [3], [4], [5] from ( select id, num, split from cte ) as src pivot (max(split) for num in ([1], [2], [3], [4], [5])) as pvt

drop table #temp;
Apr 05, 2011 at 02:02 PM Oleg
(comments are locked)
10|1200 characters needed characters left
Just thought I'd let you all know... I've solved the problem with longer strings and wider elements that Tally Table splitters used to suffer with. I'm in the process of writing an article about it for SQLServerCentral.com.
more ▼

answered Apr 05, 2011 at 08:02 PM

Jeff Moden gravatar image

Jeff Moden
1.7k 2 4 8

I believe this is worth a "Huzzah!"
Apr 06, 2011 at 03:45 AM Kevin Feasel
(comments are locked)
10|1200 characters needed characters left

create function dbo.GetDomain( @source varchar(1024), @delimiter varchar(10), @domain int ) returns varchar(1024) as begin declare @returnValue varchar(1024) declare @workingOn int declare @length int set @workingOn=0 while @workingOn<@domain begin set @source=substring(@source,charindex(@delimiter,@source)+1,1024) set @workingOn+=1 end set @length=charindex(@delimiter,@source) set @returnValue=substring(@source,1,case when @length=0 then 1024 else @length-1 end) return @returnValue end go

select t.col1, dbo.GetDomain(t.col1,',',0) as col2, dbo.GetDomain(t.col1,',',1) as col3, dbo.GetDomain(t.col1,',',2) as col4 from #test t go
more ▼

answered Feb 04 at 10:20 AM

bibinsami gravatar image

bibinsami
1

(comments are locked)
10|1200 characters needed characters left
Your answer
toggle preview:

Up to 2 attachments (including images) can be used with a maximum of 524.3 kB each and 1.0 MB total.

New code box

There's a new way to format code on the site - the red speech bubble logo will automatically format T-SQL for you. The original code box is still there for XML, etc. More details here.

Follow this question

By Email:

Once you sign in you will be able to subscribe for any updates here

By RSS:

Answers

Answers and Comments

SQL Server Central

Need long-form SQL discussion? SQLserverCentral.com is the place.

Topics:

x1943
x985
x33

asked: Apr 04, 2011 at 02:01 PM

Seen: 14401 times

Last Updated: Feb 04 at 10:20 AM