x

How do I extract data from a variable length delimited string after a certain 'phrase' appears?

I wish to extract certain pieces of information that appear 5 pieces after a certain bit of data appears in the delimited string.

For example D|1|2|3|4|5|6|7|8|9|D|1|2|3|4|5|6|7|8|9.... etc etc (48 D's)

I wish to basically write a query which pulls say the 5th value in the delimited string after every D so I need to say every time D is found, go 5 along (maybe 10 if the pipe counts as a value) and extract that information into a new column in a table, I wish for this information to be displayed along in rows rather than down in a column.

NOTE: There will always be 48 bits of information to be pulled from each cell, no more or less.
more ▼

asked Jul 03 '12 at 08:47 AM in Default

sqlrd222 gravatar image

sqlrd222
30 1 1 2

(comments are locked)
10|1200 characters needed characters left

3 answers: sort voted first

Presume there's not always 9 chars after the 'D', otherwise this would be a simple list of substrings at fixed positions

i.e.

select substring(@YourString,10,1),
select substring(@YourString,20,1),
select substring(@YourString,30,1),
....

You can use a tally table to split the string into individual characters, find the position of the 'D's and move forward 10 chars from that. Then pivot the results to get your columns. I've limited it here to 4 columns for brevity, but it gives you the idea

declare @YourString char(960)
set @YourString = 'D|1|2|3|4|X|6|7|8|9|D|1|2|3|4|Y|6|7|8|9|D|1|2|3|4|Z|6|7|8|9|D|1|2|3|4|5|6|7|8|9'
;with tally as
(
select top 1000
    ROW_NUMBER() over(order by sv1.number) N
from
    master.dbo.spt_values sv1,
    master.dbo.spt_values sv2
)


select 
 [1],[2],[3],[4]
from(
select
 row_number() over (order by N) as rownum,
 substring(@YourString,N+10,1) as chardata
from tally 
where N< = len(@YourString)
and substring(@YourString,N,1) = 'D'
) p
pivot
(max (chardata)
for 
rownum in ([1],[2],[3],[4])
) as PVT

gives us the result

1    2    3    4
---- ---- ---- ----
X    Y    Z    5

Edit: added example using a table as the source

declare @YourTable table (ID int, YourString char(960))
insert into @YourTable select 1,'D|1|2|3|4|X|6|7|8|9|D|1|2|3|4|Y|6|7|8|9|D|1|2|3|4|Z|6|7|8|9|D|1|2|3|4|5|6|7|8|9'
insert into @YourTable select 2,'D|1|2|3|4|A|6|7|8|9|D|1|2|3|4|B|6|7|8|9|D|1|2|3|4|C|6|7|8|9|D|1|2|3|4|E|6|7|8|9'
insert into @YourTable select 3,'D|1|2|3|4|M|6|7|8|9|D|1|2|3|4|N|6|7|8|9|D|1|2|3|4|O|6|7|8|9|D|1|2|3|4|P|6|7|8|9'

;with tally as
(
select top 1000
    ROW_NUMBER() over(order by sv1.number) N
from
    master.dbo.spt_values sv1,
    master.dbo.spt_values sv2
)


select 
 ID, [1],[2],[3],[4]
from(
select
 ID,
 row_number() over (partition by ID order by N) as rownum,
 substring(YourString,N+10,1) as chardata
from tally 
join @YourTable on N<= len(YourString)
and substring(YourString,N,1) = 'D'
) p
pivot
(max (chardata)
for 
rownum in ([1],[2],[3],[4])
) as PVT

order by id

Edit: added example using OPs field names

;with tally as
(
select top 1000
    ROW_NUMBER() over(order by sv1.number) N
from
    master.dbo.spt_values sv1,
    master.dbo.spt_values sv2
)


select 
 ID, [1],[2],[3],[4]
from(
select
 ID,
 row_number() over (partition by ID order by N) as rownum,
 substring(RAWCLOB,N+10,1) as chardata
from tally 
join DATAFLOW_CLOB on N>= len(RAWCLOB)
and substring(RAWCLOB,N,1) = 'D'
) p
pivot
(max (chardata)
for 
rownum in ([1],[2],[3],[4])
) as PVT

order by id
more ▼

answered Jul 03 '12 at 09:21 AM

Kev Riley gravatar image

Kev Riley ♦♦
50.7k 43 49 76

I'm not largely experience with SQL but I'm testing your code out and its erroring saying Line 20 'An expression of non-boolean type specified in context where condition is expected, near ';' - is there something I should change in my code?
Jul 03 '12 at 09:28 AM sqlrd222
Argh it's the T-SQL formatter chewing up my code on this site look for the bit that says N&lt; = and change it to 'N <='
Jul 03 '12 at 09:30 AM Kev Riley ♦♦
It's saying line 20 which is 'where N< = len(@YourString)' - I can't see any '<' signs here?
Jul 03 '12 at 09:37 AM sqlrd222

And now it's encoded your reply - I can't tell which way round you meant!

change line 20 to `where N less_than_or_equal_to len(@YourString)'

replacing 'less_than_or_equal_to' with the correct operator
Jul 03 '12 at 09:42 AM Kev Riley ♦♦
Sorted - thank you so much :)
Jul 03 '12 at 09:44 AM sqlrd222
(comments are locked)
10|1200 characters needed characters left

Thank you, you're so helpful.

I now have this code -

declare @YourTable table (ID int, YourString char(960))

insert into @YourTable select 'DATAFLOW_CLOB.RAW_CLOB'

;with tally as ( select top 1000 ROW_NUMBER() over(order by sv1.number) N from master.dbo.spt_values sv1, master.dbo.spt_values sv2 )

select ID, [1],[2],[3],[4],[5],[6],[7],[8],[9],[10],[11],[12],[13],[14],[15],[16],[17],[18],[19],[20],[21],[22],[23],[24],[25],[26],[27],[28],[29],[30],[31],[32],[33],[34],[35],[36],[37],[38],[39],[40],[41],[42],[43],[44],[45],[46],[47],[48] from( select ID, row_number() over (partition by ID order by N) as rownum, substring(RAWCLOB,N+10,1) as chardata from tally join DATAFLOW_CLOB on N<= len(RAWCLOB) and substring(RAWCLOB,N,1) = 'D' ) p pivot (max (chardata) for rownum in ([1],[2],[3],[4],[5],[6],[7],[8],[9],[10],[11],[12],[13],[14],[15],[16],[17],[18],[19],[20],[21],[22],[23],[24],[25],[26],[27],[28],[29],[30],[31],[32],[33],[34],[35],[36],[37],[38],[39],[40],[41],[42],[43],[44],[45],[46],[47],[48]) ) as PVT

order by id

But I get an error saying column name or number does not match table definition on line 3 so I took line 3 out (did't know if I needed it) and then I get an error saying invalid object name (dataflow_clob) on line 6. Not sure what's wrong really.
more ▼

answered Jul 03 '12 at 11:06 AM

sqlrd222 gravatar image

sqlrd222
30 1 1 2

No don't use any reference to @YourTable - that was just me trying to demonstrate how to run this against a table. Just use what I posted as the last example...

Also you need to identify an identifer on the row (I've called it ID in the script)
Jul 03 '12 at 11:10 AM Kev Riley ♦♦

OK I took the top bit out so the query just starts with ';with tally as...' but I'm still getting an error saying invalid object name 'DATAFLOW_CLOB' but on line 2? I can't see any reference to the table name on line 2. Is this not because I need to reference which database I am taking it from?

The ID will be the date for each string of data when this bit gets working. The date is contained in the string, it is the 12th bit of information along.
Jul 03 '12 at 11:14 AM sqlrd222
I have solved that by saying 'USE GATEKEEPER' above the code you gave me. But it's now erroring saying that ID is an invalid column name, I've been trying to declare it but for some reason I'm not getting it right?
Jul 03 '12 at 11:54 AM sqlrd222
No - the ID I'm referring to needs to be something that identifies the row, before you start to strip out the string. Are there any other fields you can use?
Jul 03 '12 at 12:00 PM Kev Riley ♦♦
The first column in the table I'm using is called DC_INDEX_FX and is a type of unique ID for each string. Is that why the ID part of the code won't work for me? Thanks
Jul 03 '12 at 12:02 PM sqlrd222
(comments are locked)
10|1200 characters needed characters left

OK posted as new answer, as the clarification that the data is not just characters means a slightly different approach. Still using the tally table, but this time to split the string out using the '|' (pipe) character, and cope with 'bits of data' that are not just single characters.

declare @YourTable table (ID int, YourString char(960))
insert into @YourTable select 1,'SPH|1.0|2ed|345|4|X|69|7dddess|8.0.98|9re|SPH|1.0|2ed|345|4|XX|69|7dddess|8.0.98|9re|SPH|1.0|2ed|345|4|XXX|69|7dddess|8.0.98|9re|SPH|1.0|2ed|345|4|XXXX|69|7dddess|8.0.98|9re|'

;with tally as
(
select top 1000
    ROW_NUMBER() over(order by sv1.number) N
from
    master.dbo.spt_values sv1,
    master.dbo.spt_values sv2
)
,
splitoutdata as
(
select
 ID,
 row_number()over(partition by ID order by N) as rownum,
 substring('|'+YourString+'|',N+1,charindex('|','|'+YourString+'|',N+1)-N-1) as bitofdata
from tally 
join @YourTable on N&lt; len('|'+YourString+'|')
and substring('|'+YourString+'|',N,1) = '|'
)

select 
 ID, [1],[2],[3],[4]
from(
select
 ID,
 row_number()over(order by rownum) as rownum,
--this line here gets the 5th element, so if in future you want a different one 
--simply change the integer value
 (select bitofdata from splitoutdata so2 where so2.rownum = so1.rownum +5) as bitofdata
from splitoutdata so1
where bitofdata = 'SPH'
) p
pivot
(max (bitofdata)
for 
rownum in ([1],[2],[3],[4])
) as PVT

order by id
more ▼

answered Jul 03 '12 at 02:00 PM

Kev Riley gravatar image

Kev Riley ♦♦
50.7k 43 49 76

Thank you for that, I apologise that I didn't clarify it in the first place. It still doesn't seem to like 'with tally as' as I'm getting an error for that line that says Invalid length parameter passed to the left or substring function - I assume the substring as there is no left function.
Jul 03 '12 at 02:12 PM sqlrd222
Would it be possible to give me an example of the real data? Please message me (envelope icon at the top of the page) for an email address if you don't want to post here.
Jul 03 '12 at 02:19 PM Kev Riley ♦♦
Did you receive my message?
Jul 03 '12 at 02:54 PM sqlrd222
Yes, looking at it now.....
Jul 03 '12 at 02:56 PM Kev Riley ♦♦
An update: the OP sent me a copy of the real data, and I arranged a solution around the query given above, substituting the OP's actual object names and data types
Jul 04 '12 at 08:32 AM Kev Riley ♦♦
(comments are locked)
10|1200 characters needed characters left
Your answer
toggle preview:

Up to 2 attachments (including images) can be used with a maximum of 524.3 kB each and 1.0 MB total.

New code box

There's a new way to format code on the site - the red speech bubble logo will automatically format T-SQL for you. The original code box is still there for XML, etc. More details here.

Follow this question

By Email:

Once you sign in you will be able to subscribe for any updates here

By RSS:

Answers

Answers and Comments

SQL Server Central

Need long-form SQL discussion? SQLserverCentral.com is the place.

Topics:

x33
x26
x22
x9

asked: Jul 03 '12 at 08:47 AM

Seen: 2499 times

Last Updated: Jul 04 '12 at 08:32 AM