question

John_Learns avatar image
John_Learns asked

Parse XML that includes escape (< and &gt)

I have the following XML column in a table. My hope is to parse LocationResponseBodyType into its individual elements.

<LocationResponseBodyType xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
  <LocationResponseText xmlns="urn:nena:xml:ns:LoggingDataTypes:2.0"><locationResponse xmlns="urn:ietf:params:xml:ns:geopriv:held">
  <locationUriSet expires="2018-04-28T17:28:38Z">
    <locationURI>https://svr-adv.xx.svcTEL.us:2001/?LbyR=fc43a892-b70f-4495-8281-a4219cc369a3</locationURI>
  </locationUriSet>
  <presence entity="pres:temp@svr.xx.svcTEL.us" xmlns="urn:ietf:params:xml:ns:pidf">
    <tuple id="lds-995f">
      <contact>tel:+18575551212</contact>
      <timestamp>2018-04-28T10:22:45Z</timestamp>
    </tuple>
  </presence>
</locationResponse></LocationResponseText>
  <DirectionValuesCode xmlns="urn:nena:xml:ns:CodeList:2.0">incoming</DirectionValuesCode>
  <LocationQueryId xmlns="urn:nena:xml:ns:LoggingDataTypes:2.0">https://svr-adv.xx.svcTEL.us:2001</LocationQueryId>
</LocationResponseBodyType>

In the attached code, you can see that I am able to parse the "normal" XML items. I cannot figure out how to also parse the elements that are included between the < and > escape characters.

with  xmlnamespaces        ('urn:nena:xml:ns:LoggingDataTypes:2.0'    AS ns1
                            ,'urn:ietf:params:xml:ns:geopriv:held'    AS ns2
                            ,'urn:nena:xml:ns:CodeList:2.0'            AS ns3
                            )
select    top 10    logID
        --LocationResponseBodyType
            ,eventBody.value('(/LocationResponseBodyType//ns1:LocationResponseText)[1]','nvarchar(max)')    AS 'LoSTResponseBodyType'
            ,eventBody.value('(/LocationResponseBodyType//ns3:DirectionValuesCode)[1]','nvarchar(max)')        AS 'DirectionValuesCode'
            ,eventBody.value('(/LocationResponseBodyType//ns1:LocationQueryId)[1]','nvarchar(max)')            AS 'LocationQueryId'
from   dbo.LogEvent

Thank you for taking the time to read this. -John

tsqlxmlparse
4 comments
10 |1200

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

Kev Riley avatar image Kev Riley ♦♦ commented ·

Has that XML pasted into the question correctly? It seems like it is not valid XML

XML parsing: line 8, character 15, end tag does not match start tag

which seems to be around the /status tag

0 Likes 0 ·
John_Learns avatar image John_Learns commented ·

Sorry about that. /Status should not be in there. The XML row I was trying to ask the question about is 7200 characters long. I wanted to trim it down to just enough to ask my question, but missed that tag. I corrected it now.

0 Likes 0 ·
Kev Riley avatar image Kev Riley ♦♦ commented ·

Ok that's better - now what would you want as the output, given that input?

0 Likes 0 ·
John_Learns avatar image John_Learns commented ·

My hope is to be able to extract the items that are surrounded by the escaped less than and greater than markers. I will write my solution into a table or a temp table so it can be analyzed and reported on.

<locationUri> =https://svr-adv.xx.svcTEL.us:2001/?LbyR=fc43a892-b70f-4495-8281-a4219cc369a3

<contact> = tel:+18575551212

<timestamp> = 2018-04-28T10:22:45Z etc. If I can figure out how to take parse these few values, I can then, repeat the process for the enormous row that I am dealing with. My struggle lies in the fact that I can easily pull out the elements when they are set up with the XML formatting that I am used to seeing. When it switches to the escape characters, I am at a loss of how to reach in and pluck the data inside.

In looking at my code block above, it converted all of the "<_and_>" symbols to proper < and > .

I have attached the text file that preserves the escaped symbols.sqlcentral.txt.

0 Likes 0 ·
sqlcentral.txt (1.8 KiB)

1 Answer

·
Kev Riley avatar image
Kev Riley answered

Hopefully this helps in some way, and that the data hasn't been mangled by posting it here.

If you treat the escaped symbols as XML, then you can simply use XML functions to parse the values, however, you also need to declare the extra namespace. I'm not going to post the data again as I don't trust the formatting on here, but my final query was:

with  xmlnamespaces        ('urn:nena:xml:ns:LoggingDataTypes:2.0'    AS ns1
                            ,'urn:ietf:params:xml:ns:geopriv:held'    AS ns2
                            ,'urn:nena:xml:ns:CodeList:2.0'            AS ns3
                            ,'urn:ietf:params:xml:ns:pidf'    as ns4
                            )

select 
	OrigQuery.logid, 
	OrigQuery.LoSTResponseBodyType,
	cast(OrigQuery.LoSTResponseBodyType as xml).value('(/ns2:locationResponse/ns4:presence/@entity)[1]','nvarchar(max)') as entity,
	cast(OrigQuery.LoSTResponseBodyType as xml).value('(/ns2:locationResponse/ns4:presence/ns4:tuple/ns4:contact)[1]','nvarchar(max)') as contact,
	cast(OrigQuery.LoSTResponseBodyType as xml).value('(/ns2:locationResponse/ns4:presence/ns4:tuple/ns4:timestamp)[1]','nvarchar(max)') as timestamp,
	OrigQuery.DirectionValuesCode, 
	OrigQuery.LocationQueryId
from
(
select    top 10    logID
        --LocationResponseBodyType
            ,eventBody.value('(/LocationResponseBodyType//ns1:LocationResponseText)[1]','nvarchar(max)')    AS 'LoSTResponseBodyType'
            ,eventBody.value('(/LocationResponseBodyType//ns3:DirectionValuesCode)[1]','nvarchar(max)')        AS 'DirectionValuesCode'
            ,eventBody.value('(/LocationResponseBodyType//ns1:LocationQueryId)[1]','nvarchar(max)')            AS 'LocationQueryId'
from   YourTable
)OrigQuery

If that seems to work then you can change this around so that the original query is defined in a CTE or whatever you feel is best

2 comments
10 |1200

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

John_Learns avatar image John_Learns commented ·

Wow! My suffering has ended! Thank you so much for your time, @Kev Riley.

0 Likes 0 ·
Kev Riley avatar image Kev Riley ♦♦ John_Learns commented ·

Glad I could help

0 Likes 0 ·

Write an Answer

Hint: Notify or tag a user in this post by typing @username.

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.