question

GPO avatar image
GPO asked

Is there a better way to replace this HTML encoding than this scalar function?

I have a column of data of type NVARCHAR(2000) in which strings of characters that include HTML encoding are written. So 10/27/2015 00:00:00 is stored as 10%2F27%2F2015%2000%3A00%3A00. I have created a scalar function to convert the data, but it looks pretty nasty (not least because it's a scalar function). Am I going about this all wrong? CREATE FUNCTION replace_html_encoding ( @string nvarchar(4000) ) RETURNS nvarchar(4000) AS BEGIN DECLARE @result nvarchar(4000) SELECT @result = replace(replace(replace(replace(replace(replace(replace(replace(replace(replace(replace(replace(replace( replace(replace(replace(replace(replace(replace(replace(replace(replace(replace(replace(replace(replace( @string ,'%20',' '),'%3C',''),'%24','$'),'%60','`'),'%7D','}') ,'%3C','] tsqlstring-functionurlscalar-functions
10 |1200 characters needed characters left characters exceeded

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

srutzky avatar image
srutzky answered
Before mentioning any options, I should point out that this question (and even the function name) are mislabeled as "HTML" encoding when really this topic is about "URL" / "URI" encoding. HTML encoding uses `&` notation (i.e. `©` or `*` ) while URL/URI encoding uses `%` notation. There are a few options here. 1. If you like the behavior of your current function, but just don't like that it is a scalar UDF, then it can be converted into an Inline TVF rather easily: CREATE FUNCTION UriDecode ( @UriEncodedString NVARCHAR(4000) ) RETURNS TABLE AS RETURN SELECT REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE( REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE( @UriEncodedString ,'%20',' '),'%3C','<'),'%5C','\'),'%2F','/'),'%3A',':'),'%252F','/'),'%3D','='),'%2B','+'),'%7B','{') ,'%2520',' '),'%3F','?'),'%2C',','),'%25','%'),'%26','&'),'%3E','>'),'%24','$'),'%60','`'),'%7D','}') ,'%3C','<'),'%27',''''),'%5B','['),'%5D',']'),'%7C','|'),'%23','#'),'%22','"'),'%40','@') AS [DecodedString]; GO Test with: SELECT data.[URL], uri.DecodedString FROM (SELECT N' http://place.com/page.aspx?var1=hi%20there%3f&var2=%7Bbob%2520villa%7D' AS [URL] UNION ALL SELECT N'another%20%3E%20row %7C+or%2Bnot!') data CROSS APPLY dbo.UriDecode(data.[URL]) uri; Returns: > **DecodedString** > \------------------ > http://place.com/page.aspx?var1=hi there?&var2={bob villa} > another > row |+or+not! 1. The link in @Tom's answer to sqlteam.com points to a function that should get all encoded values, not just these pre-defined ones, BUT: 1. Now you are back to using a scalar UDF 1. It doesn't handle the double-encoded values of `%252F` and `%2520` that your current function does. 1. Your best bet is still to use SQLCLR since it will be faster, even as a scalar function since SQLCLR scalar functions ( if not doing data access _and_ marked as `IsDeterministic = true` ) can participate in parallel execution plans (which is not possible with T-SQL scalar functions). 1. If you are going to write your own, you should use [Uri.UnescapeDataString][1] (in `System`), in which case you will also need to do a `Replace('+', ' ')` on the string _before_ passing it to `Uri.UnescapeDataString`. This option will work across all versions of SQL Server that support SQLCLR (i.e. 2005 and newer). 1. If you are using SQL Server 2012 or newer, AND have updated the .NET Framework on the server running SQL Server such that the version is at least 4.5, then you have the option of using [System.Net.WebUtility.UrlDecode][2] 1. Do not use [System.Web.HttpUtility.UrlDecode][3] as the `System.Web` library is not one of the [supported .NET Framework libraries][4] in any version of SQL Server, hence you would have to manually load it and it is not guaranteed to always work. 1. If you don't want to mess with coding and deploying the Assembly, you can use the **INET_URIDecode** function that is available in the Free version of [SQL#][5] (which I am the author of, but again, this function -- and many others -- is free). [1]: https://msdn.microsoft.com/en-us/library/system.uri.unescapedatastring.aspx [2]: https://msdn.microsoft.com/en-us/library/system.net.webutility.urldecode.aspx [3]: https://msdn.microsoft.com/en-us/library/adwtk1fy.aspx [4]: https://msdn.microsoft.com/en-us/library/ms403279.aspx [5]: http://SQLsharp.com
1 comment
10 |1200 characters needed characters left characters exceeded

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

Wow! Brilliant. Re HTML vs URI/URL, I changed the tagging but haven't edited question because then your fantastic answer would seem out of context. Very fortunate indeed to have you AND Tom Staab chipping in. Thanks again to you both.
1 Like 1 ·
Tom Staab avatar image
Tom Staab answered
I found this SQL method that might work better for you:
[ http://www.sqlteam.com/forums/topic.asp?TOPIC_ID=88926][1] If CLR is an option, you could leverage built-in .NET methods.
Prior to version 4.0:
System.Web.HttpUtility.HtmlDecode() With .NET 4.0:
System.Net.WebUtility.HtmlDecode() [1]: http://www.sqlteam.com/forums/topic.asp?TOPIC_ID=88926
1 comment
10 |1200 characters needed characters left characters exceeded

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

Heh! No CLR but SwePeso's solution looks very robust. Many thanks Tom.
0 Likes 0 ·

Write an Answer

Hint: Notify or tag a user in this post by typing @username.

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.