question

stewak avatar image
stewak asked

Using REPLACE with wildcard matching pattern

I have a table with content for my website and there are website addresses within the content that all end in something similar to:

-c.123.html

-c.134.html

-c.12.html

-c.345.hml

so the full website address would be something like


www.website.co.uk/websitepage-c.123.html

the numbers are different inbetween the -c. and the html

I want to remove all of the -c.***.html

sql queryreplacewildcard
1 comment
10 |1200 characters needed characters left characters exceeded

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

REPLACE doesn't do that. What you might be better off doing is a bit of string slicing - take the left part of the string up to the -c bit, and then adding back the last few characters.

Either that, or you could find / write a CLR routine to do the job. See, for example, this set of posts: https://www.sqlservercentral.com/blogs/regex-in-sql-server-for-replacing-text

1 Like 1 ·
srutzky avatar image
srutzky answered

If I am understanding the request correctly, then all you need is a simple SUBSTRING since there is a common, static piece of text that you can match on — "-c." — so you can ignore the part with the numbers that you want to get rid of. For example:


DECLARE @Values TABLE
(
   [Value] NVARCHAR(100) NOT NULL
);

INSERT INTO @Values ([Value]) VALUES (N'https://www.website.co.uk/websitepage-c.123.html');
INSERT INTO @Values ([Value]) VALUES (N'https://www.section-c.com/c-123.html');

;WITH cte AS
(
   SELECT [Value], (CHARINDEX(N'-c.', [Value]) - 1) AS [Pos]
   FROM   @Values
)
SELECT SUBSTRING([Value], 1, IIF(cte.[Pos] > 0, cte.[Pos], LEN([Value])))
FROM    cte


;WITH cte AS
(
   SELECT [Value], (PATINDEX(N'%-c.[0-9]%', [Value]) - 1) AS [Pos]
   FROM   @Values
)
SELECT SUBSTRING([Value], 1, IIF(cte.[Pos] > 0, cte.[Pos], LEN([Value])))
FROM    cte

That returns:

https://www.website.co.uk/websitepage
https://www.section

and then:

https://www.website.co.uk/websitepage
https://www.section-c.com/c-123.html


If for some reason have more complex requirements that can't easily be done using T-SQL, then you can use RegEx if SQLCLR is an option (and it definitely is not if you are using either Azure SQL DB single DB or SQL Server 2017+ on AWS RDS). Several pre-compiled RegEx functions are available in the free version of SQL# (SQLsharp), a SQLCLR library containing many functions and stored procedures (that I wrote).



10 |1200 characters needed characters left characters exceeded

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

Cebisa avatar image
Cebisa answered

Have you thought about using data masking?

10 |1200 characters needed characters left characters exceeded

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

Magnus Ahlkvist avatar image
Magnus Ahlkvist answered

To me, this would be one of the rare cases for using CLR-functions (as Thomas Rushton suggests). In .NET you have Regex-methods which you can expose in a CLR Function.
Using substring as srutzky suggests is not a safe way to solve this problem, as it can't handle multiple matches, it'll use the leftmost match och '-c.'. It would return incorrect results for this one for example: https://www.section-c.com/c-123.html But with RegEx functions, you can create more complex pattern-matching for replace.

10 |1200 characters needed characters left characters exceeded

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

Jeff Moden avatar image
Jeff Moden answered

               

This should do it.

 SELECT testdata.WebSite
        ,Cleaned = CASE
                    WHEN testdata.WebSite LIKE ('%[-]c.[0-9]%[0-9].h%')
                    THEN SUBSTRING(testdata.WebSite,1,PATINDEX('%[-]c.[0-9]%[0-9].h%',testdata.WebSite)-1)
                    ELSE testdata.WebSite
                    END
   FROM (VALUES --Simulating the readily consumable test data you forgot to post
         ('www.website.co.uk/websitepage-c.123.html')
        ,('www.website.co.uk/websitepage-c.134.html')
        ,('www.website.co.uk/websitepage-c.12.html')
        ,('www.website.co.uk/websitepage-c.345.hml')
        ,('www.website.co.uk/websitepage-calls.345.hml')
        )testdata(WebSite)
;


10 |1200 characters needed characters left characters exceeded

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

Write an Answer

Hint: Notify or tag a user in this post by typing @username.

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.