question

Paul 3 avatar image
Paul 3 asked

Help using REPLACE with wildcard matching pattern....

Hi all,

Can anyone help me with a little problem I can't solve. I am trying to update a phone_number column to replace any character that is not 0-9 with an empty string ie:


PHONE_TABLE

phone_number

01234-567-890 (result i want = 01234567890)

012345 6789ext (result i want = 0123456789)

n/a (result i want = )

...12345..... (result i want = 12345)


i can identify which records have have non-numeric values in the phone number using...

select *
from PHONE_TABLE
where phone_number like '%[^0-9]%'

but I want to update this table and remove non-numeric characters in the phone_number field something along the lines of....

update PHONE_TABLE
set phone_number = replace(phone_number, [^0-9], '')

Any help would be greatly appreciated

Many thanks,

Paul

t-sqlreplace
10 |1200

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

Kev Riley avatar image
Kev Riley answered

OK just for fun, here's a TSQL method, using Jeff Moden's Tally table, and a SQL2005 (and later) trick for concatenating the results back

--Jeff Moden Tally table
    if object_id('dbo.tally') is not null 
    drop table dbo.tally

    select top 10000 --change to fit max lenght of phone number
            identity(int,1,1) as n
       into dbo.tally
       from master.dbo.syscolumns sc1,
            master.dbo.syscolumns sc2

    -- add pk to maximize performance
      alter table dbo.tally
        add constraint pk_tally_n 
            primary key clustered (n) with fillfactor = 100

declare @phonetable table 
        (uniqueid int identity(1,1), phone_number varchar(500))
insert into @phonetable (phone_number)
        select '01234-567-890'
union   select '012345 6789ext' 
union   select 'n/a'
union   select '...12345.....'

;with cte (uniqueid, phone_number, goodchar, badchar) as(
select
    uniqueid, phone_number, 
    case when substring(phone_number,N,1) not like '%[^0-9]%' 
         then substring(phone_number,N,1) end as goodchar,
    case when substring(phone_number,N,1) like '%[^0-9]%' 
         then substring(phone_number,N,1) end as badchar

from @phonetable , Tally
where phone_number like '%[^0-9]%'
and N <= len(phone_number)
)

SELECT distinct
        phone_number,
        isnull(
        stuff ( ( SELECT
                          '' + goodchar
                  FROM
                          cte t1
                  where t1.UniqueID = t2.UniqueID
        FOR XML PATH ( '' ) ) , 1 , 0 , '' )
        ,'') as clean_phone_number
from cte t2

gives

phone_number        clean_phone_number
...12345.....        12345
012345 6789ext       0123456789
01234-567-890        01234567890
n/a 
2 comments
10 |1200

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

Fatherjack avatar image Fatherjack ♦♦ commented ·
+1 - Nice. I was trying to think how a tally table could get involved but didnt have enough time to sit down and work on it. Thanks.
0 Likes 0 ·
Kev Riley avatar image Kev Riley ♦♦ commented ·
Cheers, although it needs some serious tuning if it was to be used for real....
0 Likes 0 ·
Fatherjack avatar image
Fatherjack answered

Do you have a lot of different non-numeric characters? If not I would simply go for

USE [adventureworks]
GO    
SELECT phone, REPLACE([c].[Phone],'-','') 
FROM [Person].[Contact] AS c

if there are lots of different ones and you need to repeat this process regularly then you might need to get involved with regex, functions and even clr...

1 comment
10 |1200

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

Paul 3 avatar image Paul 3 commented ·
Hi, yes I have lots of non-numeric characters in there from a-z and/or special characters....and about 650000 records....unfortunately they are all coming in from a dataload so i wanted to do a cleanup exercise. thanks for your help though, i have no idea where to start with regex etc.
0 Likes 0 ·
Mukesh avatar image
Mukesh answered
CREATE FUNCTION ExtractInteger ( @String VARCHAR(2000) )
RETURNS VARCHAR(1000)
AS BEGIN 
    DECLARE @Count INT 
    DECLARE @IntNumbers VARCHAR(1000) 
    SET @Count = 0 
    SET @IntNumbers = '' 
    WHILE @Count <= LEN(@String) 
        BEGIN 
            IF SUBSTRING(@String, @Count, 1) >= '0'
                AND SUBSTRING(@String, @Count, 1) <= '9' 
                BEGIN 
                    SET @IntNumbers = @IntNumbers + SUBSTRING(@String, @Count, 1) 
                END 
            SET @Count = @Count + 1 
        END 
    RETURN @IntNumbers 
   END
-- SELECT dbo.ExtractInteger('0323-111-CALL')
-- gives  0323111
10 |1200

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

Srikant Maurya avatar image
Srikant Maurya answered
DECLARE  @s      VARCHAR(1000), 
         @i      TINYINT, 
         @result VARCHAR(50) 
SELECT @s = '012345 6789ext' 
SET @i = 1 
SET @result = '' 
WHILE (@i <= Datalength(@s)) 
  BEGIN 
    SET @result = @result + Substring(@s,Patindex('%[0-9]%',@s),1) 
    SET @s = Stuff(@s,Patindex('%[0-9]%',@s),1,'') 
    SET @i = @i + 1 
  END 
SELECT @result
1 comment
10 |1200

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

Srikant Maurya avatar image Srikant Maurya commented ·
U can Past above code in function and call function in select statment
0 Likes 0 ·
TimothyAWiseman avatar image
TimothyAWiseman answered

Kev and Mukesh both have very good answers you may want to consider. The other option you may want to consider is writing a CLR in C#/VB.NET that uses regex to do what you want and then you can run that against the table from within SQL Server.

10 |1200

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

jacked avatar image
jacked answered
Why not? Select (Replace(Replace(phone_number, '-', ''), ' ', '') from phone_table
1 comment
10 |1200

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

Kev Riley avatar image Kev Riley ♦♦ commented ·
That would only replace the '-' and ' ' characters. The original question was to replace all non-numerics.
0 Likes 0 ·
jgrizzle avatar image
jgrizzle answered
select dbo.GiveMeNumbers(columnName) from tableName create function dbo.GiveMeNumbers(@input varchar(max)) returns varchar(max) as begin declare @results varchar(max) = '' declare @position int = 1 declare @current varchar(1) while @position <= len(@input) begin set @current = substring(@input, @position, 1) if @current like '[0-9]' set @results = @results + @current set @position += 1; end return @results end
1 comment
10 |1200

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

GPO avatar image GPO commented ·
This is almost certainly not a good use for a WHILE loop in SQL Server. In fact there are very very few good uses for WHILE loops in SQL Server. Have you tried running it over a million rows?
1 Like 1 ·

Write an Answer

Hint: Notify or tag a user in this post by typing @username.

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.