question

Ved avatar image
Ved asked

Extract 'AA' or 'BB' type pattern using SQL Server

Problem Statement:The table is like this Emp_ID Name 1 Aakash 2 Aashish 3 Varun 4 Vikas 5 Jiignesh I have to extract the Names which have alphabets consecutively e.g. Aakash has AA Aashish has AA Jiignesh has ii and reject other names ........Please help me out with this
sql-server-2005t-sqldata-cleansing
10 |1200

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

Fatherjack avatar image
Fatherjack answered
firstly create a table to contain all of the combinations you want to locate CREATE TABLE DoubleLetters ( combo CHAR(2) PRIMARY KEY ) GO INSERT INTO DoubleLetters VALUES ( 'aa' ), ( 'bb' ), ( 'cc' ), ( 'dd' ), ( 'ee' ), ( 'ff' ), ( 'gg' ), ( 'hh' ), ( 'ii' ), ( 'jj' ), ( 'kk' ), ( 'll' ), ( 'mm' ), ( 'nn' ), ( 'oo' ), ( 'pp' ), ( 'qq' ), ( 'rr' ), ( 'ss' ), ( 'tt' ), ( 'uu' ), ( 'vv' ), ( 'ww' ), ( 'xx' ), ( 'yy' ), ( 'zz' ) go then write a function to return a result when it finds a match CREATE FUNCTION ufn_HasDoubleLetter ( @thename CHAR(50) ) RETURNS TABLE AS return ( SELECT case when patindex('%' + combo + '%', @thename) > 0 THEN @thename END AS r FROM doubleletters ) go then use it SELECT [n].[empID] , [n].[FName] FROM [dbo].[NAMES] AS n CROSS APPLY [dbo].[ufn_HasDoubleLetter]([n].[FName]) AS c WHERE [c].[r] IS NOT null to give results like this empID FName 1 Aakash 2 Aashish 5 Jiignesh
6 comments
10 |1200

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

Oleg avatar image Oleg commented ·
@Fatherjack @Matt Whitfield +1 This is a very elegant solution! The problem with multi-duplicates within same name can be cured by restricting the TVF to only return one record (now it always returns 26 per each record in the table), i.e.
;with doubleletters(combo)
as
(
select top 26
	replicate(char(96 + 
		row_number() over (order by [object_id])), 2) combo
	from sys.objects
)
	select e.*
		from dbo.names e cross apply
		(
			select top 1 1 r
				from doubleletters 
					where patindex('%' + combo + '%', e.name) > 0
		) tvf;
2 Likes 2 ·
Matt Whitfield avatar image Matt Whitfield ♦♦ commented ·
+1- maybe want to add a group by to avoid duplicates of names with two doubles? Like aaron aaronson? :)
1 Like 1 ·
Fatherjack avatar image Fatherjack ♦♦ commented ·
@Oleg - liking this a lot, why dont you post it as an answer?
1 Like 1 ·
Fatherjack avatar image Fatherjack ♦♦ commented ·
@Matt, good call. Thats something that would come after you review the results. although this query isnt going to find Fname and SName combos as its only passing the FName col. Point taken though.
0 Likes 0 ·
Oleg avatar image Oleg commented ·
@Fatherjack Because that would be stealing :) I really like your answer, very elegant one indeed. All I did was just a restatement of your answer so it fits in the comment box. Now when I look at my comment I realize that **select top 1 1 r** looks somewhat ridiculous, maybe I should consider changing statements like this to something more readable, like **select top 1 'X' as r** :) <\!-- **Begin Edit** I can edit my comments, just not for too long. I think that 1 hour is the limit, so yes, it is too late to edit the **select top 1 1 r** thingy. \--> **End Edit**
0 Likes 0 ·
Show more comments
TimothyAWiseman avatar image
TimothyAWiseman answered
Oleg's modification to Fatherjack's answer is probably the best way to go, and I applaud them both for their elegance. But for completeness you might also want to consider writing a CLR assembly in C#/ VB.NET and importing it into SQL Server. This would permit you to use RegEx in your solution and avoid the need to create a table. Phil Factor has a fantastic article on doing this at [ http://www.simple-talk.com/sql/t-sql-programming/clr-assembly-regex-functions-for-sql-server-by-example/][1] [1]: http://www.simple-talk.com/sql/t-sql-programming/clr-assembly-regex-functions-for-sql-server-by-example/
2 comments
10 |1200

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

Oleg avatar image Oleg commented ·
@TimothyAWiseman Wow, this is a really great article! Thank you. Then again, I don't believe that it is possible to find any article by Phil Factor which is less than a **must read**.
2 Likes 2 ·
TimothyAWiseman avatar image TimothyAWiseman commented ·
Some of his less technical articles vary in how interesting they are, but all of the technical articles are fantastic.
0 Likes 0 ·

Write an Answer

Hint: Notify or tag a user in this post by typing @username.

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.