x

Strip all but alpha chars out of a string

Hi all

I need to find a way use t-SQL to strip all characters out of a string except [A-Z] and [a-z]. So I might start off with "!QA@WD#EF$RG%TH1qa2ws3ed4rf", but I'd want to end up with "QAWDEFRGTHqawsedrf". Any suggestions gratefully received. I'd rather not use a WHILE loop or a cursor either ;-)
more ▼

asked Jun 30 '11 at 07:59 PM in Default

GPO gravatar image

GPO
1.9k 32 35 39

Well, I was going to answer the question, but unfortunately since you are not interested in a while loop or a cursor, you may be out of luck. This is not a relational problem so unless you are open to implementing RegEx via CLR you're going to have to rely on an iterative solution rather than set-based. If you really want to avoid loops in SQL then maybe you could use a loop or some fancy RegEx at the application layer.
Jun 30 '11 at 08:30 PM AaronBertrand
Thanks @AaronBertrand I'm almost (well half-way) there. It might be a set-based problem if I think about the string as a "set" of characters. I'll post the first half of my code (the "normalizing" bit) below shortly.
Jun 30 '11 at 08:36 PM GPO
(comments are locked)
10|1200 characters needed characters left

4 answers: sort voted first

So just as a polite suggestion, don't be so quick to dismiss proven methods just because you've heard that they're bad, and also don't be so quick to assume that magic methods like numbers tables will always perform better. Just because it doesn't look like a loop doesn't mean it's not a loop.

Finer words could not have been spoken. Just because someone uses a Tally Table, doesn’t mean that the solution will automatically be faster. The Tally Table is a “different animal” and if someone uses the same ol’ tired methods and formulas they use in loops in conjunction with a Tally Table, then the Tally Table code will frequently be slower.

Before we get started, let’s have a little test for the folks reading this in the USA. C’mon… it’ll be fun! ;-) Ready? Answer these as quickly as possible.

  1. How many sides does a STOP sign have? Good! You’ve been paying attention. STOP signs have 8 sides.
  2. What COLOR is a STOP sign? Again, good! Most everyone in the USA knows that STOP signs are RED. If they've really been paying attention, they also know they have WHITE letters.
  3. How many sides does a YIELD sign have? Excellent! Yes, a YIELD sign has 3 sides. You folks are really on a roll!
  4. What COLOR is a YIELD sign?

70% of you in the USA answered “Yellow” and most of those added “with Black letters”… and you’re all WRONG! ;-) The official colors of Yield signs changed, more than 20 years ago, to Red and White. Sure, you may find a wicked old sign here and there that’s still Yellow, but almost all of them are Red and White and all of the new ones (less than 23 years old) are Red and White.

That’s the real problem, isn’t it? People keep “thinking” the same way and, as a result, they make that same type of mistake in their code. They end up overriding the “magic” of the Tally Table with the same inefficient methods and formulas they use with loops because that’s what they’ve always known and that’s how they’ve always thought. As I've said many times, "Before you can think outside the box, you must FIRST realize... YOU'RE IN A FREAKIN' BOX!!!" ;-) and people just don’t realize what kind of box they’re in when it comes to using the Tally Table or “set based” programming methods, in general.

I’ll also make the polite, yet, opposing suggestion that folks should usually dismiss “proven methods” if they involve a loop. Learn how to think when it comes to “set based” code and the use of the Tally Table. Learn to think in columns instead of rows. Think in “words” instead of “characters”. Learn to use the very high performance natural looping (I use R. Barry Young’s term for this… PSEUDO CURSOR) that occurs within a simple SELECT. Stop living in the same ol’ box, folks.

I’ll make another polite suggestion that, if you really want to know how something will perform, you really, really need to learn how to make scads of test data because things like the use of GO 50000 skew the test a bit. What the use of GO 50000 will cause is 50000 reevaluations of whether or not the current execution plan can be reused or not and since that process is sometimes a little slower than some think, it can cause Set-based code to look like it’s running slower than looping code which always evaluates the reuse of the execution plans it made for the first row. For example… turn on the “Actual Execution Plan” and run the following simple code…

SELECT TOP (1) N FROM dbo.Tally
GO 10

See? 10 separate execution plans. Now, if you’re testing for singleton use of a function for use by a GUI, that might be ok, but it’s NOT ok when you’re trying to measure the performance a function will have when it’s used against a table containing a large number of rows. Only one execution plan per code snippet (SELECT, INSERT, UPDATE, DELETE, MERGE) should be created for a batch run (which is part of the reason why loops are so bad because they create multiple execution plans for each iteration).

Ok… with that in mind, let’s create enough test data to make our tests worthwhile. As is normal for me, most of the details are included as comments in the code. The code will look much better in SSMS where wrapping doesn't take place.

--=====================================================================================================================
--      Create substantial quantities of highly randomized test data so our tests can more easily measure the
--      performance of various methods even on fast machines.
--
--      Note that this section of code is NOT a part of the solution.  We're just building test data here.
--=====================================================================================================================
--===== Conditionally drop the test table to make reruns easier in SSMS
     IF OBJECT_ID('tempdb..#TestTable','U') IS NOT NULL DROP TABLE #TestTable;

--===== Create and populate the test table on-th-fly with random length random data with randomly placed characters
     -- including some letters, digits, and some symbols (dashes in this case).  The length of the data varies from
     -- 36 to 72 characters and the position of the symbols also varies because of the randomized lengths.
 SELECT TOP (100000)
        RowNum     = IDENTITY(INT,1,1),
        SomeString = RIGHT(CAST(NEWID() AS VARCHAR(36)),ABS(CHECKSUM(NEWID()))%19+18)
                   + RIGHT(CAST(NEWID() AS VARCHAR(36)),ABS(CHECKSUM(NEWID()))%19+18)
   INTO #TestTable
   FROM sys.all_columns ac1
  CROSS JOIN sys.all_columns ac2
;
--===== Now, let's make all the even-numbered rows lower case so we can make sure the case sensitivity works
 UPDATE #TestTable
    SET SomeString = LOWER(SomeString)
  WHERE RowNum %2 = 0
;
--===== Show the Min/Max length of the SomeString column.
 SELECT MIN(LEN(SomeString)), MAX(LEN(SomeString))
   FROM #TestTable
;
--===== Show the first 10 rows of data we've built just to get a feel for what's going on.
 SELECT TOP (10)
        *
   FROM #TestTable
;

As a bit of a sidebar, the real reason why most people don’t create test data is because {drum roll please}… they think they have to use a While Loop to create randomized data and While Loops just take too long in most folks eyes. With a While Loop, you have to first create a table and then you have to populate it. With a While Loop, the code to populate the table actually takes a fair bit of coding because you have to control the start of the process, the process itself, and the end of the process. Make a mistake and the code could run “forever”.

Remember… the Yield sign isn’t Yellow, anymore. ;-)

Now, if someone is going to claim that a given set-based technology, like the Tally Table, is worse than any form of RBAR (pronounced “ree-bar” like the metal rods stuck in cement forever, means “Row By Agonizing Row”, and, yeah… I’m the person who coined that acronym on SQLServerCentral in 2005 http://www.sqlservercentral.com/articles/Advanced+Querying/calculatingworkdays/1660/ ) for a given problem, they at least need to build the principle component, the Tally Table itself, correctly. There MUST be a Clustered Index with a FILL FACTOR of 100 on the “N” column to truly make use of its “magic”. Without that Clustered Index, you’re just taking pot-shots with where the Table Scan decides to quit especially if you’re not using the numbers from the Tally Table in sequential order.

The following method for building an 11,001 row, zero based (starts at 0) Tally Table is guaranteed to work because sys.all_columns has at least 4,000 rows (no guarantee past that, though) in it even on a brand new installation if you’ve installed the whole ball-of-wax. Why do I use the max number of 11,000? Read my article (http://www.sqlservercentral.com/articles/T-SQL/62867/ ) to find out. ;-) It’ll change your life if you don’t know how a Tally Table actually works even if you’re currently using it.

Here’s the code to build a proper Tally Table. Yeah… you can use ROW_NUMBER() with an ISNULL to make it a NOT NULL column and subtract 1 from it to start at Zero instead of one, but why bother? That’s too much typing. ;-)

--===================================================================
--      Create a Tally table from 0 to 11000
--===================================================================
--===== Create and populate the Tally table on the fly.
 SELECT TOP 11001
        IDENTITY(INT,0,1) AS N
   INTO dbo.Tally
   FROM Master.sys.ALL_Columns ac1
  CROSS JOIN Master.sys.ALL_Columns ac2
;
--===== Add a CLUSTERED Primary Key to maximize performance
  ALTER TABLE dbo.Tally
    ADD CONSTRAINT PK_Tally_N 
        PRIMARY KEY CLUSTERED (N) WITH FILLFACTOR = 100
;
--===== Allow the general public to use it
  GRANT SELECT ON dbo.Tally TO PUBLIC
;
GO

Ok… we’ve got a decent amount of test data and we have a properly constructed Tally Table complete with the required Clustered Index. On with the show!

As I said previously, the reason why the While Loop method (specifically the StripNonAlpha function) was able to beat the Tally Table methods that have been submitted, so far, is because people keep trying to use the same ol’ tired methods of doing the same thing as they do in loops (not to mention the crippling effect of not having the Clustered Index on the Tally Table). They’ve taken the first step in replacing the While Loop with the Pseudo-Cursor produced with a join to the Tally Table, but that’s all they’ve done. They haven’t replaced the truly bad technology, which might be called “CBAC” or “Character By Agonizing Character”. The Tally Table can do so much more if you let it.

They’ve also violated one of the most important concepts of “set based” programming there is… they handle and move too much data. All of the methods given, so far, strip out one character, determine if it’s a “keeper”, and then concatenate it into a “keeper string”. That means the data is handled/moved at least twice per character, “keeper or not”.

With that in mind, let’s write a Tally Table function that will wipe out only the unwanted characters using the STUFF function. And, no… it’s not an Inline Function. Not all Inline Functions are faster than Scalar Functions. In the true spirit that we all know only too well, “It Depends” and the Yield sign isn't Yellow, anymore. ;-)

Here’s my rendition of what a string cleaner that uses a Tally Table should look like.

CREATE FUNCTION dbo.CleanString
/*****************************************************************************************
 Purpose:
 Given a string, this function removes unwanted characters according to a single 
 character, case sensitive "LIKE" pattern.

 Reference:
 http://ask.sqlservercentral.com/questions/75404/strip-all-but-alpha-chars-out-of-a-string

 Revision History:
 Rev 00 - 17 Jul 2005 - Jeff Moden
        - Production code with certain non-relevant functionality removed for
          demonstration purposes on this site.
*****************************************************************************************/
--===== Declare the I/) parameters for the function
        (
        @pString  VARCHAR(8000),
        @pPattern VARCHAR(300)
        )
RETURNS VARCHAR(8000)
     AS 
  BEGIN
        --===== Find and "Stuff-out" only the characters found by the "pattern".
             -- Note that the pattern is "case sensitive" due to the collation used.
             -- Also notice how very simple and compact this code actually is. If it
             -- ever does need maintenance, it will be easy for someone to do
         SELECT @pString = STUFF(@pString,t.N,1,'') 
           FROM dbo.Tally t
          WHERE t.N BETWEEN 1 AND LEN(@pString)
            AND SUBSTRING(@pString,t.N,1) LIKE @pPattern COLLATE Latin1_General_BIN
          ORDER BY t.N DESC
        ;
 RETURN @pString
    END
;

Claims of performance aren’t valid nor tolerated without an actual test. With that thought in mind, run the code (further above on this post) that creates the test table and then run this test harness.

--===== Conditionally drop the test results tables to make reruns easier in SSMS
     IF OBJECT_ID('tempdb..#CleanStringResults'  ,'U') IS NOT NULL DROP TABLE #CleanStringResults;
     IF OBJECT_ID('tempdb..#StripNonAlphaResults','U') IS NOT NULL DROP TABLE #StripNonAlphaResults;

--===== Setup the environment so that the While Loop code has a chance at being performant
    SET NOCOUNT ON; --Supresses the auto-display of row counts.

--===== Declare a "timer" variable
DECLARE @StartTime DATETIME;

--===== Test the Tally Table string-cleaner =======================================================
     -- Start the "timer"
 SELECT @StartTime = GETDATE();

     -- Run the test
 SELECT RowNum, CleanString = dbo.CleanString(SomeString,'[^A-Za-z]')
   INTO #CleanStringResults
   FROM #TestTable
;
     -- Display the duration
  PRINT 'Duration(ms) for Tally Table: ' + CAST(DATEDIFF(ms,@StartTime,GETDATE()) AS VARCHAR(10))
;
--===== Test the While Loop string-cleaner =======================================================
     -- Start the "timer"
 SELECT @StartTime = GETDATE();

     -- Run the test
 SELECT RowNum, NonAlphaString = dbo.StripNonAlpha(SomeString)
   INTO #StripNonAlphaResults
   FROM #TestTable
;
     -- Display the duration
  PRINT 'Duration(ms) for While Loop:  ' + CAST(DATEDIFF(ms,@StartTime,GETDATE()) AS VARCHAR(10))
;

Here are the results from my poor ol’ 9 year old, single CPU desktop…

Duration(ms) for Tally Table: 15313
Duration(ms) for While Loop:  26170

I’m sure some person smarter than me will be able to turn the function into an iTVF (Inline Table Valued Function) for a bit of extra speed or maybe do the same thing with a CLR which is sometimes better for string handling, but this simple Tally Table code is nearly twice as fast as the While Loop function. Even my method uses “RBR” (Row-By-Row) technology because it’s a difficult thing to manipulate strings at a character level for each row without some form of “RBR”. You can, however, avoid exasperating the problem by keeping “RBAR” (Row-By-AGONIZING-Row) out of the function (if you understand the difference), itself. Just make sure you don’t include some “Hidden RBAR” that looks “Set Based” but really isn’t.

Again, if I may politely suggest, there are going to be a whole lot of people that don’t know how to use the technology of the Tally Table correctly and there are going to be at least that number that use that fact to denounce it’s use. They’ll continue to tell folks to "not dismiss {sic} 'proven methods'” using a loop and they’ll seemingly prove it with code, which is the only true judge you should listen to. ;-) The problem is that those same people are normally so used to using RBAR methods that they can’t figure out how to use more advanced technology correctly and the people following the thread will frequently agree because… they don’t necessarily know how to use it correctly, either.

Write it down and post it on your monitor… tattoo it to the back of your hand if you have to. If it has RBAR in it, whether it’s a discreet loop or “looks-like-set-based” recursion or “Triangular Joins”, there’s a VERY good chance (there ARE exceptions, of course. I'll see if I can come up with one in a couple of days) you’ve done it wrong despite what other people supposedly “prove” with code. ;-) How do I know this for sure? Heh… I’m also guilty of being a mere carbon-based life-form with bad habits to break. Take a look at the following article and see what I mean. Up until that article, I thought Yield signs were Yellow. ;-)

[http://www.sqlservercentral.com/articles/Tally+Table/72993/][3]

Thanks for listening, folks. And, yes… the dust bunnies all say “Hi”. ;-)

--Jeff Moden

p.s. Magnus and GPO… VERY nice tries. Magnus... your code would be a lot faster (but still not as fast as the "Stuff" method I used) if it selected only the good characters instead of making a CASE decision on every character. Like this...

 CREATE FUNCTION dbo.fnGetAlphaModified
        (@pString VARCHAR(8000))
RETURNS TABLE
     AS
 RETURN
 SELECT AlphaString = 
            (
             SELECT SUBSTRING(@pString, t.N, 1)
               FROM dbo.Tally t 
              WHERE t.n BETWEEN 1 AND LEN(@pString)
                AND SUBSTRING(@pString, t.N, 1) LIKE '[A-Za-z]' COLLATE Latin1_General_BIN
              ORDER BY t.n
                FOR XML PATH('')
            )
;
[3]: http://www.sqlservercentral.com/articles/Tally+Table/72993/
more ▼

answered Jul 17 '11 at 05:14 PM

Jeff Moden gravatar image

Jeff Moden
1.7k 2 8

Speaking of +10 moments...
Jul 17 '11 at 11:41 PM Magnus Ahlkvist
Where would we be without people like Jeff Moden who give of their knowledge so often and so selflessly? Many thanks Jeff.
Jul 18 '11 at 07:48 PM GPO
Gosh... thanks, Magnus and GPO. I've been given many opportunities over time. I'm just "passing it forward". Very glad I could be of service.
Jul 18 '11 at 07:59 PM Jeff Moden
I added a parenthetical comment to the post above. I came on very strong about there being a "VERY good chance you've done it wrong" if you used RBAR and I stand my ground on that. I want to make sure that Aaron's message isn't totally lost in all of that and that everyone understands there WILL be exceptions where a While loop actually CAN beat setbased thinking. BUT, you really have to think outside the box to get a While Loop to beat a Tally Table. Almost everyone on this thread started by doing a character-by-character parse of the data and the reassembling it. Thinking differently allowed us to come up with the very high performance Tally Table method. Now, let's break that mold, as well. I'll be back.
Jul 23 '11 at 09:38 PM Jeff Moden
(comments are locked)
10|1200 characters needed characters left

Here's a function which uses a numbers table.

--First create Tally
select identity(int,1,1) as N into Tally
from sys.all_columns 
GO

CREATE FUNCTION[dbo].[fnGetAlpha](@s VARCHAR(100))
RETURNS VARCHAR(100)
AS
BEGIN
    RETURN (
    SELECT 
       CASE 
         WHEN ASCII(UPPER(SUBSTRING(@s,t.n,1))) BETWEEN 65 AND 90 
              THEN SUBSTRING(@s,t.n,1) 
         ELSE '' 
       END
    FROM   Tally t 
    WHERE  t.n<=len(@s)
    ORDER BY 
         t.n
    FOR XML PATH(''))
END
more ▼

answered Jul 01 '11 at 12:34 AM

Magnus Ahlkvist gravatar image

Magnus Ahlkvist
15.9k 15 19 32

Very cool idea. Another thing to test.
Jul 01 '11 at 03:56 AM GPO
I have performance tested my function and the function suggested by @AaronBertrand and found that they perform equally well. With one small byt very important addition: You must add a primary key to the Tally table, otherwise my function is outperformed with a 20 magnitude by @AaronBertrand 's function.
Jul 01 '11 at 04:22 AM Magnus Ahlkvist
+1 on the above comment and good catch (I'd give it a +10 if it would let me). That's one of the single biggest reasons why some folks think that a Tally Table can be beat by a While Loop. As a side bar, I modified your good code and provided an explanation as to "why" at the end of my rather long winded answer on this thread. VERY well done all around, Magnus.
Jul 17 '11 at 06:42 PM Jeff Moden
(comments are locked)
10|1200 characters needed characters left

Assuming a numbers table like this (you probably have one with more than 5000 rows, but this is just for this example):

CREATE TABLE dbo.Numbers(n INT PRIMARY KEY);

INSERT dbo.Numbers(n)
    SELECT ROW_NUMBER() OVER (ORDER BY [object_id]) 
    FROM sys.all_columns; -- should be > 5000 rows

And a #temp table with some fun input:

CREATE TABLE #demog
(
    PK INT IDENTITY(1,1),
    StringValue VARCHAR(4000)
);

INSERT #demog(StringValue) 
    SELECT '!QA@WD#EF$RG%TH1qa2ws3ed4rf'
    UNION SELECT 'fdsfdsfdsf23132%$&#$fsds';
GO

Then you could write a FOR XML PATH query like this to split and re-join the strings. Still seems dirty to me, but hey, you don't have to write DECLARE CURSOR or WHILE, so it must be far superior than any loop, right?

;WITH x AS 
(
    SELECT 
       d.PK, 
       c = SUBSTRING(d.StringValue, n.n, 1),
       rn = ROW_NUMBER() OVER (PARTITION BY d.PK ORDER BY n.n)
    FROM #demog AS d
    CROSS JOIN dbo.Numbers AS n
    WHERE n.n <= LEN(d.StringValue)
),
y AS 
(
    SELECT PK, c, rn, 
       rn2 = ROW_NUMBER() OVER (PARTITION BY PK ORDER BY rn)
       FROM x 
       WHERE ASCII(UPPER(c)) BETWEEN 65 AND 90
)
SELECT PK, s = (
    SELECT y2.c + ''
       FROM y AS y2 WHERE y2.PK = y.PK
       ORDER BY PK, rn
       FOR XML PATH(''), TYPE).value('.[1]', 'VARCHAR(4000)')
    FROM y WHERE rn2 = 1;

The more natural way for me to do this would be to create a simple, scalar UDF that loops through the string one character at a time, and builds a new one with only the characters that are A-Z or a-z. Here is my function:

CREATE FUNCTION dbo.StripNonAlpha
(
    @s VARCHAR(4000)
)
RETURNS VARCHAR(4000)
AS
BEGIN
    DECLARE 
       @x INT = 1, 
       @y INT = LEN(@s),
       @p CHAR(1),
       @s_out VARCHAR(4000) = '';

    WHILE @x <= @y
    BEGIN
       SELECT @p = SUBSTRING(@s, @x, 1);

       SELECT @s_out += CASE 
         WHEN ASCII(UPPER(@p)) BETWEEN 65 AND 90 
         THEN @p ELSE '' END;

       SET @x += 1;
    END
    RETURN (@s_out);
END
GO

Just to test my theory that the non-loop, numbers table version will be less efficient (both to perform and to maintain), I ran the following code:

CREATE TABLE #hold(PK INT, s VARCHAR(4000));

SELECT GETUTCDATE();
GO

INSERT #hold SELECT PK, dbo.StripNonAlpha(StringValue) FROM #demog;
GO 50000

SELECT GETUTCDATE();
GO

;WITH x AS 
(
    SELECT 
       d.PK, 
       c = SUBSTRING(d.StringValue, n.n, 1),
       rn = ROW_NUMBER() OVER (PARTITION BY d.PK ORDER BY n.n)
    FROM #demog AS d
    CROSS JOIN dbo.Numbers AS n
    WHERE n.n <= LEN(d.StringValue)
),
y AS 
(
    SELECT PK, c, rn, 
       rn2 = ROW_NUMBER() OVER (PARTITION BY PK ORDER BY rn)
       FROM x 
       WHERE ASCII(UPPER(c)) BETWEEN 65 AND 90
)
INSERT #hold
SELECT PK, s = (
    SELECT y2.c + ''
       FROM y AS y2 WHERE y2.PK = y.PK
       ORDER BY PK, rn
       FOR XML PATH(''), TYPE).value('.[1]', 'VARCHAR(4000)')
    FROM y WHERE rn2 = 1;
GO 50000

SELECT GETUTCDATE();
GO

The results? The scalar UDF ran in less than 21 seconds (about 2,381 rows per second). The XML / numbers table version (which - while admittedly could probably be a bit cleaner - surely must be faster than a loop, right?) took about 3 times as long - 64 seconds - about 781 rows per second.

So just as a polite suggestion, don't be so quick to dismiss proven methods just because you've heard that they're bad, and also don't be so quick to assume that magic methods like numbers tables will always perform better. Just because it doesn't look like a loop doesn't mean it's not a loop.
more ▼

answered Jun 30 '11 at 10:39 PM

AaronBertrand gravatar image

AaronBertrand
905 1 3

Nice work Aaron. I'll take a bit of time to digest this and then post back. You've gone to a tremendous amount of effort. Many thanks.
Jun 30 '11 at 11:58 PM GPO
(comments are locked)
10|1200 characters needed characters left

I've just used a Moden Tally table. Don't have the link handy.

    ;WITH normalized as 
        ( 
   SELECT myTablePK
          ,myStringField
          ,SUBSTRING(myStringField,t.N,1) norm_output
          ,ROW_NUMBER() OVER(PARTITION BY myTablePK ORDER BY t.N) as seq
     FROM #demog
          CROSS JOIN dbo.Tally t
    WHERE t.N <= LEN(myStringField)
          and (SUBSTRING(myStringField,t.N,1) like '[A-Z]'
              or SUBSTRING(myStringField,t.N,1) like '[a-z]')
        )
   SELECT *
     FROM normalized
 ORDER BY norm_output
          ,seq
I'm struggling with the FOR XML PATH to join it all back up now.
more ▼

answered Jun 30 '11 at 08:43 PM

GPO gravatar image

GPO
1.9k 32 35 39

It's more commonly referred to as a numbers table, and FWIW Jeff Moden didn't invent its use, he just renamed it to tally for some reason. Anyway do you think this "set"-based approach is really set-based at all? Do you think it's less work than just looping through the characters in the string?
Jun 30 '11 at 09:45 PM AaronBertrand
Less work for me or the machine? I guess I won't know whether one method's better (quicker) than another until I compare them. Feel free to post your loop. Thanks again for the feedback.
Jun 30 '11 at 09:58 PM GPO
Mostly I meant work to write the query. The performance itself won't really matter unless you're doing this A LOT. In which case I'd probably still recommend doing this outside of the database, or at least creating a computed column or something so that the stripping is only done at insert/update time instead of every query time.
Jun 30 '11 at 10:16 PM AaronBertrand
Good points. I'm reporting off data that I can't change (unless push really came to shove), so inserts and updates are not an issue. It's just a matter of seeing whether the report will run respectably fast without having to go into tedious negotiations with people about strategies such as those you suggest.
Jun 30 '11 at 10:44 PM GPO

"It's more commonly referred to as a numbers table, and FWIW Jeff Moden didn't invent its use, he just renamed it to tally for some reason."

Correct, and I said so in the article I wrote about it. ;-) The purpose of the article was to do what so few have actually done... explain how and why it works to people who didn't know. For those that still don't know (maybe even if you're using it), here's the link to the article.

http://www.sqlservercentral.com/articles/T-SQL/62867/
Jul 17 '11 at 07:40 AM Jeff Moden
(comments are locked)
10|1200 characters needed characters left
Your answer
toggle preview:

Up to 2 attachments (including images) can be used with a maximum of 524.3 kB each and 1.0 MB total.

New code box

There's a new way to format code on the site - the red speech bubble logo will automatically format T-SQL for you. The original code box is still there for XML, etc. More details here.

Follow this question

By Email:

Once you sign in you will be able to subscribe for any updates here

By RSS:

Answers

Answers and Comments

SQL Server Central

Need long-form SQL discussion? SQLserverCentral.com is the place.

Topics:

x977
x33
x4
x3

asked: Jun 30 '11 at 07:59 PM

Seen: 2406 times

Last Updated: Jun 30 '11 at 07:59 PM