question

swarnava2004 avatar image
swarnava2004 asked

How to count distinct values in SQL with trailing space

DECLARE @testtable TABLE (name varchar(10))

insert into @testtable values('+91')

insert into @testtable values('+91 ')

insert into @testtable values('+91 ')

insert into @testtable values('abc')


select distinct name from @testtable , Getting output result as '+91' and 'abc'


I want output as '+91', '+91 ', '+91 ', 'abc'

How to achieve this?


Thanks in advance

countdistinct
10 |1200

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

1 Answer

·
Oleg avatar image
Oleg answered

To make the short story long, there is a reason why the trailing spaces are ignored. This is done so that the string comparison algorithms are ANSI compliant. The ANSI algorithm pads the strings to compare with trailing spaces until they are of the same length, and then compares the characters from start of the string, character by character until the difference is found. Suppose you want to compare 2 strings, say, one string which is just one character, letter A, for example, and the other string is the same letter followed by the tab character. To compare, the algorithm will pad the first string with one space so that they are both of the same length (2 characters in this case). Then comparison starts. First letter is the same, then it goes to the second character. The first string now has padded space which is 0x20, and the second one has the tab which is 0x09. Since hex 20 (32 in decimal) is greater than hex 9 (9 in decimal), the comparison will result in letter A greater than letter A plus tab. This might appear somewhat counterintuitive, but it makes perfect sense.

To make the long story short, if you replace the space with something unusual, the character which is never present in text, for example, the bell character char(7) for the sake of grouping, and then replace that character back with what it was (the space in this case) then you get the results you need. By the way, please DO NOT use the keywords as the column names. name is a keyword, there is no reason whatsoever to use it as the column name in your table variable. Here is the script:


select
    replace([name], char(7), char(32)) [name]
    from (
        select 
            replace([name], char(32), char(7)) [name] 
            from @testtable group by replace([name], char(32), char(7))
    ) t;

Hope this helps,

Oleg

1 comment
10 |1200

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

swarnava2004 avatar image swarnava2004 commented ·
@Oleg : thank you very much ...
0 Likes 0 ·

Write an Answer

Hint: Notify or tag a user in this post by typing @username.

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.