question

David 2 1 avatar image
David 2 1 asked

Grouping Count Values Once Across Multiple Columns

Hi there, Happy new year everyone. I am trying to get a count by ID where the ID is only counted in the result set once. For example, in the test data below the value 'Yes' appears a total of 7 times however if 'Yes' appears across an ID multiple times it is only counted once. e.g. The row with ID 4 has two 'Yes' values however it should only be counted once: CREATE TABLE Test( ID INT, Col1 VARCHAR(10), Col2 VARCHAR(10), Col3 VARCHAR(10)) INSERT INTO Test SELECT 1,'Yes','No','Maybe' UNION ALL SELECT 2,'No','No','Yes' UNION ALL SELECT 3,'Yes','Maybe','No' UNION ALL SELECT 4,'Yes','Yes','Maybe' UNION ALL SELECT 5,'Maybe','Maybe','Yes' UNION ALL SELECT 6,'Yes','No','No' I am looking to create the following result sets. Firstly, the grouped by result totals: Value Counts Maybe 4 No 4 Yes 6 Secondly, I'd like to list the ID's that make up these counts. In otherwords, list all ID's that have a 'Maybe' value, and all the ID's that have a 'No' value, and all the ID's that have a 'Yes' value. e.g. If an ID has a 'Yes', 'No', 'Maybe' then that ID will appear in each of the lists 'Yes', 'No' and 'Maybe'. Hopefully I've tried to explain this clearly. If not please let me know. TIA
tsqlpivotgroup-bycount
10 |1200 characters needed characters left characters exceeded

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

Oleg avatar image
Oleg answered
I believe that in this case the set should be transformed ***before*** counting the number of occurrences of Yes, No, Maybe values, ***not after***. This will allow to avoid using the case statement to do this count. Also, I am not sure about the meaning of " list the ID's that make up these counts" in question, but let's assume that it means the comma-delimited list of ID values which make up the counts. Here is the complete script which produces desired output: ;with transformed as ( select ID, Result from Test src unpivot (Result for ColumnName in ([Col1], [Col2], [Col3])) upt group by ID, Result ) select t.Result [Value], count(1) Counts, max(stuff(ID.List, 1, 2, '')) ListOfID from transformed t cross apply ( select ', ' + cast(ID as varchar) from transformed where Result = t.Result for xml path('') ) ID(List) group by Result; Results: Value Counts ListOfID ---------- ----------- ----------------- Maybe 4 1, 3, 4, 5 No 4 1, 2, 3, 6 Yes 6 1, 2, 3, 4, 5, 6 P.S. It might be beneficial to pre-group the transformed set before cross apply. Otherwise, the engine is forced to waste its precious time calling (or perhaps re-using) the results of the same call with the same "parameters". To see what I mean, remove the group by from the bottom, count from select list and use just the ID.List rather than its max to see that the results of the ID.List are repetitive for each of the Yes, No, Maybe rows. Pre-grouping will make the query look slightly more complicated, but performance might be better. Please check both versions of the script, results are the same but performance might differ. ;with transformed as ( select ID, Result from Test src unpivot (Result for ColumnName in ([Col1], [Col2], [Col3])) upt group by ID, Result ), preGrouped as (select Result, count(1) Counts from transformed group by Result) select t.Result [Value], t.Counts, stuff(ID.List, 1, 2, '') ListOfID from preGrouped t cross apply ( select ', ' + cast(ID as varchar) from transformed where Result = t.Result for xml path('') ) ID(List); Hope this helps. Oleg
2 comments
10 |1200 characters needed characters left characters exceeded

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

@Oleg happy new year and a big thank you very much for your answer. Apologies for the delay in replying, it's been a busy 2018. As I mentioned to Kev above, due to the number of columns that I'll need to work with your answer more suits perfectly with my report. As always, thanks for answering. Best.
0 Likes 0 ·
@Oleg just a quick follow up as I'm sitting at home digesting your answers further. So far I've not been able to test the performance difference (I'm currently being pulled between different projects) however I will do this when I get the chance and provide some comparisons. I understand that grouping or aggregation could be quite resource intensive especially across multiple columns so thanks for highlighting this.
0 Likes 0 ·
tzvikl avatar image
tzvikl answered
Here's a quick simple solution for your first question (not sure this is feasible with for a large table): select distinct Col1, count(*) from ( select Col1 from Test union all select Col2 from Test union all select Col3 from Test )t1 group by col1 Regarding your second question - I didn't quite understand what your'e trying to achieve, can you perhaps send an example resultset you looking for?
2 comments
10 |1200 characters needed characters left characters exceeded

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

@tzvikl The script this answer does not produce desired results. If the question provides sample data and expected results then it is pretty easy to smoke-test the script before posting it, just to make sure that it works as expected. Otherwise, OP ends up with the "solution" which is misleading/incorrect, and does not help. In his question, @David 2 1 has specifically mentioned that "Yes" value appears 7 times but the goal is to count it 6 times because 2 "Yes" values in row 4 are desired to be counted only once, not twice as they are in your script. You appear to be pretty knowledgeable, and could be a great asset to this site, but I noticed on more than one occasion that your scripts are never tested, and thus produce incorrect results more often than not. Please test your scripts before posting if at all possible, you will then see your karma and acceptance rating improved. Thank you.
2 Likes 2 ·
@tzvikl thank you for taking the time to reply. I appreciate all insights learning TSQL however Oleg's answer above gave me the best result set for the number of columns I'll need to report across. Thanks again.
0 Likes 0 ·
Kev Riley avatar image
Kev Riley answered
I would create a a temporary structure that summarizes the data as select ID, cast(case when col1 = 'Yes' then 1 else 0 end + case when col2 = 'Yes' then 1 else 0 end + case when col3 = 'Yes' then 1 else 0 end as bit) as Yes, cast(case when col1 = 'No' then 1 else 0 end + case when col2 = 'No' then 1 else 0 end + case when col3 = 'No' then 1 else 0 end as bit) as [No], cast(case when col1 = 'Maybe' then 1 else 0 end + case when col2 = 'Maybe' then 1 else 0 end + case when col3 = 'Maybe' then 1 else 0 end as bit) as Maybe from test which gives ID Yes No Maybe ----------- ----- ----- ----- 1 1 1 1 2 1 1 0 3 1 1 1 4 1 0 1 5 1 0 1 6 1 1 0 (6 rows affected) this can then be used to answer your other questions. So for the first one, the totals, you simply unpivot an aggregation of this data. If we define the aggregation as with YesNoMaybeMatrix as ( select ID, cast(case when col1 = 'Yes' then 1 else 0 end + case when col2 = 'Yes' then 1 else 0 end + case when col3 = 'Yes' then 1 else 0 end as bit) as Yes, cast(case when col1 = 'No' then 1 else 0 end + case when col2 = 'No' then 1 else 0 end + case when col3 = 'No' then 1 else 0 end as bit) as [No], cast(case when col1 = 'Maybe' then 1 else 0 end + case when col2 = 'Maybe' then 1 else 0 end + case when col3 = 'Maybe' then 1 else 0 end as bit) as Maybe from test ) select sum(case when Yes=1 then 1 else 0 end) as Yes, sum(case when No=1 then 1 else 0 end) as [No], sum(case when Maybe=1 then 1 else 0 end) as Maybe from YesNoMaybeMatrix Yes No Maybe ----------- ----------- ----------- 6 4 4 (1 row affected) and then UNPIVOT it select vals, counts from ( select sum(case when Yes=1 then 1 else 0 end) as Yes, sum(case when No=1 then 1 else 0 end) as [No], sum(case when Maybe=1 then 1 else 0 end) as Maybe from YesNoMaybeMatrix ) a unpivot ( counts for vals in ([Yes],[No],[Maybe]) ) as unpvt Value counts ---------------------- ----------- Yes 6 No 4 Maybe 4 (3 rows affected) To get a list of IDs that make up the counts with YesNoMaybeMatrix as ( select ID, cast(case when col1 = 'Yes' then 1 else 0 end + case when col2 = 'Yes' then 1 else 0 end + case when col3 = 'Yes' then 1 else 0 end as bit) as Yes, cast(case when col1 = 'No' then 1 else 0 end + case when col2 = 'No' then 1 else 0 end + case when col3 = 'No' then 1 else 0 end as bit) as [No], cast(case when col1 = 'Maybe' then 1 else 0 end + case when col2 = 'Maybe' then 1 else 0 end + case when col3 = 'Maybe' then 1 else 0 end as bit) as Maybe from test ) select ID from YesNoMaybeMatrix where Yes = 1 ID ----------- 1 2 3 4 5 6 (6 rows affected) with YesNoMaybeMatrix as ( select ID, cast(case when col1 = 'Yes' then 1 else 0 end + case when col2 = 'Yes' then 1 else 0 end + case when col3 = 'Yes' then 1 else 0 end as bit) as Yes, cast(case when col1 = 'No' then 1 else 0 end + case when col2 = 'No' then 1 else 0 end + case when col3 = 'No' then 1 else 0 end as bit) as [No], cast(case when col1 = 'Maybe' then 1 else 0 end + case when col2 = 'Maybe' then 1 else 0 end + case when col3 = 'Maybe' then 1 else 0 end as bit) as Maybe from test ) select ID from YesNoMaybeMatrix where [No] = 1 ID ----------- 1 2 3 6 (4 rows affected) with YesNoMaybeMatrix as ( select ID, cast(case when col1 = 'Yes' then 1 else 0 end + case when col2 = 'Yes' then 1 else 0 end + case when col3 = 'Yes' then 1 else 0 end as bit) as Yes, cast(case when col1 = 'No' then 1 else 0 end + case when col2 = 'No' then 1 else 0 end + case when col3 = 'No' then 1 else 0 end as bit) as [No], cast(case when col1 = 'Maybe' then 1 else 0 end + case when col2 = 'Maybe' then 1 else 0 end + case when col3 = 'Maybe' then 1 else 0 end as bit) as Maybe from test ) select ID from YesNoMaybeMatrix where Maybe = 1 ID ----------- 1 3 4 5 (4 rows affected)
2 comments
10 |1200 characters needed characters left characters exceeded

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

@Kev Riley ♦♦ thank you very much for your answer Kev. Apologies for the delay in replying, it's been a busy new year so far. You have given me a good insight in your answer however due to the number of columns that I'll need to work with @Oleg answer more suits my problem. Thanks for answering.
0 Likes 0 ·
@David no worries - glad it got you thinking
0 Likes 0 ·

Write an Answer

Hint: Notify or tag a user in this post by typing @username.

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.