x

Distinct sums of case statements

This has stumped me. Not sure how to get a distinct sum inside a case statement.

Here's a sample of my data in the two fields (I'm actually doing this for a year but just 2 months here for easier reading):

empid  date
132 7/1/2012
143 7/1/2012
154 7/2/2012
154 7/3/2012
123 8/1/2012
197 8/1/2012
197 8/2/2012
187 8/3/2012

I need a count of each month in a column but each emp needs to count as one, like this:

Jul11 Aug11
3 3

but instead I get 4 and 4. Here is my code

SELECT     sum(case when month(hist_date)='07' and year(hist_date) = '2011' then 1 else 0 end) as Jul11, 
           sum(case when month(hist_date)='08' and year(hist_date) = '2011' then 1 else 0 end) as Aug11
FROM         empjobst

I've tried putting the word distinct in between Select and sum but that made it a 1 and 1.

Any help would be much appreciated.

Cyndi

more ▼

asked Nov 20, 2012 at 10:08 PM in Default

clynch gravatar image

clynch
50 1 1 1

(comments are locked)
10|1200 characters needed characters left

3 answers: sort voted first
Try adding a group by. Without a dataset in front of me I would try a group by month(hist_date)
more ▼

answered Nov 20, 2012 at 10:13 PM

Tim gravatar image

Tim
36.4k 35 41 139

(comments are locked)
10|1200 characters needed characters left

This is a two-step problem that you can solve with ROW_NUMBER.

Here's a sample solution:

declare @table table
(
 empid int,
 hist_date date
);

insert into @table(empid, hist_date) values
(132, '7/1/2012'),
(143, '7/1/2012'),
(154, '7/2/2012'),
(154, '7/3/2012'),
(123, '8/1/2012'),
(197, '8/1/2012'),
(197, '8/2/2012'),
(187, '8/3/2012'),
(132, '8/5/2012');

with rownums as
(
 select
 empid,
 hist_date,
 row_number() over (partition by empid, month(hist_date), year(hist_date) order by hist_date) as rownum
 from
 @table
)
select
 sum(case when month(hist_date)='07' and year(hist_date) = '2012' then 1 else 0 end) as Jul12, 
 sum(case when month(hist_date)='08' and year(hist_date) = '2012' then 1 else 0 end) as Aug12
from
 rownums
where
 rownum = 1;

The first part of the query is a common table expression which gets the row number for each employee ID based on the month and year in question. This is what you can use to filter out "duplicate" rows in the bottom expression: only get the ones where rownum = 1. The end results are what you are expecting, and it results in one scan of the table.

One thing you might think about doing is flipping the query around, though, especially if you're presenting it in a presentation layer like an SSRS matrix or Excel table. The query as you have it now works, but you would need to update each month (or write a somewhat-nasty dynamic SQL statement to display each month). Instead, you could use the natural solution:

with rownums as
(
 select
 empid,
 hist_date,
 month(hist_date) as base_month,
 year(hist_date) as base_year,
 row_number() over (partition by empid, month(hist_date), year(hist_date) order by hist_date) as rownum
 from
 @table
)
select
 base_month,
 base_year,
 count(*) as results
from
 rownums
where
 rownum = 1
group by
 base_month,
 base_year;

Then, put it in an SSRS matrix and you'll automatically get the month and year as columns. Or if it's going out to Excel, you can do the same in a pivot table.

EDIT - Fixed a bug in the partitioning. Now a person can show up in multiple months.
more ▼

answered Nov 20, 2012 at 10:23 PM

Kevin Feasel gravatar image

Kevin Feasel
6.1k 3 5 11

I tried your first one with a change of adding my table data into the @table instead of by hand it worked perfect.

Thanks for the quick answer, Cyndi
Nov 20, 2012 at 10:43 PM clynch

Huuummm...need help again

I have a test query to ensure I am getting the correct numbers (empid, count(histdate) - then look at the number of rows there are for that month - run one month at a time). The correct amounts are:

July=14 Aug=10 Sep=8 Oct=8 Nov=9 Dec=8

When I use your first query for July only I get 14. When I run it for Aug only I get 10 and so on. When I broaden my dates from 7/1/11 to 12/31/11, my results are different. I get instead:

July=14 Aug=9 Sep=6 Oct=8 Nov=8 Dec=5

What is causing this discrepency?
Nov 21, 2012 at 09:56 PM clynch
I think it has to do with the datetime...I have sql server 2005 so it would not accept date, I had to change it to datetime....I changed my date range to hist_date BETWEEN CONVERT(DATETIME, '2011-12-01 00:00:00', 102) AND CONVERT(DATETIME, '2011-12-31 23:59:59', 102)) hoping it was the time throwing it all off...but that did not change the results...
Nov 21, 2012 at 10:01 PM clynch
Something the test data didn't have: an employee ID in multiple months. The original partition was by employee ID and ordering by month and year. Because you have employee IDs showing up in different months, month and year need to be part of the partition statement and not just the ordering.
Nov 21, 2012 at 10:07 PM Kevin Feasel
I think I may have found it but not sure...when it assigns the row numbers it is not starting back over with row number 1 at the next month for a person... so how do I adjust the query to do this?
Nov 21, 2012 at 10:08 PM clynch
(comments are locked)
10|1200 characters needed characters left

If you don't need the separate dates in columns, you could write the query as:

DECLARE @Emp TABLE
(
 EmpID INT,
 hist_date DATETIME
)

INSERT @Emp(EmpID, hist_date)
VALUES (132, '7/1/2012'),
 (143, '7/1/2012'),
 (154, '7/2/2012'),
 (154, '7/3/2012'),
 (123, '8/1/2012'),
 (197, '8/1/2012'),
 (197, '8/2/2012'),
 (187, '8/3/2012'),
 (123, '9/1/2012'),
 (197, '9/1/2012'),
 (197, '9/2/2012'),
 (187, '9/3/2012');


SELECT *
FROM
(SELECT DISTINCT DATENAME(MONTH, hist_date) + CONVERT(VARCHAR(10), DATEPART(YEAR, hist_date)) AS Dt, EmpID
FROM @Emp) AS Src
PIVOT
(
COUNT(EmpID)
FOR Dt IN ([July2012], [August2012], [September2012])
) AS pvt
more ▼

answered Nov 21, 2012 at 10:04 PM

Beandon10 gravatar image

Beandon10
136 3

I need to go across not down...
Nov 21, 2012 at 10:13 PM clynch
I've updated the final select so the records go across, but this query will then need to be updated to accommodate any changes in months/years.
Nov 21, 2012 at 10:26 PM Beandon10
(comments are locked)
10|1200 characters needed characters left
Your answer
toggle preview:

Up to 2 attachments (including images) can be used with a maximum of 524.3 kB each and 1.0 MB total.

New code box

There's a new way to format code on the site - the red speech bubble logo will automatically format T-SQL for you. The original code box is still there for XML, etc. More details here.

Follow this question

By Email:

Once you sign in you will be able to subscribe for any updates here

By RSS:

Answers

Answers and Comments

SQL Server Central

Need long-form SQL discussion? SQLserverCentral.com is the place.

Topics:

x12

asked: Nov 20, 2012 at 10:08 PM

Seen: 1076 times

Last Updated: Nov 23, 2012 at 01:44 AM