Wednesday, November 7, 2018

SQL: Pay attention when using Group by and CASE WHEN statement together

When you are working with SQL and having case when statement together with 
group by then please have a close look at the what are you putting in the case
when statement. 

For example, Cases like below is pretty much straight forward:
 
Select cust_key,
       Cust_site_key,
       Max(rating_key) as rating_key
From Customer_rating
Where year=2018
Group by 1,2
 
However, Now let's look at the below code and see if you find any issue with
this code.

Select
event_key,
Event_code,
CASE Event_code
   When ‘start’ then min(f.event_date_key)
   When ‘Closed’ then max(f.event_date_key)
ELSE
   f.event_date_key
END as c 
FROM
Event_fact f
Join
event_dim ed
On
f.event_key=ed.event_key
Group
by event_key,Event_code


After running above code you will get error with "Failed
3504: Selected non-aggregate values must be part of the associated group."

What can go wrong? Well, you may think; event_date_key is using aggregate
function (min and max) so you don't need to use group by with 
event_date_key. However, if you look closely then will find ELSE part of the CASE
statement also using the event_date_key where no aggregation applied. It means 
we need to use group by with event_date_key.

Hence the correct code will be:
Select
event_key,
Event_code,
CASE Event_code
   When ‘start’ then min(f.event_date_key)
   When ‘Closed’ then max(f.event_date_key)
ELSE
   f.event_date_key
END as c 
FROM
Event_fact f
Join
event_dim ed
On
f.event_key=ed.event_key
Group
by event_key,Event_code,event_date_key