This post will help to understand QUALIFY ROW_NUMBER OVER (PARTITION BY ...) phrase with simple example.
We take an example like below:
We take an example like below:
INDUST_CODE
|
INDUSTRY_NAME
|
ACTIVATE_DATE
|
1011
|
GOOGLE
|
13-12-2009
|
1013
|
YOUTUBE
|
01-09-2008
|
1011
|
GOOGLE
|
15-12-2009
|
1012
|
MICROSOFT
|
14-10-1999
|
1011
|
GOOGLE
|
11-12-2009
|
The
data listed above have more than one row with Industry named ‘Google’. Business needs the industry name without duplication, they
don’t want one industry name more than once. They like to see the data like
below:
INDUST_CODE
|
INDUSTRY_NAME
|
|
1011
|
GOOGLE
|
|
1012
|
MICROSOFT
|
|
1013
|
YOUTUBE
|
How
do you do that?
You
are expert in SQL, so wrote the script like below which works fine:
select INDUST_CODE,INDUSTRY_NAME
from INDUSTRY
QUALIFY
row_number() Over
(partition by
INDUST_CODE order
by ACTIVATE_DATE
desc)=1
If you want to know how the code worked? Then please find the explanation step by step:
1) Group by (Partition by)with Industry Code (INDUST_CODE)
2) Order by with ACTIVATE_DATE which means latest date will be used to find the first row
3) Then uses ‘QUALIFY’ to select the row.
Lets explain the syntax :
Lets explain the syntax :
Make the translation easy:
qualify row_number() over (Partition and order by brace) = 1
which means,
Selecting first row from the result set after applying group by and order by function.