You may involve with a Data warehouse (DW) project where you find source system got many repeated columns in a table which can be reduced to a few columns table. It's not only reducing the number of columns rather you would like to find out attributes those can be aggregated for the reporting purpose, in DW world, which will be your fact attributes. Let's start with an example:
create table #HospitalBed (TranID int,SubmissionDate datetime, GenAcuteCom int,
ShortstayCrisisComp int, MoodAnxietyComp int, EatDisorderComp int)
insert into #HospitalBed
select 101,getutcdate(),10,3,9,4
union
select 102,getutcdate(),5,6,2,8
union
select 104,getutcdate(),6,3,9,1
union
select 105,getutcdate(),5,6,2,8
If you would like to get the data under, SELECT * FROM #HospitalBed
Now to fit this data for data warehouse model, we can clearly see columns need to turned into rows. These different type of bed should go under one column called 'BedType' and values for those columns under another column can be called 'complementbed'. Now how do we do it? We have function in SQL called 'unpivot', we are going to use the function to populate expected data set.
select u.tranID, u.Bedtype, u.complementbed
from #HospitalBed s
unpivot
(
complementbed
for Bedtype in (GenAcuteCom, ShortstayCrisisComp, MoodAnxietyComp,EatDisorderComp)
) u;
As you can see , after applying UNPIVOT you get more rows but column got reduced.
But what is the benefit? This will help to make dimensional model easy, where all these BedType can be kept in a dimension table and fact table can have the key of BedType along the with ComplementBed value. Below Fig (4) shows how dimensional model can be established for the earlier mentioned source table (Fig 1) eventually.
And consider the source table we started have 4 different type of bed, however, it could be around 20 columns and more. Dimensional model can be super complicated and even may not able to build proper dimensional model without transposing the columns into rows for the above scenario,
In Medical data, Hospital can have different type of emergency bed type e.g. below table holds bed availability data for different type of bed.
Fig 1: Hospital Bed data |
Let's make a table with above data, use below SQL query to populate the data.
ShortstayCrisisComp int, MoodAnxietyComp int, EatDisorderComp int)
insert into #HospitalBed
select 101,getutcdate(),10,3,9,4
union
select 102,getutcdate(),5,6,2,8
union
select 104,getutcdate(),6,3,9,1
union
select 105,getutcdate(),5,6,2,8
If you would like to get the data under, SELECT * FROM #HospitalBed
Fig 2: Data from Table |
Now to fit this data for data warehouse model, we can clearly see columns need to turned into rows. These different type of bed should go under one column called 'BedType' and values for those columns under another column can be called 'complementbed'. Now how do we do it? We have function in SQL called 'unpivot', we are going to use the function to populate expected data set.
select u.tranID, u.Bedtype, u.complementbed
from #HospitalBed s
unpivot
(
complementbed
for Bedtype in (GenAcuteCom, ShortstayCrisisComp, MoodAnxietyComp,EatDisorderComp)
) u;
Fig 3: After unpivot |
As you can see , after applying UNPIVOT you get more rows but column got reduced.
But what is the benefit? This will help to make dimensional model easy, where all these BedType can be kept in a dimension table and fact table can have the key of BedType along the with ComplementBed value. Below Fig (4) shows how dimensional model can be established for the earlier mentioned source table (Fig 1) eventually.
Fig 4: Sample Dimension and Fact table |
And consider the source table we started have 4 different type of bed, however, it could be around 20 columns and more. Dimensional model can be super complicated and even may not able to build proper dimensional model without transposing the columns into rows for the above scenario,
No comments:
Post a Comment