Saturday, August 24, 2019

Table variable using in SSIS packages

When you need it and how to use it?

As you may know table variable is similar to temp table in Microsoft SQL, however; table variable generally uses fewer resources. I am going to explain when table variable could be useful at the time of SSIS package development.

Scenario 1: When you need to store data like array and need to use it later in your SSIS package

For example, below scenario I only had to insert/update data where the periods exist in the source data and then making next operation depend on the value exist or not.

DECLARE @periodACD TABLE (periodACD char(2))



FROM [ETL].[SourceTable];

SELECT CASE WHEN periodACD='4' THEN 'Insert or update for April'  END /*You can use any transformation or calculation at THEN clause.*/
FROM @periodACD WHERE periodACD='4'

Scenario 2: Keep insert or updated row count for logging purposes while you use SQL Merge

For example, you are using merge to update the dimension and want to catch the number of rows you insert, you can use table variable to store the data for merge.

DECLARE @RowCount TABLE (InsertedID INT) --Declare Table variable

MERGE  [Dim].[Table] AS Target
 USING    [Staging].[Data] as Source
    ON Target.SourceID = Source.SourceID --BusinessKey

AND (ISNULL(Source.SourceName,'')<>ISNULL(target.[BrandDesc],'')
OR ISNULL(Source.SourceDate,0)<>ISNULL(Target.SourceDate,0)
        OR ISNULL(Source.SourceBit,'')<>ISNULL(Target.SourceBit,'')
   UPDATE SET Target.SourceName = Source.SourceName,
  Target.SourceDate = Source.SourceDate,
  Target.SourceBit = Source.SourceBit

OUTPUT inserted.SourceID 'InsertedID' INTO @RowCount; --output IDs for rowcount

FROM @RowCount

Saturday, July 20, 2019

New to Git and Github? How Azure DevOps connect with Git? Must know.

Those who have been using TFS for a long and now you know 'Git' is a 'thing' that you can't avoid.  When you start with Git, please don’t expect it will be exact same like TFS. When I have started with TFS 10 years back, it was super easy to start with, however for Git, start with a mindset that, command line is required.

As you know Git, other name come along which is 'GitHub' and then confusion grows. 
Many people think Git and GitHub are the same, But it’s not. Git is the version-Control software and GitHub is your storage with sharing capability. If you don’t want to share your code with anyone but still having versions maintenance you don’t need GitHub (which is unlikely though). Moreover, alternative to GitHub you can use Azure DevOps.

Those of you used to work with TFS, maybe already got to know about Azure Devops which give you option to either use Git or TFS as your version-control tool. It means in this scenario, you don’t need GitHub. Below picture depicts Git relate with Azure DevOps and GitHub.

Fig 1: Git, GitHub and Azure Devops

Let's start with beginner steps to work with Git.

Step 1: Install Git at your PC.
Follow the link and choose your right platform to download the file and then install it.

Step 2: Open the Git command window, which look like below

Fig 2: Installed Git software

Now, you would like to create a solution or file which you would like to add to the Git and would like to work with other colleagues.

Adding file to Git means, the folder where you keep the files/code should be recognized by Git. To achieve that, you need to do following:

For example, your code is located under C\Code and you want to make this as Git repository.

After opening the Git Bash, write below command:

1 cd C:\                                                                                                            

2 cd Code
3 git init

Now the folder is ready to use as Git repository

'Initialized empty local Git repository in C:/Code/.git/'

And you will a find a git folder created in the folder C:/Code/

Now you can copy all your files under the folder C:/Code and let's add a file to the local Git repository.

4. git add Testcode.txt

and Commit the code in local repository

5.  git commit -m 'added test file'

Now your solution or all code files is under git repository, If you work on your own and wanted to have version control so you can always go back to your earlier code, which is ready at this stage.

Step 3:
Let's get to the next step where you would like to push this code to either Github or Azure Devops. In my case I have uploaded the code to Azure DevOps.

How do you do this?

Login to the Azure devops :

Create repository , choose Type as 'Git'

Fig 3: Create repo under Azure Devops
After finishing the repository creation, you will find like below window, where you need to copy the command line code.

Now go back to your  'Git Bash' command window, and write two lines of code:

6. git remote add origin https://d*****
7. git push -u origin –all

Now if you login to the Azure DevOps, you will see the local file you have committed earlier is uploaded to the Azure DevOps.

Saturday, June 29, 2019

Creating script along with Data for database objects- Why require? and Easy way to accomplish.

Sometimes you may have a table which have populated by business or tech to support the solution you develop. This type of table can be called as Master data table which is not derived from any source system. So whenever you need to move table DDL/script to higher environment (from DEV to UAT or PROD) you need to move not only DDL but also data for that table.

Now let's find out How do you do it?

There are a few ways to do it but my preferable way to do it by using SQL Task, we mostly use this for generating scripts; however, it can also take the data.

Step 1: Select the database and find generate script option
Fig 1: Finding out generate scripts for the database
Step 2: Follow the wizard
Fig 2: Go thought the wizard
Step 3: Choose the database object, the table you would like to move to the higher environment. And click Next.
Fig 3: Choose your database object
Step 4: Before you generate the script, you also need to have data,so click at the 'Advanced' button.
Fig 4: Advanced scripting options
Step 5: Change 'Type of data to script' to Schema and data
Fig 5: get schema and data together

You are almost done, after completing the addition few steps in the wizard script will be ready with DDL and data. Now you can move the saved file to the upper environment and run the script, you will find table and data together.

Sunday, May 19, 2019

When and why unpivot require for your Data Warehouse(DW) project?

You may involve with a Data warehouse (DW) project where you find source system got many repeated columns in a table which can be reduced to a few columns table. It's not only reducing the number of columns rather you would like to find out attributes those can be aggregated for the reporting purpose, in DW world, which will be your fact attributes. Let's start with an example:

In Medical data, Hospital can have different type of emergency bed type e.g. below table holds bed availability data for different type of bed.

Fig 1: Hospital Bed data

Let's make a table with above data, use below SQL query to populate the data.

create table #HospitalBed (TranID int,SubmissionDate datetime, GenAcuteCom int, 
ShortstayCrisisComp int, MoodAnxietyComp int, EatDisorderComp int)

 insert into #HospitalBed
 select 101,getutcdate(),10,3,9,4
 select 102,getutcdate(),5,6,2,8
select 104,getutcdate(),6,3,9,1
 select 105,getutcdate(),5,6,2,8

If you would like to get the data under, SELECT * FROM #HospitalBed

Fig 2: Data from Table

Now to fit this data for data warehouse model, we can clearly see columns need to turned into rows. These different type of bed should go under one column called 'BedType' and values for those columns  under another column can be called 'complementbed'. Now how do we do it? We have function in SQL called 'unpivot', we are going to use the function to populate expected data set.

select u.tranID, u.Bedtype, u.complementbed
from #HospitalBed s
  for Bedtype in (GenAcuteCom, ShortstayCrisisComp, MoodAnxietyComp,EatDisorderComp)
) u;

Fig 3: After unpivot

As you can see , after applying UNPIVOT you get more rows but column got reduced.

But what is the benefit? This will help to make dimensional model easy, where all these BedType can be kept in a dimension table and fact table can have the key of BedType along the with ComplementBed value. Below Fig (4) shows how dimensional model can be established for the earlier mentioned source table (Fig 1) eventually.

Fig 4: Sample Dimension and Fact table 

And consider the source table we started have 4 different type of bed, however, it could be around 20 columns and more. Dimensional model can be super complicated and even may not able to build proper dimensional model without transposing the columns into rows for the above scenario,

Sunday, April 21, 2019

Misleading popup message in SSIS solution can give you trouble!!

SQL Server Data Tools (2017) is smart and trying to help you many ways. But sometime performance improvement popup message can lead you to some other issue.

I would like to share my experience recently, I was working with SSIS packages and at the end of the day commited and pushed the code to github successfully. But very next day when I opened the solution, surprisingly got incompatibility error (Fig 1), I was not able to open the project.

Fig 1: Solution incompatibility error.
I was really surprised, and trying to remember what I did wrong which may causes the issue. The project incompatible (Fig 1) generally happens when Visual studio version miss match. I did not do any update !! My mind was stuck for a while then recall that I clicked a pop up message in SSDT last day even can't remember what I responded. 

The pop up message was the below one (Fig 2). And I surely clicked 'Disable this extension'.

Fig 2: Misleading pop up
Pop up message in SSDT also says 'Disabling it may improve your experience' (Fig 2). I did not pay close attention and trusted the part where saying it may improve the performance which is misleading. And this causes the solution turned into incompatible.

How to solve this issue?

There are two ways you can solve:

1) Repair the software or 2) Enable the extension

1) Repair the software

You can execute the SQL server data tools (SSDT-Setup-ENU.exe) and click repair as like below figure 3, complete a few steps in wizard. When repair is done then open the solution, now compatibility error will go away.
Fig 3: Repair SSDT

2) Enable the extension
Please open SQL Server Data Tool and go to Tools->Extension and updates. You will find 'Micoroft Integration Services Project' extension is Disabled (Fig 4), so you need to enable the extension,

Fig 4: Extension and update

Saturday, March 16, 2019

Unions in OLEDB Source can kill performance in SSIS, How to handle this?

If you write SQL with many UNION in OLEDB source transformation then you will find the SSIS package is running slow.

For example, you may require many SQL unions to populate the data set as like below:
SELECT x,y,z,z1,z2

SELECT x,y,z,z1,z2

SELECT x,y,z,z1,z2

SELECT x,y,z,z1,z2

SELECT x,y,z,z1,z2

Fig 1: OLEDB source 

Instead of using the above SQL in OLEDB source, you can use Union All transformation which is much faster in compare with using the SQL code in OLEDB source.

Fig 2: Union All Transformation

Saturday, February 16, 2019

SQL Analytical Function: Teradata Vs. MS SQL syntax

A few years back, I posted about Teradata analytical function where I described how this works and why we need to use analytical function :

However, this post includes SQL syntax for MS SQL and how it's different from Teradata.

The SQL Code I have written in Teradata in my earlier post to find latest version of the data by using below:
row_number() Over (partition by INDUST_CODE order by ACTIVATE_DATE desc)=1

Do you think, the above syntax will work in MS SQL? Not really!! MS SQL doesn't have QUALIFY function. You need to write the syntax a little different way and  a few extra lines required to achieve same results.

Let's translate the above syntax for MS SQL:

) TT

As you see, for MS SQL we had to use sub query and RANK function to achieve the same result we got in Teradata.