It's all about Data

Sunday, March 26, 2017

How to fix your corrupted SSAS database?

If your SSAS database is corrupted, then you can either 1) restore the database or 2) delete the existing SSAS database and then process the cube to load the fresh data.

I would like to share the second one: Delete the existing database and then process the SSAS cube to load the fresh data.

For example, my SSAS database 'HelloWorld' got corrupted and now you would like to fix the SSAS database, so following steps should be taken to fix it.

Step 1: Delete the existing database

Fig: Find your corrupted database

Hit the delete button

Fig 2: Delete the database

Choose any one of the options that you prefer:

Fig 3:

Since database is deleted, now we can open SSAS package and hit the process button to deploy and process it as like below:

Fig 4: Deploy the SSAS Cube

You may unfortunate:

However, if we may unfortunate sometime when you will find error like :

Fig 5: Deployment error

Step 2: Delete the .xml file from data directory

To fix the error we need to delete .xml file from the data directoy, Depend on your Windows version (32/64 bit), you can go to data dictionary and delete the xml file

C:\Program Files\Microsoft SQL Server\MSAS11.TABULAR\OLAP\Data

The XML file format will look like: <CubeID>.<VersionNum>.db.xml, e.g. this case: HelloWorld 1.0.db.xml . You can also delete the folder as like same name: HelloWorld 1.0.db

Fig 6: Deletete the .xml file

Step 3: Restart the SSAS

Restart the SSAS

Fig 7: Restart the SSAS

Step 4: Run the SSAS package

You are all set to process the cube. So open your SQL Server Data Tools and SSAS solution, process the cube and you will be able to load fresh data structure and the data, it means your databse is back again and functional as like before.

Thursday, February 9, 2017

Measures(Calculation) in SSAS Tabular

While you work with SSAS Tabular or OLAP you have option to create measures. As found calculation in SSAS OLAP is easier than Tabular. However, I have recently made SSAS Tabular cube and going to share how to create Measures/Calculation in SSAS Tabular model.

SSAS tabular project look like below, I have marked the area for the calculation/measures.

Fig 1: Measures/Calculation area.

How do we make Measures?

Select any cell from above fig 1, and then as like excel you have option to put the calculation or formula at the formula bar as like below fig 2 (yellow shaded).

Fig 2: Calculation bar

Here, I am going to calculate how many product is sold in Canada region, So I have calculated number of rows where productno are not empty and filter the salesTerritory with Canada.

The measure name is: Prod_Cnt_Canada (product count in the canda region)

Prod_Cnt_Canada:=COUNTAX(FILTER(FactInternetSales;[SalesTerritoryKey]=6);[ProductKey])

The above example counts the number of nonblank rows in the column, ProductKey, using the table FactInternetSales that results from filtering the SalesTerritoryKey=6 (means canada).

COUNTAX function is to return the number of rows for non empty rows. And FILTER function is to do the filter.

Result after the calculation is show in the below figure: 3

Fig 3: Measure after the calcuation

Wednesday, January 18, 2017

MERGE in SQL is a good friend for Data Warhouse

If your need is SCD2 type table, where you would like to keep the history. Then SQL merge could be your good friend. You could produce history with different way but SQL merge is clean; it's one placeholder for inserting and updating the data.

Lets start with very simple example, in data warehouse many of your table must need to hold history where you take each full load at the first load and then you start loading delta each day. In data warehouse you have many layers but in my example, it's source and target table is taken.

Step 1: Creating Source Table
First make a source table:
Create table Source_Table (id int, ename varchar(50), UpdateDate DateTime);

Step 2: Creating Target table
We need to make Target table to load the data

Create table Target_Table (id int, ename varchar(50),EFF_DT date, END_DT date)

Step 3: Merge Actions in general
Merge have three different actions, those are INSERT, UPDATE and DELETE.

Below SQL query will show the current inserted row to the table, you don't need to write select * from syntax.

insert into Source_Table
OUTPUT INSERTED.*
values (1,'Diponkar',getdate()), (2,'San',getdate()), (3,'Derek',getdate()), (4,'Paul',GETDATE())

Fig 1: OUTPUT from the Insert

Step 4: Data warehouse Full load/Init load

Since we don't have any data in the target table. So, if we execute below merge query it will only affect the insert part of the merge.

Merge target_Table T
Using Source_Table S
on (S.id=T.id)
When NOT Matched
Then insert (id,ename,EFF_DT,END_DT)
values(S.id,S.ename,GETDATE(),'2999-12-31')
When Matched
Then Update SET T.ename=s.ename
OUTPUT $action, DELETED.*, INSERTED.*;

And we are able to see what is ACTION of this merge.

Fig 2: Merge Action

Above figure 2, clearly shown that only insert action is taken place by the merge,

Step 5: Delta load
Source data is daily delta data, so we can delete the old data from source table and insert two new rows. One with new key and other with existing key.

delete from Source_Table

insert into Source_Table (updated data with same business key)
values(1,'Dipu2',GETDATE())

insert into Source_Table (completely new row)
values(5,'Drina',GETDATE())

Data in the source table is like below:

Fig 3: Delta load

Now, lets run the below merge query and catch the action:

Merge target_Table T
Using Source_Table S
on (S.id=T.id)
When NOT Matched
Then insert (id,ename,EFF_DT,END_DT)
values(S.id,S.ename,GETDATE(),'2999-12-31')
When Matched
Then Update SET T.ename=s.ename
OUTPUT $action, DELETED.*, INSERTED.*;

There should be one insert and one update, lets have a look at the output:

Fig 4:Output after merge applied

Step 6: Data in the target table
As we see from the below fig: 5, merge query updated the target table accordingly.

Fig 5: Insert and Update shown

Friday, December 30, 2016

How to Fix SSAS database ID and Name miss match?

You can have a situation where your SSAS database ID and Name can be miss matched. If you try to manually change the Database Name but ID still the same; this could happen when you move solution from development to Test and then change the name but still ID did not change.

What could possibly happen if the ID and Name are not same:

1) If you deploy SSAS package via Visual Studio it will look like processing but actually you will not find updated processed cube. (you can look at the last processed property of the database)
2) If you have job agent and then try to run the job it will show like it's working but actually there will be no update of the cube. You can look at the last processed property timestamp.

So, it means your SSAS package will not be updated any more.

How to check Name and ID are miss matched?

After the deployment of your SSAS Package your SSAS database look like below; where you have Tabular and OLAP cube.

Fig 1: Analysis Server Database

If you look at the tabular database closely and see the property where name and Id are miss matched.

Fig 2: Name and Id miss match for the tabular cube

What is the fix?

At first, you need to create script of the database as like below, which will generate .XMLA file:

Fig 3: Create script of the Tabular database

Generated .xmla file will look like below where you can find ID and Name.

Secondly, You need to change the ID and Name from .xmla script and press 'F5' to execute the change. Now if you look into the databse you will find the ID and Name are completely matched.

Fig 4: ID and Name is updated

Now you will be able to deploy your package with new database name but with same metadata.

Fig 5: Deploy package with new database Name

You may not be happy at this point, since you wanted to have your original name back, well, it's simple just open SQL Server Data Tools and from the solution property change your Database name as like above screenshot (Fig 5), you can put your original Database name and then deploy; you are all set now. You may want to delete other database that you created, no worries; go ahead and delete the old one.

Monday, December 5, 2016

How to Insert data into a table with Identity column

In general, you don't insert data manually in a table which has an identity column. However, in obvious cases if you need to insert data into the table manually then writing general inset Query will failed with error:

For example, you have a table (DDL) like below:

Create table PurchaseOrder

(orderId int identity(1,1),

item nvarchar(100))

If you would like to insert data into the table, standard syntax for inserting data is like below:

Insert into PurchaseOrder(orderId,item)

values (101,'strawberry')

However, above query will failed with below error message:

Fig 1: Eror Inserting data into the table with Identiy column

To insert the data into a table with identity column you should not put any value for the identity column. e.g.

Insert into PurchaseOrder (orderId,item)
values ('strawberry')

And then you will see the result set like below:

Fig 2. Output of the above insert command

However, if you would like to explicity put value for the identity column. e.g. you would like to add a negative number as orderid to make an item invalid. (sometimes you may need for the data warehouse dimension and fact tables). SQL for inserting data with your own choice:

SET IDENTITY_INSERT PurchaseOrder ON
insert into PurchaseOrder(orderId,item)
values (-1,'An Island')
SET IDENTITY_INSERT PurchaseOrder OFF

Now the result will look like below:

Fig 3: Output after the above SQL command

Thursday, November 24, 2016

Is your Pre-Production database environment aligned with Production?

As database developer or BI developer you may get below questions many times:

1) Is our Pre-Production database has exact match with production database?

2) How do we ensure Test environment has good data quality?

3) Is our Test environment has big deviation than production system? and so on.

To answer those questions you have different ways of investigating the database tables. However, one most obvious and first step could be counting rows of two different databse environemnts. You may think it's going to take a lot hell of time to count all the tables's row count in two different environments and list them manually.

well, you don't need to do this task manually. There is a way where you can get row counts of all the tables in a database with small piece of code.

Please run the below code to find number of rows for each table in the chosen database.

USE AdventureWorks2012
GO
SELECT @@SERVERNAME AS Server_Name,
DB_NAME() AS Current_Database,
SCHEMA_NAME(A.schema_id) + '.' + A.Name as Table_Name,
SUM(B.rows) AS Row_Count
FROM sys.objects A
INNER JOIN sys.partitions B ON A.object_id = B.object_id

GROUP BY A.schema_id, A.Name

I have used AdventureWorks2012 databse and result set looks like below:

Fig 1: result set of the query

At the same way you can find out number of rows of the tables across different database environments. Now you can place the result side by side in excel and easily compare the row numbers and number of tables between two different database environments. If you wish to make this automated then you can put this SQL code in a SSIS package and load the data after comparison.

Friday, November 4, 2016

Fix the SSIS Errror: AcquireConnection method call to the connection manager (error code 0xC0202009)

I was doing ETL to get data from staging area to Data warehouse and while runnig the data flow task it's shown the error : The AcquireConnection method call to the connection manager [DatabseName] failed with error code 0xC0202009.

Fig 1: Error code 0xC0202009

Interestingly, the same SSIS package is running without error in Test and Production. After a little bit of interenet search as found, a few people mentioned about changing the Debug options from 64bit to 32Bit, It means you set Run64BitRuntime=False, as like below:

Fig 2: Change 64bitRuntime from debug options

However, it did not work for me, then tried a attribute called 'Delay validation'. You need to click the Data Flow Task's property and change the 'Delay Validation' from TRUE to FALSE.

Fig 3: Change Delay Validation Property

After changing the property you should able to run the package without error,

Pages