Saturday, January 16, 2016

Teardata Error Code 2652: Operation is not allowed and table is being Loaded, how to solve?

I was working with a SAS DI job and using transformation called Teradata Table Loader to load data into the teradata table named 'Table1'.

Fig 1: Job  (dummy) with Teradata Table Loader

After running the job I found error:  Operation is not allowed: 'Table1' is being Loaded. After investigating a little found the reason that when you are loading data into Teradata and using TPT (Teradata Parallel Transporter) utility with Teradata MultiLoad you can find such error. The Teradata Table Loader property in the job looks like below:

Fig 2: Property for transformation Teradata table loader.

 And the error means you will not able to make any operation to this table, it's locked. 

Then I started looking into the solution; so far I found a few solutions:

1) Use empty file and connect with target table and then run it so the error will disappear. Then run with the right source file.
2) Use code to release the lock, syntax is:
       RELEASE MLOAD Table1 ;
       OR
       RELEASE MLOAD Table1 IN APPLY
     
3) Drop and recreate the table.

I have tried  number 1 and 2 solution but did not work successfully.  DBA kill my SQL session but still it's same, did not fix the problem.

However, solution 3 works for me. Then I was able to run the job and load the data into the table.

Saturday, December 19, 2015

How to change CSV format from comma(,) delimited to pipe(|) delimited?

By default, when you save an excel file to .csv, it will be save as comma seperated file.
For example, you have data in excel that looks like:

Fig 1: Original format of data

Lets save this as .csv format:

Fig 2: Save as Comma separated file

After saving the file data will look like below:

Fig 3: Data after converting into CSV 

However, you need to have pipe delimited instead of comma delimited, how do you do that? Please follow the below steps to save file as pipe delimited.

If you are using Windows 10, then you need to go to Settings and then you will find below:


Fig 4: Settings in Windows 10

Now please click "Time and Language" option then you will find below screen, where you need to click "Additional time & regional settings" button.


Fig 5: Related Setting area
Now you will find below option:

Fig 6: More click
If you click on "Change date, time and number formats" under Region section; will find below window (left) and then if you click "Additional settings" then right side window will pop up.

Fig 7: Additional settings
Finally you need to change, List separator from comma(,) delimited to pipe (|) delimited.

Fig 8: List seperator
And now, if you save the same excel as CSV then it will look like below:

So achieved your goal. However, please remember to change your settings back if you want to have your data as comma delimited.



















Thursday, November 12, 2015

MS SQL: How to insert data into a table from Excel

One of my previous post I have written how to insert data from excel/csv file to Teradata table (URL: http://pauldhip.blogspot.dk/2015/02/insert-data-from-excel-to-terdata-table.html).This time, I am going to explain how to do that in MS SQL.

Lets say, we have a table in the database named Customer_dim which have following columns:

CREATE TABLE [dbo].[Customer_dim](
[Customer_id] [nchar](10) NULL,
[Customer_First_Name] [nchar](10) NULL,
[Customer_Last_Name] [nchar](10) NULL,
[Martial_Status] [nchar](10) NULL
) ON [PRIMARY]

The table need to filled by data from excel. To do so, we need to follow the following steps:


Step 1: Save MS excel file into CSV
The excel file should have same number of columns as in the table.

Fig 1: Data in the excel

Now save the excel data as .csv file.
Fig 2: Save as .csv





Step 2: Write code to insert data from CSV

BULK INSERT [dbo].[Customer_dim] FROM 'C:\BI Projects\Blog\Dim_Customer_CSV.csv'
WITH (
    CHECK_CONSTRAINTS,
    DATAFILETYPE='widechar',
    FIELDTERMINATOR='|',
    ROWTERMINATOR='\n',
    KEEPIDENTITY,
    TABLOCK
);

Since, my csv file is saved as pipe delimiter so that above SQL code has FIELDTERMINATOR='|'. You need to change that as per your csv format. As well as,vyou need to change your csv file loacation accordingly.

After running the above SQL code, you should able to see below records are inserted to the table.

Fig 3: Result set after inserting the data



Sunday, October 18, 2015

Teradata:How to check DDL(table structures) are in sync among different environments(development/Test/Production) ?

Title of this post will be explained with an example. For example, If your projects involve populating data from source to target for ten (10) new tables. Then you have to create the DDL for ten tables at least in three different environments(development/Test/Production). Well, you start with development environment, develop the jobs and populate data into target area and when those are working fine then you take DDL from development environment to Test. And if tester or business find that you need to add a new column or data type then you need to change. Moreover, maybe your go-live date is closer so same DDL that put into test also may go for deploy into Production. It means , any change caught by tester or you; now you need to change the SQL in three different environment to keep consistent. Offourse, you follow change order to keep track of that, but as a developer you will prefer to see if all three different environments's table structure looks same.

If you use database as Teradata you have options to test through different environment, you can compare tables, table's column and column's data type. So you can quickly find out if you have any inconsistency through different environments.

Below code will help you to find out the difference between Test and Production environments.

select
upper(trim(tablename)) Table_name,
upper(trim(columnname)) Column_name,
upper(trim(ColumnFormat)) Column_format
from dbc.columns

where upper(trim(tablename)) in ('Custmer', 'Cust_Addreess', 'Geography', ,'Address_X_Geo','Customer_X_Product', 'Customer_X_Sales')
and upper(trim(databasename))='DWH_T'
except
select
upper(trim(tablename)) Table_name,
upper(trim(columnname)) Column_name,
upper(trim(ColumnFormat)) Column_format
from dbc.columns
where upper(trim(tablename)) in ('Custmer', 'Cust_Addreess', 'Geography', ,'Address_X_Geo','Customer_X_Product', 'Customer_X_Sales')
and upper(trim(databasename))='DWH_P';


Sometimes it's challenging to find out if  table structures (DDL) for a particular projects in different environments in data warehouse has same DDL. And due to the inconsistent table structure especially between test and production can lead into ETL job failure in production.

As precaution, by running above mentioned code you can verify if your production DDL is different than Test DDL where your ETL job ran fine. So you can fix the DDL before the jobs failed in production.

Saturday, September 26, 2015

Develop job via SAS DI studio: What is SQL Pass Through and why you use it?

If you are an ETL Developer  and using SAS DI Studio as your ETL tool then this post can be interesting for you. SQL Pass-Through Facility is  SQL procedure that enables developer to send DBMS-specific statements to a DBMS and to retrieve data from database.
You can specify database SQL syntax instead of SAS SQL syntax when you use the Pass-Through Facility.

What is SQL Pass Through?
When we make job via SAS DI studio and use PROC SQL to make operation in database  then your SQL statement will directly communicate with database and make all the operation on your SQL database and final result will bring back to SAS. To activate the feature what you need to do is to set  your SAS DI studio property SQL Pass Through 'Yes'.

Fig 1: How to activate SQL Pass through



Why will you use SQL Pass Through?

1) It's handy when you use database specific function like QUALIFY, RANK etc.
2) If your SQL query has calculation then it will be on SQL database so it's faster
3) Very little processing time on SAS side
4) If you are using store procedure or SQL macro in your ETL job then SQL Pass Through will surely pay off.




Thursday, August 20, 2015

PERIOD: An useful function in Teradata

Let’s start with an example where a table is defined with a column with the type PERIOD.  PERIOD data type in Teradata can hold two dates or two time stamps. To ease your date and time calculation PERIOD can play a big role. The below example will show university students’ enrollment history with sample data. 

We created a table where column 'Enrolled_period' is defined as PERIOD type.


CREATE MULTISET TABLE person_University_period (
  person_id        INTEGER                          NOT NULL,
  University_Name varchar(40)                          NOT NULL,
  Enrolled_period  PERIOD(DATE) FORMAT 'YYYY-MM-DD' NULL
)

PRIMARY INDEX (person_id);

And insert some dummy data in the table:

Insert into TD8245.person_University_period
Values(24568,'Gothenburg University', PERIOD(DATE '2007-08-09',DATE '2009-08-25'));

Insert into TD8245.person_University_period
Values(24569,'Gothenburg University', PERIOD(DATE '2009-07-09',DATE '2011-06-25'));


Data looks like below:

Fig 1 : Student's enrollment data

From the above table we can see Enrolled_period holds start date and end date. Lets see different type of operation you can do on Enrolled_period field.

e.g. you to know how many months or year or hour student is enrolled to the university. To find out that you don't need to do use any formula. PERIOD can find out months or year or day from the column 'enrolled_period'. Below Syntax will find out the number of months the students have studied.

Select INTERVAL(Enrolled_period) MONTH as enrolled_month from person_University_period



Fig 2: Number of months studied by students


At the same way you can calculate interval in day, hour, minutes and seconds. 

From the above 'enrolled_period' column you may need to find out start date, so syntax goes like:

select begin(Enrolled_period) as Enrolled_dt from person_University_period


And to find end date SQL will be:

select END(Enrolled_period) as Enrolled_dt from person_University_period


This period function is helpful when you consider banking operation, e.g. you open a bank account and you should have account opening date and closing date. Generally in the database table we make  two fields to serve the purpose which are Effective date and End date.

Instead of using two fields in the table, we can use one column that can hold both start and end date and calculation on two dates or time stamp become easier.

As an example:

INSERT INTO Account
VALUES (
  21222250,
  'Paul Sandal',
  PERIOD(
    CURRENT_DATE,
    UNTIL_CHANGED
  )
);

Result will look like below:

Account_number Account_Name      Account_Period
21222250        Paul Sandal              ('2015-08-20','9999-12-31')

Fig 3: Date with current and until changed

Here, UNTIL_CHANGED set date as '9999-12-31'. If the account is updated for any reason; then the date '9999-12-31' will be updated with the date which account get updated.












Saturday, August 1, 2015

Teradata DBC : Easy way to get DDL of views, tables and macros!!

 As a database developer you may need to take back up of DDL of views, macros and tables for a particular database that you are working on. Or when your development work is done then you need to move table/macro structure (DDL) from development environment to test and test to production.

And those cases, how do you move DDL from Development environment to Test environment and test to Production?

You can copy DDL from database one by one and then paste it to a document to save it and then forward the document to the DBA to execute your DDL in Test or Production environment.

 If you would like to copy DDL of the table then you do like below:

Fig 1: Manually copy DDL one by one


If you have twenty(20) tables, 10 macros or 5 views then you have to do the above action 35 times which is time consumable, error prone and boring tasks.

Then, what could be the better way to do it?

You can use DBC to get DDL of views, macros and tables for one particular database by using below SQL:

select * from dbc.tables where databasename='Customer_Service'

order by TableKind desc


Result look like below:

Fig 2: DBC result set

From the above result set (Fig 2),  you will find a column named RequestText that holds DDL for all elements exist in the database.

To find DDL for table; SQL will look like below:

Select RequestText from dbc.tables where tablekind='T' and databasename='Customer_Service'

tablekind=T means table
tablekind=V means views
tablekind=M means macro

By changing tablekind column's value you can get DDL for different object in the database.