After moving into Teradata world, discovered a few challenges; one of them was not knowing about SET and MULTISET concept in Teradata. My story goes like this:
I was making ETL job where I had three different tables and inserting into a target table. While I have selected columns after joining three tables I got 8,000 rows. However, when I inserted them into target table then I got around 5000 rows which puzzled me.
After doing some investigation found the Teradata table looks like below:
I was making ETL job where I had three different tables and inserting into a target table. While I have selected columns after joining three tables I got 8,000 rows. However, when I inserted them into target table then I got around 5000 rows which puzzled me.
After doing some investigation found the Teradata table looks like below:
CREATE TABLE INTEREST_MAP
(
INT_ID,
BANK_NAME,
INT_RATE,
EFF_DT,
RATING_SCALE
)
PRIMARY INDEX(INT_ID);
Fig 1: DDL without MULTISET
If the table is made above way, then the table can't accept duplicate values. So, my select statement returned 8000 rows which has 3000 duplicate rows that can not be inserted into the table.
What will be solution??
The table DDL should look like below so that it can accept duplicate values. Added MULTISET in the DDL.
CREATE MULTISET TABLE INTEREST_MAP
(
INT_ID,
BANK_NAME,
INT_RATE,
EFF_DT,
RATING_SCALE
)
PRIMARY INDEX(INT_ID);
Fig 1: DDL with MULTISET
Now, I am able to insert 8000 rows into the table. So Lesson learned; without having MULTISET in your DDL you will not be able to insert duplicate rows.
NOTE: If you have unique index in your DDL; then MULTISET will be over ruled by the Unique Index.
We know MULTISET now; What is SET?
When you work with Teradata mode then by default it is SET. Since I was working with Teradata mode so It means above DDL in Fig-1 is work as SET though DDL did not mentioned explicitly SET.
And SET doesn't allow duplicate rows to be inserted into the table.
No comments:
Post a Comment