Tuesday, October 14, 2014

Teradata: How to update duplicate records

Generally update statement will not allow updating duplicated records; records must be unique if you want to update a table.

But sometimes your business needs duplicate records in a table. Then your task is to provide the solution. How do you do that?

You need to make a new column that can generate sequential numbers and you need to add that column to duplicate rows to make a unique row.

For example:
Your data look like below, say table name is Weather_DK:

Country
CTY
AVG_TEMP
AVG_Humidity
DK
CPH
12.3
83%
DK
CPH
12.3
83%
DK
CPH
12.3
83%
DK
Arhus
11.3
85%
DK
Aalborg
12.3
87%
                                        Fig 1.1: Table Weather_DK

For example, now you want to update the data in Weather_SE table which has same column name as Weather_DK table  and data like below:
Country
AVG_TEMP
AVG_Humidity
SE
19.3
43%
SE
23.3
83%
SE
12.3
53%
                                    Fig 1.2: Table Weather_SE

Say; your business informed, Swedish average tempreture and humidity is exactly same as Denmark's capital city Copenhagen's (CPH) average temperature and humidity.

So your job is to update the Weather_SE table’s two column AVG_TEMP and AVG_Humidity  from Weather_DK data. When you write SQL query like below:

Update Weather_SE
Set
AVG_TEMP= DK.AVG_TEMP
AVG_Humidity=DK.AVG_Humidity
from
(Select  AVG_TEMP , AVG_Humidity
From Weather_DK
group by 1,2) as DK
where DK.CTY='CPH'

When you will run the above script you will get  error :”Failure 7547 Target row updated by multiple source rows”

To eliminate the error, you need to have unique rows. So you add a new columns with your result set, you can select like below:

select ROW_NUMBER() over (order by AVG_TEMP) as row_S, AVG_TEMP, AVG_Humidity
from Weather_DK  as  DK

Now you can easily update.


2 comments:

Anonymous said...

It should be something like:

Update Weather_SE
Set
AVG_TEMP= UPD_ABG_TEMP
AVG_Humidity= UPD_AVG_Humidity
from
(Select AVG_TEMP , AVG_Humidity
From Weather_DK
group by 1,2
)as DK (UPD_ABG_TEMP, UPD_AVG_Humidity)
where CTY='CPH';

note - this is not tested as well.

Diponkar Paul said...

Many thanks!, Mobesar.