Tuesday, February 23, 2021

How to add local timestamp end of the files in ADF?

This post will describe how to add your local timestamp at the end of the each file in Azure Data Factory (ADF). In general, ADF gets UTC timestamp so we need to convert the timestamp from UTC to EST, since our local time zone is EST.

For example: Input Source file name in SFTP is: _source_Customer.csv and the expected outcome will be: _source_Customer_2021-02-12T133751.csv

It means, the pipeline should add '_2021-02-12T133751' end of each file. And this will work dynamically means any file you pass from source will add it by using ADF regular expression.

Let's set a simple pipeline and explain the scenario in a few steps. In this example, we receive files from event based trigger and hold the file in a parameter, main learning of this article is how to pad current date time end of the file name which we received from the event base trigger. Please note that, event based trigger will not be discussed here, If you like to know more about how to create trigger please follow the link.

Step 1: Add Copy Activity

Create a simple pipeline where need to have at least one Copy activity which connect source and sink as like below figure 1.0

                                           
                                                          Fig 1.0: Pipeline with Copy Activity

Step 2: Adding a parameter to receive file name from trigger

Before adding the expression, we need to have a parameter in the pipeline which will catch file name from trigger. For the time being just consider trigger will give us the file name and the parameter is to hold the file name.
 
                                                 
                                                              Fig 1.1: Adding parameter

Step 3: Prepare sink dataset

In Copy data activity you got Sink dataset, you need to add a parameter in the dataset, Click on Sink dataset and Open it then you will find as like below diagram 1.2

                                                                           
                                                                                Fig 1.2: Adding parameter to the dataset
 
To add parameter to the dataset, click New and ad the parameter name and select the type, I have selected string type. Now you you will see sink dataset look like below figure 1.3 where you need to add dynamic content.
                                             
                                                  Fig 1.3: Dynamic content to add the timestamp

 

Step 4: Setting up dynamic expression

Now, let's go the the dynamic expression. As soon you hit the 'Add dynamic content' like below figure 1.4 you will able to write the expression which will convert the UTC timestamp to EST and then pad end of the file.

                                         
                                                                  Fig 1.4: expression language
 
Hope you remember, pTriggerFile parameter we have created at step no 1 which hold the file name from event based trigger, we are applying all the necessary function to the parameter.

Let's have closer look into the expression:

@concat(replace(pipeline().parameters.pTriggerFile,'.csv',''), '_', formatDateTime(convertTimeZone(utcnow(),'UTC','Eastern Standard Time'),'yyyy-MM-ddTHHmmss'), '.csv')

The functions which have been used to accomplish the task: @concat  @replace @convertTimeZone and @formatDateTime

Explanation of the above expression?

1 : First we need to get the filename  from parameter :_source_Customer.csv

2: To replace the .csv with empty string : replace(pipeline().parameters.pTriggerFile,'.csv','') which will  return:  _source_Customer

3:  Convert utcnow() to EST  : convertTimeZone(utcnow(),'UTC','Eastern Standard Time')

4: want to format the date: formatDateTime(convertTimeZone(utcnow(),'UTC','Eastern Standard Time'),'yyyy-MM-ddTHHmmss') which will return 2021-02-12T133751

5: Concat from earlier steps and add .csv at the end  @concat(Step1,'_', 'Step4','.csv') which will return _source_Customer_2021-02-12T133751.csv

 

We learned how to add local timestamp end of any file, in this sample source file was .csv. However, you can follow the same process for .txt file where you only need to change '.csv' to '.txt' in the expression.

No comments: