Sunday, February 5, 2023

How to implement Continuous integration and delivery (CI/CD) in Azure Data Factory

Continuous Integration and Continuous Deployment (CI/CD) are integral parts of the software development lifecycle. As a Data Engineer/ Data Professional when we build the ETL pipeline using Azure Data Factory (ADF), we need to move our code from a Development environment to Pre-Prod and Production environment. One of the ways to do this is by using Azure DevOps.

Target Audience:

This article will be helpful for below mentioned two types of audiences.

Audience Type 1: Using DevOps code repository in ADF but CI/CD is missing

Audience Typ 2: Using ADF for ETL development but never used DevOps repository and CI/CD

For Audience Type 1 you can follow the rest of the blog to implement CI/CD. And for audience type 2, you need to first connect ADF with the DevOps repository, and to do so please follow this blog post and then follow the rest of this blog post for the CI/CD.

This blog post will describe how to set up CI/CD for ADF. There are two parts to this process one called 1) Continuous Integration (CI) and 2) Continuous Deployment (CD).

1) Continuous Integration (CI)

First, you need to log in to the Azure DevOps site from your organization and click the pipelines as shown in fig 1.

Pipeline creation

Fig 1: DevOps pipelines

And then click “New pipeline” as shown in fig 2

New Pipeline

Fig 2: New pipeline creation

And then follow the few steps to create the pipeline:

Step 1: Depending on where your code is located, choose the option. In this blog post, we are using the classic editor as shown below in fig 3.

Fig 3: Choose the right source or use the classic editor

Step 2: In this step, you need to choose the repository that you are using in ADF and make sure to select the default branch. We have chosen ADF_publish as a default branch for the builds.

                         

Fig 4: selecting the branch for the pipeline

Step 3: Create an Empty job by clicking the Empty job as shown in below figure 5

Fig 5: Selected Empty job

Step 4: You need to provide the agent pool and agent specification at this step. Please choose Azure Pipelines for the agent pool and windows latest for the Agent specification as shown in figure 6.

Fig 6: Agent pool and Agent specification

 

Now click "Save and Queue" to save the build pipeline with the name “Test_Build Pipeline” as shown below figure:

Fig 7: Saved the build pipeline

The build pipeline is completed, and the next part is creating the release pipeline which will do the continuous deployment (CD) means the movement of the code/artifacts from Development to the pre-prod and Production environment.

2) Continous Deployment (CD)

The very first step of Continuous Deployment is to create a new release pipeline. And this release pipeline will have a connection with the previously created build pipeline named "Test_Build Pipeline".


 Fig 8: Release Pipeline

As soon as you hit the new release pipeline following step will appear. Please close the right popup screen and then click +Add on the artifact as shown below figure:

 


  Fig 9: Create the artifact

Next step to connect the build pipeline and the release pipeline, you will find out the build pipeline that you created at the CI process earlier and choose the item “Test_Build Pipeline”

               

Fig 10: connecting the build pipeline from the release pipeline

Click on Stage 2 and select an empty job as shown in fig 10

empty job
  Fig 11: Empty job under release pipeline

After the last steps the pipeline will look like the below, now please click on 1 job, 0 tasks. We need to create an ARM template deployment task

job and tasks

Fig 12: Release pipeline job and task

Search for ARM template deployment, as shown in fig 13. ARM template Deployment task will create or update the resources and the artifacts e.g. Linked services, datasets, pipelines, and so on.

search for ARM template
  Fig 13: search for the ARM template

After adding the ARM template, you need to fill in the information for 1 to 12 as shown in the below diagram:

ARM template configuration

Fig 14: ARM Template information

  1. Display Name: Simply put any name to display for the deployment template
  1. Deployment Scope: You have options to choose the deployment scope from Resource Group, Subscription, or Management Group. Please choose Resource Group for ADF CI/CD.
  1. Azure Resouce Manager Connection: This is a service connection; if you already have a service principal you can set up a service connection or create a completely new one. The Cloud Infrastructure team in your organization can set it up for you. You will also find details about Service Connection in Microsoft Learn.
  1. Subscription: Please choose the right subscription where the resource group for the ADF instance resided for the pre-Prod or Production environment.
  1. Action: Please Choose "Create or Update resource group" from the list.
  1. Resource Group: The resource group name where the ADF instance is lying.
  1. Location: Resource location e.g. I have used Canada Central
  1. Template Location: Please choose either "Linked Artifact" or "URL of the file". I have chosen "Linked Artifact" which will connect the ARM template which is already built via Continuous Integration (CI) process.
  1. Template: Please choose the file "ARMTemplateForFactory.json"
  1. Template parameters: Please choose the template parameter file: "ARMTemplateParametersForFactory.json"
  1. OverrideTemplate Parameters: Make sure to overwrite all the parameters including pre-prod or production environment's database server details and so on. Go through all the parameters that exist in the Development environment you need to update those for the upper environments.
  1. Deployment mode: Please choose "Incremental" for the Deployment mode.

After putting all the above information please save it. Hence your release pipeline is completed.

However, to make this pipeline runs automated, means when you push the code from the master to publish branch you need to set up a Continuous deployment trigger as shown in fig 15

release trigger

fig 15: Continuous deployment Trigger

 

The last step of the CD part is creating a Release from the Release pipeline you just created. To do so please select the Release pipeline and create a release as shown in fig 16. 

Create release

Fig 16: Create a Release from the release pipeline

 You are done with CI/CD. Now to test it please go to your ADF instance choose the master branch and then click the 'Publish' button as shown below in figure 17. The C/CD process starts and the codes for pipelines, Linked Services, and Datasets move from the Development environment to Pre-Prod and then to Production.

publish from ADF

Fig 17: Publish from Azure Data Factory (ADF)

The blog post demonstrates step by step process of implementing CI/CD for ADF by using ARM templates. However, in addition to ARM template deployment; there is another ARM template named "Azure Data Factory Deployment (ARM)" which can also be used to implement CI/CD for ADF.

 

Friday, December 2, 2022

Step-by-step guidelines to provision Azure Synapse Analytics for your organization

This blog post will cover end-to-end deployment guidelines including the right roles and permission required to provision Azure Synapse Analytics for your organization.

Prerequisites: You need to have an Azure Portal login.

After logging into the Azure Portal search with the keyword "Synapse", you will find Azure Synapse Analytics as it is shown in the below diagram 1:


Fig 1: Finding out Azure Synapse Analytics


After you find it click the item and hit the "Create" button which will take you to the next screen which will look like below (Fig 2).

There are five different sections (tabs) you need to fill up to provision Azure Synapse Analytics. You will find each section elaborated on details below. 

 Basics: You need to fill up the basic information about the Synapse workspace.



Fig 2: Basic steps of Synapse configuration

1. Choose your subscription: Please choose the subscription where you want to provision the Azure Synapse Analytics resource

2. Choose or create a Resource Group: If you already have a resource group then you can choose the resource group or create a new one.

3. Managed resource group: Managed resource group will be automatically created, if you want to give a name please fill up this field otherwise it will be chosen automatically.

4. Workspace name: You need to pick up a globally unique name, it will suggest if the name is not unique.

5. Region: Please pick the region where you want to provision the Azure Synapse Analytics resource. Since I am in the Canadian region so put the region "Canada Central"

6. Data Lake Storage: Please choose Data Lake from the subscription or put it manually.

7. Data Lake Storage Account: This is the Data Lake Storage you have already created or if you need to create one please do so by choosing "Create new"

8.  Data Lake Storage File System: The file System name is a container name in the Data Lake storage.


This is how it looks after filling up the Basics information (shown in Fig 3)


Fig 3: After filling up the basic information


Security:

Let's move to the Security tab as shown below in figure 4:

Fig 4: Security 


The Security part is to connect with both serverless and dedicated SQL Pools. You can choose either local user and AAD login or only ADD login. I have chosen SQL and AAD login like in the old days when you provision SQL database instances. So you have both options available whenever or if required. 

And the check box "Allow network to Data Lake Storage Gen2 account" will be automatically chosen if you put the Data Lake Storage information the under the "Basics" tab. Synapse Serverless SQL pool required communication with Data Lake Storage and in this step Synapse workspace network allows to access a Data Lake Storage account.


Networking: You need to choose the right networking options for your organization. If you are provisioning for demo purposes then you can allow the public network or allow outbound traffic. However, if data security is top of your mind I would suggest following the below setup (fig 5) for the networking.

Fig 5: Networking

1. Managed virtual network: Enable the Managed Virtual network so that communication between resources inside Synapse happens through the private network.

2. Managed Private End Point: Create a managed private endpoint for the primary storage account (we did a storage account under the Basic tab and step #6)

3. Allow outbound Traffic:  I have set this "No" for not limiting only the approved Azure AD tenants. However, the data security is tightened through the next point #4

4. Public Network Access: Public network access has been disabled, which means there is no risk of exposing the data to the public network, and communication between resources will take place via private endpoints.

Tags: It's the best practice to have Tags. The tagging helps identify resources and deployment environments and so on.

Fig 6: Tags

Review and Create: It's the final steps that show you a summary for you to review. Please verify your storage account, database details, security, and networking settings before you hit the create button (shown below fig 7)


Fig 7: Review and Create



You have done with provisioning the Azure Synapse Analytics, as an Admin, you can work with the Azure Synapse Analytics. 

However, if any additional team members want to work with the Azure Synapse Analytics tool you need to do a few more steps.

You need to add a user to the Synapse workspace, as shown in fig 8




Fig 8: Synapse workspace
After clicking "Access control"  you will find "+Add" button to add user with the right role. At first you need to choose role as shown in below figure 9.



                                                           Fig 9: Choosing the right role
If users are data engineers or developers you may want to choose "contributor" role which I have chosen as shown in fig 9. After choosing the role you need to choose members, it can be individual members or AD group members.

Fig 10: Choosing the right member




The above fig 10 shown I have chosen a member and then click "Next' button to review and assign the role to the members. You have completed the steps for adding right role and members to the Synapse workspace.

After adding the right role and member to the Synapse workspace, you also need to add the user to the Azure Synapse Portal as shown in below fig 11. At first click "Access Control" and then by clicking "+Add" button you can assign members or AD group to the right role. If you are giving access to Data Engineers or Developers they will require Contributor role. In below fig 9, I have given Contributor role to the member.



Fig 11: Synapse administrator from the Synapse Portal

Hpwever, to have access to Serverless SQL Pool and Linked Service creation the members will require more permission. To know more about Synapse roles please go through this Microsoft documentation.

In summary, by following up the above step by step guidelines you can provision Azure Synapse Analytics for your organization. And please make sure through this process work closely with your organization's cloud infrastructure team who can guide you through all networking and security questions you may have.






Saturday, February 12, 2022

How do you secure sensitive data in a modern Data Warehouse?

In 2019 Canadian Broadcasting Corporation (CBC) news reported a massive data breach at the Desjardins Group, which is a Canadian financial service cooperative and the largest federation of credit unions in North America. The report indicated, a "malicious" employee copied sensitive personal information collected by Desjardins from their data warehouse. The data breach compromised the data of nearly 9.7 million Canadians.

Now the question is, how do we secure data warehouses so that employees of the same organization can't breach the data? We need to make sure sensitive data is protected inside the organization.

When any IT solution is built in an organization, there are two types of environment that exist, one is called non-production and the other is a production environment. Production and non-production environments are physically separated. A non-Production Environment means an environment for development, testing, or quality assurance, and the solution is not consumed by end-users daily basis from a non-production environment. However, the Production environment is a place where the latest version of the software or IT solution is available and ready to be used by the intended users.

As stated at the beginning, a rogue employee was involved in the massive data breach at the Desjardins Group. Hence; an organization building a data-driven IT solution needs to work on setting up both Production and Non-Production environments secure way. This article will describe how sensitive data can be protected in both the Production and Non-Production environments.

A. Protecting Sensitive Data in Non-Production Environment in a Data Warehouse:

In general, a Non-Production environment is not well guarded with security. Different personas can have access to a Non-Production environment in a data warehouse e.g. Developers, Testers, business stakeholders. 

So it's important to protect sensitive data inside the organization. The very first thing we need to do is whenever copying data from any application to a data warehouse (non-production environment) sensitive data need to be scrambled. 

Fig 1: Moving data from IT application to non-prod data warehouse

There are a few steps that can help us to scramble the data in  Non-Production Environment in a Data Warehouse:

Step 1: Business or data steward find the list of sensitive or Personal Identifiable Information (PII) data

Step 2: Data Engineer or ETL Developer will use any standard tool like Azure Data Factory (ADF) to mask the data and store it in the data warehouse.

Step 3: Either Test Engineer or Business Stakeholder will verify all the sensitive columns in the database before it releases to the rest of the team.


B. Protecting Sensitive Data in Production Environment in a Data Warehouse:

In a Production environment, we can't scramble the data in such a way that is irreversible. We need to keep the original data intact but make sure only the intended users have access to the data. So if we can mask the columns that hold sensitive or PII data in such a way so that only privileged users get access to the data. Below figure shows what is expected from the solution:




Fig 2: Protect sensitive data from Data Warehouse (PROD)


As shown in Fig 2, when users try to access the data via an application such as Power BI, only intended users will be able to see the intact data. Non-intended users will find the data obfuscated. The above-explained process can be done by using dynamic data masking provided by Microsoft Databases. The process only masks the data on the fly at the query time. If you would like to learn about dynamic data masking, please follow the Microsoft document.

In Summary, whenever PII data is taken from the operational system to the Non-Production environment to build any analytics solution data need to be scrambled. And in a Production environment, though dynamic data masking can prevent viewing the data by unintended users, however; it's important to properly manage database permission on the Data Warehouse. As well as, make sure to have auditing enabled to track all activities taking place in the Data Warehouse in the Production environment.



Thursday, December 16, 2021

How to integrate Azure DevOps with Azure Synapse Studio?

There are two ways you can develop and execute code in Azure Synapse Studio:

  1. Synapse live development
  2. Git enabled development.

By default, Synapse Studio uses Synapse live, as shown in Fig 1. With Synapse live you can't work in a group for the same codebase whereas by enabling Git collaboration, this becomes easy. This article will demonstrate a step-by-step guide to set up Git-enabled development in Synapse Studio.

Fig 1: Synapse live

With the Git enabled development approach either you can use Azure DevOps Git or GitHub. This article will guide you using Azure DevOps Git integration.

Prerequisites

There are two prerequisites before following along with this article:

  1. Permissions - You must need to have contributor or higher role in the Synapse workspace to connect, edit or delete the source code repository.
  2. Git Repository - You also need to create the Git repository. You will find more details about creating an Azure DevOps repository in this link.

Choose from Two Different Options

There are two ways you can connect Azure DevOps Git from Synapse Studio, either from the global bar or from manage hub. You will find details below how to choose from the two options.

Option 1: The global bar

If you follow the figure 2, select the "Synapse live" drop down menu then you will find "Set up code repository". Choose this option.

Fig 2: Setup code repo from global bar

Option 2: The Manage hub

From Synapse studio look at the left bottom menu, as shown in figure 3. Those the last icon that looks like a toolbox. This is the Manage selection. Then choose the Git configuration item int eh menu that is shown to the right of this icon. In the main pane, select configure.

Fig 3: setup code repo from manage hub

Either of the the above two options will take you to the next step, which look like Fig 4. By selecting Azure DevOps Git you connect Azure DevOps Git with the Synapse Studio.

Fig 4: Choose either DevOps Git or Github

At the next step you will find one more attribute populated, as shown in the below figure 5. Please select the appropriate Azure Active Directory from your organization.

Fig 5: Connect the AD tenants

After clicking "Next" you will enter all the necessary information to choose your Git repository which is already created in your organization. Each item shown in Fig 6 is explained below:

  1. Azure DevOps Organization: In the dropdown you may find more than one organization please select the appropriate organization. It's organization name which have been created when Azure DevOps repository is configured.
  2. ProjectName: There are more than one project in the list, select the relevant one. This is Azure DevOps repos project name which you created earlier.
  3. RepositoryName: Please select the right repository from the list or you can also create a repository.
  4. Collaboration branch: By default, it's master. This is the branch where all other branch will be created from. Code will be merged to this branch from other branches as well as you will publish the code from this branch.
  5. Publish branch: The Publish branch is the branch in your repository where publishing related ARM templates are stored and updated. In general adf_publish is your publish branch but you can also make any other branch as a publish branch.
  6. Root folder: Your root folder in your Azure Repos collaboration branch.

 

Fig 6: Configure repository

After completing all the above steps, click "Apply". When this process is successfully completed you should able to see the Git repository branches, as shown in Fig 7.

Fig 7: Synapse Studio after connecting with Azure DevOps Git

How to disconnect from Azure DevOps Git

To disconnect from Azure DevOps Git repo, you need to go to Manage-> Git configuration, as shown in Fig 8. There is a Disconnect menu item at the top.

Fig 8: Disconnect from Azure DevOps Git Repo

Please note that "Disconnect" option will be disabled if you on any other branch than master. So, make sure you choose master branch if you need to disconnect the Azure DevOps Git Repo.

In summary, there are two ways to Develop and execute code in Azure Synapse and collaboration is only possible with Git enablement. The article depicted how to connect Azure DevOps Git with Azure Synapse Studio as well as how to disconnect them whenever required.

Saturday, September 25, 2021

How to recover if Azure Data Factory AutoResolveIntegrationRuntime become corrupted?

I would like to share my recent experience with Azure Data Factory (ADF) where AutoResolveIntegrationRuntime become corrupted and how to fix it. I still don't know how the Integration Runtime (IR) is corrupted and don't expect this may happen to you but if it happens then this article will help you to solve the issue.

Problem statement:

In general, the ADF AutoResolveIntegrationRuntime should look like below fig 1.


Fig 1: AutoResolveIntegrationRuntime in Azure


As shown in figure 2, I found in ADF AutoResolve IR has been changes from ‘Public’ to ‘Managed Virtual Network” and Status of the IR said "Failed to get status" under the master branch.



Fig 2: Corrupted AutoResolveIntegrationRuntime

I was shocked, was not aware of any code changes that may impact AutoResolve IR. Due to AutoResolve IR corruption release pipeline stopped working, hence we were not able to push new changes to PROD.

Identify the Issue:

After looking into the DevOps code repo, as found below fig 3 is shown extra code has been added to the original code.

Fig 3:  Managed virtual network section has been added


Resolution:

Delete the below code as shown above fig 3 from the DevOps. This part of code changed the AutoResolve IR's Sub-type from 'Public' to 'Managed Virtual Network'.

"managedVirtualNetwork": {
            "type""ManagedVirtualNetworkReference",
            "referenceName""default"
        }


After deleting the part of the code from master branch, the issue seems resolved but not completely. As shown below fig 4, the IR changes back from 'Managed Virtual Network' to 'Public', however; still the status is showing error message.

Fig 4: Status still showing error 

At this stage, release pipeline started working means I was able push the changes to  PROD. However; I wanted to see error message disappear. To clean the error message I had to delete the AutoResolve IR code as shown below fig 5. To do so, logged into the Azure DevOps and have chosen the master branch and then under integrationRuntime folder there were two files one is AutoResolve IR and other one is selfhosted IR, I have deleted AutoResolve IR file.

Fig 5: Remove AutoResolveIntegrationRuntime from DevOps

After the file is deleted, checked ADF portal and then refresh it found the error is completely gone. So anytime you find AutoResolve IR is corrupted from your master branch you know how to fix it.


Saturday, August 28, 2021

How to Flatten JSON in Azure Data Factory?

When you work with ETL and the source file is JSON, many documents may get nested attributes in the JSON file. Your requirements will often dictate that you flatten those nested attributes. There are many ways you can flatten the JSON hierarchy, however; I am going to share my experiences with Azure Data Factory (ADF) to flatten JSON.

The ETL process involved taking a JSON source file, flattening it, and storing in an Azure SQL database. The attributes in the JSON files were nested, which required flattening them. The source JSON look like this:

{

"id": "01",

"name": "Tom Hanks",

"age": 20.0,

"email": "th@hollywood.com",

"Cars":

  {

  "make": "Bentley",

  "year": 1973.0,

  "color": "White"

  }

}

The above JSON document has a nested attribute, Cars. We would like to flatten these values that produce a final outcome look like below:

{

"id": "01",

"name": "Tom Hanks",

"age": 20.0,

"email": "th@hollywood.com",

"Cars_make":  "Bentley",

"Cars_year":  "1973.0",

"Cars_color":   "White"

}

How do we do it by using ADF?

Let's create a pipeline that includes the Copy activity, which has the capabilities to flatten the JSON attributes. Let's do that step by step.

First, create a new ADF Pipeline and add a copy activity.

Fig 1: Copy Activity in ADF


Next, we need datasets. You need to have both source and target datasets to move data from one place to another. In this case source is Azure Data Lake Storage (Gen 2). The target is Azure SQL database. The below figure shows the source dataset. We are using a JSON file in Azure Data Lake.

Fig 2: Source dataset

We will insert data into the target after flattening the JSON. the below figure shows the sink dataset, which is an Azure SQL Database.

Fig 3: Sink dataset

Please note that, you will need Linked services to create both the datasets, this article will not go into details about Linked Services, to know details you can look into the Microsoft document.


3. Flattening JSON

After you create source and target dataset, you need to click on mapping as shown below figure 4 and follow the steps:

Fig 4: Flattening JSON

a) At first import schemas
b) Make sure to choose value from Collection Reference
c) Toggle the Advanced Editor
d) Update the columns those you want to flatten

After you have done above, then save it and execute the pipeline. You will find flatten records are inserted to the database as shown in fig 5.


Fig 5: Saved data into the table after flattening


Be cautious

Make sure to choose "Collection Reference" as mentioned 3.b, if you forget to choose that then the mapping will look like below Fig 6:

Fig 6: Without putting collection reference


If you look at the mapping closely from the above figure 6, the nested item in the JSON from source side is: 'result'][0]['Cars']['make'] which means it will only take very first record from the JSON. If you execute the pipeline you will find only one record from JSON file is inserted to the database. So it's important to choose Collection Reference.

In summary, I found Copy Activity in Azure Data Factory make easier to flatten the JSON, you don't need to write any custom code which is super cool.



Sunday, July 25, 2021

Step by step guideline to install PostgreSQL in Azure cloud and Client tool to administrate the PostgreSQL

What is PostgreSQL?

PostgreSQL, also known as Postgres, is a free and open-source relational database management system.  The official PostgreSQL site mentioned, "The World's Most Advanced Open Source Relational Database".  PostgreSQL as Open Source database gained huge popularity in past few years, this article post will focus how to install PostgreSQL in Azure cloud and tools to interact with the database.


Installation of PostgreSQL in the Azure Cloud environment

At first, login to your Azure Portal and search for PostgreSQL, You will find different services to choose from, I have chosen “Azure Database for PostgreSQL flexible servers” from the below list as shown in Fig 1. This particular service will allow to add any extension you want to add to your database in future.


Fig 1: PostgresSQL services in Azure Cloud


As soon as you choose the option you will find below figure 2, which will allow to create the postgreSQL flexible server.


    Fig 2: PostgreSQL flexible server


After clicking  "Create Azure Database for PostgreSQL flexible server" as shown in above figure 2, you will have options to choose from four different plans as shown in figure 3. As per your need you can choose from anyone of them. "Single server" was best fit for my requirements since it's enterprise ready, fully managed and I can add extension to it.


Fig 3: Choose right plan for your database

 
As soon as you hit the 'Single server' as shown above figure 3, you will find details information to fill up as shown in figure 4.

Please follow the below steps, figure (4) indicates each step listed.

1) Choose the right subscription for your resource group
2) Please select resource group where you want to install the database server, if no resource group created then you need to create a resource group. Please find details how to create azure resource group
3) Put the server name for PostgreSQL
4) Choose the location where you would like to install the PostgreSQL, I have chosen Canada Central, however; you can choose which best fit for you.
5) Choose the version of PostgreSQL that you would like deploy in Azure
6) At this step fill up the administrator account information and save this credential; you will need this when you log into the database server.

Fig 4: PostgreSQL deployment config input





After filling up the above information, please click 'Review + Create'. It will take a few minutes to complete the installation and you will find below message when deployment is completed as shown in figure 5.


Fig 5: Deployment is completed


After the deployment if you click Go to Resource (as shown bottom link at Fig 6), you will find out more details about the resource that you just created. We will need these information when database server need to connect from On-Premise IDE.

Fig 6: resource details



How to connect PostgreSQL from On-Premise GUI?


PostgreSQL deployment is completed in Azure Cloud, however; Now we need to find out how to connect this PostgreSQL database server with a Graphical User Interface (GUI) and create any new databases. One of the popular GUI for PostgreSQL is pgAdmin.

Let's start installing pgAdmin to connect the database server and do rest of the operation. Please follow the link to install pgAdmin for Windows. You can choose latest version to of pgAmin, download it and then use wizard to install it.

When pgAdmin installation is completed, you will find below (Fig 7) if you search for the app from your computer.


                             Fig 7: pgAdmin installed in my PC



Now, we are going to use pgAdmin 4 to connect the deployed PostgreSQL database server. Open the app pgAdmin 4 and right click under server as below figure 8 is shown.



Fig 8: Create connection

And then you need to fill up the details to connect PostgreSQL database server which we deployed previously (fig 4). Details are shown in below figure 9, and fill up the information as suggested below:

1. Host name/Address: This is server name which can be found under the resource details (as shown in figure 5.)
2. Port by default should be set 5432, in case it's not then please put 5432.
3. Maintenance database: It's like master database if you are coming from SQL DB experiences, it should fill up automatically, if not then put: postgres
4. User Name: It's admin user name (see figure 4 or 6)
5. Password: The password you entered (fig 4)

As well as, under General tab, please give any name you like for the connection then hit Save button.



Fig 9: connection details need to fill up



Now you are connected your PostgreSQL database server in the Azure Cloud environment from PgAdmin GUI as shown in below figure 10. Everything is set, you can create new database, add new extension to it and whatever operations you want to make. 


Fig 10: PgAdmin GUI connected with PostgreSQL in the Azure Cloud


We learned how to deploy PostgreSQL in the Azure Cloud environment as well as how we can connect the database server from on-premise GUI called PgAdmin.