Sunday, July 26, 2020

How to deploy SQL Server big data cluster by using Azure data studio?

There are different ways to deploy SQL server 2019 big data cluster. However, this blog post will use Azure data studio to deploy big data cluster.

The very first thing you need to have is Azure data studio, if you don't have installed Azure data studio then please find it: https://docs.microsoft.com/en-us/sql/azure-data-studio/download-azure-data-studio?view=sql-server-ver15


Step 1: After you open Azure data studio you will find below UI:



Fig 1: Azure data studio

Step 2: Choose SQL server big data cluster


Fig 2: Choose SQL Server Big Data Cluster    

Step 3: Missing prerequisite

  • kubectl
  • Azure CLI
  • azdata
Please install those.

Fig 3: Missing prerequisite



After prerequisite have been installed you will find below UI

Fig 4: After prequisite is installed



Step 4: Deployment Configuration Profile



Fig 5: Configuration profile


Step 5.A: Azure Settings:
 Azure settings where resource group will be created, make sure you have enough permission to create resource group, if you don't have permission then it will fail.

Fig 5: Azure settings

If failed then the error message will show like below:

Fig 5.1: Error while creating resource group

Step 5.B: Cluster Settings: Your setup cluster name and credential which you will require later to connect the cluster, so please keep it safe.

Fig 7: SQL Server cluster settings



Step 5.C: Service and storage settings: It will fill automatic but please adjust as per your need.

Fig 8: Service settings.



Step 6: Script to Notebook

Last step of the wizard, where as soon as click the 'Script to Notebook' it will open Notebook in Azure studio.
Fig 9: Script to Notebook


However, if this is the first time you are deploying big data cluster then python need to be installed, so you will find below UI:
Fig 10: install python


When you are done then go and hit 'Run All'


Fig 11: Execute the script


You may find error where pandas is still missing


Fig 12: Pandas is missing

So you need to install Pandas package from Azure data studio, Go to Package manager
Fig 13: Finding package manager in Azure data studio

And then find pandas under package manager, as soon as you find them , hit the install button.

Fig 14: Install pandas
When installation is done, then hit the 'Run All' from Notebook again, this time it will successfully start running and you will find , one of the step will login to Azure portal where automatically redirect to the portal and you will find like below UI where you don't need to do anything.

Fig 15: Azure login
When deployment is over , your big data cluster should be ready to use, so you must need to connect that. You will find all the end points after deployment is completed, please take the end point for 'SQL Server Master Instance Front-End'

Fig 16: Click to connect the Cluster

After connecting the big data cluster, it will look like below:

Fig 17: Connect with big data cluster

If you would like to remove the cluster then either you delete it by using azdata command in the command shell.


Delete cluster:

azdata bdc delete -n mssql-cluster

Fig 18: delete cluster



Or you can completely remove the the resource group from Azure portal.