The very first thing you need to have is Azure data studio, if you don't have installed Azure data studio then please find it: https://docs.microsoft.com/en-us/sql/azure-data-studio/download-azure-data-studio?view=sql-server-ver15
Step 1: After you open Azure data studio you will find below UI:
Fig 1: Azure data studio
Step 2: Choose SQL server big data cluster
Fig 2: Choose SQL Server Big Data Cluster
Step 3: Missing prerequisite
- kubectl
- Azure CLI
- azdata
Please install those.
Fig 3: Missing prerequisite
After prerequisite have been installed you will find below UI
Fig 4: After prequisite is installed
Step 4: Deployment Configuration Profile
Fig 5: Configuration profile
Step 5.A: Azure Settings:
Azure settings where resource group will be created, make sure you have enough permission to create resource group, if you don't have permission then it will fail.
Fig 5: Azure settings
If failed then the error message will show like below:
Fig 5.1: Error while creating resource group
Step 5.B: Cluster Settings: Your setup cluster name and credential which you will require later to connect the cluster, so please keep it safe.
Fig 7: SQL Server cluster settings
Step 5.C: Service and storage settings: It will fill automatic but please adjust as per your need.
Fig 8: Service settings.
Step 6: Script to Notebook
Last step of the wizard, where as soon as click the 'Script to Notebook' it will open Notebook in Azure studio.
Fig 9: Script to Notebook
However, if this is the first time you are deploying big data cluster then python need to be installed, so you will find below UI:
Fig 10: install python
When you are done then go and hit 'Run All'
Fig 11: Execute the script
You may find error where pandas is still missing
Fig 12: Pandas is missing
So you need to install Pandas package from Azure data studio, Go to Package manager
Fig 13: Finding package manager in Azure data studio
And then find pandas under package manager, as soon as you find them , hit the install button.
Fig 14: Install pandas
When installation is done, then hit the 'Run All' from Notebook again, this time it will successfully start running and you will find , one of the step will login to Azure portal where automatically redirect to the portal and you will find like below UI where you don't need to do anything.
Fig 15: Azure login
When deployment is over , your big data cluster should be ready to use, so you must need to connect that. You will find all the end points after deployment is completed, please take the end point for 'SQL Server Master Instance Front-End'
Fig 16: Click to connect the Cluster
After connecting the big data cluster, it will look like below:
Fig 17: Connect with big data cluster
If you would like to remove the cluster then either you delete it by using azdata command in the command shell.
Delete cluster:
azdata bdc delete -n mssql-cluster
Or you can completely remove the the resource group from Azure portal.