Setting up your Azure Machine Learning services workspace and configuring needed resources
Requirements - In order to benefit from this tutorial, you will need:
pip install -r requirements.txt
Learning Objectives - By the end of this tutorial, you should be able to:
Motivations - This notebook covers the scenario that user define components using yaml then use these components to build pipeline.
# Import required libraries
from promptflow.azure import PFClient
We are using DefaultAzureCredential
to get access to workspace. When an access token is needed, it requests one using multiple identities(EnvironmentCredential, ManagedIdentityCredential, SharedTokenCacheCredential, VisualStudioCodeCredential, AzureCliCredential, AzurePowerShellCredential
) in turn, stopping when one provides a token.
Reference here for more information.
DefaultAzureCredential
should be capable of handling most Azure SDK authentication scenarios.
Reference here for all available credentials if it does not work for you.
from azure.identity import (
InteractiveBrowserCredential,
DefaultAzureCredential,
)
try:
credential = DefaultAzureCredential()
# Check if given credential can get token successfully.
credential.get_token("https://management.azure.com/.default")
except Exception as ex:
# Fall back to InteractiveBrowserCredential in case DefaultAzureCredential not work
credential = InteractiveBrowserCredential()
The workspace is the top-level resource for Azure Machine Learning, providing a centralized place to work with all the artifacts you create when you use Azure Machine Learning. In this section we will connect to the workspace in which the job will be run.
Check this notebook for creating a workspace.
To connect to a workspace, we need identifier parameters - a subscription, resource group and workspace name.
The config details of a workspace can be saved to a file from the Azure Machine Learning portal. Click on the name of the portal on the top right corner to see the link to save the config file.
This config file can be used to load a workspace using PLClient
. If no path is mentioned, path is defaulted to current folder. If no file name is mentioned, file name will be defaulted to config.json
try:
pf = PFClient.from_config(credential=credential)
except Exception as ex:
# NOTE: Update following workspace information if not correctly configure before
client_config = {
"subscription_id": "<SUBSCRIPTION_ID>",
"resource_group": "<RESOURCE_GROUP>",
"workspace_name": "<AML_WORKSPACE_NAME>",
}
if client_config["subscription_id"].startswith("<"):
print(
"please update your <SUBSCRIPTION_ID> <RESOURCE_GROUP> <AML_WORKSPACE_NAME> in notebook cell"
)
raise ex
else: # write and reload from config file
import json, os
config_path = "../.azureml/config.json"
os.makedirs(os.path.dirname(config_path), exist_ok=True)
with open(config_path, "w") as fo:
fo.write(json.dumps(client_config))
pf = PFClient.from_config(credential=credential, path=config_path)
print(pf)
To create a Azure Machine Learning job, you need a compute cluster as prerequisite. Below code ensures computes named cpu-cluster
exists in your workspace.
from azure.ai.ml import MLClient
from azure.ai.ml.entities import AmlCompute
# MLClient use the same configuration as PFClient
ml_client = MLClient.from_config(credential=credential)
# specify aml compute name.
cpu_compute_target = "cpu-cluster"
try:
ml_client.compute.get(cpu_compute_target)
except Exception:
print("Creating a new cpu compute target...")
compute = AmlCompute(
name=cpu_compute_target, size="STANDARD_D2_V2", min_instances=0, max_instances=4
)
ml_client.compute.begin_create_or_update(compute).result()
# TODO: set up connections