Notebook

Configuration¶

Setting up your Azure Machine Learning services workspace and configuring needed resources

Requirements - In order to benefit from this tutorial, you will need:

An Azure account with an active subscription. Create an account for free
An Azure ML workspace
A python environment
Install dependent packages for samples via pip install -r requirements.txt

Learning Objectives - By the end of this tutorial, you should be able to:

Connect to your AML workspace from the Python SDK using different auth credentials
Create workspace config file

Motivations - This notebook covers the scenario that user define components using yaml then use these components to build pipeline.

1. Import the required libraries & set dependent environment variables¶

In [ ]:

# Import required libraries
from promptflow.azure import PFClient

2. Configure credential¶

We are using DefaultAzureCredential to get access to workspace. When an access token is needed, it requests one using multiple identities(EnvironmentCredential, ManagedIdentityCredential, SharedTokenCacheCredential, VisualStudioCodeCredential, AzureCliCredential, AzurePowerShellCredential) in turn, stopping when one provides a token. Reference here for more information.

DefaultAzureCredential should be capable of handling most Azure SDK authentication scenarios. Reference here for all available credentials if it does not work for you.

In [ ]:

from azure.identity import (
    InteractiveBrowserCredential,
    DefaultAzureCredential,
)

try:
    credential = DefaultAzureCredential()
    # Check if given credential can get token successfully.
    credential.get_token("https://management.azure.com/.default")
except Exception as ex:
    # Fall back to InteractiveBrowserCredential in case DefaultAzureCredential not work
    credential = InteractiveBrowserCredential()

3. Connect to Azure Machine Learning Workspace¶

The workspace is the top-level resource for Azure Machine Learning, providing a centralized place to work with all the artifacts you create when you use Azure Machine Learning. In this section we will connect to the workspace in which the job will be run.

Check this notebook for creating a workspace.

To connect to a workspace, we need identifier parameters - a subscription, resource group and workspace name. The config details of a workspace can be saved to a file from the Azure Machine Learning portal. Click on the name of the portal on the top right corner to see the link to save the config file. This config file can be used to load a workspace using PLClient. If no path is mentioned, path is defaulted to current folder. If no file name is mentioned, file name will be defaulted to config.json

In [ ]:

try:
    pf = PFClient.from_config(credential=credential)
except Exception as ex:
    # NOTE: Update following workspace information if not correctly configure before
    client_config = {
        "subscription_id": "<SUBSCRIPTION_ID>",
        "resource_group": "<RESOURCE_GROUP>",
        "workspace_name": "<AML_WORKSPACE_NAME>",
    }

    if client_config["subscription_id"].startswith("<"):
        print(
            "please update your <SUBSCRIPTION_ID> <RESOURCE_GROUP> <AML_WORKSPACE_NAME> in notebook cell"
        )
        raise ex
    else:  # write and reload from config file
        import json, os

        config_path = "../.azureml/config.json"
        os.makedirs(os.path.dirname(config_path), exist_ok=True)
        with open(config_path, "w") as fo:
            fo.write(json.dumps(client_config))
        pf = PFClient.from_config(credential=credential, path=config_path)
print(pf)

4. Retrieve or create an Azure Machine Learning compute target¶

To create a Azure Machine Learning job, you need a compute cluster as prerequisite. Below code ensures computes named cpu-cluster exists in your workspace.

In [ ]:

from azure.ai.ml import MLClient
from azure.ai.ml.entities import AmlCompute

# MLClient use the same configuration as PFClient
ml_client = MLClient.from_config(credential=credential)

# specify aml compute name.
cpu_compute_target = "cpu-cluster"

try:
    ml_client.compute.get(cpu_compute_target)
except Exception:
    print("Creating a new cpu compute target...")
    compute = AmlCompute(
        name=cpu_compute_target, size="STANDARD_D2_V2", min_instances=0, max_instances=4
    )
    ml_client.compute.begin_create_or_update(compute).result()

In [ ]:

# TODO: set up connections