Notebook

Submitting Batch Jobs to Modzy with Python SDK¶

In this notebook, we will use the Modzy Python SDK to submit a batch of data to a the Sentiment Analysis model for inferencing. We will generate a batch of data from Kaggle's Amazon Reviews dataset.

For more detailed usage documenation for our Python SDK, visit our GitHub page.

1. Environment Set Up¶

Create a virtual environment (venv, conda, or other preferred virtual environment) with Python 3.6 or newer.

Pip install the following packages in your environment.

modzy-sdk>=0.10.0

And install Jupyter Notebooks in your preferred environment using the appropriate install instructions.

2. Import Modzy SDK and Initialize Client¶

Insert your instance URL and personal API Key to establish connection to the Modzy API Client

In [1]:

# Import Libraries
from modzy import ApiClient, error
from pprint import pprint

Initialize Modzy API Client¶

In [ ]:

# the url we will use for authentication
'''
Example: "https://test.modzy.url"
'''
API_URL = "https://<your.modzy.url>"
# the api key we will be using for authentication -- make sure to paste in your personal API access key below
API_KEY = "<your.api.key>"
    
# setup our API Client
client = ApiClient(base_url=API_URL, api_key=API_KEY)

3. Discover Available Models¶

In this notebook, we will run inference for the Automobile Classification Model

In [3]:

# Query model by name
auto_model_info = client.models.get_by_name("Sentiment Analysis")
pprint(auto_model_info)

{'author': 'Open Source',
 'description': 'This model gives sentiment scores showing the polarity and '
                'strength of the emotions in text.',
 'features': [{'description': 'This model has a built-in explainability '
                              'feature. Click '
                              '[here](https://arxiv.org/abs/1602.04938) to '
                              'read more about model explainability.',
               'identifier': 'built-in-explainability',
               'name': 'Explainable'}],
 'images': [{'caption': 'Sentiment Analysis',
             'relationType': 'background',
             'url': '/modzy-images/ed542963de/image_background.png'},
            {'caption': 'Sentiment Analysis',
             'relationType': 'card',
             'url': '/modzy-images/ed542963de/image_card.png'},
            {'caption': 'Sentiment Analysis',
             'relationType': 'thumbnail',
             'url': '/modzy-images/ed542963de/image_thumbnail.png'},
            {'caption': 'Open Source',
             'relationType': 'logo',
             'url': '/modzy-images/companies/open-source/company-image.jpg'}],
 'isActive': True,
 'isCommercial': False,
 'isRecommended': True,
 'lastActiveDateTime': '2022-05-24T02:11:45.229+00:00',
 'latestActiveVersion': '1.0.1',
 'latestVersion': '1.0.27',
 'modelId': 'ed542963de',
 'name': 'Sentiment Analysis',
 'permalink': 'ed542963de-open-source-sentiment-analysis',
 'snapshotImages': [],
 'tags': [{'dataType': 'Input Type',
           'identifier': 'text',
           'isCategorical': True,
           'name': 'Text'},
          {'dataType': 'Task',
           'identifier': 'label_or_classify',
           'isCategorical': True,
           'name': 'Label or Classify'},
          {'dataType': 'Tags',
           'identifier': 'sentiment_analysis',
           'isCategorical': False,
           'name': 'Sentiment Analysis'},
          {'dataType': 'Tags',
           'identifier': 'text_analytics',
           'isCategorical': False,
           'name': 'Text Analytics'},
          {'dataType': 'Subject',
           'identifier': 'language_and_text',
           'isCategorical': True,
           'name': 'Language and Text'}],
 'versions': ['0.0.28', '1.0.1', '1.0.27', '0.0.27'],
 'visibility': ApiObject({
  "scope": "ALL"
})}

In [11]:

# Define Variables for Inference
MODEL_ID = auto_model_info["modelId"]
MODEL_VERSION = auto_model_info["latestActiveVersion"]
INPUT_FILENAME = list(client.models.get_version_input_sample(MODEL_ID, MODEL_VERSION)["input"]["sources"]["0001"].keys())[0]

4. Create Batch of Data¶

After reading in the 1000 samples in the test subset of the Amazon reviews dataset, we will create a batch of 500 to submit for inference

In [31]:

with open("test.ft.txt", "r", encoding="utf-8") as f:
    text = f.readlines()[:500] # extract first 500

# clean reviews before feeding to model
text_cleaned = [t.split("__label__")[-1][2:].replace("\n", "") for t in text]

In [33]:

sources = {"review_{}".format(i): {"input.txt": review} for i, review in enumerate(text_cleaned)}

5. Submit Batch Inference to Model¶

Helper Function¶

Below is a helper function we will use to submit inference jobs to the Modzy platform and return the model output using the submit_text method. For additional job submission methods, visit our docs page.

In [41]:

def get_model_output(model_identifier, model_version, data_sources, explain=False):
    """
    Args:
        model_identifier: model identifier (string)
        model_version: model version (string)
        data_sources: dictionary with the appropriate filename --> local file key-value pairs
        explain: boolean variable, defaults to False. If true, model will return explainable result
    """
    job = client.jobs.submit_text(model_identifier, model_version, data_sources, explain)
    result = client.results.block_until_complete(job, timeout=None)        
    return result

In [47]:

model_results = get_model_output(MODEL_ID, MODEL_VERSION, sources, explain=False)
first_review_results = model_results["results"]["review_0"]["results.json"]["data"]["result"]
pprint(first_review_results)

{'classPredictions': [ApiObject({
  "class": "neutral",
  "score": 0.716
}),
                      ApiObject({
  "class": "positive",
  "score": 0.214
}),
                      ApiObject({
  "class": "negative",
  "score": 0.07
})]}