#!/usr/bin/env python # coding: utf-8 # # ⭐ Scaling Machine Learning in Three Week course # # - Week 3: # ## Deployment # # **Prerequisite** # Run notebook `week-3.0-data-prep-for-training` and `week-3.0-evaluate-and-automate-pipelines.ipynb` before. # # # In this excercise, you will use: # * deployments in batch setting # # # # # This excercise is part of the [Scaling Machine Learning with Spark book](https://learning.oreilly.com/library/view/scaling-machine-learning/9781098106812/) # available on the O'Reilly platform or on [Amazon](https://amzn.to/3WgHQvd). # # In[17]: import mlflow import mlflow.spark from pyspark.sql.types import ArrayType, StringType from pyspark.sql.functions import col, struct from pyspark.ml.regression import LinearRegression, LinearRegressionModel from pyspark.sql import SparkSession # In[18]: spark = SparkSession.builder \ .master('local[*]') \ .appName("deployment") \ .getOrCreate() # ### ✅ **Task 1 :** ### Move model from Model folder to Best Model # # Now that we have a model that gives us a good results, it's time to move it to the next phase. # In[19]: model_path = "../models/linearRegression_model" # In[20]: restored_mllib_model = LinearRegressionModel.load(model_path) # In[21]: restored_mllib_model.save("../models/best_model") # ### ✅ **Task 2 :** use the model for prediction in production # # imagine there is a deployment to production of the best_model # that meanes, that there is a new app that is going to load the model within it and leverage it with Spark. # so now, there is a production dataframe. # # Write the functionality to load the model, and use it to predict production dataframe in a batch setting. # In[22]: # your code goes # ... # How is it different from what you have done so far? # # shar your response in the chat! # In[ ]: