#!/usr/bin/env python
# coding: utf-8

# # ⭐ Scaling Machine Learning in Three Week course 
# # - Week 3:
# ##  Deployment
# 
# **Prerequisite**
# Run notebook `week-3.0-data-prep-for-training` and `week-3.0-evaluate-and-automate-pipelines.ipynb` before.
# 
# 
# In this excercise, you will use:
#  * deployments in batch setting
# 
# 
# 
# 
# This excercise is part of the [Scaling Machine Learning with Spark book](https://learning.oreilly.com/library/view/scaling-machine-learning/9781098106812/)
# available on the O'Reilly platform or on [Amazon](https://amzn.to/3WgHQvd).
# 

# In[17]:


import mlflow
import mlflow.spark
from pyspark.sql.types import ArrayType, StringType
from pyspark.sql.functions import col, struct
from pyspark.ml.regression import LinearRegression, LinearRegressionModel
from pyspark.sql import SparkSession 


# In[18]:


spark = SparkSession.builder \
    .master('local[*]') \
    .appName("deployment") \
    .getOrCreate()


#  ### ✅ **Task 1 :**  ### Move model from Model folder to Best Model
#  
#  Now that we have a model that gives us a good results, it's time to move it to the next phase.

# In[19]:


model_path =  "../models/linearRegression_model"


# In[20]:


restored_mllib_model = LinearRegressionModel.load(model_path)


# In[21]:


restored_mllib_model.save("../models/best_model")


# ### ✅ **Task 2 :**  use the model for prediction in production
# 
# imagine there is a deployment to production of the best_model
# that meanes, that there is a new app that is going to load the model within it and leverage it with Spark. 
# so now, there is a production dataframe.
# 
# Write the functionality to load the model, and use it to predict production dataframe in a batch setting.

# In[22]:


# your code goes
# ...


# How is it different from what you have done so far? 
# 
# shar your response in the chat!

# In[ ]: