Name: Ali Gowani
Contact: https://www.linkedin.com/in/aliagowani/

Title: Regression Experiment for Intelligent Contact Center Employee performance
Pycaret Version: 2.1

Created: Monday, August 24, 2020
Updated: Thursday, September 3, 2020

Use Jupyter Notebook Viewer to view this notebook properly: https://nbviewer.jupyter.org/github/aliagowani/Pycaret_2.1_Regression_EmployeePerformance/blob/master/Pycaret_2.1_Regression_EmployeePerformance.ipynb

0. Overview: Real-Life Employee Performance Case in Machine Learning (Regression) using Pycaret 2.1¶

We are going to utilize a low-code Machine Learning Python library, Pycaret (version 2.1), to predict the First Call Resolution (FCR) metric for Customer Service Agents (Employees) in Call Centers. FCR is an important metric in a call center as it indicates the percentage of issues that were resolved when the customer called the first time. We want to ensure that customers do not keep calling back to resolve an issue as it costs the company money when the issue is not resolved the first time.

Below is the approach we will take to predict a Customer Service Agent or Contact Agents FCR metric:

Conduct Exploratory Data Analysis (EDA) on the real data from a global call center.
Execute regression models to determine how accurately we can predict the FCR metric for each employee.
Create a classification indicator to determine whether predicting an employee's increase or decrease in FCR metric performance is more meaningful than regression.

We will leverage a real-case data from a business process outsourcer (BPO) that supports many Fortune 500 companies. *Note: dataset has been sanitized of personal information as it is a real dataset.

Let's get started!

1. Load Libraries¶

In [3]:

# Import libraries for data processing.
import numpy as np
import pandas as pd
import warnings
import time
warnings.filterwarnings('ignore')

# Import libraries for visualization and set default values.
import matplotlib.pyplot as plt
import seaborn as sns
plt.style.use(['seaborn'])

from sklearn import set_config
set_config(display='text')

In [4]:

# Install and import Pycaret library for transformation and classification.
# !pip install pycaret
from pycaret.regression import *

# Confirm Pycaret version is 2.1
from pycaret.utils import version
print('Confirm Pycaret version is 2.1.X?')
print('Pycaret Version: ', version())

Confirm Pycaret version is 2.1.X?
Pycaret Version:  2.1.2

2. Load Dataset¶

In [5]:

# Load Dataset.
url = 'https://raw.githubusercontent.com/aliagowani/Pycaret_2.1_Regression_EmployeePerformance/master/employee_performance.csv?token=AMLWIYQHZO4XANFWX3IP5B27LGQIY'
dataset = pd.read_csv(url)

In [6]:

# Check shape of dataset and view first few observations to ensure data loaded correctly.
print("Shape of dataset (observations, features):", dataset.shape)
dataset.head(5).round(2)

Shape of dataset (observations, features): (102, 19)

Out[6]:

	Agent_ID	Friday	Monday	Saturday	Sunday	Thursday	Tuesday	Wednesday	Site	Function_Field	tenure	Total number of calls	Assistance	Recommend	CSat	total coaching	total coaching improved	Actual Value	FCR Week before
0	384091	100.00	90.00	90.00	96.15	100.00	96.88	100.00	Kuala Lumpur	Agent	33	163	95.06	94.23	4.87	0	0	85.71	97.14
1	369185	94.00	100.00	96.87	96.87	96.10	95.89	98.33	Kuala Lumpur	Agent	33	306	95.08	94.67	4.83	0	0	100.00	88.64
2	360854	94.44	80.00	92.94	92.94	100.00	93.94	96.30	Kuala Lumpur	Agent	32	138	94.16	94.74	4.82	2	1	80.00	92.31
3	374407	100.00	95.00	96.63	96.63	92.50	95.65	100.00	Kuala Lumpur	Agent	32	148	98.65	95.83	4.84	0	0	94.44	100.00
4	372496	96.88	95.83	94.28	94.28	83.33	95.35	100.00	Kuala Lumpur	Agent	29	142	97.18	98.55	4.86	0	0	100.00	90.91

3. Exploratory Data Analysis (EDA)¶

In [7]:

# Below is a high-level description of each feature. The data set is comprised of 102 agents and we have 19 features to evaluate. The data is between the time period of June 1, 2020 and July 31, 2020, with the 'Actual Value' as the Actual FCR value of each agent for August 7, 2020. The goal is to create a model that predicts the FCR for an employee can have at the end of the week (Friday). 

# 'Agent_ID': unique identifier of the employee or agent.
# 'Friday', 'Monday', 'Saturday', 'Sunday', 'Thursday', 'Tuesday', 'Wednesday': shows the FCR percent for each agent as an aggregate (mean). The higher the percentage the higher the rate at which the customer's issue was resolved in the first call. 
# 'Site': city location of the call center.
# 'Function_Field': this is the role of the employee. In our case, it should all be Agents. 
# 'tenure': tenure of the agent at the company in months. 
# 'Total number of calls': number of all phone calls taken by the agent in the given timeframe.
# 'Assistance': percentage of time the agent needed to provide additional assistance by escalating to supervisor. 
# 'Recommend': percentage of time the agent would be recommended by the customer to resolve an issue. 
# 'CSat': average survey response (on a scale from 1 (least satisfied) to 5 (most satisfied)) that the customer was satisfied with the experience. 
# 'total coaching': number of times the agent received coaching sessions from supervisor to improve FCR metric in the given timeframe. 
# 'total coaching improved': number of times the agent's FCR value increased after a week from the initial coaching. 
# 'Actual Value': the label we are trying to predict. It is the next Friday (August 7, 2020) FCR value for the agent. 
# 'FCR Week before': a strong indicator is how well the agent performed the previous Friday. This is the FCR value for the Friday before the predicted value ('Actual Value'). 

# Describe the features in the dataset, such as, count, mean, standard deviation, min, max, etc.
dataset.describe().round(2)

Out[7]:

	Agent_ID	Friday	Monday	Saturday	Sunday	Thursday	Tuesday	Wednesday	tenure	Total number of calls	Assistance	Recommend	CSat	total coaching	total coaching improved	Actual Value	FCR Week before
count	102.00	102.00	102.00	102.00	102.00	102.00	102.00	102.00	102.00	102.00	102.00	102.00	102.00	102.00	102.00	102.00	102.00
mean	371097.88	95.65	94.10	95.55	95.19	94.91	94.97	96.54	14.24	173.94	95.83	95.62	4.81	1.61	0.88	96.04	95.58
std	10742.44	4.36	10.47	3.56	3.94	5.45	4.68	3.83	6.85	77.23	1.78	2.00	0.09	1.78	1.07	5.70	4.67
min	353039.00	77.78	0.00	80.56	80.00	75.00	80.00	83.33	3.00	52.00	89.66	87.16	4.41	0.00	0.00	75.00	80.00
25%	362108.00	93.75	92.45	93.87	93.28	92.31	92.94	95.00	8.00	116.25	94.71	94.67	4.77	0.00	0.00	94.12	92.92
50%	371781.00	96.15	94.94	95.83	96.01	96.00	95.86	97.33	15.00	158.00	95.81	95.59	4.82	1.00	1.00	100.00	96.49
75%	380697.75	100.00	100.00	97.56	97.57	100.00	98.15	100.00	17.00	226.25	96.99	97.11	4.87	2.00	1.00	100.00	100.00
max	388627.00	100.00	100.00	100.00	100.00	100.00	100.00	100.00	33.00	388.00	100.00	100.00	4.98	9.00	4.00	100.00	100.00

In [8]:

# Create a correlation for the dataset.
dataset_corr = dataset.corr().round(4)

# Drop Agent_ID from correlation dataset.
dataset_corr = dataset_corr.drop(["Agent_ID"], axis=1)
dataset_corr = dataset_corr.drop(["Agent_ID"], axis=0)

# Create a correlation matrix. Only bottom left corner valued. 
mask = np.zeros_like(dataset_corr.round(4))
mask[np.triu_indices_from(mask)] = True

# Generate the corrleation matrix (heatmap) using Seaborn. 
with sns.axes_style("whitegrid"):
    f, ax = plt.subplots(figsize=(12, 10))
    ax = sns.heatmap(dataset_corr.round(2), mask=mask, vmax=1, center = 0, vmin=-1, square=True, cmap='PuOr', linewidths=.5, annot=True, annot_kws={"size": 12}, fmt='.1f')
    plt.title('Heatmap (Correlations) of Features in the Dataset', fontsize=15)
    plt.xlabel('Features', fontsize=15)
    plt.ylabel('Features', fontsize=15)
plt.show()

In [9]:

# Visualize the FCR for specific day in the probability density chart. 
facet = sns.FacetGrid(dataset, aspect = 3, height=5)
facet.map(sns.kdeplot, 'Sunday', shade = True, color='#4E79A7')
facet.map(sns.kdeplot, 'Monday', shade = True, color='#F28E2B')
facet.map(sns.kdeplot, 'Tuesday', shade = True, color='#59A14F')
facet.map(sns.kdeplot, 'Wednesday', shade = True, color='#E15759')
facet.map(sns.kdeplot, 'Thursday', shade = True, color='#B6992D')
facet.map(sns.kdeplot, 'Friday', shade = True, color='#499894')
facet.map(sns.kdeplot, 'Saturday', shade = True, color='#B07AA1')
facet.set(xlim = (0, dataset['Thursday'].max()))
facet.add_legend()
plt.title('First Call Resolution for Days.', fontsize=12)
plt.ylabel('Probability Density', fontsize=12)
plt.xlabel('First Call Resolution (FCR)', fontsize=12)
plt.show()

In [10]:

# Visualize the dispersion of FCR values in a given day and identify the outliers. 

# Create a color set that matches the probability density chart. 
my_pal = {'Sunday': '#4E79A7','Monday': '#F28E2B', 'Tuesday': '#59A14F', 'Wednesday': '#E15759', 'Thursday': '#B6992D', 'Friday': '#499894', 'Saturday': '#B07AA1'}

# Generate a boxplot using Seaborn. 
dataset_boxplot = pd.DataFrame(data = dataset, columns = ['Sunday','Monday', 'Tuesday', 'Wednesday', 'Thursday', 'Friday', 'Saturday'])
plt.figure(figsize=(15,5))
sns.boxplot(x="value", y="variable", data=pd.melt(dataset_boxplot), color='#cccccc')
sns.swarmplot(x="value", y="variable", data=pd.melt(dataset_boxplot), palette=my_pal, alpha=0.5)
plt.title('First Call Resolution by Days.', fontsize=12)
plt.ylabel('Days', fontsize=12)
plt.xlabel('FCR', fontsize=12)
plt.show()

4. Conduct Regression Model in Pycaret¶

In [11]:

# Transform dataset (transform, bin and create dummy variables) and split the dataset. In addition, we are logging experiments and plots for those experiment to be viewed later with MLflow. 
reg_fcr = setup(data=dataset, target='Actual Value', session_id=786, transformation=True, normalize=False, train_size=0.75, numeric_features=('Sunday', 'Monday', 'Saturday', 'Thursday', 'Tuesday', 'Wednesday', 'Friday', 'Total number of calls', 'CSat', 'total coaching', 'total coaching improved', 'FCR Week before'), remove_outliers=True, outliers_threshold=0.05, remove_multicollinearity=True, multicollinearity_threshold=0.9, feature_selection=True, feature_interaction=True, silent=False, ignore_features=['Agent_ID'], combine_rare_levels=True, polynomial_features=True, trigonometry_features=True, feature_selection_threshold=0.4, feature_selection_method='classic', folds_shuffle=True, pca=True, log_experiment=True, experiment_name='reg_fcr_experiments', log_plots=True)

 
Setup Succesfully Completed.

	Description	Value
0	session_id	786
1	Transform Target	False
2	Transform Target Method	None
3	Original Data	(102, 19)
4	Missing Values	False
5	Numeric Features	16
6	Categorical Features	2
7	Ordinal Features	False
8	High Cardinality Features	False
9	High Cardinality Method	None
10	Sampled Data	(96, 19)
11	Transformed Train Set	(72, 31)
12	Transformed Test Set	(24, 31)
13	Numeric Imputer	mean
14	Categorical Imputer	constant
15	Normalize	False
16	Normalize Method	None
17	Transformation	True
18	Transformation Method	yeo-johnson
19	PCA	True
20	PCA Method	linear
21	PCA Components	0.990000
22	Ignore Low Variance	False
23	Combine Rare Levels	True
24	Rare Level Threshold	0.100000
25	Numeric Binning	False
26	Remove Outliers	True
27	Outliers Threshold	0.050000
28	Remove Multicollinearity	True
29	Multicollinearity Threshold	0.900000
30	Clustering	False
31	Clustering Iteration	None
32	Polynomial Features	True
33	Polynomial Degree	2
34	Trignometry Features	True
35	Polynomial Threshold	0.100000
36	Group Features	False
37	Feature Selection	True
38	Features Selection Threshold	0.400000
39	Feature Interaction	True
40	Feature Ratio	False
41	Interaction Threshold	0.010000

In [12]:

# Below is a list of models that Pycaret can use for regression. The ID for each regression can be used to include or exclude models for various functions.
models()

Out[12]:

	Name	Reference	Turbo
ID
lr	Linear Regression	sklearn.linear_model.LinearRegression	True
lasso	Lasso Regression	sklearn.linear_model.Lasso	True
ridge	Ridge Regression	sklearn.linear_model.Ridge	True
en	Elastic Net	sklearn.linear_model.ElasticNet	True
lar	Least Angle Regression	sklearn.linear_model.Lars	True
llar	Lasso Least Angle Regression	sklearn.linear_model.LassoLars	True
omp	Orthogonal Matching Pursuit	sklearn.linear_model.OMP	True
br	Bayesian Ridge	sklearn.linear_model.BayesianRidge	True
ard	Automatic Relevance Determination	sklearn.linear_model.ARDRegression	False
par	Passive Aggressive Regressor	sklearn.linear_model.PAR	True
ransac	Random Sample Consensus	sklearn.linear_model.RANSACRegressor	True
tr	TheilSen Regressor	sklearn.linear_model.TheilSenRegressor	True
huber	Huber Regressor	sklearn.linear_model.HuberRegressor	True
kr	Kernel Ridge	sklearn.kernel_ridge.KernelRidge	False
svm	Support Vector Machine	sklearn.svm.SVR	True
knn	K Neighbors Regressor	sklearn.neighbors.KNeighborsRegressor	True
dt	Decision Tree	sklearn.tree.DecisionTreeRegressor	True
rf	Random Forest	sklearn.ensemble.RandomForestRegressor	True
et	Extra Trees Regressor	sklearn.ensemble.ExtraTreesRegressor	True
ada	AdaBoost Regressor	sklearn.ensemble.AdaBoostRegressor	True
gbr	Gradient Boosting Regressor	sklearn.ensemble.GradientBoostingRegressor	True
mlp	Multi Level Perceptron	sklearn.neural_network.MLPRegressor	False
xgboost	Extreme Gradient Boosting	xgboost.readthedocs.io	True
lightgbm	Light Gradient Boosting Machine	github.com/microsoft/LightGBM	True
catboost	CatBoost Regressor	https://catboost.ai	True

In [13]:

# We can do a compare_models() function without assigning it to a variable. However, we have top 5 models selected using n_select and assigning it to top5 variable. We plan to use this for Stacking and Blending purposes. We have excluded 'RANSAC' (Random Sample Consensus) and 'KNN' (K-Nearest Neighbor) models and have adjusted the default fold value from 10 to 5. 
top5 = compare_models(n_select=5, exclude=(['ransac', 'knn']), sort='RMSE', fold=5)

	Model	MAE	MSE	RMSE	R2	RMSLE	MAPE	TT (Sec)
0	Lasso Regression	3.7130	25.3413	4.9250	0.2483	0.0538	0.0407	0.0035
1	Elastic Net	3.7677	25.4430	4.9550	0.2274	0.0541	0.0413	0.0037
2	Orthogonal Matching Pursuit	4.0510	27.4890	5.1716	0.1303	0.0556	0.0438	0.0094
3	CatBoost Regressor	4.0196	28.5505	5.2430	0.1355	0.0574	0.0443	1.3449
4	Bayesian Ridge	4.0471	28.6559	5.2558	0.1271	0.0571	0.0442	0.0048
5	Random Forest	4.1942	29.8542	5.4258	-0.1348	0.0589	0.0455	0.1358
6	AdaBoost Regressor	3.9844	31.9570	5.6004	-0.0837	0.0613	0.0441	0.0612
7	Extra Trees Regressor	4.4468	31.7656	5.6088	-0.1164	0.0609	0.0483	0.0992
8	Light Gradient Boosting Machine	4.3154	34.5705	5.8095	-0.1048	0.0630	0.0472	0.0104
9	Gradient Boosting Regressor	4.3556	35.3551	5.8438	-0.4162	0.0636	0.0474	0.0508
10	Lasso Least Angle Regression	4.5154	36.6081	5.9376	-0.1017	0.0647	0.0497	0.0039
11	Support Vector Machine	3.9780	40.7379	6.1438	-0.1249	0.0671	0.0453	0.0032
12	Extreme Gradient Boosting	4.8463	48.0118	6.5678	-1.5531	0.0720	0.0522	0.0316
13	Decision Tree	4.6632	56.7530	6.9887	-2.0352	0.0770	0.0507	0.0042
14	Ridge Regression	5.3578	51.6905	7.0107	-0.8219	0.0750	0.0572	0.0049
15	TheilSen Regressor	5.8157	55.3237	7.2968	-0.8380	0.0782	0.0621	0.7838
16	Linear Regression	5.5428	57.1311	7.3383	-1.0322	0.0787	0.0591	0.0040
17	Passive Aggressive Regressor	5.6631	56.8023	7.4761	-1.3372	0.0786	0.0602	0.0118
18	Least Angle Regression	5.8003	63.1026	7.7016	-1.2282	0.0830	0.0617	0.0127
19	Huber Regressor	6.1915	61.8013	7.7522	-1.3569	0.0824	0.0656	0.0258

In [14]:

# Sometimes you want to include the output of the compare_models() as a screenshot into a report. However, with the yellow highlights it gets difficult to read. Pycaret has thought of that and you can use the pull() function to show the model results in the sort by or ascending order.
pull().sort_values(by='RMSE', ascending=True)

Out[14]:

	Model	MAE	MSE	RMSE	R2	RMSLE	MAPE	TT (Sec)
0	Lasso Regression	3.7130	25.3413	4.9250	0.2483	0.0538	0.0407	0.0035
1	Elastic Net	3.7677	25.4430	4.9550	0.2274	0.0541	0.0413	0.0037
2	Orthogonal Matching Pursuit	4.0510	27.4890	5.1716	0.1303	0.0556	0.0438	0.0094
3	CatBoost Regressor	4.0196	28.5505	5.2430	0.1355	0.0574	0.0443	1.3449
4	Bayesian Ridge	4.0471	28.6559	5.2558	0.1271	0.0571	0.0442	0.0048
5	Random Forest	4.1942	29.8542	5.4258	-0.1348	0.0589	0.0455	0.1358
6	AdaBoost Regressor	3.9844	31.9570	5.6004	-0.0837	0.0613	0.0441	0.0612
7	Extra Trees Regressor	4.4468	31.7656	5.6088	-0.1164	0.0609	0.0483	0.0992
8	Light Gradient Boosting Machine	4.3154	34.5705	5.8095	-0.1048	0.0630	0.0472	0.0104
9	Gradient Boosting Regressor	4.3556	35.3551	5.8438	-0.4162	0.0636	0.0474	0.0508
10	Lasso Least Angle Regression	4.5154	36.6081	5.9376	-0.1017	0.0647	0.0497	0.0039
11	Support Vector Machine	3.9780	40.7379	6.1438	-0.1249	0.0671	0.0453	0.0032
12	Extreme Gradient Boosting	4.8463	48.0118	6.5678	-1.5531	0.0720	0.0522	0.0316
13	Decision Tree	4.6632	56.7530	6.9887	-2.0352	0.0770	0.0507	0.0042
14	Ridge Regression	5.3578	51.6905	7.0107	-0.8219	0.0750	0.0572	0.0049
15	TheilSen Regressor	5.8157	55.3237	7.2968	-0.8380	0.0782	0.0621	0.7838
16	Linear Regression	5.5428	57.1311	7.3383	-1.0322	0.0787	0.0591	0.0040
17	Passive Aggressive Regressor	5.6631	56.8023	7.4761	-1.3372	0.0786	0.0602	0.0118
18	Least Angle Regression	5.8003	63.1026	7.7016	-1.2282	0.0830	0.0617	0.0127
19	Huber Regressor	6.1915	61.8013	7.7522	-1.3569	0.0824	0.0656	0.0258

In [15]:

# We can tune our top 5 models dynamically with a higher iteration rate (n_iter) to find more optimal hyper parameters over a larger search space. 
tuned_top5 = [tune_model(i, n_iter=120, optimize='RMSE', fold=5) for i in top5]

	MAE	MSE	RMSE	R2	RMSLE	MAPE
0	3.6922	28.7731	5.3641	0.2916	0.0599	0.0419
1	2.9477	11.8495	3.4423	0.0112	0.0351	0.0300
2	4.3477	30.2222	5.4975	0.3182	0.0601	0.0476
3	5.1968	43.0441	6.5608	0.1431	0.0724	0.0579
4	4.0851	29.4094	5.4230	-0.1452	0.0578	0.0440
Mean	4.0539	28.6597	5.2575	0.1238	0.0571	0.0443
SD	0.7414	9.9248	1.0089	0.1740	0.0121	0.0090

<Figure size 576x396 with 0 Axes>

In [16]:

# Blending models is an ensemble method of combining different machine learning algorithms and use a majority vote to build consensus of final prediction values. Let's try building a blending model from our top 5 models and evaluate the results. 
blender_specific = blend_models(estimator_list=tuned_top5[0:], fold=5, optimize='RMSE', choose_better=False)

	MAE	MSE	RMSE	R2	RMSLE	MAPE
0	3.4107	24.5344	4.9532	0.3960	0.0554	0.0387
1	2.7098	10.7143	3.2733	0.1059	0.0334	0.0276
2	4.2827	27.5811	5.2518	0.3778	0.0574	0.0468
3	4.6942	38.4357	6.1997	0.2349	0.0692	0.0527
4	3.4236	23.0011	4.7959	0.1043	0.0514	0.0369
Mean	3.7042	24.8533	4.8948	0.2438	0.0534	0.0406
SD	0.7026	8.8923	0.9458	0.1262	0.0116	0.0086

<Figure size 576x396 with 0 Axes>

In [17]:

# Below is a view of the model parameters. 
blender_specific

Out[17]:

VotingRegressor(estimators=[('Lasso_0',
                             Lasso(alpha=0.777, copy_X=True, fit_intercept=True,
                                   max_iter=1000, normalize=False,
                                   positive=False, precompute=False,
                                   random_state=786, selection='cyclic',
                                   tol=0.0001, warm_start=False)),
                            ('Elastic Net_1',
                             ElasticNet(alpha=0.8300000000000001, copy_X=True,
                                        fit_intercept=True, l1_ratio=0.26,
                                        max_iter=1000, normalize=False,
                                        positiv...
                                                       tol=None)),
                            ('CatBoost Regressor_3',
                             <catboost.core.CatBoostRegressor object at 0x7f9aff702d60>),
                            ('Bayesian Ridge_4',
                             BayesianRidge(alpha_1=0.001, alpha_2=0.0001,
                                           alpha_init=None, compute_score=False,
                                           copy_X=True, fit_intercept=True,
                                           lambda_1=0.3, lambda_2=0.3,
                                           lambda_init=None, n_iter=300,
                                           normalize=False, tol=0.001,
                                           verbose=False))],
                n_jobs=-1, verbose=False, weights=None)

In [18]:

# Stacking models is an ensemble method of using meta learning, where a meta model is created using multiple base estimators to generate the final prediction. Let's try building a stacking model from our top 5 models and evaluate the results. 
stacker_specific = stack_models(estimator_list=tuned_top5[1:], meta_model=tuned_top5[0], fold=5, optimize='RMSE', choose_better=False)

	MAE	MSE	RMSE	R2	RMSLE	MAPE
0	3.6171	27.1878	5.2142	0.3307	0.0584	0.0411
1	3.0085	14.1218	3.7579	-0.1784	0.0386	0.0306
2	4.0868	27.8063	5.2732	0.3727	0.0582	0.0451
3	5.0266	39.0431	6.2484	0.2228	0.0686	0.0557
4	3.3589	22.7910	4.7740	0.1125	0.0510	0.0360
Mean	3.8196	26.1900	5.0535	0.1720	0.0550	0.0417
SD	0.6985	8.0747	0.8073	0.1972	0.0099	0.0085

<Figure size 576x396 with 0 Axes>

In [19]:

# Below is a view of the model parameters. 
stacker_specific

Out[19]:

StackingRegressor(cv=5,
                  estimators=[('ElasticNet_0',
                               ElasticNet(alpha=0.8300000000000001, copy_X=True,
                                          fit_intercept=True, l1_ratio=0.26,
                                          max_iter=1000, normalize=False,
                                          positive=False, precompute=False,
                                          random_state=786, selection='cyclic',
                                          tol=0.0001, warm_start=False)),
                              ('OrthogonalMatchingPursuit_1',
                               OrthogonalMatchingPursuit(fit_intercept=True,
                                                         n_nonzero_coef...
                                             compute_score=False, copy_X=True,
                                             fit_intercept=True, lambda_1=0.3,
                                             lambda_2=0.3, lambda_init=None,
                                             n_iter=300, normalize=False,
                                             tol=0.001, verbose=False))],
                  final_estimator=Lasso(alpha=0.777, copy_X=True,
                                        fit_intercept=True, max_iter=1000,
                                        normalize=False, positive=False,
                                        precompute=False, random_state=786,
                                        selection='cyclic', tol=0.0001,
                                        warm_start=False),
                  n_jobs=-1, passthrough=True, verbose=0)

5. Evaluate Results and Finalize Model¶

In [20]:

# We can use Pycaret's built in plot_model() function to generate side-by-side plots: the Cook's Distance Outliers and t-SNE Manifold charts. 
fig = plt.figure(figsize=(20,30))
ax = fig.add_subplot(5,2,1)
plot_model(blender_specific, plot='cooks', save=True, verbose=False, scale=1.1)
ax = fig.add_subplot(5,2,2)
plot_model(blender_specific, plot='manifold', save=True, verbose=False, scale=1.1)
plt.savefig('plots_cooks_and_manifold.png', dpi=300, pad_inches=0.25)
plt.show()

In [21]:

# We can use Pycaret's built in plot_model() function to generate side-by-side plots: the Residuals chart, Prediction Error and Cross Validation (learning) charts. Let's compare the Blend and Stack model plots in a side-by-side comparison. 
fig = plt.figure(figsize=(25,20))
ax = fig.add_subplot(3,2,1)
plot_model(blender_specific, plot='residuals', save=True, verbose=False, scale=1.1)
ax = fig.add_subplot(3,2,2)
plot_model(stacker_specific, plot='residuals', save=True, verbose=False, scale=1.1)
ax = fig.add_subplot(3,2,3)
plot_model(blender_specific, plot='error', save=True, verbose=False, scale=1.1)
ax = fig.add_subplot(3,2,4)
plot_model(stacker_specific, plot='error', save=True, verbose=False, scale=1.1)
ax = fig.add_subplot(3,2,5)
plot_model(blender_specific, plot='learning', save=True, verbose=False, scale=1.1)
ax = fig.add_subplot(3,2,6)
plot_model(stacker_specific, plot='learning', save=True, verbose=False, scale=1.1)
plt.savefig('plots_blender_vs_stacker.png', dpi=300, pad_inches=0.25)
plt.show()

In [22]:

# We can execute the predict_model() function to use the model to generate the predicted values. 
pred_tunded_blender = predict_model(blender_specific)

	Model	MAE	MSE	RMSE	R2	RMSLE	MAPE
0	Voting Regressor	4.5013	35.0045	5.9165	-0.7728	0.0619	0.0479

In [23]:

# We can execute the predict_model() function to use the model to generate the predicted values. 
pred_tunded_stacker = predict_model(stacker_specific)

	Model	MAE	MSE	RMSE	R2	RMSLE	MAPE
0	Stacking Regressor	4.5898	39.7708	6.3064	-1.0142	0.0657	0.049

In [24]:

# The Blend model seems to perform better in both our train and test so let us finalize it. The finalize_model() function trains the model on the entire dataset. 
finalize_blender = finalize_model(blender_specific)
finalize_blender

Out[24]:

VotingRegressor(estimators=[('Lasso_0',
                             Lasso(alpha=0.777, copy_X=True, fit_intercept=True,
                                   max_iter=1000, normalize=False,
                                   positive=False, precompute=False,
                                   random_state=786, selection='cyclic',
                                   tol=0.0001, warm_start=False)),
                            ('Elastic Net_1',
                             ElasticNet(alpha=0.8300000000000001, copy_X=True,
                                        fit_intercept=True, l1_ratio=0.26,
                                        max_iter=1000, normalize=False,
                                        positiv...
                                                       tol=None)),
                            ('CatBoost Regressor_3',
                             <catboost.core.CatBoostRegressor object at 0x7f9b02cd9820>),
                            ('Bayesian Ridge_4',
                             BayesianRidge(alpha_1=0.001, alpha_2=0.0001,
                                           alpha_init=None, compute_score=False,
                                           copy_X=True, fit_intercept=True,
                                           lambda_1=0.3, lambda_2=0.3,
                                           lambda_init=None, n_iter=300,
                                           normalize=False, tol=0.001,
                                           verbose=False))],
                n_jobs=-1, verbose=False, weights=None)

<Figure size 576x396 with 0 Axes>

6. Analyze the Performance of Final Model on Entire Dataset¶

In [25]:

# The predict_model() can be executed with the final blend model over the entire dataset and saved to a csv file. 
pred_final_blender = predict_model(finalize_blender, data=dataset)
pred_final_blender.to_csv('pred_final_blender.csv')
pred_final_blender.describe()

Out[25]:

	Agent_ID	Friday	Monday	Saturday	Sunday	Thursday	Tuesday	Wednesday	tenure	Total number of calls	Assistance	Recommend	CSat	total coaching	total coaching improved	Actual Value	FCR Week before	Label
count	102.000000	102.000000	102.000000	102.000000	102.000000	102.000000	102.000000	102.000000	102.000000	102.000000	102.000000	102.000000	102.000000	102.000000	102.000000	102.000000	102.000000	102.000000
mean	371097.882353	95.650593	94.096218	95.552565	95.186928	94.907817	94.968364	96.544911	14.235294	173.941176	95.832315	95.622119	4.807622	1.607843	0.882353	96.036176	95.578725	96.181909
std	10742.442598	4.355132	10.465289	3.558242	3.941495	5.453729	4.675461	3.829237	6.846516	77.234608	1.784522	1.997818	0.089011	1.775281	1.074402	5.700526	4.674600	2.478407
min	353039.000000	77.777778	0.000000	80.555556	80.000000	75.000000	80.000000	83.333333	3.000000	52.000000	89.655172	87.162162	4.413793	0.000000	0.000000	75.000000	80.000000	86.818100
25%	362108.000000	93.750000	92.445055	93.873767	93.284134	92.307692	92.938312	95.000000	8.000000	116.250000	94.711729	94.666667	4.767000	0.000000	0.000000	94.120000	92.920000	95.324600
50%	371781.000000	96.153846	94.935897	95.833414	96.013439	96.000000	95.861872	97.329060	15.000000	158.000000	95.806136	95.589688	4.819492	1.000000	1.000000	100.000000	96.490000	96.715000
75%	380697.750000	100.000000	100.000000	97.556895	97.568007	100.000000	98.146168	100.000000	17.000000	226.250000	96.988326	97.113066	4.865194	2.000000	1.000000	100.000000	100.000000	97.755875
max	388627.000000	100.000000	100.000000	100.000000	100.000000	100.000000	100.000000	100.000000	33.000000	388.000000	100.000000	100.000000	4.984127	9.000000	4.000000	100.000000	100.000000	100.294000

In [26]:

# We can use the Pycaret's built-in plot_model() function to generate Residuals and Error plots for the finalized blend model. 
fig = plt.figure(figsize=(9,10))
ax = fig.add_subplot(2,1,1)
plot_model(finalize_blender, plot='residuals', save=True, verbose=False, scale=1.1)
ax = fig.add_subplot(2,1,2)
plot_model(finalize_blender, plot='error', save=True, verbose=False, scale=1.1)
plt.savefig('plots_pred_final_blender.png', dpi=300, pad_inches=0.25)
plt.show()

In [27]:

# An interesting view is looking at the Actual Values and Predicted Values (Label) in a histogram over the entire dataset. This shows the distribution between the values. We can see how the Predicted Values seem to peak in a more distributed manner and skew to the left. 
plt.figure(figsize=(15,5))
sns.set_style("whitegrid")
sns.distplot(pred_final_blender["Actual Value"],
                bins=20,
                kde=False,
                color="#c6690c")
sns.distplot(pred_final_blender["Label"],
                bins=20,
                kde=False,
                color="#664697")
plt.title("Distribution between Actual Value and Predicted Value (Label)")
plt.ylabel("Count")
plt.xlabel("FCR Value")
plt.xlim((74,101))
plt.legend(('Actual Value', 'Predicted Value (Label)'), ncol=2, loc='upper left', fontsize=12)

Out[27]:

<matplotlib.legend.Legend at 0x7f9b1cfd4af0>

In [28]:

# We can plot the Predicted Value (Label) and Actual Value over the entire dataset. 
sns.regplot(x="Actual Value", y="Label", data=pred_final_blender, lowess=False, scatter_kws ={'s':50}, line_kws={"color": "#664697"}, color="#c6690c")
plt.title("Linear Relationship between Actual Value and Predicted Value (Label)")
plt.ylabel("Predicted Value (Label)")
plt.xlabel("Actual Value")
plt.xlim((74,101))
plt.legend(('Best Fit', 'Actual Value vs Predicted Value (Label)'), ncol=2, loc='upper left', fontsize=12)

Out[28]:

<matplotlib.legend.Legend at 0x7f9aff06ddf0>

In [29]:

# We can compare the Predicted Values (Label) and Residuals in an error plot over the entire dataset. 
sns.residplot(x="Actual Value", y="Label", data=pred_final_blender, lowess=False, scatter_kws ={'s':50}, line_kws={"color": "#664697"}, color="#c6690c")
plt.title("Residuals for the Predicted values in Final Blend Model")
plt.ylabel("Residuals")
plt.xlabel("Predicted Value (Label)")
plt.xlim((74,101))
plt.legend(('Best Fit', 'Predicted Value (Label)'), ncol=2, loc='upper left', fontsize=12)

Out[29]:

<matplotlib.legend.Legend at 0x7f9aff71ae50>

7. Client Presentation and Next Steps¶

We presented the results of our experiment to the client, thinking they would be impressed. Initially, they shared their excitement of leveraging the models to predict performance. However, having a RMSE of ~4.5 is not going to be viewed in a positive manner with employees and their supervisors.

However, we decided to approach it differently. Instead of looking at a finite value, we determined that if we can predict whether the employee was going to increase or decrease performance then this would be quite beneficial. In particular, if we can determine that the employee's performance was going to decrease then the supervisor can preempt it and try to mitigate the decrease in employee's performance before it occurs.

In [30]:

# Generating the classification label based on the regression analysis. 
pred_final_blender.loc[(pred_final_blender['Actual Value'] >= pred_final_blender['FCR Week before']) & (pred_final_blender['Label'] > pred_final_blender['FCR Week before']), 'Pred_Class'] = 'True Positive'

pred_final_blender.loc[(pred_final_blender['Actual Value'] < pred_final_blender['FCR Week before']) & (pred_final_blender['Label'] < pred_final_blender['FCR Week before']), 'Pred_Class'] = 'True Negative'

pred_final_blender.loc[(pred_final_blender['Actual Value'] >= pred_final_blender['FCR Week before']) & (pred_final_blender['Label'] < pred_final_blender['FCR Week before']), 'Pred_Class'] = 'False Negative'

pred_final_blender.loc[(pred_final_blender['Actual Value'] < pred_final_blender['FCR Week before']) & (pred_final_blender['Label'] > pred_final_blender['FCR Week before']), 'Pred_Class'] = 'False Positive'

In [31]:

# Saving the classification results to a CSV file and create a data type to store the classification metrics. 
pred_final_blender
pred_final_blender.to_csv('pred_final_blender.csv')

cf_values = pred_final_blender['Pred_Class'].value_counts()
cf_values

Out[31]:

True Positive     47
True Negative     29
False Negative    23
False Positive     3
Name: Pred_Class, dtype: int64

In [32]:

# Create confusion matrix table, along with labels, counts and percentage
cf_matrix = np.array([[cf_values.loc['True Positive'], cf_values.loc['False Negative']],
            [cf_values.loc['False Positive'], cf_values.loc['True Negative']]])

group_names = ['True Positive', 'False Negative\n(Type II Error)', 'False Positive\n(Type I Error)', 'True Negative']

group_counts = ['{0:0.0f}'.format(value) for value in
                cf_matrix.flatten()]
                
group_percentages = ['{0:.2%}'.format(value) for value in
                     cf_matrix.flatten()/np.sum(cf_matrix)]
                     
labels = [f'{v1}\n{v2}\n{v3}' for v1, v2, v3 in
          zip(group_names,group_counts,group_percentages)]
          
labels = np.asarray(labels).reshape(2,2)

# Create confusion matrix plot using Seaborn
ax = plt.subplot()
plt.rcParams.update({'font.size': 14})
sns.heatmap(cf_matrix, annot=labels, fmt='', cmap='PuOr')

# Set labels, title and ticks
ax.set_xlabel('Predicted Class', fontsize=14)
ax.set_ylabel('Actual Class', fontsize=14)
ax.set_title('Confusion Matrix', fontsize=14, fontweight='bold')
ax.xaxis.set_ticklabels(['Performance Increase', 'Performance Decrease'], fontsize=12)
ax.yaxis.set_ticklabels(['Performance Increase', 'Performance Decrease'], va='center', fontsize=12)

Out[32]:

[Text(0, 0.5, 'Performance Increase'), Text(0, 1.5, 'Performance Decrease')]

In [33]:

# Below are the classification metrics. 
accuracy = '{0:.2%}'.format((cf_values.loc['True Positive'] + cf_values.loc['True Negative']) / (cf_values.loc['True Positive'] + cf_values.loc['False Positive'] + cf_values.loc['True Negative'] + cf_values.loc['False Negative']))
print('Accuracy: ', accuracy)

sensitivity = '{0:.2%}'.format((cf_values.loc['True Positive']) / (cf_values.loc['True Positive'] + cf_values.loc['False Negative']))
print('Sensitivity (Recall): ', sensitivity)

specificity = '{0:.2%}'.format((cf_values.loc['True Negative']) / (cf_values.loc['True Negative'] + cf_values.loc['False Positive']))
print('Specificity (True Negative Rate): ', specificity)

precision = '{0:.2%}'.format((cf_values.loc['True Positive']) / (cf_values.loc['True Positive'] + cf_values.loc['False Positive']))
print('Precision (Positive Predictive Value): ', precision)

f1 = '{0:.2%}'.format((2 * cf_values.loc['True Positive']) / (2 * cf_values.loc['True Positive'] + cf_values.loc['False Positive'] + cf_values.loc['False Negative']))
print('F1 Score (harmonic mean of Precision and Sensitivity): ', f1)

Accuracy:  74.51%
Sensitivity (Recall):  67.14%
Specificity (True Negative Rate):  90.62%
Precision (Positive Predictive Value):  94.00%
F1 Score (harmonic mean of Precision and Sensitivity):  78.33%

8. Conclusion¶

Getting accuracy over 70% is more likely to move this model in to production. We can confidently predict whether the employee will increase or decrease in performance. Even more so, for employees we think will decrease in performance, we can get ahead of that and ask the supervisor to coach them and support them to mitigate the decrease in performance.

We can see how versatile Pycaret is in experimenting real case scenarios. Most companies do not have terabytes of data so as a citizen-data-scientists, we can use Pycaret in many daily business situations to make more meaningful and data-driven decisions.

In [34]:

# Note: since we had enabled log_experiement and log_plots in the setup() function. Pycaret does a wonderful job in leveraging MLflow so all experiments can be logged and analyzed. This can be accomplished with a simple one-line command and viewed in your browser. 
!mlflow ui

[2020-09-04 12:00:01 -0500] [16848] [INFO] Starting gunicorn 20.0.4
[2020-09-04 12:00:01 -0500] [16848] [INFO] Listening at: http://127.0.0.1:5000 (16848)
[2020-09-04 12:00:01 -0500] [16848] [INFO] Using worker: sync
[2020-09-04 12:00:01 -0500] [16850] [INFO] Booting worker with pid: 16850
^C
[2020-09-04 12:14:16 -0500] [16848] [INFO] Handling signal: int
[2020-09-04 12:14:16 -0500] [16850] [INFO] Worker exiting (pid: 16850)

Screen%20Shot%202020-09-04%20at%2010.10.33%20AM.png

9. Sources and References¶

Matplotlib. Retrieved August 24, 2020, from https://matplotlib.org

Numpy. Retrieved August 24, 2020, from https://numpy.org/

Pandas. Retrieved August 24, 2020, from https://pandas.pydata.org/

Pycaret. (2020, May 11). Retrieved August 24, 2020, from https://pycaret.org/

Python. Retrieved August 24, 2020, from https://www.python.org/

Scikit Learn: Learn. Retrieved August 24, 2020, from https://scikit-learn.org/

Seaborn. Retrieved August 24, 2020, from https://seaborn.pydata.org/