Eeta lección está dedicada a un modelo de clasificación con mútliples categorías, que corresponde a la generalización natural del modelo logístico.
Usaremos las bibliotecas
Puede usar las siguientes instrucciones para isntalr desde la consola.
# !conda install -c anaconda seaborn
# !conda install -c intel scikit-learn
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
import tensorflow as tf
#
from tensorflow.keras.models import Model
#
from tensorflow.keras.layers import Dense, Input, Activation, Dropout
#
from tensorflow.keras.utils import to_categorical
from tensorflow.keras.utils import plot_model
#
from sklearn.preprocessing import StandardScaler
from sklearn.metrics import confusion_matrix
#
#from sklearn import KFold
print("Versión de Tensorflow:", tf.__version__)
Versión de Tensorflow: 2.4.1
Dada la salida del sumador digamos $y=\mathbf{w}'\mathbf{x} +b$, la función de activación relu esta definida por
$$ \text{relu}(y) = \begin{cases} &0, \text{ si } y\le 0,\\ &y, \text{ en otro caso } \end{cases} $$Dados los valores $x_1,\ldots, x_n$ la función softmax es definida por
$$ \text{softmax}(x_i) = \frac{e^{x_i}}{\sum_{j=1}^{n} e^{x_j}} $$Es decir, softmax transforma los valores en un función de probabilidad.
Este conjunto de datos fue introducido por sir Ronald Fisher
Bajamos los datos de Internet usando tf.keras.utils y luego los cargamos en dataframes de Python.
# nombres de las columnas de los datos
col_names = ['SepalLength', 'SepalWidth', 'PetalLength', 'PetalWidth', 'Species']
target_dimensions = ['Setosa', 'Versicolor', 'Virginica']
# lee los datos
training_data_path = tf.keras.utils.get_file("iris_training.csv", "https://storage.googleapis.com/download.tensorflow.org/data/iris_training.csv")
test_data_path = tf.keras.utils.get_file("iris_test.csv", "https://storage.googleapis.com/download.tensorflow.org/data/iris_test.csv")
training = pd.read_csv(training_data_path, names=col_names, header=0)
test = pd.read_csv(test_data_path, names=col_names, header=0)
test.head()
SepalLength | SepalWidth | PetalLength | PetalWidth | Species | |
---|---|---|---|---|---|
0 | 5.9 | 3.0 | 4.2 | 1.5 | 1 |
1 | 6.9 | 3.1 | 5.4 | 2.1 | 2 |
2 | 5.1 | 3.3 | 1.7 | 0.5 | 0 |
3 | 6.0 | 3.4 | 4.5 | 1.6 | 1 |
4 | 5.5 | 2.5 | 4.0 | 1.3 | 1 |
La variable objetivo (target) tiene tres categorías. Usaremos la codificación one-hot para transformar las codificaciones en vectors binarios.
y_train= pd.DataFrame(to_categorical(training.Species))
y_train.columns = target_dimensions
y_test = pd.DataFrame(to_categorical(test.Species))
y_test.columns = target_dimensions
y_test
Setosa | Versicolor | Virginica | |
---|---|---|---|
0 | 0.0 | 1.0 | 0.0 |
1 | 0.0 | 0.0 | 1.0 |
2 | 1.0 | 0.0 | 0.0 |
3 | 0.0 | 1.0 | 0.0 |
4 | 0.0 | 1.0 | 0.0 |
5 | 0.0 | 1.0 | 0.0 |
6 | 1.0 | 0.0 | 0.0 |
7 | 0.0 | 0.0 | 1.0 |
8 | 0.0 | 1.0 | 0.0 |
9 | 0.0 | 0.0 | 1.0 |
10 | 0.0 | 0.0 | 1.0 |
11 | 1.0 | 0.0 | 0.0 |
12 | 0.0 | 0.0 | 1.0 |
13 | 0.0 | 1.0 | 0.0 |
14 | 0.0 | 1.0 | 0.0 |
15 | 1.0 | 0.0 | 0.0 |
16 | 0.0 | 1.0 | 0.0 |
17 | 1.0 | 0.0 | 0.0 |
18 | 1.0 | 0.0 | 0.0 |
19 | 0.0 | 0.0 | 1.0 |
20 | 1.0 | 0.0 | 0.0 |
21 | 0.0 | 1.0 | 0.0 |
22 | 0.0 | 0.0 | 1.0 |
23 | 0.0 | 1.0 | 0.0 |
24 | 0.0 | 1.0 | 0.0 |
25 | 0.0 | 1.0 | 0.0 |
26 | 1.0 | 0.0 | 0.0 |
27 | 0.0 | 1.0 | 0.0 |
28 | 0.0 | 0.0 | 1.0 |
29 | 0.0 | 1.0 | 0.0 |
y_train_species = training.pop('Species')
#test.drop(['Species'], axis=1, inplace=True)
y_test_species = test.pop('Species') # extrae la columna y la coloca en y_test_species
#
#Si necesita subir al dataframe la recodificación use estas líneas
#training = training.join(y_train )
#test = test.join(y_test)
# crea el objeto StandardScaler
scaler = StandardScaler()
# Ajusta los parámetros del scaler
scaler.fit(training)
print (scaler.mean_)
# escala training y test
x_train = scaler.transform(training)
x_test = scaler.transform(test)
# labels ( no requieren escalación)
[5.845 3.065 3.73916667 1.19666667]
La API funcional de Keras es bastante más flexible y poderosa que el modelo Sequential
# Con la API funcion se requiere la capa Input que transforma la entrada
# en un tensor de tensorflow directamente
#
inputs = Input(shape=(4,),name='capa_entrada')
#
# vamos construyendo capa por capa
x = Activation('relu')(inputs)
x = Dense(8, activation='relu',name='primera_capa_oculta')(x)
x = Dropout(0.2)(x)
x = Dense(16, activation='relu', name='segunda_capa_oculta')(x)
x = Dropout(0.2)(x)
outputs = Dense(3, activation='softmax', name='capa_salida')(x)
# Creamos ahora el modelo
model_iris = Model(inputs=inputs, outputs=outputs)
model_iris.summary()
plot_model(model_iris, to_file='../Imagenes/iris_model.png',
show_shapes=True)
Model: "model_22" _________________________________________________________________ Layer (type) Output Shape Param # ================================================================= capa_entrada (InputLayer) [(None, 4)] 0 _________________________________________________________________ activation_16 (Activation) (None, 4) 0 _________________________________________________________________ primera_capa_oculta (Dense) (None, 8) 40 _________________________________________________________________ dropout_20 (Dropout) (None, 8) 0 _________________________________________________________________ segunda_capa_oculta (Dense) (None, 16) 144 _________________________________________________________________ dropout_21 (Dropout) (None, 16) 0 _________________________________________________________________ capa_salida (Dense) (None, 3) 51 ================================================================= Total params: 235 Trainable params: 235 Non-trainable params: 0 _________________________________________________________________
model_iris.compile(optimizer='adam',
loss='categorical_crossentropy',
metrics=['accuracy']
)
class PrintDot(tf.keras.callbacks.Callback):
def on_epoch_end(self, epoch, logs):
print('.', end='')
epochs = 200
history = model_iris.fit(x_train, y_train,
batch_size= 16,
epochs= epochs,
validation_split=0.1, verbose=0,
callbacks=[PrintDot()])
print('\nHecho')
print('Resultados finales de pérdida y exactitud\n')
# presenta la última parte de la historia
hist = pd.DataFrame(history.history)
hist['epoch'] = history.epoch
hist.tail()
........................................................................................................................................................................................................ Hecho Resultados finales de pérdida y exactitud
loss | accuracy | val_loss | val_accuracy | epoch | |
---|---|---|---|---|---|
195 | 0.303951 | 0.842593 | 0.360190 | 0.916667 | 195 |
196 | 0.281763 | 0.842593 | 0.361281 | 0.916667 | 196 |
197 | 0.290119 | 0.861111 | 0.361293 | 0.916667 | 197 |
198 | 0.292439 | 0.898148 | 0.360291 | 0.916667 | 198 |
199 | 0.293885 | 0.907407 | 0.360623 | 0.916667 | 199 |
def plot_history(history):
hist = pd.DataFrame(history.history)
hist['epoch'] = history.epoch
plt.figure()
plt.xlabel('Epoch')
plt.ylabel('Pérdida')
plt.plot(hist['epoch'], hist['loss'],
label='Pérdida entrenamiento')
plt.plot(hist['epoch'], hist['val_loss'],
label = 'Pérdida validación')
plt.ylim([0,2])
plt.legend()
plt.figure()
plt.xlabel('Epoch')
plt.ylabel('Exactitud')
plt.plot(hist['epoch'], hist['accuracy'],
label='Exactitud entrenamiento')
plt.plot(hist['epoch'], hist['val_accuracy'],
label = 'Exactitud Validación')
plt.ylim([0,1])
plt.legend()
plt.show()
plot_history(history)
# Predicting the Test set results
y_pred = model_iris.predict(x_test)
y_pred_c = np.argmax(y_pred, axis=1)
WARNING:tensorflow:5 out of the last 11 calls to <function Model.make_predict_function.<locals>.predict_function at 0x7f99d8514940> triggered tf.function retracing. Tracing is expensive and the excessive number of tracings could be due to (1) creating @tf.function repeatedly in a loop, (2) passing tensors with different shapes, (3) passing Python objects instead of tensors. For (1), please define your @tf.function outside of the loop. For (2), @tf.function has experimental_relax_shapes=True option that relaxes argument shapes that can avoid unnecessary retracing. For (3), please refer to https://www.tensorflow.org/guide/function#controlling_retracing and https://www.tensorflow.org/api_docs/python/tf/function for more details.
cm = confusion_matrix(y_test_species, y_pred_c)
print("Our accuracy is {}%".format(((cm[0][0] + cm[1][1]+ cm[2][2])/y_test_species.shape[0])*100))
Our accuracy is 93.33333333333333%
sns.heatmap(cm,annot=True)
plt.savefig('h.png')
inputs = x_train
outputs = model_iris(inputs)
outputs.numpy().round(2)[:10]
array([[0. , 0.01, 0.99], [0.51, 0.44, 0.05], [0. , 0.38, 0.62], [0.8 , 0.18, 0.02], [1. , 0. , 0. ], [0.98, 0.02, 0. ], [1. , 0. , 0. ], [0. , 0.02, 0.98], [0.02, 0.79, 0.19], [1. , 0. , 0. ]], dtype=float32)
# modelo Sequential
#layer_2 = tf.keras.models.Model(
# inputs=model_iris.inputs,
# outputs=model_iris.get_layer(name='segunda_capa_oculta').output,
#)
# API funcional
# 1. crea un nuevo modelo
# 2. Compila
# 3. Predice
inputs = x_train
model = Model(model_iris.input, model_iris.get_layer(name='segunda_capa_oculta').output)
model.compile()
output = model.predict(inputs)
WARNING:tensorflow:6 out of the last 12 calls to <function Model.make_predict_function.<locals>.predict_function at 0x7f99ac40e940> triggered tf.function retracing. Tracing is expensive and the excessive number of tracings could be due to (1) creating @tf.function repeatedly in a loop, (2) passing tensors with different shapes, (3) passing Python objects instead of tensors. For (1), please define your @tf.function outside of the loop. For (2), @tf.function has experimental_relax_shapes=True option that relaxes argument shapes that can avoid unnecessary retracing. For (3), please refer to https://www.tensorflow.org/guide/function#controlling_retracing and https://www.tensorflow.org/api_docs/python/tf/function for more details.
output.shape
(120, 16)
plot_data = np.hstack([output, np.array(y_train_species).reshape(y_train_species.shape[0],1)])
plot_data = pd.DataFrame(plot_data)
plot_data
0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | 13 | 14 | 15 | 16 | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 2.467787 | 0.000000 | 0.731057 | 0.000000 | 0.510913 | 0.000000 | 0.000000 | 0.000000 | 1.827167 | 3.498898 | 2.964854 | 1.161228 | 1.649435 | 2.353893 | 0.0 | 0.0 | 2.0 |
1 | 0.000000 | 0.230899 | 0.000000 | 0.215546 | 0.000000 | 0.113442 | 0.213827 | 0.396466 | 0.000000 | 0.100383 | 0.180319 | 0.579343 | 0.000000 | 0.000000 | 0.0 | 0.0 | 1.0 |
2 | 0.981748 | 0.000000 | 0.150882 | 0.000000 | 0.072097 | 0.000000 | 0.000000 | 0.000000 | 1.005695 | 1.802041 | 1.445802 | 0.958601 | 0.674698 | 1.188161 | 0.0 | 0.0 | 2.0 |
3 | 0.000000 | 0.418447 | 0.000000 | 0.336955 | 0.000000 | 0.276314 | 0.321667 | 0.520616 | 0.000000 | 0.000000 | 0.088640 | 0.445429 | 0.000000 | 0.000000 | 0.0 | 0.0 | 0.0 |
4 | 0.000000 | 2.559006 | 0.097274 | 1.694688 | 0.000000 | 2.285218 | 1.820933 | 1.972289 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.0 | 0.0 | 0.0 |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
115 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.115792 | 0.183403 | 0.435880 | 0.432974 | 0.678667 | 0.000000 | 0.090053 | 0.0 | 0.0 | 1.0 |
116 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.171450 | 0.110357 | 0.347058 | 0.385124 | 0.669227 | 0.000000 | 0.025612 | 0.0 | 0.0 | 1.0 |
117 | 0.000000 | 0.230899 | 0.000000 | 0.215546 | 0.000000 | 0.113442 | 0.213827 | 0.396466 | 0.000000 | 0.100383 | 0.180319 | 0.579343 | 0.000000 | 0.000000 | 0.0 | 0.0 | 0.0 |
118 | 0.000000 | 0.230899 | 0.000000 | 0.215546 | 0.000000 | 0.113442 | 0.213827 | 0.396466 | 0.000000 | 0.100383 | 0.180319 | 0.579343 | 0.000000 | 0.000000 | 0.0 | 0.0 | 0.0 |
119 | 0.000000 | 0.230899 | 0.000000 | 0.215546 | 0.000000 | 0.113442 | 0.213827 | 0.396466 | 0.000000 | 0.100383 | 0.180319 | 0.579343 | 0.000000 | 0.000000 | 0.0 | 0.0 | 1.0 |
120 rows × 17 columns
from sklearn.manifold import TSNE
# reduce dimensionalidad con t-sne
tsne = TSNE(n_components=2, verbose=1, perplexity=50, n_iter=1000, learning_rate=200)
tsne_results = tsne.fit_transform(output)
[t-SNE] Computing 119 nearest neighbors... [t-SNE] Indexed 120 samples in 0.000s... [t-SNE] Computed neighbors for 120 samples in 0.011s... [t-SNE] Computed conditional probabilities for sample 120 / 120 [t-SNE] Mean sigma: 1.722451 [t-SNE] KL divergence after 250 iterations with early exaggeration: 49.608604 [t-SNE] KL divergence after 700 iterations: 0.032179
labels = [target_dimensions[i] for i in y_train_species]
#['Setosa', 'Versicolor', 'Virginica']
# visualiza con seaborn
df_tsne = pd.DataFrame(tsne_results, columns=['x', 'y'])
df_tsne['label'] = labels
sns.lmplot(x='x', y='y', data=df_tsne, hue='label', fit_reg=False)
<seaborn.axisgrid.FacetGrid at 0x7f99a4611bb0>
¿Cuáles son sus conclusiones?