Binary classification using neural networks¶

Find the pima-diabetes.csv file in the ./data/ folder.
Can we correctly classify Diabetes knowing all the input variables?

In [0]:

from google.colab import files
uploaded = files.upload()

In [0]:

import pandas as pd
data = pd.read_csv('pima-diabetes.csv', delimiter=',')
data.head()

In [0]:

import numpy as np
# Use np.loadtxt() instead when there are non-numeric values as well
dataset = np.genfromtxt('pima-diabetes.csv', delimiter=",", skip_header = True) 

In [0]:

np.set_printoptions(precision = 2) # does not work for too wide array
np.set_printoptions(formatter = {'float': '{: 0.1f}'.format})

print('')
print(dataset.shape)
print('')
print(dataset[0:5])

In [0]:

X = dataset[:, :-1]
Y = dataset[:, -1]

In [0]:

mean = X.mean(axis=0)
X -= mean
std = X.std(axis=0)
X /= std

Design a neural network¶

In [0]:

from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense

model = Sequential()
model.add(Dense(8, input_dim = len(X[0, :]), activation='relu'))
model.add(Dense(4, activation='relu'))
model.add(Dense(1, activation='sigmoid'))

Draw the network architecture¶

What is the total parameters? How?

In [0]:

model.summary()

In [0]:

from tensorflow.keras.utils import plot_model
plot_model(model, show_layer_names=True, show_shapes=True)

In [0]:

model.compile(loss='binary_crossentropy', optimizer = 'rmsprop', metrics=['accuracy'])
model.fit(X, Y, epochs = 256, verbose = 1)

In [0]:

print ('True Validation Data:')
print(Y[:10])
prediction = model.predict(X)
print ('Prediction:')
print(prediction[0:10].T)

Evaluating binary predictions¶

In [0]:

from sklearn.metrics import accuracy_score, precision_score, recall_score, f1_score
accuracy = accuracy_score(Y, prediction.round())
precision = precision_score(Y, prediction.round())
recall = recall_score(Y, prediction.round())
f1score = f1_score(Y, prediction.round())
print("Accuracy: %.2f%%" % (accuracy * 100.0))
print("Precision: %.2f%%" % (precision * 100.0))
print("Recall: %.2f%%" % (recall * 100.0))
print("F1-score: %.2f" % (f1score))

In [0]: