Brainome's predictors can be easily integrated into your python application.
This notebook assumes brainome is installed as per notebook brainome_101_Quick_Start
The data sets are:
Predictors require numpy and optionally scipy to generate a confusion matrix.
!python3 -m pip install brainome --quiet
!brainome --version
# download data sets
import urllib.request as request
response1 = request.urlretrieve('https://download.brainome.ai/data/public/titanic_validate.csv', 'titanic_validate.csv')
response2 = request.urlretrieve('https://download.brainome.ai/data/public/titanic_predict.csv', 'titanic_predict.csv')
%ls -lh titanic_validate.csv titanic_predict.csv
The predictor filename is predictor_301.py
!brainome https://download.brainome.ai/data/public/titanic_train.csv -rank -y -o predictor_301.py -modelonly -q
print('\nCreated predictor_301.py')
!ls -lh predictor_301.py
Start with importing the predictor_301.py source file into your program. It also requires numpy.
Calling help(predictor)
will display the pydoc for it.
import numpy as np # predictors require numpy
import predictor_301 as predictor
help(predictor)
Given a test data set, the predictor will compare predictions with expected outcomes.
For this exercise, we are reading the data set into a pandas data frame, your method may differ.
%pip install pandas --quiet
import pandas as pd
validate_data = pd.read_csv('titanic_validate.csv', na_values=[], na_filter=False)
validate_values = validate_data.values
clean_values = predictor.__preprocess_and_clean_in_memory(validate_values)
count, correct_count, num_TP, num_TN, num_FP, num_FN, num_class_1, num_class_0, preds = predictor.validate(clean_values)
print(' Test Predictions '.center(80, '-'))
print(preds)
true_labels = clean_values[:, -1]
mtrx, stats = predictor.__confusion_matrix(np.array(true_labels).reshape(-1), np.array(preds).reshape(-1), True)
print(' Confusion Matrix '.center(80, '-'))
print(mtrx)
print(' Statistics '.center(80, '-'))
print(stats)
Demonstrating classification of a single passenger
passenger = [881,2,"Shelley, Mrs. William (Imanita Parrish Hall)","female",25,0,1,230433,26,"","S"]
prediction = predictor.predict([passenger])[0]
print(passenger[2], prediction)
Given a chunk of data, the predictor will return classification predictions for each record.
predict_data = pd.read_csv('titanic_predict.csv', na_values=[], na_filter=False)
predict_values = predict_data.values
predictions_output = predictor.predict(predict_values)
print(' Batch Predictions '.center(80, '-'))
print(predictions_output)
Not all data sets can be fully loaded into memory but rather must be streamed instance by instance.
import csv
with open("./titanic_predict.csv", "r") as csv_file:
data_reader = csv.reader(csv_file)
header = next(data_reader)
first = True
for row in data_reader:
prediction = predictor.predict([row])[0]
probability = predictor.predict([row], return_probabilities=True)
if first:
first = False
print(header[0], 'Prediction', "\t", probability[0])
print(row[0], prediction, "\t", probability[1])