Running brainome in five easy steps:
Pip will automatically include dependencies.
!python3 -m pip install brainome
print("\n\nChecking brainome version number:")
!brainome --version
Sometimes pip requires --user
parameter in order to install successfully:
python3 -m pip install brainome --user
The titanic data set is a commonly used for introduction to data science. It is a passenger manifest of the Titanic including whether they survived the disaster or not. For more information, refer to kaggle.com/c/titanic
import urllib.request as request
response1 = request.urlretrieve('https://download.brainome.ai/data/public/titanic_train.csv', 'titanic_train.csv')
response2 = request.urlretrieve('https://download.brainome.ai/data/public/titanic_validate.csv', 'titanic_validate.csv')
response3 = request.urlretrieve('https://download.brainome.ai/data/public/titanic_predict.csv', 'titanic_predict.csv')
%ls -lh titanic_train.csv titanic_validate.csv titanic_predict.csv
The goal of the training is to predict which passenger survived the diaster.
The passenger roster contains 11 features (PassengerId, Cabin_Class, Name, etc) for 800 passengers that can be used to create a model. Hence, the target column is 'Survived'.
You can download the training data at titanic_train.csv
# preview uses pandas to read and display csv data
%pip install pandas --quiet
import pandas as pd
pd.read_csv('titanic_train.csv')
DEPRECATION: Configuring installation scheme with distutils config files is deprecated and will no longer work in the near future. If you are using a Homebrew or Linuxbrew Python, please see discussion at https://github.com/Homebrew/homebrew-core/issues/76621
Requirement already satisfied: pandas in /usr/local/lib/python3.9/site-packages (1.3.1)
Requirement already satisfied: pytz>=2017.3 in /usr/local/lib/python3.9/site-packages (from pandas) (2021.1)
Requirement already satisfied: python-dateutil>=2.7.3 in /usr/local/lib/python3.9/site-packages (from pandas) (2.8.2)
Requirement already satisfied: numpy>=1.17.3 in /usr/local/lib/python3.9/site-packages (from pandas) (1.20.0)
Requirement already satisfied: six>=1.5 in /usr/local/lib/python3.9/site-packages (from python-dateutil>=2.7.3->pandas) (1.16.0)
Note: you may need to restart the kernel to use updated packages.
PassengerId | Cabin_Class | Name | Sex | Age | Sibling_Spouse | Parent_Children | Ticket_Number | Fare | Cabin_Number | Port_of_Embarkation | Survived | |
---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 1 | 3 | Braund, Mr. Owen Harris | male | 22.0 | 1 | 0 | A/5 21171 | 7.2500 | NaN | S | died |
1 | 2 | 1 | Cumings, Mrs. John Bradley (Florence Briggs Th... | female | 38.0 | 1 | 0 | PC 17599 | 71.2833 | C85 | C | survived |
2 | 3 | 3 | Heikkinen, Miss. Laina | female | 26.0 | 0 | 0 | STON/O2. 3101282 | 7.9250 | NaN | S | survived |
3 | 4 | 1 | Futrelle, Mrs. Jacques Heath (Lily May Peel) | female | 35.0 | 1 | 0 | 113803 | 53.1000 | C123 | S | survived |
4 | 5 | 3 | Allen, Mr. William Henry | male | 35.0 | 0 | 0 | 373450 | 8.0500 | NaN | S | died |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
795 | 796 | 2 | Otter, Mr. Richard | male | 39.0 | 0 | 0 | 28213 | 13.0000 | NaN | S | died |
796 | 797 | 1 | Leader, Dr. Alice (Farnham) | female | 49.0 | 0 | 0 | 17465 | 25.9292 | D17 | S | survived |
797 | 798 | 3 | Osman, Mrs. Mara | female | 31.0 | 0 | 0 | 349244 | 8.6833 | NaN | S | survived |
798 | 799 | 3 | Ibrahim Shawah, Mr. Yousseff | male | 30.0 | 0 | 0 | 2685 | 7.2292 | NaN | C | died |
799 | 800 | 3 | Van Impe, Mrs. Jean Baptiste (Rosalie Paula Go... | female | 30.0 | 1 | 1 | 345773 | 24.1500 | NaN | S | died |
800 rows × 12 columns
In its simplest invocation, brainome will automatically measure your data, identify the best model, build it, train it, and validate it.
It will automatically split your data into training and validation.
The output is python source code in predictor_101.py
.
!brainome titanic_train.csv --yes -o predictor_101.py
Open predictor_101.py
to browse the predictor's source code. Notice it is on the order of 38k bytes.
%ls -lh predictor_101.py
%pycat predictor_101.py
Running your predictor on an unseen data set demonstrates its effectiveness.
You can download the validation data at titanic_validate.csv
!python3 predictor_101.py -validate titanic_validate.csv
Run your predictor on an unlabeled data set to generate predictions for other passengers.
You can download the prediction data at titanic_predict.csv
!python3 predictor_101.py titanic_predict.csv > predictions_101.csv
pd.read_csv('predictions_101.csv')