ALL INFORMATION, SOFTWARE, DOCUMENTATION, AND DATA ARE PROVIDED "AS-IS". UNIVERSITE PARIS SUD, INRIA, CHALEARN, AND/OR OTHER ORGANIZERS OR CODE AUTHORS DISCLAIM ANY EXPRESSED OR IMPLIED WARRANTIES.
from os.path import join
import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline
inges_dir = 'AutoDL_ingestion_program/' # Ingestion program directory
score_dir = 'AutoDL_scoring_program/' # Scoring program directory
model_dir = 'AutoDL_sample_code_submission/' # Where model code is, called model.py
baseline_dir = 'AutoDL_simple_baseline_models/' # some baseline methods are implemented here
output_dir = 'AutoDL_scoring_output'
detailed_results_page = join(output_dir, 'detailed_results.html')
from sys import path;
path.append(model_dir); path.append(inges_dir); path.append(score_dir); path.append(baseline_dir);
Let's start with a sample dataset miniciao, which can be found in ./AutoDL_sample_data/miniciao
data_dir = 'AutoDL_sample_data' # Change this directory and the dataset as needed
data_name = 'miniciao'
!ls $data_dir
# read train / test datasets
from dataset import AutoDLDataset # The module 'dataset' is defined at AutoDL_ingestion_program/dataset.py
D_train = AutoDLDataset(join(data_dir, data_name + '/' + data_name + '.data', "train"))
D_test = AutoDLDataset(join(data_dir, data_name + '/' + data_name + '.data', "test"))
# show important meta information about the dataset
print ("Dataset path: ", D_train.get_metadata().get_dataset_name())
print ("Image shape: ", D_train.get_metadata().get_tensor_size(0))
print ("Dataset size: ", D_train.get_metadata().size())
print ("Output size: ", D_train.get_metadata().get_output_size())
print ("Class labels: ", D_train.get_class_labels())
# show sample images
D_train.show_image(1);
D_train.show_image(11);
Thus, it depends on you to choose to adapt to different datasets.
# copy simple model
model_simple = join(baseline_dir, 'linear', 'model.py') # choose one simple baseline model
model_submit = join(model_dir, 'model.py') # submitted models must be called model.py
!cp $model_simple $model_submit
# set time budget and instanciate the model with dataset
from model import Model
time_budget=200
M = Model(D_train.get_metadata())
# train the model for a certain time
M.train(D_train.get_dataset(), remaining_time_budget=time_budget)
# get prediction by calling test method
prediction = M.test(D_test.get_dataset(), remaining_time_budget=time_budget)
For each prediction made at a timestamp $t$, we compute for each (binary) class i the ROC AUC: $AUC_i$, then normalize it (and average over all classes) by: \begin{equation*} AUC = \frac{1}{C} \sum_{i=1}^C AUC_i, \quad NAUC = 2 \times AUC - 1, \end{equation*}
$NAUC$ is also called Gini index in some context. Then, the learning curve can be plot as a function of $NAUC$ vs. time. Let's denote the learning curve as $s(t)$. Since $s(t)$ is defined as the $NAUC$ of the most recent prediction made before timestamp $t$, $s(t)$ is actually a step function.
As this challenge aims to push forward the state-of-the-art in the any-time learning setting, we use a performance metric related to the whole learning curve (instead of only the last point). This metric is computed as follows.
This gives the evaluation score used for one task. Later, when ALC score is computed for all tasks, the final score is obtained by the average rank (over all tasks). It should be emphasized that multi-class classification metrics are not being considered, i.e., each class is scored independently.
Let's see in the following how the scores are computed.
# calculate scores
from score import autodl_auc, accuracy
from libscores import read_array
solution_file = join(data_dir, data_name + '/' + data_name + '.solution')
solution = read_array(solution_file)
acc = accuracy(solution, prediction) # note that accuracy is not evaluation metric in the challenge
current_bac = autodl_auc(solution, prediction)
# print('Number of test examples: %d \n\t\t Solution \t\t\t\t\t Prediction ' % len(solution))
# [print(z) for z in zip(solution, prediction)]
print ("Classification Accuracy: ", acc)
print ("Normalized Area Under ROC Curve (NAUC) = {:.4f}.".format(current_bac))
print ("ALC can be read from the result page as shown in the next part.")
model.py
in the AutoDL_sample_code_submission/
directory, then run this test to make sure everything works fine. This is the actual program that will be run on the server to test your submission.
# run local test
!python run_local_test.py -code_dir=./AutoDL_sample_code_submission -dataset_dir=AutoDL_sample_data/miniciao
# result report
from IPython.core.display import display, HTML
display(HTML(detailed_results_page))
From the learning curve we see that the predictions are only made at the beginning, then the training is stopped and no more predictions are made. This is due to several reasons:
self.num_epochs_we_want_to_train
in the class Model
. (When this number of epochs of training is attained, the model will set self.done_training
to True
and ingestion program will stop the whole train/predict process and do final evaluation in scoring program);miniciao
is very small as it contains only 100 examples;All these factors together make the training (and testing/predicting) fast.
You are invited to change the value of self.num_epochs_we_want_to_train
in model.py
and/or change the arguments (typically code_dir
and dataset_dir
) passed to run_local_test.py
in a cell above to test different algorithms on different datasets, and hopefully get better performance than what we had. :)
# compress model to be submitted
from data_io import zipdir
submission_filename = 'mysubmission.zip'
zipdir(submission_filename, model_dir)
print("Submit this file: " + submission_filename)
model.py
)
preprocess_tensor_4d
(optional) for preprocessing data, e.g. resize, change gray images to RGB imagesinput_function
(optional) for reading batchsmodel_fn
(mandatory) for defining your own models, CNN, ResNet, Inception, etc.For instructions on wrinting model_fn
, you are invited to consult this page for reference:
https://www.tensorflow.org/guide/custom_estimators#write_a_model_function
Good luck!