Interpreting numeric split points in H2O POJO tree based models¶

This notebook explains how to correctly interpret split points that you might see in POJOs of H2O tree based models.

Motivation: we had seen there are users who are parsing H2O POJO and translating the Java code into another representation (SQL statements, ...). While we do not encourage users to use POJO in this particular use case we want to clarify how to interpret the numerical values correctly.

Concept of floating point numbers in computers¶

Computers and software like H2O use floating-point representation of real numbers. In this representation sequences of bits (0/1) are used to store the number with a limited precision. In H2O we use mainly 32-bit and 64-bit floating point number representation.

Lets take look at one example of a floating point number - 25.695312 and use 32-bit and 64-bit representation to compare the behavior.

In [247]:

import numpy as np

In [248]:

f32 = np.float32("25.695312")
f32

Out[248]:

25.695312

In [249]:

f64 = np.float64("25.695312")
f64

Out[249]:

25.695312

If we try to compare the numbers we will see they are not actually the same number

In [250]:

f32 == f64

Out[250]:

False

When two numbers are compared their precion is first adjusted to be the same. This typically means the lower precison number is converted to the higher precision representation. In this case f32 will be converted to float64 representation. We can do the same thing explicitly:

In [251]:

np.float64(f32) == f64

Out[251]:

False

The comparison failed because the converted number is actually different

In [252]:

np.float64(f32)

Out[252]:

25.6953125

Notice the 7th decimal digit after the conversion.

In [253]:

np.float64(f32) - f64

Out[253]:

4.999999987376214e-07

In [254]:

np.float64(f32) > f64

Out[254]:

True

Examining GBM POJO¶

Understanding how computers compare numbers of different precision is critical for correctly interpretting split points in tree-based POJOs. Lets now train a simple GBM model.

In [255]:

import h2o
from h2o.estimators.gbm import H2OGradientBoostingEstimator

In [256]:

# Connect to a pre-existing cluster
h2o.init()

Checking whether there is an H2O instance running at http://localhost:54321 . connected.

H2O_cluster_uptime:	09 secs
H2O_cluster_timezone:	America/New_York
H2O_data_parsing_timezone:	UTC
H2O_cluster_version:	3.35.0.99999
H2O_cluster_version_age:	2 hours and 53 minutes
H2O_cluster_name:	mkurka
H2O_cluster_total_nodes:	1
H2O_cluster_free_memory:	7.094 Gb
H2O_cluster_total_cores:	16
H2O_cluster_allowed_cores:	16
H2O_cluster_status:	locked, healthy
H2O_connection_url:	http://localhost:54321
H2O_connection_proxy:	{"http": null, "https": null}
H2O_internal_security:	False
H2O_API_Extensions:	Amazon S3, XGBoost, Algos, AutoML, Core V3, TargetEncoder, Core V4
Python_version:	3.8.2 final

In [257]:

from h2o.utils.shared_utils import _locate # private function. used to find files within h2o git project directory.

df = h2o.upload_file(path=_locate("smalldata/logreg/prostate.csv"))

Parse progress: |████████████████████████████████████████████████████████████████| (done) 100%

In [258]:

# Remove ID from training frame
train = df.drop("ID")

In [259]:

# For VOL & GLEASON, a zero really means "missing"
vol = train['VOL']
vol[vol == 0] = None
gle = train['GLEASON']
gle[gle == 0] = None

In [260]:

# Convert CAPSULE to a logical factor
train['CAPSULE'] = train['CAPSULE'].asfactor()

In [261]:

# Run GBM
my_gbm = H2OGradientBoostingEstimator(ntrees=1, seed=1234)

my_gbm.train(y="CAPSULE", training_frame=train)

gbm Model Build progress: |██████████████████████████████████████████████████████| (done) 100%
Model Details
=============
H2OGradientBoostingEstimator :  Gradient Boosting Machine
Model Key:  GBM_model_python_1636137917875_1


Model Summary:

		number_of_trees	number_of_internal_trees	model_size_in_bytes	min_depth	max_depth	mean_depth	min_leaves	max_leaves	mean_leaves
0		1.0	1.0	360.0	5.0	5.0	5.0	24.0	24.0	24.0


ModelMetricsBinomial: gbm
** Reported on train data. **

MSE: 0.22019689456071448
RMSE: 0.4692514193486414
LogLoss: 0.6319753099030868
Mean Per-Class Error: 0.20582476749877632
AUC: 0.8816907085888687
AUCPR: 0.8515845076604194
Gini: 0.7633814171777373

Confusion Matrix (Act/Pred) for max f1 @ threshold = 0.4008312811161997:

		0	1	Error	Rate
0	0	176.0	51.0	0.2247	(51.0/227.0)
1	1	29.0	124.0	0.1895	(29.0/153.0)
2	Total	205.0	175.0	0.2105	(80.0/380.0)

Maximum Metrics: Maximum metrics at their respective thresholds

	metric	threshold	value	idx
0	max f1	0.400831	0.756098	10.0
1	max f2	0.379840	0.831486	16.0
2	max f0point5	0.429293	0.783866	6.0
3	max accuracy	0.429293	0.807895	6.0
4	max precision	0.463528	1.000000	0.0
5	max recall	0.372774	1.000000	18.0
6	max specificity	0.463528	1.000000	0.0
7	max absolute_mcc	0.412406	0.595958	7.0
8	max min_per_class_accuracy	0.404036	0.777778	9.0
9	max mean_per_class_accuracy	0.404036	0.794175	9.0
10	max tns	0.463528	227.000000	0.0
11	max fns	0.463528	121.000000	0.0
12	max fps	0.363105	227.000000	19.0
13	max tps	0.372774	153.000000	18.0
14	max tnr	0.463528	1.000000	0.0
15	max fnr	0.463528	0.790850	0.0
16	max fpr	0.363105	1.000000	19.0
17	max tpr	0.372774	1.000000	18.0

Gains/Lift Table: Avg response rate: 40.26 %, avg score: 40.30 %

	group	cumulative_data_fraction	lower_threshold	lift	cumulative_lift	response_rate	score	cumulative_response_rate	cumulative_score	capture_rate	cumulative_capture_rate	gain	cumulative_gain	kolmogorov_smirnov
0	1	0.084211	0.463528	2.483660	2.483660	1.000000	0.463528	1.000000	0.463528	0.209150	0.209150	148.366013	148.366013	0.209150
1	2	0.128947	0.457452	2.337562	2.432973	0.941176	0.457452	0.979592	0.461420	0.104575	0.313725	133.756248	143.297319	0.309320
2	3	0.157895	0.444791	2.032086	2.359477	0.818182	0.444791	0.950000	0.458372	0.058824	0.372549	103.208556	135.947712	0.359333
3	4	0.218421	0.432692	1.835749	2.214348	0.739130	0.436693	0.891566	0.452364	0.111111	0.483660	83.574879	121.434759	0.444013
4	5	0.300000	0.429622	1.682479	2.069717	0.677419	0.430389	0.833333	0.446389	0.137255	0.620915	68.247944	106.971678	0.537215
5	6	0.426316	0.404036	1.241830	1.824417	0.500000	0.412442	0.734568	0.436330	0.156863	0.777778	24.183007	82.441701	0.588350
6	7	0.521053	0.392412	0.827887	1.643230	0.333333	0.395728	0.661616	0.428948	0.078431	0.856209	-17.211329	64.322968	0.561055
7	8	0.660526	0.383949	0.562338	1.414994	0.226415	0.385145	0.569721	0.419699	0.078431	0.934641	-43.766186	41.499362	0.458870
8	9	0.763158	0.379840	0.445785	1.284652	0.179487	0.380533	0.517241	0.414432	0.045752	0.980392	-55.421485	28.465179	0.363652
9	10	0.813158	0.373285	0.261438	1.221736	0.105263	0.373285	0.491909	0.411902	0.013072	0.993464	-73.856209	22.173573	0.301834
10	11	1.000000	0.363105	0.034981	1.000000	0.014085	0.364467	0.402632	0.403039	0.006536	1.000000	-96.501887	0.000000	0.000000


Scoring History:

		timestamp	duration	number_of_trees	training_rmse	training_logloss	training_auc	training_pr_auc	training_lift	training_classification_error
0		2021-11-05 14:45:28	0.022 sec	0.0	0.490428	0.674064	0.500000	0.402632	1.00000	0.597368
1		2021-11-05 14:45:28	0.182 sec	1.0	0.469251	0.631975	0.881691	0.851585	2.48366	0.210526

Variable Importances:

	variable	relative_importance	scaled_importance	percentage
0	GLEASON	20.125320	1.000000	0.496931
1	PSA	8.138151	0.404374	0.200946
2	VOL	6.416112	0.318808	0.158426
3	DPROS	5.819649	0.289170	0.143698
4	AGE	0.000000	0.000000	0.000000
5	RACE	0.000000	0.000000	0.000000
6	DCAPS	0.000000	0.000000	0.000000

Out[261]:

In [262]:

# Get the POJO
my_gbm.download_pojo()

/*
  Licensed under the Apache License, Version 2.0
    http://www.apache.org/licenses/LICENSE-2.0.html

  AUTOGENERATED BY H2O at 2021-11-05T14:45:28.555-04:00
  3.35.0.99999
  
  Standalone prediction code with sample test data for GBMModel named GBM_model_python_1636137917875_1

  How to download, compile and execute:
      mkdir tmpdir
      cd tmpdir
      curl http://192.168.86.229:54321/3/h2o-genmodel.jar > h2o-genmodel.jar
      curl http://192.168.86.229:54321/3/Models.java/GBM_model_python_1636137917875_1 > GBM_model_python_1636137917875_1.java
      javac -cp h2o-genmodel.jar -J-Xmx2g -J-XX:MaxPermSize=128m GBM_model_python_1636137917875_1.java

     (Note:  Try java argument -XX:+PrintCompilation to show runtime JIT compiler behavior.)
*/
import java.util.Map;
import hex.genmodel.GenModel;
import hex.genmodel.annotations.ModelPojo;

@ModelPojo(name="GBM_model_python_1636137917875_1", algorithm="gbm")
public class GBM_model_python_1636137917875_1 extends GenModel {
  public hex.ModelCategory getModelCategory() { return hex.ModelCategory.Binomial; }

  public boolean isSupervised() { return true; }
  public int nfeatures() { return 7; }
  public int nclasses() { return 2; }

  // Names of columns used by model.
  public static final String[] NAMES = NamesHolder_GBM_model_python_1636137917875_1.VALUES;
  // Number of output classes included in training data response column.
  public static final int NCLASSES = 2;

  // Column domains. The last array contains domain of response column.
  public static final String[][] DOMAINS = new String[][] {
    /* AGE */ null,
    /* RACE */ null,
    /* DPROS */ null,
    /* DCAPS */ null,
    /* PSA */ null,
    /* VOL */ null,
    /* GLEASON */ null,
    /* CAPSULE */ GBM_model_python_1636137917875_1_ColInfo_7.VALUES
  };
  // Prior class distribution
  public static final double[] PRIOR_CLASS_DISTRIB = {0.5973684210526315,0.4026315789473684};
  // Class distribution used for model building
  public static final double[] MODEL_CLASS_DISTRIB = {0.5973684210526315,0.4026315789473684};

  public GBM_model_python_1636137917875_1() { super(NAMES,DOMAINS,"CAPSULE"); }
  public String getUUID() { return Long.toString(4988040225257658559L); }

  // Pass in data in a double[], pre-aligned to the Model's requirements.
  // Jam predictions into the preds[] array; preds[0] is reserved for the
  // main prediction (class for classifiers or value for regression),
  // and remaining columns hold a probability distribution for classifiers.
  public final double[] score0( double[] data, double[] preds ) {
    java.util.Arrays.fill(preds,0);
    GBM_model_python_1636137917875_1_Forest_0.score0(data,preds);
    preds[2] = preds[1] + -0.3945120960889672;
    preds[2] = 1./(1. + Math.min(1e19, Math.exp(-(preds[2]))));
    preds[1] = 1.0-preds[2];
    preds[0] = hex.genmodel.GenModel.getPrediction(preds, PRIOR_CLASS_DISTRIB, data, 0.4008312811161997);
    return preds;
  }
}
// The class representing training column names
class NamesHolder_GBM_model_python_1636137917875_1 implements java.io.Serializable {
  public static final String[] VALUES = new String[7];
  static {
    NamesHolder_GBM_model_python_1636137917875_1_0.fill(VALUES);
  }
  static final class NamesHolder_GBM_model_python_1636137917875_1_0 implements java.io.Serializable {
    static final void fill(String[] sa) {
      sa[0] = "AGE";
      sa[1] = "RACE";
      sa[2] = "DPROS";
      sa[3] = "DCAPS";
      sa[4] = "PSA";
      sa[5] = "VOL";
      sa[6] = "GLEASON";
    }
  }
}
// The class representing column CAPSULE
class GBM_model_python_1636137917875_1_ColInfo_7 implements java.io.Serializable {
  public static final String[] VALUES = new String[2];
  static {
    GBM_model_python_1636137917875_1_ColInfo_7_0.fill(VALUES);
  }
  static final class GBM_model_python_1636137917875_1_ColInfo_7_0 implements java.io.Serializable {
    static final void fill(String[] sa) {
      sa[0] = "0";
      sa[1] = "1";
    }
  }
}

class GBM_model_python_1636137917875_1_Forest_0 {
  public static void score0(double[] fdata, double[] preds) {
    preds[1] += GBM_model_python_1636137917875_1_Tree_0_class_0.score0(fdata);
  }
}
class GBM_model_python_1636137917875_1_Tree_0_class_0 {
  static final double score0(double[] data) {
    double pred =      (Double.isNaN(data[6]) || data[6 /* GLEASON */] < 6.5f ? 
         (Double.isNaN(data[2]) || data[2 /* DPROS */] < 2.5f ? 
             (data[6 /* GLEASON */] < 5.5f ? 
                 (Double.isNaN(data[5]) || data[5 /* VOL */] < 19.44375f ? 
                    -0.16740088f : 
                     (data[5 /* VOL */] < 35.319237f ? 
                        -0.0842475f : 
                        -0.16740088f)) : 
                 (data[2 /* DPROS */] < 1.5f ? 
                     (Double.isNaN(data[5]) || data[5 /* VOL */] < 25.695312f ? 
                        -0.09571693f : 
                        -0.16740088f) : 
                     (Double.isNaN(data[5]) || data[5 /* VOL */] < 23.359375f ? 
                        -0.07830798f : 
                        0.005835324f))) : 
             (Double.isNaN(data[4]) || data[4 /* PSA */] < 6.645508f ? 
                 (data[5 /* VOL */] < 4.4484377f ? 
                    -0.007490538f : 
                     (data[4 /* PSA */] < 3.6390624f ? 
                        -0.16740088f : 
                        -0.1258242f)) : 
                 (data[6 /* GLEASON */] < 5.5f ? 
                    -0.05400991f : 
                     (Double.isNaN(data[5]) || data[5 /* VOL */] < 19.1625f ? 
                        0.17277204f : 
                        0.040482566f)))) : 
         (data[4 /* PSA */] < 14.730078f ? 
             (Double.isNaN(data[2]) || data[2 /* DPROS */] < 2.5f ? 
                 (Double.isNaN(data[5]) || data[5 /* VOL */] < 7.1585937f ? 
                     (data[4 /* PSA */] < 7.995f ? 
                        0.10977705f : 
                        -0.039472606f) : 
                    -0.12363595f) : 
                 (Double.isNaN(data[5]) || data[5 /* VOL */] < 17.264843f ? 
                     (Double.isNaN(data[4]) || data[4 /* PSA */] < 8.267187f ? 
                        0.1524198f : 
                        -0.042670812f) : 
                    0.22390914f)) : 
             (Double.isNaN(data[6]) || data[6 /* GLEASON */] < 7.5f ? 
                 (Double.isNaN(data[5]) || data[5 /* VOL */] < 24.657812f ? 
                     (data[4 /* PSA */] < 18.55625f ? 
                        0.24836601f : 
                        0.11424766f) : 
                    0.01078493f) : 
                 (data[4 /* PSA */] < 22.60625f ? 
                    0.12363594f : 
                    0.24836601f))));
    return pred;
  } // constant pool size = 94B, number of visited nodes = 23, static init size = 0B
}

Please take a close look at the POJO code, you should see statements like this one

Double.isNaN(data[5]) || data[5 /* VOL */] < 25.695312f ? -0.09571693f : -0.16740088f

This code represents one split decision in a GBM tree. data represents a single input row. The split decision is looking a column VOL to decide whether the observation should go to the left sub-tree or go right based on the value of element 5 in the data array.

It is important to notice that data is defined as a double array:

double[] data

This means data is represented by 64-bit floating point numbers. The split point itself is however outputted in 32-bit precision. In java code we capture this fact by using f suffix in the number representation, eg.: 25.695312f.

This means we have the same scenario as outlined in the beginning of this notebook - we are comparing numbers with two different precisions and we need to pay attention to how the numbers are interpreted.

In [263]:

data = np.array([0, 0, 0, 0, 0, np.float64(25.695312)])
data[5]

Out[263]:

25.695312

The java comparison rewritten to Python would look like this:

In [264]:

data[5] < np.float32(25.695312)

Out[264]:

True

This means that observation represented by array data should got the left subtree of the current node. If we ignored the fact that the split point is using 32-bit precision and considered it as 64-bit precision, we would miclassify the observation to left sub-tree.

In [265]:

data[5] < np.float64(25.695312)

Out[265]:

False

Expert options¶

Forcing split point in POJO to be written in 64-bit precision¶

H2O allows users to modify the POJO output by setting a property sys.ai.h2o.java.output.doubles. Setting this property to true will cause the POJO generator to output split point in 64-bit precision (doubles) instead of the default 32-bit precision.

We can set this property even on a running H2O instance by invoking a rapids expression.

In [266]:

h2o.rapids("(setproperty \"{}\" \"{}\")".format("sys.ai.h2o.java.output.doubles", "true"))["string"]

Out[266]:

'Old values of sys.ai.h2o.java.output.doubles (per node): null'

In [267]:

my_gbm.download_pojo()

/*
  Licensed under the Apache License, Version 2.0
    http://www.apache.org/licenses/LICENSE-2.0.html

  AUTOGENERATED BY H2O at 2021-11-05T14:45:28.619-04:00
  3.35.0.99999
  
  Standalone prediction code with sample test data for GBMModel named GBM_model_python_1636137917875_1

  How to download, compile and execute:
      mkdir tmpdir
      cd tmpdir
      curl http://192.168.86.229:54321/3/h2o-genmodel.jar > h2o-genmodel.jar
      curl http://192.168.86.229:54321/3/Models.java/GBM_model_python_1636137917875_1 > GBM_model_python_1636137917875_1.java
      javac -cp h2o-genmodel.jar -J-Xmx2g -J-XX:MaxPermSize=128m GBM_model_python_1636137917875_1.java

     (Note:  Try java argument -XX:+PrintCompilation to show runtime JIT compiler behavior.)
*/
import java.util.Map;
import hex.genmodel.GenModel;
import hex.genmodel.annotations.ModelPojo;

@ModelPojo(name="GBM_model_python_1636137917875_1", algorithm="gbm")
public class GBM_model_python_1636137917875_1 extends GenModel {
  public hex.ModelCategory getModelCategory() { return hex.ModelCategory.Binomial; }

  public boolean isSupervised() { return true; }
  public int nfeatures() { return 7; }
  public int nclasses() { return 2; }

  // Names of columns used by model.
  public static final String[] NAMES = NamesHolder_GBM_model_python_1636137917875_1.VALUES;
  // Number of output classes included in training data response column.
  public static final int NCLASSES = 2;

  // Column domains. The last array contains domain of response column.
  public static final String[][] DOMAINS = new String[][] {
    /* AGE */ null,
    /* RACE */ null,
    /* DPROS */ null,
    /* DCAPS */ null,
    /* PSA */ null,
    /* VOL */ null,
    /* GLEASON */ null,
    /* CAPSULE */ GBM_model_python_1636137917875_1_ColInfo_7.VALUES
  };
  // Prior class distribution
  public static final double[] PRIOR_CLASS_DISTRIB = {0.5973684210526315,0.4026315789473684};
  // Class distribution used for model building
  public static final double[] MODEL_CLASS_DISTRIB = {0.5973684210526315,0.4026315789473684};

  public GBM_model_python_1636137917875_1() { super(NAMES,DOMAINS,"CAPSULE"); }
  public String getUUID() { return Long.toString(4988040225257658559L); }

  // Pass in data in a double[], pre-aligned to the Model's requirements.
  // Jam predictions into the preds[] array; preds[0] is reserved for the
  // main prediction (class for classifiers or value for regression),
  // and remaining columns hold a probability distribution for classifiers.
  public final double[] score0( double[] data, double[] preds ) {
    java.util.Arrays.fill(preds,0);
    GBM_model_python_1636137917875_1_Forest_0.score0(data,preds);
    preds[2] = preds[1] + -0.3945120960889672;
    preds[2] = 1./(1. + Math.min(1e19, Math.exp(-(preds[2]))));
    preds[1] = 1.0-preds[2];
    preds[0] = hex.genmodel.GenModel.getPrediction(preds, PRIOR_CLASS_DISTRIB, data, 0.4008312811161997);
    return preds;
  }
}
// The class representing training column names
class NamesHolder_GBM_model_python_1636137917875_1 implements java.io.Serializable {
  public static final String[] VALUES = new String[7];
  static {
    NamesHolder_GBM_model_python_1636137917875_1_0.fill(VALUES);
  }
  static final class NamesHolder_GBM_model_python_1636137917875_1_0 implements java.io.Serializable {
    static final void fill(String[] sa) {
      sa[0] = "AGE";
      sa[1] = "RACE";
      sa[2] = "DPROS";
      sa[3] = "DCAPS";
      sa[4] = "PSA";
      sa[5] = "VOL";
      sa[6] = "GLEASON";
    }
  }
}
// The class representing column CAPSULE
class GBM_model_python_1636137917875_1_ColInfo_7 implements java.io.Serializable {
  public static final String[] VALUES = new String[2];
  static {
    GBM_model_python_1636137917875_1_ColInfo_7_0.fill(VALUES);
  }
  static final class GBM_model_python_1636137917875_1_ColInfo_7_0 implements java.io.Serializable {
    static final void fill(String[] sa) {
      sa[0] = "0";
      sa[1] = "1";
    }
  }
}

class GBM_model_python_1636137917875_1_Forest_0 {
  public static void score0(double[] fdata, double[] preds) {
    preds[1] += GBM_model_python_1636137917875_1_Tree_0_class_0.score0(fdata);
  }
}
class GBM_model_python_1636137917875_1_Tree_0_class_0 {
  static final double score0(double[] data) {
    double pred =      (Double.isNaN(data[6]) || data[6 /* GLEASON */] < 6.5 ? 
         (Double.isNaN(data[2]) || data[2 /* DPROS */] < 2.5 ? 
             (data[6 /* GLEASON */] < 5.5 ? 
                 (Double.isNaN(data[5]) || data[5 /* VOL */] < 19.443750381469727 ? 
                    -0.16740088164806366 : 
                     (data[5 /* VOL */] < 35.319236755371094 ? 
                        -0.08424749970436096 : 
                        -0.16740088164806366)) : 
                 (data[2 /* DPROS */] < 1.5 ? 
                     (Double.isNaN(data[5]) || data[5 /* VOL */] < 25.6953125 ? 
                        -0.0957169309258461 : 
                        -0.16740088164806366) : 
                     (Double.isNaN(data[5]) || data[5 /* VOL */] < 23.359375 ? 
                        -0.07830797880887985 : 
                        0.005835324060171843))) : 
             (Double.isNaN(data[4]) || data[4 /* PSA */] < 6.6455078125 ? 
                 (data[5 /* VOL */] < 4.448437690734863 ? 
                    -0.007490538060665131 : 
                     (data[4 /* PSA */] < 3.6390624046325684 ? 
                        -0.16740088164806366 : 
                        -0.1258241981267929)) : 
                 (data[6 /* GLEASON */] < 5.5 ? 
                    -0.05400991067290306 : 
                     (Double.isNaN(data[5]) || data[5 /* VOL */] < 19.162500381469727 ? 
                        0.17277203500270844 : 
                        0.04048256576061249)))) : 
         (data[4 /* PSA */] < 14.730077743530273 ? 
             (Double.isNaN(data[2]) || data[2 /* DPROS */] < 2.5 ? 
                 (Double.isNaN(data[5]) || data[5 /* VOL */] < 7.158593654632568 ? 
                     (data[4 /* PSA */] < 7.994999885559082 ? 
                        0.1097770482301712 : 
                        -0.03947260603308678) : 
                    -0.12363594770431519) : 
                 (Double.isNaN(data[5]) || data[5 /* VOL */] < 17.264842987060547 ? 
                     (Double.isNaN(data[4]) || data[4 /* PSA */] < 8.267187118530273 ? 
                        0.1524198055267334 : 
                        -0.04267081245779991) : 
                    0.2239091396331787)) : 
             (Double.isNaN(data[6]) || data[6 /* GLEASON */] < 7.5 ? 
                 (Double.isNaN(data[5]) || data[5 /* VOL */] < 24.657812118530273 ? 
                     (data[4 /* PSA */] < 18.556249618530273 ? 
                        0.24836601316928864 : 
                        0.11424765735864639) : 
                    0.010784929618239403) : 
                 (data[4 /* PSA */] < 22.606250762939453 ? 
                    0.12363594025373459 : 
                    0.24836601316928864))));
    return pred;
  } // constant pool size = 94B, number of visited nodes = 23, static init size = 0B
}

In the modified POJO output you can now see the original split is coded as

Double.isNaN(data[5]) || data[5 /* VOL */] < 25.6953125 ? -0.0957169309258461 : -0.16740088164806366

Notice the last decimal place and observer there is now no suffix f at the end of the number. Compare it to the original version

Double.isNaN(data[5]) || data[5 /* VOL */] < 25.695312f ? -0.09571693f : -0.16740088f

The 64-bit precision output might be more natural to users for understanding what the POJO is doing when deciding how should a given observation traverse the tree.

Convert existing MOJO into POJO with 64-bit precision number representation¶

Suppose we already have a MOJO model that was created by an older H2O version and we want to see how would the POJO look like with numbers represented in 64-bits.

For this use case H2O provides a conversion tool MojoConvertTool as a part of the h2o.jar.

In [268]:

mojo_path = my_gbm.download_mojo()
mojo_path

Out[268]:

'/Users/mkurka/git/h2o/h2o-3/GBM_model_python_1636137917875_1.zip'

In [269]:

# Find h2o.jar (this is using internal functions)
from h2o.backend import H2OLocalServer
h2o_jar = H2OLocalServer()._find_jar()

In [270]:

# Invoke MojoConvertTool without arguments to print out usage instructions
import subprocess
subprocess.call(["java", "-cp", h2o_jar, "water.tools.MojoConvertTool"], stderr=subprocess.STDOUT, shell=False)

java -cp h2o.jar water.tools.MojoConvertTool source_mojo.zip target_pojo.java

Out[270]:

In [271]:

# Add path to MOJO file and write output to "pojo.java"
subprocess.call(["java", "-cp", h2o_jar, "water.tools.MojoConvertTool", mojo_path, "pojo.java"], stderr=subprocess.STDOUT, shell=False)

Starting local H2O instance to facilitate MOJO to POJO conversion.

14:45:29.416 [main] INFO  hex.tree.xgboost.util.NativeLibrary - Loaded library from lib/osx_64/libxgboost4j_minimal.dylib (/var/folders/v1/fkjmcbkd11v2mrm4dm6345ym0000gn/T/libxgboost4j_minimal6279070988842798503.dylib)
11-05 14:45:29.543 127.0.0.1:54321       79164        main  INFO water.default: ----- H2O started  -----
11-05 14:45:29.544 127.0.0.1:54321       79164        main  INFO water.default: Build git branch: master
11-05 14:45:29.544 127.0.0.1:54321       79164        main  INFO water.default: Build git hash: b9ba1af5f07c6dbc6369e41113ea43947109e054
11-05 14:45:29.544 127.0.0.1:54321       79164        main  INFO water.default: Build git describe: jenkins-master-5625-7-gb9ba1af5f0
11-05 14:45:29.544 127.0.0.1:54321       79164        main  INFO water.default: Build project version: 3.35.0.99999
11-05 14:45:29.544 127.0.0.1:54321       79164        main  INFO water.default: Build age: 2 hours and 53 minutes
11-05 14:45:29.545 127.0.0.1:54321       79164        main  INFO water.default: Built by: 'mkurka'
11-05 14:45:29.545 127.0.0.1:54321       79164        main  INFO water.default: Built on: '2021-11-05 11:51:34'
11-05 14:45:29.545 127.0.0.1:54321       79164        main  INFO water.default: Found H2O Core extensions: [XGBoost, KrbStandalone]
11-05 14:45:29.545 127.0.0.1:54321       79164        main  INFO water.default: Processed H2O arguments: [-disable_web, -ip, localhost, -disable_net]
11-05 14:45:29.545 127.0.0.1:54321       79164        main  INFO water.default: Java availableProcessors: 16
11-05 14:45:29.545 127.0.0.1:54321       79164        main  INFO water.default: Java heap totalMemory: 491.0 MB
11-05 14:45:29.546 127.0.0.1:54321       79164        main  INFO water.default: Java heap maxMemory: 7.11 GB
11-05 14:45:29.546 127.0.0.1:54321       79164        main  INFO water.default: Java version: Java 1.8.0_311 (from Oracle Corporation)
11-05 14:45:29.546 127.0.0.1:54321       79164        main  INFO water.default: JVM launch parameters: []
11-05 14:45:29.546 127.0.0.1:54321       79164        main  INFO water.default: JVM process id: 79164@michals-mbp.lan
11-05 14:45:29.546 127.0.0.1:54321       79164        main  INFO water.default: OS version: Mac OS X 10.16 (x86_64)
11-05 14:45:29.547 127.0.0.1:54321       79164        main  INFO water.default: Machine physical memory: 32.00 GB
11-05 14:45:29.548 127.0.0.1:54321       79164        main  INFO water.default: Machine locale: en_US
11-05 14:45:29.549 127.0.0.1:54321       79164        main  INFO water.default: X-h2o-cluster-id: 1636137928927
11-05 14:45:29.549 127.0.0.1:54321       79164        main  INFO water.default: User name: 'mkurka'
11-05 14:45:29.549 127.0.0.1:54321       79164        main  INFO water.default: IPv6 stack selected: false
11-05 14:45:29.549 127.0.0.1:54321       79164        main  INFO water.default: H2O node running in unencrypted mode.
11-05 14:45:30.081 127.0.0.1:54321       79164        main  INFO water.default: Kerberos not configured
11-05 14:45:30.081 127.0.0.1:54321       79164        main  INFO water.default: Log dir: '/tmp/h2o-mkurka/h2ologs'
11-05 14:45:30.081 127.0.0.1:54321       79164        main  INFO water.default: Cur dir: '/Users/mkurka/git/h2o/h2o-3'
11-05 14:45:30.087 127.0.0.1:54321       79164        main  INFO water.default: Subsystem for distributed import from HTTP/HTTPS successfully initialized
11-05 14:45:30.088 127.0.0.1:54321       79164        main  INFO water.default: HDFS subsystem successfully initialized
11-05 14:45:30.090 127.0.0.1:54321       79164        main  INFO water.default: S3 subsystem successfully initialized
11-05 14:45:30.102 127.0.0.1:54321       79164        main  INFO water.default: GCS subsystem successfully initialized
11-05 14:45:30.103 127.0.0.1:54321       79164        main  INFO water.default: Flow dir: '/Users/mkurka/h2oflows'
11-05 14:45:30.108 127.0.0.1:54321       79164        main  INFO water.default: Cloud of size 1 formed [localhost/127.0.0.1:54321]
11-05 14:45:30.116 127.0.0.1:54321       79164        main  INFO water.default: Registered parsers: [GUESS, ARFF, XLS, SVMLight, AVRO, PARQUET, CSV]
11-05 14:45:30.117 127.0.0.1:54321       79164        main  INFO water.default: XGBoost extension initialized
11-05 14:45:30.117 127.0.0.1:54321       79164        main  INFO water.default: KrbStandalone extension initialized
11-05 14:45:30.117 127.0.0.1:54321       79164        main  INFO water.default: Registered 2 core extensions in: 377ms
11-05 14:45:30.117 127.0.0.1:54321       79164        main  INFO water.default: Registered H2O core extensions: [XGBoost, KrbStandalone]
11-05 14:45:30.305 127.0.0.1:54321       79164        main  INFO hex.tree.xgboost.XGBoostExtension: Found XGBoost backend with library: xgboost4j_minimal
11-05 14:45:30.306 127.0.0.1:54321       79164        main  WARN hex.tree.xgboost.XGBoostExtension: Your system supports only minimal version of XGBoost (no GPUs, no multithreading)!
11-05 14:45:30.406 127.0.0.1:54321       79164        main  INFO water.default: Registered: 257 REST APIs in: 289ms
11-05 14:45:30.406 127.0.0.1:54321       79164        main  INFO water.default: Registered REST API extensions: [Amazon S3, XGBoost, Algos, AutoML, Core V3, TargetEncoder, Core V4]
11-05 14:45:30.506 127.0.0.1:54321       79164        main  INFO water.default: Registered: 311 schemas in 100ms
11-05 14:45:30.506 127.0.0.1:54321       79164        main  INFO water.default: Locking cloud to new members, because H2O is started in a single node configuration.

Converting /Users/mkurka/git/h2o/h2o-3/GBM_model_python_1636137917875_1.zip to pojo.java...
11-05 14:45:30.642 127.0.0.1:54321       79164     FJ-1-15  INFO water.default: Starting model Generic_model_1636137928927_1
11-05 14:45:30.747 127.0.0.1:54321       79164     FJ-1-15  INFO water.default: Completing model Generic_model_1636137928927_1
DONE

Out[271]:

In [272]:

# Display the content of the POJO
with open('pojo.java', 'r') as f:
    print(f.read())

/*
  Licensed under the Apache License, Version 2.0
    http://www.apache.org/licenses/LICENSE-2.0.html

  AUTOGENERATED BY H2O at 2021-11-05T14:45:30.759-04:00
  3.35.0.99999
  
  Standalone prediction code with sample test data for GBMModel named Generic_model_1636137928927_1

  How to download, compile and execute:
      mkdir tmpdir
      cd tmpdir
      curl http:/localhost/127.0.0.1:54321/3/h2o-genmodel.jar > h2o-genmodel.jar
      curl http:/localhost/127.0.0.1:54321/3/Models.java/Generic_model_1636137928927_1 > Generic_model_1636137928927_1.java
      javac -cp h2o-genmodel.jar -J-Xmx2g -J-XX:MaxPermSize=128m Generic_model_1636137928927_1.java

     (Note:  Try java argument -XX:+PrintCompilation to show runtime JIT compiler behavior.)
*/
import java.util.Map;
import hex.genmodel.GenModel;
import hex.genmodel.annotations.ModelPojo;

@ModelPojo(name="Generic_model_1636137928927_1", algorithm="gbm")
public class Generic_model_1636137928927_1 extends GenModel {
  public hex.ModelCategory getModelCategory() { return hex.ModelCategory.Binomial; }

  public boolean isSupervised() { return true; }
  public int nfeatures() { return 7; }
  public int nclasses() { return 2; }

  // Names of columns used by model.
  public static final String[] NAMES = NamesHolder_Generic_model_1636137928927_1.VALUES;
  // Number of output classes included in training data response column.
  public static final int NCLASSES = 2;

  // Column domains. The last array contains domain of response column.
  public static final String[][] DOMAINS = new String[][] {
    /* AGE */ null,
    /* RACE */ null,
    /* DPROS */ null,
    /* DCAPS */ null,
    /* PSA */ null,
    /* VOL */ null,
    /* GLEASON */ null,
    /* CAPSULE */ Generic_model_1636137928927_1_ColInfo_7.VALUES
  };
  // Prior class distribution
  public static final double[] PRIOR_CLASS_DISTRIB = {0.5973684210526315,0.4026315789473684};
  // Class distribution used for model building
  public static final double[] MODEL_CLASS_DISTRIB = {0.5973684210526315,0.4026315789473684};

  public Generic_model_1636137928927_1() { super(NAMES,DOMAINS,"CAPSULE"); }
  public String getUUID() { return Long.toString(4988040225257658559L); }

  // Pass in data in a double[], pre-aligned to the Model's requirements.
  // Jam predictions into the preds[] array; preds[0] is reserved for the
  // main prediction (class for classifiers or value for regression),
  // and remaining columns hold a probability distribution for classifiers.
  public final double[] score0( double[] data, double[] preds ) {
    java.util.Arrays.fill(preds,0);
    Generic_model_1636137928927_1_Forest_0.score0(data,preds);
    preds[2] = preds[1] + -0.3945120960889672;
    preds[2] = 1./(1. + Math.min(1e19, Math.exp(-(preds[2]))));
    preds[1] = 1.0-preds[2];
    preds[0] = hex.genmodel.GenModel.getPrediction(preds, PRIOR_CLASS_DISTRIB, data, 0.4008312811161997);
    return preds;
  }
}
// The class representing training column names
class NamesHolder_Generic_model_1636137928927_1 implements java.io.Serializable {
  public static final String[] VALUES = new String[7];
  static {
    NamesHolder_Generic_model_1636137928927_1_0.fill(VALUES);
  }
  static final class NamesHolder_Generic_model_1636137928927_1_0 implements java.io.Serializable {
    static final void fill(String[] sa) {
      sa[0] = "AGE";
      sa[1] = "RACE";
      sa[2] = "DPROS";
      sa[3] = "DCAPS";
      sa[4] = "PSA";
      sa[5] = "VOL";
      sa[6] = "GLEASON";
    }
  }
}
// The class representing column CAPSULE
class Generic_model_1636137928927_1_ColInfo_7 implements java.io.Serializable {
  public static final String[] VALUES = new String[2];
  static {
    Generic_model_1636137928927_1_ColInfo_7_0.fill(VALUES);
  }
  static final class Generic_model_1636137928927_1_ColInfo_7_0 implements java.io.Serializable {
    static final void fill(String[] sa) {
      sa[0] = "0";
      sa[1] = "1";
    }
  }
}

class Generic_model_1636137928927_1_Forest_0 {
  public static void score0(double[] fdata, double[] preds) {
    preds[1] += Generic_model_1636137928927_1_Tree_0_class_0.score0(fdata);
  }
}
class Generic_model_1636137928927_1_Tree_0_class_0 {
  static final double score0(double[] data) {
    double pred =      (Double.isNaN(data[6]) || data[6 /* GLEASON */] < 6.5f ? 
         (Double.isNaN(data[2]) || data[2 /* DPROS */] < 2.5f ? 
             (data[6 /* GLEASON */] < 5.5f ? 
                 (Double.isNaN(data[5]) || data[5 /* VOL */] < 19.44375f ? 
                    -0.16740088f : 
                     (data[5 /* VOL */] < 35.319237f ? 
                        -0.0842475f : 
                        -0.16740088f)) : 
                 (data[2 /* DPROS */] < 1.5f ? 
                     (Double.isNaN(data[5]) || data[5 /* VOL */] < 25.695312f ? 
                        -0.09571693f : 
                        -0.16740088f) : 
                     (Double.isNaN(data[5]) || data[5 /* VOL */] < 23.359375f ? 
                        -0.07830798f : 
                        0.005835324f))) : 
             (Double.isNaN(data[4]) || data[4 /* PSA */] < 6.645508f ? 
                 (data[5 /* VOL */] < 4.4484377f ? 
                    -0.007490538f : 
                     (data[4 /* PSA */] < 3.6390624f ? 
                        -0.16740088f : 
                        -0.1258242f)) : 
                 (data[6 /* GLEASON */] < 5.5f ? 
                    -0.05400991f : 
                     (Double.isNaN(data[5]) || data[5 /* VOL */] < 19.1625f ? 
                        0.17277204f : 
                        0.040482566f)))) : 
         (data[4 /* PSA */] < 14.730078f ? 
             (Double.isNaN(data[2]) || data[2 /* DPROS */] < 2.5f ? 
                 (Double.isNaN(data[5]) || data[5 /* VOL */] < 7.1585937f ? 
                     (data[4 /* PSA */] < 7.995f ? 
                        0.10977705f : 
                        -0.039472606f) : 
                    -0.12363595f) : 
                 (Double.isNaN(data[5]) || data[5 /* VOL */] < 17.264843f ? 
                     (Double.isNaN(data[4]) || data[4 /* PSA */] < 8.267187f ? 
                        0.1524198f : 
                        -0.042670812f) : 
                    0.22390914f)) : 
             (Double.isNaN(data[6]) || data[6 /* GLEASON */] < 7.5f ? 
                 (Double.isNaN(data[5]) || data[5 /* VOL */] < 24.657812f ? 
                     (data[4 /* PSA */] < 18.55625f ? 
                        0.24836601f : 
                        0.11424766f) : 
                    0.01078493f) : 
                 (data[4 /* PSA */] < 22.60625f ? 
                    0.12363594f : 
                    0.24836601f))));
    return pred;
  } // constant pool size = 94B, number of visited nodes = 23, static init size = 0B
}

In [273]:

# Now specify system property sys.ai.h2o.java.output.doubles to output numbers in 64-bit precision
subprocess.call(["java", "-Dsys.ai.h2o.java.output.doubles=true", "-cp", h2o_jar, "water.tools.MojoConvertTool", mojo_path, "pojo64.java"], stderr=subprocess.STDOUT, shell=False)

Starting local H2O instance to facilitate MOJO to POJO conversion.

14:45:31.502 [main] INFO  hex.tree.xgboost.util.NativeLibrary - Loaded library from lib/osx_64/libxgboost4j_minimal.dylib (/var/folders/v1/fkjmcbkd11v2mrm4dm6345ym0000gn/T/libxgboost4j_minimal978915340387551523.dylib)
11-05 14:45:31.628 127.0.0.1:54321       79166        main  INFO water.default: ----- H2O started  -----
11-05 14:45:31.628 127.0.0.1:54321       79166        main  INFO water.default: Build git branch: master
11-05 14:45:31.628 127.0.0.1:54321       79166        main  INFO water.default: Build git hash: b9ba1af5f07c6dbc6369e41113ea43947109e054
11-05 14:45:31.628 127.0.0.1:54321       79166        main  INFO water.default: Build git describe: jenkins-master-5625-7-gb9ba1af5f0
11-05 14:45:31.629 127.0.0.1:54321       79166        main  INFO water.default: Build project version: 3.35.0.99999
11-05 14:45:31.629 127.0.0.1:54321       79166        main  INFO water.default: Build age: 2 hours and 53 minutes
11-05 14:45:31.629 127.0.0.1:54321       79166        main  INFO water.default: Built by: 'mkurka'
11-05 14:45:31.629 127.0.0.1:54321       79166        main  INFO water.default: Built on: '2021-11-05 11:51:34'
11-05 14:45:31.629 127.0.0.1:54321       79166        main  INFO water.default: Found H2O Core extensions: [XGBoost, KrbStandalone]
11-05 14:45:31.629 127.0.0.1:54321       79166        main  INFO water.default: Processed H2O arguments: [-disable_web, -ip, localhost, -disable_net]
11-05 14:45:31.630 127.0.0.1:54321       79166        main  INFO water.default: Java availableProcessors: 16
11-05 14:45:31.630 127.0.0.1:54321       79166        main  INFO water.default: Java heap totalMemory: 491.0 MB
11-05 14:45:31.630 127.0.0.1:54321       79166        main  INFO water.default: Java heap maxMemory: 7.11 GB
11-05 14:45:31.630 127.0.0.1:54321       79166        main  INFO water.default: Java version: Java 1.8.0_311 (from Oracle Corporation)
11-05 14:45:31.630 127.0.0.1:54321       79166        main  INFO water.default: JVM launch parameters: [-Dsys.ai.h2o.java.output.doubles=true]
11-05 14:45:31.631 127.0.0.1:54321       79166        main  INFO water.default: JVM process id: 79166@michals-mbp.lan
11-05 14:45:31.631 127.0.0.1:54321       79166        main  INFO water.default: OS version: Mac OS X 10.16 (x86_64)
11-05 14:45:31.631 127.0.0.1:54321       79166        main  INFO water.default: Machine physical memory: 32.00 GB
11-05 14:45:31.631 127.0.0.1:54321       79166        main  INFO water.default: Machine locale: en_US
11-05 14:45:31.633 127.0.0.1:54321       79166        main  INFO water.default: X-h2o-cluster-id: 1636137931013
11-05 14:45:31.633 127.0.0.1:54321       79166        main  INFO water.default: User name: 'mkurka'
11-05 14:45:31.633 127.0.0.1:54321       79166        main  INFO water.default: IPv6 stack selected: false
11-05 14:45:31.633 127.0.0.1:54321       79166        main  INFO water.default: H2O node running in unencrypted mode.
11-05 14:45:32.130 127.0.0.1:54321       79166        main  INFO water.default: Kerberos not configured
11-05 14:45:32.130 127.0.0.1:54321       79166        main  INFO water.default: Log dir: '/tmp/h2o-mkurka/h2ologs'
11-05 14:45:32.130 127.0.0.1:54321       79166        main  INFO water.default: Cur dir: '/Users/mkurka/git/h2o/h2o-3'
11-05 14:45:32.136 127.0.0.1:54321       79166        main  INFO water.default: Subsystem for distributed import from HTTP/HTTPS successfully initialized
11-05 14:45:32.137 127.0.0.1:54321       79166        main  INFO water.default: HDFS subsystem successfully initialized
11-05 14:45:32.140 127.0.0.1:54321       79166        main  INFO water.default: S3 subsystem successfully initialized
11-05 14:45:32.151 127.0.0.1:54321       79166        main  INFO water.default: GCS subsystem successfully initialized
11-05 14:45:32.152 127.0.0.1:54321       79166        main  INFO water.default: Flow dir: '/Users/mkurka/h2oflows'
11-05 14:45:32.158 127.0.0.1:54321       79166        main  INFO water.default: Cloud of size 1 formed [localhost/127.0.0.1:54321]
11-05 14:45:32.164 127.0.0.1:54321       79166        main  INFO water.default: Registered parsers: [GUESS, ARFF, XLS, SVMLight, AVRO, PARQUET, CSV]
11-05 14:45:32.165 127.0.0.1:54321       79166        main  INFO water.default: XGBoost extension initialized
11-05 14:45:32.166 127.0.0.1:54321       79166        main  INFO water.default: KrbStandalone extension initialized
11-05 14:45:32.166 127.0.0.1:54321       79166        main  INFO water.default: Registered 2 core extensions in: 376ms
11-05 14:45:32.166 127.0.0.1:54321       79166        main  INFO water.default: Registered H2O core extensions: [XGBoost, KrbStandalone]
11-05 14:45:32.353 127.0.0.1:54321       79166        main  INFO hex.tree.xgboost.XGBoostExtension: Found XGBoost backend with library: xgboost4j_minimal
11-05 14:45:32.353 127.0.0.1:54321       79166        main  WARN hex.tree.xgboost.XGBoostExtension: Your system supports only minimal version of XGBoost (no GPUs, no multithreading)!
11-05 14:45:32.467 127.0.0.1:54321       79166        main  INFO water.default: Registered: 257 REST APIs in: 301ms
11-05 14:45:32.467 127.0.0.1:54321       79166        main  INFO water.default: Registered REST API extensions: [Amazon S3, XGBoost, Algos, AutoML, Core V3, TargetEncoder, Core V4]
11-05 14:45:32.564 127.0.0.1:54321       79166        main  INFO water.default: Registered: 311 schemas in 96ms
11-05 14:45:32.565 127.0.0.1:54321       79166        main  INFO water.default: Locking cloud to new members, because H2O is started in a single node configuration.

Converting /Users/mkurka/git/h2o/h2o-3/GBM_model_python_1636137917875_1.zip to pojo64.java...
11-05 14:45:32.700 127.0.0.1:54321       79166     FJ-1-15  INFO water.default: Starting model Generic_model_1636137931013_1
11-05 14:45:32.804 127.0.0.1:54321       79166     FJ-1-15  INFO water.default: Completing model Generic_model_1636137931013_1
DONE

Out[273]:

In [274]:

# Display the content of the POJO with 64-bit number representation
with open('pojo64.java', 'r') as f:
    print(f.read())

/*
  Licensed under the Apache License, Version 2.0
    http://www.apache.org/licenses/LICENSE-2.0.html

  AUTOGENERATED BY H2O at 2021-11-05T14:45:32.815-04:00
  3.35.0.99999
  
  Standalone prediction code with sample test data for GBMModel named Generic_model_1636137931013_1

  How to download, compile and execute:
      mkdir tmpdir
      cd tmpdir
      curl http:/localhost/127.0.0.1:54321/3/h2o-genmodel.jar > h2o-genmodel.jar
      curl http:/localhost/127.0.0.1:54321/3/Models.java/Generic_model_1636137931013_1 > Generic_model_1636137931013_1.java
      javac -cp h2o-genmodel.jar -J-Xmx2g -J-XX:MaxPermSize=128m Generic_model_1636137931013_1.java

     (Note:  Try java argument -XX:+PrintCompilation to show runtime JIT compiler behavior.)
*/
import java.util.Map;
import hex.genmodel.GenModel;
import hex.genmodel.annotations.ModelPojo;

@ModelPojo(name="Generic_model_1636137931013_1", algorithm="gbm")
public class Generic_model_1636137931013_1 extends GenModel {
  public hex.ModelCategory getModelCategory() { return hex.ModelCategory.Binomial; }

  public boolean isSupervised() { return true; }
  public int nfeatures() { return 7; }
  public int nclasses() { return 2; }

  // Names of columns used by model.
  public static final String[] NAMES = NamesHolder_Generic_model_1636137931013_1.VALUES;
  // Number of output classes included in training data response column.
  public static final int NCLASSES = 2;

  // Column domains. The last array contains domain of response column.
  public static final String[][] DOMAINS = new String[][] {
    /* AGE */ null,
    /* RACE */ null,
    /* DPROS */ null,
    /* DCAPS */ null,
    /* PSA */ null,
    /* VOL */ null,
    /* GLEASON */ null,
    /* CAPSULE */ Generic_model_1636137931013_1_ColInfo_7.VALUES
  };
  // Prior class distribution
  public static final double[] PRIOR_CLASS_DISTRIB = {0.5973684210526315,0.4026315789473684};
  // Class distribution used for model building
  public static final double[] MODEL_CLASS_DISTRIB = {0.5973684210526315,0.4026315789473684};

  public Generic_model_1636137931013_1() { super(NAMES,DOMAINS,"CAPSULE"); }
  public String getUUID() { return Long.toString(4988040225257658559L); }

  // Pass in data in a double[], pre-aligned to the Model's requirements.
  // Jam predictions into the preds[] array; preds[0] is reserved for the
  // main prediction (class for classifiers or value for regression),
  // and remaining columns hold a probability distribution for classifiers.
  public final double[] score0( double[] data, double[] preds ) {
    java.util.Arrays.fill(preds,0);
    Generic_model_1636137931013_1_Forest_0.score0(data,preds);
    preds[2] = preds[1] + -0.3945120960889672;
    preds[2] = 1./(1. + Math.min(1e19, Math.exp(-(preds[2]))));
    preds[1] = 1.0-preds[2];
    preds[0] = hex.genmodel.GenModel.getPrediction(preds, PRIOR_CLASS_DISTRIB, data, 0.4008312811161997);
    return preds;
  }
}
// The class representing training column names
class NamesHolder_Generic_model_1636137931013_1 implements java.io.Serializable {
  public static final String[] VALUES = new String[7];
  static {
    NamesHolder_Generic_model_1636137931013_1_0.fill(VALUES);
  }
  static final class NamesHolder_Generic_model_1636137931013_1_0 implements java.io.Serializable {
    static final void fill(String[] sa) {
      sa[0] = "AGE";
      sa[1] = "RACE";
      sa[2] = "DPROS";
      sa[3] = "DCAPS";
      sa[4] = "PSA";
      sa[5] = "VOL";
      sa[6] = "GLEASON";
    }
  }
}
// The class representing column CAPSULE
class Generic_model_1636137931013_1_ColInfo_7 implements java.io.Serializable {
  public static final String[] VALUES = new String[2];
  static {
    Generic_model_1636137931013_1_ColInfo_7_0.fill(VALUES);
  }
  static final class Generic_model_1636137931013_1_ColInfo_7_0 implements java.io.Serializable {
    static final void fill(String[] sa) {
      sa[0] = "0";
      sa[1] = "1";
    }
  }
}

class Generic_model_1636137931013_1_Forest_0 {
  public static void score0(double[] fdata, double[] preds) {
    preds[1] += Generic_model_1636137931013_1_Tree_0_class_0.score0(fdata);
  }
}
class Generic_model_1636137931013_1_Tree_0_class_0 {
  static final double score0(double[] data) {
    double pred =      (Double.isNaN(data[6]) || data[6 /* GLEASON */] < 6.5 ? 
         (Double.isNaN(data[2]) || data[2 /* DPROS */] < 2.5 ? 
             (data[6 /* GLEASON */] < 5.5 ? 
                 (Double.isNaN(data[5]) || data[5 /* VOL */] < 19.443750381469727 ? 
                    -0.16740088164806366 : 
                     (data[5 /* VOL */] < 35.319236755371094 ? 
                        -0.08424749970436096 : 
                        -0.16740088164806366)) : 
                 (data[2 /* DPROS */] < 1.5 ? 
                     (Double.isNaN(data[5]) || data[5 /* VOL */] < 25.6953125 ? 
                        -0.0957169309258461 : 
                        -0.16740088164806366) : 
                     (Double.isNaN(data[5]) || data[5 /* VOL */] < 23.359375 ? 
                        -0.07830797880887985 : 
                        0.005835324060171843))) : 
             (Double.isNaN(data[4]) || data[4 /* PSA */] < 6.6455078125 ? 
                 (data[5 /* VOL */] < 4.448437690734863 ? 
                    -0.007490538060665131 : 
                     (data[4 /* PSA */] < 3.6390624046325684 ? 
                        -0.16740088164806366 : 
                        -0.1258241981267929)) : 
                 (data[6 /* GLEASON */] < 5.5 ? 
                    -0.05400991067290306 : 
                     (Double.isNaN(data[5]) || data[5 /* VOL */] < 19.162500381469727 ? 
                        0.17277203500270844 : 
                        0.04048256576061249)))) : 
         (data[4 /* PSA */] < 14.730077743530273 ? 
             (Double.isNaN(data[2]) || data[2 /* DPROS */] < 2.5 ? 
                 (Double.isNaN(data[5]) || data[5 /* VOL */] < 7.158593654632568 ? 
                     (data[4 /* PSA */] < 7.994999885559082 ? 
                        0.1097770482301712 : 
                        -0.03947260603308678) : 
                    -0.12363594770431519) : 
                 (Double.isNaN(data[5]) || data[5 /* VOL */] < 17.264842987060547 ? 
                     (Double.isNaN(data[4]) || data[4 /* PSA */] < 8.267187118530273 ? 
                        0.1524198055267334 : 
                        -0.04267081245779991) : 
                    0.2239091396331787)) : 
             (Double.isNaN(data[6]) || data[6 /* GLEASON */] < 7.5 ? 
                 (Double.isNaN(data[5]) || data[5 /* VOL */] < 24.657812118530273 ? 
                     (data[4 /* PSA */] < 18.556249618530273 ? 
                        0.24836601316928864 : 
                        0.11424765735864639) : 
                    0.010784929618239403) : 
                 (data[4 /* PSA */] < 22.606250762939453 ? 
                    0.12363594025373459 : 
                    0.24836601316928864))));
    return pred;
  } // constant pool size = 94B, number of visited nodes = 23, static init size = 0B
}